-
Notifications
You must be signed in to change notification settings - Fork 136
String xsd:double value crash toRrdf()
#202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
String xsd:double value crash toRrdf()
#202
Conversation
xsd:double value crash toRrdf()
|
@BigBlueHat @davidlehn please let me know what you think. Thanks! |
|
An extra test should not bring about much harm. It can be deduplicated and, if necessary, merged into the test suite of the API specification later. |
Please don't assume "it works in the json-ld.org playground" means the output is correct! We need the shared tests to be reviewed to ensure they are actually correct. It's entirely possible that the playground js code is doing something wrong and needs fixing too. Better to have the test in the approved shared suite so everyone agrees on behavior.
Great, please update the patch. Looks like it's probably a few blank lines, trailing commas, and some trailing whitespace to fix? Open an issue and/or PR about using other tooling.
Again, seems like this patch should match the style of the existing code, which doesn't yet use annotations. Probably best to discuss it in an issue then slowly or bulk add everywhere needed.
I wasn't questioning the need or reason to have a test. Did you determine if the test suite had anything to cover this? That a test needed to be written seems to imply no. This 100% needs to be in the spec api test suite if that is the case. All implementations need to cover the issue. I know it's a bit awkward to figure out how to add tests to the spec suite. Naming there has been a mess. But adding something that can get some feedback and adjusted later is not too bad. At least in the js implementation there are (clunky) features to run single or a group of tests. I use that all the time. It's often as easy as local per-implementation tests and the general tests need to be in the api test suite anyway.
It is absolutely harmful to have general tests that only exist in one implementation! Interoperability is what JSON-LD is about. All implementations need to know they function the same on a test suite shared and approved by the community.
Duplication is a problem. It would mix up which tests are Python specific vs ones that need to be added to the API test suite and removed. If it is a duplicate, than please clearly document why and refer to the API test. But what would that be for? It's now later and it is necessary. Please add it to the spec api test suite. I also find adding to the test suite sometimes encourages me or others to add more variations. Especially when it's about a handful of triples. Like here, do positive, negative, zero, lots of sig digits, values near maximums and minimums, maybe some non-numbers if it makes sense, etc. All that being said, thank you for the test and fix. :-) |
| r'(\d)0*E\+?0*(\d)', r'\1E\2', | ||
| ('%1.15E' % value)) | ||
| object['datatype'] = datatype or XSD_DOUBLE | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Match style and remove appropriate newlines.
| return { | ||
| **object, | ||
| 'value': _canonicalize_double(value), | ||
| 'datatype': datatype or XSD_DOUBLE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Match style and remove trailing commas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason to add trailing commas is to reduce the size of a diff on a subsequent change.
If, below
'datatype': datatype or XSD_DOUBLE
someone would decide another item in the dictionary, — the diff will cover two lines:
- the existing line, with an extra comma being added,
- and the next line, which has been inserted.
Is there a style guide in this project which would forbid certain style elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am aware of pros/cons of this and other style choices. The code is the style guide. Please match it for now.
| object['datatype'] = datatype or XSD_DOUBLE | ||
|
|
||
| elif _is_double(value): | ||
| return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use this construct? Would matching the other code and setting value and datatype work? Or if it is to maybe avoid the IRI conditional below, could style use that style and a return, rather than create a new object. But in that case, everything else here should match. Maybe better to use existing style and refactor it all later if needed.
|
|
||
| def test_offline_pyld_bug_reproduction(self): | ||
| """Test reproducing the PyLD bug with captured Wikidata data structure.""" | ||
| # This is the exact problematic data structure captured from Wikidata Q399 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments here and below should be adjust to only say what is happening, if needed at all. Text about a contemporary bug will not make much sense after the issue is fixed. That seems like text for the PR comments, not something that needs to be in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was primarily written as an illustration to prove that the bug exists to those who read this PR. Thanks for pointing this out, I will clean it up if/when this PR is being prepared to merge after the more urgent changes. At that point, this test might also require refactoring because it will be running under an updated test system.
| data = { | ||
| "@context": { | ||
| "xsd": "http://www.w3.org/2001/XMLSchema#", | ||
| "geoLongitude": "http://www.w3.org/2003/01/geo/wgs84_pos#longitude" | ||
| }, | ||
| "@graph": [ | ||
| { | ||
| "@id": "http://www.wikidata.org/entity/Q399", | ||
| "geoLongitude": { | ||
| "@type": "xsd:double", | ||
| "@value": "45" # This string number causes the PyLD bug | ||
| } | ||
| } | ||
| ] | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think tests should be as minimal as possible. This could inline xsd prefix, but if this were to be slightly expanded to have various values, xsd: is easier to read. How about something like:
| data = { | |
| "@context": { | |
| "xsd": "http://www.w3.org/2001/XMLSchema#", | |
| "geoLongitude": "http://www.w3.org/2003/01/geo/wgs84_pos#longitude" | |
| }, | |
| "@graph": [ | |
| { | |
| "@id": "http://www.wikidata.org/entity/Q399", | |
| "geoLongitude": { | |
| "@type": "xsd:double", | |
| "@value": "45" # This string number causes the PyLD bug | |
| } | |
| } | |
| ] | |
| } | |
| data = { | |
| "@context": { | |
| "xsd": "http://www.w3.org/2001/XMLSchema#" | |
| }, | |
| "@graph": [ | |
| { | |
| "@id": "ex:1", | |
| "ex:p": { | |
| "@type": "xsd:double", | |
| "@value": "45" | |
| } | |
| } | |
| ] | |
| } |
| # The bug was in PyLD's _object_to_rdf method where string values | ||
| # with @type: "xsd:double" were not being converted to float | ||
| result = pyld.jsonld.to_rdf(data) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove trailing whitespace here and elsewhere.
| Tests for to_rdf functionality, specifically focusing on double/float handling bugs. | ||
| """ | ||
|
|
||
| import json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused.
Why
Here's a JSON-LD document.
{ "@context": { "xsd": "http://www.w3.org/2001/XMLSchema#", "geoLongitude": "http://www.w3.org/2003/01/geo/wgs84_pos#longitude" }, "@graph": [ { "@id": "http://www.wikidata.org/entity/Q399", "geoLongitude": { "@type": "xsd:double", "@value": "45" } } ] }Conversion of this document to a RDF dataset works at JSON-LD playground, but it will crash
pyld2.0.4:What
This PR