-
Notifications
You must be signed in to change notification settings - Fork 204
Move JRuby extension to SnakeYAML Engine #612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6321402
to
31c1ee3
Compare
The failures might be caused by the schema. Version 2.5 is taking the JSON schema, version 2.6 (released yesterday) supports also the Core schema which was used in SnakeYAML. But it must be explicitly enabled (JSON is the default) |
@asomov Thanks for that, but it doesn't seem to impact the failures too much. I get 32F and 2E with my latest local fixes. Nearly all of these fail due to the I'll push my latest after some cleanup. |
@headius it looks like a bug - I am pretty sure that the |
@asomov I figured out how to disable it, by passing an empty Optional into the DocumentStartEvent constructor. Now I'm down to 7F, 1E...mostly failing tests that want the YAML directive to be present. 🤣 |
See jruby/jruby#7570 for some of the justification for this move. We only require the parser from SnakeYAML, but in the original form it is encumbered with Java object serialization code that keeps getting flagged as a CVE risk. We disagree with the assessment, at least as it pertains to JRuby (we do not use the code in question) but our inclusion of the library continues to get flagged by auditing tools. This commit starts the process of moving to the successor library, SnakeYAML Engine. The parser API is largely unchanged, except as seen in this commit. No Java exceptions are thrown, but a number of Psych tests fail (possibly due to Engine being YAML 1.2 only).
This eliminates the %YAML 1.2 directive at the start of each emit, which improves tests passing but also breaks a few tests that *expect* the YAML directive to be present.
31c1ee3
to
d772d8a
Compare
With latest push, I'm down to only one error! The header handling was me not honoring Psych's way of indicating whether a version header should or should not be printed. The remaining error is a test case that triggers a SyntaxError in Engine: assert_to_yaml(
[{"arrival"=>"EDI", "departure"=>"LAX", "fareref"=>"DOGMA", "currency"=>"GBP"}, {"arrival"=>"MEL", "departure"=>"SYD", "fareref"=>"MADF", "currency"=>"AUD"}, {"arrival"=>"MCO", "departure"=>"JFK", "fareref"=>"DFSF", "currency"=>"USD"}], <<EOY
-
&F fareref: DOGMA
&C currency: GBP
&D departure: LAX
&A arrival: EDI
- { *F: MADF, *C: AUD, *D: SYD, *A: MEL }
- { *F: DFSF, *C: USD, *D: JFK, *A: MCO }
EOY
) The |
Oops, here's that error:
Ruby trace:
Full trace with Java lines (JIT threshold 0 to reduce interpreter frames):
|
@hsbt @tenderlove This is very close to 100%. My reading of the YAML 1.2 spec seems to indicate that it should be mostly a superset of 1.1, but maybe you have some concerns about the JRuby extension updating to 1.2 before the C extension. For context, see jruby/jruby#7570 where we are dealing with some CVE reports against the old SnakeYAML. To summarize, SnakeYAML has features that allow dumping and loading Java objects, similar to Psych's support for dumping and loading Ruby objects. And like Psych, SnakeYAML has had this feature reported as being exploitable. There's ongoing debate about how to handle this in SnakeYAML, since the exploit is the feature (object serialization), but we can dodge the issue by moving to SnakeYAML Engine since it does not have the same feature. Note that JRuby has never used this serialization feature, but just having SnakeYAML ship with JRuby causes us to be flagged as well. |
48c3af0
to
63428e3
Compare
@headius sorry, I did not quite catch how I can help. The Engine indeed had some modifications to be closer to the YAML 1.2 spec. |
@asomov I don't know how to interpret that syntax error. It may be a valid rejection of off-spec yaml. |
@headius Thanks for your works. I'm +1 to migrate SnakeYAML to SnakeYAML-Engine for JRuby. It's great first step to support YAML 1.2 spec because I didn't understand the details of YAML 1.2 spec yet 😇 |
The last two things that fail are both within Lines 219 to 252 in 1f23e6e
The YAML in these two sections follows, and as far as I can tell it does not parse with any online YAML checkers: -
&F fareref: DOGMA
&C currency: GBP
&D departure: LAX
&A arrival: EDI
- { *F: MADF, *C: AUD, *D: SYD, *A: MEL }
- { *F: DFSF, *C: USD, *D: JFK, *A: MCO } ---
ALIASES: [&f fareref, &c currency, &d departure, &a arrival]
FARES:
- *f: DOGMA
*c: GBP
*d: LAX
*a: EDI
- *f: MADF
*c: AUD
*d: SYD
*a: MEL
- *f: DFSF
*c: USD
*d: JFK
*a: MCO The errors for these two cases, with the text coming from SnakeYAML, are below:
and
I've stared at the specs for a bit but can't figure out how this YAML fits into the grammar. Perhaps these cases are no longer valid in YAML 1.2? @asomov @hsbt @tenderlove |
@headius this is valid in YAML 1.2:
mind the extra space between the alias and the colon. It is required because a colon can be a part of alias (*F: and *F are both valid aliases) |
@asomov Aha! If I update the YAML with the extra space, it does pass correctly under SnakeYAML Engine! @hsbt @tenderlove: Would we prefer to test only the new 1.2 syntax (which parses fine on 1.0 and 1.1) or test both with a version guard of some kind? I will push a change to the test for now. |
This allows these tests to pass on SnakeYAML Engine -- which is a 1.2-only YAML library -- while still passing on libyaml 1.1.
If the 1.2 syntax works on 1.0 and 1.1, then probably just update it to use the 1.2 syntax? I don't think there is a reason to have version specific tests (at least I can't think of a reason). |
Done! The most recent commit in this branch updates the test and everything is green. |
If we could embed indy call sites here they would cache as constants; this is the best we can do at the moment.
@hsbt @tenderlove Any objections to merging this? Does it need a major version update? Any concerns about releasing it soon? |
I am unsure. If it's backwards compatible, then I don't see any need for a major version increase. |
👍 I'm +1 to bump "5.1.0" for this. |
Thanks so much everyone! Looking forward to incorporating this in JRuby! |
Oh, @hsbt @tenderlove if we can push a 5.1.pre gem I can run it through JRuby's CI and make sure everything looks ok there too. |
See jruby/jruby#7570 for some of the justification for this move. We only require the parser from SnakeYAML, but in the original form it is encumbered with Java object serialization code that keeps getting flagged as a CVE risk. We disagree with the assessment, at least as it pertains to JRuby (we do not use the code in question) but our inclusion of the library continues to get flagged by auditing tools.
This PR starts the process of moving to the successor library, SnakeYAML Engine. The parser API is largely unchanged, except as seen in this commit. No Java exceptions are thrown, but a number of Psych tests fail (possibly due to Engine being YAML 1.2 only).