Fix invalid JSXExpressions having identifier-ish things in their trivia, improve error messages for comma expressions in JSX #31480

andrewbranch · 2019-05-20T19:35:26Z

I'm not totally sure why that debug assertion was there and specific to identifiers, but removing it only seems to require that the synthetically added identifier get its escapedText set.

Fixes #25487

weswigham · 2019-05-20T19:57:05Z

While that may be what's wrong with the node (and what causes a crash when the assertion is removed), based on the PR the assertion was added in ( #21581 ) there absolutely should not be an identifier within a trivia span as far as we know - meaning the node end position is incorrect for the preceding node (since it includes this identifier-scannable span). AFAIK we should only have a stream of tokens (eg, parens or keywords) when scanning over this range.

andrewbranch · 2019-05-20T20:20:30Z

I mean, how else would you parse this?

<div>{<div />x}</div>

The parser hits the x when doing a parseExpected(SyntaxKind.CloseBraceToken) and then moves on. The end of the node is set to after x, so when the language service adds synthetic nodes, it scans x as an identifier. I can't see any reason to treat this differently than, say

<div>{<div />""}</div>

which parses the same, but since it's a token, doesn't hit the debug assertion.

This can cause a problem if we scan an Identifier because we don't give it any text here

So I gave it text here ¯\(ツ)/¯

tests/cases/fourslash/jsxExpressionFollowedByIdentifier.ts

weswigham · 2019-05-20T23:38:14Z

The parser hits the x when doing a parseExpected(SyntaxKind.CloseBraceToken) and then moves on.

parseExpected only advances if a match was found - the issue is the scanJsxText inside parseJsxExpression that's forcing the cursor forward without assigning the intervening text to a real node (because we do actually parse JSX text out). There's no discussion of it in #4228 (which introduced it) so it's a bit of an exercise in software archaeology.

(Also it looks like we usually more gracefully handle similar parse errors by synthesizing a binary expression with a comma operator - see parseJsxElementOrSelfClosingElementOrFragment)

andrewbranch · 2019-05-21T16:44:27Z

synthesizing a binary expression with a comma operator

A-ha, this is exactly what I was missing. I can do that.

So as a general rule, what kinds of syntax is the parser allowed to skip over? To me, it feels just as odd for a string literal token to end up in another syntax's trivia as it does for an identifier. But the debug assertion only cares about identifiers. I'm trying to figure out how I could have known, without feedback, that this solution is better than “fixing” synthetic identifiers, which also does seem to solve the problem. Is it that the parser should try as hard as possible to create a node for everything it sees, even when invalid?

This reverts commit 422b541.

weswigham · 2019-05-21T17:44:56Z

So as a general rule, what kinds of syntax is the parser allowed to skip over?

Keywords/punctuation, whitespace and comments. (There is still some punctuation we capture for transformer reasons.) Things we don't make real AST nodes for when parsing well-formed code.

But the debug assertion only cares about identifier

I think the assertion is just an early error to warn about the crash on access - finding a string literal or almost anything that would usually have a real AST node in the trivia is an error. The category of "trivia" is primarily just comments and whitespace and is expanded to include unparsed keywords and punctuation in this context.

Is it that the parser should try as hard as possible to create a node for everything it sees, even when invalid?

Yeah, usually I think we just stuff invalid junk we want to consume into a "missing node" which is marked appropriately for incremental reparsing.

I'm trying to figure out how I could have known, without feedback, that this solution is better than “fixing” synthetic identifiers, which also does seem to solve the problem.

Deep software archaeology. When editing code I didn't write, I'm in the habit of looking up the blames of relevant code (in this case, both the PR adding the assertion and the PR adding the scanning in jsx expression parsing) I want to change to find the justification for it being as it is and seeing if the current state is well-reasoned or apparently ad-hoc. In the case of the later, finding and fixing the root cause of the prior ad-hoc change is often able to create higher quality code than stacking more ad-hoc fixes (which is what I'm suggesting here).

andrewbranch · 2019-05-21T17:56:53Z

🕵 related: #31510

tests/cases/fourslash/jsxExpressionFollowedByIdentifier.ts

andrewbranch

@RyanCavanaugh looks like all these horrifying baselines that changed were generated by you. Would you mind taking a look?

andrewbranch · 2019-05-21T22:42:42Z

tests/baselines/reference/jsxParsingError1.js

@@ -16,5 +16,4 @@ const elem = <div className={class1, class2}/>;
 // This should be a parse error
 var class1 = "foo";
 var class2 = "bar";
-var elem = <div className={class1} class2/>;
-/>;;
+var elem = <div className={class1, class2}/>;


This emit is invalid although slightly less so than the previous? We could potentially put these inside a ParenthesizedExpression, but we're already emitting a grammar error, and it seems like there's no requirement for the emit to be totally valid if the input wasn't totally valid.

andrewbranch · 2019-05-21T22:43:28Z

tests/baselines/reference/tsxErrorRecovery1.errors.txt

    declare namespace JSX { interface Element { } }

    function foo() {
    	var x = <div>  { </div>
-    	         ~~~
-!!! error TS17008: JSX element 'div' has no corresponding closing tag.
    	                 ~~
 !!! error TS1109: Expression expected.
    }
    // Shouldn't see any errors down here


Now we don't. I think this gets significantly better.

andrewbranch · 2019-05-21T22:44:18Z

tests/baselines/reference/jsxInvalidEsprimaTestSuite.js

@@ -123,7 +123,7 @@ var x = <div>one</div>, <div>two</div>;
 var x = <div>one</div> /* intervening comment */, /* intervening comment */ <div>two</div>;
 ;
 //// [20.jsx]
-<a>{"str"}}</a>;
+<a>{"str"};}</a>;


This is weird, but barely weirder than the previous

andrewbranch · 2019-05-21T22:45:57Z

tests/baselines/reference/jsxAndTypeAssertion.js

@@ -28,18 +28,21 @@ var foo = /** @class */ (function () {
    return foo;
 }());
 var x;
-x = <any> {test} <any></any> };
+x = <any> {test}: <any></any> };


The TypeScript generating this was x = <any> { test: <any></any> };. Garbage in, garbage out?

andrewbranch · 2019-05-21T22:49:49Z

tests/baselines/reference/jsxParsingError1.errors.txt

+                                 ~~~~~~
+!!! error TS2695: Left side of comma operator is unused and has no side effects.
+                                 ~~~~~~~~~~~~~~
+!!! error TS18007: JSX expressions may not use the comma operator. Did you mean to write an array?


These two errors make a lot more sense to me than the previous three, although I could do without the first. I considered making the error range start on the comma operator instead of at the beginning of the expression. Thoughts?

andrewbranch · 2019-05-21T22:51:36Z

src/compiler/parser.ts

+                // Only an AssignmentExpression is valid here per the JSX spec,
+                // but we can unambiguously parse a comma sequence and provide
+                // a better error message in grammar checking.
+                node.expression = parseExpression();


When we successfully parse something we know is illegal, is that always a grammar error, or is there ever a reason why that should be a parse error? I know a parse error will prevent the node from being reused incrementally—in this case the tree should be well-formed, so I think reusing it isn’t a problem.

andrewbranch · 2019-05-21T23:01:12Z

Also it looks like we usually more gracefully handle similar parse errors by synthesizing a binary expression with a comma operator - see parseJsxElementOrSelfClosingElementOrFragment

Thinking this was going to be the way forward, I improved error messages for comma expressions as JSX expressions.

However, I don't think it's quite feasible to parse extra stuff in a JSXExpression as the right-hand side of a binary expression, because we have no idea how much extra stuff there might be, what kind of stuff it is, or if the closing } will indeed ever be found. The trick we do for <div /><div /> only works because it’s a special case where a JSX element follows a JSX element—that kind of logic wouldn't work very well if we needed to couple a JSX element and N other subsequent syntaxes.

By conditionally scanning for more JSX text based on whether the expected } was found, we still end up with a pretty gross parse tree, but in <div>{<div />x}</div> the x} becomes JSXText instead of trivia, so the crash is avoided and every token is accounted for.

andrewbranch · 2019-06-13T21:59:06Z

@weswigham did you ever have a chance to look at my second iteration of this? I think it’s much more aligned with your initial recommendation.

andrewbranch added 2 commits May 20, 2019 12:02

Allow synthetic identifiers to exist and give them escapedText

422b541

Add test

0a3c755

andrewbranch requested a review from sheetalkamat May 20, 2019 19:35

DanielRosenwasser reviewed May 20, 2019

View reviewed changes

tests/cases/fourslash/jsxExpressionFollowedByIdentifier.ts Outdated Show resolved Hide resolved

Use range instead of two markers

897e10a

Revert "Allow synthetic identifiers to exist and give them escapedText"

77a76c1

This reverts commit 422b541.

andrewbranch added 3 commits May 21, 2019 14:21

Improve error message for JSXExpressions that are comma expressions

feaef9c

Parse stray identifier-ish as JSXText instead of trivia

2856aab

Accept baselines, I guess

52894cf

weswigham reviewed May 21, 2019

View reviewed changes

tests/cases/fourslash/jsxExpressionFollowedByIdentifier.ts Outdated Show resolved Hide resolved

andrewbranch commented May 21, 2019

View reviewed changes

andrewbranch changed the title ~~Allow language service to parse identifiers in weird places instead of crashing~~ Fix invalid JSXExpressions having identifier-ish things in their trivia, improve error messages for comma expressions in JSX May 21, 2019

Expose ts.Diagnostics to fourslash

ad9c36e

andrewbranch requested a review from weswigham June 13, 2019 21:58

weswigham approved these changes Jun 13, 2019

View reviewed changes

RyanCavanaugh approved these changes Jun 26, 2019

View reviewed changes

andrewbranch merged commit 871bdee into microsoft:master Jun 26, 2019

andrewbranch deleted the bug/25487 branch June 26, 2019 17:14

Fix invalid JSXExpressions having identifier-ish things in their trivia, improve error messages for comma expressions in JSX #31480

Fix invalid JSXExpressions having identifier-ish things in their trivia, improve error messages for comma expressions in JSX #31480

Uh oh!

Conversation

andrewbranch commented May 20, 2019

Uh oh!

weswigham commented May 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andrewbranch commented May 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

weswigham commented May 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andrewbranch commented May 21, 2019

Uh oh!

weswigham commented May 21, 2019

Uh oh!

andrewbranch commented May 21, 2019

Uh oh!

Uh oh!

andrewbranch left a comment

Choose a reason for hiding this comment

Uh oh!

andrewbranch May 21, 2019

Choose a reason for hiding this comment

Uh oh!

andrewbranch May 21, 2019

Choose a reason for hiding this comment

Uh oh!

andrewbranch May 21, 2019

Choose a reason for hiding this comment

Uh oh!

andrewbranch May 21, 2019

Choose a reason for hiding this comment

Uh oh!

andrewbranch May 21, 2019

Choose a reason for hiding this comment

Uh oh!

andrewbranch May 21, 2019

Choose a reason for hiding this comment

Uh oh!

andrewbranch commented May 21, 2019

Uh oh!

andrewbranch commented Jun 13, 2019

Uh oh!

Uh oh!

weswigham commented May 20, 2019 •

edited

Loading

andrewbranch commented May 20, 2019 •

edited

Loading

weswigham commented May 20, 2019 •

edited

Loading