fix(linter/plugins)!: use TypeScript parser directly for getting tokens #16805

overlookmotel · 2025-12-13T18:16:24Z

Previously we used @typescript-eslint/typescript-estree parser to tokenize a source text. @typescript-eslint/typescript-estree uses TypeScript parser internally, and then builds the tokens array by traversing TypeScript's AST.

Instead, use TypeScript parser directly, and copy the code from TS-ESLint for converting tokens from TypeScript's format to TS-ESTree format.

Why?

This has a few advantages:

Remove a dependency.
TS-ESLint's tokens did not have start or end properties, which we want for consistency with our AST. By reimplementing ourselves, we can add those properties to tokens.
TS-ESLint does a bunch of other work besides compiling the tokens array - it converts the AST too. We discard the AST that TS-ESLint produces, so this work is pointless. By cutting out TS-ESLint, we also cut out that unnecessary work, and do only what we need to - tokenizing.
TS-ESLint would throw an error if TypeScript produces any errors (even non-fatal ones), and there is no option to tell TS-ESLint not to do that. This caused a bunch of conformance tests to fail, because ESLint's tests include invalid code.
Can add workarounds for TypeScript bugs - e.g. not being able to parse JSX closing fragments containing line breaks (<>hello<\n/>).

Implementation detail

We no longer need to get comments from TypeScript. We use the comments produced by Oxc parser instead.

`CommentToken` type

A side effect is that there's no longer any need for separate Comment and CommentToken types - as we use our own comments now. So this PR removes CommentToken type.

Have marked this as a breaking change due to removal of this type.

overlookmotel · 2025-12-13T18:16:37Z

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

0-merge - adds this PR to the back of the merge queue
hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

This stack of pull requests is managed by Graphite. Learn more about stacking.

Pure refactor of tests. Unit tests for `isSpaceBetween` were using `@typescript-eslint/typescript-estree` to parse code snippets. Instead use Oxc parser, and set up global state the same way as if the file was being linted, instead of mocking `source_code.ts` to get the source text into the linter's internals. This brings the tests closer to real usage. It also prepares the way for changing how tokens are parsed in a following PR (#16805).

Copilot

Pull request overview

This pull request replaces the @typescript-eslint/typescript-estree parser with direct TypeScript parser usage for tokenization. The change removes an external dependency, adds start and end properties to tokens for consistency with Oxc AST, and enables parsing of invalid code that TS-ESLint would reject.

Key changes:

Implements custom token parsing using TypeScript's parser directly instead of TS-ESLint
Removes CommentToken type (breaking change) in favor of using Oxc's Comment type
Uses Oxc parser's comments instead of TypeScript's comments
Adds start and end properties to tokens alongside existing range property

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
pnpm-lock.yaml	Removes `@typescript-eslint/typescript-estree` and related dependencies
apps/oxlint/package.json	Removes `@typescript-eslint/typescript-estree` from dependencies
apps/oxlint/tsdown.config.ts	Updates bundle entry from `ts_eslint.cjs` to `typescript.cjs` and updates comments
apps/oxlint/src-js/plugins/typescript.cjs	New bundle entry that exports TypeScript module directly
apps/oxlint/src-js/plugins/ts_eslint.cjs	Deleted - replaced by typescript.cjs
apps/oxlint/src-js/plugins/tokens_parse.ts	New file implementing token parsing using TypeScript parser, adapted from TS-ESLint's implementation
apps/oxlint/src-js/plugins/tokens.ts	Refactors `initTokens` to use new parser, removes TS-ESLint lazy-loading, uses Oxc comments instead of TS-ESLint comments
apps/oxlint/src-js/plugins/types.ts	Changes `NodeOrToken` type from `Node \| Token \| CommentToken` to `Node \| Token \| Comment`
apps/oxlint/src-js/plugins/source_code.ts	Updates return type of `tokensAndComments` to use `Comment` instead of `CommentToken`
apps/oxlint/src-js/index.ts	Removes `CommentToken` from public exports (breaking change)
apps/oxlint/test/tokens.test.ts	Updates test setup to properly initialize parsing environment including context, buffers, and source text
apps/oxlint/test/fixtures/tokens/plugin.ts	Adds assertions to verify tokens have `start` and `end` properties matching `range`
apps/oxlint/conformance/snapshot.md	Shows improved test pass rates: 36 more tests passing overall, with 3 rules moving to fully passing

Files not reviewed (1)

pnpm-lock.yaml: Language not supported

Comments suppressed due to low confidence (1)

apps/oxlint/src-js/plugins/tokens.ts:211

The TODO comment indicates that range[0] should be replaced with start now that tokens have start property. However, this TODO has not been addressed in this PR. The function still uses token.range[0] and comment.range[0] instead of token.start and comment.start. According to the stored memory about token property access, the codebase convention is to use start and end properties for system-produced tokens. This inconsistency should be resolved for better code consistency and performance (property access is faster than array indexing).

  // TODO: Replace `range[0]` with `start` throughout this function
  // once we have our own tokens which have `start` property

  // Fast paths for file with no comments, or file which is only comments
  const commentsLength = comments.length;
  if (commentsLength === 0) {
    tokensAndComments = tokens;
    return;
  }

  const tokensLength = tokens.length;
  if (tokensLength === 0) {
    tokensAndComments = comments;
    return;
  }

  // File contains both tokens and comments.
  // Fill `tokensAndComments` with the 2 arrays interleaved in source order.
  tokensAndComments = [];

  let tokenIndex = 0,
    commentIndex = 0,
    token = tokens[0],
    comment = comments[0],
    tokenStart = token.range[0],
    commentStart = comment.range[0];

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

apps/oxlint/src-js/plugins/tokens.ts

charliecreates

The new TypeScript-native tokenization introduces a hard regression in debug/test builds: debugCheckValidRanges now throws on end === start, and the conformance snapshot shows multiple Invalid token range: … failures, indicating this is currently breaking suites. tokens_parse.ts depends on parent pointers for JSX token classification, so setParentNodes must be intentionally kept true to prevent subtle token-type regressions. Finally, the null as any + __proto__ approach in tokens_parse.ts weakens safety/maintainability and is worth tightening given this is a core parsing primitive.

Additional notes (1)

Maintainability | apps/oxlint/src-js/plugins/tokens.ts:168-187
initTokensAndComments now sources comments from ast.comments (Oxc) but tokens come from TypeScript. You do validate ranges independently, but you are implicitly assuming both systems use the same indexing scheme and that both are based on the same exact sourceText.

Given the new snapshot failures and the presence of BOM handling in setupSourceForFile, it’s worth explicitly verifying that TS token offsets and Oxc comment offsets are aligned (especially around BOM and newline normalization). If they diverge, merging/sorting will produce incorrect order and potentially overlapping ranges.

Summary of changes

Tokenization pipeline refactor

Replaced tokenization based on @typescript-eslint/typescript-estree with a new TypeScript-native tokenizer:
- apps/oxlint/src-js/plugins/tokens.ts now calls parseTokens() instead of TS-ESLint parse().
- Added apps/oxlint/src-js/plugins/tokens_parse.ts, copying/adapting TS-ESLint logic to walk a TS AST and convert TS tokens into the project’s Token shape (now including start/end and lazy loc).
- Added a new lazy-loaded bundle entry apps/oxlint/src-js/plugins/typescript.cjs and updated apps/oxlint/tsdown.config.ts to build it.

Types and API surface changes

Removed the CommentToken type export and switched to using Comment from plugins/types.ts:
- tokensAndComments is now typed as (Token | Comment)[].
- NodeOrToken now includes Comment.
- BaseToken now extends Span directly so tokens gain start/end.

Test and fixtures updates

Updated token fixture plugin to assert token.start/end === token.range[...].
Reworked tokens.test.ts to avoid mocking source_code.ts; it now uses real parse/buffer setup via parseRaw() and file context helpers.

Dependency changes

Removed @typescript-eslint/typescript-estree from apps/oxlint devDependencies and lockfile.

Conformance output changes

Snapshot updated for improved conformance results and new error outputs (notably Invalid token range: …).

apps/oxlint/src-js/plugins/tokens_parse.ts

graphite-app · 2025-12-13T22:54:32Z

Merge activity

Dec 13, 10:54 PM UTC: overlookmotel added this pull request to the Graphite merge queue.
Dec 13, 11:01 PM UTC: Merged by the Graphite merge queue.

…ns (#16805) Previously we used `@typescript-eslint/typescript-estree` parser to tokenize a source text. `@typescript-eslint/typescript-estree` uses TypeScript parser internally, and then builds the tokens array by traversing TypeScript's AST. Instead, use TypeScript parser directly, and copy the code from TS-ESLint for converting tokens from TypeScript's format to TS-ESTree format. ### Why? This has a few advantages: 1. Remove a dependency. 2. TS-ESLint's tokens did not have `start` or `end` properties, which we want for consistency with our AST. By reimplementing ourselves, we can add those properties to tokens. 3. TS-ESLint does a bunch of other work besides compiling the tokens array - it converts the AST too. We discard the AST that TS-ESLint produces, so this work is pointless. By cutting out TS-ESLint, we also cut out that unnecessary work, and do only what we need to - tokenizing. 4. TS-ESLint would throw an error if TypeScript produces any errors (even non-fatal ones), and there is no option to tell TS-ESLint not to do that. This caused a bunch of conformance tests to fail, because ESLint's tests include invalid code. 5. Can add workarounds for TypeScript bugs - e.g. not being able to parse JSX closing fragments containing line breaks (`<>hello<\n/>`). ### Implementation detail We no longer need to get comments from TypeScript. We use the comments produced by Oxc parser instead. ### `CommentToken` type A side effect is that there's no longer any need for separate `Comment` and `CommentToken` types - as we use our own comments now. So this PR removes `CommentToken` type. Have marked this as a breaking change due to removal of this type.

Pure refactor. `| Comment` in `NodeOrToken | Comment` is redundant. Since #16805, `NodeOrToken` = `Node | Token | Comment` anyway. Shorten to just `NodeOrToken`.

…ens methods (#16808) #16805 added `start` and `end` properties to tokens. Use them instead of the slower and more verbose `range[0]` and `range[1]` in all tokens methods. Note: Only use `start` and `end` for tokens and comments, as we produce them and know for sure that these properties are present. Continue to use `range` for objects passed to methods by user.

overlookmotel mentioned this pull request Dec 13, 2025

test(linter/plugins): tests use Oxc parser #16803

Merged

github-actions bot added A-linter Area - Linter A-cli Area - CLI A-linter-plugins Area - Linter JS plugins labels Dec 13, 2025

overlookmotel mentioned this pull request Dec 13, 2025

test(linter/plugins): refactor tokens tests #16804

Merged

github-actions bot added the C-bug Category - Bug label Dec 13, 2025

This was referenced Dec 13, 2025

feat(linter/plugins)!: do not export NodeOrToken type #16806

Merged

refactor(linter/plugins): shorten NodeOrToken | Comment type #16807

Merged

perf(linter/plugins): use start and end instead of range in tokens methods #16808

Merged

overlookmotel force-pushed the 12-13-test_linter_plugins_refactor_tokens_tests branch from c457c5c to bfbf14d Compare December 13, 2025 18:42

overlookmotel force-pushed the 12-12-fix_linter_plugins_use_typescript_parser_directly_for_getting_tokens branch from 6821b23 to 3d24979 Compare December 13, 2025 18:42

overlookmotel self-assigned this Dec 13, 2025

graphite-app bot changed the base branch from 12-13-test_linter_plugins_refactor_tokens_tests to graphite-base/16805 December 13, 2025 18:52

graphite-app bot force-pushed the graphite-base/16805 branch from bfbf14d to d734b86 Compare December 13, 2025 18:58

graphite-app bot force-pushed the 12-12-fix_linter_plugins_use_typescript_parser_directly_for_getting_tokens branch from 3d24979 to babcbb4 Compare December 13, 2025 18:58

graphite-app bot changed the base branch from graphite-base/16805 to main December 13, 2025 18:59

graphite-app bot force-pushed the 12-12-fix_linter_plugins_use_typescript_parser_directly_for_getting_tokens branch from babcbb4 to a211e0f Compare December 13, 2025 18:59

overlookmotel changed the title ~~fix(linter/plugins): use TypeScript parser directly for getting tokens~~ fix(linter/plugins)!: use TypeScript parser directly for getting tokens Dec 13, 2025

overlookmotel changed the base branch from main to graphite-base/16805 December 13, 2025 21:13

overlookmotel force-pushed the 12-12-fix_linter_plugins_use_typescript_parser_directly_for_getting_tokens branch from a211e0f to 1973405 Compare December 13, 2025 21:13

overlookmotel changed the base branch from graphite-base/16805 to 12-13-refactor_linter_plugins_stronger_debug_checks_for_tokens December 13, 2025 21:13

This was referenced Dec 13, 2025

refactor(linter/plugins): stronger debug checks for tokens #16810

Merged

test(linter/plugins): tokens test report each token individually #16811

Merged

overlookmotel marked this pull request as ready for review December 13, 2025 21:15

overlookmotel requested a review from camc314 as a code owner December 13, 2025 21:15

Copilot AI review requested due to automatic review settings December 13, 2025 21:15

charliecreates bot requested a review from CharlieHelps December 13, 2025 21:15

Copilot started reviewing on behalf of overlookmotel December 13, 2025 21:15 View session

Copilot AI reviewed Dec 13, 2025

View reviewed changes

apps/oxlint/src-js/plugins/tokens.ts Show resolved Hide resolved

charliecreates bot reviewed Dec 13, 2025

View reviewed changes

apps/oxlint/src-js/plugins/tokens_parse.ts Show resolved Hide resolved

apps/oxlint/src-js/plugins/tokens_parse.ts Show resolved Hide resolved

apps/oxlint/src-js/plugins/tokens_parse.ts Show resolved Hide resolved

apps/oxlint/src-js/plugins/tokens_parse.ts Show resolved Hide resolved

charliecreates bot removed the request for review from CharlieHelps December 13, 2025 21:21

overlookmotel force-pushed the 12-12-fix_linter_plugins_use_typescript_parser_directly_for_getting_tokens branch 2 times, most recently from eeff8a9 to e2d81cd Compare December 13, 2025 22:20

graphite-app bot added the 0-merge Merge with Graphite Merge Queue label Dec 13, 2025

graphite-app bot force-pushed the 12-13-refactor_linter_plugins_stronger_debug_checks_for_tokens branch from d608f7c to c06d243 Compare December 13, 2025 22:55

graphite-app bot force-pushed the 12-12-fix_linter_plugins_use_typescript_parser_directly_for_getting_tokens branch from e2d81cd to f814c82 Compare December 13, 2025 22:55

Base automatically changed from 12-13-refactor_linter_plugins_stronger_debug_checks_for_tokens to main December 13, 2025 23:00

graphite-app bot removed the 0-merge Merge with Graphite Merge Queue label Dec 13, 2025

graphite-app bot merged commit f814c82 into main Dec 13, 2025
20 checks passed

graphite-app bot deleted the 12-12-fix_linter_plugins_use_typescript_parser_directly_for_getting_tokens branch December 13, 2025 23:01

Boshen mentioned this pull request Dec 15, 2025

release(apps): oxlint v1.33.0 && oxfmt v0.18.0 #16888

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix(linter/plugins)!: use TypeScript parser directly for getting tokens #16805

fix(linter/plugins)!: use TypeScript parser directly for getting tokens #16805

Uh oh!

overlookmotel commented Dec 13, 2025 •

edited

Loading

Uh oh!

overlookmotel commented Dec 13, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

charliecreates bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

graphite-app bot commented Dec 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix(linter/plugins)!: use TypeScript parser directly for getting tokens #16805

fix(linter/plugins)!: use TypeScript parser directly for getting tokens #16805

Uh oh!

Conversation

overlookmotel commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why?

Implementation detail

CommentToken type

Uh oh!

overlookmotel commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to use the Graphite Merge Queue

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

charliecreates bot left a comment

Choose a reason for hiding this comment

Summary of changes

Tokenization pipeline refactor

Types and API surface changes

Test and fixtures updates

Dependency changes

Conformance output changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

graphite-app bot commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

overlookmotel commented Dec 13, 2025 •

edited

Loading

`CommentToken` type

overlookmotel commented Dec 13, 2025 •

edited

Loading

graphite-app bot commented Dec 13, 2025 •

edited

Loading