Prepare Tree-sitter for Master Merge#3398
Draft
sdottaka wants to merge 80 commits into
Draft
Conversation
Integrate tree-sitter as an optional syntax highlighting engine that supplements the existing keyword-based CrystalEdit parsers. When a grammar DLL and highlight query (.scm) are present in the TreeSitterGrammars directory, tree-sitter provides full AST-based highlighting; otherwise the existing parser runs unchanged. Core components: - TreeSitterParser.h/.cpp: CTreeSitterParser, CTreeSitterColorMap, CTreeSitterLanguage, and TreeSitterRegistry classes - ParseLine virtual override in CMergeEditView for tree-sitter results - Incremental parsing via ts_tree_edit() on each edit operation - Lazy reparse with dirty flag (fires once per paint cycle) - Status bar indicator showing [TS:language] in encoding pane - Post-build step to copy grammar DLLs from Release to Debug/Test Supported languages: bash, c, c-sharp, cpp, css, dtd, flow, fsharp, fsharp_signature, go, html, java, javascript, json, php, php_only, python, ruby, rust, tsx, typescript, xml. Grammar DLLs are built separately via build-grammars.ps1.
- build-grammars.ps1: downloads and compiles grammar DLLs from GitHub releases using MSVC cl.exe/link.exe - grammars.json: defines 17 grammar repos and release tags - fsharp-highlights.scm: F# syntax highlight queries for tree-sitter
Wire in scope-aware highlighting (locals.scm) and language injection (injections.scm) alongside the existing highlights.scm support. - CTreeSitterLanguage: add LoadQuery() helper, load all three .scm files - CTreeSitterParser: add RunLocalsQuery() for scope/def/ref tracking, RunInjectionQuery() for embedded language highlighting, GetSetProperty() for #set! predicate parsing; RunHighlightQuery() cross-references locals - TreeSitterRegistry: add GetLanguageForName() for injection language lookup - build-grammars.ps1: resolve and copy locals.scm and injections.scm files - Fix type mismatch (RefInfo vs PendingRef) and remove dead code
- Add tree-sitter shared items to solution and projects - Update SampleStatic project to include tree-sitter - Fix build-grammars.ps1 to use Git Bash explicitly - Add missing <algorithm> include - Minor solution cleanup and add Italian translation
* fix: bundle inherited tree-sitter queries for grammars Agent-Logs-Url: https://github.com/Thorium/winmerge/sessions/234ce03d-a145-4b8c-b4c2-37eed3e33cf0 Co-authored-by: Thorium <229355+Thorium@users.noreply.github.com> * refine tree-sitter query bundling helpers Agent-Logs-Url: https://github.com/Thorium/winmerge/sessions/234ce03d-a145-4b8c-b4c2-37eed3e33cf0 Co-authored-by: Thorium <229355+Thorium@users.noreply.github.com> * polish tree-sitter query bundle handling Agent-Logs-Url: https://github.com/Thorium/winmerge/sessions/234ce03d-a145-4b8c-b4c2-37eed3e33cf0 Co-authored-by: Thorium <229355+Thorium@users.noreply.github.com> * Earlier CoPilot feedback addressed. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Thorium <229355+Thorium@users.noreply.github.com>
* Doc - Italian language - Updated (#3319) * Update Italian.po * Fix issue #3321: [BUG] Incorrect string used with beta releases * Show error message when entering path in header bar (#3322) * Prioritize explicitly selected plugins over archive detection (#3324) * Prioritize explicitly selected plugins over archive detection * Update Src/7zCommon.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update Src/7zCommon.cpp --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Use 7-Zip IsArc API for archive detection and refactor format guessing logic (#3323) * Use 7-Zip IsArc API for archive detection and refactor format guessing logic * Update ArchiveSupport/Merge7z/Merge7zCommon.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Restore extension-only fallback in GuessFormatEx and handle NEED_MORE result Agent-Logs-Url: https://github.com/WinMerge/winmerge/sessions/47af4d0f-fc0a-4e33-ab81-8ec95c0f599e Co-authored-by: sdottaka <98126+sdottaka@users.noreply.github.com> * Use 7-Zip IsArc API for archive detection and refactor format guessing logic (2) * Use 7-Zip IsArc API for archive detection and refactor format guessing logic (3) * Prioritize explicitly selected plugins over archive detection * Update Src/7zCommon.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update Src/7zCommon.cpp * Update Merge7zCommon.cpp --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: sdottaka <98126+sdottaka@users.noreply.github.com> * Merge7z: Bump revision to 2600.1 * Merge7z: Bump revision to 2600.1 (2) * Update French Manual (#3325) * Refactor: unify open parameters and move recurse to OpenFolderParams (#3326) * Update Manual/French.po * Refactor: unify open parameters and move recurse to OpenFolderParams (#3326) (2) (cherry picked from commit 83af229) * Add Folder comparison mode with archive extraction support (#3320) * Update Manual/French.po * Update Brazilian.po (#3328) Added translation for "Add Folder comparison mode with archive extraction support (#3320)" * Update German.po (#3329) * update zh-cn translation (#3331) * Update Turkish.po (#3333) New string entries * Update Korean (#3334) * Code review fixes for 5 oldest source files#3327 #1 * Code review fixes for 5 oldest source files#3327 #2 * Update Turkish.po * Update TranslationsStatus * Update ChangeLog&ReleaseNotes * Italian language (#3335) * Stabilize tree-sitter highlight precedence Make overlapping captures resolve deterministically so syntax colors stay consistent across panes and languages. Also accept local.* capture prefixes so newer query conventions keep local symbol highlighting working. * Unify tree-sitter block ordering Use one parser-wide block order counter so injected-language highlights cannot collide with primary highlight ordering when the final precedence tie-breaker runs. --------- Co-authored-by: bovirus <1262554+bovirus@users.noreply.github.com> Co-authored-by: Takashi Sawanaka <sdottaka@users.sourceforge.net> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: sdottaka <98126+sdottaka@users.noreply.github.com> Co-authored-by: t3chnob0y <t3chnob0y@users.noreply.github.com> Co-authored-by: Marcellomco <70959309+Marcellomco@users.noreply.github.com> Co-authored-by: René T. Nicolaus <12006431+Havoc7891@users.noreply.github.com> Co-authored-by: YG <1246410+yingang@users.noreply.github.com> Co-authored-by: bilimiyorum <131397022+bilimiyorum@users.noreply.github.com> Co-authored-by: VenusGirl❤ <venusgirl@outlook.com>
* Finish tree-sitter runtime integration for compare views Wire the runtime grammar bundle, compare-view UI, and same-file navigation together so tree-sitter features are actually available in built binaries. This also updates the F# grammar bundle to include tags and disables Go to Definition when the current caret position cannot resolve. * Fix tree-sitter follow-up packaging issues Guard the WiX grammar component reference when harvested files are absent, and remove the redundant TreeSitterWrapper include to avoid the _T macro redefinition warning.
# Conflicts: # ArchiveSupport/Merge7z/BuildArc.cmd # Docs/Users/ChangeLog.html # Docs/Users/ChangeLog.md # Docs/Users/ReleaseNotes.html # Docs/Users/ReleaseNotes.md # DownloadDeps.cmd # Src/FilepathEdit.cpp # Src/Merge.vcxproj.filters # Src/res/new_folder.bmp # Translations/TranslationsStatus.md # Translations/WinMerge/Arabic.po # Translations/WinMerge/Basque.po # Translations/WinMerge/Brazilian.po # Translations/WinMerge/Bulgarian.po # Translations/WinMerge/Catalan.po # Translations/WinMerge/ChineseSimplified.po # Translations/WinMerge/ChineseTraditional.po # Translations/WinMerge/Corsican.po # Translations/WinMerge/Croatian.po # Translations/WinMerge/Czech.po # Translations/WinMerge/Danish.po # Translations/WinMerge/Dutch.po # Translations/WinMerge/English.pot # Translations/WinMerge/Finnish.po # Translations/WinMerge/French.po # Translations/WinMerge/Galician.po # Translations/WinMerge/German.po # Translations/WinMerge/Greek.po # Translations/WinMerge/Hebrew.po # Translations/WinMerge/Hungarian.po # Translations/WinMerge/Italian.po # Translations/WinMerge/Japanese.po # Translations/WinMerge/Korean.po # Translations/WinMerge/Lithuanian.po # Translations/WinMerge/Norwegian.po # Translations/WinMerge/Persian.po # Translations/WinMerge/Polish.po # Translations/WinMerge/Portuguese.po # Translations/WinMerge/Romanian.po # Translations/WinMerge/Russian.po # Translations/WinMerge/Serbian.po # Translations/WinMerge/Sinhala.po # Translations/WinMerge/Slovak.po # Translations/WinMerge/Slovenian.po # Translations/WinMerge/Spanish.po # Translations/WinMerge/Swedish.po # Translations/WinMerge/Tamil.po # Translations/WinMerge/Turkish.po # Translations/WinMerge/Ukrainian.po # Translations/WinMerge/Vietnamese.po
…s and FolderCompare projects are not yet buildable. MFC dependencies still need to be removed from TreeSitterParser.
* Fix tree-sitter go to definition from context menus Update right-click navigation to resolve the symbol under the mouse and prefer tagged type definitions when the position-based lookup stays on the current line. * Update tree-sitter context-menu definition handling
# Conflicts: # Src/Merge.vcxproj # Src/MergeDoc.cpp # Src/MergeDoc.h
Replace ITextBuffer* parameter in NotifyEdit with TextEdit struct. Move notification to buffer layer (AddUndoRecord) for consistency.
- Move TreeSitterParser and TreeSitterWrapper from Externals/crystaledit/editlib to Src/ - Move tree-sitter library from Externals/crystaledit/editlib/ to Externals/ (top-level) - Remove TreeSitter references from editlibparsers.vcxitems (CrystalEdit shared items) - Update include paths in WinMerge source files to reference local TreeSitter headers - Update project files and solution configuration This decouples tree-sitter from CrystalEdit, making CrystalEdit a pure text editor library while keeping tree-sitter as a WinMerge-specific feature.
…esign Remove stored buffer reference from CTreeSitterParser and pass ITextBuffer* explicitly to methods that need it. This eliminates hidden state and makes buffer dependencies explicit at call sites. Changes: - Remove m_pBuffer, SetBuffer(), and GetBuffer() from CTreeSitterParser - Add ITextBuffer* parameter to FindDefinition() and TryGetTagDefinitionByNameAt() - Introduce TreeSitterParseContext struct to hold both parser and buffer references - Update MergeDoc to create and own TreeSitterParseContext instances - Update ParseLineTreeSitter() to use context for lazy reparse with explicit buffer - Update all call sites in MergeEditView to pass buffer parameter
Keep only the highest priority highlight when multiple captures match the same token range, preventing conflicting color indices.
- Add ISyntaxParser::FindMatchingBrace() with default false implementation - Implement FindMatchingBrace() in CrystalLineParserAdapter using legacy logic - Implement FindMatchingBrace() in TreeSitterParserAdapter using AST structure axParser- Refactor CCrystalTextView::OnMatchBrace() to delegate to parser - Add m_nCurrentTextType to track current parser type for UI state - Remove redundant m_CurSourceDef->flags writes from menu handlers - Update OnUpdateSourceType/OnToggleSourceHeader to use m_nCurrentTextType This reduces CCrystalTextView's dependency on m_CurSourceDef and provides cleaner abstraction for syntax-aware brace matching.
Replace m_CurSourceDef->type with m_nCurrentTextType in UI update handlers Update CopyProperties to use m_nCurrentTextType instead of m_CurSourceDef Change OnMatchBrace fallback to read comment syntax from m_nCurrentTextType Add null safety check to ParseLine legacy fallback Document m_CurSourceDef as legacy-only (used when m_pSyntaxParser is null) This further reduces dependency on m_CurSourceDef, confining it to legacy parser fallback scenarios only.
…-refactor # Conflicts: # Translations/WinMerge/Portuguese.po
# Conflicts: # Externals/crystaledit/editlib/ccrystaltextview.cpp # Externals/crystaledit/editlib/ccrystaltextview.h # Externals/crystaledit/editlib/editlib.vcxitems.filters # Externals/crystaledit/editlib/parsers/html.cpp # Src/DiffWrapper.cpp # Src/Merge.cpp # Src/Merge.vcxproj.filters # Src/MergeDoc.cpp # Src/MergeEditView.cpp # Src/SyntaxParserHelper.cpp # Testing/FolderCompare/FolderCompare.vcxproj.filters # Testing/GoogleTest/UnitTests/UnitTests.vcxproj.filters # Translations/WinMerge/Portuguese.po
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The current
feature/tree-sitterbranch still has several issues that should be addressed before it can be merged intomaster.This PR focuses on resolving the following items:
Move Tree-sitter-related parser files from CrystalEdit to WinMerge, since CrystalEdit itself does not use them. (Completed)
Improve syntax highlighting consistency with the existing CrystalEdit parsers.
Fix stability issues, including crashes.
Improve editing performance by avoiding full highlight cache rebuilds after every edit.
Rework the parser architecture to allow multiple parser implementations (e.g. CrystalEdit parsers and Tree-sitter) through a common language-services layer. This includes:
LangServicesnamespace containing common types and interfaces such asTEXTBLOCK,TextDefinition,ITextBlock,ISyntaxParser, andISyntaxParserFactory.CCrystalTextViewthroughISyntaxParserFactory.SyntaxParserRegistryas a singleton to manage parser factory registration and parser creation.CCrystalTextViewandcrystallineparser.h, enabling use in non-UI contexts such as comment-difference ignoring.Replace the current "Enable Tree-sitter" checkbox with parser selection modes:
These modes are implemented by changing the parser factory registration order.
Stop downloading and building Tree-sitter language modules during every release build. Instead, add dedicated Visual Studio projects (
.vcxproj) for each supported language module and build them as part of the WinMerge solution.Include Tree-sitter runtime DLLs and language-module DLLs in all installer packages (x86 IS5, x64 IS5, and x64 IS6).
The goal is to make the Tree-sitter implementation stable, performant, maintainable, and ready for integration into the main branch.