Support highlighting Markdown/RST in comments #665

taminomara · 2025-07-25T12:29:31Z

This PR adds an optional feature that highlights Markdown, MySt (a Markdown extension used with Sphinx), and RST syntax in comments and provides Go To Definition implementation for cross-references. This feature is disabled by default; it can be enabled by specifying doc.syntax in .emmyrc.json.

Caveats

We use Semantic Tokens to provide syntax highlighting. Since there are no Semantic Tokens for bold and italic text, those are not highlighted.

taminomara · 2025-07-25T12:29:59Z

Note: this is a rather large PR, see individual commits to get more digestible diffs.

CppCXY · 2025-07-25T12:41:22Z

Sorry, but I do not agree with any changes to the AST, especially those that use regular expressions to match and modify the usage of regions.

taminomara · 2025-07-25T12:45:05Z

especially those that use regular expressions to match and modify the usage of regions.

I can redo those without regexes, but might you tell me what exactly you disagree with and why?

requiring spaces after region markers,
adding new tokens,
adding new nodes,
the whole concept in general?

taminomara · 2025-07-25T12:45:43Z

...

handling of whitespaces in beginning of comments?

CppCXY · 2025-07-25T12:48:26Z

You just want to highlight the documentation in comments, so you only need to parse the comments once during semantic token request and highlight them in the appropriate places, rather than modifying the entire structure of the syntax tree.

taminomara · 2025-07-25T12:55:35Z

Hm, I see your point. I can implement this. Although there will be complications if we want to support Go To Definition in comments.

I see it like this:

make a cache for comment markup, in case if IDE requests semantic tokens twice;
in Go To Definition request, if the user clicked inside of the documentation comment, we can check the comment markup cache to see if there's a cross reference in that place;
in Document Selection Range request, we can either ignore comments, or use comment markup cache once again.

Would you agree with this implementation?

CppCXY · 2025-07-25T12:58:21Z

This is a multithreaded language server, and caching introduces data races

CppCXY · 2025-07-25T12:59:08Z

Even if you re-parse everything, it will still be faster than using a cache.

taminomara · 2025-07-25T12:59:19Z

Comment markup cache doesn't need its own AST, it will only save highlighted ranges.

taminomara · 2025-07-25T13:01:51Z

caching introduces data races

I can make cache async/multi-threaded, that's not a big issue.

Even if you re-parse everything, it will still be faster than using a cache.

That's possible, I'll need to benchmark this.

taminomara · 2025-07-25T13:11:08Z

I'll start with re-parsing on every request, and see if caching is warranted later.

Two more questions:

do you agree with changes in 06cc982? I've struggled with this behavior quite a bit, because no other language server does this.
would you be interested in a separate PR that requires a space/end-of-line after "#region", provided I don't use regexes for that? Current behavior seems a bit too unspecific to me.

taminomara · 2025-07-29T17:07:10Z

Reworked this PR to avoid changes to AST.

Caching wasn't required, it's fast enough to re-parse docs on every relevant request.

crates/emmylua_parser/src/desc_parser/md.rs

crates/emmylua_code_analysis/src/semantic/mod.rs

crates/emmylua_parser/src/desc_parser/md.rs

crates/emmylua_parser/src/text/reader.rs

CppCXY · 2025-07-30T03:40:55Z

When the code block area uses Lua syntax, should we use luaLexer to parse it and highlight keywords, strings, and numbers?

taminomara · 2025-07-30T16:53:04Z

When the code block area uses Lua syntax, should we use luaLexer to parse it and highlight keywords, strings, and numbers?

Done:

CppCXY · 2025-07-31T02:40:27Z

Your changes to error have caused the error location to change.

local p

--- @return fun(): (integer?, any?, ...)
local function f(tasks)
end

CppCXY · 2025-07-31T02:50:51Z

I got error:


thread 'tokio-runtime-worker' panicked at crates\emmylua_parser\src\text\text_range.rs:18:9:
assertion failed: start_offset <= end_offset
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

thread 'tokio-runtime-worker' panicked at crates\emmylua_parser\src\text\text_range.rs:18:9:
assertion failed: start_offset <= end_offset

---```lua
---
--- local t = 213
---```
---
--- .. code-block:: lua
---
---    local t = 123
---    yes = 1123
local t = 123

taminomara · 2025-07-31T16:41:47Z

Fixed bugs, added more tests, implemented autocomplete for ref targets, improved go to definition to handle modules.

crates/emmylua_parser/src/lexer/lua_lexer.rs

crates/emmylua_parser/src/desc_parser/dummy.rs

CppCXY · 2025-08-01T02:30:29Z

Why does the highlighting look like this?

taminomara · 2025-08-01T12:58:44Z

based on conversations I've had with people, this will be polarising to a lot of people

Just for context, is that because highlighting in comments is distracting for some people, or are there other concerns?

Does it only parse comments with --- or all comments? If not I suggest this is limited to only --- comments.

I've disabled highlighting for any comment that doesn't start with three dashes. It should also detect comment blocks adorned with long dashed lines:

---------------   <- not highlighted
--- Comment       <- highlighted
---------------   <- highlighted
--- Comment       <- highlighted
---------------   <- not highlighted

---------------   <- not highlighted
-- Comment        <- not highlighted
---------------   <- not highlighted
-- Comment 2      <- not highlighted
---------------   <- not highlighted

Can we configure whether to use markdown or rst?

Yes. Also, std will use markdown even if user configured rst.

lewis6991 · 2025-08-01T13:01:53Z

Just for context, is that because highlighting in comments is distracting for some people, or are there other concerns?

Yes, and I've also had someone say they want treesitter (which has more highlights and is faster) to handle this kind of highlighting, not the server.

taminomara · 2025-08-01T14:00:45Z

Yes, and I've also had someone say they want treesitter (which has more highlights and is faster) to handle this kind of highlighting, not the server.

Then there should be a way to disable highlighting even if emmyrc specifies otherwise. I think I'll do it as a separate PR though.

taminomara · 2025-08-02T16:18:29Z

I ran parsing and completions on my codebase, fixed some minor things. I believe I've done all features I wanted for this PR, it's ready for merge.

In next PR I'll add option to force-disable markdown parsing even if .emmyrc enables it; it will be coupled with updates to VSCode plugin to expose this new option to users.

clason · 2025-08-02T17:46:57Z

Please don't invent new off-spec options that only VS Code is set up to handle it. It's enough to make this opt-in via .emmrc.json (or the LSP configuration requests), anything beyond that makes it much more (and too) difficult to reason about.

If you want to do anything, split the doc.syntax option into two: one for enabling/disabling it, and one that controls the type of syntax. This way, the latter can be set per-configuration in a local .emmyrc.json while enabling this extension can be done in the client configuration.

I would also (strongly) recommend to name and structure the options in a way to make clear that they relate to semantic tokens. You probably also want to document the (off-spec?) tokens you send for documentation.

taminomara · 2025-08-02T18:54:20Z

If you want to do anything, split the doc.syntax option into two

Do you think it's needed? I like this idea, but I don't have a strong opinion here.

I would also (strongly) recommend to name and structure the options in a way to make clear that they relate to semantic tokens.

It's not just semantic tokens, though, but also hovers, completion, text ranges, and go to definition. Plus, option names must reflect that they affect highlighting markup within descriptions, and don't affect highlighting @tags, types, and so on.

So, this is what I propose then (ideas for better naming are welcome):

doc.syntax for overall syntax configuration,
semantic_tokens.enableForDocumentationMarkup to enable/disable semantic tokens,
not sure if options for other sub-systems are necessary, but we can have hover.enableForDocumentationMarkup and others.

As for defaults, I want to avoid scenario where a user sets doc.syntax and doesn't see anything because semantic_tokens.enableForDocumentationMarkup is disabled. So we can either have doc.syntax = "none" and semantic_tokens.enableForDocumentationMarkup = true, or doc.syntax = "md" and semantic_tokens.enableForDocumentationMarkup = false. IMO, the latter option seems more reasonable.

Please don't invent new off-spec options that only VS Code is set up to handle it.

With settings being more granular this is indeed not necessary. Bear in mind, though, that there's already lots of off-spec things happening in VSCode extension, mostly because (1) VSCode doesn't support overlapping semantic tokens, and (2) VSCode doesn't allow customizing LSP configuration call the same way neovim does. So some changes will have to be made regardless.

You probably also want to document the (off-spec?) tokens you send for documentation.

Actually, no off-spec tokens so far. Maybe I'll add some for italic/bold text, but that will be later).

clason · 2025-08-02T19:04:26Z

I don't think the "active" LSP features need to be opt-in, as they need to be actively triggered on a documentation location anyway. (It would be different if references also included documentation locations -- which I feel they shouldn't either way.)

I still -- strongly -- recommend the option split, since the syntax format is a project choice while the highlighting is a user choice. This needs to be respected. If this requires making doc semantic tokens opt-out, so be it.

And I understand that the extension needs special support for this; that is fine. I amjust concerned with the server configuration not being driven by that -- everything should be (also) configurable through standard interfaces (config file and ConfigurationChanged notifications).

Right now, our LSP only requests settings when client is VSCode or Neovim. To make matters worse, it requests *different* settings for these clients: - for VSCode, it requests VSCode-specific about files and extensions, - for Neovim, it requests settings in scope `"Lua"`, and uses them as overrides for `.emmyrc`. This PR aims to unify this behavior and make it easier for users to override settings without having to create `.emmyrc.json`. This is especially handy when you want to override a setting for a single project, but don't want to change project's `.emmyrc.json`. For example, switching documentation highlighting on per-project basis will become easier (see EmmyLuaLs#665). # Changes - When client is VSCode, LSP will request settings under `"emmylua.rc"` scope, as well as VSCode-specific settings mentioned above. I'll add `"emmylua.rc"` category to the plugin settings page in a separate PR when this is merged; until then, these changes will not affect user experience. - When client is Neovim, LSP will request settings under `"emmylua.rc"` scope. If there's no settings under this scope, it will fall back to using the `"Lua"` scope. Currently, both LuaLs and EmmyLua use the same scope `"Lua"`. This is good for backwards compatibility and for users switching from one to another, but risks introducing forward incompatibilities in the long run. The new option will allow users to configure EmmyLua without affecting LuaLs. I'll send a PR to update nvim-lspconfig's documentation for EmmyLua when this is merged. - For any other client, LSP will request settings under `"emmylua.rc"` scope. Currently, there are no clients that provide this scope (EmmyLua2 for Intellij doesn't provide any config scopes), so it shouldn't break anything, but it will provide future client implementations with a better way to override LSP configuration.

Right now, our LSP only requests settings when client is VSCode or Neovim. To make matters worse, it requests *different* settings for these clients: - for VSCode, it requests VSCode-specific about files and extensions, - for Neovim, it requests settings in scope `"Lua"`, and uses them as overrides for `.emmyrc`. This PR aims to unify this behavior and make it easier for users to override settings without having to create `.emmyrc.json`. This is especially handy when you want to override a setting for a single project, but don't want to change project's `.emmyrc.json`. For example, switching documentation highlighting on per-project basis will become easier (see discussion in EmmyLuaLs#665). # Changes - When client is VSCode, LSP will request settings under `"emmylua"` scope, as well as VSCode-specific settings mentioned above. I'll add relevant settings to the plugin settings page in a separate PR when this is merged; until then, these changes will not affect user experience. - When client is Neovim, LSP will request settings under `"emmylua"` scope. If there's no such scope, it will fall back to using the `"Lua"` scope. Currently, both LuaLs and EmmyLua use the same scope `"Lua"`. This is good for backwards compatibility and for users switching from one to another, but risks introducing forward incompatibilities in the long run. The new option will allow users to configure EmmyLua without affecting LuaLs. I'll send a PR to update nvim-lspconfig's documentation for EmmyLua when this is merged. - For any other client, LSP will request settings under `"emmylua"` scope. Currently, there are no clients that provide this scope (EmmyLua2 for Intellij doesn't provide any config scopes), so it shouldn't break anything, but it will provide future client implementations with a better way to override LSP configuration.

taminomara · 2025-08-03T19:25:44Z

Added semanticTokens.renderDocumentationMarkup, default is false.

The reason for this name is for consistency with possible future options.

For example, currently we don't provide documentation modifier for semantic tokens related to documentation comments. In future, we might implement this, but we'll need to make it opt-out because at least in VSCode, most themes render documentation comments in horrendously bright colors. Option name renderDocumentationAsComments will sound consistent with renderDocumentationMarkup.

And I understand that the extension needs special support for this; that is fine. I amjust concerned with the server configuration not being driven by that -- everything should be (also) configurable through standard interfaces (config file and ConfigurationChanged notifications).

I've created #680 to address this issue. When it's merged, VSCode and NeoVim (and other editors) will be able to configure things using the same mechanism.

Right now, our LSP only requests settings when client is VSCode or NeoVim. To make matters worse, it requests *different* settings for these clients: - for VSCode, it requests VSCode-specific about files and extensions, - for NeoVim, it requests settings in scope `"Lua"`, and uses them as overrides for `.emmyrc`. This PR aims to unify this behavior and make it easier for users to override settings without having to create `.emmyrc.json`. This is especially handy when you want to override a setting for a single project, but don't want to change project's `.emmyrc.json`. For example, switching documentation highlighting on per-project basis will become easier (see discussion in EmmyLuaLs#665). # Changes - When client is VSCode, LSP will request settings under `"emmylua"` scope, as well as VSCode-specific settings mentioned above. I'll add relevant settings to the plugin settings page in a separate PR when this is merged; until then, these changes will not affect user experience. - When client is NeoVim, LSP will request settings under `"emmylua"` scope. If there's no such scope, it will fall back to using the `"Lua"` scope. Currently, both LuaLs and EmmyLua use the same scope `"Lua"`. It works in NeoVim because options for every LSP server are independent. In VSCode, however, they all share the same namespace, meaning that we can't reuse `"Lua"`. I'd like to keep things consistent, hence this rename. I'll send a PR to update nvim-lspconfig's documentation for EmmyLua when this is merged. - For any other client, LSP will request settings under `"emmylua"` scope. Currently, there are no clients that provide this scope (EmmyLua2 for Intellij doesn't provide any config scopes), so it shouldn't break anything, but it will provide future client implementations with a better way to override LSP configuration. # Tests Tested this with VSCode and NeoVim.

crates/emmylua_parser_desc/src/testlib.rs

crates/emmylua_ls/src/handlers/test_lib/mod.rs

crates/emmylua_parser_desc/src/util.rs

CppCXY · 2025-08-04T06:40:40Z

please donot change my local keyword color
now:

before:

CppCXY · 2025-08-06T06:33:07Z

please resolve conflicts

CppCXY · 2025-08-07T12:59:01Z

If everything works well, I will merge it before the release tomorrow, but please resolve the conflicts first.

Simplify reader's implementation, rename methods to be more consistent, add 1-character lookbehind support, expose unconsumed (tail) range and text.

Fix lifetimes, fix `current_token_range` not being reset.

This will allow parsing code snippets in comments line-by-line.

See more info in PR or merge commit.

taminomara · 2025-08-07T17:26:50Z

Rebased.

Since you want to release tomorrow, I've also added support for bold/italic highlighting via emmy annotator, see also EmmyLua/VSCode-EmmyLua#248.

taminomara marked this pull request as draft July 25, 2025 13:02

taminomara force-pushed the doc-hl branch 2 times, most recently from 5cb40f6 to 13a4679 Compare July 29, 2025 17:06

taminomara marked this pull request as ready for review July 29, 2025 17:06

taminomara force-pushed the doc-hl branch from ff6aff2 to cbfbe3a Compare July 29, 2025 19:34

CppCXY reviewed Jul 30, 2025

View reviewed changes

taminomara force-pushed the doc-hl branch from 8eb92ff to 443260c Compare July 30, 2025 16:52

taminomara mentioned this pull request Jul 30, 2025

Make syntax of @see compatible with LuaLs #671

Closed

taminomara marked this pull request as draft July 31, 2025 02:03

taminomara force-pushed the doc-hl branch from 443260c to cca37f4 Compare July 31, 2025 16:37

taminomara marked this pull request as ready for review July 31, 2025 16:39

CppCXY reviewed Aug 1, 2025

View reviewed changes

crates/emmylua_parser/src/lexer/lua_lexer.rs Outdated Show resolved Hide resolved

crates/emmylua_parser/src/lexer/lua_lexer.rs Show resolved Hide resolved

crates/emmylua_parser/src/desc_parser/dummy.rs Outdated Show resolved Hide resolved

taminomara force-pushed the doc-hl branch from a7d45ff to fd70d83 Compare August 1, 2025 16:50

taminomara mentioned this pull request Aug 3, 2025

Unify initialization process for all editors #680

Merged

taminomara force-pushed the doc-hl branch from a34c937 to dbdc1bb Compare August 3, 2025 19:19

taminomara force-pushed the doc-hl branch from dbdc1bb to 956fcd3 Compare August 3, 2025 19:41

CppCXY reviewed Aug 4, 2025

View reviewed changes

crates/emmylua_parser_desc/src/testlib.rs Show resolved Hide resolved

crates/emmylua_ls/src/handlers/test_lib/mod.rs Outdated Show resolved Hide resolved

crates/emmylua_parser_desc/src/util.rs Outdated Show resolved Hide resolved

taminomara force-pushed the doc-hl branch from 956fcd3 to 2d8eb32 Compare August 7, 2025 09:34

taminomara added 6 commits August 7, 2025 21:07

Simplify and refactor Reader

6ccace7

Simplify reader's implementation, rename methods to be more consistent, add 1-character lookbehind support, expose unconsumed (tail) range and text.

Minor fixes for LuaDocParser

8184dec

Fix lifetimes, fix `current_token_range` not being reset.

Expose state of LuaLexer

010b3ea

This will allow parsing code snippets in comments line-by-line.

Add description parsers for Markdown/RST

28de3de

See more info in PR or merge commit.

Support LSP features for Markdown/RST in comments

b0c3771

Add tests for Markdown/RST in comments

e1a17ad

taminomara force-pushed the doc-hl branch from 2d8eb32 to e1a17ad Compare August 7, 2025 17:10

Support emphasis in emmy annotator

8d610ec

CppCXY merged commit 66c5874 into EmmyLuaLs:main Aug 8, 2025
21 checks passed

Support highlighting Markdown/RST in comments #665

Support highlighting Markdown/RST in comments #665

Uh oh!

Conversation

taminomara commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Caveats

Uh oh!

taminomara commented Jul 25, 2025

Uh oh!

CppCXY commented Jul 25, 2025

Uh oh!

taminomara commented Jul 25, 2025

Uh oh!

taminomara commented Jul 25, 2025

Uh oh!

CppCXY commented Jul 25, 2025

Uh oh!

taminomara commented Jul 25, 2025

Uh oh!

CppCXY commented Jul 25, 2025

Uh oh!

CppCXY commented Jul 25, 2025

Uh oh!

taminomara commented Jul 25, 2025

Uh oh!

taminomara commented Jul 25, 2025

Uh oh!

taminomara commented Jul 25, 2025

Uh oh!

taminomara commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CppCXY commented Jul 30, 2025

Uh oh!

taminomara commented Jul 30, 2025

Uh oh!

CppCXY commented Jul 31, 2025

Uh oh!

CppCXY commented Jul 31, 2025

Uh oh!

taminomara commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CppCXY commented Aug 1, 2025

Uh oh!

taminomara commented Aug 1, 2025

Uh oh!

lewis6991 commented Aug 1, 2025

Uh oh!

taminomara commented Aug 1, 2025

Uh oh!

taminomara commented Aug 2, 2025

Uh oh!

clason commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taminomara commented Aug 2, 2025

Uh oh!

clason commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taminomara commented Aug 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CppCXY commented Aug 4, 2025

Uh oh!

CppCXY commented Aug 6, 2025

Uh oh!

CppCXY commented Aug 7, 2025

taminomara commented Jul 25, 2025 •

edited

Loading

taminomara commented Jul 29, 2025 •

edited

Loading

clason commented Aug 2, 2025 •

edited

Loading

clason commented Aug 2, 2025 •

edited

Loading

taminomara commented Aug 3, 2025 •

edited

Loading