Skip to content

Show line & column numbers#2056

Merged
thomas-zahner merged 11 commits intolycheeverse:masterfrom
thomas-zahner:show-span
Feb 25, 2026
Merged

Show line & column numbers#2056
thomas-zahner merged 11 commits intolycheeverse:masterfrom
thomas-zahner:show-span

Conversation

@thomas-zahner
Copy link
Copy Markdown
Member

@thomas-zahner thomas-zahner commented Feb 19, 2026

Fixes #1304

We call line & column numbers span and RawUriSpan in our code. Not sure if this is the best name, as it currently isn't really a span but a position, so the span is missing the end. But maybe we could update the span to include an end. (maybe not in this PR)

I've removed html5ever as an experiment in the second commit, to show that it makes it possible to unwrap the column. It seems like we tried to not rely too much on html5ever and the code comments discourage its usage. Maybe it's a good point to remove it then, but I'm not quite sure and don't now the background story. @mre What do you think?

Then, I'm also open to discuss how the output looks like. Do you have a better proposal? The current one seems simple and quite straight forward, basically minimal changes.

Note that @mihaigalos's suggestion with [./README.md:123:54]: would make the current grouping by input source impossible or it would lead to repetition of the input source.

@thomas-zahner thomas-zahner force-pushed the show-span branch 3 times, most recently from 5bcf30c to 71947a4 Compare February 20, 2026 13:50
@thomas-zahner thomas-zahner requested review from katrinafyi and mre and removed request for mre February 20, 2026 16:23
Copy link
Copy Markdown
Member

@katrinafyi katrinafyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable and seems to be just wiring through the source locations into the output.

About the formatting, there are reasons to prefer filename.md:4:1 if possible. Some editors or consoles can turn that into a clickable link. I'm not intrinsically opposed to repeating the filename (compilers and linters just do that, but often with fewer reports).

In any case, about the current format, I think it is a little hard to notice because it's two numbers with little context and placed after a possibly very long URL.

Here is one very loose suggestion: It contextualises the numbers with the filename (but not the full path), and it moves them closer to the left. This makes them more noticeable and doesn't take up too much space since numbers are usually small.

[a/x/b/a/a.md]:
   [ERROR] (a.md:39:1) https://a.com/#asd  | Error (cached)
     [404] (a.md:4:100) https://google.com/ajfidsajfidosa | Rejected status code: 404 Not Found (configurable with "accept" option)

Alternatively or additionally, we could also consider a new formatter for a more line-oriented output like grep:

grep -R struct lychee-lib --line-number
Image

As the same URLs in a single file are now visualised multiple times
(since they now contain spans) the tests were failing. But the tests
weren't really accurate as they only tested if they weren't shown multiple
times. This has nothing to do with caching or the actual amount of
requests performed. This is covered by the updated and more accurate
test test_process_internal_host_caching.
@thomas-zahner
Copy link
Copy Markdown
Member Author

thomas-zahner commented Feb 23, 2026

@katrinafyi Thank you for your thoughts and the output suggestion. I totally agree that we could consider adding a more grep like output format.

Thinking about it, it seems like there is a lot we could do. As said currently the "span" is only a location, as it's missing the an end point. Not sure how easy it is to update it to become a "real" span with an end point. If we had this we could also do the what grep does; to reprint the affected input line and highlight the broken URL. Even better is the more modern rg approach which groups the lines by file.

rg output
lychee-lib/src/types/status.rs
77:            s = serializer.serialize_struct("Status", 2)?;
81:            s = serializer.serialize_struct("Status", 2)?;
85:            s = serializer.serialize_struct("Status", 1)?;

But this is not yet possible because we're missing the endpoint of the span, but it would be great to have such an output format in the long run. We could even try to mimic the output of a compiler.

404: Rejected status code: 404 Not Found
  --> fixtures/TEST.md:71:9
   |
71 |         This is a [link](https://example.com).
   |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^

For now in this PR I'd like to keep it simple and not implement a new format or the span endpoint. I'd also prefer to not changing the default output fundamentally.

I think it is a little hard to notice because it's two numbers with little context and placed after a possibly very long URL

Totally agree.

What about the following? Is it still intuitively understandable? It's just that I think it might not be worth repeating the file name (as in (a.md:39:1)) because it will not make a clickable link as it's an incomplete path. (almost like generating broken links 🤔)

[a/x/b/a/a.md]:
   [ERROR] (at 39:1) https://a.com/#asd  | Error (cached)
     [404] (at 4:100) https://google.com/ajfidsajfidosa | Rejected status code: 404 Not Found (configurable with "accept" option)

@katrinafyi
Copy link
Copy Markdown
Member

@thomas-zahner The (at 39:1) format looks good to me! I also agree that JSON format can be used for more specialised transformations.

About other output formats, yeah there's a lot we could do here and it will probably take longer to work out what's best. I do like the conciseness of rg, but we have a bit more information to display for each link. The compiler-like output gives us lots of space for extra details, but it might also be too spacious. I also think that the ^ underline is less effective for lychee than compilers - the link is usually obvious and we won't be pointing to subtle details like a missing semicolon.

Lots to think about later :)

@mre
Copy link
Copy Markdown
Member

mre commented Feb 23, 2026

Great PR.

In regard to the formatting, the one thing I don't like is that the position info breaks URL alignment, making the output harder to scan.

I would like to discuss two additional options:

Option one puts the position after the URL:

[a/x/b/a/a.md]:
   [ERROR] https://a.com/#asd (at 39:1) | Error (cached)
     [404] https://google.com/ajfidsajfidosa (at 4:100) | Rejected status code: 404

URLs stay front and center, which is arguably what matters most. Alignment is a non-issue since the position just floats at the end. The downside is that position info gets buried after a long URL and is easy to miss.

Option two pads the position prefix up to a threshold, so URLs still line up within each group:

[a/x/b/a/a.md]:
   [ERROR] (at 39:1)  https://a.com/#asd | Error (cached)
     [404] (at 4:100) https://google.com/ajfidsajfidosa | Rejected status code: 404

Position is prominent and easy to scan down the left side. It only requires one pass per file group to compute the padding, not the whole output. The threshold cap (say, up to 999:99) prevents weird wide gaps in edge cases, with the occasional misalignment beyond that being acceptable.

Any thoughts?

thomas-zahner and others added 2 commits February 24, 2026 11:16
Co-authored-by: Matthias Endler <matthias@endler.dev>
@katrinafyi
Copy link
Copy Markdown
Member

katrinafyi commented Feb 24, 2026

Another formatting suggestion, because I don't love the line number being put in the middle of the status information. This is a bit like ripgrep's output with line numbers at the beginning of lines.

[a/x/b/a/a.md]:
39:1: [ERROR] https://a.com/#asd | Error (cached)
4:100:  [404] https://google.com/ajfidsajfidosa | Rejected status code: 404

It might be complicated to do this left/right justification of the error and line number where they can take space from each other. But I think it looks a bit nicer than having gaps on both sides for alignment.

I would also be okay with @ mre's first suggestion:

   [ERROR] https://a.com/#asd (at 39:1) | Error (cached)


/// Set [`Self::span`]
#[must_use]
pub const fn with_span(mut self, span: RawUriSpan) -> Self {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly we can mark this fn const but the other ones fail with E0493

@thomas-zahner
Copy link
Copy Markdown
Member Author

Another formatting suggestion [...] This is a bit like ripgrep [...]

I really like the idea and tried it in thomas-zahner@e329eb4. But as mentioned in the commit message it screws up a few things, like task formatting. Basically this concept doesn't really fit in the current format/mode concept. Instead we could maybe make a dedicated format for ripgrep's approach. But this would preferably be a separate PR.

I would also be okay with @ mre's first suggestion:

Cool, that is much easier to implement. See a5e5cd1

@thomas-zahner
Copy link
Copy Markdown
Member Author

thomas-zahner commented Feb 25, 2026

So I'm going to merge this. This allows me to tackle #614. In summary I tried to keep the changes to the output formats minimal while also being a helpful improvement. In the future it would be pretty cool to try pursue the idea of ripgreps format I started in thomas-zahner@e329eb4.

@thomas-zahner thomas-zahner merged commit 13147ca into lycheeverse:master Feb 25, 2026
7 checks passed
@mre mre mentioned this pull request Feb 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: Line numbers and columns in output

4 participants