Skip to content

Complete the initial LSH implementation#624

Merged
lhecker merged 11 commits intomainfrom
dev/lhecker/syntax-highlighting
Mar 27, 2026
Merged

Complete the initial LSH implementation#624
lhecker merged 11 commits intomainfrom
dev/lhecker/syntax-highlighting

Conversation

@lhecker
Copy link
Copy Markdown
Member

@lhecker lhecker commented Aug 25, 2025

Previous commits have added all of the infinity stones.
This one introduces the remaining, more complex .lsh files.

Closes #18

@lhecker lhecker changed the title Move build.rs into its own directory A first draft for simple syntax highlighting Aug 25, 2025
Base automatically changed from dev/lhecker/build-system to main August 25, 2025 17:54
@MamiyaOtaru
Copy link
Copy Markdown

looking forward to the other languages. they seemed to work well enough in https://github.com/microsoft/edit/tree/4f36e2afe2a84ed339829ea6c63242fb9f4b7de3 (well, really long Here Strings made the rest of the file red, but ones with a reasonable length were fine)
image

@trumblejoe
Copy link
Copy Markdown

PR awaiting 2 pending checks still.

@lhecker
Copy link
Copy Markdown
Member Author

lhecker commented Oct 27, 2025

Working on this feature? In the current economy?!
(I'll try...)

@Consolatis
Copy link
Copy Markdown

Consolatis commented Oct 28, 2025

Thanks for your work on this. I've played around with this branch a bit and want to share my thoughts on the current implementation.

I tried to hack together basic C highlighting and noticed there is some state missing for "constants" (convention with all-uppercase variables usually defined with #define or part of some enum, not usual variables defined with const). Those are neither a Keyword nor a Variable.

Could the regexes be loaded from a file in the future? It might save some work to implement the basic functionality now by parsing a hardcoded string rather than defining everything in build/lsh/definitions.rs in rust code. Then it just needs the decision of what file(s) and from where to load + implement the actual file loading at a later point.

Another thing I wondered about is why the HighlightKind enum contains any colors, I have a feeling that those should be translated by themes from states like Comment or String.

Edit:
Some more feedback: support for Look::WordAscii (\b) and/or Look::WordStartAscii (\<) + Look::WordEndAscii (\>) would be very useful. Currently I can differentiate between break and breakage but not between break and unbreak. I've tried to implement it myself in build/lsh/compiler.rs but I am not familiar enough with rust and the regex-syntax crate.

@lhecker
Copy link
Copy Markdown
Member Author

lhecker commented Nov 22, 2025

FYI I've continued development in the https://github.com/microsoft/edit/tree/dev/lhecker/syntax-highlighting-alt branch where I've since written a custom language with proper compiler. It even has rudimentary support for variables. It's fairly time consuming to develop so it's taking a while. It's also certainly not done yet, but please feel free to check it out already. You can find the definition files under crates/lsh/definitions.

@MamiyaOtaru
Copy link
Copy Markdown

nice to have PS highlighting back! quick comparison with highlighting as it was in the version I linked above
comp

unrelated to highlighting: mousescroll worked in file dialog in the old one, scrolls edit window instead in the current syntax-highlighting-alt

@DanPartelly
Copy link
Copy Markdown

Why not tree-sitter integration for syntax highlighting and text objects ? You basically get for "free" countless languages. (implementation not considered)

@DHowett
Copy link
Copy Markdown
Member

DHowett commented Dec 16, 2025

Why not tree-sitter

FWIW, this has been hashed out over in #18. In fact, the justification for having a simple syntax highlighter that is expressly not tree-sitter is described in the issue body of #18.

TL;DR (and it really is not much longer than this comment here): tree-sitter is huge. Edit is 250KiB. We don't want edit to become much larger than it already is.

@lhecker lhecker force-pushed the dev/lhecker/syntax-highlighting branch from 33890d3 to 3f386a9 Compare January 26, 2026 22:15
@lhecker lhecker changed the title A first draft for simple syntax highlighting Add simple syntax highlighting Jan 26, 2026
@lhecker
Copy link
Copy Markdown
Member Author

lhecker commented Jan 26, 2026

Well, so this ain't a "draft for syntax highlighting" anymore by any measure. This PR is now 7000 lines and contains an entire frigging compiler lol. (...which btw is extremely poorly engineered. 😭😭)

To anyone who reads this: Have fun testing this! Definitions are in crates/lsh/definitions.

@1001encore
Copy link
Copy Markdown

Hi, I've been experimenting with this and managed to get a basic markdown.lsh working with no lazy matching, since the engine only supports greedy matching. Successfully compiles to ~200 instructions. I thought it might be useful so posting below.
It doesn't render yet since the runtime seems to be work in progress, but I verified it via a CLI harness around lsh::compiler.

#[display_name = "Markdown"]
#[path = "**/*.md"]
#[path = "**/*.markdown"]
pub fn markdown() {
    // Headers
    if /#+ .*/ {
        yield keyword;
    }

    // Bold
    if /\*\*[^*]*\*\*/ {
        yield string;
    }

    // Links
    if /\[[^\]]*\]\([^)]*\)/ {
        yield function;
    }
}

@lhecker
Copy link
Copy Markdown
Member Author

lhecker commented Feb 11, 2026

I'm glad you had a chance to try it out! I'll try to simplify the syntax in the future.

Did you try my markdown.lsh version? It's included in this PR: https://github.com/microsoft/edit/blob/cb98cf5b0fe6a5e192f37e82ef3716cd9a63db7f/crates/lsh/definitions/markdown.lsh

@1001encore
Copy link
Copy Markdown

oh, I totally missed that, my bad. I was on the compiler branch the whole time!

lhecker added a commit that referenced this pull request Mar 20, 2026
This PR contains no CLI frontend, etc., for the compiler,
as I split out everything but the compiler to reduce the PR size.

Part of #624
@lhecker lhecker mentioned this pull request Mar 20, 2026
lhecker added a commit that referenced this pull request Mar 23, 2026
This adds the accompanying runtime for #753.

Part of #624
@lhecker lhecker mentioned this pull request Mar 23, 2026
lhecker added a commit that referenced this pull request Mar 25, 2026
This integrates the LSH compiler and runtime into edit.

Part of #624
lhecker added a commit that referenced this pull request Mar 25, 2026
Adds settings.json loading and parsing.
For now, that's just for a `files.associations` key.

Part of #22
Part of #624
@lhecker lhecker force-pushed the dev/lhecker/syntax-highlighting branch from 463097d to 2fc81f5 Compare March 25, 2026 19:54
@lhecker lhecker requested a review from DHowett March 25, 2026 19:54
@lhecker lhecker changed the title Add simple syntax highlighting Complete the initial LSH implementation Mar 25, 2026
Copy link
Copy Markdown
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15/19

double_quote_string();
} else if /true|false|null/ {
if /\w+/ {
// Not a keyword after all.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in JSON would this be an error token?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a token type for "this is obviously incorrect"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, no. I didn't add that, because VS Code doesn't do anything special either. In an ideal world, Edit would get LSP support in the future which would significantly improve this.

@DHowett
Copy link
Copy Markdown
Member

DHowett commented Mar 26, 2026

make sure you update the description!

Copy link
Copy Markdown
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The powershell one seems somewhat broken, and yaml has some minor bugs, but I think this is a good place to start iterating.

@lhecker lhecker requested a review from DHowett March 27, 2026 17:40
@lhecker lhecker merged commit 05452ab into main Mar 27, 2026
6 checks passed
@lhecker lhecker deleted the dev/lhecker/syntax-highlighting branch March 27, 2026 18:01
@avih
Copy link
Copy Markdown

avih commented Mar 27, 2026

Not familiar with "LSH", and I'm not building it myself, so I didn't try it out.

Could it please be clarified whether the syntax schemes and/or the color themes are embedded into the binary? or as external files?

And if external files, where could a collection of such files be found?

Thanks.

@lhecker
Copy link
Copy Markdown
Member Author

lhecker commented Mar 27, 2026

First off: This is the first (!) version of the simple (!) syntax highlighter. You cannot expect this to be feature complete.
(Edit: It may be worth mentioning that someone deleted their comment above, so my response may seem out of place.)

Alright...

Could it please be clarified whether the syntax schemes and/or the color themes are embedded into the binary? or as external files?

They're embedded and not configurable. I will release v2.0.0 without it being configurable, because I expect v3.0.0 to be all about configurability anyway.

And if external files, where could a collection of such files be found?

Hence, no external files. It's still a single binary.

I dont see any documentation in the commit - does adding a new .lsh require a recompile?

Yes. The source code gets translated to the VM bytecode at compile time. The compiler is fairly poorly written for my standards, and I do not want it to process possibly untrusted, dangerous .lsh files.

@avih

This comment was marked as resolved.

@3052

This comment was marked as off-topic.

@avih

This comment was marked as off-topic.

@avih

This comment has been minimized.

@avih

This comment has been minimized.

@DHowett

This comment was marked as off-topic.

@3052

This comment was marked as off-topic.

@lhecker
Copy link
Copy Markdown
Member Author

lhecker commented Mar 27, 2026

@3052 If you'd like to further discuss this, create a "Discussion" and I'll respond there. We maintainers will not respond on this issue anymore.

And because I am the primary maintainer, I'll also take my right to collapse all prior comments above, because I'm trying to focus on drafting the preview release and this is distracting.

@microsoft microsoft locked as too heated and limited conversation to collaborators Mar 27, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add simple syntax highlighting

9 participants