Skip to content
This repository was archived by the owner on Apr 22, 2020. It is now read-only.

Add Elixir language support #478

Merged
merged 17 commits into from
Mar 15, 2017
Merged

Conversation

nietaki
Copy link
Contributor

@nietaki nietaki commented Mar 2, 2017

Implements the changes for #319 .

This PR adds support for Elixir. As with some other languages, for simplicity and performance it doesn't support string interpolation highlighting, but should cover pretty much the rest of the language syntax.

For context: I have been working full time with Elixir for the last year and have tested the code with snippets from a handful of projects. I have submitted the highlighting for feedback on elixir forum and the Elixir slack channel and didn't get any issues raised (yet).

If you want to play around with the highlighting without checking out the code, go to this rawgit page - I wrote this to make implementing and testing this easier for myself.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please let us know the company's name.

@nietaki
Copy link
Contributor Author

nietaki commented Mar 2, 2017

I signed the CLA! :)

@googlebot
Copy link

CLAs look good, thanks!

@nietaki nietaki mentioned this pull request Mar 2, 2017
@sheharyarn
Copy link

I'm so happy right now 😭 ❤️

@nietaki
Copy link
Contributor Author

nietaki commented Mar 5, 2017

One bug and one improvement have been pointed out in the forums by @notriddle, please hold off on merging while I write the fixes.

@nietaki
Copy link
Contributor Author

nietaki commented Mar 5, 2017

I don't think adding string interpolation support is really viable using the framework here, unless there's something about it that I'm missing. Take this (valid) example: "string #{ inspect(%{{"foo"} => "bar\""}) } interpolation" - there's not practical way of tracking where the string interpolation ends.

I think the state of the branch as it is right now might be as far as we get. Especially since using string literals inside a string interpolation expression is not a common use-case.

Copy link
Contributor

@mikesamuel mikesamuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interrupted, but I thought I'd send out what I have.

src/lang-ex.js Outdated
@@ -0,0 +1,78 @@
/**
* @license
* Copyright (C) 2017 Google Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you're a Google employee, the copyright should be attributed to you and by putting that license on the file, you're saying you're ok with distributing your work under that license.

src/lang-ex.js Outdated
// # comments
[PR['PR_COMMENT'], /^#.*/, null, '#'],
// a (possibly multiline) charlist
[PR['PR_LITERAL'], /^'(?:[^'\\]|\\.)*'?/, null, '\''],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\\. allows any escaped character except newlines. I mention this since I don't whether Elixir allows JS style line continuations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out! The JS-style line continuations are supported and did break the highlighter, both in charlists and regular strings.

Now fixed and added to the tests.

src/lang-ex.js Outdated
'!%&()*+,-;<=>?[\\]^{|}'],
// Borrowed from lang-erlang.js:
[PR['PR_LITERAL'],
/^(?:0o[0-7](?:[0-7]|_[0-7])*|0x[\da-f](?:[\da-f]|_[\da-f])*|\d(?:\d|_\d)*(?:\.\d(?:\d|_\d)*)?(?:e[+\-]?\d(?:\d|_\d)*)?)/i,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, So 0o and 0O precede octal literals, and underscores can separate myriads?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right about the underscores. The 0x and 0o part shouldn't be case insensitive - I fixed that now.

[PR['PR_ATTRIB_NAME'], /^(?:__(?:CALLER|ENV|MODULE|DIR)__)/],
// keywords
[PR['PR_KEYWORD'],
/^(?:alias|case|catch|def(?:delegate|exception|impl|macrop?|module|overridable|p?|protocol|struct)|do|else|end|fn|for|if|in|import|quote|raise|require|rescue|super|throw|try|unless|unquote(?:_splicing)?|use|when|with|yield)\b/],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self, the |p? variant follows def so does not mean this token matches the empty string.

src/lang-ex.js Outdated
// A double-quoted multi-line string
// or a triple double-quoted multi-line string.
[PR['PR_STRING'],
/^(?:"(?:(?:""(?:""?(?!")|[^\\"]|\\.)*"{0,3})|(?:[^"\\]|\\.)*"?))/],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lookahead means that """""" (6 double quotes in a row) is not a double quoted string?

IIUC, this might match """x" but not the whole of """x"y"""?

Maybe just do two cases:

/^(?:"""(?:(?!""")[^\\]|\.)""")|""|"(?!")([^\\"]|\.)")/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub mangled your suggestion a bit, I just reworked it on my own.

Since the string recognition isn't a shortcut pattern anymore, there isn't as much benefit in putting both """ """ and " " strings in one pattern, I separated them for simplicity.

Also: I looked into it more closely and a correct heredoc requires a newline after the opening """ - the compilation/interpreter fails otherwise. I Modified the code to take it into account.

[PR['PR_PLAIN'], /^[$a-z]\w*[\!\?]?/],
// sigils with the same starting and ending character.
// Key part: X(?:[^X\r\n\\]|\\.)+X where X is the sigil character
[PR['PR_ATTRIB_VALUE'], /^~[A-Z](?:\/(?:[^\/\r\n\\]|\\.)+\/|\|(?:[^\|\r\n\\]|\\.)+\||"(?:[^"\r\n\\]|\\.)+"|'(?:[^'\r\n\\]|\\.)+')[A-Z]*/i],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self, stopped here. Resume later.

Copy link
Contributor

@mikesamuel mikesamuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks broadly good. Thanks for doing this.

The only blocking issue is the copyright one, so feel free to treat all the other comments as FYI.

' `END`ATV~C&lt;hello&gt;`END`PLN\n' +
' `END`KWDend`END`PLN\n' +
'\n' +
'`END`KWDend`END')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing tests.

@nietaki
Copy link
Contributor Author

nietaki commented Mar 9, 2017

Thanks for reviewing, I'll update the copyright and look into the issues you highlighted when I have a second, probably on the weekend.

@nietaki
Copy link
Contributor Author

nietaki commented Mar 14, 2017

@mikesamuel How's this looking now?

@mikesamuel mikesamuel merged commit 760e6e7 into googlearchive:master Mar 15, 2017
@mikesamuel
Copy link
Contributor

mikesamuel commented Mar 15, 2017 via email

EricFromCanada pushed a commit to EricFromCanada/code-prettify that referenced this pull request Aug 15, 2017
* First, crude version of the elixir language handler.

* multiple elixir plugin improvements - attributes, atoms, strings, ...

* elixir-lang - added support for number literals with underscores

* elixir-lang: added support for atoms as keys in keyword lists

* Elixir: added support for sigils, simplified the keywords regex

* Elixir: added more keywords from Kernel.SpecialForms

* Elixir: adding missing keywords and more permissive atom/variable names

* Elixir: better support for binaries/bitstrings

* Elixir: added support for the `iex` prompt for interactive examples

* added a failing test for elixir

* Elixir: fixed the failing test for elixir syntax highlighting

* Elixir: Highlighting constructs like `%{"THIS": :foo}` as atoms

* Elixir: fixed false-positive highlighting for 0XFF and 0O77

* Elixir: fixing the license text.

* Elixir: fixing multiline charlists

* Elixir: more robust multiline strings and charlists

* Elixir: making string recognition simpler and more correct
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants