-
Notifications
You must be signed in to change notification settings - Fork 22
Description
The Neon Genesis Evolution Plan
北海虽赊,扶摇可接;东隅已逝,桑榆非晚。-- 滕王阁序
It is better late than never.
Introduction
Lacking a compiler documentation and a language specification with elaborate testable examples are the long-standing obstacles which hinder the Nim language from gaining the popularity it deserves. A compiler documentation gives developers the overview of the compiler internals and helps them step into the development of the Nim compiler. A language specification clarifies how the language works. Armed with detailed testable examples, it can prevent regressions from happening and act as a delicate guide to developers proving that the feature works and the language works.
The RFC is intended to be progressive and grows gradually. The task should start from commenting the compiler code. A testable specification can roll out at the same time. Finally the comments and the specification should help form a guide summarizing how the Nim compiler works.
The tasks are divided into three steps: comment the compiler code, build a testable specification and write a compiler guide.
Comment the compiler code
Compiler code without comments tends to be hard to understand. Comments help developers gain the insights of why the code is written like this and how it probably works. Comments could be obsolete and meaningless as time goes by. However, It is worthwhile updating the comments periodically since the code and comments will be read thousands of times.
The Nim CI should have a measure to calculate the comment ratio of the code. Ideally it should measure the comment ratio per pull request. The comment ratio is defined as comments lines / total lines (excluding empty lines).
- add the comment ratio metric to the Nim CI
The contribution of the compiler documentation should be in the form of a pull request. The pull request should comment a certain module or functionality in the Nim compiler.
- mConStrStr => document compiler procs regarding
&Nim#20257 - todo ...
Build a testable specification
A specification should describe what should work or fail, with abundant testable examples. It should cover as many cases as possible.
As a convention, a describe block can be used. The implementation starts from
template describe*(x: string, body: typed) =
block:
bodyIt may be extended for better documentation generation in the future. The testable specification for the Nim manual should be put under the tests/spec/manual directory. A simple example is given below:
# manual_bool_type.nim
import stdtest/specutils
describe "The internal integer value of a Boolean type":
describe "0 for false":
doAssert ord(false) == 0
describe "1 for true":
doAssert ord(true) == 1
describe "A Boolean type is an Ordinal type":
doAssert bool is Ordinal
doAssert true is Ordinal
doAssert false is Ordinal
describe "The size of a Boolean type is one byte":
doAssert sizeof(bool) == 1
doAssert sizeof(true) == 1
doAssert sizeof(false) == 1- Lexical Analysis
- Encoding
- Indentation
- Comments
- Multiline comments
- Identifiers & Keywords
- Identifier equality
- Keywords as identifiers
- String literals
- Triple quoted string literals
- Raw string literals
- Generalized raw string literals
- Character literals
- Numeric literals
- Custom numeric literals
- Operators
- Other tokens
- Syntax
- Associativity
- Precedence
- Dot-like operators
- Grammar
- Order of evaluation
- Constants and Constant Expressions
- Restrictions on Compile-Time Execution
- Types
- Ordinal types
- Pre-defined integer types
- Subrange types
- Pre-defined floating-point types
- Boolean type
- Character type
- Enumeration types
- String type
- cstring type
- Structured types
- Array and sequence types
- Open arrays
- Varargs
- Unchecked arrays
- Tuples and object types
- Object construction
- Object variants
- cast uncheckedAssign
- Set type
- Bit fields
- Reference and pointer types
- Nil
- Mixing GC'ed memory with ptr
- Procedural type
- Distinct type
- Modeling currencies
- Avoiding SQL injection attacks
- Auto type
- Type relations
- Type equality
- Subtype relation
- Convertible relation
- Assignment compatibility
- Overload resolution
- Overloading based on 'var T'
- Lazy type resolution for untyped
- Varargs matching
- iterable
- Overload disambiguation
- Named argument overloading
- Statements and expressions
- Statement list expression
- Discard statement
- Void context
- Var statement
- Let statement
- Tuple unpacking
- Const section
- Static statement/expression
- If statement
- Case statement
- When statement
- When nimvm statement
- Return statement
- Yield statement
- Block statement
- Break statement
- While statement
- Continue statement
- Assembler statement
- Using statement
- If expression
- When expression
- Case expression
- Block expression
- Table constructor
- Type conversions
- Type casts
- The addr operator
- The unsafeAddr operator
- Procedures
- Export marker
- Method call syntax
- Properties
- Command invocation syntax
- Closures
- Creating closures in loops
- Anonymous procedures
- Do notation
- Func
- Routines
- Type bound operators
- Nonoverloadable builtins
- Var parameters
- Var return type
- Future directions
- NRVO
- Overloading of the subscript operator
- Methods
- Multi-methods
- Inhibit dynamic method resolution via procCall
- Iterators and the for statement
- Implicit items/pairs invocations
- First-class iterators
- Converters
- Type sections
- Exception handling
- Try statement
- Try expression
- Except clauses
- Custom exceptions
- Defer statement
- Raise statement
- Exception hierarchy
- Imported exceptions
- Effect system
- Exception tracking
- EffectsOf annotation
- Tag tracking
- Side effects
- GC safety effect
- Effects pragma
- Generics
- Is operator
- Type classes
- Implicit generics
- Generic inference restrictions
- Symbol lookup in generics
- Open and Closed symbols
- Mixin statement
- Bind statement
- Delegating bind statements
- Templates
- Typed vs untyped parameters
- Passing a code block to a template
- Varargs of untyped
- Symbol binding in templates
- Identifier construction
- Lookup rules for template parameters
- Hygiene in templates
- Limitations of the method call syntax
- Macros
- Debug example
- bindSym
- Post-statement blocks
- For loop macro
- Case statement macros
- Special Types
- static[T]
- typedesc[T]
- typeof operator
- Modules
- Import statement
- Include statement
- Module names in imports
- Collective imports from a directory
- Pseudo import/include paths
- From import statement
- Export statement
- Scope rules
- Block scope
- Tuple or object scope
- Module scope
- Packages
- Compiler Messages
- Pragmas
- deprecated pragma
- compileTime pragma
- noreturn pragma
- acyclic pragma
- final pragma
- shallow pragma
- pure pragma
- asmNoStackFrame pragma
- error pragma
- fatal pragma
- warning pragma
- hint pragma
- line pragma
- linearScanEnd pragma
- computedGoto pragma
- immediate pragma
- redefine pragma
- compilation option pragmas
- push and pop pragmas
- register pragma
- global pragma
- Disabling certain messages
- used pragma
- experimental pragma
- Implementation Specific Pragmas
- Bitsize pragma
- Align pragma
- Noalias pragma
- Volatile pragma
- nodecl pragma
- Header pragma
- IncompleteStruct pragma
- Compile pragma
- Link pragma
- passc pragma
- localPassC pragma
- passl pragma
- Emit pragma
- ImportCpp pragma
- Namespaces
- Importcpp for enums
- Importcpp for procs
- Wrapping constructors
- Wrapping destructors
- Importcpp for objects
- ImportJs pragma
- ImportObjC pragma
- CodegenDecl pragma
- cppNonPod pragma
- compile-time define pragmas
- User-defined pragmas
- pragma pragma
- Custom annotations
- Macro pragmas
- Foreign function interface
- Importc pragma
- Exportc pragma
- Extern pragma
- Bycopy pragma
- Byref pragma
- Varargs pragma
- Union pragma
- Packed pragma
- Dynlib pragma for import
- Dynlib pragma for export
- Threads
- Thread pragma
- Threadvar pragma
- Threads and exceptions
- Guards and locks
- Guards and locks sections
- Protecting global variables
- Protecting general locations
These examples should generate a pretty HTML documentation. They can be linked to the Nim manual or included into the Nim manual in the foldable form (defaults to folded).
Write a compiler guide
A compiler guide helps developers learn the internals of the Nim compiler without wasting time in reading the compiler code. What is a magic function? How the Nim method is implemented? Is it efficient to use & to concatenate multiple strings? These are questions that a compile guide should attempt to answer. The guide should focus on the explanation of the specific terms and the gist of the implementation.
For instance:
`$` function is a magic proc used to concatenate strings and chars. The Nim compiler does some optimizations to make it perform as good as the in-place version.
There are four overloads for `$` function in the system module.
```nim
proc `&`*(x: string, y: char): string {.magic: "ConStrStr", noSideEffect.}
proc `&`*(x, y: char): string {.magic: "ConStrStr", noSideEffect.}
proc `&`*(x, y: string): string {.magic: "ConStrStr", noSideEffect.}
proc `&`*(x: char, y: string): string {.magic: "ConStrStr", noSideEffect.}
```
All of them have the same magic `mConStrStr`, which is needed in the optimization phases.
```nim
s = "Hello " & name & ", how do you feel" & 'z'
```
Here is the ast of the right-side expression:
```nim
StmtList
Infix
Ident "&"
Infix
Ident "&"
Infix
Ident "&"
StrLit "Hello "
Ident "name"
StrLit ", how do you feel"
CharLit 122
```
- Is it efficient to use
$function to concatenate multiple strings?(https://ringabout.github.io/neon/concatenate) - todo ...