This file provides guidance to AI agents when working with code in this repository.
# Build
go build .
# Run
go run . [flags] grammar.y
# Test (no test suite exists; validate by generating a grammar)
go vet ./...There is no Makefile; standard Go tooling is used throughout.
This tool (package main) reads a YACC grammar (.y file) and writes a Go parser. The main logic lives in goyacc.go (~3700 lines); union layout inference is in unionsize.go (uses go/packages to inspect the target package's types at generation time).
The upstream source is cmd/goyacc in the Go x/tools repository (GitHub mirror). This fork adds the enhancements described below.
- Lexing —
gettok()/getword()tokenize the.yinput - Grammar parsing —
setup()reads productions, types, and directives into global arrays - State generation —
stagen()builds the LALR(1) automaton (states, items, lookahead sets) - Table output —
output()/go2out()write the action/goto tables to the output file - Code emission —
cppcode(),cpyact(), and related functions copy user code sections verbatim into the output
Pitem/Item— a production rule with a dot position and lookahead setSymb— a grammar symbol (terminal or nonterminal)Lkset— a bitset representing lookahead tokensRow— one row of the action table (actions + default)Error— a custom error message keyed by (state, token)
Discriminated unsafe.Pointer union (%union): yySymType holds a data [N]uintptr array (sized to the largest member) plus a ptrs [M]unsafe.Pointer GC-keepalive array. All member types are stored/read via uintptr casts, eliminating interface boxing. Array sizes and pointer-word offsets (for the GC-keepalive array) are inferred automatically at generation time via go/packages (inferUnionLayout in unionsize.go). Typed getter (<member>()) and setter (set<member>()) methods are generated for each member.
Custom error messages: // error: "message" comments in grammar rules are collected into a lookup table keyed by (state, token) pair and emitted into the generated parser.
The emitted file is valid but unformatted Go. Callers should post-process with goimports and/or gofumpt.
Generated parsers expose:
yySymType— the semantic value type on the parser stackyyLexerinterface —Lex(lval *yySymType) intyyParserinterface —Parse(yylex yyLexer) intyyParse(yylex yyLexer) int— convenience entry point