I know that in the last weeks I have been posting too much about this tool, but I am having fun developing it, and I hope it gets useful for someone. Version 0.4.0 of tree-sitter-bnf-tools is out, and the headline this time is completeness: the BNF dialect now covers the full set of tree-sitter grammar-level directives, and check has grown substantially smarter at catching problems before tree-sitter generate gets a chance to. This way, we reduce the cycle length of developing a grammar, testing, finding errors, and getting back to the grammar rewrite.
The Missing Directives
Tree-sitter grammars have a handful of top-level fields — word, externals, precedences, reserved — that previously had no BNF equivalent. They’re all there now:
%word ruleNamedeclares the identifier token for keyword extraction and better error recovery.%externals name1, name2lists tokens defined by an external C scanner instead of the grammar.%precedences [a, b], [c, d]declares named precedence groups in descending priority order.%reserved setName: [r1, r2]defines named reserved-word sets, with per-occurrence overrides via(body %reserved setName)in rule bodies.
Named precedence levels can now be used in %prec: you can now write %prec 'unary' and have it emit prec('unary', …) in grammar.js, as long as the name is declared in some %precedences group. And for completeness, negative integers (%prec -1) and regex flag suffixes (/pattern/i) are now valid syntax too.
Check Yourself Before You Wreck Yourself
Now, check is smarter. A whole class of issues that used to slip through check and blow up later at tree-sitter generate time now fail early:
- Undefined rule references in
%conflicts,%inline,%supertypes,%externals, and%reservedare now errors, not warnings. - A hidden start rule (via
_prefix or listed in%supertypes) is caught immediately. - Invalid
%inlinetargets — the start rule, an external token, or a pure token body — are all reported. %word‘s target must be a pure token with a unique body; both violations are now caught.- A
%supertypesrule with a pure-token body or multi-step alternatives is rejected. - A name declared in both
%externalsand a rule body is an error.
Error messages also got better: syntax errors now report file, line, column, and a source snippet for every problem in the file (up to 10), instead of a bare SyntaxError.
One More Thing
convert --generate now writes a minimal tree-sitter.json to the output directory when one doesn’t exist. This satisfies tree-sitter ≥ 0.25’s ABI 15 requirement and silences the fallback-to-ABI-14 warning without any extra steps.
Documentation
If you are not sure what all this is about, we got you covered as well. The tutorial now includes a section dedicated to parsers tree-sitter concepts (LR parsing, shift-reduce conflicts, GLR, keyword extraction, external scanners), and we also included a worked-example, where you can see the tool working in a real grammar from scratch.
Install or upgrade via Cargo:
cargo install ts-bnf-tool