CoCoDo: Compiler Coding Dojo

2021

CoCoDo is a coding dojo where you can enjoy an entire day of compiler programming under gentle guidance of field experts.

Due to the cancellation of the ‹Programming› 2020 conference, CoCoDo 2020 did NOT take place in March 2020. We went again in 2021.

Programme (times in UTC!)

13:00–14:30: A Tutorial on the Spoofax Language Workbench (Eelco Visser)
15:00–16:30: An Interactive Exploration of a Simple Compiler (Marcus Denker)
17:00–18:30: From Abstract Syntax Trees to Machine Code with LLVM (Dimitri Racordon)
- (Cocodol) (slides) (getting started)

Feel free to browse the website (especially the “Technologies” tab above).

Compiler construction comprises, but is not limited to, lexical analysis, syntactic analysis, preprocessing, context handling, code generation, code optimisation, virtual machines, interpreters, smell detection, clone management, portability, migration, refactoring, domain-specific language design, linking and loading, assembling and disassembling, generics and reflection, numerous paradigms and so much more.

Organised by Johan Fabry and Vadim Zaytsev as a part of Raincode Labs academic activities.

Post-Proceedings Program Committee

Andrei Chiş (feenk, Switzerland)
Johan Fabry (Raincode Labs, Belgium) — co-chair
Adrian Johnstone (Royal Holloway, UK)
Stefan Marr (University of Kent, UK)
Fabio Niephaus (Hasso Platner Institut, Germany)
Elizabeth Scott (Royal Holloway, UK)
Anthony Sloane (Australia)
Tijs van der Storm (CWI & University of Groningen, The Netherlands)
Vadim Zaytsev (University of Twente, The Netherlands) — co-chair

After the conference, we have collected submissions for peer review by the members of the PC. Timely submitted manuscripts that were accepted as a result of reviewing, were printed in the ‹Programming› Companion volume.

Johan Fabry, Vadim Zaytsev, 4th Raincode Labs Compiler Coding Dojo, Welcome from the Chairs (doi:10.1145/3464432, pp. XI–XVIII)
Dimitri Racordon, From ASTs to Machine Code with LLVM (doi:10.1145/3464432.3464777, pp. 68–76)

2019

CoCoDo 2019 took place on 2 April 2019 in Genova, Italy as a part of the ‹Programming› conference.

Programme

08:45–09:00: Opening and introductions (Vadim Zaytsev)
09:00–10:30: Scrap your DSL Boilerplate with a Universe of Syntaxes, their Programs and Proofs (Guillaume Allais with Agda)
11:00–12:30: Language Engineering with Rascal (Tijs van der Storm with Rascal)
14:00–15:30: Tool Battle: 1 Infrastructure, 2 Parser Generators!
- Jason Lecerf with SmaCC
- Andrei Chiş with PetitParser
16:00–17:00: Discussion, presentations and pitches

2018

CoCoDo 2018 took place on 9 April 2018 in Nice, France as a part of the ‹Programming› conference.

Programme

08:45–09:00: Opening and introductions (Vadim Zaytsev)
09:00–10:30: Peter D. Mosses with FunCons
11:00–12:30: Jesper Öqvist with JastAdd
14:30–16:00: Anya Helene Bagge with Rascal
16:30–18:00: Presentations and pitches:
- Johan Fabry
- Friedrich Steimann

2017

CoCoDo 2017 took place on 4 April 2017 in Brussels, Belgium as a part of the ‹Programming› conference.

Programme

09:00–09:15: Opening and introductions (Vadim Zaytsev)
09:15–10:30: Attribute Grammars for DSLs for Music and 3D Graphics (Elizabeth Scott, Adrian Johnstone) with ART
11:00–12:00: Metaprogramming in Late Phases of Compilation (Anya Helene Bagge) with Rascal
13:30–15:00: Experimenting with Racket as a Language Workbench (Robby Findler) with Racket
15:30–17:00: The Future of Compilers (presentations and pitches)
- Rik Arends with MakePad
- Ralf Lämmel with The Software Languages Book
- Adrian Johnstone with FunCons
- Nicolas Laurent with Whimsy and Autumn

Technologies

ANTLR

ANTLR version 4.9.3 (download)
Main website: antlr.org
Works with Java, C#, JavaScript, Python2, Python3
Wikipedia: (English) (Dutch) (French) (German) (Russian)
Parsing algorithm: ALL(*)

Example from antlr.org:

grammar Expr;
prog: (expr NEWLINE)* ;
expr: expr ('*'|'/') expr
    | expr ('+'|'-') expr
    | INT
    | '(' expr ')'
    ;
NEWLINE : [\r\n]+ ;
INT     : [0-9]+ ;

More examples: pragprog.com
Maintained by Terence Parr
Built on top of ANTLR:
- EMFText
- Xtext

Beaver

Beaver version 0.9.11 (sourceforge)
Main website: beaver.sourceforge.net
Works with Java
Parsing algorithm: LALR(1)

Example from beaver.sourceforge.net:

%%

%left RPAREN;
%left MULT, DIV;
%left PLUS, MINUS;

%typeof NUMBER = "Number";
%typeof expr = "Expr";

%%

expr = expr.a MULT  expr.b   {: return new Expr(a.value * b.value); :}
     | expr.a DIV   expr.b   {: return new Expr(a.value / b.value); :}
     | expr.a PLUS  expr.b   {: return new Expr(a.value + b.value); :}
     | expr.a MINUS expr.b   {: return new Expr(a.value - b.value); :}
     | NUMBER.n              {: return new Expr(n.doubleValue());   :}
     | LPAREN expr.e RPAREN  {: return e; :}
     ;

More examples: beaver.sourceforge.net
Used with:
- JFlex
- JLex

Bison

Bison version 3.8 (download)
Main website: gnu.org
Works with C, C++, Java
Wikipedia: (English) (French) (German) (Russian)
Parsing algorithms: LALR(1), GLR

Example from gnu.org:

%token TYPE DOTDOT ID

%left '+' '-'
%left '*' '/'

%%
type_decl: TYPE ID '=' type ';' ;

type: '(' id_list ')'
    | expr DOTDOT expr;
id_list: ID
       | id_list ',' ID;
expr: '(' expr ')'
    | expr '+' expr
    | expr '-' expr
    | expr '*' expr
    | expr '/' expr
    | ID;

More examples: gnu.org
Maintained by Akim Demaille and Paul Eggert
Used with:
- flex (English) (French) (Russian)

BiYacc

BiYacc (bitbucket)
Main website: @prl_tokyo/biyacc
Works with Haskell
Parsing algorithm: BX

Example from biyacc.yozora.moe:

Abstract

data Arith = Add Arith Arith
           | Sub Arith Arith
           | Mul Arith Arith
           | Div Arith Arith
           | Num Natural
           | Var String
  deriving (Show, Eq, Read)

Concrete

Expr   -> Expr '+' Term
        | Expr '-' Term
        | Term
        ;

Term   -> Term '*' Factor
        | Term '/' Factor
        | Factor
        ;

Factor -> '-' Factor
        | Int
        | Name
        | '(' Expr ')'
        ;

Actions

Arith +> Expr
Add x y +> (x +> Expr) '+' (y +> Term);
Sub x y +> (x +> Expr) '-' (y +> Term);
exp     +> (exp +> Term);

Arith +> Term
Mul x y +>  (x +> Term) '*' (y +> Factor);
Div x y +>  (x +> Term) '/' (y +> Factor);
exp     +>  (exp +> Factor);

Arith +> Factor
Sub (Num 0) y +> '-' (y +> Factor);
Num i         +> (i +> Int);
Var n         +> (n +> Name);
exp           +> '(' (exp +> Expr) ')';

More examples: biyacc.yozora.moe
Maintained by Zirun Zhu, Yongzhe Zhang, Hsiang-Shang Ko, Pedro Martins, João Saraiva and Zhenjiang Hu

BtYacc

BtYacc version 3.0 (download)
Main website: siber.com
Works with C++
Parsing algorithms: backtracking, bottom-up

Example from siber.com:

%left LO '+' '-'
%left HI '*' '/' '%'
%nonassoc UNARY

%%

expr: expr op1 expr %prec LO
    | expr op2 expr %prec HI
    | unary expr %prec UNARY
    ;

op1 : '+'
    | '-'
    ;

op2 : '*'
    | '/'
    | '%'
    ;

unary : '+' | '-' | '*' | '&' ;

Maintained by Chris Dodd and Vadim Maslov

byacc

Berkeley Yacc

byacc version 1.9 (download)
Main website: invisible-island.net
Works with C
Wikipedia: (English)
Parsing algorithm: LALR

Example from dinosaur.compilertools.net:

%left  '+'  '-'
%left  '*'  '/'

%%

expr    :       expr  '+'  expr
        |       expr  '-'  expr
        |       expr  '*'  expr
        |       expr  '/'  expr
        |       '-'  expr      %prec  '*'
        |       NAME
        ;

Maintained by Robert Corbett
Used with:
- flex (English) (French) (Russian)
- lex (English) (French) (Dutch) (Russian)

BYACC/J

Berkeley Yacc Java extension

BYACC/J version 1.15 (sourceforge)
Main website: byaccj.sourceforge.net
Works with C, C++, Java
Parsing algorithm: LALR

Example from byaccj.sourceforge.net:

%{
import java.lang.Math;
import java.io.*;
import java.util.StringTokenizer;
%}

/* YACC Declarations */
%token NUM
%left '-' '+'
%left '*' '/'
%left NEG /* negation--unary minus */
%right '^' /* exponentiation */

/* Grammar follows */
%%
input: /* empty string */
 | input line
 ;

line: '\n'
 | exp '\n' { System.out.println(" " + $1.dval + " "); }
 ;

exp: NUM { $$ = $1; }
 | exp '+' exp { $$ = new ParserVal($1.dval + $3.dval); }
 | exp '-' exp { $$ = new ParserVal($1.dval - $3.dval); }
 | exp '*' exp { $$ = new ParserVal($1.dval * $3.dval); }
 | exp '/' exp { $$ = new ParserVal($1.dval / $3.dval); }
 | '-' exp %prec NEG { $$ = new ParserVal(-$2.dval); }
 | exp '^' exp { $$ = new ParserVal(Math.pow($1.dval, $3.dval)); }
 | '(' exp ')' { $$ = $2; }
 ;
%%

String ins;
StringTokenizer st;

void yyerror(String s) { System.out.println("par:"+s); }

boolean newline;
int yylex()
{
	String s;
	int tok;
	Double d;
	if (!st.hasMoreTokens())
		if (!newline)
		{
			newline=true;
			return '\n';
		}
		else
			return 0;
	s = st.nextToken();
	try
	{
		d = Double.valueOf(s);
		yylval = new ParserVal(d.doubleValue());
		tok = NUM;
	}
	catch (Exception e)
	{
		tok = s.charAt(0);
	}
	return tok;
}

void dotest()
{
	BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
	while (true)
	{
		System.out.print("expression:");
		try { ins = in.readLine(); }
		catch (Exception e) {}
		st = new StringTokenizer(ins);
		newline=false;
		yyparse();
	}
}

public static void main(String args[])
{
	Parser par = new Parser(false);
	par.dotest();
}

Maintained by Tomas Hurka

Coco/R

Coco/R (download)
Main website: ssw.jku.at
Works with C#, Java, C++, F#, VB.NET, Swift, Oberon
Wikipedia: (English) (German) (Russian)
Parsing algorithms: recursive descent, AG, LL(k)

Example from ssw.jku.at:

COMPILER Sample

CHARACTERS
  digit = '0'..'9'.
TOKENS
  number = digit {digit}.
IGNORE '\r' + '\n'

PRODUCTIONS
  Sample                     (. int n; .)
  = {
      "calc" 
      Expr<out n>            (. Console.WriteLine(n); .)
    }.

  Expr<out int n>            (. int n1; .)
  = Term<out n>
    {
      '+'
      Term<out n1>           (. n = n + n1; .)
    }.

  Term<out int n> 
  = number                   (. n = Convert.Int32(t.val); .)
  .
END Sample.

More examples: ssw.jku.at
Maintained by Hanspeter Mössenböck, Markus Löberbauer and Albrecht Wöß
Built on top of Coco/R:

Copper

Copper version 1.0.0 (github)
Main website: melt.cs.umn.edu
Works with Silver
Parsing algorithms: context-aware, LALR(1)

Example from @melt-umn/copper:

%cf{
    non terminal stmt;
    non terminal Double stmts,expr;

    start with stmts;

    precedence left PLUS, BINARY_MINUS;
    precedence left TIMES, DIVIDE;
    precedence left UNARY_MINUS;

    stmts ::=
      stmts:hd stmt:tl   {: RESULT = (env.containsKey("RESULT") ? env.get("RESULT") : 0.0/0.0); :}
    | stmt:s             {: RESULT = (env.containsKey("RESULT") ? env.get("RESULT") : 0.0/0.0);  :}
    ;

    stmt ::=
      UNASSIGNED_ID:i ASSIGN expr:e SEMI      {: env.put(i,e); :}
    | expr:e SEMI                             {: env.put("_e" + (nextUnnamed++),e); :}
    ;

    expr ::=
      expr:l PLUS expr:r           {: RESULT = l + r;    :}
    | expr:l BINARY_MINUS expr:r   {: RESULT = l - r;    :}
    | expr:l TIMES expr:r          {: RESULT = l * r;    :}
    | expr:l DIVIDE expr:r         {: RESULT = l / r;    :}
    | UNARY_MINUS expr:e           {: RESULT = -1.0 * e; :}
        %layout ()
    | LPAREN expr:e RPAREN         {: RESULT = e;        :}
    | NUMBER:n                     {: RESULT = n;        :}
    | ASSIGNED_ID:i                {: RESULT = i;        :}
    | UNASSIGNED_ID:u
      {:
          error(_pos,"Undefined symbol '" + u + "'");
          RESULT = 0.0/0.0;
      :}
    ;

%cf}

More examples: @melt-umn/copper
Maintained by Eric Van Wyk

DCG

Definite Clause Grammars

SWI Prolog version 8.4.1 (download)
Main website: swi-prolog.org
Works with Prolog
Wikipedia: (English) (Russian)
Parsing algorithm: DCG

Example from @grammarware/slps:

expr(E) --> lassoc(ops,atom,binary,E).

expr(apply(N,Es)) -->
	name(N),
	+(atom,Es).

expr(ifThenElse(E1,E2,E3)) -->
	reserved("if"),
	expr(E1),
	reserved("then"),
	expr(E2),
	reserved("else"),
	expr(E3).

More examples: pathwayslms.com
DCG contributors

Eli

Eli version 4.8.1 (sourceforge)
Main website: eli-project.sourceforge.net
Parsing algorithm: AG

Example from eli-project.sourceforge.net:

ClassDeclaration:
  Modifiers 'class' Identifier Super Interfaces ClassBody .

Super:
  'extends' InhName / .

Interfaces:
  ['implements' InterfaceTypeList] .

InterfaceTypeList:
  InterfaceType /
  InterfaceTypeList ',' InterfaceType .

ClassBody:
  '{' ClassBodyDeclarations '}' .

ClassBodyDeclarations:
  / ClassBodyDeclarationList .

ClassBodyDeclarationList:
  ClassBodyDeclaration /
  ClassBodyDeclarationList ClassBodyDeclaration .

More examples: eli-project.sourceforge.net

Ensō

Ensō (github)
Main website: @enso-lang/enso
Parsing algorithm: GLL

Example from cs.utexas.edu:

start G
G ::= [Grammar] "start" start:</rules[it]> rules:R*
R ::= [Rule] name:sym "::=" arg:A
A ::= [Alt] alts:C + @"|"
C ::= [Create] "[" name:sym "]" arg:S | S
S ::= [Sequence] elements:F*
F ::= [Field] name:sym  ":" arg:P | P
P ::= [Lit] value:str
    | [Value] kind:("int" | "str" | "real" | "sym")
    | [Ref] "<" path:Path  ">"
    | [Call] rule:</rules[it]>
    | [Code] "{" code:Expr "}"
    | [Regular] arg:P "*" Sep? { optional && many }
    | [Regular] arg:P "?" { optional }
    | "(" A ")"
Sep ::= "@"sep:P

More examples: enso-lang.org
Maintained by William R. Cook, Alex Loh and Tijs van der Storm

Frown

Frown version 0.6.1 (download)
Main website: cs.ox.ac.uk
Works with Haskell
Parsing algorithm: LALR(k)

Example from cs.ox.ac.uk:

import Char

data Expr  =  Ident  Char
           |  Apply  Expr Expr
              deriving (Show)

type Terminal  =  Char

%{

Terminal  =  guard {isAlpha} as "letter"
          |  '('
          |  ')';

Nonterminal  =  expr {Expr};

expr  {Ident x}    :  "letter" {x};
      {Apply t u}  |  expr {t}, '(', expr {u}, ')';
     
}%

frown ts  =  fail "syntax error"

Maintained by Ralf Hinze and Arjan Oosting
Built on top of Frown:
- frown 0.6.1-6

GDK

Grammar Deployment Kit

GDK version 1.4.4 (sourceforge)
Main website: gdk.sourceforge.net
Works with C
Parsing algorithm: LL

Example from @slebok/zoo:

specification : rule+;
rule          : ident ":" disjunction ";";
disjunction   : {conjunction "|"};
conjunction   : term+;
term          : basis repetition?;
basis         : ident
              | literal
              | "%epsilon"
              | alternation
              | group
              ;
repetition    : "+" | "*" | "?";
alternation   : "{" basis basis "}" repetition;
group : "(" disjunction ")" ;

More examples: gdk.sourceforge.net
Maintained by Jan Kort

GOLD

GOLD Parsing System

GOLD version 5 (download)
Main website: goldparser.org
Works with Intel Asm, C, C#, D, Delphi, Java, Pascal, Python3, VB, VB.NET, C++
Wikipedia: (English) (German)
Parsing algorithms: LALR, DFA

Example from goldparser.org:

Id = {Letter}{AlphaNumeric}*

<Statement> ::= if Id then <Statement>
              | if Id then <Then Stm> else <Statement>
              | Id ':=' Id

<Then Stm>  ::= if Id then <Then Stm> else <Then Stm>
              | Id ':=' Id

More examples: goldparser.org, goldparser.org
Devin Cook and others
Built on top of GOLD:

Happy

Happy version 1.18.5 (download)
Main website: haskell.org
Works with Haskell
Parsing algorithm: GLR

Example from haskell.org:

Exp   : let var '=' Exp in Exp  { Let $2 $4 $6 }
      | Exp1                    { Exp1 $1 }

Exp1  : Exp1 '+' Term           { Plus $1 $3 }
      | Exp1 '-' Term           { Minus $1 $3 }
      | Term                    { Term $1 }

Term  : Term '*' Factor         { Times $1 $3 }
      | Term '/' Factor         { Div $1 $3 }
      | Factor                  { Factor $1 }

Factor			  
      : int                     { Int $1 }
      | var                     { Var $1 }
      | '(' Exp ')'             { Brack $2 }

More examples: haskell.org
Maintained by Andy Gill and Simon Marlow
Used with:
- Alex

Iguana

Iguana (github)
Main website: iguana-parser.github.io
Works with Rascal, Java, Scala
Parsing algorithms: GLL, data dependency

Example from @iguana-parser/examples:

Start ::= Element

Element ::= s=STag Content ETag(s)
STag    ::= '<' n:Name Attribute* '>' {n.yield}
ETag(s) ::= '</' n:Name [n.yield == s] '>'

Attribute ::= Name "=" String
Content ::= Element* | Text

@Layout
L ::= Whitespaces

regex
{
	Name ::= [a-zA-Z][a-zA-Z0-9]*
	Text ::= [a-zA-Z0-9]+
	String ::= '\"' [a-zA-Z0-9]+ '\"'
	Whitespaces ::= [\n\r\t\f\ ]*
}

More examples: @iguana-parser/examples
Maintained by Ali Afroozeh and Anastasia Izmaylova

Irony

Irony version 1.1.0 (github)
Main website: @IronyProject/Irony
Works with C#
Wikipedia: (English)
Parsing algorithm: LALR(1)

Example from @IronyProject/Irony:

Expr.Rule = Term | UnExpr | BinExpr | PostFixExpr;
Term.Rule = number | ParExpr | identifier;
ParExpr.Rule = "(" + Expr + ")";
UnExpr.Rule = UnOp + Term;
UnOp.Rule = ToTerm("+") | "-" | "++" | "--";
BinExpr.Rule = Expr + BinOp + Expr;
BinOp.Rule = ToTerm("+") | "-" | "*" | "/" | "**";
PostFixExpr.Rule = Term + PostFixOp;
PostFixOp.Rule = ToTerm("++") | "--";
AssignmentStmt.Rule = identifier + AssignmentOp + Expr;
AssignmentOp.Rule = ToTerm("=") | "+=" | "-=" | "*=" | "/=";
Statement.Rule = AssignmentStmt | Expr | Empty;
ProgramLine.Rule = Statement + NewLine;
Program.Rule = MakeStarRule(Program, ProgramLine);
this.Root = Program;

RegisterOperators(1, "+", "-");
RegisterOperators(2, "*", "/");
RegisterOperators(3, Associativity.Right, "**");

Irony contributors

JastAdd

JastAdd version 2.3.5 (download)
Main website: jastadd.org
Parsing algorithm: RAG

Example from @jastadd/examples:

Program ::= Block /PredefinedType:TypeDecl*/;
Block ::= BlockStmt*;
abstract BlockStmt;
abstract Stmt: BlockStmt;
abstract Decl: BlockStmt ::= <Name:String>;
abstract TypeDecl:Decl;
ClassDecl: TypeDecl ::= [Superclass:IdUse] Body:Block;
VarDecl: Decl ::= Type:Access;
AssignStmt: Stmt ::= Variable:Access Value:Exp;
WhileStmt: Stmt ::= Condition:Exp Body:Stmt;
abstract Exp;
abstract Access:Exp;
abstract IdUse: Access ::= <Name:String>;
Use: IdUse;
Dot:Access ::= ObjectReference:Access IdUse;
BooleanLiteral : Exp ::= <Value:String>;

More examples: @jastadd/examples, jastadd.org
Maintained by Görel Hedin, Emma Söderberg, Emma Söderberg and Jesper Öqvist

JavaCC

JavaCC version 7.0.10 (download)
Main website: javacc.org
Works with Java, C, C++
Wikipedia: (English) (French) (German) (Russian)
Parsing algorithm: LL(k)

Example from engr.mun.ca:

double Primary() throws NumberFormatException :
{
	Token t ;
	double d ;
}
{
	t=<NUMBER>
	{ return Double.parseDouble( t.image ) ; }
|
	<PREVIOUS>
	{ return previousValue ; }
|
	<OPEN_PAR> d=Expression() <CLOSE_PAR>
	{ return d ; }
|
	<MINUS> d=Primary()
	{ return -d ; }
}

More examples: engr.mun.ca
Maintained by Glassfish Kenai, Paul Cager, Tom Copeland and sreeni.

jparsec

jparsec version 3.1 (github)
Main website: @jparsec/jparsec
Works with Java
Parsing algorithm: recursive descent

Example from @jparsec/jparsec:

Terminals operators = Terminals.operators(","); // only one operator supported so far
Parser<?> integerTokenizer = Terminals.IntegerLiteral.TOKENIZER;
Parser<String> integerSyntacticParser = Terminals.IntegerLiteral.PARSER;
Parser<?> ignored = Parsers.or(Scanners.JAVA_BLOCK_COMMENT, Scanners.WHITESPACES);
Parser<?> tokenizer = Parsers.or(operators.tokenizer(), integerTokenizer); // tokenizes the operators and integer
Parser<List<String>> integers = integerSyntacticParser.sepBy(operators.token(","))
    .from(tokenizer, ignored.skipMany());
assertEquals(Arrays.asList("1", "2", "3"), integers.parse("1, /*this is comment*/2, 3");

Maintained by Arnaud Bailly, Gregory Ssi-Yan-Kai, Ben Yu, Alex Michael Berry

Kiama

Kiama version 2.5.0 (github)
Main website: @inkytonik/kiama
Works with Scala
Parsing algorithms: Packrat, AG

Example from @inkytonik/kiama:

class SyntaxAnalyser(positions : Positions) extends Parsers(positions)
{
    lazy val stmt : Parser[Stmt] =
        ";" ^^ (_ => Null()) | sequence | asgnStmt | whileStmt

    lazy val asgnStmt =
        variable ~ ("=" ~> exp) <~ ";" ^^ Asgn

    lazy val whileStmt =
        ("while" ~> "(" ~> exp <~ ")") ~ stmt ^^ While

    lazy val sequence =
        "{" ~> (stmt*) <~ "}" ^^ Seqn

    lazy val exp : PackratParser[Exp] =
        exp ~ ("+" ~> term) ^^ Add |
            exp ~ ("-" ~> term) ^^ Sub |
            term

    lazy val term : PackratParser[Exp] =
        term ~ ("*" ~> factor) ^^ Mul |
            term ~ ("/" ~> factor) ^^ Div |
            factor

    lazy val factor : Parser[Exp] =
        double | integer | variable | "-" ~> exp ^^ Neg | "(" ~> exp <~ ")"

    lazy val double =
        """[0-9]+\.[0-9]+""".r ^^ (s => Num(s.toDouble))

    lazy val integer =
        "[0-9]+".r ^^ (s => Num(s.toInt))

    lazy val variable =
        idn ^^ Var

    lazy val idn =
        not(keyword) ~> "[a-zA-Z][a-zA-Z0-9]*".r

    lazy val keyword =
        keywords("[^a-zA-Z0-9]".r, List("while"))

}

More examples: @inkytonik/kiama
Maintained by Anthony Sloane

Kleenex

Kleenex (github)
Main website: kleenexlang.org
Works with C
Parsing algorithm: FST

Example from @diku-kmc/kleenexlang:

main := (num /[^0-9]/ | other)*
num := digit{1,3} ("," digit{3})*
digit := /[0-9]/
other := /./

Maintained by Niels Bjørn Bugge Grathwohl, Fritz Henglein, Ulrik Terp Rasmussen, Kristoffer Aalund Søholm, Sebastian Paaske Tørholm

kwParsing

kwParsing (download)
Main website: starship.skyport.net
Works with Python2
Parsing algorithm: ?

Example from starship.skyport.net:

GRAMMARSTRING ="""
       Value ::  ## indicates Value is the root nonterminal for the grammar
         @R SetqRule :: Value >> ( setq var Value )
         @R ListRule :: Value >> ( ListTail
         @R TailFull :: ListTail >> Value ListTail
         @R TailEmpty :: ListTail >> )
         @R Varrule :: Value >> var
         @R Intrule :: Value >> int
         @R Strrule :: Value >> str
         @R PrintRule :: Value >> ( print Value )
"""

Maintained by Aaron Watters

Laja

Laja version 2-007-beta (sourceforge)
Main website: laja.tengstrand.nu
Works with Java
Parsing algorithms: scannerless, top-down, backtracking

Example from laja.tengstrand.nu:

grammar example {
   digit = "5";
   example = digit;

   Example example;
   example.setDigit(String digit);
}

More examples: laja.tengstrand.nu
Maintained by Joakim Tengstrand

LGI

Language Generator by Instil

LGI version 0.95 (sourceforge)
Main website: sourceforge.net
Works with Java
Parsing algorithm: Packrat

Example from sourceforge.net:

Grammar    <- Spacing Definition+ EndOfFile
Definition <- Identifier LEFTARROW Expression
Expression <- Sequence (SLASH Sequence)*
Sequence   <- Prefix*
Prefix     <- (AND / NOT)? Suffix
Suffix     <- Primary (QUESTION / STAR / PLUS)?
Primary    <- Identifier !LEFTARROW
            / OPEN Expression CLOSE
            / Literal / DOT

Identifier <- '[a-zA-Z_][a-zA-Z_0-9]*' Spacing
Literal    <- "'" (!"'" Char)* "'" Spacing
            / '"' (!'"' Char)* '"' Spacing
Char       <- '\\[nrt\'"\+\-\*\.\?\^\$\{\}\(\)\[\]\\]'
            / '\\[0-2][0-7][0-7]'
            / '\\[0-7][0-7]?'
            / !'\\' .

Maintained by Edgar A. Duenez-Guzman

MetaEdit

MetaEdit+

MetaEdit (download)
Main website: metacase.com
Wikipedia: (English)
Parsing algorithm: drag and drop
Example from metacase.com:
More examples: metacase.com
Maintained by MetaCase

ML-Yacc

ML-Yacc version 2.4 (download)
Main website: smlnj.org
Works with ML
Parsing algorithm: LALR

Example from smlnj.org:

fun lookup "bogus" = 10000
  | lookup s = 0

%%

%eop EOF SEMI

%pos int

%left SUB PLUS
%left TIMES DIV
%right CARAT

%term ID of string | NUM of int | PLUS | TIMES | PRINT |
      SEMI | EOF | CARAT | DIV | SUB
%nonterm EXP of int | START of int option

%name Calc

%subst PRINT for ID
%prefer PLUS TIMES DIV SUB
%keyword PRINT SEMI

%noshift EOF
%value ID ("bogus")
%nodefault
%verbose
%%

START : PRINT EXP (print EXP;
                   print "\n";
                   flush_out std_out; SOME EXP)
      | EXP (SOME EXP)
      | (NONE)
EXP : NUM             (NUM)
    | ID              (lookup ID)
    | EXP PLUS EXP    (EXP1+EXP2)
    | EXP TIMES EXP   (EXP1*EXP2)
    | EXP DIV EXP     (EXP1 div EXP2)
    | EXP SUB EXP     (EXP1-EXP2)
    | EXP CARAT EXP   (let fun e (m,0) = 1
                              | e (m,l) = m*e(m,l-1)
                       in e (EXP1,EXP2)       
                       end)

More examples: smlnj.org
Maintained by David R. Tarditi and Andrew W. Appel

Mouse

Mouse version 2.3 (sourceforge)
Main website: romanredz.se
Works with Java
Parsing algorithms: recursive descent, backtracking

Example from romanredz.se:

MethodDeclaration
    = MethodModifier* MethodHeader MethodBody ;

MethodHeader
    = Result MethodDeclarator Throws?
    / TypeParameters Annotation* Result MethodDeclarator Throws?
    ;

MethodDeclarator
    = Identifier LPAR FormalParameterList? RPAR Dim* ;

FormalParameterList
    = (ReceiverParameter / FormalParameter)(COMMA FormalParameter)* ;

FormalParameter
    = VariableModifier* UnannType VariableDeclaratorId
    / VariableModifier* UnannType Annotation* ELLIPSIS VariableDeclaratorId !COMMA
    ;

VariableModifier
    = Annotation
    / FINAL
    ;

ReceiverParameter
    = VariableModifier* UnannType (Identifier DOT)? THIS ;

Result
    = UnannType
    / VOID
    ;

MethodModifier
    = Annotation
    / PUBLIC
    / PROTECTED
    / PRIVATE
    / ABSTRACT
    / STATIC
    / FINAL
    / SYNCHRONIZED
    / NATIVE
    / STRICTFP
    ;

Throws
    = THROWS ExceptionTypeList ;

ExceptionTypeList
    = ExceptionType (COMMA ExceptionType)* ;

ExceptionType
    = ClassType
    / TypeVariable
    ;

MethodBody
    = Block
    / SEMI
    ;

Maintained by Roman Redziejowski

MPS

parboiled

parboiled version 1.3.1 (download)
Main website: @sirthias/parboiled
Works with Java, Scala
Parsing algorithm: Packrat

Example from @sirthias/parboiled:

import org.parboiled.scala._

class SimpleCalculator extends Parser
{
  def Expression: Rule0 = rule { Term ~ zeroOrMore(anyOf("+-") ~ Term) }
  def Term = rule { Factor ~ zeroOrMore(anyOf("*/") ~ Factor) }
  def Factor = rule { Number | "(" ~ Expression ~ ")" }
  def Number = rule { oneOrMore("0" - "9") }
}

Maintained by Edgar A. Duenez-Guzman

Parsec

Parsec version 3.1.15.0 (download)
Main website: wiki.haskell.org
Works with Haskell, F#, Java, JavaScript, Erlang
Wikipedia: (English)
Parsing algorithms: monadic, recursive descent

Example from legacy.cs.uu.nl:

import ParsecExpr

expr    :: Parser Integer
expr    = buildExpressionParser table factor
        <?> "expression"

table   = [[op "*" (*) AssocLeft, op "/" div AssocLeft]
          ,[op "+" (+) AssocLeft, op "-" (-) AssocLeft]
          ]          
        where
          op s f assoc
             = Infix (do{ string s; return f}) assoc

factor  = do{ char '('
            ; x <- expr
            ; char ')'
            ; return x 
            }
        <|> number
        <?> "simple expression"

number  :: Parser Integer
number  = do{ ds <- many1 digit
            ; return (read ds)
            }
        <?> "number"

Maintained by Antoine Latter
Built on top of Parsec:

ParseLib

ParseLib version 2015.1.1 (download)
Main website: hackage.haskell.org
Works with Haskell
Parsing algorithm: monadic

Example from haskell.org:

exprparser :: Parser Expr
exprparser = buildExpressionParser table term <?> "expression"
table = [ [Prefix (m_reservedOp "~" >> return (Uno Not))]
        , [Infix (m_reservedOp "&" >> return (Duo And)) AssocLeft]
        , [Infix (m_reservedOp "=" >> return (Duo Iff)) AssocLeft]
        ]
term = m_parens exprparser
       <|> fmap Var m_identifier
       <|> (m_reserved "true" >> return (Con True))
       <|> (m_reserved "false" >> return (Con False))

Maintained by Jurriën Stutterheim and João Paulo Pizani Flor
Built on top of ParseLib:
- ParseLib in Frege/JVM

parsnip

parsnip version 0.23 (sourceforge)
Main website: parsnip-parser.sourceforge.net
Works with C++
Parsing algorithm: Packrat

Example from spikebucket.blogspot.be:

double multiply(double x, double y) { return x*y; }
double add(double x, double y) { return x+y; }
double subtract(double x, double y) { return x-y; }
double divide(double x, double y) { return x/y; }

typedef Parser<string, double>::type NumParser;

NumParser ops = op_table(real)
                 ->infix_left("+", 10, add)
                 ->infix_left("-", 10, subtract)
                 ->infix_left("*", 20, multiply)
                 ->infix_left("/", 20, divide);

ParseResult<double> result;
 
result = parse("3+4*2", ops);

if (result.parse_finished())
{
  std::cout << parse.data(); 
}

Maintained by Alex Rubinsteyn

PetitParser

Helvetia / Pharo version 9 (download) (download)
Main website: scg.unibe.ch
Works with Smalltalk
Parsing algorithms: scannerless, monadic, Packrat, contexts

Example from lukas-renggli.ch:

identifier := (#letter asParser , (#letter asParser / #digit asParser) star) flatten.
number :=  #digit asParser plus token trim ==> [ :token | token value asNumber ].

term := PPUnresolvedParser new.
prod := PPUnresolvedParser new.
prim := PPUnresolvedParser new.
 
term def: (prod , $+ asParser trim , term ==> [ :nodes | nodes first + nodes last ])
   / prod.
prod def: (prim , $* asParser trim , prod ==> [ :nodes | nodes first * nodes last ])
   / prim.
prim def: ($( asParser trim , term , $) asParser trim ==> [ :nodes | nodes second ])
   / number.

start := term end.

Maintained by Tudor Gîrba, Jan Kurš
Built on top of Helvetia / Pharo:

PLY

Python Lex-Yacc

PLY version 3.11 (download)
Main website: dabeaz.com
Works with Python2, Python3
Parsing algorithm: LR

Example from dabeaz.com:

names = { }

def p_statement_assign(t):
    'statement : NAME EQUALS expression'
    names[t[1]] = t[3]

def p_statement_expr(t):
     'statement : expression'
    print(t[1])

def p_expression_binop(t):
    '''expression : expression PLUS expression
                  | expression MINUS expression
                  | expression TIMES expression
                  | expression DIVIDE expression'''
    if t[2] == '+'  : t[0] = t[1] + t[3]
    elif t[2] == '-': t[0] = t[1] - t[3]
    elif t[2] == '*': t[0] = t[1] * t[3]
    elif t[2] == '/': t[0] = t[1] / t[3]

def p_expression_uminus(t):
    'expression : MINUS expression %prec UMINUS'
    t[0] = -t[2]

def p_expression_group(t):
    'expression : LPAREN expression RPAREN'
    t[0] = t[2]

def p_expression_number(t):
    'expression : NUMBER'
    t[0] = t[1]

def p_expression_name(t):
    'expression : NAME'
    try:
        t[0] = names[t[1]]
    except LookupError:
        print("Undefined name '%s'" % t[1])
        t[0] = 0

More examples: dabeaz.com
Maintained by David Beazley

PRECC

PREttier Compiler-Compiler

PRECC version 2.60 (download)
Main website: preccx.sourceforge.net
Works with C
Parsing algorithm: LL(k)

Example from preccx.sourceforge.net:

integer = [ <'+'>|<'-'> ]
          unsigned_int
          [ {<'E'>|<'e'>} [<'+'>] unsigned_int ]

unsigned_int = (isdigit)+

Maintained by Peter Breuer and Jonathan Bowen

PyBison

PyBison version 0.6.1
Main website: @lukeparser/pybison
Works with Python2
Parsing algorithm: LALR(1)

Example from freenet.mcnabhosting.com:

def on_exp(self, target, option, names, values):
    """
    exp : NUMBER
        | exp PLUS exp
        | exp MINUS exp
        | exp TIMES exp
        | exp DIVIDE exp
        | MINUS exp %prec NEG
        | exp POW exp
        | LPAREN exp RPAREN
    """
    if option == 0:
        return float(values[0])
    elif option == 1:
        return values[0] + values[2]
    elif option == 2:
        return values[0] - values[2]
    elif option == 3:
        return values[0] * values[2]
    elif option == 4:
        return values[0] / values[2]
    elif option == 5:
        return - values[1]
    elif option == 6:
        return values[0] ** values[2]
    elif option == 7:
        return values[1]

More examples: freenet.mcnabhosting.com
Maintained by David McNab
Not to be confused with PyBison by Scott Hassan!

PyLR

PyLR (download)
Main website: starship.skyport.net
Works with Python2
Parsing algorithm: LR

Example from starship.skyport.net:

_class SimpleMathParser
_lex mathlex.mathlex()
_code from PyLR.Lexers import mathlex
"""
expression: expression PLUS term (addfunc)
     |      term;

term: term TIMES factor (timesfunc)
  |   factor;

factor: LPAREN expression RPAREN (parenfunc)
   |    INT;
"""

Maintained by Scott

pyparsing

pyparsing version 2.2.0 (sourceforge)
Main website: pyparsing.wikispaces.com
Works with Python2, Python3
Parsing algorithms: scannerless, recursive descent

Example from pyparsing.wikispaces.com:

from pyparsing import Word, alphas
greet = Word( alphas ) + "," + Word( alphas ) + "!"
hello = "Hello, World!"
print (hello, "->", greet.parseString( hello ))

More examples: pyparsing.wikispaces.com
Maintained by Paul McGuire

pysec

pysec
Main website: valuedlessons.com
Works with Python2
Parsing algorithms: monadic, recursive descent

Example from valuedlessons.com:

from Pysec import Parser, choice, quoted_chars, group_chars, option_chars, digits, between, pair, spaces, match, quoted_collection

json_choices = []
json         = choice(json_choices)
text         = quoted_chars("'", "'")
number       = group_chars([option_chars(["-"]), digits, option_chars([".", digits])]) >> Parser.lift(float)
joiner       = between(spaces, match(","), spaces)
mapping_pair = pair(text, spaces & match(":") & spaces & json)
collection   = quoted_collection("[", spaces, json,         joiner, "]") >> Parser.lift(list)
mapping      = quoted_collection("{", spaces, mapping_pair, joiner, "}") >> Parser.lift(dict)
json_choices.extend([text, number, mapping, collection])

print json.parseString("{'a' : -1.0, 'b' : 2.0, 'z' : {'c' : [1.0, [2.0, [3.0]]]}}")

Maintained by Peter Thatcher

Racket

Racket version 8.3 (download)
Main website: racket-lang.org
Works with Racket
Wikipedia: (English) (French) (German) (Russian)
Parsing algorithm: LALR(1)

Example from gist.github.com:

(define simple-math-lexer
  (lexer
   ("-" (token--))
   ("+" (token-+))
   ("let" (token-LET))
   ("in" (token-IN))
   ((re-+ number10) (token-NUM (string->number lexeme)))
   (identifier      (token-VAR lexeme))
   ;; recursively calls the lexer which effectively skips whitespace
   (whitespace (simple-math-lexer input-port))
   ((eof) (token-EOF))))

(define simple-math-parser
  (parser
   (start exp)
   (end EOF)
   (error void)
   (tokens a b)
   (precs (left - +))
   (grammar
    (exp ((LET VAR NUM IN exp)
          (make-let-exp $2 (num-exp $3) $5))
         ((NUM) (num-exp $1))
         ((VAR) (var-exp $1))
         ((exp + exp) (make-arith-exp + $1 $3))
         ((exp - exp) (make-arith-exp - $1 $3))))))

More examples: matt.might.net, docs.racket-lang.org, docs.racket-lang.org

Ragel

Ragel version 7.0.4 (download)
Main website: colm.net
Works with C, C++, Intel Asm, Obj-C, D, Go, Ruby, Java
Wikipedia: (English) (French) (German) (Russian)
Parsing algorithm: FSM

Example from colm.net:

action dgt      { printf("DGT: %c\n", fc); }
action dec      { printf("DEC: .\n"); }
action exp      { printf("EXP: %c\n", fc); }
action exp_sign { printf("SGN: %c\n", fc); }
action number   { /*NUMBER*/ }

number = (
    [0-9]+ $dgt ( '.' @dec [0-9]+ $dgt )?
    ( [eE] ( [+\-] $exp_sign )? [0-9]+ $exp )?
) %number;

main := ( number '\n' )*;

Maintained by Adrian D. Thurston
Built on top of Ragel:
- Colm

Rascal

Rascal version 0.18.3 (download)
Main website: rascal-mpl.org
Wikipedia: (English)
Parsing algorithm: GLL

Example from @usethesource/rascal:

start syntax Program 
  = program: "begin" Declarations decls {Statement  ";"}* body "end" ;

syntax Declarations 
  = "declare" {IdType ","}* decls ";" ;  
 
syntax IdType = idtype: Id id ":" Type t;

syntax Statement 
  = assign: Id var ":="  Expression val 
  | cond: "if" Expression cond "then" {Statement ";"}*  thenPart "else" {Statement ";"}* elsePart "fi"
  | cond: "if" Expression cond "then" {Statement ";"}*  thenPart "fi"
  | loop: "while" Expression cond "do" {Statement ";"}* body "od"
  ;  
     
syntax Type 
  = natural:"natural" 
  | string :"string" 
  | nil    :"nil-type"
  ;

syntax Expression 
  = id: Id name
  | strcon: String string
  | natcon: Natural natcon
  | bracket "(" Expression e ")"
  > left concat: Expression lhs "||" Expression rhs
  > left ( add: Expression lhs "+" Expression rhs
         | min: Expression lhs "-" Expression rhs
         )
  ;

           
lexical Id  = [a-z][a-z0-9]* !>> [a-z0-9];
lexical Natural = [0-9]+ ;
lexical String = "\"" ![\"]*  "\"";

layout Layout = WhitespaceAndComment* !>> [\ \t\n\r%];

lexical WhitespaceAndComment 
   = [\ \t\n\r]
   | @category="Comment" "%" ![%]+ "%"
   | @category="Comment" "%%" ![\n]* $
   ;

public start[Program] program(str s) {
  return parse(#start[Program], s);
}

public start[Program] program(str s, loc l) {
  return parse(#start[Program], s, l);
}

More examples: tutor.rascal-mpl.org
Maintained by Paul Klint, Jurgen Vinju and Tijs van der Storm

RecDescent

Parse::RecDescent

RecDescent version 1.967015 (download)
Main website: metacpan.org
Works with Perl
Parsing algorithm: recursive descent

Example from metacpan.org:

parser = Parse::RecDescent->new(q{
<autoaction: { [@item] } >
 
expression: and_expr '||' expression | and_expr
and_expr:   not_expr '&&' and_expr   | not_expr
not_expr:   '!' brack_expr       | brack_expr
brack_expr: '(' expression ')'       | identifier
identifier: /[a-z]+/i
});

Maintained by Jeremy T. Braun

SableCC

SableCC version 3.7 (download)
Main website: sablecc.org
Works with Java
Wikipedia: (English)
Parsing algorithm: LALR(1)

Example from engr.mun.ca:

Grammar expression;

Lexer

  num = digit+;
  digit = '0'..'9';
  blanks = (' ' | eol | tab)+;

  eol = cr | lf | cr lf;
  cr = #13;
  lf = #10;
  tab = #9;

Parser

  Ignored
    blanks;

  program =
    exp ';';

  exp =
    {mul:} exp [op:]'*' exp |
    {add:} exp [op:]'+' exp |
    {num:} num;

    Precedence
      Left mul;
      Left add;

More examples: @SableCC/sablecc
Maintained by Etienne M. Gagnon, Patrick Pelletier, Jean Privat, Jérôme Dassonville and Lucas Satabin

Silver

Silver version 0.4.4 (github)
Main website: melt.cs.umn.edu
Works with Silver
Parsing algorithms: context-aware, LALR(1), AG

Example from melt.cs.umn.edu:

nonterminal Root_c ;
synthesized attribute pp :: String ;
synthesized attribute ast_Root :: Root;
attribute pp, ast_Root occurs on Root_c ;

concrete production root_c
r::Root_c ::= e::Expr_c
{ r.pp = e.pp;
  r.ast_Root = root(e.ast_Expr);
}

synthesized attribute ast_Expr :: Expr ;
nonterminal Expr_c with pp, ast_Expr;
nonterminal Term_c with pp, ast_Expr;
nonterminal Factor_c with pp, ast_Expr;

concrete production add_c
sum::Expr_c ::= e::Expr_c ’+’ t::Term_c
{ sum.pp = e.pp ++ " + " ++ t.pp ;
  sum.ast_Expr = add(e.ast_Expr, t.ast_Expr );
}

concrete production exprTerm_c
e::Expr_c ::= t::Term_c
{ e.pp = t.pp ;
  e.ast_Expr = t.ast_Expr ;
}

concrete production mul_c
prd::Term_c ::= t::Term_c ’*’ f::Factor_c
{ prd.pp = t.pp ++ " * " ++ f.pp ;
  prd.ast_Expr = mul(t.ast_Expr, f.ast_Expr);
}

concrete production termFactor_c
t::Term_c ::= f::Factor_c
{ t.pp = f.pp ;
  t.ast_Expr = f.ast_Expr ;
}

concrete production integerConstant_c
ic::Factor_c ::= i::IntLit_t
{ ic.pp = i.lexeme ;
  ic.ast_Expr = integerConstant(i);
}

More examples: melt.cs.umn.edu
Maintained by Ted Kaminski, Lucas Kramer and Eric Van Wyk

SimpleParse

SimpleParse version 2.1.1a2 (sourceforge)
Main website: simpleparse.sourceforge.net
Works with Python2
Parsing algorithms: top-down, one-pass

Example from simpleparse.sourceforge.net:

declaration = r'''# note use of raw string when embedding in python code...
file           :=  [ \t\n]*, section+
section        :=  '[', identifier!, ']'!, ts, '\n', body
body           :=  statement*
statement      :=  (ts, semicolon_comment) / equality / nullline
nullline       :=  ts, '\n'
comment        :=  -'\n'*
equality       :=  ts, identifier, ts, '=', ts, identified, ts, '\n'
identifier     :=  [a-zA-Z], [a-zA-Z0-9_]*
identified     :=  string / number / identifier
ts             :=  [ \t]*
'''

Maintained by Mike C. Fletcher

SJPT

Simple Java Parsing Toolkit

SJPT version 0.10 (sourceforge)
Main website: sjpt.sourceforge.net
Works with Java
Parsing algorithms: LL(1), LR(0), SLR(1), LR(1), LALR(1)

Example from prdownloads.sourceforge.net:

import ro.infoiasi.donald.compiler.parser0.runtime.*;
   
terminal SEMI, PLUS, MINUS, TIMES, DIVIDE, LPAREN, RPAREN;
terminal Integer NUMBER, ID;
   
non terminal Object expr_list, expr_part;
non terminal Integer expr, factor, term;
    
expr_list ::= expr_list expr_part
            | expr_part;
      
expr_part ::= expr:e {: System.out.println(" = " + e); :} SEMI;
   
expr      ::= factor:f PLUS expr:e {: RESULT = new Integer(f.intValue() + e.intValue()); :}
            | factor:f MINUS expr:e {: RESULT = new Integer(f.intValue() - e.intValue()); :}
            | factor:f {: RESULT = new Integer(f.intValue()); :};
   
factor    ::= factor:f TIMES term:t {: RESULT = new Integer(f.intValue() * t.intValue()); :}
            | factor:f DIVIDE term:t {: RESULT = new Integer(f.intValue() / t.intValue()); :}
            | term:t {: RESULT = new Integer(t.intValue()); :};
   
   
term      ::= LPAREN expr:e RPAREN {: RESULT = e; :}
            | NUMBER:n {: RESULT = n; :}
            | ID:i {: RESULT = i; :};

Maintained by Catalin Hritcu

SmaCC

SmaCC version 2.0.5 (download) (github)
Main website: refactoryworkers.com
Works with Smalltalk
Parsing algorithms: LR(1), LALR(1), GLR

Example from refactoryworkers.com:

<number> : [0-9]+ (\. [0-9]*) ? ;
<whitespace> : \s+;
%left "+" "-";
%left "*" "/";
%right "^";

Expression 
	: Expression 'exp1' "+" Expression 'exp2' {exp1 + exp2}
	| Expression 'exp1' "-" Expression 'exp2' {exp1 - exp2}
	| Expression 'exp1' "*" Expression 'exp2' {exp1 * exp2}
	| Expression 'exp1' "/" Expression 'exp2' {exp1 / exp2}
	| Expression 'exp1' "^" Expression 'exp2' {exp1 raisedTo: exp2}
	| "(" Expression 'expression' ")" {expression}
	| Number 'number' {number}
	;
Number 
	: <number> 'numberToken' {numberToken value asNumber}
	;

More examples: @SquareBracketAssociates/Booklet-Smacc
Maintained by John Brant and Thierry Goubier

SPARK

Scanning, Parsing, and Rewriting Kit

SPARK version 0.7 pre-alpha-7 (download)
Main website: pages.cpsc.ucalgary.ca
Works with Python2
Parsing algorithm: ?

Example from pages.cpsc.ucalgary.ca:

class ExprParser(GenericParser):
	def __init__(self, start='expr'):
		GenericParser.__init__(self, start)

	def p_rules(self args):
		'''
			expr ::= expr + term
			expr ::= term
			term ::= term * factor
			term ::= factor
			factor ::= number
			factor ::= float
		'''

Maintained by John Aycock

Spirit

Spirit version 3.0.0 (download)
Main website: boost-spirit.com
Works with C++
Wikipedia: (English) (German)
Parsing algorithms: backtracking, LL(*)

Example from English:

#include <string>
#include <iostream>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
 
int main()
{
  namespace qi = boost::spirit::qi;

  std::string input;
 
  std::cout << "Input a line: \n";
  getline(std::cin, input);
  std::cout << "Got '" << input << "'.\n";
 
  unsigned count = 0;

  auto rule = *(qi::lit("cat") [ ++qi::_val ] | qi::omit[qi::char_]);
  qi::parse(input.begin(), input.end(), rule, count);
  
  std::cout << "The input contained " << count << " occurrences of 'cat'\n";
}

Maintained by Joel de Guzman, Hartmut Kaiser, Carl Barron, François Barel, Dan Nuffer, Ben Hanson, Martijn van der Lee, John David, Juan Carlos Arevalo-Baeza, Martin Wille, Peter Simons, Jeff Westfahl, João Abecasis, Dan Marsden, Nicola Musatti, Tobias Schwinger, Thomas Heller, Bryce Lelbach, Jeroen Habraken, Agustín Bergé, Kohei Takahashi

Spoofax

Spoofax version 2.5.16 (download)
Main website: metaborg.org
Works with SDF3
Wikipedia: (English)
Parsing algorithm: SGLR

Example from @MetaBorgCube/metaborg-pascal:

module Pascal

context-free start-symbols Program

context-free syntax
  
  Program.Program = <
    <Programheading> 
    <Block>.
  >

  Programheading.ProgramHeading = <
    program <Identifier> (<{Identifier ", "}*>);
  > {case-insensitive}

  Block.Block = <
    <LabelDeclarations?>
    <ConstantDefinitions?>
    <TypeDefinitions?>
    <VariableDeclarations?>
    
    <{ProcedureOrFunctionDeclaration "\n\n"}*>
    
    begin  
      <{Statement ";\n"}*>     
      <Term?>
    end
  > {case-insensitive} 

  LabelDeclarations.LabelDecls = <
    label 
       <{Label ", "}+>;
  > {case-insensitive}
  
  ConstantDefinitions.ConstDefs = <
    const 
       <{ConstantDefinition "\n"}+>
  > {case-insensitive}

  ConstantDefinition.ConstDef =
    <<Identifier> = <Constant>;>
  
  TypeDefinitions.TypeDefs = <
    type 
      <{TypeDefinition "\n"}+>
  > {case-insensitive}

  TypeDefinition.TypeDef =
    <<Identifier> = <Type>;>
  
  VariableDeclarations.VarDecls = <
    var 
       <{VariableDeclaration "\n"}+>
  > {case-insensitive}

  VariableDeclaration.VarDecl =
    <<{VariableIdentifier ", "}+> : <Type>;>
    
  VariableIdentifier.VarId = <<Identifier>>

...

More examples: declare-your-language.metaborg.org
Maintained by Eelco Visser et al
Used with:

SugarJ

SugarJ (download) (github)
Main website: erdweg.org
Works with Java
Example from student.informatik.tu-darmstadt.de:
More examples: @sugar-lang/case-studies
Maintained by Sebastian Erdweg

SwiftParsec

SwiftParsec version 4.0.1 (github)
Main website: @davedufresne/SwiftParsec
Works with Swift
Parsing algorithms: monadic, recursive descent

Example from @davedufresne/SwiftParsec:

let noneOf = StringParser.noneOf

let quotedChars = noneOf("\"") <|>
    StringParser.string("\"\"").attempt *>
    GenericParser(result: "\"")

let character = StringParser.character

let quote = character("\"")
let quotedField = quote *> quotedChars.many.stringValue <*
    (quote <?> "quote at end of field")

let field = quotedField <|> noneOf("\r\n,\n\r").many.stringValue
let record = field.separatedBy(character(","))

let endOfLine = StringParser.crlf.attempt <|>
    (character("\n") *> character("\r")).attempt <|>
    character("\n") <|>
    character("\r") <?> "end of line"

let csv = record.separatedBy(endOfLine)

More examples: @davedufresne/SwiftParsec
Maintained by David Dufresne

Takmela

Takmela (github)
Main website: @samiz/takmela
Works with Java
Parsing algorithms: recursive descent, memoisation

Example from @samiz/takmela:

expr : expr PLUS term 
    | term ;
term : 
      ID 
    | NUM
    | expr OPENP expr CLOSEP;
    
ID: [a-zA-Z] [a-zA-Z0-9]*;

NUM: [0-9]+;

PLUS: '+';
OPENP: '(';
CLOSEP: ')';
WS skip: ' '*;

Maintained by Mohamed Samy

TXL

FreeTxl version 10.8a (download)
Main website: txl.ca
Works with TXL, XML
Wikipedia: (English)
Parsing algorithms: top-down, backtracking

Example from txl.ca:

define program 
        [compilation_unit]
end define

define compilation_unit
        [definition_module] 
    |   [opt 'IMPLEMENTATION] [program_module]
end define

define definition_module
                                        [NL]
        DEFINITION MODULE [id] ;        [NL][NL]
                [repeat import_]
                [repeat definition]
        END [id] .                      [NL]
end define

define program_module
        MODULE [id] [opt priority] ;    [NL][NL]
                [repeat import_]
                [block] [id] .          
end define

More examples: txl.ca
Maintained by James Cordy

vembyr

vembyr version 1.1 (sourceforge)
Main website: code.google.com
Works with C++, Python2, Ruby
Parsing algorithm: Packrat
Maintained by Jon Rafkind

Waxeye

Waxeye version 0.8.0 (sourceforge)
Main website: waxeye.org
Works with C, Java, JavaScript, Python2, Ruby, Scheme
Parsing algorithm: Packrat

Example from waxeye.org:

calc  <- ws sum

sum   <- prod *([+-] ws prod)

prod  <- unary *([*/] ws unary)

unary <= '-' ws unary
       | :'(' ws sum :')' ws
       | num

num   <- +[0-9] ?('.' +[0-9]) ws

ws    <: *[ \t\n\r]

More examples: @orlandohill/waxeye
Maintained by Orlando Hill

Whimsy

Autumn version 1.2.0 (github)
Main website: @norswap/whimsy
Works with Kotlin
Parsing algorithms: PEG, monadic

Example from @norswap/whimsy:

import norswap.autumn.Grammar
import norswap.autumn.parsers.*

class RegexGrammar: Grammar()
{
     fun meta_char()
        = char_set("|*+?()\\")

    fun regular_char()
        = seq { not { meta_char() } && char_any() }

    fun quoted_char()
        = seq { string("\\") && char_any() }

    fun character()
        = choice { quoted_char() || regular_char() }

    fun paren_group(): Boolean
        = seq { string("(") && alternation() && string(")") }

    fun atom()
        = choice { paren_group() || character() }

    fun repetition_char()
        = char_set("*+?")

    fun repetition()
        = seq { atom() && repeat0 { repetition_char() } }

    fun concatenation()
        = repeat1 { repetition() }

    fun alternation()
        = around1 ({ concatenation() } , { string("|") })

    override fun root() =
        alternation()
}

Maintained by Nicolas Laurent

Whole

Whole Platform

Whole version 1.0.0 (sourceforge)
Main website: whole.sourceforge.net
Works with Java
Example from flickr.com:
Maintained by Riccardo Solmi

xtc

eXTensible Compiler

xtc version 2.4.0 (download)
Main website: cs.nyu.edu
Works with Java
Parsing algorithm: Packrat

Example from cs.nyu.edu:

Module        <- Spacing Intro Production* EOF
Intro         <- ModuleDecl Dependency* Header? Body? Footer? Option?
ModuleDecl    <- "module" FSpacing ModuleRef SEMICOLON
Dependency    <- Modification / Instantiation / Import
Modification  <- "modify" FSpacing ModuleRef ModuleTarget? SEMICOLON
Instantiation <- "instantiate" FSpacing ModuleRef ModuleTarget? SEMICOLON
Import        <- "import" FSpacing ModuleRef ModuleTarget? SEMICOLON
ModuleRef     <- QName ModuleParams?
ModuleParams  <- OPEN ( QName (COMMA QName)* )? CLOSE
ModuleTarget  <- "as" FSpacing QName
Header        <- "header" Spacing Action
Body          <- "body" Spacing Action
Footer        <- "footer" Spacing Action
Option        <- "option" FSpacing Attribute (COMMA Attribute)* SEMICOLON
Production    <- Full / Addition / Removal / Override
Full          <- PAttributes QName Identifier EQUAL Choice SEMICOLON
Addition      <- QName Identifier PLUSEQUAL
                 ( SName ELLIPSIS SLASH Choice SEMICOLON
                 / Choice SLASH SName ELLIPSIS SEMICOLON )
Removal       <- QName Identifier MINUSEQUAL
                 SName ( COMMA SName )* SEMICOLON
Override      <- QName Identifier COLONEQUAL Choice SEMICOLON
                 / QName Identifier COLONEQUAL ELLIPSIS SLASH Choice SEMICOLON
                 / QName Identifier COLONEQUAL Choice SLASH ELLIPSIS SEMICOLON
                 / PAttributes QName Identifier COLONEQUAL ELLIPSIS SEMICOLON

More examples: cs.nyu.edu
Maintained by Robert Grimm

Xtext

Xtext version 2.25.0 (download)
Main website: eclipse.org
Works with Java
Wikipedia: (English) (German) (Spanish)

Example from eclipse.org:

grammar org.example.domainmodel.Domainmodel
with org.eclipse.xtext.common.Terminals
 
generate domainmodel "http://www.example.org/domainmodel/Domainmodel"
 
Domainmodel :
    (elements+=Type)*;
  
Type:
    DataType | Entity;
  
DataType:
    'datatype' name=ID;
 
Entity:
    'entity' name=ID ('extends' superType=[Entity])? '{'
        (features+=Feature)*
    '}';
 
Feature:
    (many?='many')? name=ID ':' type=[Type];

More examples: eclipse.org
Used with:
- ANTLR

yacc

Yet Another Compiler-compiler

yacc
Main website: dinosaur.compilertools.net
Works with C
Wikipedia: (English) (Dutch) (French) (German) (Russian)
Parsing algorithm: LALR

Example from dinosaur.compilertools.net:

%left  '+'  '-'
%left  '*'  '/'

%%

expr    :       expr  '+'  expr
        |       expr  '-'  expr
        |       expr  '*'  expr
        |       expr  '/'  expr
        |       '-'  expr      %prec  '*'
        |       NAME
        ;

Maintained by Stephen C. Johnson
Used with:
- lex (English) (French) (Dutch) (Russian)

YAPPS

Yet Another Python Parser System

YAPPS version 2 (download)
Main website: theory.stanford.edu
Works with Python1.5
Parsing algorithm: recursive descent

Example from theory.stanford.edu:

parser Calculator:
    option:    "context-insensitive-scanner"
    token END: "$"
    token NUM: "[0-9]+"

    rule goal:           expr END                         -> << expr        >>
    rule expr:           factor expr_tail<<factor>>       -> << expr_tail   >>
    rule expr_tail<<v>>:                                  -> << v           >>
                       | "+" factor expr_tail<<v+factor>> -> << expr_tail   >>
                       | "-" factor expr_tail<<v-factor>> -> << expr_tail   >>
    rule factor:         term factor_tail<<term>>         -> << factor_tail >>
    rule factor_tail<<v>>:                                -> << v           >>
                       | "*" term factor_tail<<v*term>>   -> << factor_tail >>
                       | "/" term factor_tail<<v/term>>   -> << factor_tail >>
    rule term:           NUM                              -> << atoi(NUM)   >>
                       | "(" expr ")"                     -> << expr        >>

Maintained by Amit Patel

Raincode Labs Compiler Coding Dojo

CoCoDo @ ‹Programming›

22 March 2021, online

2021

Programme (times in UTC!)

Post-Proceedings Program Committee

2019

Programme

Links

2018

Programme

Links

2017

Programme

Links

Technologies

ANTLR

Beaver

Bison

BiYacc

BtYacc

BtYacc

byacc

Berkeley Yacc

BYACC/J

Berkeley Yacc Java extension

Coco/R

Coco/R

Copper

Copper

DCG

Definite Clause Grammars

Eli

Ensō

Frown

GDK

Grammar Deployment Kit

GOLD

GOLD Parsing System

Happy

Iguana

Irony

JastAdd

JavaCC

jparsec

jparsec

Kiama

Kleenex

kwParsing

kwParsing

Laja

LGI

Language Generator by Instil

MetaEdit

MetaEdit+

ML-Yacc

ML-Yacc

Mouse

Mouse

MPS

parboiled

Parsec

ParseLib

ParseLib

parsnip

parsnip

PetitParser

PLY

Python Lex-Yacc

PRECC

PREttier Compiler-Compiler

PyBison

PyBison

PyLR

PyLR

pyparsing

pyparsing

pysec

pysec

Racket