BISON • Generates LALR (or GLR) parsers • Code in C, C++ or Java • reentrant with %define api.pure set • used by ALL THE THINGS • PHP • Ruby • Postgresql • Go
BISON IN C
LEMON • Generates LALR(1) parser • reentrant AND thread safe • non-terminal destructor (leak avoidance) • pul parsing • sqlite
REENTRANT VS THREAD SAFE • Process • Thread • Locking • Scope • Reentrant
COMPILE IT • transform programming language to computer language
INTERPRET IT • directly executes programming language
UNDER THE HOOD WHAT USES THIS STUFF?
PHP RE2C + Bison + these crazy opcodes….
LALR(1) WRITTEN BY HAND How - pythonic
HHVM Flex and Bison and JIT – OH MY!
SQLITE Lemon is tasty!
WRITING PARSERS AND LEXERS THEORIES OF CODING
STEP 1: THINK SMALL • Writing a general purpose parser is hard – that’s why you use PHP • Writing a single purpose parser is much easier • markup text (markdown) • configuration or definition files (behat/gherkin syntax) • complex validation (addresses in multiple formats)
STEP 2: SEPARATE AND UNOPTIMIZED • premature optimization yada yada • combine after it’s ready to be used (or not at if you’l need to change it later) • lexer and parser each have unique, wel defined goals • the ability to potential y switch parser styles later wil help you!
STEP 3: LEXER • the lexer's job is to recognize tokens • it can do this via a giant switch statement of doom • or maybe a giant loop • or maybe a list of goto statements • or maybe a complex class with methods • …. or you can just use a generator
LET’S BREAK THAT DOWN 1. Define a token format 2. Define grammar format (what are we looking for?) 3. Go over the input data (usual y a string) and make matches 1. compare or regex or ctype_* or however it make sense 4. Keep track of your current state 5. Have an output format – AST, tree, whatever
STEP 4: PARSER • Loop over our tokens • Look at the values and decide to what to do
STEP 5: DO SOMETHING WITH IT! 1. Compile – write out to something that can be run (html) 2. Interpret – run through another program to get output (templates to html) 3. Analyze – run through to analyze the data inside (code analysis/sniffer tools) 4. Validate – check for proper “spel ing and grammar” 5. ??? 6. PROFIT
“If you’re not sure how to do a job – ask!” - silly poster on my laundry room wall