# Cooking parsers with Winnow --- ### Santiago locations 🇦🇷 🇳🇱 experience 🐍 🦀    Notes: - First time speaking at a rust conference --- ### Content 1. Beginnings 2. Parsers 3. Winnow 4. Demo --- ### An Idea 💡 
Photo by
John Paulsen
on
Unsplash
and edited by me
Note: - Everything began - idea in a garage - would take me into an endless rabbit hole - makes no money - I love it --- ### What idea? Exchanging and tune recipes Notes: - I'm picky, if you like pineapple doesn't mean you like pineapple pizza ----         ---- #### Short history  ---- #### Short history  ---- #### Short history  ---- #### Short history  ---- #### Short history  ---- ## recipe-lang ```recp Take {potatoes}(3) and wrap them in &{aluminium foil}. Place them directly into the coals of the grill. Let cook for t{1 hour} until the potatoes are fork-tender. ```  [reciperium/recipe-lang](https://github.com/reciperium/recipe-lang) --- ### Why Rust as a solo dev? #### INSANE CONFIDENCE Note: - performance by default - cheap hosting - scales well - things just work - fast iteration (next slide) ---  Note: - easy to maintain - easy to reproduce - easy to refactor --- ## What was I needing? - A way to write recipes that was easy for humans and machines --- ## What's the problem with recipes? Many sections - ingredients - materials - instructions - and more --- ## recipe-lang ```recp Take {potatoes}(3) and wrap them in &{aluminium foil}. Place them directly into the coals of the grill. Let cook for t{1 hour} until the potatoes are fork-tender. ``` [playground](https://play.reciperium.com) --- ### Why building a parser in rust? - for fun - for performance - for portability --- ## What is a parser? ---- ### First: What is a grammar? > a finite set of rules to generate strings > that are in the grammar ```sql SELECT ( ALL | DISTINCT )? (
| (
(
)* ) ) ``` - SELECT is a terminal - \
is a non-terminal Notes: - it's actually a context-free grammar - it's context free because it doesn't have to consider the context - popular: Backus Normal Form (BNF) - terminal: like a literal value - non-terminal: reference to another rule (play this rule) - Precedence: order of operations - Associativity: left to right, right to left, etc ---- ### Then: What is a parser? > A tool that maps a series of tokens to grammar rules to create a syntax tree, identifying errors if the token sequence is invalid ---- ```json { "name": "H J Simpson", "age": 40, "address": { "street": "Evrgrn Terace", "number": 742 } } ```  ---- ### Disclaimer  ---- ### Disclaimer  --- ### Choosing a parser | Grammars | Combinators | Regex | | --- | --- | --- | | New syntax | Familiar lang | New syntax | | Maintanable | Maintanable | Depends | | Reusable* | | Reusable* | | Rely on macros | | | \* In theory Notes: - This is almost never the case - antlr works in many languages, but not rust - all code is portable to other languages - is there a correct one? No - combinators: a mix of parser functions combined into a parser - combinators are more intuitive to me, easier to test - grammars: context-free grammar --- ### Combinator example `winnow` ```rs fn hex_primary(input: &mut &str) -> PResult
{ take_while(2, |c: char| c.is_ascii_hexdigit()) .try_map(|input| u8::from_str_radix(input, 16)) .parse_next(input) } ```
docs.rs/winnow/latest/winnow/#example
---- ### Grammar example `pest` ``` alpha = { 'a'..'z' | 'A'..'Z' } digit = { '0'..'9' } ident = { (alpha | digit)+ } ident_list = _{ !digit ~ ident ~ (" " ~ ident)+ } ```
pest.rs/
---- ### Regex ```re text-wrap (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\]) ``` --- ## What's in the rust market? - grammars: [pest](https://github.com/pest-parser/pest), [peg](https://github.com/kevinmehall/rust-peg) - combinators: [winnow](https://github.com/winnow-rs/winnow), [nom](https://github.com/rust-bakery/nom), [chumsky](https://github.com/zesterer/chumsky) - regex: [regex](https://github.com/rust-lang/regex) [and more](https://github.com/rosetta-rs/parse-rosetta-rs "target='_blank'") --- ## Winnow - Excellent documentation with tutorials - Very extendable - Performant - A fork of `nom` by [epage](https://github.com/epage) ---- ## Setup ```sh cargo add winnow ``` ---- ### Main usage ```rs [1-14|2|4,8|8,12] use winnow::combinator::alt; use winnow::{PResult, Parser}; pub fn parse_foobar_bool(i: &mut &str) -> PResult
{ alt(( "foo".value(true), "bar".value(false) )).parse_next(i) } fn main() { let input = "foo"; let out = parse_foobar_bool.parse(input).unwrap(); println!("{}", out); } ```
notes: - PResult handles winnow errors with `ErrMode` - Parser implements combinators for common types: `&str`, `&[u8]`, `char`, etc - 4-9 is a combinator - `parse_next` used inside comb - `parse` for our main parser --- ## Demo --- ### JSONOO (JSON Object Only) _/Jay • SO • NOOOO/_ ---- ```json {"event": "subscribed", "name": "bruce", "isadmin": false} {"event": "subscribed", "name": "marta", "isadmin": true} ``` --- ### alt ---- ### alt  ---- ### alt  ---- ### alt  ---- ### alt  ---- ```rs use winnow::ascii::{alpha1, digit1}; use winnow::combinator::alt; fn parser(input: &str) -> IResult<&str, &str> { alt((alpha1, digit1)).parse_peek(input) } ``` --- ### delimited ---- ### delimited  ---- ### delimited  ---- ### delimited  ---- ```rs use winnow::combinator::delimited; let mut parser = delimited("(", "abc", ")"); assert_eq!(parser.parse_peek("(abc)"), Ok(("", "abc"))); ``` --- ### separated ---- ### separated  ---- ### separated  ---- ### separated  ---- ```rs // { "key1": "value1", "key2": "value2" } delimited( ("{", space0), separated( 0.., separated_pair(parse_string, ":", parse_token), (",", space0), ), (space0, "}"), ) ``` --- ## Error handling ---- ### ErrMode Backtrack: recoverable (default) Cut: unrecoverable ~`Incomplete`~ Notes: - Incomplete only relevant for streaming ---- ### delimited  ---- ### delimited  ---- ```rs [1-14|7|8-10] use winnow::combinator::cut_err; use winnow::combinator::delimited; let mut parser = delimited( "(", "abc", cut_err(")").context( StrContext::Expected( StrContextValue::CharLiteral('}') ) ) ); assert_eq!(parser.parse_peek("(abc)"), Ok(("", "abc"))); ``` ---- ### Context ```rs StrContext::Label // what is currently being parsed StrContext::Expected // expected grammar item StrContextValue::CharLiteral StrContextValue::StringLiteral StrContextValue::Description ``` --- ## Wrap up - [winnow][winnow] is a well-documented powerful library - check out [crafting interpreters] book - let me know if you write a parser! [winnow]: https://github.com/winnow-rs/winnow [crafting interpreters]: https://craftinginterpreters.com --- ## Thanks   [@woile@hachyderm.io](https://hachyderm.io/@woile)  [woile](https://github.com/woile) santiwilly@gmail.com