home ~ projects ~ socials

Rust nom Parser V6

Not sure how many version of the parser I've gone through. This is the latest.

It goes through things character and attempts to match parsers at each point. This approach makes a lot of sense to me and feels like maybe the way it's designed to be used.

This does some inline editing for everything that's matched to output the results directly.

Some things would require multiple passes (e.g. footnotes, possibly).

use nom::branch::alt;
use nom::bytes::complete::tag;
use nom::bytes::complete::take;
use nom::character::complete::not_line_ending;
use nom::combinator::eof;
use nom::error::Error;
use nom::multi::many_till;
use nom::sequence::preceded;
use nom::sequence::tuple;
use nom::IResult;
use nom::Parser;

fn main() {
    let (a, b) = parse("\n>> TITLE: quick \n>> DATE: 2023-03-26\n >> TITLE: fox").unwrap();
    dbg!(&a);
    dbg!(&b);
}

fn parse(source: &str) -> IResult<&str, (Vec<String>, &str)> {
    let (c, d) = many_till(part, eof)(source)?;
    Ok((c, d))
}

fn part(source: &str) -> IResult<&str, String> {
    let (e, f) = alt((
        preceded(
            tag::<&str, &str, Error<&str>>("\n"),
            tuple((tag(">> TITLE: "), not_line_ending)),
        )
        .map(|t| format!("<title>{}</title>", t.1.trim())),
        preceded(
            tag::<&str, &str, Error<&str>>("\n"),
            tuple((tag(">> DATE: "), not_line_ending)),
        )
        .map(|t| format!("<date>{}</date>", t.1.trim())),
        take(1u32).map(|t: &str| t.to_string()),
    ))(source)?;
    Ok((e, f))
}
-- end of line --