Note: This site is currently "Under construction". I'm migrating to a new version of my site building software. Lots of things are in a state of disrepair as a result (for example, footnote links aren't working). It's all part of the process of building in public. Most things should still be readable though.

Sanitize HTML With Ammonia In Rust

Overview

I'm using the ammonia rust crate to sanitize HTML for my twitch bot. I'm using it like this:

Code

use ammonia::Builder;
use maplit::{hashmap, hashset};

fn main() {
    let source = r#"
        <div>
            <span id="alfa" class="bravo">charlie</span>
        </div>
    "#;
    let scrubbed = sanatize_html(source);
    dbg!(scrubbed);
}

fn sanatize_html(source: &str) -> String {
    let tags = hashset!["span"];
    let tag_attrs = hashmap![
        "span" => hashset!["id"]
    ];
    Builder::new()
        .tags(tags)
        .tag_attributes(tag_attrs)
        .clean(source)
        .to_string()
}

Details

  • Only tags and attributes for the tags are are explicitly added will be allowed through

  • Permitted tags are added to the `tags`` hashset and added to the `Builder`` via `.tags()``

  • Attributes for defined for each tag `tag_attrs`` and added via `.tag_attributes()``

  • The output is returned as a string. In this example about the result is:

    <span id="alfa">charlie</span>

Code

cargo add ammonia
cargo add maplit

The _matlit__ crate provides the macros used in the example to make the hashsets and hashmap. It's not required. Using the std hash features works as well.

References

  • - an allow-list based HTML sanitization library

    "Designed to prevent cross-site scripting, layout breaking, and clickjacking caused by untrusted user-provided HTML being mixed into a larger web page"

  • This is what ammonia uses under the hood for parsing

  • The made struct for setting up a sanitize run