Parsing

Parsing HTML Documents

The Document struct in dom_query is designed to handle full HTML documents. You can create a Document by passing in HTML content, which can be provided in several formats: &str, String, or StrTendril.

extern crate dom_query;

use dom_query::Document;
use tendril::StrTendril;

// HTML content as a string slice
let contents_str = r#"<!DOCTYPE html>
<html><head><title>Test Page</title></head><body></body></html>"#;
let doc = Document::from(contents_str);

// HTML content as a String
let contents_string = contents_str.to_string();
let doc = Document::from(contents_string);

// HTML content as a StrTendril
let contents_tendril = StrTendril::from(contents_str);
let doc = Document::from(contents_tendril);

// Checking the root element of the `Document`
assert!(doc.root().is_document());

When parsing a full HTML document, Document will recognize a <!DOCTYPE> if it exists at the start of the input. In this case, the Doctype will be added as the first child of the root Document node. If you provide an HTML snippet without a <!DOCTYPE>, Document will ignore the Doctype.

assert!(doc.root().first_child().unwrap().is_doctype());

Parsing HTML Fragments

For cases where you need to parse only a part of an HTML document, such as a snippet or component, dom_query provides Document::fragment(). This function also accepts &str, String, or StrTendril, but behaves a little differently from Document::from() in that it treats the input as a fragment instead of a full document.

use dom_query::Document;
use tendril::StrTendril;

// Parsing an HTML fragment from a string slice
let contents_str = r#"<div><p>Example Fragment</p></div>"#;
let fragment = Document::fragment(contents_str);

// Parsing from a String
let contents_string = contents_str.to_string();
let fragment = Document::fragment(contents_string);

// Parsing from a StrTendril
let contents_tendril = StrTendril::from(contents_str);
let fragment = Document::fragment(contents_tendril);

// Checking the root element of the fragment
assert!(!fragment.root().is_document());
assert!(fragment.root().is_fragment());

When using Document::fragment(), note that Doctype declarations are ignored, focusing only on the fragment itself.

// Confirming Doctype is excluded in the fragment
assert!(!fragment.root().first_child().unwrap().is_doctype());

Document::fragment() is also used internally within the library to create new elements within the document tree.

Querying

Selecting Elements

The dom_query crate provides several selection methods to locate HTML elements in the document. Using CSS-like selectors, you can select both single and multiple elements.

use dom_query::Document;

let html = r#"<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
        <title>Test Page</title>
    </head>
    <body>
        <h1>Test Page</h1>
        <ul>
            <li>One</li>
            <li><a href="/2">Two</a></li>
            <li><a href="/3">Three</a></li>
        </ul>
    </body>
</html>"#;
let document = Document::from(html);

// Select a single element
let a = document.select("ul li:nth-child(2)");
let text = a.text().to_string();
assert!(text == "Two");

// Selecting multiple elements
document.select("ul > li:has(a)").iter().for_each(|sel| {
    assert!(sel.is("li"));
});

// Optionally select an element with `try_select`, which returns an `Option`
let no_sel = document.try_select("p");
assert!(no_sel.is_none());

The Selection::is method checks whether elements in the current selection match a given selector, without performing a deep search within the elements. dom_query supports pseudo-classes that goes from selectors crate and a few others from itself.

See also: List of supported CSS pseudo-classes

Selecting a Single Match and Multiple Matches

To retrieve only the first match of a selector, Selection::select_single method is available. This method is useful when you want a single match without iterating through all matches.

use dom_query::Document;

let doc: Document = r#"<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
    <ul class="list">
        <li>1</li><li>2</li><li>3</li>
    </ul>
    <ul class="list">
        <li>4</li><li>5</li><li>6</li>
    </ul>
</body>
</html>"#.into();

// selecting a first match
let single_selection = doc.select_single(".list");
assert_eq!(single_selection.length(), 1);
assert_eq!(single_selection.inner_html().to_string().trim(), 
    "<li>1</li><li>2</li><li>3</li>");

// selecting all matches
let selection = doc.select(".list");
assert_eq!(selection.length(), 2);
// but when you call property methods usually
// you will get the result of the first match
assert_eq!(selection.inner_html().to_string().trim(), 
    "<li>1</li><li>2</li><li>3</li>");

// This creates a Selection from the first node in the selection
let first_selection = doc.select(".list").first();
assert_eq!(first_selection.length(), 1);
assert_eq!(first_selection.inner_html().to_string().trim(), 
    "<li>1</li><li>2</li><li>3</li>");

// This approach also creates a new Selection from the next node, each iteration
let next_selection = doc.select(".list").iter().next().unwrap();
assert_eq!(next_selection.length(), 1);
assert_eq!(next_selection.inner_html().to_string().trim(), 
    "<li>1</li><li>2</li><li>3</li>");

// currently, to get data from all matches you need to iterate over them:
let all_matched: String = selection
.iter()
.map(|s| s.inner_html().trim().to_string())
.collect();

assert_eq!(
    all_matched,
    "<li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li>"
);

// same thing as previous, but a little cheaper, because we iterating over the nodes, 
// and do not create a new Selection on each iteration
let all_matched: String = doc
        .select(".list").nodes()
        .iter()
        .map(|s| s.inner_html().trim().to_string())
        .collect();

assert_eq!(
    all_matched,
    "<li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li>"
);

Descendant selections

Elements can be selected in relation to a parent element. Here, a Document is queried for ul elements, and then descendant selectors are applied within that context.

use dom_query::Document;

let html = r#"<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
        <title>Test Page</title>
    </head>
    <body>
        <h1>Test Page</h1>
        <ul class="list-a">
            <li>One</li>
            <li><a href="/2">Two</a></li>
            <li><a href="/3">Three</a></li>
        </ul>
        <ul class="list-b">
            <li><a href="/4">Four</a></li>
        </ul>
    </body>
</html>"#;
let document = Document::from(html);

// selecting parent elements
let ul = document.select("ul");
ul.select("li").iter().for_each(|el| {
    // descendant select matches only inside the children context
    assert!(el.is("li"));
});

// also descendant selector may include elements of the higher level than the parent. 
// It may be useful to specify the exact element you want to select
let el = ul.select("body ul.list-b li").first();
let text = el.text();
assert_eq!("Four", text.to_string());

Selecting with Precompiled Matchers

For repeated queries, dom_query allows using precompiled matchers. This approach enhances performance when matching the same pattern across multiple documents.

use dom_query::{Document, Matcher};

let html1 = r#"<!DOCTYPE html>
    <html><head><title>Test Page 1</title></head><body></body></html>"#;
let html2 = r#"<!DOCTYPE html>
    <html><head><title>Test Page 2</title></head><body></body></html>"#;
let doc1 = Document::from(html1);
let doc2 = Document::from(html2);

// create a matcher once, reuse on different documents
let title_matcher = Matcher::new("title").unwrap();

let title_el1 = doc1.select_matcher(&title_matcher);
assert_eq!(title_el1.text(), "Test Page 1".into());

let title_el2 = doc2.select_matcher(&title_matcher);
assert_eq!(title_el2.text(), "Test Page 2".into());

let title_single = doc1.select_single_matcher(&title_matcher);
assert_eq!(title_single.text(), "Test Page 1".into());

Selecting Ancestor Elements

You can use Node::ancestors() to retrieve the sequence of ancestor nodes for a given element in the document tree, which can be helpful when you need to navigate upward from a specific node.


use dom_query::Document;

let doc: Document = r#"<!DOCTYPE html>
<html>
    <head>Test</head>
    <body>
        <div id="great-ancestor">
            <div id="grand-parent">
                <div id="parent">
                    <div id="child">Child</div>
                </div>
            </div>
        </div>
    </body>
</html>
"#.into();

// Select an element
let child_sel = doc.select("#child");
assert!(child_sel.exists());

// Access the selected node
let child_node = child_sel.nodes().first().unwrap();

// Get all ancestor nodes for the `#child` node
let ancestors = child_node.ancestors(None);
let ancestor_sel = Selection::from(ancestors);

// or just: let ancestor_sel = child_sel.ancestors(None);

// In this case, all ancestor nodes up to the root <html> are included
assert!(ancestor_sel.is("html")); // Root <html> is included
assert!(ancestor_sel.is("#parent")); // Direct parent is also included

// `Selection::is` performs a shallow match, so it will not match `#child` in this selection.
assert!(!ancestor_sel.is("#child"));

// You can limit the number of ancestor nodes returned by specifying `max_limit`
let limited_ancestors = child_node.ancestors(Some(2));
let limited_ancestor_sel = Selection::from(limited_ancestors);

// With a limit of 2, only `#grand-parent` and `#parent` ancestors are included
assert!(limited_ancestor_sel.is("#grand-parent"));
assert!(limited_ancestor_sel.is("#parent"));
assert!(!limited_ancestor_sel.is("#great-ancestor")); // This node is excluded due to the limit

Note that ancestors() can be called on both NodeRef and Selection. NodeRef::ancestors() returns a vector with ancestor nodes, while Selection returns a new Selection containing ancestor nodes.

Selecting with pseudo-classes (:has, :has-text, :contains)

The dom_query crate provides versatile selector pseudo-classes, built on both its own functionality and the capabilities of the selectors crate. These pseudo-classes allow targeting elements based on attributes, text content, and context within the document.

use dom_query::Document;

let html = include_str!("../test-pages/rustwiki_2024.html");
let doc = Document::from(html);

// Search for list items (`li`) within a `tr` element that contains an `a` element
// with the title "Programming paradigm"
let paradigm_selection = doc.select(
    r#"table tr:has(a[title="Programming paradigm"]) td.infobox-data ul > li"#
    );

println!("Rust programming paradigms:");
for item in paradigm_selection.iter() {
    println!(" {}", item.text());
}
println!("{:-<50}", "");

// Select items based on `th` containing text "Influenced by" and
// the following `tr` containing `td` with list items.
let influenced_by_selection = doc.select(
    r#"table tr:has-text("Influenced by") + tr td ul > li > a"#
    );

println!("Rust influenced by:");
for item in influenced_by_selection.iter() {
    println!(" {}", item.text());
}
println!("{:-<50}", "");

// Extract all links within a paragraph containing "foreign function interface" text.
// Since a part of the text is in a separate tag, we use the `:contains` pseudo-class.
let links_selection = doc.select(
    r#"p:contains("Rust has a foreign function interface") a[href^="/"]"#
    );

println!("Links in the FFI block:");
for item in links_selection.iter() {
    println!(" {}", item.attr("href").unwrap());
}
println!("{:-<50}", "");

// :only-text selects an element that contains only a single text node, 
// with no child elements.
// It can be combined with other pseudo-classes to achieve more specific selections.
// For example, to select a <div> inside an <a> 
//that has no siblings and no child elements other than text.
println!("Single <div> inside an <a> with text only:");
for el in doc.select("a div:only-text:only-child").iter() {
    println!("{}", el.text().trim());
}

Key Points:

  • :has(selector): Finds elements that contain a matching element anywhere within.
  • :has-text("text"): Matches elements based on their immediate text content, ignoring any nested elements. This makes it ideal for selecting nodes where the direct text is crucial for differentiation.
  • :contains("text"): Selects elements containing the specified text within them, useful when searching in a block of text.
  • :only-text: Selects elements that contain only a single text node, with no other child nodes.

These pseudo-classes allow for precise and expressive searches within the DOM, enabling the selection of content-rich elements based on structural or attribute-driven conditions. For a full list of supported pseudo-classes, refer to the Supported CSS Pseudo-Classes List.

Filtering Selection

You can filter a selection based on another selection. This can be useful when you need to narrow down a selection to only include elements that are also part of another selection.

use dom_query::Document;

let doc: Document = r#"<!DOCTYPE html>
<html lang="en">
    <head>TEST</head>
    <body>
        <div class="content">
            <p>Content text has a <a href="/0">link</a></p>
        </div>
        <footer>
            <a href="/1">Footer Link</a>
        </footer>
    </body>
</html>
"#.into();

// Selecting all links in the document
let sel_with_links = doc.select("a[href]");

assert_eq!(sel_with_links.length(), 2);

// Selecting every element inside
let content_sel = doc.select("div.content *");

// Filter selection by content selection, so now we get only links (actually only 1 link) that are inside
let filtered_sel = sel_with_links.filter_selection(&content_sel);

assert_eq!(filtered_sel.length(), 1);

You can also use Selection::filter , Selection::try_filter, which returns an Option<Selection>, and Selection::filter_matcher to filter a selection using a pre-compiled Matcher.

Adding Selection

You can combine multiple selections. This can be useful when you want to work with a combined set of elements.

use dom_query::Document;

let doc: Document = r#"<!DOCTYPE html>
<html>
    <head>Test</head>
    <body>
       <div id="great-ancestor">
           <div id="grand-parent">
               <div id="parent">
                   <div id="first-child">Child</div>
                   <div id="second-child">Child</div>
               </div>
           </div>
       </div>
    </body>
</html>"#.into();

let first_sel = doc.select("#first-child");
assert_eq!(first_sel.length(), 1);
let second_sel = doc.select("#second-child");
assert_eq!(second_sel.length(), 1);
let children_sel = first_sel.add_selection(&second_sel);
assert_eq!(children_sel.length(), 2);

Additionally, there are other methods available:

  • Selection::add to add a single element.
  • Selection::try_add which returns an Option<Selection>.
  • Selection::add_matcher to add elements using a pre-compiled Matcher.

HTML and Text Content Extraction

Extracting HTML and Inner HTML

Serialization enables extracting HTML content of elements, either with or without outer tags. This can be useful for accessing structured content within elements.

use dom_query::Document;

let html = r#"<!DOCTYPE html>
<html>
    <head><title>Test</title></head>
    <body><div class="content"><h1>Test Page</h1></div></body>
</html>"#;
let doc = Document::from(html);
let heading_selector = doc.select("div.content");

// Serialization including the outer HTML tag
let content = heading_selector.html();
assert_eq!(content.to_string(), r#"<div class="content"><h1>Test Page</h1></div>"#);

// Serialization excluding the outer HTML tag
let inner_content = heading_selector.inner_html();
assert_eq!(inner_content.to_string(), "<h1>Test Page</h1>");

The html() and inner_html() methods return serialized content as StrTendril. If no elements match the selector, html() and inner_html() will return an empty value, whereas try_html() and try_inner_html() return an Option<StrTendril>, allowing for handling of None.

// Using `try_html()`, which returns an Option<StrTendril>.
// If there are no matching elements, it returns None.
let opt_no_content = doc.select("div.no-content").try_html();
assert_eq!(opt_no_content, None);

// The `html()` method will return an empty `StrTendril` if there are no matches
let no_content = doc.select("div.no-content").html();
assert_eq!(no_content, "".into());

// Similarly, `inner_html()` and `try_inner_html()` work the same way
assert_eq!(doc.select("div.no-content").try_inner_html(), None);
assert_eq!(doc.select("div.no-content").inner_html(), "".into());

Extracting Descendant Text

The text() method retrieves all descendant text content within the selected element, concatenating any nested text nodes into a single string.

use dom_query::Document;

let html = r#"<!DOCTYPE html>
<html>
    <head><title>Test</title></head>
    <body><div><h1>Test <span>Page</span></h1></div></body>
</html>"#;
let doc = Document::from(html);
let body_selection = doc.select("body div").first();
let text = body_selection.text();
assert_eq!(text.to_string(), "Test Page");

Extracting Immediate Text

The immediate_text() method retrieves the immediate text content of the selected element, excluding any text content from its descendants.

This is useful when you need to access the text content of an element without including the text content of its child elements.

use dom_query::Document;

let html = r#"<!DOCTYPE html>
<html>
    <head><title>Test</title></head>
    <body><div><h1>Test <span>Page</span></h1></div></body>
</html>"#;

let doc = Document::from(html);

let body_selection = doc.select("body div h1").first();
// accessing immediate text without descendants
let text = body_selection.immediate_text();
assert_eq!(text.to_string(), "Test ");

Accessing and Manipulating the element's attributes

The dom_query crate provides several methods for accessing and manipulating the attributes of an HTML element.

[!NOTE] All methods listed below apply to both Selection and Node.

Getting an attribute value

You can use the attr() method to retrieve the value of a specific attribute. If the attribute does not exist, it will return None. You can use the attr_or() method to retrieve the value of a specific attribute, and return a default value if the attribute does not exist.

use dom_query::Document;

let html = r#"<!DOCTYPE html>
<html>
    <head><title>Test</title></head>
    <body><input hidden="" id="k" class="important" type="hidden" name="k" data-k="100"></body>
</html>"#;

let doc = Document::from(html);

let mut input_selection = doc.select("input[name=k]");

let val = input_selection.attr("data-k").unwrap();
assert_eq!(val.to_string(), "100");

// try to get an attribute that does not exist
let val_or = input_selection.attr_or("data-l", "0");
assert_eq!(val_or.to_string(), "0");

Removing an attribute

You can use the remove_attr() method to remove a specific attribute from the element. If it called from the Selection then it will remove an attribute from all elements in the selection.

input_selection.remove_attr("data-k");

Removing multiple attributes

You can use the remove_attrs() method to remove multiple attributes from the element. If it called from the Selection then it will remove all listed attributes from all elements in the selection.

input_selection.remove_attrs(&["id", "class"]);

Setting an attribute value

You can use the set_attr() method to set the value of a specific attribute. If it called from the Selection then it will set an attribute to all elements in the selection.

input_selection.set_attr("data-k", "200");

Checking if an attribute exists

You can use the has_attr() method to check if a specific attribute exists on the element. If it called from the Selection then it will check if an attribute exists on the first element in the selection.

let is_hidden = input_selection.has_attr("hidden");
assert!(is_hidden);

Removing all attributes

You can use the remove_all_attrs() method to remove all attributes from the element. If it called from the Selection then it will remove all attributes from all elements in the selection.

input_selection.remove_all_attrs();
assert_eq!(input_selection.html(), r#"<input>"#.into());

Manipulating the DOM

Manipulating the Selection

The dom_query crate provides various methods to manipulate the DOM. Below are some examples demonstrating how to append new HTML nodes, set new content, remove selections, and replace selections with new HTML.

use dom_query::Document;

let html_contents = r#"<!DOCTYPE html>
<html>
    <head><title>Test</title></head>
    <body>
        <div class="content">
            <p>9,8,7</p>
        </div>
        <div class="remove-it">
            Remove me
        </div>
        <div class="replace-it">
            <div>Replace me</div>
        </div>
    </body>
</html>"#;

let doc = Document::from(html_contents);

// Select the div with class "content"
let mut content_selection = doc.select("body .content");

// Append a new HTML node to the selection
content_selection.append_html(r#"<div class="inner">inner block</div>"#);
assert!(doc.select("body .content .inner").exists());

// Set a new content to the selection, replacing existing content
let mut set_selection = doc.select(".inner");
set_selection.set_html(r#"<p>1,2,3</p>"#);
assert_eq!(doc.select(".inner").html(),
    r#"<div class="inner"><p>1,2,3</p></div>"#.into());

// Remove the selection with class "remove-it"
doc.select(".remove-it").remove();
assert!(!doc.select(".remove-it").exists());

// Replace the selection with new HTML, the current selection will not change
let mut replace_selection = doc.select(".replace-it");
replace_selection.replace_with_html(r#"<div class="replaced">Replaced</div>"#);
assert_eq!(replace_selection.text().trim(), "Replace me");

// But the document will reflect the changes
assert_eq!(doc.select(".replaced").text(),"Replaced".into());


// Prepend more elements to the selection
content_selection.prepend_html(r#"<p class="third">3</p>"#);
content_selection.prepend_html(r#"<p class="first">1</p><p class="second">2</p>"#);

// Now the added paragraphs are in front of the 'div'
assert!(doc.select(r#".content > .first + .second + .third + div:has-text("1,2,3")"#).exists());

Explanation:

  • Append HTML:
    • The append_html method is used to add a new HTML node to the existing selection.
  • Set HTML:
    • The set_html method replaces the existing content of the selection with new HTML.
  • Remove Selection:
    • The remove method deletes the elements matching the selector from the document.
  • Replace with HTML:
    • The replace_with_html method replaces the selected elements with new HTML. Note that the selection itself remains unchanged, but the document reflects the new content.
  • Prepend HTML
    • The prepend_html method is used to add a new HTML node at the beginning of the existing selection.

Renaming Elements Without Changing the Contents

The dom_query crate allows you to easily rename selected elements without changing their contents. Selection::rename does the same for the entire selection, while Node::rename does it for a single element.

use dom_query::Document;

let doc: Document = r#"<!DOCTYPE html>
<html>
<head><title>Test</title></head>
<body>
    <div class="content">
        <div>1</div>
        <div>2</div>
        <div>3</div>
        <span>4</span>
    </div>
<body>
</html>"#.into();

let mut sel = doc.select("div.content > div, div.content > span");
// Before renaming, there are 3 `div` and 1 `span`
assert_eq!(sel.length(), 4);

sel.rename("p");

// After renaming, there are no `div` and `span` elements
assert_eq!(doc.select("div.content > div, div.content > span").length(), 0);
// But there are four `p` elements
assert_eq!(doc.select("div.content > p").length(), 4);

Creating and Manipulating Elements

The dom_query crate allows you to create and manipulate HTML elements with ease. Below are examples demonstrating how to create new elements, set attributes, append HTML, and replace content.

use dom_query::Document;

let doc: Document = r#"<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
    <div id="main">
        <p id="first">It's</p>
    <div>
</body>
</html>"#.into();

// Selecting a node we want to attach a new element
let main_sel = doc.select_single("#main");
let main_node = main_sel.nodes().first().unwrap();

// Creating a simple element
let el = doc.tree.new_element("p");
// Setting attributes
el.set_attr("id", "second");
// Setting text content
el.set_text("test");
main_node.append_child(&el);
assert!(doc.select(r#"#main #second:has-text("test")"#).exists());

// Appending a more complex element using `append_html`
main_node.append_html(r#"<p id="third">Wonderful</p>"#);
assert_eq!(doc.select("#main #third").text().as_ref(), "Wonderful");
assert!(doc.select("#first").exists());

// There is also a `prepend_child` and `prepend_html` methods which allows
// to insert content to the begging of the node.
main_node.prepend_html(r#"<p id="minus-one">-1</p><p id="zero">0</p>"#);
assert!(doc.select("#main > #minus-one + #zero + #first + #second + #third").exists());

// Replacing existing element content with new HTML using `set_html`
main_node.set_html(r#"<p id="the-only">Wonderful</p>"#);
assert_eq!(doc.select("#main #the-only").text().as_ref(), "Wonderful");
assert!(!doc.select("#first").exists());

// Completely replacing the contents of the node, 
// including itself, using `replace_with_html`
main_node.replace_with_html(
    r#"<span>Tweedledum</span> and <span>Tweedledee</span>"#
);
assert!(!doc.select("#main").exists());
assert_eq!(doc.select("span + span").text().as_ref(), "Tweedledee");

Explanation:

  • Creating a Simple Element:

    • Use doc.tree.new_element() to create a new element.
    • Set attributes using node.set_attr().
    • Set text content using node.set_text().
    • Append the new element to the selected node using node.append_child().
  • Appending HTML:

    • Use append_html to add a more complex HTML node to the existing selection.
    • This method is more convenient for adding multiple elements to the selected node.
  • Prepending HTML:

    • Use prepend_html to add new HTML nodes at the beginning of the existing selection.
    • Use prepend_child to prepend a new or an existing element node to the selected node.
  • Setting New HTML Content:

    • Use set_html to replace the existing content of the selected node with new HTML.
    • It changes the inner HTML contents of the node.
  • Replacing Node Contents Completely:

    • Use replace_with_html to replace the entire content of the node, including the node itself.

Additionally, methods like replace_with_html, set_html, append_html and prepend_html can specify more than one element in the provided string.

Supported CSS pseudo-classes in dom_query

Implementation with selectors:

  • :empty
  • :first-child
  • :last-child
  • :has
  • :is
  • :where
  • :last-of-type
  • :not
  • :only-child
  • :only-of-type
  • :nth-child
  • :nth-last-child

Implementation with dom_query:

  • :any-link
  • :link
  • :has-text
  • :contains
  • :only-text

Notes

:has-text – checks whether one of children nodes has specific text.

:contains – checks whether the combined text of all child nodes contains specific text.

:only-text - checks whether the element contains only a single text node, with no other child nodes.

WASM32 Compilation

When compiling dom_query to WebAssembly (target wasm32-unknown-unknown) using wasm-pack, you may encounter runtime panics related to memory allocation, such as:

panicked at 'assertion failed: psize <= size + max_overhead'

This issue currently occurs due to compatibility problems between the latest versions of the selectors crate and the dlmalloc crate. The issue specifically manifests when using pseudo-elements, including selectors' own pseudo-elements like :not and :has.

If you must compile dom_query for a wasm32 application, consider using an alternative to dlmalloc. The following allocators have been tested and work successfully:

  • wee_alloc
  • lol_alloc
  • mini-alloc

Solution:

  1. Add mini-alloc to your Cargo.toml:
[dependencies]
mini-alloc = "0.6.0"
  1. Set mini-alloc as the global allocator in your lib.rs or main.rs:
#[cfg(target_arch = "wasm32")]
#[global_allocator]
static ALLOC: mini_alloc::MiniAlloc = mini_alloc::MiniAlloc::INIT;
  1. Build or test your WebAssembly project