XPath Tester
Test XPath against XML or HTML in your browser, verify namespace mappings and scalar results, and copy node ledgers or runtime snippets.{{ matchedOutput }}
| Index | Path | Type | Name | Value | Copy |
|---|---|---|---|---|---|
| {{ row.index }} | {{ row.path }} | {{ row.type }} | {{ row.name }} | {{ row.value }} |
| Check | Status | Evidence | Action | Copy |
|---|---|---|---|---|
| {{ row.check }} | {{ row.status }} | {{ row.evidence }} | {{ row.action }} |
{{ runtimeSnippet }}
| Selector | Purpose | Example | Portability | Copy |
|---|---|---|---|---|
| {{ row.selector }} | {{ row.purpose }} | {{ row.example }} | {{ row.portability }} |
Introduction
XPath expressions are often written as if they were short bits of search text, but they behave more like precise instructions for walking a parsed document tree. A feed importer, test assertion, scraper, XML transform, browser automation check, or SOAP response parser may all depend on one expression returning the same node, text value, count, or attribute after the document changes.
The tree matters because XML and HTML are not read the same way as plain text. Elements, attributes, text nodes, comments, and the document root each have their own position in the model. A path can start at the root, search descendants, select attributes, filter with predicates, or call a function that turns a group of nodes into one string, number, or boolean.
The compact syntax is useful, but it also makes mistakes easy to miss. XML namespaces change the expanded name that XPath must match. Positions are one-based. A broad descendant search can return more nodes than expected. HTML parsers can repair copied markup before the XPath runs, while strict XML parsers stop on malformed input. A selector that works on a small sample may fail on a namespaced production feed or a page fragment copied from a browser.
- Location path
- The step sequence that moves from a context node to selected nodes, such as from a catalog to each title.
- Axis
- The relationship used by a step, such as child, descendant, attribute, parent, or self.
- Predicate
- A bracketed filter such as
[@genre="fiction"]or[1]that narrows the current node set. - Scalar result
- A single string, number, or boolean returned by functions such as
string(),count(), orboolean().
Good XPath testing is less about finding any match and more about proving that the expression, parser, and namespace assumptions match the runtime where the selector will be used. That is why node evidence is so important. A count of 2 may be correct, but the paths and node types show whether those two matches are the intended records, attributes, or text nodes.
- Use the same parser style as the target environment before copying the selector.
- Carry namespace prefixes with the XPath expression, especially when XML uses a default namespace.
- Inspect matched nodes before trusting a scalar function that collapses several matches into one value.
How to Use This Tool:
Use a representative document and keep the XPath test aligned with the parser, namespace mappings, result type, and runtime style that will use the selector later.
- Paste markup into XML or HTML document, drop a local XML, HTML, or TXT file, or choose Browse file. Load sample fills a small namespaced catalog with known title, attribute, count, and string examples.
- Set Parser mode before evaluating the expression. Choose Strict XML for well-formed XML feeds, SOAP payloads, configuration files, and structured exports. Choose HTML document for copied page markup that may rely on browser repair.
- Enter one value in XPath expression. The preset buttons show common paths for element nodes, attribute nodes,
count(), andstring()against the sample catalog. - Add Namespace prefixes as one
prefix=URImapping per line when the source uses XML namespaces. Root namespace declarations are read when available, but a default XML namespace still needs an explicit XPath prefix such asdefault:entry. - Leave Result type on Auto detect for ordinary tests, or force Node snapshot, String value, Number value, or Boolean value when the target runtime expects a specific result category.
- Choose whether Matched output should favor serialized nodes, text values, or generated paths. Raise the Advanced Match limit only when the audit reports truncation and the broad match is intentional.
- Check Matched Nodes, Node Match Ledger, Expression Audit, and XPath Syntax Notes before copying. Use Runtime Snippet only after parser errors, namespace warnings, empty matches, and truncation warnings are resolved.
Interpreting Results:
Matched Nodes is the copy-friendly result, while Node Match Ledger is the stronger evidence view. Confirm the count, generated path, node type, node name, and compact value before moving the XPath into a test suite, scraper, import job, or transformation rule.
A scalar result can be valid and still hide the shape of the source. For String value, Number value, or Boolean value, test the underlying path as a Node snapshot when repeated records, optional elements, or namespaces are in play. That prevents a single returned value from hiding a selector that is too broad.
An empty match should be read as a diagnostic clue, not as immediate proof that the data is absent. Review Expression Audit for parser status, namespace resolver readiness, match-limit truncation, and local processing evidence. Then check spelling, case, predicate position, parser mode, and whether XML default namespaces need explicit prefixes.
Match Type Mix is useful when the expression returns an unexpected blend of element, attribute, text, or scalar evidence. Use it as a quick shape check, then read the underlying ledger rows before copying code or exporting JSON.
Technical Details:
XPath 1.0 evaluates expressions against a logical tree. The original markup string is no longer the authority once parsing has happened. Elements, attributes, text nodes, comments, processing instructions, and the document root become separate node kinds, and each step in a location path applies from a current context node.
A location step combines an axis, a node test, and optional predicates. Abbreviated syntax hides some of that machinery. //bk:title means a descendant search for title elements in the bk namespace, while //bk:book/@id switches to the attribute axis after selecting book elements.
XPath 1.0 returns node sets, strings, numbers, or booleans. Functions such as count() and boolean() intentionally collapse source structure into one scalar value, so the most reliable review pattern is to verify the node path first and then test the scalar expression.
Transformation Core
A repeatable XPath check follows the same evaluation path each time:
- Parse the source as strict XML or as an HTML document tree.
- Collect available namespace declarations and merge them with any manual
prefix=URImappings. - Resolve the requested result category, or infer a scalar category for common functions when automatic detection is used.
- Evaluate the expression with the namespace resolver attached to the parsed document.
- For node results, store rows up to the match limit with generated path, node type, node name, displayed value, and serialized evidence.
- For scalar results, store the returned string, number, or boolean as a single result row.
- Report parser, XPath, namespace, match-limit, and processing-location checks as audit evidence.
| Piece | Example | Review cue |
|---|---|---|
| Root path | /catalog/bk:book |
Starts from the document root, so the top-level element and namespace must match. |
| Descendant search | //bk:title |
Useful for exploration, but broad paths can hide duplicate or unexpected matches. |
| Attribute axis | //bk:book/@id |
Returns attribute nodes, not element text, so node type review matters. |
| Predicate | //bk:book[@genre="fiction"][1] |
Filters run in order, and XPath positions are one-based. |
| Scalar function | count(//bk:book) |
Returns one value, so inspect the counted path when the source shape is uncertain. |
| Rule | Effect | Common failure cue |
|---|---|---|
| Strict XML parsing | Requires well-formed markup before any XPath result can be trusted. | Unclosed tags, duplicate attributes, or namespace errors stop the test early. |
| HTML parsing | Builds a browser-style tree and may add, close, move, or normalize elements. | A selector copied from XML may not match after HTML repair changes the tree. |
| Default namespace | Unprefixed XPath names do not inherit the document default namespace. | Namespaced elements return no matches until the expression uses a bound prefix. |
| Match limit | Node evidence rows are capped from 10 to 500 rows for large matches. | The audit reports truncation when more nodes matched than rows were kept. |
Worked Mechanism Path
In the sample catalog, bk=urn:books binds the bk prefix to the book namespace. The expression //bk:book/bk:title searches descendants for book elements in that namespace and then selects their title children. The node result returns two title rows with paths and text values. Changing the expression to count(//bk:book) keeps the same namespace rule but returns one numeric scalar with value 2.
Privacy Notes:
The pasted document, local file contents, XPath expression, and namespace mappings are evaluated in the current browser session. The audit includes a processing-location check, and the active XPath test does not need a backend request.
Local evaluation is still not a reason to paste secrets casually. Treat customer records, access tokens, private feeds, credentials, and unreleased markup as confidential, use a browser session you trust, and clear the source text when the test is done.
Worked Examples:
Namespaced catalog titles
With the sample loaded, //bk:book/bk:title plus bk=urn:books produces 2 node matches. Matched Nodes can show serialized title elements, while Node Match Ledger lists each title path, type, name, and text value. Removing the namespace mapping turns the same expression into a namespace problem, not a missing-title problem.
Counting records before an import
For a feed with the sample shape, count(//bk:book) returns a Number value of 2. If the value is 0, check Namespace resolver in Expression Audit before rewriting the path, because the record element may still be present under a namespace.
Attributes instead of element text
The expression //bk:book/@id selects the book identifier attributes. The ledger should show type attribute, name id, and values such as bk101. If Matched output is set to paths, copied evidence becomes a generated path list rather than serialized node text.
Copied HTML with repaired structure
A table fragment copied from a web page may fail in Strict XML. Switching Parser mode to HTML document lets the browser build the tree used for page markup. Inspect the ledger afterward because repaired rows, cells, and text nodes can sit in a different place than the raw fragment suggests.
FAQ:
Why does a valid-looking XPath return no matches?
Check namespaces, parser mode, spelling, case, and predicate position. XML default namespaces usually need an explicit XPath prefix, and Expression Audit flags missing namespace mappings, parser errors, and empty matches.
Should XML and HTML use the same parser mode?
Use Strict XML for XML that must be well formed. Use HTML document for copied page markup where the browser may close tags, move nodes, or normalize the tree before XPath evaluation.
What does Auto detect do for result type?
Auto detect treats common scalar functions such as count(), string(), number(), and boolean() as scalar results. Other expressions normally return a node snapshot unless you force another result type.
Why inspect nodes before trusting a scalar value?
Scalar functions return one value, so they can hide repeated records or a path that matched the wrong branch. Test the underlying expression as a Node snapshot, then switch to the scalar result type.
Can the runtime snippets be used without changes?
Runtime Snippet is a starting point. Keep the parser choice, namespace mappings, and result category consistent in the target environment, then adapt library-specific constants or registration calls as needed.
Glossary:
- XPath
- A query language for selecting nodes or scalar values from an XML-style document tree.
- Location path
- A sequence of XPath steps that moves from one context in the tree to selected nodes.
- Axis
- The relationship used by a path step, such as child, descendant, attribute, parent, or self.
- Predicate
- A bracketed filter that narrows a node set by position, attribute, text, or another condition.
- Namespace prefix
- A short expression prefix that maps to a namespace URI used by the source document.
- Node snapshot
- A stored list of matched nodes, useful for reviewing paths, node names, types, and values.
- Scalar result
- A single string, number, or boolean returned by an XPath expression.
References:
- XML Path Language (XPath) Version 1.0, W3C, 16 November 1999.
- Document Object Model XPath, W3C, 26 February 2004.
- Namespaces in XML 1.0 (Third Edition), W3C, 8 December 2009.
- HTML Standard: DOMParser, WHATWG.
- How to use XPath selectors in Scrapy, Simplified Guide.