.NET and XPath

So I’m working on this XPath presentation for my team at work. I was trying to hack up a sample using some of the more interesting XPath functions, like string-join. PHP’s DOMXPath throws a fit when I use this function so I cracked open MSDN and saw that XPathNavigator in the 2.0 framework claims to support “the XQuery 1.0 and XPath 2.0 Data Model[s].” Nifty, huh? Especially since string-join is defined in those specs. (Note that this table claims it is available in XPath 1.0. Apparently nobody bothered to check the XPath 1.0 specification which does not mention it at all.)

PHP’s implementation must be broken then. Off I go and code a Winforms project that I can use to run my example. Right? Yeah, right…

For the sake of simplicity, I coded a small CLI program that will run an XPath query against an empty document:

using System;
using System.Xml;
using System.Xml.XPath;

public class XPathCLI {
    public static void Main(string[] args) {
        XmlDocument doc = new XmlDocument();

        XPathNavigator nav = doc.CreateNavigator();
        Console.WriteLine(nav.Evaluate(args[0]).ToString());
    }
}

Now let’s make sure it’s working:

$ ./XPathCLI.exe 'concat("hello ", "world")'
hello world

Looks good. Now let’s try the examples listed under string-join:

$ ./XPathCLI.exe "string-join({'Now', 'is', 'the', 'time', '...'}, \" \")"

Unhandled Exception: System.Xml.XPath.XPathException: invalid token: '{'
  at Mono.Xml.XPath.Tokenizer.ParseToken () [0x00000]
  at Mono.Xml.XPath.Tokenizer.advance () [0x00000]
  at Mono.Xml.XPath.XPathParser.yyparse (yyInput yyLex) [0x00000]
  at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000]

$ ./XPathCLI.exe "string-join({abra, cadabra}, \"\")"

Unhandled Exception: System.Xml.XPath.XPathException: invalid token: '{'
  at Mono.Xml.XPath.Tokenizer.ParseToken () [0x00000]
  at Mono.Xml.XPath.Tokenizer.advance () [0x00000]
  at Mono.Xml.XPath.XPathParser.yyparse (yyInput yyLex) [0x00000]
  at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000]

$ ./XPathCLI.exe 'string-join((), "separator")'

Unhandled Exception: System.Xml.XPath.XPathException: Error during parse of string-join((), "separator") ---> Mono.Xml.XPath.yyParser.yyException: irrecoverable syntax error
  at Mono.Xml.XPath.XPathParser.yyparse (yyInput yyLex) [0x00000]
  at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000] --- End of inner exception stack trace ---

  at Mono.Xml.XPath.XPathParser.Compile (System.String xpath) [0x00000]
  at System.Xml.XPath.XPathExpression.Compile (System.String xpath, IXmlNamespaceResolver nsmgr, IStaticXsltContext ctx) [0x00000]
  at System.Xml.XPath.XPathExpression.Compile (System.String xpath) [0x00000]
  at System.Xml.XPath.XPathNavigator.Compile (System.String xpath) [0x00000]
  at System.Xml.XPath.XPathNavigator.Evaluate (System.String xpath) [0x00000]
  at XPathCLI.Main (System.String[] args) [0x00000]

Ok, that didn’t go too well. Apparently Mono doesn’t like some of the syntax. Let’s use a node selecting expression instead:

$ ./XPathCLI.exe 'string-join(//something, "separator")'

Unhandled Exception: System.Xml.XPath.XPathException: function string-join not found
  at System.Xml.XPath.ExprFunctionCall.Evaluate (System.Xml.XPath.BaseIterator iter) [0x00000]
  at System.Xml.XPath.CompiledExpression.Evaluate (System.Xml.XPath.BaseIterator iter) [0x00000]

Uh… ok. Let’s start over on MS.NET. It must be a Mono bug, right?

>XPathCLI.exe "string-join({'Now', 'is' 'the', 'time', '...'}, \" \")"

Unhandled Exception: System.Xml.XPath.XPathException: 'string-join({'Now', 'is''the', 'time', '...'}, " ")' has an invalid token.
   at MS.Internal.Xml.XPath.XPathScanner.NextLex()
   at MS.Internal.Xml.XPath.XPathParser.ParseMethod(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParsePrimaryExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseFilterExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParsePathExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseUnionExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseUnaryExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseMultiplicativeExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseAdditiveExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseRelationalExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseEqualityExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseAndExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseOrExpr(AstNode qyInput)
   at MS.Internal.Xml.XPath.XPathParser.ParseXPathExpresion(String xpathExpresion)
   at System.Xml.XPath.XPathExpression.Compile(String xpath, IXmlNamespaceResolver nsResolver)
   at System.Xml.XPath.XPathNavigator.Evaluate(String xpath)
   at XPathCLI.Main(String[] args)

Let’s jump straight to the one that made it past Mono’s parser to crash in the evaluator:

>XPathCLI.exe "string-join(//something, \"separator\")"

Unhandled Exception: System.Xml.XPath.XPathException: Namespace Manager or XsltContext needed. This query has a prefix, variable, or user-defined function.
   at MS.Internal.Xml.XPath.CompiledXpathExpr.get_QueryTree()
   at System.Xml.XPath.XPathNavigator.Evaluate(XPathExpression expr, XPathNodeIterator context)
   at System.Xml.XPath.XPathNavigator.Evaluate(String xpath)
   at XPathCLI.Main(String[] args)

From this we can make a few conclusions:

  • Mono, MS.NET, and PHP do not support XPath 2.0. I cannot find any PHP documentation that claims a specific version of XPath support, but, as noted in the intro paragraph, MSDN claims XPath 2.0 support and MS.NET does not deliver. (Mono may be following the MS.NET implementation instead of the spec, so whether this is a Mono bug or not is debatable.)
  • Mono, MS.NET, and PHP do not support the {...} construct, which is present in the XPath 2.0 “Precedence Order” section but not actually defined elsewhere. This construct is not present at all in the XPath 1.0 specification. Whether this is a specification or implementation defect is left an open question.
  • Mono, MS.NET, and PHP do not implement the string-join function defined in at least XPath 2.0.

And from those conclusions we can draw a few more.

  • Nobody gives a whip about following the XPath specification.
  • The XPath specification is broken. Or confusing. Or (more likely) both.

The real question, then, is do people intentionally not implement the XPath 2.0 specification because they don’t want to, or because parts of it make no sense? It seems odd to me that an implementation would support concat and not string-join, especially since they are defined right next to each other.

In any case, if you’re not implementing all of it, don’t claim that you do. Incorrect documentation is worse than no documentation.

CategoriesC#

10 Replies to “.NET and XPath”

  1. The fact is very simple: XPath 2.0 is not supported by any of them. Anything that is only in 2.0 must be regarded as invalid expression. So, every implementation mentioned here is correct.

    MSDN documetation does not say that it supports XPath 2.0 nor XQuery 1.0 (especially functions and operators).

    XQuery 1.0 / XPath 2.0 F&O *working draft* is of course wrong. You cannot find corresponding part in the recommendation.

    And yes, I don’t like XQuery and XPath 2.0 which are part of pro-xsd family.

  2. What does this mean then?

    “The XPathNavigator class in the System.Xml.XPath namespace is an abstract class which defines a cursor model for navigating and editing XML information items as instances of the XQuery 1.0 and XPath 2.0 Data Model.”

  3. Stumbled across this post searching for some XPath 2.0 related material. Just to clarify, Microsoft at one point planned to support XQuery 1.0 which is built on top of XPath 2.0 and in fact they provided support for earlier versions of the specifications inside some of the earlier CTP’s for the .NET 2.0 framework. It’s likely you stumbled upon documentation that hadn’t been updated to match the fact that they removed that support before the final release of .NET 2.0.

    As a matter of interest, Atsushi started work on both XPath 2.0 and XQuery, but when MSFT removed their implementation he removed it and placed it in the Mono.Xml.Ext namespace: http://www.mono-project.com/XML#Mono.Xml.Ext

  4. I’m a bit behind on this discussion, but I’ll comment anyway. As I recall, the data model is not the language. There may very well be support for the XQuery 1.0/XPath 2.0 Data Model in the .NET Framework, but there is support for compiling and executing XQuery statements. I’ve been struggling with XQuery in .NET for a few weeks now. I’ve used saxon and it works quite well, but I do wish there were native framework support for it.

  5. That should be: …but there is *no* support for compiling and executing XQuery statments.

    I’ve read elsewhere that Microsoft cut XQuery in favor of XSL 2.0 due to customer demand.

Comments are closed.