E4X: I want my S-expressions back

This page is a mirrored copy of an article originally posted on the (now sadly defunct) LShift blog; see the archive index here.

Sat, 24 June 2006

E4X is a new ECMA standard (ECMA-357) specifying an extension to ECMAScript for streamlining work with XML documents.

It adds objects representing XML to ECMAScript, and extends the syntax to allow literal XML fragments to appear in code. It also supports a very XPath-like notation for use in extracting data from XML objects. So far, so good - all these things are somewhat useful. However, there are serious problems with the design of the extension.

If E4X objects were real objects, if there were a means of splicing a sequence of child nodes into XML literal syntax, and if E4X supported XML namespace prefixes properly, most of my objections would be dealt with. As it stands, the overall verdict is “clunky at best”.

These are my main complaints:

It doesn’t do anything like Scheme’s unquote-splicing, and so using E4X to produce XML objects is verbose, error-prone and dangerous in concurrent settings.

There seems to be no way of splicing in a sequence of items - I’d like to do something like the following:

function buildItems() {
  return [<item>Hello</item>,
          <item>World!</item>];
}
var doc = <mydocument>{buildItems()}</mydocument>;

and have doc contain

<mydocument>
  <item>Hello</item>
  <item>World!</item>
</mydocument>

What actually results is the more-or-less useless

<mydocument>Hello,World!</mydocument>

The closest I can get to the result I’m after is

function buildItems(n) {
  n.mydocument += <item>Hello</item>;
  n.mydocument += <item>World!</item>;
}
var doc = <mydocument></mydocument>;
buildItems(doc);

It’s full of redundant redundancy - it’s as verbose as XML, when you can do so much better.
There’s no toXML() method (or similar) for use in papering over the yawning chasm between the XML objects and the rest of the language: you can’t even make a Javascript object able to seamlessly render itself to XML.
The new types E4X introduces aren’t even proper objects - they’re a whole new class of primitive datum!
Because they’re not proper objects, you can’t extend the system. You ought to be able to implement to an interface and benefit from the language’s XPath searching and filtering operations. E4X is so close to offering a comprehension facility for Javascript, but it’s been short-sightedly restricted to a single class of non-extensible primitives.
You can’t even construct XML tags programmatically! If the name of the tag doesn’t appear literally in your Javascript code, you’re out of luck (unless you resort to eval…) [[Update: I was wrong about this - you can write <{expr}> and have the result of evaluating expr substituted into the tag.]]
E4X XML objects have no notion of namespace prefixes (which are required for quality implementations of XPath and anything to do with XML signatures). Prefixes only turn up in the API as a means of producing (namespaceURI,localname) pairs. This is actually how it should be, but because there’s already broken software out there that depends on prefix support, by not supporting prefixes properly you preclude ECMAScript+E4X from being used for XML signatures or ECMAScript-native XPath implementations.

In my opinion, E4X violates several programming language design principles: most importantly, those of regularity, simplicity and orthogonality, but also preservation of information, automation and structure. SXML, perhaps in combination with eager comprehensions, provides a far superior model for producing and consuming XML. Sadly, there’s no real alternative for ECMAScript yet - we’re limited either to library extensions, or to using the DOM without any syntactic or library support at all.

Comments

On 25 June, 2006 at 12:18 am, Chris Double wrote:

It’s also frustrating that you can’t take an E4X node and insert it directly into the browser DOM.

There is no ‘toHTML’ which prevents the use of E4X for dynamically generating valid HTML - you have to stick with XHTML. Unfortunately some browsers require things like ‘SCRIPT’ tags to have a closing tag rather than the shortcut method. ie. They need not . E4X is incapable of handling this and is non extensible for generating your own output methods.

As an s-expression style approach I tend to use code like this:

http://mg.to/2006/02/27/easy-dom-creation-for-jquery-and-prototype

Chris.

On 26 June, 2006 at 10:08 am, tonyg wrote:

Thanks for the link, Chris - something similar had occurred to me, too (both the SXML and Seaside HTML canvas approaches should translate to Javascript pretty well), but I haven’t had a chance to explore the design space properly yet.

As it happens, I got into experimenting with E4X through hacking a Seaside-a-like onto your server-side javascript implementation. I’ll post about that soon :-)

On 26 June, 2006 at 11:18 am, Chris Double wrote:

Nice! I look forward to seeing what you’ve done :)

On 7 March, 2008 at 11:21 am, tonyg wrote:

As it happens, I was wrong about the lack of computed tags:

js> var y = 'sometag';
js> (<{y}/>).toXMLString();
<sometag/>

On 7 March, 2008 at 11:42 am, tonyg wrote:

Second embarrassing update: If I’d used an XMLList instead of an array, my buildItems() would have worked just as I intended:

function buildItems() {
  return <><item>Hello</item>
           <item>World!</item></>;
}
var doc = <mydocument>{buildItems()}</mydocument>;

On 17 September, 2008 at 10:11 pm, Prestaul wrote:

I think that this article was a badly researched, knee-jerk reaction. This works:

function buildItems() { return <> Hello World! ; } var doc = {buildItems()};

So does this:

function buildItems() { var list = <>; // or: new XMLList() list += Hello; list += World!; return list; } var doc = {buildItems()};