A SRFI-10-style extension to JSON

This page is a mirrored copy of an article originally posted on the (now sadly defunct) LShift blog; see the archive index here.

Tue, 11 October 2005

Background

The data language JSON is a great replacement for XML for many applications. It’s very similar in spirit to Lisp and Scheme S-expressions, as well as to XML: it is a pure data language, with no intrinsic semantics.

XML doesn’t allow direct literal representation of any data types other than strings and XML nodes (to oversimplify slightly); S-expressions don’t allow direct literal representation of anything but atoms, pairs and vectors; and JSON doesn’t allow direct literal representation of anything but atoms, lists, and dictionaries.

There are currently no extensions to XML to remedy this, to allow programmers to extend the XML reader to support literal notation for other classes of object. For S-expressions, however, there is SRFI 10: Sharp-Comma External Form, which allows the S-expression reader to be extended to allow read-time construction of literal objects.

Reader/Writer Extensibility for JSON

I’ve implemented a SRFI-10-like construction for JSON, allowing extension of the JSON reader and writer, which makes it easier to use JSON for things like serialization or high-level messaging protocols.

You can download the modified json.js from here.

The basic extension is the addition of @constructor ... notation to the reader: whenever the JSON parser sees an “@” sign, it parses a single identifier, and then a complete JSON object called the argument. The identifier is looked up in a table of named constructor functions, and if a match is found, the corresponding constructor function is called with the argument. If no matching constructor function is found, a parse error is signalled.

The corresponding extension to the writer is the toJsonString method. If an object that the JSON.stringify function encounters possesses a toJsonString implementation, the method is called and its result is returned as the stringification of the object.

Example

As an example, imagine an application where a client and server shared some notion of a database of objects, indexable by some unique identifier. The client side uses proxies to manipulate these objects, and the proxies send JSON RPC requests to the server (perhaps using XmlHttpRequest) to query and update the database held by the server. There are two ways we could represent one of these database objects: as just its unique identifier, which requires a manual lookup on both the server and client side whenever a reference to the object is to be sent over the wire, or using the @constructor extension.

MyDatabaseObject.prototype.toJsonString = function () {
  return "@dbobject " + this.id;
}

An example message from the client to the server might be

{
  "action": "retrieveNextObject",
  "arg": @dbobject 42
}

When the server parses the message out of the HTTP request (or other transport), it calls JSON.parse with an extra argument:

var requestFromClient = JSON.parse(requestText, {
  "dbobject": function (arg) {
    return DB.lookupById(arg);
  }
});

When the parser sees the “@dbobject ...” part of the input string, it will call the constructor given to JSON.parse, which looks up and returns the object in the database by its identifier, making for a convenient, hassle-free deserialization of a complex object.

Comments

On 12 October, 2005 at 12:31 pm, Paul Crowley wrote:

Useful, and a need I’ve felt myself in AJAX-like applications. But shouldn’t the syntax look more like a programming language? JSON should be executable JavaScript; it shouldn’t be hard to use syntax that looks like a function call, and then the function could look up and return the so-named object.

On 17 October, 2005 at 2:05 pm, tonyg wrote:

Actually, I don’t think JSON should be executable javascript. I think it serves my purpose better if it’s just a data language, with only very limited code-execution ability.

Function call syntax is another option for this read-time-evaluation extension, though - it may not even be harder to parse. (The immediate attraction of the @ prefix is firstly that it’s very easy to parse, and secondly that it doesn’t look like function application, since it isn’t really function application.)