leastfixedpoint

XML CDATA and escaping

This page is a mirrored copy of an article originally posted on the (now sadly defunct) LShift blog; see the archive index here.

Thu, 25 October 2007

XML’s syntax for CDATA looks like this:

  <![CDATA[some text]]>

Tag syntax within a CDATA section is suspended, so this is well-formed XML:

  <![CDATA[some <more> text]]>

even though it looks like the “<more>” tag is unclosed.

There’s only one thing you can’t say in a CDATA section: “]]>”. But there’s a trick to save us, even here. To print an arbitrary string in a CDATA enclosure, replace each instance of “]]>” with “]]]]><![CDATA[>", and then put the normal "<![CDATA["/"]]>” brackets around it:

  my ]]> text

becomes

  <![CDATA[my ]]]]><![CDATA[> text]]>

Comments

On 27 October, 2007 at 4:16 am, Chui Tey wrote:

I’m (pleasantly) surprised that Google Feed Reader actually formatted your post correctly!

On 26 June, 2010 at 7:42 pm, Javier wrote:

thanks.