Vincit omnia veritas. Truth conquers all.

Convert XML to JSON using XSLT

With increasing use of separate services on the same data, the need for portable data formats aroused. XML was one of the first widely used, but lately JSON is blooming.

I don’t have a particular bias here, both serve well in appropriate environment, although JSON carrying the same data could result in about 20% size reduction.

So can they be interchangeable? Just recently, I needed to convert XML data into a JSON format for easier consumption on the client.

The fastest and (sometimes) easiest way to process XML to another format is XSLT.

The full XSLT code is at the bottom of this post and on GitHub.

Performing the transformation

Using XSLT to transform XML to another format is pretty easy, as it’s meant to be. :)

Depending on the environment you are running when you need this, there are different ways you can perform the transformation – so here are some.

What’s important to note here is that the same XSLT is used in all of these methods.

Specifying the stylesheet in XML

The easiest would be to just add a stylesheet to you XML document, like this.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="xml2json.xsl"?>
<!-- ... XML ... -->

Just open such a XML file in a browser, and the JSON will be there. Here’s an example page, you will see JSON when you open it, but if you view the source code, you will see XML only. The browser is applying the transformation.

Use an executable to transform XML to JSON

Microsoft provides msxsl.exe, a free command line utility to perform transformations, but it works only with MSXML 4.0 libraries (link). So it’s not really usable on Windows 7, for example.

I created a similar, but .NET based command line utility and, here is xsltr.exe that you can download.

C# code excerpt

It boils down to this…

doc = new XPathDocument(xmlfile);
XslCompiledTransform transform = new XslCompiledTransform(true);
transform.Load(xslfile);
XmlTextWriter writer = new XmlTextWriter(outfile, null);
transform.Transform(doc, null, writer);

Compiling the XSLT

But if you need the performance, here is a command line utility  together with the compiled XSLT.

I used the xsltc.exe to create a compiled xslt from the source code. It will compile the XSLT code to IL assembly and it will perform the transformation much faster.

Transform XML to JSON in a browser using JavaScript

To work with XML, DOMParser can be used in all the modern browsers – Firefox, Opera, Chrome, Safari… Of course, Internet Explorer has it’s own Microsoft.XMLDOM class.

Here’s a demo page that performs the transformation. There are a couple of XML files that you can transform, but you can also enter arbitrary XML and transform it.

If you prefer to work with libraries, I tried jsxml and it worked flawlessly.

The pure JavaScript code boils down to these pieces.

Load a string into an XML DOM JavaScript code excerpt

// code for regular browsers
if (window.DOMParser) {
    var parser = new DOMParser();
    demo.xml = parser.parseFromString(xmlString, "application/xml");
}
// code for IE
if (window.ActiveXObject) {
    demo.xml = new ActiveXObject("Microsoft.XMLDOM");
    demo.xml.async = false;
    demo.xml.loadXML(xmlString);
}

Apply the XSLT JavaScript code excerpt

// code for regular browsers
if (document.implementation && document.implementation.createDocument)
{
    var xsltProcessor = new XSLTProcessor();
    xsltProcessor.importStylesheet(demo.xslt);
    result = xsltProcessor.transformToFragment(demo.xml, document);
}
else if (window.ActiveXObject) {
    // code for IE
    result = demo.xml.transformNode(demo.xslt);
}

You can see this in action on the demo page.

XSLT Code

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text" encoding="utf-8"/>

    <xsl:template match="/*[node()]">
        <xsl:text>{</xsl:text>
        <xsl:apply-templates select="." mode="detect" />
        <xsl:text>}</xsl:text>
    </xsl:template>

    <xsl:template match="*" mode="detect">
        <xsl:choose>
            <xsl:when test="name(preceding-sibling::*[1]) = name(current()) and name(following-sibling::*[1]) != name(current())">
                    <xsl:apply-templates select="." mode="obj-content" />
                <xsl:text>]</xsl:text>
                <xsl:if test="count(following-sibling::*[name() != name(current())]) &gt; 0">, </xsl:if>
            </xsl:when>
            <xsl:when test="name(preceding-sibling::*[1]) = name(current())">
                    <xsl:apply-templates select="." mode="obj-content" />
                    <xsl:if test="name(following-sibling::*) = name(current())">, </xsl:if>
            </xsl:when>
            <xsl:when test="following-sibling::*[1][name() = name(current())]">
                <xsl:text>"</xsl:text><xsl:value-of select="name()"/><xsl:text>" : [</xsl:text>
                    <xsl:apply-templates select="." mode="obj-content" /><xsl:text>, </xsl:text>
            </xsl:when>
            <xsl:when test="count(./child::*) > 0 or count(@*) > 0">
                <xsl:text>"</xsl:text><xsl:value-of select="name()"/>" : <xsl:apply-templates select="." mode="obj-content" />
                <xsl:if test="count(following-sibling::*) &gt; 0">, </xsl:if>
            </xsl:when>
            <xsl:when test="count(./child::*) = 0">
                <xsl:text>"</xsl:text><xsl:value-of select="name()"/>" : "<xsl:apply-templates select="."/><xsl:text>"</xsl:text>
                <xsl:if test="count(following-sibling::*) &gt; 0">, </xsl:if>
            </xsl:when>
        </xsl:choose>
    </xsl:template>

    <xsl:template match="*" mode="obj-content">
        <xsl:text>{</xsl:text>
            <xsl:apply-templates select="@*" mode="attr" />
            <xsl:if test="count(@*) &gt; 0 and (count(child::*) &gt; 0 or text())">, </xsl:if>
            <xsl:apply-templates select="./*" mode="detect" />
            <xsl:if test="count(child::*) = 0 and text() and not(@*)">
                <xsl:text>"</xsl:text><xsl:value-of select="name()"/>" : "<xsl:value-of select="text()"/><xsl:text>"</xsl:text>
            </xsl:if>
            <xsl:if test="count(child::*) = 0 and text() and @*">
                <xsl:text>"text" : "</xsl:text><xsl:value-of select="text()"/><xsl:text>"</xsl:text>
            </xsl:if>
        <xsl:text>}</xsl:text>
        <xsl:if test="position() &lt; last()">, </xsl:if>
    </xsl:template>

    <xsl:template match="@*" mode="attr">
        <xsl:text>"</xsl:text><xsl:value-of select="name()"/>" : "<xsl:value-of select="."/><xsl:text>"</xsl:text>
        <xsl:if test="position() &lt; last()">,</xsl:if>
    </xsl:template>

    <xsl:template match="node/@TEXT | text()" name="removeBreaks">
        <xsl:param name="pText" select="normalize-space(.)"/>
        <xsl:choose>
            <xsl:when test="not(contains($pText, '&#xA;'))"><xsl:copy-of select="$pText"/></xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="concat(substring-before($pText, '&#xD;&#xA;'), ' ')"/>
                <xsl:call-template name="removeBreaks">
                    <xsl:with-param name="pText" select="substring-after($pText, '&#xD;&#xA;')"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>

</xsl:stylesheet>

The XSLT code turned out to be more complicated then I thought – I imagined that the transformation would be more natural, not so case based, but it just isn’t possible (or I don’t see the way).

Resources

Links to some resources I used in the process.

Bojan Bjelić

Working hard on bunch of stuff, positive future above all. I'm blogging mostly about software, productivity and digital world.

More Posts - Website - Twitter - Facebook - LinkedIn - Google Plus - Flickr

17 Responses to “Convert XML to JSON using XSLT”

  1. raju says:

    can we apply xsl to a text file to get json response

  2. Edward says:

    Thank you for this template, this is the best XML to JSON XSLT implementation I’ve come across!

  3. Jean-Philippe Martin says:

    Great script !
    I made a modification : On the line 48 :
    -
    +

    Because my xml contained carriage returns in the CDATA : ex:

    • Jean-Philippe Martin says:

      Oups my code got stripped !
      I added a normalize-space around text() on line 48.

      My CDATA was :
      ![CDATA[ObjDesCadres_Type3]]

      • Hey Jean-Phillpe, glad to see you found good use of it. The removeBreaks template should be taking care of the line breaks…
        If you send me the XML you’re processing to my email, I’ll take a look.
        Best, Bojan

  4. Michael B. says:

    Thank you for this Bojan, but it looks like your dropbox file or folder has disappeared. I have a lot of xml files that I’d like to convert to json (on the command line), and the .NET utility that you refer to cannot be found. ie, from this, above:

    I created a similar, but .NET based command line utility and, here is xsltr.exe that you can download.

    Or am I missing something? Thanks!

    ( As far as I can tell yours would be the only command line utility “out there”… )

  5. Mihir says:

    I am totally new to this, and thus dont know how to fix when my xml contains some text: including our “Ready4Retail” standard: a 90-plus-point inspection. Any ideas?
    ITs not properly parsing and giving an error:
    Error: Parse error on line 1:
    …ards including our “Ready4Retail” standa
    ———————–^
    Expecting ‘EOF’, ‘}’, ‘:’, ‘,’, ‘]’

    Thanks

  6. Vivek says:

    Bojan,

    This script is awesome! Thanks for publishing it.

    Is there a way to strip the root node from the JSON output? Appreciate your input on this.

    • Hi Vivek,
      You’re very welcome.
      Didn’t try this, but to skip the root, the solution should be to select whatever node comes first under the root. To achieve this, the value for match=”/*[node()]” attribute in the line 5 should be replaced by another xpath. Let me know if this works out.
      Cheers, Bojan

  7. cuq says:

    Thanks! a LOT!

  8. henry says:

    this is great

Leave a Reply


4 + = 12