XML format

<< Click to Display Table of Contents >>

Navigation:  Reference > File formats >

XML format

Easy Data Transform can input from and output to XML format files. Default file extension ".xml".

 

XML (Extensible Markup Language) format is commonly used for exchanging data between programs.

 

For example:

 

convert to XML

 

Is equivalent to:

 

<?xml version="1.0" encoding="UTF-8"?>

<root>

  <record>

    <CategoryID>1</CategoryID>

    <CategoryName>Beverages</CategoryName>

    <Description>Soft drinks, coffees &amp; teas</Description>

    <In-stock>true</In-stock>

  </record>

  <record>

    <CategoryID>2</CategoryID>

    <CategoryName>Condiments</CategoryName>

    <Description>Sweet and savory sauces</Description>

    <In-stock>false</In-stock>

  </record>

  <record>

    <CategoryID>3</CategoryID>

    <CategoryName>Confections</CategoryName>

    <Description>Candies and sweet breads</Description>

    <In-stock>true</In-stock>

  </record>

</root>

 

The dot ('.') character is used in the column header to show nesting. For example:

unflatten-table

Is equivalent to:

 

<?xml version="1.0" encoding="UTF-8"?>

<root>

  <record>

    <name>Avocado Dip</name>

    <carb>2</carb>

    <cholesterol>5</cholesterol>

    <fiber>0</fiber>

    <protein>1</protein>

    <sodium>210</sodium>

    <minerals>

      <ca>0</ca>

      <fe>0</fe>

    </minerals>

    <vitamins>

      <a>0</a>

      <c>0</c>

    </vitamins>

  </record>

</root>

 

Note that the columns may be input in a different order to the nodes in the XML. You can use Reorder Cols or Stack transforms to change the ordering. But note that during XML output nodes with no children are always ordered before nodes with children, regardless of the column order.

 

Any dots in XML element names are converted to hyphens ('-') on input.

 

The underscore ('_') character is used at the start of a column header name to identify it as an XML attribute. For example:

unflatten-table-attributes

Is equivalent to:

 

<?xml version="1.0" encoding="UTF-8"?>

<root>

  <record carb="2" cholesterol="5" fiber="0" name="Avocado Dip" protein="1" sodium="210">

    <minerals>

      <ca>0</ca>

      <fe>0</fe>

    </minerals>

    <vitamins>

      <a>0</a>

      <c>0</c>

    </vitamins>

  </record>

</root>

 

Repeated XML values can be input in either long or wide Format. For example:

 

<?xml version="1.0" encoding="UTF-8"?>

<ITEMS>

  <ITEM>

    <PARAM name="a" value="1"/>

    <PARAM name="b" value="2"/>

  </ITEM>

</ITEMS>

 

Input as Long (more rows):

long-format-example

Input as Wide (more columns):

 

wide-format-example

 

You are responsible for ensuring that the names of XML nodes and attributes are valid (e.g. start with a letter or underscore and do not contain spaces).