XML format

<< Click to Display Table of Contents >>

Navigation:  Reference > File formats >

XML format

Easy Data Transform can input from and output to XML format files. Default file extension ".xml".

 

XML (Extensible Markup Language) format is commonly used for exchanging data between programs.

 

For example:

 

convert to XML

 

Is equivalent to:

 

<?xml version="1.0" encoding="UTF-8"?>

<root>

 <record>

   <CategoryID>1</CategoryID>

   <CategoryName>Beverages</CategoryName>

   <Description>Soft drinks, coffees &amp; teas</Description>

   <In-stock>true</In-stock>

 </record>

 <record>

   <CategoryID>2</CategoryID>

   <CategoryName>Condiments</CategoryName>

   <Description>Sweet and savory sauces</Description>

   <In-stock>false</In-stock>

 </record>

 <record>

   <CategoryID>3</CategoryID>

   <CategoryName>Confections</CategoryName>

   <Description>Candies and sweet breads</Description>

   <In-stock>true</In-stock>

 </record>

</root>

 

The dot ('.') character is used in the column header to show nesting. For example:

unflatten-table

Is equivalent to:

 

<?xml version="1.0" encoding="UTF-8"?>

<root>

 <record>

   <name>Avocado Dip</name>

   <carb>2</carb>

   <cholesterol>5</cholesterol>

   <fiber>0</fiber>

   <protein>1</protein>

   <sodium>210</sodium>

   <minerals>

     <ca>0</ca>

     <fe>0</fe>

   </minerals>

   <vitamins>

     <a>0</a>

     <c>0</c>

   </vitamins>

 </record>

</root>

 

Note that the columns may be input in a different order to the nodes in the XML. You can use Reorder Cols or Stack transforms to change the ordering. But note that during XML output nodes with no children are always ordered before nodes with children, regardless of the column order.

 

Any dots in XML element names are converted to hyphens ('-') on input.

 

The underscore ('_') character is used at the start of a column header name to identify it as an XML attribute. For example:

unflatten-table-attributes

Is equivalent to:

 

<?xml version="1.0" encoding="UTF-8"?>

<root>

 <record carb="2" cholesterol="5" fiber="0" name="Avocado Dip" protein="1" sodium="210">

   <minerals>

     <ca>0</ca>

     <fe>0</fe>

   </minerals>

   <vitamins>

     <a>0</a>

     <c>0</c>

   </vitamins>

 </record>

</root>

 

Repeated XML values can be input in either long or wide Format. For example:

 

<?xml version="1.0" encoding="UTF-8"?>

<ITEMS>

 <ITEM>

   <PARAM name="a" value="1"/>

   <PARAM name="b" value="2"/>

 </ITEM>

</ITEMS>

 

Input as Long (more rows):

long-format-example

Input as Wide (more columns):

 

wide-format-example

 

You can set the names for the root and record tags and add extra XML below the root.

 

XML related fields

 

Exclamation marks at the start of column names are removed from name tags when output to XML. This can be useful when you want duplicate tag names with the same parent.

 

XML duplicate child tag names

 

You are responsible for ensuring that the names of XML nodes and attributes are valid (e.g. start with a letter or underscore and do not contain spaces).

 

If you need really fine grained control over the XML output, you can:

output the XML to a file

read the file back in as plain text

apply transforms

write it back out as plain text

 

output XML as plain text

 

See also:

Video: How to convert CSV to XML

Video: How to convert Excel to XML

Video: How to convert XML to CSV

Video: How to convert XML to Excel