Using XML based Configurations

This section explains how to use Hierarchical and Structured XML datasets.

Hierarchical properties

The XML document we used in the section about composite configuration was quite simple. Because of its tree-like nature XML documents can represent data that is structured in many ways. This section explains how to deal with such structured documents.

Structured XML

Consider the following scenario: An application operates on database tables and wants to load a definition of the database schema from its configuration. A XML document provides this information. It could look as follows:


<?xml version="1.0" encoding="ISO-8859-1" ?>

<database>
  <tables>
    <table tableType="system">
      <name>users</name>
      <fields>
        <field>
          <name>uid</name>
          <type>long</type>
        </field>
        <field>
          <name>uname</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>firstName</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>lastName</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>email</name>
          <type>java.lang.String</type>
        </field>
      </fields>
    </table>
    <table tableType="application">
      <name>documents</name>
      <fields>
        <field>
          <name>docid</name>
          <type>long</type>
        </field>
        <field>
          <name>name</name>
          <type>java.lang.String</type>
        </field>
        <field>
          <name>creationDate</name>
          <type>java.util.Date</type>
        </field>
        <field>
          <name>authorID</name>
          <type>long</type>
        </field>
        <field>
          <name>version</name>
          <type>int</type>
        </field>
      </fields>
    </table>
  </tables>
</database>

			

This XML is quite self explanatory; there is an arbitrary number of table elements, each of it has a name and a list of fields. A field in turn consists of a name and a data type. To access the data stored in this document it must be included in the configuration definition file:


<?xml version="1.0" encoding="ISO-8859-1" ?>

<configuration>
  <properties fileName="usergui.properties"/>
  <xml fileName="gui.xml"/>
  <xml fileName="tables.xml"/>
</configuration>

			

The additional xml element causes the document with the table definitions to be loaded. When we now want to read some of the properties we face a problem: the syntax for constructing configuration keys we learned so far is not powerful enough to access all of the data stored in the tables document.

Because the document contains a list of tables some properties are defined more than once. E.g. the configuration key tables.table.name refers to a name element inside a table element inside a tables element. This constellation happens to occur twice in the tables document.

Multiple definitions of a property do not cause problems and are supported by all classes of Configuration. If such a property is queried using getProperty(), the method recognizes that there are multiple values for that property and returns a collection with all these values. So we could write


Object prop = config.getProperty("tables.table.name");
if(prop instanceof Collection)
{
	System.out.println("Number of tables: " + ((Collection) prop).size());
}

			

An alternative to this code would be the getList() method of Configuration. If a property is known to have multiple values (as is the table name property in this example), getList() allows to retrieve all values at once. Note: it is legal to call getString() or one of the other getter methods on a property with multiple values; it returns the first element of the list.

Accessing structured properties

Okay, we can obtain a list with the name of all defined tables. In the same way we can retrieve a list with the names of all table fields: just pass the key tables.table.fields.field.name to the getList() method. In our example this list would contain 10 elements, the names of all fields of all tables. This is fine, but how do we know, which field belongs to which table?

The answer is, with our actual approach we have no chance to obtain this knowledge! If XML documents are loaded this way, their exact structure is lost. Though all field names are found and stored the information which field belongs to which table is not saved. Fortunately Configuration provides a way of dealing with structured XML documents. To enable this feature the configuration definition file has to be slightly altered. It becomes:


<?xml version="1.0" encoding="ISO-8859-1" ?>

<configuration>
  <properties fileName="usergui.properties"/>
  <xml fileName="gui.xml"/>
  <hierarchicalXml fileName="tables.xml"/>
</configuration>

			

Note that one xml element was replaced by a hierarchicalXml element. This element tells the configuration factory that not the default class for processing XML documents should be used, but the class HierarchicalXMLConfiguration. As the name implies this class is capable of saving the hierarchy of XML documents thus keeping their structure.

When working with such hierarchical properties configuration keys used to query properties support an extended syntax. All components of a key can be appended by a numerical value in parentheses that determines the index of the affected property. This is explained best by some examples:

We will now provide some configuration keys and show the results of a getProperty() call with these keys as arguments.

tables.table(0).name
Returns the name of the first table (all indices are 0 based), in this example the string users.
tables.table(0)[@tableType]
Returns the value of the tableType attribute of the first table (system).
tables.table(1).name
Analogous to the first example returns the name of the second table (documents).
tables.table(2).name
Here the name of a third table is queried, but because there are only two tables result is null. The fact that a null value is returned for invalid indices can be used to find out how many values are defined for a certain property: just increment the index in a loop as long as valid objects are returned.
tables.table(1).fields.field.name
Returns a collection with the names of all fields that belong to the second table. With such kind of keys it is now possible to find out, which fields belong to which table.
tables.table(1).fields.field(2).name
The additional index after field selects a certain field. This expression represents the name of the third field in the second table (creationDate).
tables.table.fields.field(0).type
This key may be a bit unusual but nevertheless completely valid. It selects the data types of the first fields in all tables. So here a collection would be returned with the values [long, long].

These examples should make the usage of indices quite clear. Because each configuration key can contain an arbitrary number of indices it is possible to navigate through complex structures of XML documents; each XML element can be uniquely identified. So at the end of this section we can draw the following facit: For simple XML documents that define only some simple properties and do not have a complex structure the default XML configuration class is suitable. If documents are more complex and their structure is important, the hierarchy aware class should be used, which is enabled by an additional className attribute as shown in the example configuration definition file above.

Union configuration

In an earlier section about the configuration definition file for ConfigurationFactory it was stated that configuration files included first can override properties in configuraton files included later and an example use case for this behaviour was given. There may be times when there are other requirements.

Let's continue the example with the application that somehow process database tables and that reads the definitions of the affected tables from its configuration. Now consider that this application grows larger and must be maintained by a team of developers. Each developer works on a separated set of tables. In such a scenario it would be problematic if the definitions for all tables would be kept in a single file. It can be expected that this file needs to be changed very often and thus can be a bottleneck for team development when it is nearly steadily checked out. It would be much better if each developer had an associated file with table definitions and all these information could be linked together at the end.

ConfigurationFactory provides support for such a use case, too. It is possible to specify in the configuration definition file that from a set of configuration sources a logic union configuration is to be constructed. Then all properties defined in the provided sources are collected and can be accessed as if they had been defined in a single source. To demonstrate this feature let us assume that a developer of the database application has defined a specific XML file with a table definition named tasktables.xml:


<?xml version="1.0" encoding="ISO-8859-1" ?>

<config>
  <table tableType="application">
    <name>tasks</name>
    <fields>
      <field>
        <name>taskid</name>
        <type>long</type>
      </field>
      <field>
        <name>name</name>
        <type>java.lang.String</type>
      </field>
      <field>
        <name>description</name>
        <type>java.lang.String</type>
      </field>
      <field>
        <name>responsibleID</name>
        <type>long</type>
      </field>
      <field>
        <name>creatorID</name>
        <type>long</type>
      </field>
      <field>
        <name>startDate</name>
        <type>java.util.Date</type>
      </field>
      <field>
        <name>endDate</name>
        <type>java.util.Date</type>
      </field>
    </fields>
  </table>
</config>

		

This file defines the structure of an additional table, which should be added to the so far existing table definitions. To achieve this the configuration definition file has to be changed: A new section is added that contains the include elements of all configuration sources which are to be combined.


<?xml version="1.0" encoding="ISO-8859-1" ?>
<!-- Configuration definition file that demonstrates the
     override and additional sections -->

<configuration>
  <override>
    <properties fileName="usergui.properties"/>
    <xml fileName="gui.xml"/>
  </override>
  
  <additional>
    <hierarchicalXml fileName="tables.xml"/>
    <hierarchicalXml fileName="tasktables.xml" at="tables"/>
  </additional>
</configuration>

		

Compared to the older versions of this file a couple of changes has been done. One major difference is that the elements for including configuration sources are no longer direct children of the root element, but are now contained in either an override or additional section. The names of these sections already imply their purpose.

The override section is not strictly necessary. Elements in this section are treated as if they were children of the root element, i.e. properties in the included configuration sources override properties in sources included later. So the override tags could have been ommitted, but for sake of clearity it is recommended to use them when there is also an additional section.

It is the additonal section that introduces a new behaviour. All configuration sources listed here are combined to a union configuration. In our example we have put two xml elements in this area that load the available files with database table definitions. The syntax of elements in the additional section is analogous to the syntax described so far. The only difference is an additionally supported at attribute that specifies the position in the logic union configuration where the included properties are to be added. In this example we set the at attribute of the second element to tables. This is because the file starts with a table element, but to be compatible with the other table definition file should be accessable under the key tables.table.

After these modifications have been performed the configuration obtained from the ConfigurationFactory will allow access to three database tables. A call of config.getString("tables.table(2).name"); will result in a value of tasks. In an analogous way it is possible to retrieve the fields of the third table.

Note that it is also possible to override properties defined in an additonal section. This can be done by placing a configuration source in the override section that defines properties that are also defined in one of the sources listed in the additional section. The example does not make use of that. Note also that the order of the override and additional sections in a configuration definition file does not matter. Sources in an override section are always treated with higher priority (otherwise they could not override the values of other sources).