XML serialization and parsing support
Juneau supports converting arbitrary POJOs to and from XML using ultra-efficient serializers and parsers.
The XML serializer converts POJOs directly to XML without the need for intermediate DOM objects.
Likewise, the XML parser uses a STaX parser and creates POJOs directly without intermediate DOM objects.
Unlike frameworks such as JAXB, Juneau does not require POJO classes to be annotated to produce
and consume XML.
For example, it can serialize and parse instances of any of the following POJO types:
String
, Integer
, Boolean
, Float
).
HashSet
, TreeMap
) containing anything on this list.
Strings
(e.g. classes containing toString()
, fromString()
, valueOf()
, constructor(String)
).
In addition to the types shown above, Juneau includes the ability to define transforms to transform non-standard object and
property types to serializable forms (e.g. to transform Calendars
to and from ISO8601
strings,
or byte[]
arrays to and from base-64 encoded strings).
These transforms can be associated with serializers/parsers, or can be associated with classes or bean properties through type and method annotations.
Refer to POJO Categories for a complete definition of supported POJOs.
While annotations are not required to produce or consume XML, several XML annotations are provided for handling namespaces and fine-tuning the format of the XML produced.
The Juneau XML serialization and parsing support does not require any external prerequisites. It only requires Java 1.6 or above.
The example shown here is from the Address Book resource located in the juneau-examples-rest
microservice project.
The POJO model consists of a List
of Person
beans, with each Person
containing
zero or more Address
beans.
When you point a browser at /sample/addressBook
, the POJO is rendered as HTML:
By appending ?Accept=mediaType&plainText=true
to the URL, you can view the data in the various supported XML formats:
In addition to serializing POJOs to XML, Juneau includes support for serializing the POJO metamodel to XML Schema, with support for multiple namespaces.
{@link org.apache.juneau.xml.XmlSerializer} is the class used to convert POJOs to XML.
{@link org.apache.juneau.xml.XmlDocSerializer} is a subclass that adds an XML declaration element to the output before the POJO is serialized.
The XML serializer includes many configurable settings.
Static reusable instances of XML serializers are provided with commonly-used settings:
In addition, DTO beans are provided that use the XML serializer and parser for the following languages:
Refer to the package-level Javadocs for more information about those formats.
The examples shown in this document will use single-quote, readable settings.
For brevity, the examples will use public fields instead of getters/setters to reduce the size of the examples.
In the real world, you'll typically want to use standard bean getters and setters.
To start off simple, we'll begin with the following simplified bean and build upon it.
The following code shows how to convert this to simple XML (no namespaces):
Side note: Serializers can also be created by cloning existing serializers:
The code above produces the following output:
The first thing you may notice is how the bean instance is represented by the element
When objects have no name associated with them, Juneau provides a default generalized name that maps to the equivalent JSON data type.
Some cases when objects do not have names:
The generalized name reflects the JSON-equivalent data type.
Juneau produces JSON-equivalent XML, meaning any valid JSON document can be losslessly converted into an XML equivalent.
In fact, all of the Juneau serializers and parsers are built upon this JSON-equivalency.
The following examples show how different data types are represented in XML. They mirror how the data structures are represented in JSON.
The representation of loose (not a direct bean property value) simple types are shown below:
Data type | JSON example | XML |
---|---|---|
string | ||
boolean | ||
integer | 123 | |
float | 1.23 | |
null |
Loose maps and beans use the element
Object
or superclass/interface value type).
Data type | JSON example | XML |
---|---|---|
Map<String,String> |
{
k1: |
|
Map<String,Number> |
{
k1: 123,
k2: 1.23,
k3: |
|
Map<String,Object> |
{
k1: |
Loose collections and arrays use the element
Data type | JSON example | XML |
---|---|---|
String[] |
[
|
|
Number[] |
[
123,
1.23,
|
|
Object[] |
[
|
|
String[][] |
[
[ |
|
|
[ 123 ] | |
|
[
|
|
List<String> |
[
|
|
List<Number> |
[
123,
1.23,
|
|
List<Object> |
[
|
Data type | JSON example | XML |
---|---|---|
|
{
a: |
Data type | JSON example | XML |
---|---|---|
|
{
a: {
k1: |
Just because Juneau allows you to serialize ordinary POJOs to XML doesn't mean you are limited to just JSON-equivalent XML.
Several annotations are provided in the {@link org.apache.juneau.xml.annotation} package for customizing the output.
The {@link org.apache.juneau.annotation.Bean#typeName() @Bean.typeName()} annotation can be used to override the Juneau default name on bean elements. Types names serve two distinct purposes:
Data type | JSON example | Without annotation | With annotation |
---|---|---|---|
|
{
a: |
On bean properties, a
In the following example, a type attribute is used on property 'b' but not property 'a' since
'b' is of type Object
and therefore the bean class cannot be inferred.
Java | Without annotation | With annotation |
---|---|---|
|
string
, number
, boolean
, object
, array
, and null
are reserved keywords that cannot be used as type names.
Beans with type names are often used in conjunction with the {@link org.apache.juneau.annotation.Bean#beanDictionary() @Bean.beanDictionary()} and {@link org.apache.juneau.annotation.BeanProperty#beanDictionary() @BeanProperty.beanDictionary()} annotations so that the beans can be resolved at parse time. These annotations are not necessary during serialization, but are needed during parsing in order to resolve the bean types.
The following examples show how type names are used under various circumstances.
Note that array dimensions are represented with the caret
Pay special attention to when
Java | XML |
---|---|
|
|
|
|
|
Bean type names are also used for resolution when abstract fields are used. The following examples show how they are used in a variety of circumstances.
Java | XML |
---|---|
|
|
|
|
|
|
|
On a side note, characters that cannot be represented in XML 1.0 are encoded using a simple encoding.
Note in the examples below, some characters such as
Java | XML |
---|---|
|
|
|
While it's true that these characters CAN be represented in XML 1.1, it's impossible to parse XML 1.1 text in Java without the XML containing an XML declaration. Unfortunately, this, and the uselessness of the {@link javax.xml.stream.XMLInputFactory#IS_REPLACING_ENTITY_REFERENCES} setting in Java forced us to make some hard design decisions that may not be the most elegant.
The {@link org.apache.juneau.xml.annotation.Xml#childName() @Xml.childName()} annotation can be used to specify the name of XML child elements for bean properties of type collection or array.
Data type | JSON example | Without annotation | With annotation |
---|---|---|---|
|
{
a: [ |
||
|
{ a: [123,456] } |
The {@link org.apache.juneau.xml.annotation.Xml#format() @Xml.format()} annotation can be used to tweak the XML format of a POJO.
The value is set to an enum value of type {@link org.apache.juneau.xml.annotation.XmlFormat}.
This annotation can be applied to both classes and bean properties.
The {@link org.apache.juneau.xml.annotation.XmlFormat#ATTR} format can be applied to bean properties to serialize
them as XML attributes instead of elements.
Note that this only supports properties of simple types (e.g. strings, numbers, booleans).
Data type | JSON example | Without annotation | With annotation |
---|---|---|---|
|
{
a: |
The {@link org.apache.juneau.xml.annotation.XmlFormat#ATTRS} format can be applied to bean classes to force all bean properties to be serialized as XML attributes instead of child elements.
Data type | JSON example | Without annotation | With annotation |
---|---|---|---|
|
{
a: |
The {@link org.apache.juneau.xml.annotation.XmlFormat#ELEMENT} format can be applied to bean properties to override the {@link org.apache.juneau.xml.annotation.XmlFormat#ATTRS} format applied on the bean class.
Data type | JSON example | Without annotation | With annotation |
---|---|---|---|
|
{
a: |
The {@link org.apache.juneau.xml.annotation.XmlFormat#ATTRS} format can be applied to a single bean property
of type Map<String,Object>
to denote arbitrary XML attribute values on the element.
These can be mixed with other {@link org.apache.juneau.xml.annotation.XmlFormat#ATTR} annotated properties, but
there must not be an overlap in bean property names and map keys.
Data type | JSON example | Without annotation | With annotation |
---|---|---|---|
|
{
a: {
k1: |
The {@link org.apache.juneau.xml.annotation.XmlFormat#COLLAPSED} format can be applied to bean properties
of type array/Collection.
This causes the child objects to be serialized directly inside the bean element.
This format must be used in conjunction with {@link org.apache.juneau.xml.annotation.Xml#childName()}
to differentiate which collection the values came from if you plan on parsing the output back into beans.
Note that child names must not conflict with other property names.
Data type | JSON example | Without annotation | With annotation |
---|---|---|---|
|
{
a: [ |
The {@link org.apache.juneau.xml.annotation.XmlFormat#ELEMENTS} format can be applied to a single bean property
of either a simple type or array/Collection.
It allows free-form child elements to be formed.
All other properties on the bean MUST be serialized as attributes.
Data type | JSON example | With annotation |
---|---|---|
|
{
a: |
|
|
{
a: |
The {@link org.apache.juneau.xml.annotation.XmlFormat#MIXED} format is similar to {@link org.apache.juneau.xml.annotation.XmlFormat#ELEMENTS}
except elements names on primitive types (string/number/boolean/null) are stripped from the output.
This format particularly useful when combined with bean dictionaries to produce mixed content.
The bean dictionary isn't used during serialization, but it is needed during parsing to resolve bean types.
The {@link org.apache.juneau.xml.annotation.XmlFormat#MIXED_PWS} format identical to {@link org.apache.juneau.xml.annotation.XmlFormat#MIXED} except whitespace characters are preserved in the output.
Data type | JSON example | Without annotations | With annotations |
---|---|---|---|
|
{
a: [
|
Whitespace (tabs and newlines) are not added to MIXED child nodes in readable-output mode. This helps ensures strings in the serialized output can be losslessly parsed back into their original forms when they contain whitespace characters. If the {@link javax.xml.stream.XMLInputFactory#IS_REPLACING_ENTITY_REFERENCES} setting was not useless in Java, we could support lossless readable XML for MIXED content. But as of Java 8, it still does not work.
XML suffers from other deficiencies as well that affect MIXED content. For example,
The examples below show how whitespace is handled under various circumstances:
Data type | XML |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It should be noted that when using
The {@link org.apache.juneau.xml.annotation.XmlFormat#TEXT} format is similar to {@link org.apache.juneau.xml.annotation.XmlFormat#MIXED}
except it's meant for solitary objects that get serialized as simple child text nodes.
Any object that can be serialize to a String
can be used.
The {@link org.apache.juneau.xml.annotation.XmlFormat#TEXT_PWS} is the same except whitespace is preserved in the output.
Data type | JSON example | Without annotations | With annotations |
---|---|---|---|
|
{
a: |
The {@link org.apache.juneau.xml.annotation.XmlFormat#XMLTEXT} format is similar to {@link org.apache.juneau.xml.annotation.XmlFormat#TEXT}
except it's meant for strings containing XML that should be serialized as-is to the document.
Any object that can be serialize to a String
can be used.
During parsing, the element content gets parsed with the rest of the document and then re-serialized to XML before being set as the
property value. This process may not be perfect (e.g. double quotes may be replaced by single quotes, etc...).
Data type | JSON example | With TEXT annotation | With XMLTEXT annotation |
---|---|---|---|
|
{
a: |
Let's go back to the example of our original Person
bean class:
However, this time we'll leave namespaces enabled on the serializer:
Now when we run this code, we'll see namespaces added to our output:
This isn't too exciting yet since we haven't specified any namespaces yet.
Therefore, everything is defined under the default Juneau
namespace.
Namespaces can be defined at the following levels:
It's typically best to specify the namespaces used at the package level.
We'll do that here for the package containing our test code.
We're defining four namespaces in this package and designating
Take special note that the
Other XML annotations are also modelled after JAXB.
However, since many of the features of JAXB are already implemented for all serializers and parsers
at a higher level through various general annotations such as {@link org.apache.juneau.annotation.Bean} and {@link org.apache.juneau.annotation.BeanProperty}
it was decided to maintain separate Juneau XML annotations instead of reusing JAXB annotations.
This may change in some future implementation, but for now it was decided that having separate Juneau XML annotations was less confusing.
On our bean class, we'll specify to use the
Now when we serialize the bean, we get the following:
We can simplify the output by setting the default namespace on the serializer so that all the elements do not need to be prefixed:
This produces the following equivalent where the elements don't need prefixes since they're already in the default document namespace:
One important property on the XML serializer class is {@link org.apache.juneau.xml.XmlSerializerContext#XML_autoDetectNamespaces XML_autoDetectNamespaces}.
This property tells the serializer to make a first-pass over the data structure to look for namespaces defined on classes and bean properties.
In high-performance environments, you may want to consider disabling auto-detection and providing your own explicit list of namespaces to the serializer
to avoid this scanning step.
The following code will produce the same output as before, but will perform slightly better since it avoids this prescan step.
The {@link org.apache.juneau.annotation.Bean @Bean} and {@link org.apache.juneau.annotation.BeanProperty @BeanProperty} annotations
are used to customize the behavior of beans across the entire framework.
In addition to using them to identify the resource URI for the bean shown above, they have various other uses:
For example, we now add a birthDate
property, and associate a transform with it to transform
it to an ISO8601 date-time string in GMT time.
By default, Calendars
are treated as beans by the framework, which is usually not how you want them serialized.
Using transforms, we can convert them to standardized string forms.
Next, we alter our code to pass in the birthdate:
Now when we rerun the sample code, we'll get the following:
Another useful feature is the {@link org.apache.juneau.annotation.Bean#propertyNamer()} annotation that allows you to plug in your own
logic for determining bean property names.
The {@link org.apache.juneau.PropertyNamerDashedLC} is an example of an alternate property namer.
It converts bean property names to lowercase-dashed format.
In our example, let's add a list-of-beans property to our sample class:
The Address
class has the following properties defined:
Next, add some quick-and-dirty code to add an address to our person bean:
Now when we run the sample code, we get the following:
Juneau provides the {@link org.apache.juneau.xml.XmlSchemaSerializer} class for generating XML-Schema documents
that describe the output generated by the {@link org.apache.juneau.xml.XmlSerializer} class.
This class shares the same properties as XmlSerializer
.
Since the XML output differs based on settings on the XML serializer class, the XML-Schema serializer
class must have the same property values as the XML serializer class it's descriqux.
To help facilitate creating an XML Schema serializer with the same properties as the corresponding
XML serializer, the {@link org.apache.juneau.xml.XmlSerializer#getSchemaSerializer()} method
has been added.
XML-Schema requires a separate file for each namespace.
Unfortunately, does not mesh well with the Juneau serializer architecture which serializes to single writers.
To get around this limitation, the schema serializer will produce a single output, but with multiple
schema documents separated by the null character (
Lets start with an example where everything is in the same namespace.
We'll use the classes from before, but remove the references to namespaces.
Since we have not defined a default namespace, everything is defined under the default Juneau namespace.
The code for creating our POJO model and generating XML Schema is shown below:
Now if we add in some namespaces, we'll see how multiple namespaces are handled.
The schema consists of 4 documents separated by a
For convenience, the {@link org.apache.juneau.xml.XmlSchemaSerializer#getValidator(SerializerSession,Object)} method is provided to create a {@link javax.xml.validation.Validator} using the input from the serialize method.
The XML serializer is designed to be used against POJO tree structures.
It expects that there not be loops in the POJO model (e.g. children with references to parents, etc...).
If you try to serialize models with loops, you will usually cause a StackOverflowError
to
be thrown (if {@link org.apache.juneau.serializer.SerializerContext#SERIALIZER_maxDepth} is not reached first).
If you still want to use the XML serializer on such models, Juneau provides the
{@link org.apache.juneau.serializer.SerializerContext#SERIALIZER_detectRecursions} setting.
It tells the serializer to look for instances of an object in the current branch of the tree and
skip serialization when a duplicate is encountered.
For example, let's make a POJO model out of the following classes:
Now we create a model with a loop and serialize the results.
What we end up with is the following, which does not serialize the contents of the c
field:
Without recursion detection enabled, this would cause a stack-overflow error.
Recursion detection introduces a performance penalty of around 20%.
For this reason the setting is disabled by default.
See the following classes for all configurable properties that can be used on this serializer:
The {@link org.apache.juneau.xml.XmlParser} class is the class used to parse Juneau-generated XML back into POJOs.
A static reusable instance of XmlParser
is also provided for convenience:
Let's build upon the previous example and parse the generated XML back into the original bean.
We start with the XML that was generated.
This code produced the following:
The code to convert this back into a bean is:
We print it out to JSON to show that all the data has been preserved:
{
id: 1,
name:
The XML parser is not limited to parsing back into the original bean classes.
If the bean classes are not available on the parsing side, the parser can also be used to
parse into a generic model consisting of Maps
, Collections
, and primitive
objects.
You can parse into any Map
type (e.g. HashMap
, TreeMap
), but
using {@link org.apache.juneau.ObjectMap} is recommended since it has many convenience methods
for converting values to various types.
The same is true when parsing collections. You can use any Collection (e.g. HashSet
, LinkedList
)
or array (e.g. Object[]
, String[]
, String[][]
), but using
{@link org.apache.juneau.ObjectList} is recommended.
When the map or list type is not specified, or is the abstract Map
, Collection
, or List
types,
the parser will use ObjectMap
and ObjectList
by default.
See the following classes for all configurable properties that can be used on this parser:
Juneau provides fully-integrated support for XML serialization/parsing in the REST server and client APIs.
The next two sections describe these in detail.
There are four general ways of defining REST interfaces with support for XML. Two using the built-in Juneau Server API, and two using the JAX-RS integration component.
In general, the Juneau REST server API is much more configurable and easier to use than JAX-RS, but beware that the author may be slightly biased in this statement.
The quickest way to implement a REST resource with XML support is to create a subclass of {@link org.apache.juneau.rest.RestServletDefault}.
This class provides support for JSON, XML, HTML, URL-Encoding, and others.
The AddressBookResource
example shown in the first chapter uses the RestServletJenaDefault
class
which is a subclass of RestServletDefault
with additional support for RDF languages.
The start of the class definition is shown below:
Notice how serializer and parser properties can be specified using the @RestResource.properties()
annotation.
The
The remaining properties are specific to the HTML serializer.
The $L{...}
variable represent localized strings pulled from the resource bundle identified by the messages
annotation.
These variables are replaced at runtime based on the HTTP request locale.
Several built-in runtime variable types are defined, and the API can be extended to include user-defined variables.
See {@link org.apache.juneau.rest.RestServlet#getVarResolver()} for more information.
This document won't go into all the details of the Juneau RestServlet
class.
Refer to the {@link org.apache.juneau.rest} documentation for more information on the REST servlet class in general.
The rest of the code in the resource class consists of REST methods that simply accept and return POJOs.
The framework takes care of all content negotiation, serialization/parsing, and error handling.
Below are 3 of those methods to give you a general idea of the concept:
The resource class can be registered with the web application like any other servlet, or can be defined as a child of another resource through the {@link org.apache.juneau.rest.annotation.RestResource#children()} annotation.
For fine-tuned control of media types, the {@link org.apache.juneau.rest.RestServlet} class
can be subclassed directly.
The serializers/parsers can be specified through annotations at the class and/or method levels.
An equivalent AddressBookResource
class could be defined to only support XML using
the following definition:
Likewise, serializers and parsers can be specified/augmented/overridden at the method level like so:
The {@link org.apache.juneau.rest.annotation.RestMethod#serializersInherit()} and
{@link org.apache.juneau.rest.annotation.RestMethod#parsersInherit()} control how various artifacts
are inherited from the parent class.
Refer to {@link org.apache.juneau.rest} for additional information on using these annotations.
XML media type support in JAX-RS can be achieved by using the {@link org.apache.juneau.rest.jaxrs.DefaultProvider} class.
It implements the JAX-RS MessageBodyReader
and MessageBodyWriter
interfaces for all Juneau supported media types.
The DefaultProvider
class definition is shown below:
That's the entire class. It consists of only annotations to hook up media types to Juneau serializers and parsers.
The
To enable the provider, you need to make the JAX-RS environment aware of it. In Wink, this is accomplished by adding an entry to a config file.
Simply include a reference to the provider in the configuration file.
org.apache.juneau.rest.jaxrs.DefaultProvider
Properties can be specified on providers through the {@link org.apache.juneau.rest.jaxrs.JuneauProvider#properties()} annotation.
Properties can also be specified at the method level by using the {@link org.apache.juneau.rest.annotation.RestMethod#properties} annotation, like so:
In general, the Juneau REST API is considerably more flexible than the JAX-RS API, since you can specify and override
serializers, parsers, properties, transforms, converters, guards, etc... at both the class and method levels.
Therefore, the JAX-RS API has the following limitations that the Juneau Server API does not:
XmlSerializer
with one set of properties on
one class, and another instance with different properties on another class.To provide support for only XML media types, you can define your own provider class, like so:
Then register it with Wink the same way as DefaultProvider
.
The {@link org.apache.juneau.rest.client.RestClient} class provides an easy-to-use REST client interface with
pluggable media type handling using any of the Juneau serializers and parsers.
Defining a client to support XML media types on HTTP requests and responses can be done in one line of code:
The client handles all content negotiation based on the registered serializers and parsers.
The following code is pulled from the main method of the ClientTest
class in the sample web application, and
is run against the AddressBookResource
class running within the sample app.
It shows how the client can be used to interact with the REST API while completely hiding the negotiated content type and working with nothing more than beans.
String root =
Number of entries = 2 Deleted person Barack Obama, response = DELETE successful Deleted person George Walker Bush, response = DELETE successful Number of entries = 0 Created person Barack Obama, uri = http://localhost:9080/sample/addressBook/people/3 Created person George Walker Bush, uri = http://localhost:9080/sample/addressBook/people/4 Created address http://localhost:9080/sample/addressBook/addresses/7 Created address http://localhost:9080/sample/addressBook/addresses/8 Changed name, response = PUT successful New name = Barack Hussein Obama
*** fín ***