Jena-based RDF serialization and parsing support
Juneau supports serializing and parsing arbitrary POJOs to and from the following RDF formats:
Juneau can serialize and parse instances of any of the following POJO types:
String
, Integer
, Boolean
,
Float
).
HashSet
, TreeMap
) containing anything on
this list.
Strings
(e.g. classes containing
toString()
, fromString()
, valueOf()
, constructor(String)
).
In addition to the types shown above, Juneau includes the ability to define 'swaps' to transform non-standard
object and property types to serializable forms (e.g. to transform Calendars
to and from
ISO8601
strings, or byte[]
arrays to and from base-64 encoded strings).
These can be associated with serializers/parsers, or can be associated with classes or bean properties through
type and method annotations.
Refer to POJO Categories for a complete definition of supported POJOs.
Juneau uses the Jena library for these formats.
The predefined serializers and parsers convert POJOs to and from RDF models and then uses Jena to convert
them to and from the various RDF languages.
Jena libraries must be provided on the classpath separately if you plan on making use of the RDF support.
The minimum list of required jars are:
jena-core-2.7.1.jar
jena-iri-0.9.2.jar
log4j-1.2.16.jar
slf4j-api-1.6.4.jar
slf4j-log4j12-1.6.4.jar
The example shown here is from the Address Book resource located in the
org.apache.juneau.sample.war
application.
The POJO model consists of a List
of Person
beans, with each Person
containing zero or more Address
beans.
When you point a browser at /sample/addressBook
, the POJO is rendered as HTML:
By appending ?Accept=mediaType&plainText=true
to the URL, you can view the data
in the various RDF supported formats.
The {@link org.apache.juneau.jena.RdfSerializer} class is the top-level class for all Jena-based serializers.
Language-specific serializers are defined as inner subclasses of the RdfSerializer
class:
Static reusable instances of serializers are also provided with default settings:
Abbreviated RDF/XML is currently the most widely accepted and readable RDF syntax, so the examples shown here will use that format.
For brevity, the examples will use public fields instead of getters/setters to reduce the size of the examples.
In the real world, you'll typically want to use standard bean getters and setters.
To start off simple, we'll begin with the following simplified bean and build it up.
The following code shows how to convert this to abbreviated RDF/XML:
It should be noted that serializers can also be created by cloning existing serializers:
This code produces the following output:
Notice that we've taken an arbitrary POJO and converted it to RDF.
The Juneau serializers and parsers are designed to work with arbitrary POJOs without requiring any annotations.
That being said, several annotations are provided to customize how POJOs are handled to produce usable RDF.
You'll notice in the previous example that Juneau namespaces are used to represent bean property names.
These are used by default when namespaces are not explicitly specified.
The juneau
namespace is used for generic names for objects that don't have namespaces
associated with them.
The juneaubp
namespace is used on bean properties that don't have namespaces associated with
them.
The easiest way to specify namespaces is through annotations.
In this example, we're going to associate the prefix 'per'
to our bean class and all properties
of this class.
We do this by adding the following annotation to our class:
In general, the best approach is to define the namespace URIs at the package level using a
package-info.java
class, like so:
This assigns a default prefix of
Now when we rerun the sample code, we'll get the following:
Namespace auto-detection ({@link org.apache.juneau.xml.XmlSerializerContext#XML_autoDetectNamespaces}) is
enabled on serializers by default.
This causes the serializer to make a first-pass over the data structure to look for namespaces.
In high-performance environments, you may want to consider disabling auto-detection and providing an
explicit list of namespaces to the serializer to avoid this scanning step.
This code change will produce the same output as before, but will perform slightly better since it doesn't have to crawl the POJO tree before serializing the result.
Bean properties of type java.net.URI
or java.net.URL
have special meaning to the
RDF serializer.
They are interpreted as resource identifiers.
In the following code, we're adding 2 new properties.
The first property is annotated with
The second un-annotated property is interpreted as a reference to another resource.
We alter our code to pass in values for these new properties.
Now when we run the sample code, we get the following:
The {@link org.apache.juneau.annotation.URI} annotation can also be used on classes and properties
to identify them as URLs when they're not instances of java.net.URI
or java.net.URL
(not needed if
is already specified).
The following properties would have produced the same output as before.
Note that the
Also take note of the {@link org.apache.juneau.serializer.SerializerContext#SERIALIZER_uriResolution}, {@link org.apache.juneau.serializer.SerializerContext#SERIALIZER_uriRelativity}, and and {@link org.apache.juneau.serializer.SerializerContext#SERIALIZER_uriContext} settings that can be specified on the serializer to resolve relative and context-root-relative URIs to fully-qualified URIs.
This can be useful if you want to keep the URI authority and context root information out of the bean logic layer.
The following code produces the same output as before, but the URIs on the beans are relative.
The {@link org.apache.juneau.annotation.Bean} and {@link org.apache.juneau.annotation.BeanProperty}
annotations are used to customize the behavior of beans across the entire framework.
In addition to using them to identify the resource URI for the bean shown above, they have various other
uses:
For example, we now add a birthDate
property, and associate a swap with it to transform
it to an ISO8601 date-time string in GMT time.
By default, Calendars
are treated as beans by the framework, which is usually not how you want
them serialized.
Using swaps, we can convert them to standardized string forms.
And we alter our code to pass in the birthdate.
Now when we rerun the sample code, we'll get the following:
Collections and arrays are converted to RDF sequences.
In our example, let's add a list-of-beans property to our sample class:
The Address
class has the following properties defined:
Next, add some quick-and-dirty code to add an address to our person bean:
Now when we run the sample code, we get the following:
For all RDF languages, the POJO objects get broken down into simple triplets.
Unfortunately, for tree-structured data like the POJOs shown above, this causes the root node of the tree
to become lost.
There is no easy way to identify that person/1
is the root node in our tree once in triplet
form, and in some cases it's impossible.
By default, the {@link org.apache.juneau.jena.RdfParser} class handles this by scanning all the nodes and
identifying the nodes without incoming references.
However, this is inefficient, especially for large models.
And in cases where the root node is referenced by another node in the model by URL, it's not possible to
locate the root at all.
To resolve this issue, the property {@link org.apache.juneau.jena.RdfSerializerContext#RDF_addRootProperty}
was introduced.
When enabled, this adds a special root
attribute to the root node to make it easy to locate
by the parser.
To enable, set the
Now when we rerun the sample code, we'll see the added root
attribute on the root resource.
XML-Schema data-types can be added to non-String
literals through the
{@link org.apache.juneau.jena.RdfSerializerContext#RDF_addLiteralTypes} setting.
To enable, set the
Now when we rerun the sample code, we'll see the added root
attribute on the root resource.
The RDF serializer is designed to be used against tree structures.
It expects that there not be loops in the POJO model (e.g. children with references to parents, etc...).
If you try to serialize models with loops, you will usually cause a StackOverflowError
to
be thrown (if {@link org.apache.juneau.serializer.SerializerContext#SERIALIZER_maxDepth} is not reached
first).
If you still want to use the XML serializer on such models, Juneau provides the
{@link org.apache.juneau.serializer.SerializerContext#SERIALIZER_detectRecursions} setting.
It tells the serializer to look for instances of an object in the current branch of the tree and skip
serialization when a duplicate is encountered.
Recursion detection introduces a performance penalty of around 20%.
For this reason the setting is disabled by default.
See the following classes for all configurable properties that can be used on this serializer:
The {@link org.apache.juneau.jena.RdfParser} class is the top-level class for all Jena-based parsers.
Language-specific parsers are defined as inner subclasses of the RdfParser
class:
The RdfParser.Xml
parser handles both regular and abbreviated RDF/XML.
Static reusable instances of parsers are also provided with default settings:
For an example, we will build upon the previous example and parse the generated RDF/XML back into the original bean.
We print it out to JSON to show that all the data has been preserved:
{
uri:
The RDF parser is not limited to parsing back into the original bean classes.
If the bean classes are not available on the parsing side, the parser can also be used to parse into a
generic model consisting of Maps
, Collections
, and primitive objects.
You can parse into any Map
type (e.g. HashMap
, TreeMap
), but
using {@link org.apache.juneau.ObjectMap} is recommended since it has many convenience methods
for converting values to various types.
The same is true when parsing collections. You can use any Collection (e.g. HashSet
,
LinkedList
) or array (e.g. Object[]
, String[]
,
String[][]
), but using {@link org.apache.juneau.ObjectList} is recommended.
When the map or list type is not specified, or is the abstract Map
, Collection
,
or List
types, the parser will use ObjectMap
and ObjectList
by
default.
In the following example, we parse into an ObjectMap
and use the convenience methods for
performing data conversion on values in the map.
However, there are caveats when parsing into generic models due to the nature of RDF.
Watch out for the following:
ObjectMap
and ObjectList
since
various methods are provided for converting to the correct type anyway.
We can see some of these when we render the ObjectMap
back to JSON.
System.
This is what's produced:
{
uri:
As a general rule, parsing into beans is often more efficient than parsing into generic models.
And working with beans is often less error prone than working with generic models.
See the following classes for all configurable properties that can be used on this parser:
*** fín ***