Partitions

Partitions are entry stores assigned to a naming context. The idea behind a partition is that it stores a subset of the Directory Information Base (DIB). Partitions can be implemented in any way so long as they adhere to interfaces.

Status

Presently the server has a single partition implementation. This implementation is used for both the system partition and user partitions. It uses JDBM as the underlying B+Tree implementation for storing entries.

Other implementations are possible. I'm particularly interested in memory based partitions either BTree based or based on something like Prevayer.

Partitions have simple interfaces that can be used to align any data source to the LDAP data model thereby accessing it via JNDI or via LDAP over the wire. This makes the server very flexible as a bridge to standardize access to disparate data sources and formats. Dynamic mapping based backends are also interesting.

System Partition

The system partition is a very special partition that is hardcoded to hang off of the ou=system naming context. It is always present and contains administrative and operational information needed by the server to operate. Hence its name.

The server's subsystems will use this partition to store information critical to their operation. Things like triggers, stored procedures, access control instructions and schema information can be maintained here.

Root Nexus

Several partitions can be assigned to different naming contexts within the server so long as their names do not overlap such that one partition's naming context is contained within anothers. The root nexus is a fake partition that does not really store entries. It maps other entry storing partitions to naming contexts and routes backing store calls to the partition containing the entry associated with the operation.

User Partitions

User partitions are partitions added by users. When you download and start using the server you may want to create a separate partition to store the entries of your application. To us user (sometimes also referred to as application) partitions are those that are not the system partition! In the following section we describe how a user partition can be created in the server.

Adding User Partitions

Adding new application partitions to the server is a matter of setting the right JNDI environment properties. These properties are used in both standalone and in embedded configurations. We will show you how to configure partitions by example using properties files and programatically.

Using Properties Files

Obviously properties files are not the best way to configure a large system like an LDAP server. However properties files are the JNDI standard for pulling in configuration. The server's JNDI provider tries to honor this. Hence the use of a properties file for configuration. Below we have the configuration of two user defined partitions within a properties file. These partitions are for the naming contexts: dc=apache,dc=org and ou=test.

# all multivalued properties are space separated like the list of partions here
server.db.partitions=apache test

# apache partition configuration
server.db.partition.suffix.apache=dc=apache,dc=org
server.db.partition.indices.apache=ou cn objectClass uid
server.db.partition.attributes.apache.dc=apache
server.db.partition.attributes.apache.objectClass=top domain extensibleObject

# test partition configuration
server.db.partition.suffix.test=ou=test
server.db.partition.indices.test=ou objectClass
server.db.partition.attributes.test.ou=test
server.db.partition.attributes.test.objectClass=top organizationalUnit extensibleObject
    

Although somewhat ugly the way we use properties for settings is portable across JNDI LDAP providers. Hopefully we can build a tool on top of this to save the user some hassle. Another approach may be to use XML or something easier to generate these properties from them. For now its the best non-specific (to the server's provider) means we have to inject settings through JNDI environment Hashtables while still being able to load settings via properties files. Properties from proerties files are the common denominator though. Another easier means to configure the server is possible programatically.

Partition Id

Breifly we'll explain these properties and the scheme used. A partition's property set is associated as a set using the partition's id. All partition ids are listed as a space separated list using the server.db.partitions property: above it lists the ids for the two partitions, apache and test.

Naming Context

Partitions need to know the naming context they will store entries for. This naming context is also referred to as the suffix since all entries in the partition have this common suffix. The suffix is a distinguished name. The property key for the suffix of a partition is composed of the following property key base server.db.partition.suffix. concatenated with the id of the partition: server.db.partition.suffix.${id}. For example if the partition id is foo, then the suffix key would be, server.db.partition.suffix.foo.

User Defined Indices

Partitions can have indices on attributes. Unlike OpenLDAP where you can build specific types of indices, the server's indices are of a single type. For each partition, a key is assembled from the partition id and the property key base: server.db.partition.indices.${id}. So again for foo the key for attribute indices would be server.db.partition.indices.foo. This value is a space separated list of attributeType names to index. For example the apache partition has indices built on top of ou, objectClass and uid.

Suffix Entry

When creating a context the root entry of the context corresponding to the suffix of the partition must be created. This entry is composed of single-valued and multi-valued attributes. We must specify these attributes as well as their values. To do so we again use a key composed of a base, however this time we use both the id of the partition and the name of the attribute: server.db.partition.attributes.${id}.${name}. So for partition foo and attribute bar the following key would be used: server.db.partition.attributes.foo.bar. The value of the key is a space separated list of values for the bar attribute. For example the apache partition's suffix has an objectClass attribute and its values are set to: top domain extensibleObject.

Programatically

This is simple create a Hashtable and stuff it with those properties. But that's a real pain. The other option is to set all the properties that way minus the one for the suffix entries attributes. We have a shortcut where you can set an Attributes object within the Hashtable and it will get picked up instead of using the standard property scheme above.

Simply put the Attributes into the Hashtable using the following key server.db.partition.attributes.${id}. Below we show how this can be done for the same example above:

BasicAttributes attrs = new BasicAttributes( true );
BasicAttribute attr = new BasicAttribute( "objectClass" );
attr.add( "top" );
attr.add( "organizationalUnit" );
attr.add( "extensibleObject" );
attrs.put( attr );
attr = new BasicAttribute( "ou" );
attr.add( "testing" );
attrs.put( attr );

extras.put( EnvKeys.PARTITIONS, "testing example" );
extras.put( EnvKeys.SUFFIX + "testing", "ou=testing" );
extras.put( EnvKeys.INDICES + "testing", "ou objectClass" );
extras.put( EnvKeys.ATTRIBUTES + "testing", attrs );

attrs = new BasicAttributes( true );
attr = new BasicAttribute( "objectClass" );
attr.add( "top" );
attr.add( "domain" );
attr.add( "extensibleObject" );
attrs.put( attr );
attr = new BasicAttribute( "dc" );
attr.add( "example" );
attrs.put( attr );

extras.put( EnvKeys.SUFFIX + "example", "dc=example" );
extras.put( EnvKeys.INDICES + "example", "ou dc objectClass" );
extras.put( EnvKeys.ATTRIBUTES + "example", attrs );

Ok that does not look any shorter. We'll add to this in the future. Perhaps we enable the use of configuration beans that can be used with an SPI specific to server. However this starts making your code server provider specific. You can just change properties and use the SUN provider anymore to have your code be location independent.

Future Progress

Partition Nesting

Today we have some limitations to the way we can partition the DIB. Namely we can't have a partition within a partition and sometimes this makes sense. Eventually we intend to enable this kind of functionality using a special type of nexus which is both a router and a backing store for entries. It's smart enough to know what to route verses when to use its own database. Here's a JIRA improvement specifically aimed at achieving this goal.

Partition Variety

Obviously we want as many different kinds of partitions as possible. Some really cool ideas have floated around out there for a while. Here's a list of theoretically possible partition types that might be useful or just cool:

  • Partitions that use JDBC to store entries. These would probably be way too slow. However they might be useful if some mapping were to be used to represent an existing application's database schema as an LDAP DIT. This would allow us to expose any database data via LDAP.
  • Partitions using other LDAP servers to store their entries. Why do this when introducing latency. Perhaps you want to proxy other servers or make other servers behave like the server.
  • A partition that serves out the Windows registry via LDAP. A standard mechanism to map the Windows registry to an LDAP DIT is pretty simple. This would be a neat way to expose client machine registry management.
  • A partition based on SleepyCat's JE. I was going to try this and see how it performs against JDBM.
  • A partition based on an in-memory BTree implementation. This would be fast and really cool for storing things like schema info. It would also be cool for staging data between memory and disk.
  • A partition based on Prevalyer. This is like an in-memory partition but you can save it at the end of the day. This might be really useful especially for things the system partition which almost always need to be in memory. The system partition can do this by using really large caches equal to the number of entries in the system partition.

Partitioning entries under a single context?

Other aspirations include entry partitioning within a container context. Imagine having 250 million entries under ou=citizens,dc=census,dc=gov. You don't want all 250 million in one partition but would like to sub partition these entries under the same context based on some attribute. Basically we will be using the attribute's value to implement sub partitioning where within a single context we are partitioning entries. The value is used to hash entries across buckets (the buckets are other partitions). Yeah this is a bit wild but it would be useful in several situations.