|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.openjena.atlas.data.AbstractDataBag<E>
org.openjena.atlas.data.SortedDataBag<E>
org.openjena.atlas.data.DistinctDataBag<E>
public class DistinctDataBag<E>
This data bag will gather distinct items in memory until a size threshold is passed, at which point it will write out all of the items to disk using the supplied serializer.
After adding is finished, call iterator()
to set up the data bag for reading back items and iterating over them.
The iterator will retrieve only distinct items.
IMPORTANT: You may not add any more items after this call. You may subsequently call iterator()
multiple
times which will give you a new iterator for each invocation. If you do not consume the entire iterator, you should
call Iter.close(Iterator)
to close any FileInputStreams associated with the iterator.
Additionally, make sure to call SortedDataBag.close()
when you are finished to free any system resources (preferably in a finally block).
Implementation Notes: Data is stored without duplicates as it comes in in a HashSet. When it is time to spill, that data is sorted and written to disk. An iterator that eliminates adjacent duplicates is used in conjunction with the SortedDataBag's iterator.
Constructor Summary | |
---|---|
DistinctDataBag(ThresholdPolicy<E> policy,
SerializationFactory<E> serializerFactory,
Comparator<E> comparator)
|
Method Summary | |
---|---|
boolean |
isDistinct()
Find out if the bag is distinct. |
boolean |
isSorted()
Find out if the bag is sorted. |
Iterator<E> |
iterator()
Returns an iterator over a set of elements of type E. |
Methods inherited from class org.openjena.atlas.data.SortedDataBag |
---|
add, close, flush |
Methods inherited from class org.openjena.atlas.data.AbstractDataBag |
---|
addAll, addAll, isEmpty, send, size |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public DistinctDataBag(ThresholdPolicy<E> policy, SerializationFactory<E> serializerFactory, Comparator<E> comparator)
Method Detail |
---|
public boolean isSorted()
DataBag
isSorted
in interface DataBag<E>
isSorted
in class SortedDataBag<E>
public boolean isDistinct()
DataBag
isDistinct
in interface DataBag<E>
isDistinct
in class SortedDataBag<E>
public Iterator<E> iterator()
SortedDataBag
Iter.close(Iterator)
to be sure any open file handles are closed.
iterator
in interface Iterable<E>
iterator
in class SortedDataBag<E>
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |