org.apache.hadoop.hbase.codec.prefixtree.encode.column
Class ColumnSectionWriter

java.lang.Object
  extended by org.apache.hadoop.hbase.codec.prefixtree.encode.column.ColumnSectionWriter

@InterfaceAudience.Private
public class ColumnSectionWriter
extends Object

Takes the tokenized family or qualifier data and flattens it into a stream of bytes. The family section is written after the row section, and qualifier section after family section.

The family and qualifier tries, or "column tries", are structured differently than the row trie. The trie cannot be reassembled without external data about the offsets of the leaf nodes, and these external pointers are stored in the nubs and leaves of the row trie. For each cell in a row, the row trie contains a list of offsets into the column sections (along with pointers to timestamps and other per-cell fields). These offsets point to the last column node/token that comprises the column name. To assemble the column name, the trie is traversed in reverse (right to left), with the rightmost tokens pointing to the start of their "parent" node which is the node to the left.

This choice was made to reduce the size of the column trie by storing the minimum amount of offset data. As a result, to find a specific qualifier within a row, you must do a binary search of the column nodes, reassembling each one as you search. Future versions of the PrefixTree might encode the columns in both a forward and reverse trie, which would convert binary searches into more efficient trie searches which would be beneficial for wide rows.


Field Summary
static int EXPECTED_NUBS_PLUS_LEAVES
           
 
Constructor Summary
ColumnSectionWriter()
          construct
ColumnSectionWriter(PrefixTreeBlockMeta blockMeta, Tokenizer builder, ColumnNodeType nodeType)
           
 
Method Summary
 ColumnSectionWriter compile()
          methods
protected  void compilerInternals()
           
 ArrayList<ColumnNodeWriter> getColumnNodeWriters()
          get/set
 ArrayList<TokenizerNode> getLeaves()
           
 ArrayList<TokenizerNode> getNonLeaves()
           
 int getNumBytes()
           
 int getOutputArrayOffset(int sortedIndex)
           
 void reconstruct(PrefixTreeBlockMeta blockMeta, Tokenizer builder, ColumnNodeType nodeType)
           
 void reset()
           
 void writeBytes(OutputStream os)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EXPECTED_NUBS_PLUS_LEAVES

public static final int EXPECTED_NUBS_PLUS_LEAVES
See Also:
Constant Field Values
Constructor Detail

ColumnSectionWriter

public ColumnSectionWriter()
construct


ColumnSectionWriter

public ColumnSectionWriter(PrefixTreeBlockMeta blockMeta,
                           Tokenizer builder,
                           ColumnNodeType nodeType)
Method Detail

reconstruct

public void reconstruct(PrefixTreeBlockMeta blockMeta,
                        Tokenizer builder,
                        ColumnNodeType nodeType)

reset

public void reset()

compile

public ColumnSectionWriter compile()
methods


compilerInternals

protected void compilerInternals()

writeBytes

public void writeBytes(OutputStream os)
                throws IOException
Throws:
IOException

getColumnNodeWriters

public ArrayList<ColumnNodeWriter> getColumnNodeWriters()
get/set


getNumBytes

public int getNumBytes()

getOutputArrayOffset

public int getOutputArrayOffset(int sortedIndex)

getNonLeaves

public ArrayList<TokenizerNode> getNonLeaves()

getLeaves

public ArrayList<TokenizerNode> getLeaves()


Copyright © 2007–2015 The Apache Software Foundation. All rights reserved.