org.apache.hadoop.hbase.mapred
Class BuildTableIndex
java.lang.Object
org.apache.hadoop.hbase.mapred.BuildTableIndex
public class BuildTableIndex
- extends Object
Example table column indexing class. Runs a mapreduce job to index
specified table columns.
- Each row is modeled as a Lucene document: row key is indexed in
its untokenized form, column name-value pairs are Lucene field name-value
pairs.
- A file passed on command line is used to populate an
IndexConfiguration
which is used to set various Lucene parameters,
specify whether to optimize an index and which columns to index and/or
store, in tokenized or untokenized form, etc. For an example, see the
createIndexConfContent
method in TestTableIndex
- The number of reduce tasks decides the number of indexes (partitions).
The index(es) is stored in the output path of job configuration.
- The index build process is done in the reduce phase. Users can use
the map phase to join rows from different tables or to pre-parse/analyze
column content, etc.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
BuildTableIndex
public BuildTableIndex()
run
public void run(String[] args)
throws IOException
- Throws:
IOException
createJob
public JobConf createJob(Configuration conf,
int numMapTasks,
int numReduceTasks,
String indexDir,
String tableName,
String columnNames)
main
public static void main(String[] args)
throws IOException
- Throws:
IOException
Copyright © 2006 The Apache Software Foundation