org.apache.hadoop.hbase.mapreduce
Class BuildTableIndex
java.lang.Object
org.apache.hadoop.hbase.mapreduce.BuildTableIndex
public class BuildTableIndex
- extends Object
Example table column indexing class. Runs a mapreduce job to index
specified table columns.
- Each row is modeled as a Lucene document: row key is indexed in
its untokenized form, column name-value pairs are Lucene field name-value
pairs.
- A file passed on command line is used to populate an
IndexConfiguration
which is used to set various Lucene parameters,
specify whether to optimize an index and which columns to index and/or
store, in tokenized or untokenized form, etc. For an example, see the
createIndexConfContent
method in TestTableIndex
- The number of reduce tasks decides the number of indexes (partitions).
The index(es) is stored in the output path of job configuration.
- The index build process is done in the reduce phase. Users can use
the map phase to join rows from different tables or to pre-parse/analyze
column content, etc.
Method Summary |
static org.apache.hadoop.mapreduce.Job |
createSubmittableJob(org.apache.hadoop.conf.Configuration conf,
String[] args)
Creates a new job. |
static void |
main(String[] args)
The main entry point. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
BuildTableIndex
public BuildTableIndex()
createSubmittableJob
public static org.apache.hadoop.mapreduce.Job createSubmittableJob(org.apache.hadoop.conf.Configuration conf,
String[] args)
throws IOException
- Creates a new job.
- Parameters:
conf
- args
- The command line arguments.
- Throws:
IOException
- When reading the configuration fails.
main
public static void main(String[] args)
throws Exception
- The main entry point.
- Parameters:
args
- The command line arguments.
- Throws:
Exception
- When running the job fails.
Copyright © 2010 The Apache Software Foundation