org.apache.accumulo.examples.wikisearch.reader
Class AggregatingRecordReader

java.lang.Object
  extended by org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
      extended by org.apache.accumulo.examples.wikisearch.reader.LongLineRecordReader
          extended by org.apache.accumulo.examples.wikisearch.reader.AggregatingRecordReader
All Implemented Interfaces:
Closeable

public class AggregatingRecordReader
extends LongLineRecordReader

This class aggregates Text values based on a start and end filter. An example use case for this would be XML data. This will not work with data that has nested start and stop tokens.


Field Summary
static String END_TOKEN
           
static String RETURN_PARTIAL_MATCHES
           
static String START_TOKEN
           
 
Constructor Summary
AggregatingRecordReader()
           
 
Method Summary
 org.apache.hadoop.io.LongWritable getCurrentKey()
           
 org.apache.hadoop.io.Text getCurrentValue()
           
 void initialize(org.apache.hadoop.mapreduce.InputSplit genericSplit, org.apache.hadoop.mapreduce.TaskAttemptContext context)
           
 boolean nextKeyValue()
           
 
Methods inherited from class org.apache.accumulo.examples.wikisearch.reader.LongLineRecordReader
close, getProgress
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

START_TOKEN

public static final String START_TOKEN
See Also:
Constant Field Values

END_TOKEN

public static final String END_TOKEN
See Also:
Constant Field Values

RETURN_PARTIAL_MATCHES

public static final String RETURN_PARTIAL_MATCHES
See Also:
Constant Field Values
Constructor Detail

AggregatingRecordReader

public AggregatingRecordReader()
Method Detail

getCurrentKey

public org.apache.hadoop.io.LongWritable getCurrentKey()
Overrides:
getCurrentKey in class LongLineRecordReader

getCurrentValue

public org.apache.hadoop.io.Text getCurrentValue()
Overrides:
getCurrentValue in class LongLineRecordReader

initialize

public void initialize(org.apache.hadoop.mapreduce.InputSplit genericSplit,
                       org.apache.hadoop.mapreduce.TaskAttemptContext context)
                throws IOException
Overrides:
initialize in class LongLineRecordReader
Throws:
IOException

nextKeyValue

public boolean nextKeyValue()
                     throws IOException
Overrides:
nextKeyValue in class LongLineRecordReader
Throws:
IOException


Copyright © 2012 The Apache Software Foundation. All Rights Reserved.