Package | Description |
---|---|
org.apache.nutch.segment |
A segment stores all data from on generate/fetch/update cycle:
fetch list, protocol status, raw content, parsed content, and extracted outgoing links.
|
Modifier and Type | Method and Description |
---|---|
org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,MetaWrapper> |
SegmentMerger.ObjectInputFormat.getRecordReader(org.apache.hadoop.mapred.InputSplit split,
org.apache.hadoop.mapred.JobConf job,
org.apache.hadoop.mapred.Reporter reporter) |
org.apache.hadoop.mapred.RecordWriter<org.apache.hadoop.io.Text,MetaWrapper> |
SegmentMerger.SegmentOutputFormat.getRecordWriter(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.mapred.JobConf job,
String name,
org.apache.hadoop.util.Progressable progress) |
Modifier and Type | Method and Description |
---|---|
void |
SegmentMerger.map(org.apache.hadoop.io.Text key,
MetaWrapper value,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,MetaWrapper> output,
org.apache.hadoop.mapred.Reporter reporter) |
Modifier and Type | Method and Description |
---|---|
void |
SegmentMerger.map(org.apache.hadoop.io.Text key,
MetaWrapper value,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,MetaWrapper> output,
org.apache.hadoop.mapred.Reporter reporter) |
void |
SegmentMerger.reduce(org.apache.hadoop.io.Text key,
Iterator<MetaWrapper> values,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,MetaWrapper> output,
org.apache.hadoop.mapred.Reporter reporter)
NOTE: in selecting the latest version we rely exclusively on the segment
name (not all segment data contain time information).
|
void |
SegmentMerger.reduce(org.apache.hadoop.io.Text key,
Iterator<MetaWrapper> values,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,MetaWrapper> output,
org.apache.hadoop.mapred.Reporter reporter)
NOTE: in selecting the latest version we rely exclusively on the segment
name (not all segment data contain time information).
|
Copyright © 2014 The Apache Software Foundation