org.apache.mahout.classifier.bayes
Class WikipediaDatasetCreatorMapper

java.lang.Object
  extended by org.apache.hadoop.mapred.MapReduceBase
      extended by org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorMapper
All Implemented Interfaces:
java.io.Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>

public class WikipediaDatasetCreatorMapper
extends org.apache.hadoop.mapred.MapReduceBase
implements org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>

Maps over Wikipedia xml format and output all document having the category listed in the input category file


Constructor Summary
WikipediaDatasetCreatorMapper()
           
 
Method Summary
 void configure(org.apache.hadoop.mapred.JobConf job)
           
 void map(org.apache.hadoop.io.LongWritable key, org.apache.hadoop.io.Text value, org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> output, org.apache.hadoop.mapred.Reporter reporter)
           
 
Methods inherited from class org.apache.hadoop.mapred.MapReduceBase
close
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.io.Closeable
close
 

Constructor Detail

WikipediaDatasetCreatorMapper

public WikipediaDatasetCreatorMapper()
Method Detail

map

public void map(org.apache.hadoop.io.LongWritable key,
                org.apache.hadoop.io.Text value,
                org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> output,
                org.apache.hadoop.mapred.Reporter reporter)
         throws java.io.IOException
Specified by:
map in interface org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Throws:
java.io.IOException

configure

public void configure(org.apache.hadoop.mapred.JobConf job)
Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable
Overrides:
configure in class org.apache.hadoop.mapred.MapReduceBase


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.