org.apache.mahout.text
Class SequenceFilesFromCsvFilter
java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.mahout.common.AbstractJob
org.apache.mahout.text.SequenceFilesFromDirectory
org.apache.mahout.text.SequenceFilesFromDirectoryFilter
org.apache.mahout.text.SequenceFilesFromCsvFilter
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, org.apache.hadoop.fs.PathFilter, org.apache.hadoop.util.Tool
public final class SequenceFilesFromCsvFilter
- extends SequenceFilesFromDirectoryFilter
Implements an example csv to sequence file parser.
Method Summary |
void |
addOptions()
Override this method in order to add additional options to the command line of the SequenceFileFromDirectory job. |
static void |
main(String[] args)
|
Map<String,String> |
parseOptions()
Override this method in order to parse your additional options from the command line. |
protected void |
process(org.apache.hadoop.fs.FileStatus fst,
org.apache.hadoop.fs.Path current)
|
Methods inherited from class org.apache.mahout.common.AbstractJob |
addFlag, addInputOption, addOption, addOption, addOption, addOption, addOutputOption, buildOption, getInputPath, getOption, getOutputPath, hasOption, keyFor, maybePut, parseArguments, parseDirectories, prepareJob, shouldRunNextPhase |
Methods inherited from class org.apache.hadoop.conf.Configured |
getConf, setConf |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.hadoop.conf.Configurable |
getConf, setConf |
KEY_COLUMN_OPTION
public static final String[] KEY_COLUMN_OPTION
VALUE_COLUMN_OPTION
public static final String[] VALUE_COLUMN_OPTION
SequenceFilesFromCsvFilter
public SequenceFilesFromCsvFilter(org.apache.hadoop.conf.Configuration conf,
String keyPrefix,
Map<String,String> options,
ChunkedWriter writer)
throws IOException
- Throws:
IOException
main
public static void main(String[] args)
throws Exception
- Throws:
Exception
addOptions
public void addOptions()
- Description copied from class:
SequenceFilesFromDirectory
- Override this method in order to add additional options to the command line of the SequenceFileFromDirectory job.
Do not forget to call super() otherwise all standard options (input/output dirs etc) will not be available.
- Overrides:
addOptions
in class SequenceFilesFromDirectory
parseOptions
public Map<String,String> parseOptions()
throws IOException
- Description copied from class:
SequenceFilesFromDirectory
- Override this method in order to parse your additional options from the command line. Do not forget to call
super() otherwise standard options (input/output dirs etc) will not be available.
- Overrides:
parseOptions
in class SequenceFilesFromDirectory
- Throws:
IOException
process
protected void process(org.apache.hadoop.fs.FileStatus fst,
org.apache.hadoop.fs.Path current)
throws IOException
- Specified by:
process
in class SequenceFilesFromDirectoryFilter
- Throws:
IOException
Copyright © 2008-2011 The Apache Software Foundation. All Rights Reserved.