org.apache.mahout.text
Class SequenceFilesFromMailArchives
java.lang.Object
org.apache.mahout.text.SequenceFilesFromMailArchives
public final class SequenceFilesFromMailArchives
- extends Object
Converts a directory of gzipped mail archives into SequenceFiles of specified chunkSize.
This class is similar to SequenceFilesFromDirectory
except it uses block-compressed
SequenceFile
s and parses out the subject and body text of each mail message into
a separate key/value pair.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SequenceFilesFromMailArchives
public SequenceFilesFromMailArchives()
createSequenceFiles
public void createSequenceFiles(MailOptions options)
throws IOException
- Throws:
IOException
main
public static void main(String[] args)
throws Exception
- Throws:
Exception
Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.