org.apache.mahout.text
Class SequenceFilesFromMailArchives

java.lang.Object
  extended by org.apache.mahout.text.SequenceFilesFromMailArchives

public final class SequenceFilesFromMailArchives
extends Object

Converts a directory of gzipped mail archives into SequenceFiles of specified chunkSize. This class is similar to SequenceFilesFromDirectory except it uses block-compressed SequenceFiles and parses out the subject and body text of each mail message into a separate key/value pair.


Nested Class Summary
 class SequenceFilesFromMailArchives.PrefixAdditionFilter
           
 
Constructor Summary
SequenceFilesFromMailArchives()
           
 
Method Summary
 void createSequenceFiles(MailOptions options)
           
static void main(String[] args)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SequenceFilesFromMailArchives

public SequenceFilesFromMailArchives()
Method Detail

createSequenceFiles

public void createSequenceFiles(MailOptions options)
                         throws IOException
Throws:
IOException

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception


Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.