Class Summary |
Bump125 |
Helps with making nice intervals at arbitrary scale. |
MatrixDumper |
Export a Matrix in various text formats:
* CSV file
Input format: Hadoop SequenceFile with Text key and MatrixWritable value, 1 pair
TODO:
Needs class for key value- should not hard-code to Text. |
SequenceFileDumper |
|
SplitInput |
A utility for splitting files in the input format used by the Bayes
classifiers or anything else that has one item per line or SequenceFiles (key/value)
into training and test sets in order to perform cross-validation. |
SplitInputJob |
|
SplitInputJob.SplitInputComparator |
Randomly permute key value pairs |
SplitInputJob.SplitInputMapper |
Mapper which downsamples the input by downsamplingFactor |
SplitInputJob.SplitInputReducer |
Reducer which uses MultipleOutputs to randomly allocate key value pairs
between test and training outputs |