org.apache.mahout.ga.watchmaker.cd.hadoop
Class DatasetSplit

java.lang.Object
  extended by org.apache.mahout.ga.watchmaker.cd.hadoop.DatasetSplit

public final class DatasetSplit
extends Object

Separate the input data into a training and testing set.


Nested Class Summary
static class DatasetSplit.DatasetTextInputFormat
          TextInputFormat that uses a DatasetSplit.RndLineRecordReader as a RecordReader
static class DatasetSplit.RndLineRecordReader
          a RecordReader that skips some lines from the input.
 
Constructor Summary
DatasetSplit(org.apache.hadoop.conf.Configuration conf)
           
DatasetSplit(double threshold)
           
DatasetSplit(long seed, double threshold)
           
 
Method Summary
 long getSeed()
           
 double getThreshold()
           
 boolean isTraining()
           
 void setTraining(boolean training)
           
 void storeJobParameters(org.apache.hadoop.conf.Configuration conf)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DatasetSplit

public DatasetSplit(long seed,
                    double threshold)
Parameters:
seed -
threshold - fraction of the total dataset that will be used for training

DatasetSplit

public DatasetSplit(double threshold)

DatasetSplit

public DatasetSplit(org.apache.hadoop.conf.Configuration conf)
Method Detail

getSeed

public long getSeed()

getThreshold

public double getThreshold()

isTraining

public boolean isTraining()

setTraining

public void setTraining(boolean training)

storeJobParameters

public void storeJobParameters(org.apache.hadoop.conf.Configuration conf)


Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.