org.apache.mahout.clustering.spectral.eigencuts
Class EigencutsDriver
java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.mahout.common.AbstractJob
org.apache.mahout.clustering.spectral.eigencuts.EigencutsDriver
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool
public class EigencutsDriver
- extends AbstractJob
Method Summary |
static void |
main(String[] args)
|
static DistributedRowMatrix |
performEigenDecomposition(org.apache.hadoop.conf.Configuration conf,
DistributedRowMatrix input,
LanczosState state,
int numEigenVectors,
int overshoot,
org.apache.hadoop.fs.Path tmp)
Does most of the heavy lifting in setting up Paths, configuring return
values, and generally performing the tedious administrative tasks involved
in an eigen-decomposition and running the verifier |
static void |
run(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path output,
int dimensions,
int eigenrank,
double halflife,
double epsilon,
double tau)
Run the Eigencuts clustering algorithm using the supplied arguments |
int |
run(String[] arg0)
|
Methods inherited from class org.apache.mahout.common.AbstractJob |
addFlag, addInputOption, addOption, addOption, addOption, addOption, addOutputOption, buildOption, getAnalyzerClassFromOption, getCLIOption, getCombinedTempPath, getGroup, getInputPath, getOption, getOption, getOutputPath, getOutputPath, getTempPath, getTempPath, hasOption, keyFor, maybePut, parseArguments, parseDirectories, prepareJob, prepareJob, prepareJob, setS3SafeCombinedInputPath, shouldRunNextPhase |
Methods inherited from class org.apache.hadoop.conf.Configured |
getConf, setConf |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.hadoop.conf.Configurable |
getConf, setConf |
EPSILON_DEFAULT
public static final double EPSILON_DEFAULT
- See Also:
- Constant Field Values
TAU_DEFAULT
public static final double TAU_DEFAULT
- See Also:
- Constant Field Values
OVERSHOOT_MULTIPLIER
public static final double OVERSHOOT_MULTIPLIER
- See Also:
- Constant Field Values
EigencutsDriver
public EigencutsDriver()
main
public static void main(String[] args)
throws Exception
- Throws:
Exception
run
public int run(String[] arg0)
throws Exception
- Throws:
Exception
run
public static void run(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path output,
int dimensions,
int eigenrank,
double halflife,
double epsilon,
double tau)
throws IOException,
InterruptedException,
ClassNotFoundException
- Run the Eigencuts clustering algorithm using the supplied arguments
- Parameters:
conf
- the Configuration to useinput
- the Path to the directory containing input affinity tuplesoutput
- the Path to the output directoryeigenrank
- The number of top eigenvectors/eigenvalues to usedimensions
- the int number of dimensions of the square affinity matrixhalflife
- the double minimum half-life thresholdepsilon
- the double coefficient for setting minimum half-life thresholdtau
- the double tau threshold for cutting links in the affinity graph
- Throws:
IOException
InterruptedException
ClassNotFoundException
performEigenDecomposition
public static DistributedRowMatrix performEigenDecomposition(org.apache.hadoop.conf.Configuration conf,
DistributedRowMatrix input,
LanczosState state,
int numEigenVectors,
int overshoot,
org.apache.hadoop.fs.Path tmp)
throws IOException
- Does most of the heavy lifting in setting up Paths, configuring return
values, and generally performing the tedious administrative tasks involved
in an eigen-decomposition and running the verifier
- Throws:
IOException
Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.