|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
AbstractClusterStory
provides a partial implementation of
ClusterStory
by parsing the topology tree.Path
to the list of inputs for the map-reduce job.
Path
with a custom InputFormat
to the list of
inputs for the map-reduce job.
Path
with a custom InputFormat
and
Mapper
to the list of inputs for the map-reduce job.
Path
to the list of inputs for the map-reduce job.
Path
with a custom InputFormat
to the list of
inputs for the map-reduce job.
Path
with a custom InputFormat
and
Mapper
to the list of inputs for the map-reduce job.
Mapper
class to the chain mapper.
Mapper
class to the chain reducer.
ArrayListBackedIterator
insteadInputFormat
that tries to deduce the types of the input files
automatically.BaileyBorweinPlouffe.BbpMapper
.BaileyBorweinPlouffe.BbpInputFormat
.BinaryPartitioner
instead.BinaryComparable
keys using a configurable part of
the bytes array returned by BinaryComparable.getBytes()
.InputDemuxer
to a particular file.
MachineNode
object.
LoggedDiscreteCDF
.ChainMapper
insteadChainReducer
insteadOutputCommitter.commitJob(JobContext)
or
OutputCommitter.abortJob(JobContext, int)
instead.
OutputCommitter.commitJob(org.apache.hadoop.mapreduce.JobContext)
or OutputCommitter.abortJob(org.apache.hadoop.mapreduce.JobContext, org.apache.hadoop.mapreduce.JobStatus.State)
instead.
OutputCommitter.commitJob(JobContext)
or
OutputCommitter.abortJob(JobContext, JobStatus.State)
instead.
MachineNode
object.
JobClient
.
RecordWriter
to future operations.
InputSplit
to future operations.
RecordWriter
to future operations.
Cluster
.
RecordWriter
to future operations.
RecordWriter
to future operations.
ClusterMetrics
or TaskTrackerInfo
insteadClusterStory
represents all configurations of a MapReduce cluster,
including nodes, network topology, and slot configurations.LoggedNetworkTopology
object.MultiFilterRecordReader.emit(org.apache.hadoop.mapred.join.TupleWritable)
every Tuple from the
collector (the outer join of child RRs).
MultiFilterRecordReader.emit(org.apache.hadoop.mapreduce.lib.join.TupleWritable)
every Tuple from the
collector (the outer join of child RRs).
CombineFileInputFormat
InputFormat
that returns CombineFileSplit
's in
InputFormat.getSplits(JobContext)
method.CombineFileRecordReader
CombineFileSplit
.CombineFileSplit
ComposableInputFormat
insteadComposableRecordReader
insteadCompositeInputFormat
insteadCompositeInputSplit
insteadCompositeRecordReader
insteadJobConf
.
JobConf
.
JobConf
.
Configuration
.
Counter
s that logically belong together.Counters
instead.Group
of counters, comprising of counters from a particular
counter Enum
class.JobStoryProducer
object for the
given trace.
CombineFileInputFormat.createPool(List)
.
CombineFileInputFormat.createPool(PathFilter...)
.
JobHistoryParser
that parses JobHistory
files produced by
JobHistory
in the same source
code tree as rumen.DBConfiguration
insteadDBInputFormat
instead.DBWritable
insteadDBWritable
.DefaultInputDemuxer
acts as a pass-through demuxer.Outputter
that outputs to a plain file.Node
s.
Job
.DoubleValueSum
insteadFieldSelectionMapper
and
FieldSelectionReducer
insteadFileInputFormat
instead.InputFormat
s.OutputCommitter
that commits files specified
in job output directory i.e.OutputCommitter
that commits files specified
in job output directory i.e.OutputFormat
.OutputFormat
s that read from FileSystem
s.FileSplit
instead.FilterOutputFormat
instead.FilterRecordWriter
is a convenience wrapper
class that implements RecordWriter
.FilterRecordWriter
is a convenience wrapper
class that extends the RecordWriter
.LoggedNetworkTopology
object.
DataInput
.
DataOutput
.
TypedBytesInput
.
DataInput
.
TypedBytesOutput
.
DataOutput
.
TypedBytesInput
.
DataInput
.
TypedBytesOutput
.
DataOutput
.
SkipBadRecords.COUNTER_MAP_PROCESSED_RECORDS
is incremented
by MapRunner after invoking the map function.
SkipBadRecords.COUNTER_REDUCE_PROCESSED_GROUPS
is incremented
by framework after invoking the reduce function.
Counters.Group.getCounter(String)
instead
Counters.Counter
of the given group with the given name.
Counters.Counter
of the given group with the given name.
Counter
for the given counterName
.
Counter
for the given groupName
and
counterName
.
RawComparator
comparator for
grouping keys of inputs to the reduce.
InputFormat
implementation for the map-reduce job,
defaults to TextInputFormat
if not specified explicity.
InputFormat
class for the job.
Path
s for the map-reduce job.
Path
s for the map-reduce job.
InputSplit
object for a map.
InputWriter
class.
Job
with no particular Cluster
.
Job
with no particular Cluster
.
Job
with no particular Cluster
and a given jobName.
RunningJob
object to track an ongoing job.
JobClient.getJob(JobID)
.
JobConf
for the job.
RunningJob.getID()
.
JobID
object that this task attempt belongs to
JobID
object that this tip belongs to
JobPriority
for this job.
ClusterStatus.getJobTrackerStatus()
instead.
Cluster.getJobTrackerStatus()
instead.
SequenceFileRecordReader.next(Object, Object)
..
KeyFieldBasedComparator
options
KeyFieldBasedComparator
options
KeyFieldBasedPartitioner
options
KeyFieldBasedPartitioner
options
InputSplit
.
LoggedJob
object read directly from the trace.
MachineNode
by its host name.
WrappedMapper.Context
for custom implementations.
CompressionCodec
for compressing the map outputs.
Mapper
class for the job.
Mapper
class for the job.
MapRunnable
class for the job.
true
.
TaskAttemptInfo
for a given task-attempt, considering impact
of locality.
TaskAttemptInfo
with a TaskAttemptID
associated with
taskType, taskNumber, and taskAttemptNumber.
JobClient.getMapTaskReports(JobID)
mapreduce.map.maxattempts
property.
mapred.map.max.attempts
property.
mapreduce.reduce.maxattempts
property.
mapred.reduce.max.attempts
property.
JobConf.getMemoryForMapTask()
and
JobConf.getMemoryForReduceTask()
JobStory
.
JobStory
.
OutputCommitter
implementation for the map-reduce job,
defaults to FileOutputCommitter
if not specified explicitly.
OutputCommitter
for the task-attempt.
SequenceFile.CompressionType
for the output SequenceFile
.
SequenceFile.CompressionType
for the output SequenceFile
.
CompressionCodec
for compressing the job outputs.
CompressionCodec
for compressing the job outputs.
OutputFormat
implementation for the map-reduce job,
defaults to TextOutputFormat
if not specified explicity.
OutputFormat
class for the job.
RawComparator
comparator used to compare keys.
Path
to the output directory for the map-reduce job.
Path
to the output directory for the map-reduce job.
OutputReader
class.
WritableComparable
comparator for
grouping keys of inputs to the reduce.
Object.hashCode()
to partition.
BinaryComparable.getBytes()
to partition.
Object.hashCode()
to partition.
Partitioner
used to partition Mapper
-outputs
to be sent to the Reducer
s.
Partitioner
class for the job.
Path
for a file that is unique for
the task within the job output directory.
Path
for a file that is unique for
the task within the job output directory.
RecordReader
consumed i.e.
RackNode
by its name.
RecordReader
for the given InputSplit
.
RecordReader
for the given InputSplit
.
RecordWriter
for the given job.
RecordWriter
for the given job.
RecordWriter
for the given task.
RecordWriter
for the given task.
RecordWriter
for the given task.
Reducer
class for the job.
Reducer
class for the job.
WrappedReducer.Context
for custom implementations.
true
.
JobClient.getReduceTaskReports(JobID)
TaskType
TaskStatus.State
of the task-attempt.
SequenceFile
SequenceFile
SequenceFile
SequenceFile
RawComparator
comparator used to compare keys.
true
.
FileInputFormat.listStatus(JobConf)
when
they're too big.
TaskAttemptInfo
for a given task-attempt, without regard to
impact of locality (e.g.
TaskAttemptInfo
with a TaskAttemptID
associated with
taskType, taskNumber, and taskAttemptNumber.
TaskCompletionEvent.getTaskAttemptId()
instead.
TaskID
object that this task attempt belongs to
TaskID.getTaskIDsPattern(String, Integer, TaskType,
Integer)
TaskInfo
for a given task.
TaskInfo
for the given task-attempt.
TaskType
corresponding to the character
SequenceFileRecordReader.next(Object, Object)
..
Path
to the task's temporary output directory
for the map-reduce job
Path
to the task's temporary output directory
for the map-reduce job
StreamJob.run(String[])
instead.
JobHistoryParser
to parse job histories for hadoop 0.20 (META=1).HashPartitioner
instead.Object.hashCode()
.Mapper
instead.Reducer
instead.Enum
type, by the specified amount.
JobTracker
.
Outputter
to a specific path.
InnerJoinRecordReader
instead.InputDemuxer
dem-ultiplexes the input files into individual input
streams.InputFormat
instead.InputFormat
describes the input-specification for a
Map-Reduce job.InputSampler
TotalOrderPartitioner
.InputFormat
.InputSplit
instead.InputSplit
represents the data to be processed by an
individual Mapper
.InverseMapper
instead.Mapper
that swaps keys and values.ControlledJob
instead.JobBuilder
builds one job.Job
and Cluster
insteadJobConf
, and connect to the
default JobTracker
.
Configuration
,
and connect to the default JobTracker
.
Configuration
insteadJobConfigurationParser
parses the job configuration xml file, and
extracts various framework specific properties.JobContext
instead.JobControl
insteadJobHistoryParser
defines the interface of a Job History file parser.JobHistoryParserFactory
is a singleton class that attempts to
determine the version of job history and return a proper parser.JobPriority
insteadQueueInfo
insteadJobStatus
insteadJobStory
represents the runtime information available for a
completed Map-Reduce job.JobStoryProducer
produces the sequence of JobStory
's.LoggedJob
instances.JoinRecordReader
insteadJsonGenerator
to write objects in JSON format.KeyFieldBasedComparator
insteadKeyFieldBasedPartitioner
insteadKeyFieldBasedComparator
.KeyValueLineRecordReader
insteadKeyValueTextInputFormat
insteadInputFormat
for plain text files.RunningJob.killTask(TaskAttemptID, boolean)
LazyOutputFormat
instead.LineReader
instead.Mapper
that extracts text matching a regular expression.LoggedDiscreteCDF
is a discrete approximation of a cumulative
distribution function, with this class set up to meet the requirements of the
Jackson JSON parser/generator.LoggedDiscreteCDF
is a representation of an hadoop job, with the
details of this class set up to meet the requirements of the Jackson JSON
parser/generator.LoggedLocation
is a representation of a point in an hierarchical
network, represented as a series of membership names, broadest first.LoggedNetworkTopology
represents a tree that in turn represents a
hierarchy of hosts.LoggedSingleRelativeRanking
represents an X-Y coordinate of a
single point in a discrete CDF.LoggedTask
represents a [hadoop] task that is part of a hadoop job.LoggedTaskAttempt
represents an attempt to run an hadoop task in a
hadoop job.LongSumReducer
instead.LongValueMax
insteadLongValueMin
insteadLongValueSum
insteadMachineNode
represents the configuration of a cluster node.map(...)
methods of the Mappers in the chain.
Mapper
.MapFileOutputFormat
insteadOutputFormat
that writes
MapFile
s.Mapper
instead.Context
passed on to the Mapper
implementations.Level
for the map task.
Level
for the reduce task.
JobConf.MAPRED_MAP_TASK_ENV
or
JobConf.MAPRED_REDUCE_TASK_ENV
JobConf.MAPRED_MAP_TASK_JAVA_OPTS
or
JobConf.MAPRED_REDUCE_TASK_JAVA_OPTS
JobConf.MAPRED_JOB_MAP_MEMORY_MB_PROPERTY
and
JobConf.MAPRED_JOB_REDUCE_MEMORY_MB_PROPERTY
JobConf.MAPRED_MAP_TASK_ULIMIT
or
JobConf.MAPRED_REDUCE_TASK_ULIMIT
Mapper
instead.MapRunnable
implementation.MapTaskAttemptInfo
represents the information with regard to a
map task attempt.MarkableIterator
is a wrapper iterator class that
implements the MarkableIteratorInterface
.CombineFileInputFormat
insteadCombineFileSplit
insteadMultiFileWordCount.MapClass
.CombineFileInputFormat
, one should extend it, to return a
(custom) RecordReader
.MultiFilterRecordReader
insteadMultipleInputs
insteadInputFormat
and Mapper
for each pathMultipleOutputs
insteadMultipleOutputs
insteadMultipleOutputs
insteadMultipleOutputs
insteadMultithreadedMapper
instead.DBRecordReader.nextKeyValue()
HistoryEvent
NLineInputFormat
insteadNode
represents a node in the cluster topology.NullOutputFormat
instead.Job
instead.org.apache.hadoop.mapred
package.org.apache.hadoop.mapreduce
package.OuterJoinRecordReader
instead<key, value>
pairs output by Mapper
s
and Reducer
s.OutputCommitter
instead.OutputCommitter
describes the commit of task output for a
Map-Reduce job.OutputFormat
instead.OutputFormat
describes the output-specification for a
Map-Reduce job.Utils.OutputFileUtils.OutputLogFilter
instead.OverrideRecordReader
insteadParser
insteadPartitioner
instead.HistoryEvent
Properties
.
HistoryEvent
Properties
.
RackNode
represents a rack node in the cluster topology.Type.BOOL
code.
Type.BYTE
code.
Type.BYTES
code.
Type.DOUBLE
code.
ResultSet
.
Type.FLOAT
code.
Type.INT
code.
Type.LIST
code.
Type.LONG
code.
Type.MAP
code.
Type.MAP
code.
Type.BOOL
code.
Type.BYTE
code.
Type.BYTES
code.
Type.DOUBLE
code.
Type.FLOAT
code.
Type.INT
code.
Type.LIST
code.
Type.LONG
code.
Type.MAP
code.
Type.STRING
code.
Type.VECTOR
code.
Type.STRING
code.
Type
.
Type.VECTOR
code.
Type.VECTOR
code.
RecordReader
reads <key, value> pairs from an
InputSplit
.Mapper
.RecordWriter
writes the output <key, value> pairs
to an output file.RecordWriter
writes the output <key, value> pairs
to an output file.reduce(...)
method of the Reducer with the
map(...)
methods of the Mappers in the chain.
Reducer
.Iterator
to iterate over values for a given group of records.Reducer
instead.Context
passed on to the Reducer
implementations.ReduceTaskAttemptInfo
represents the information with regard to a
reduce task attempt.JobTracker
.
RegexMapper
Mapper
that extracts text matching a regular expression.ResetableIterator
insteadReducer.run(org.apache.hadoop.mapreduce.Reducer.Context)
method to
control how the reduce task works.
DumpTypedBytes
.
LoadTypedBytes
.
Job
insteadSequenceFileAsBinaryInputFormat
insteadSequenceFileAsBinaryOutputFormat
insteadOutputFormat
that writes keys,
values to SequenceFile
s in binary(raw) formatSequenceFileAsTextInputFormat
insteadSequenceFileAsTextRecordReader
insteadSequenceFileInputFilter
insteadSequenceFileInputFormat
instead.InputFormat
for SequenceFile
s.SequenceFileOutputFormat
instead.OutputFormat
that writes SequenceFile
s.RecordReader
for SequenceFile
s.RecordReader
for SequenceFile
s.SkipBadRecords.COUNTER_MAP_PROCESSED_RECORDS
is incremented
by MapRunner after invoking the map function.
SkipBadRecords.COUNTER_REDUCE_PROCESSED_GROUPS
is incremented
by framework after invoking the reduce function.
Reducer.reduce(Object, Iterable,
org.apache.hadoop.mapreduce.Reducer.Context)
InputFormat
implementation for the map-reduce job.
InputFormat
for the job.
Path
s as the list of inputs
for the map-reduce job.
Path
s as the list of inputs
for the map-reduce job.
InputWriter
class.
JobPriority
for this job.
KeyFieldBasedComparator
options used to compare keys.
KeyFieldBasedComparator
options used to compare keys.
KeyFieldBasedPartitioner
options used for
Partitioner
KeyFieldBasedPartitioner
options used for
Partitioner
bytes[offset:]
in Python syntax.
CompressionCodec
for the map outputs.
Mapper
class for the job.
Mapper
for the job.
MapRunnable
class for the job.
JobConf.setMemoryForMapTask(long mem)
and
Use JobConf.setMemoryForReduceTask(long mem)
bytes[left:(right+1)]
in Python syntax.
OutputCommitter
implementation for the map-reduce job.
SequenceFile.CompressionType
for the output SequenceFile
.
SequenceFile.CompressionType
for the output SequenceFile
.
CompressionCodec
to be used to compress job outputs.
CompressionCodec
to be used to compress job outputs.
OutputFormat
implementation for the map-reduce job.
OutputFormat
for the job.
RawComparator
comparator used to compare keys.
Path
of the output directory for the map-reduce job.
Path
of the output directory for the map-reduce job.
OutputReader
class.
RawComparator
comparator for
grouping keys in the input to the reduce.
Partitioner
class used to partition
Mapper
-outputs to be sent to the Reducer
s.
Partitioner
for the job.
Reducer
class to the chain job.
Reducer
class for the job.
Reducer
for the job.
bytes[:(offset+1)]
in Python syntax.
SequenceFile
SequenceFile
SequenceFile
SequenceFile
Reducer
.
TaskCompletionEvent.setTaskAttemptId(TaskAttemptID)
instead.
Cluster.JobTrackerStatus
instead.StreamBackedIterator
insteadStreamJob.setConf(Configuration)
and
run with StreamJob.run(String[])
.
StringValueMax
insteadStringValueMin
insteadSubmitter.runJob(JobConf)
TaskAttemptContext
instead.TaskID
.
TaskAttemptID.TaskAttemptID(String, int, TaskType, int, int)
.
TaskID
.
TaskAttemptInfo
is a collection of statistics about a particular
task-attempt gleaned from job-history of the job.TaskCompletionEvent
insteadTaskID.TaskID(String, int, TaskType, int)
TaskID.TaskID(org.apache.hadoop.mapreduce.JobID, TaskType,
int)
JobID
.
JobID
.
TaskReport
insteadTextInputFormat
instead.InputFormat
for plain text files.TextOutputFormat
instead.OutputFormat
that writes plain text files.TokenCounterMapper
instead.TotalOrderPartitioner
TupleWritable
insteadWritable
s.UniqValueCount
insteadUserDefinedValueAggregatorDescriptor
insteadValueAggregator
insteadValueAggregatorBaseDescriptor
insteadValueAggregatorCombiner
insteadValueAggregatorDescriptor
insteadValueAggregatorJob
insteadValueAggregatorJobBase
insteadValueAggregatorMapper
insteadValueAggregatorReducer
insteadValueHistogram
insteadMapper
which wraps a given one to allow custom
WrappedMapper.Context
implementations.WrappedRecordReader
insteadReducer
which wraps a given one to allow for custom
WrappedReducer.Context
implementations.PreparedStatement
.
out
.
ZombieCluster
rebuilds the cluster topology using the information
obtained from job history logs.ZombieJob
is a layer above LoggedJob
raw JSON objects.ZombieJob
with the same semantics as the
LoggedJob
passed in this parameter
ZombieJob
with the same semantics as the
LoggedJob
passed in this parameter
JobStory
s from job trace.
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |