Package org.apache.hadoop.mapred

A system for scalable, fault-tolerant, distributed computation over large data collections.

See:
          Description

Interface Summary
InputFormat An input data format.
JobConfigurable That what may be configured.
JobHistory.Listener Callback interface for reading back log events from JobHistory.
Mapper Maps input key/value pairs to a set of intermediate key/value pairs.
MapRunnable Expert: Permits greater control of map processing.
OutputCollector Passed to Mapper and Reducer implementations to collect output data.
OutputFormat An output data format.
Partitioner Partitions the key space.
RecordReader Reads key/value pairs from an input file FileSplit.
RecordWriter Writes key/value pairs to an output file.
Reducer Reduces a set of intermediate values which share a key to a smaller set of values.
Reporter Passed to application code to permit alteration of status.
RunningJob Includes details on a running MapReduce job.
SequenceFileInputFilter.Filter filter interface
 

Class Summary
ClusterStatus Summarizes the size and current state of the cluster.
DefaultJobHistoryParser Default parser for job history files.
FileSplit A section of an input file.
InputFormatBase A base class for InputFormat.
IsolationRunner  
JobClient JobClient interacts with the JobTracker network interface.
JobConf A map/reduce job configuration.
JobHistory Provides methods for writing to and reading from job history.
JobHistory.HistoryCleaner Delete history files older than one month.
JobHistory.JobInfo Helper class for logging or reading back events related to job start, finish or failure.
JobHistory.MapAttempt Helper class for logging or reading back events related to start, finish or failure of a Map Attempt on a node.
JobHistory.ReduceAttempt Helper class for logging or reading back events related to start, finish or failure of a Map Attempt on a node.
JobHistory.Task Helper class for logging or reading back events related to Task's start, finish or failure.
JobHistory.TaskAttempt Base class for Map and Reduce TaskAttempts.
JobStatus Describes the current status of a job.
JobTracker JobTracker is the central location for submitting and tracking MR jobs in a network environment.
MapFileOutputFormat An OutputFormat that writes MapFiles.
MapReduceBase Base class for Mapper and Reducer implementations.
MapRunner Default MapRunnable implementation.
OutputFormatBase A base class for OutputFormat.
SequenceFileInputFilter A class that allows a map/red job to work on a sample of sequence files.
SequenceFileInputFilter.FilterBase base calss for Filters
SequenceFileInputFilter.MD5Filter This class returns a set of records by examing the MD5 digest of its key against a filtering frequency f.
SequenceFileInputFilter.PercentFilter This class returns a percentage of records The percentage is determined by a filtering frequency f using the criteria record# % f == 0.
SequenceFileInputFilter.RegexFilter Records filter by matching key to regex
SequenceFileInputFormat An InputFormat for SequenceFiles.
SequenceFileOutputFormat An OutputFormat that writes SequenceFiles.
SequenceFileRecordReader An RecordReader for SequenceFiles.
StatusHttpServer Create a Jetty embedded server to answer http requests.
TaskReport A report on the state of a task.
TaskTracker TaskTracker is a process that starts and tracks MR Tasks in a networked environment.
TaskTracker.Child The main() for child processes.
TaskTracker.MapOutputServlet This class is used in TaskTracker's Jetty to serve the map outputs to other nodes.
TextInputFormat An InputFormat for plain text files.
TextInputFormat.LineRecordReader  
TextOutputFormat An OutputFormat that writes plain text files.
TextOutputFormat.LineRecordWriter  
 

Enum Summary
JobHistory.Keys Job history files contain key="value" pairs, where keys belong to this enum.
JobHistory.RecordTypes Record types are identifiers for each line of log in history files.
JobHistory.Values This enum contains some of the values commonly used by history log events.
 

Package org.apache.hadoop.mapred Description

A system for scalable, fault-tolerant, distributed computation over large data collections.

Applications implement Mapper and Reducer interfaces. These are submitted as a JobConf and are applied to data stored in a FileSystem.

See Google's original Map/Reduce paper for background information.



Copyright © 2006 The Apache Software Foundation