org.apache.hadoop.mapred
Class JobTracker

java.lang.Object
  extended byorg.apache.hadoop.mapred.JobTracker
All Implemented Interfaces:
org.apache.hadoop.mapred.InterTrackerProtocol, org.apache.hadoop.mapred.JobSubmissionProtocol, org.apache.hadoop.mapred.MRConstants

public class JobTracker
extends Object
implements org.apache.hadoop.mapred.MRConstants, org.apache.hadoop.mapred.InterTrackerProtocol, org.apache.hadoop.mapred.JobSubmissionProtocol

JobTracker is the central location for submitting and tracking MR jobs in a network environment.

Author:
Mike Cafarella

Field Summary
static int FILE_NOT_FOUND
           
static long HEARTBEAT_INTERVAL
           
static org.apache.commons.logging.Log LOG
           
static int SUCCESS
           
static long TASKTRACKER_EXPIRY_INTERVAL
           
static int TRACKERS_OK
           
static int UNKNOWN_TASKTRACKER
           
 
Method Summary
 Vector completedJobs()
           
 int emitHeartbeat(org.apache.hadoop.mapred.TaskTrackerStatus trackerStatus, boolean initialContact)
          Process incoming heartbeat messages from the task trackers.
 Vector failedJobs()
           
static InetSocketAddress getAddress(Configuration conf)
           
 ClusterStatus getClusterStatus()
          Get the current status of the cluster
 String getFilesystemName()
          Grab the local fs name
 int getInfoPort()
           
 org.apache.hadoop.mapred.JobInProgress getJob(String jobid)
           
 org.apache.hadoop.mapred.JobProfile getJobProfile(String jobid)
          Grab a handle to a job that is already known to the JobTracker
 org.apache.hadoop.mapred.JobStatus getJobStatus(String jobid)
          Grab a handle to a job that is already known to the JobTracker
 String getJobTrackerMachine()
           
 TaskReport[] getMapTaskReports(String jobid)
          Grab a bunch of info on the tasks that make up the job
 TaskReport[] getReduceTaskReports(String jobid)
           
 long getStartTime()
           
 org.apache.hadoop.mapred.TaskTrackerStatus getTaskTracker(String trackerID)
           
 int getTotalSubmissions()
           
static JobTracker getTracker()
           
 int getTrackerPort()
           
 void killJob(String jobid)
          Kill the indicated job
 org.apache.hadoop.mapred.MapOutputLocation[] locateMapOutputs(String jobId, int[] mapTasksNeeded, int reduce)
          A TaskTracker wants to know the physical locations of completed, but not yet closed, tasks.
static void main(String[] argv)
          Start the JobTracker process.
 void offerService()
          Run forever
 org.apache.hadoop.mapred.Task pollForNewTask(String taskTracker)
          A tracker wants to know if there's a Task to run.
 String[] pollForTaskWithClosedJob(String taskTracker)
          A tracker wants to know if any of its Tasks have been closed (because the job completed, whether successfully or not)
 void reportTaskTrackerError(String taskTracker, String errorClass, String errorMessage)
          Report a problem to the job tracker.
 Vector runningJobs()
           
static void startTracker(Configuration conf)
           
 org.apache.hadoop.mapred.JobStatus submitJob(String jobFile)
          JobTracker.submitJob() kicks off a new job.
 Collection taskTrackers()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

HEARTBEAT_INTERVAL

public static final long HEARTBEAT_INTERVAL
See Also:
Constant Field Values

TASKTRACKER_EXPIRY_INTERVAL

public static final long TASKTRACKER_EXPIRY_INTERVAL
See Also:
Constant Field Values

SUCCESS

public static final int SUCCESS
See Also:
Constant Field Values

FILE_NOT_FOUND

public static final int FILE_NOT_FOUND
See Also:
Constant Field Values

TRACKERS_OK

public static final int TRACKERS_OK
See Also:
Constant Field Values

UNKNOWN_TASKTRACKER

public static final int UNKNOWN_TASKTRACKER
See Also:
Constant Field Values
Method Detail

startTracker

public static void startTracker(Configuration conf)
                         throws IOException
Throws:
IOException

getTracker

public static JobTracker getTracker()

getAddress

public static InetSocketAddress getAddress(Configuration conf)

offerService

public void offerService()
Run forever


getTotalSubmissions

public int getTotalSubmissions()

getJobTrackerMachine

public String getJobTrackerMachine()

getTrackerPort

public int getTrackerPort()

getInfoPort

public int getInfoPort()

getStartTime

public long getStartTime()

runningJobs

public Vector runningJobs()

failedJobs

public Vector failedJobs()

completedJobs

public Vector completedJobs()

taskTrackers

public Collection taskTrackers()

getTaskTracker

public org.apache.hadoop.mapred.TaskTrackerStatus getTaskTracker(String trackerID)

emitHeartbeat

public int emitHeartbeat(org.apache.hadoop.mapred.TaskTrackerStatus trackerStatus,
                         boolean initialContact)
Process incoming heartbeat messages from the task trackers.

Specified by:
emitHeartbeat in interface org.apache.hadoop.mapred.InterTrackerProtocol

pollForNewTask

public org.apache.hadoop.mapred.Task pollForNewTask(String taskTracker)
A tracker wants to know if there's a Task to run. Returns a task we'd like the TaskTracker to execute right now. Eventually this function should compute load on the various TaskTrackers, and incorporate knowledge of DFS file placement. But for right now, it just grabs a single item out of the pending task list and hands it back.

Specified by:
pollForNewTask in interface org.apache.hadoop.mapred.InterTrackerProtocol

pollForTaskWithClosedJob

public String[] pollForTaskWithClosedJob(String taskTracker)
A tracker wants to know if any of its Tasks have been closed (because the job completed, whether successfully or not)

Specified by:
pollForTaskWithClosedJob in interface org.apache.hadoop.mapred.InterTrackerProtocol

locateMapOutputs

public org.apache.hadoop.mapred.MapOutputLocation[] locateMapOutputs(String jobId,
                                                                     int[] mapTasksNeeded,
                                                                     int reduce)
A TaskTracker wants to know the physical locations of completed, but not yet closed, tasks. This exists so the reduce task thread can locate map task outputs.

Specified by:
locateMapOutputs in interface org.apache.hadoop.mapred.InterTrackerProtocol
Parameters:
jobId - the job id
mapTasksNeeded - an array of the mapIds that we need
reduce - the reduce's id
Returns:
an array of MapOutputLocation

getFilesystemName

public String getFilesystemName()
                         throws IOException
Grab the local fs name

Specified by:
getFilesystemName in interface org.apache.hadoop.mapred.InterTrackerProtocol
Throws:
IOException

reportTaskTrackerError

public void reportTaskTrackerError(String taskTracker,
                                   String errorClass,
                                   String errorMessage)
                            throws IOException
Description copied from interface: org.apache.hadoop.mapred.InterTrackerProtocol
Report a problem to the job tracker.

Specified by:
reportTaskTrackerError in interface org.apache.hadoop.mapred.InterTrackerProtocol
Parameters:
taskTracker - the name of the task tracker
errorClass - the kind of error (eg. the class that was thrown)
errorMessage - the human readable error message
Throws:
IOException - if there was a problem in communication or on the remote side

submitJob

public org.apache.hadoop.mapred.JobStatus submitJob(String jobFile)
                                             throws IOException
JobTracker.submitJob() kicks off a new job. Create a 'JobInProgress' object, which contains both JobProfile and JobStatus. Those two sub-objects are sometimes shipped outside of the JobTracker. But JobInProgress adds info that's useful for the JobTracker alone. We add the JIP to the jobInitQueue, which is processed asynchronously to handle split-computation and build up the right TaskTracker/Block mapping.

Specified by:
submitJob in interface org.apache.hadoop.mapred.JobSubmissionProtocol
Throws:
IOException

getClusterStatus

public ClusterStatus getClusterStatus()
Description copied from interface: org.apache.hadoop.mapred.JobSubmissionProtocol
Get the current status of the cluster

Specified by:
getClusterStatus in interface org.apache.hadoop.mapred.JobSubmissionProtocol
Returns:
summary of the state of the cluster

killJob

public void killJob(String jobid)
Description copied from interface: org.apache.hadoop.mapred.JobSubmissionProtocol
Kill the indicated job

Specified by:
killJob in interface org.apache.hadoop.mapred.JobSubmissionProtocol

getJobProfile

public org.apache.hadoop.mapred.JobProfile getJobProfile(String jobid)
Description copied from interface: org.apache.hadoop.mapred.JobSubmissionProtocol
Grab a handle to a job that is already known to the JobTracker

Specified by:
getJobProfile in interface org.apache.hadoop.mapred.JobSubmissionProtocol

getJobStatus

public org.apache.hadoop.mapred.JobStatus getJobStatus(String jobid)
Description copied from interface: org.apache.hadoop.mapred.JobSubmissionProtocol
Grab a handle to a job that is already known to the JobTracker

Specified by:
getJobStatus in interface org.apache.hadoop.mapred.JobSubmissionProtocol

getMapTaskReports

public TaskReport[] getMapTaskReports(String jobid)
Description copied from interface: org.apache.hadoop.mapred.JobSubmissionProtocol
Grab a bunch of info on the tasks that make up the job

Specified by:
getMapTaskReports in interface org.apache.hadoop.mapred.JobSubmissionProtocol

getReduceTaskReports

public TaskReport[] getReduceTaskReports(String jobid)
Specified by:
getReduceTaskReports in interface org.apache.hadoop.mapred.JobSubmissionProtocol

getJob

public org.apache.hadoop.mapred.JobInProgress getJob(String jobid)

main

public static void main(String[] argv)
                 throws IOException,
                        InterruptedException
Start the JobTracker process. This is used only for debugging. As a rule, JobTracker should be run as part of the DFS Namenode process.

Throws:
IOException
InterruptedException


Copyright © 2006 The Apache Software Foundation