Hadoop MAPREDUCE 2.0.3-alpha Release Notes
These release notes include new developer and user-facing incompatibilities, features, and major improvements.
Changes since Hadoop 2.0.2
- MAPREDUCE-4977.
Major improvement reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (documentation)
Documentation for pluggable shuffle and pluggable sort
- MAPREDUCE-4971.
Minor improvement reported by Arun C Murthy and fixed by Arun C Murthy
Minor extensibility enhancements
- MAPREDUCE-4969.
Major bug reported by Arpit Agarwal and fixed by Arpit Agarwal (test)
TestKeyValueTextInputFormat test fails with Open JDK 7
- MAPREDUCE-4953.
Major bug reported by Andy Isaacson and fixed by Andy Isaacson (pipes)
HadoopPipes misuses fprintf
- MAPREDUCE-4949.
Minor improvement reported by Sandy Ryza and fixed by Sandy Ryza (examples)
Enable multiple pi jobs to run in parallel
- MAPREDUCE-4948.
Critical bug reported by Junping Du and fixed by Junping Du (client)
TestYARNRunner.testHistoryServerToken failed on trunk
- MAPREDUCE-4946.
Critical bug reported by Jason Lowe and fixed by Jason Lowe (mr-am)
Type conversion of map completion events leads to performance problems with large jobs
- MAPREDUCE-4936.
Critical bug reported by Daryn Sharp and fixed by Arun C Murthy (mrv2)
JobImpl uber checks for cpu are wrong
- MAPREDUCE-4934.
Critical bug reported by Thomas Graves and fixed by Thomas Graves (build)
Maven RAT plugin is not checking all source files
- MAPREDUCE-4928.
Major improvement reported by Suresh Srinivas and fixed by Suresh Srinivas (applicationmaster , security)
Use token request messages defined in hadoop common
Protobuf message GetDelegationTokenRequestProto field renewer is made requried from optional. This change is not wire compatible with the older releases.
- MAPREDUCE-4925.
Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla (examples)
The pentomino option parser may be buggy
- MAPREDUCE-4924.
Trivial bug reported by Robert Kanter and fixed by Robert Kanter (mrv1)
flakey test: org.apache.hadoop.mapred.TestClusterMRNotification.testMR
- MAPREDUCE-4923.
Minor bug reported by Sandy Ryza and fixed by Sandy Ryza (mrv1 , mrv2 , task)
Add toString method to TaggedInputSplit
- MAPREDUCE-4921.
Blocker bug reported by Daryn Sharp and fixed by Daryn Sharp (client)
JobClient should acquire HS token with RM principal
- MAPREDUCE-4920.
Major bug reported by Vinod Kumar Vavilapalli and fixed by Suresh Srinivas
Use security token protobuf definition from hadoop common
- MAPREDUCE-4913.
Major bug reported by Jason Lowe and fixed by Jason Lowe (mr-am)
TestMRAppMaster#testMRAppMasterMissingStaging occasionally exits
- MAPREDUCE-4907.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (mrv1 , tasktracker)
TrackerDistributedCacheManager issues too many getFileStatus calls
- MAPREDUCE-4905.
Major test reported by Aleksey Gorshkov and fixed by Aleksey Gorshkov
test org.apache.hadoop.mapred.pipes
- MAPREDUCE-4902.
Trivial bug reported by Albert Chu and fixed by Albert Chu
Fix typo "receievd" should be "received" in log output
- MAPREDUCE-4899.
Major improvement reported by Derek Dagit and fixed by Derek Dagit
Provide a plugin to the Yarn Web App Proxy to generate tracking links for M/R appllications given the ID
- MAPREDUCE-4895.
Major bug reported by Dennis Y and fixed by Dennis Y
Fix compilation failure of org.apache.hadoop.mapred.gridmix.TestResourceUsageEmulators
- MAPREDUCE-4894.
Blocker bug reported by Siddharth Seth and fixed by Siddharth Seth (jobhistoryserver , mrv2)
Renewal / cancellation of JobHistory tokens
- MAPREDUCE-4893.
Major bug reported by Bikas Saha and fixed by Bikas Saha (applicationmaster)
MR AppMaster can do sub-optimal assignment of containers to map tasks leading to poor node locality
- MAPREDUCE-4890.
Critical bug reported by Jason Lowe and fixed by Jason Lowe (mr-am)
Invalid TaskImpl state transitions when task fails while speculating
- MAPREDUCE-4884.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (contrib/streaming , test)
streaming tests fail to start MiniMRCluster due to "Queue configuration missing child queue names for root"
- MAPREDUCE-4861.
Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla
Cleanup: Remove unused mapreduce.security.token.DelegationTokenRenewal
- MAPREDUCE-4856.
Major bug reported by Sandy Ryza and fixed by Sandy Ryza (test)
TestJobOutputCommitter uses same directory as TestJobCleanup
- MAPREDUCE-4848.
Major bug reported by Jason Lowe and fixed by Jerry Chen (mr-am)
TaskAttemptContext cast error during AM recovery
- MAPREDUCE-4845.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (client)
ClusterStatus.getMaxMemory() and getUsedMemory() exist in MR1 but not MR2
- MAPREDUCE-4842.
Blocker bug reported by Jason Lowe and fixed by Mariappan Asokan (mrv2)
Shuffle race can hang reducer
- MAPREDUCE-4838.
Major improvement reported by Arun C Murthy and fixed by Zhijie Shen
Add extra info to JH files
- MAPREDUCE-4836.
Major bug reported by Ravi Prakash and fixed by Ravi Prakash
Elapsed time for running tasks on AM web UI tasks page is 0
- MAPREDUCE-4833.
Critical bug reported by Robert Joseph Evans and fixed by Robert Parker (applicationmaster , mrv2)
Task can get stuck in FAIL_CONTAINER_CLEANUP
- MAPREDUCE-4832.
Critical bug reported by Robert Joseph Evans and fixed by Jason Lowe (applicationmaster)
MR AM can get in a split brain situation
- MAPREDUCE-4825.
Major bug reported by Jason Lowe and fixed by Jason Lowe (mr-am)
JobImpl.finished doesn't expect ERROR as a final job state
- MAPREDUCE-4822.
Trivial improvement reported by Robert Joseph Evans and fixed by Chu Tong (jobhistoryserver)
Unnecessary conversions in History Events
- MAPREDUCE-4819.
Blocker bug reported by Jason Lowe and fixed by Bikas Saha (mr-am)
AM can rerun job after reporting final job status to the client
- MAPREDUCE-4817.
Critical bug reported by Jason Lowe and fixed by Thomas Graves (applicationmaster , mr-am)
Hardcoded task ping timeout kills tasks localizing large amounts of data
- MAPREDUCE-4813.
Critical bug reported by Jason Lowe and fixed by Jason Lowe (applicationmaster)
AM timing out during job commit
- MAPREDUCE-4811.
Minor improvement reported by Ravi Prakash and fixed by Ravi Prakash (jobhistoryserver , mrv2)
JobHistoryServer should show when it was started in WebUI About page
- MAPREDUCE-4810.
Minor improvement reported by Jason Lowe and fixed by Jerry Chen (applicationmaster)
Add admin command options for ApplicationMaster
- MAPREDUCE-4809.
Major sub-task reported by Arun C Murthy and fixed by Mariappan Asokan
Change visibility of classes for pluggable sort changes
- MAPREDUCE-4808.
Major new feature reported by Arun C Murthy and fixed by Mariappan Asokan
Refactor MapOutput and MergeManager to facilitate reuse by Shuffle implementations
- MAPREDUCE-4807.
Major sub-task reported by Arun C Murthy and fixed by Mariappan Asokan
Allow MapOutputBuffer to be pluggable
- MAPREDUCE-4803.
Minor test reported by Mariappan Asokan and fixed by Mariappan Asokan (test)
Duplicate copies of TestIndexCache.java
- MAPREDUCE-4802.
Major improvement reported by Ravi Prakash and fixed by Ravi Prakash (mr-am , mrv2 , webapps)
Takes a long time to load the task list on the AM for large jobs
- MAPREDUCE-4801.
Critical bug reported by Jason Lowe and fixed by Jason Lowe
ShuffleHandler can generate large logs due to prematurely closed channels
- MAPREDUCE-4797.
Major bug reported by Jason Lowe and fixed by Jason Lowe (applicationmaster)
LocalContainerAllocator can loop forever trying to contact the RM
- MAPREDUCE-4787.
Major bug reported by Ravi Prakash and fixed by Robert Parker (test)
TestJobMonitorAndPrint is broken
- MAPREDUCE-4786.
Major bug reported by Ravi Prakash and fixed by Ravi Prakash (mrv2)
Job End Notification retry interval is 5 milliseconds by default
- MAPREDUCE-4782.
Blocker bug reported by Mark Fuhs and fixed by Mark Fuhs (client)
NLineInputFormat skips first line of last InputSplit
- MAPREDUCE-4778.
Major bug reported by Sandy Ryza and fixed by Sandy Ryza (jobtracker , scheduler)
Fair scheduler event log is only written if directory exists on HDFS
- MAPREDUCE-4777.
Minor improvement reported by Sandy Ryza and fixed by Sandy Ryza
In TestIFile, testIFileReaderWithCodec relies on testIFileWriterWithCodec
- MAPREDUCE-4774.
Major bug reported by Ivan A. Veselovsky and fixed by Jason Lowe (applicationmaster , mrv2)
JobImpl does not handle asynchronous task events in FAILED state
- MAPREDUCE-4772.
Critical bug reported by Robert Joseph Evans and fixed by Robert Joseph Evans (mrv2)
Fetch failures can take way too long for a map to be restarted
- MAPREDUCE-4771.
Major bug reported by Jason Lowe and fixed by Jason Lowe (mrv2)
KeyFieldBasedPartitioner not partitioning properly when configured
- MAPREDUCE-4764.
Major improvement reported by Ivan A. Veselovsky and fixed by
repair test org.apache.hadoop.mapreduce.security.TestBinaryTokenFile
- MAPREDUCE-4763.
Minor improvement reported by Ivan A. Veselovsky and fixed by
repair test org.apache.hadoop.mapreduce.security.TestUmbilicalProtocolWithJobToken
- MAPREDUCE-4752.
Major improvement reported by Robert Joseph Evans and fixed by Robert Joseph Evans (mrv2)
Reduce MR AM memory usage through String Interning
- MAPREDUCE-4751.
Major bug reported by Ravi Prakash and fixed by Vinod Kumar Vavilapalli
AM stuck in KILL_WAIT for days
- MAPREDUCE-4748.
Blocker bug reported by Robert Joseph Evans and fixed by Jason Lowe (mrv2)
Invalid event: T_ATTEMPT_SUCCEEDED at SUCCEEDED
- MAPREDUCE-4746.
Major bug reported by Robert Parker and fixed by Robert Parker (applicationmaster)
The MR Application Master does not have a config to set environment variables
- MAPREDUCE-4741.
Minor bug reported by Jason Lowe and fixed by Vinod Kumar Vavilapalli (applicationmaster , mrv2)
WARN and ERROR messages logged during normal AM shutdown
- MAPREDUCE-4740.
Blocker bug reported by Robert Joseph Evans and fixed by Robert Joseph Evans (mrv2)
only .jars can be added to the Distributed Cache classpath
- MAPREDUCE-4736.
Trivial improvement reported by Brandon Li and fixed by Brandon Li (test)
Remove obsolete option [-rootDir] from TestDFSIO
- MAPREDUCE-4733.
Major bug reported by Jason Lowe and fixed by Jason Lowe (applicationmaster , mrv2)
Reducer can fail to make progress during shuffle if too many reducers complete consecutively
- MAPREDUCE-4730.
Blocker bug reported by Jason Lowe and fixed by Jason Lowe (applicationmaster , mrv2)
AM crashes due to OOM while serving up map task completion events
- MAPREDUCE-4729.
Major bug reported by Thomas Graves and fixed by Vinod Kumar Vavilapalli (jobhistoryserver)
job history UI not showing all job attempts
- MAPREDUCE-4724.
Major bug reported by Thomas Graves and fixed by Thomas Graves (jobhistoryserver)
job history web ui applications page should be sorted to display last app first
- MAPREDUCE-4723.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza
Fix warnings found by findbugs 2
- MAPREDUCE-4721.
Major bug reported by Ravi Prakash and fixed by Ravi Prakash (jobhistoryserver)
Task startup time in JHS is same as job startup time.
- MAPREDUCE-4720.
Major bug reported by Robert Joseph Evans and fixed by Ravi Prakash
Browser thinks History Server main page JS is taking too long
- MAPREDUCE-4712.
Major bug reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli (jobhistoryserver)
mr-jobhistory-daemon.sh doesn't accept --config
- MAPREDUCE-4705.
Critical bug reported by Jason Lowe and fixed by Jason Lowe (jobhistoryserver , mrv2)
Historyserver links expire before the history data does
- MAPREDUCE-4703.
Major improvement reported by Ahmed Radwan and fixed by Ahmed Radwan (mrv1 , mrv2 , test)
Add the ability to start the MiniMRClientCluster using the configurations used before it is being stopped.
- MAPREDUCE-4681.
Major bug reported by Arun C Murthy and fixed by Arun C Murthy
HDFS-3910 broke MR tests
- MAPREDUCE-4678.
Minor bug reported by Chris McConnell and fixed by Chris McConnell (examples)
Running the Pentomino example with defaults throws java.lang.NegativeArraySizeException
- MAPREDUCE-4674.
Minor bug reported by Robert Justice and fixed by Robert Justice
Hadoop examples secondarysort has a typo "secondarysrot" in the usage
- MAPREDUCE-4666.
Minor improvement reported by Jason Lowe and fixed by Jason Lowe (jobhistoryserver)
JVM metrics for history server
- MAPREDUCE-4654.
Critical bug reported by Colin Patrick McCabe and fixed by Sandy Ryza (test)
TestDistCp is @ignored
- MAPREDUCE-4637.
Major bug reported by Tom White and fixed by Mayank Bansal (mrv2)
Killing an unassigned task attempt causes the job to fail
Handle TaskAttempt diagnostic updates while in the NEW and UNASSIGNED states.
- MAPREDUCE-4616.
Minor improvement reported by Tony Burton and fixed by Tony Burton (documentation)
Improvement to MultipleOutputs javadocs
- MAPREDUCE-4607.
Major bug reported by Bikas Saha and fixed by Bikas Saha
Race condition in ReduceTask completion can result in Task being incorrectly failed
- MAPREDUCE-4596.
Major task reported by Siddharth Seth and fixed by Siddharth Seth (applicationmaster , mrv2)
Split StateMachine state from states seen by MRClientProtocol (for Job, Task, TaskAttempt)
- MAPREDUCE-4554.
Major bug reported by Benoy Antony and fixed by Benoy Antony (job submission , security)
Job Credentials are not transmitted if security is turned off
- MAPREDUCE-4521.
Major bug reported by Jason Lowe and fixed by Ravi Prakash (mrv2)
mapreduce.user.classpath.first incompatibility with 0.20/1.x
- MAPREDUCE-4520.
Major new feature reported by Arun C Murthy and fixed by Arun C Murthy
Add experimental support for MR AM to schedule CPUs along-with memory
- MAPREDUCE-4517.
Minor improvement reported by James Kinley and fixed by Jason Lowe (applicationmaster)
Too many INFO messages written out during AM to RM heartbeat
- MAPREDUCE-4479.
Major bug reported by Mariappan Asokan and fixed by Mariappan Asokan (test)
Fix parameter order in assertEquals() in TestCombineInputFileFormat.java
- MAPREDUCE-4458.
Major improvement reported by Robert Joseph Evans and fixed by Robert Parker (mrv2)
Warn if java.library.path is used for AM or Task
- MAPREDUCE-4425.
Critical bug reported by Siddharth Seth and fixed by Jason Lowe (mrv2)
Speculation + Fetch failures can lead to a hung job
- MAPREDUCE-4279.
Major bug reported by Rahul Jain and fixed by Devaraj K (jobtracker)
getClusterStatus() fails with null pointer exception when running jobs in local mode
- MAPREDUCE-4278.
Major bug reported by Araceli Henley and fixed by Sandy Ryza
cannot run two local jobs in parallel from the same gateway.
- MAPREDUCE-4272.
Major bug reported by Luke Lu and fixed by Yu Gao (task)
SortedRanges.Range#compareTo is not spec compliant
- MAPREDUCE-4266.
Major task reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (build)
remove Ant remnants from MR
- MAPREDUCE-4229.
Major improvement reported by Todd Lipcon and fixed by Miomir Boljanovic (jobtracker)
Counter names' memory usage can be decreased by interning
- MAPREDUCE-4123.
Critical bug reported by Nishan Shetty and fixed by Devaraj K (mrv2)
./mapred groups gives NoClassDefFoundError
- MAPREDUCE-4049.
Major sub-task reported by Avner BenHanoch and fixed by Avner BenHanoch (performance , task , tasktracker)
plugin for generic shuffle service
Allow ReduceTask loading a third party plugin for shuffle (and merge) instead of the default shuffle.
- MAPREDUCE-3678.
Major new feature reported by Bejoy KS and fixed by Harsh J (mrv1 , mrv2)
The Map tasks logs should have the value of input split it processed
A map-task's syslogs now carries basic info on the InputSplit it processed.
- MAPREDUCE-2454.
Minor new feature reported by Mariappan Asokan and fixed by Mariappan Asokan
Allow external sorter plugin for MR
MAPREDUCE-4807 Allow external implementations of the sort phase in a Map task
- MAPREDUCE-2264.
Major bug reported by Adam Kramer and fixed by Devaraj K (jobtracker)
Job status exceeds 100% in some cases
- MAPREDUCE-1806.
Major bug reported by Paul Yang and fixed by Gera Shegalov (harchive)
CombineFileInputFormat does not work with paths not on default FS
- MAPREDUCE-1700.
Major bug reported by Tom White and fixed by Tom White (task)
User supplied dependencies may conflict with MapReduce system JARs