Hadoop YARN 2.1.0-beta Release Notes
These release notes include new developer and user-facing incompatibilities, features, and major improvements.
Changes since Hadoop 2.0.5-alpha
- YARN-1056.
Trivial bug reported by Karthik Kambatla and fixed by Karthik Kambatla
Fix configs yarn.resourcemanager.resourcemanager.connect.{max.wait.secs|retry_interval.secs}
Fix configs yarn.resourcemanager.resourcemanager.connect.{max.wait.secs|retry_interval.secs} to have a *resourcemanager* only once, make them consistent with other such yarn configs and add entries in yarn-default.xml
- YARN-1046.
Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla
Disable mem monitoring by default in MiniYARNCluster
Have been running into this frequently inspite of MAPREDUCE-3709 on centos6 machines. However, when I try to run it independently on the machines, I have not been able to reproduce it.
{noformat}
2013-08-07 19:17:35,048 WARN [Container Monitor] monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(444)) - Container [pid=16556,containerID=container_1375928243488_0001_01_000001] is running beyond virtual memory limits. Current usage: 132.4 MB of 512 MB physical memory used; 1.2 GB of 1.0 GB virtual memory used. Killing container.
{noformat}
- YARN-1045.
Major improvement reported by Siddharth Seth and fixed by Jian He
Improve toString implementation for PBImpls
The generic toString implementation that is used in most of the PBImpls {code}getProto().toString().replaceAll("\\n", ", ").replaceAll("\\s+", " ");{code} is rather inefficient - replacing "\n" and "\s" to generate a one line string. Instead, we can use {code}TextFormat.shortDebugString(getProto());{code}.
If we can get this into 2.1.0 - great, otherwise the next release.
- YARN-1043.
Major bug reported by Yusaku Sako and fixed by Jian He
YARN Queue metrics are getting pushed to neither file nor Ganglia
YARN Queue metrics are not getting pushed to file or Ganglia via Hadoop Metrics 2.
QueueMetrics are still accessible via JMX and RM REST API (<hostname>:8088/ws/v1/cluster/scheduler).
- YARN-968.
Blocker bug reported by Kihwal Lee and fixed by Vinod Kumar Vavilapalli
RM admin commands don't work
If an RM admin command is issued using CLI, I get something like following:
13/07/24 17:19:40 INFO client.RMProxy: Connecting to ResourceManager at xxxx.com/1.2.3.4:1234
refreshQueues: Unknown protocol: org.apache.hadoop.yarn.api.ResourceManagerAdministrationProtocolPB
- YARN-961.
Blocker bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
ContainerManagerImpl should enforce token on server. Today it is [TOKEN, SIMPLE]
We should only accept SecurityAuthMethod.TOKEN for ContainerManagementProtocol. Today it also accepts SIMPLE for unsecured environment.
- YARN-960.
Blocker bug reported by Alejandro Abdelnur and fixed by Daryn Sharp
TestMRCredentials and TestBinaryTokenFile are failing on trunk
Not sure, but this may be a fallout from YARN-701 and/or related to YARN-945.
Making it a blocker until full impact of the issue is scoped.
- YARN-945.
Blocker bug reported by Bikas Saha and fixed by Vinod Kumar Vavilapalli
AM register failing after AMRMToken
509 2013-07-19 15:53:55,569 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54313: readAndProcess from client 127.0.0.1 threw exception [org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN]]
510 org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN]
511 at org.apache.hadoop.ipc.Server$Connection.initializeAuthContext(Server.java:1531)
512 at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1482)
513 at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:788)
514 at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:587)
515 at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:562)
- YARN-937.
Blocker bug reported by Arun C Murthy and fixed by Alejandro Abdelnur
Fix unmanaged AM in non-secure/secure setup post YARN-701
Fix unmanaged AM in non-secure/secure setup post YARN-701 since app-tokens will be used in both scenarios.
- YARN-932.
Major bug reported by Sandy Ryza and fixed by Karthik Kambatla
TestResourceLocalizationService.testLocalizationInit can fail on JDK7
It looks like this is occurring when testLocalizationInit doesn't run first. Somehow yarn.nodemanager.log-dirs is getting set by one of the other tests (to ${yarn.log.dir}/userlogs), but yarn.log.dir isn't being set.
- YARN-927.
Major task reported by Bikas Saha and fixed by Bikas Saha
Change ContainerRequest to not have more than 1 container count and remove StoreContainerRequest
The downside is having to use more than 1 container request when requesting more than 1 container at * priority. For most other use cases that have specific locations we anyways need to make multiple container requests. This will also remove unnecessary duplication caused by StoredContainerRequest. It will make the getMatchingRequest() always available and easy to use removeContainerRequest().
- YARN-926.
Blocker bug reported by Vinod Kumar Vavilapalli and fixed by Jian He
ContainerManagerProtcol APIs should take in requests for multiple containers
AMs typically have to launch multiple containers on a node and the current single container APIs aren't helping. We should have all the APIs take in multiple requests and return multiple responses.
The client libraries could expose both the single and multi-container requests.
- YARN-922.
Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)
Change FileSystemRMStateStore to use directories
Store each app and its attempts in the same directory so that removing application state is only one operation
- YARN-919.
Minor bug reported by Mayank Bansal and fixed by Mayank Bansal
Document setting default heap sizes in yarn env
Right now there are no defaults in yarn env scripts for resource manager nad node manager and if user wants to override that, then user has to go to documentation and find the variables and change the script.
There is no straight forward way to change it in script. Just updating the variables with defaults.
- YARN-918.
Blocker bug reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
ApplicationMasterProtocol doesn't need ApplicationAttemptId in the payload after YARN-701
Once we use AMRMToken irrespective of kerberos after YARN-701, we don't need ApplicationAttemptId in the RPC pay load. This is an API change, so doing it as a blocker for 2.1.0-beta.
- YARN-912.
Major bug reported by Bikas Saha and fixed by Mayank Bansal
Create exceptions package in common/api for yarn and move client facing exceptions to them
Exceptions like InvalidResourceBlacklistRequestException, InvalidResourceRequestException, InvalidApplicationMasterRequestException etc are currently inside ResourceManager and not visible to clients.
- YARN-909.
Minor bug reported by Chuan Liu and fixed by Chuan Liu (nodemanager)
Disable TestLinuxContainerExecutorWithMocks on Windows
This unit test tests a Linux specific feature. We should skip this unit test on Windows. A similar unit test 'TestLinuxContainerExecutor' was already skipped when running on Windows.
- YARN-897.
Blocker bug reported by Djellel Eddine Difallah and fixed by Djellel Eddine Difallah (capacityscheduler)
CapacityScheduler wrongly sorted queues
The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity defines the sort order. This ensures the queue with least UsedCapacity to receive resources next. On containerAssignment we correctly update the order, but we miss to do so on container completions. This corrupts the TreeSet structure, and under-capacity queues might starve for resources.
- YARN-894.
Minor bug reported by Chuan Liu and fixed by Chuan Liu (nodemanager)
NodeHealthScriptRunner timeout checking is inaccurate on Windows
In {{NodeHealthScriptRunner}} method, we will set HealthChecker status based on the Shell execution results. Some status are based on the exception thrown during the Shell script execution.
Currently, we will catch a non-ExitCodeException from ShellCommandExecutor, and if Shell has the timeout status set at the same time, we will also set HealthChecker status to timeout.
We have following execution sequence in Shell:
1) In main thread, schedule a delayed timer task that will kill the original process upon timeout.
2) In main thread, open a buffered reader and feed in the process's standard input stream.
3) When timeout happens, the timer task will call {{Process#destroy()}}
to kill the main process.
On Linux, when timeout happened and process killed, the buffered reader will thrown an IOException with message: "Stream closed" in main thread.
On Windows, we don't have the IOException. Only "-1" was returned from the reader that indicates the buffer is finished. As a result, the timeout status is not set on Windows, and {{TestNodeHealthService}} fails on Windows because of this.
- YARN-883.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
Expose Fair Scheduler-specific queue metrics
When the Fair Scheduler is enabled, QueueMetrics should include fair share, minimum share, and maximum share.
- YARN-877.
Major sub-task reported by Junping Du and fixed by Junping Du (scheduler)
Allow for black-listing resources in FifoScheduler
YARN-750 already addressed black-list staff in YARN API and CS scheduler, this jira add implementation for FifoScheduler.
- YARN-875.
Major bug reported by Bikas Saha and fixed by Xuan Gong
Application can hang if AMRMClientAsync callback thread has exception
Currently that thread will die and then never callback. App can hang. Possible solution could be to catch Throwable in the callback and then call client.onError().
- YARN-874.
Blocker bug reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
Tracking YARN/MR test failures after HADOOP-9421 and YARN-827
HADOOP-9421 and YARN-827 broke some YARN/MR tests. Tracking those..
- YARN-873.
Major sub-task reported by Bikas Saha and fixed by Xuan Gong
YARNClient.getApplicationReport(unknownAppId) returns a null report
How can the client find out that app does not exist?
- YARN-869.
Blocker bug reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
ResourceManagerAdministrationProtocol should neither be public(yet) nor in yarn.api
This is a admin only api that we don't know yet if people can or should write new tools against. I am going to move it to yarn.server.api and make it @Private..
- YARN-866.
Major test reported by Wei Yan and fixed by Wei Yan
Add test for class ResourceWeights
Add test case for the class ResourceWeights
- YARN-865.
Major improvement reported by Xuan Gong and fixed by Xuan Gong
RM webservices can't query based on application Types
The resource manager web service api to get the list of apps doesn't have a query parameter for appTypes.
- YARN-861.
Critical bug reported by Devaraj K and fixed by Vinod Kumar Vavilapalli (nodemanager)
TestContainerManager is failing
https://builds.apache.org/job/Hadoop-Yarn-trunk/246/
{code:xml}
Running org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager
Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.249 sec <<< FAILURE!
testContainerManagerInitialization(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager) Time elapsed: 286 sec <<< FAILURE!
junit.framework.ComparisonFailure: expected:<[asf009.sp2.ygridcore.ne]t> but was:<[localhos]t>
at junit.framework.Assert.assertEquals(Assert.java:85)
{code}
- YARN-854.
Blocker bug reported by Ramya Sunil and fixed by Omkar Vinit Joshi
App submission fails on secure deploy
App submission on secure cluster fails with the following exception:
{noformat}
INFO mapreduce.Job: Job jobID failed with state FAILED due to: Application applicationID failed 2 times due to AM Container for appattemptID exited with exitCode: -1000 due to: App initialization failed (255) with output: main : command provided 0
main : user is qa_user
javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Mismatched response. [Caused by org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): DIGEST-MD5: digest response format violation. Mismatched response.]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:65)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:235)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:348)
Caused by: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): DIGEST-MD5: digest response format violation. Mismatched response.
at org.apache.hadoop.ipc.Client.call(Client.java:1298)
at org.apache.hadoop.ipc.Client.call(Client.java:1250)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:204)
at $Proxy7.heartbeat(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
... 3 more
.Failing this attempt.. Failing the application.
{noformat}
- YARN-853.
Major bug reported by Devaraj K and fixed by Devaraj K (capacityscheduler)
maximum-am-resource-percent doesn't work after refreshQueues command
If we update yarn.scheduler.capacity.maximum-am-resource-percent / yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent configuration and then do the refreshNodes, it uses the new config value to calculate Max Active Applications and Max Active Application Per User. If we add new node after issuing 'rmadmin -refreshQueues' command, it uses the old maximum-am-resource-percent config value to calculate Max Active Applications and Max Active Application Per User.
- YARN-852.
Minor bug reported by Chuan Liu and fixed by Chuan Liu
TestAggregatedLogFormat.testContainerLogsFileAccess fails on Windows
The YARN unit test case fails on Windows when comparing expected message with log message in the file. The expected message constructed in the test case has two problems: 1) it uses Path.separator to concatenate path string. Path.separator is always a forward slash, which does not match the backslash used in the log message. 2) On Windows, the default file owner is Administrators group if the file is created by an Administrators user. The test expect the user to be the current user.
- YARN-851.
Major bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
Share NMTokens using NMTokenCache (api-based) instead of memory based approach which is used currently.
It is a follow up ticket for YARN-694. Changing the way NMTokens are shared.
- YARN-850.
Major sub-task reported by Jian He and fixed by Jian He
Rename getClusterAvailableResources to getAvailableResources in AMRMClients
- YARN-848.
Major bug reported by Hitesh Shah and fixed by Hitesh Shah
Nodemanager does not register with RM using the fully qualified hostname
If the hostname is misconfigured to not be fully qualified ( i.e. hostname returns foo and hostname -f returns foo.bar.xyz ), the NM ends up registering with the RM using only "foo". This can create problems if DNS cannot resolve the hostname properly.
Furthermore, HDFS uses fully qualified hostnames which can end up affecting locality matches when allocating containers based on block locations.
- YARN-846.
Major sub-task reported by Jian He and fixed by Jian He
Move pb Impl from yarn-api to yarn-common
- YARN-845.
Major sub-task reported by Arpit Gupta and fixed by Mayank Bansal (resourcemanager)
RM crash with NPE on NODE_UPDATE
the following stack trace is generated in rm
{code}
n, service: 68.142.246.147:45454 }, ] resource=<memory:1536, vCores:1> queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:44544, vCores:29>usedCapacity=0.90625, absoluteUsedCapacity=0.90625, numApps=1, numContainers=29 usedCapacity=0.90625 absoluteUsedCapacity=0.90625 used=<memory:44544, vCores:29> cluster=<memory:49152, vCores:48>
2013-06-17 12:43:53,655 INFO capacity.ParentQueue (ParentQueue.java:completedContainer(696)) - completedContainer queue=root usedCapacity=0.90625 absoluteUsedCapacity=0.90625 used=<memory:44544, vCores:29> cluster=<memory:49152, vCores:48>
2013-06-17 12:43:53,656 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(832)) - Application appattempt_1371448527090_0844_000001 released container container_1371448527090_0844_01_000005 on node: host: hostXX:45454 #containers=4 available=2048 used=6144 with event: FINISHED
2013-06-17 12:43:53,656 INFO capacity.CapacityScheduler (CapacityScheduler.java:nodeUpdate(661)) - Trying to fulfill reservation for application application_1371448527090_0844 on node: hostXX:45454
2013-06-17 12:43:53,656 INFO fica.FiCaSchedulerApp (FiCaSchedulerApp.java:unreserve(435)) - Application application_1371448527090_0844 unreserved on node host: hostXX:45454 #containers=4 available=2048 used=6144, currently has 4 at priority 20; currentReservation <memory:6144, vCores:4>
2013-06-17 12:43:53,656 INFO scheduler.AppSchedulingInfo (AppSchedulingInfo.java:updateResourceRequests(168)) - checking for deactivate...
2013-06-17 12:43:53,657 FATAL resourcemanager.ResourceManager (ResourceManager.java:run(422)) - Error in handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.unreserve(FiCaSchedulerApp.java:432)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.unreserve(LeafQueue.java:1416)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainer(LeafQueue.java:1346)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignOffSwitchContainers(LeafQueue.java:1221)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainersOnNode(LeafQueue.java:1180)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignReservedContainer(LeafQueue.java:939)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:803)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:665)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:727)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:83)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:413)
at java.lang.Thread.run(Thread.java:662)
2013-06-17 12:43:53,659 INFO resourcemanager.ResourceManager (ResourceManager.java:run(426)) - Exiting, bbye..
2013-06-17 12:43:53,665 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped SelectChannelConnector@hostXX:8088
2013-06-17 12:43:53,765 ERROR delegation.AbstractDelegationTokenSecretManager (AbstractDelegationTokenSecretManager.java:run(513)) - InterruptedExcpetion recieved for ExpiredTokenRemover thread java.lang.InterruptedException: sleep interrupted
2013-06-17 12:43:53,766 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(200)) - Stopping ResourceManager metrics system...
2013-06-17 12:43:53,767 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(206)) - ResourceManager metrics system stopped.
2013-06-17 12:43:53,767 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(572)) - ResourceManager metrics system shutdown complete.
2013-06-17 12:43:53,768 WARN amlauncher.ApplicationMasterLauncher (ApplicationMasterLauncher.java:run(98)) - org.apache.hadoop.yarn.server.resourcemanager.amlauncher.ApplicationMasterLauncher$LauncherThread interrupted. Returning.
2013-06-17 12:43:53,768 INFO ipc.Server (Server.java:stop(2167)) - Stopping server on 8033
2013-06-17 12:43:53,770 INFO ipc.Server (Server.java:run(686)) - Stopping IPC Server listener on 8033
2013-06-17 12:43:53,770 INFO ipc.Server (Server.java:stop(2167)) - Stopping server on 8032
2013-06-17 12:43:53,770 INFO ipc.Server (Server.java:run(828)) - Stopping IPC Server Responder
2013-06-17 12:43:53,771 INFO ipc.Server (Server.java:run(686)) - Stopping IPC Server listener on 8032
2013-06-17 12:43:53,771 INFO ipc.Server (Server.java:run(828)) - Stopping IPC Server Responder
2013-06-17 12:43:53,771 INFO ipc.Server (Server.java:stop(2167)) - Stopping server on 8030
2013-06-17 12:43:53,773 INFO ipc.Server (Server.java:run(686)) - Stopping IPC Server listener on 8030
2013-06-17 12:43:53,773 INFO ipc.Server (Server.java:stop(2167)) - Stopping server on 8031
2013-06-17 12:43:53,773 INFO ipc.Server (Server.java:run(828)) - Stopping IPC Server Responder
2013-06-17 12:43:53,774 INFO ipc.Server (Server.java:run(686)) - Stopping IPC Server listener on 8031
2013-06-17 12:43:53,775 INFO ipc.Server (Server.java:run(828)) - Stopping IPC Server Responder
{code}
- YARN-841.
Major sub-task reported by Siddharth Seth and fixed by Vinod Kumar Vavilapalli
Annotate and document AuxService APIs
For users writing their own AuxServices, these APIs should be annotated and need better documentation. Also, the classes may need to move out of the NodeManager.
- YARN-840.
Major sub-task reported by Jian He and fixed by Jian He
Move ProtoUtils to yarn.api.records.pb.impl
- YARN-839.
Minor bug reported by Chuan Liu and fixed by Chuan Liu
TestContainerLaunch.testContainerEnvVariables fails on Windows
The unit test case fails on Windows due to job id or container id was not printed out as part of the container script. Later, the test tries to read the pid from output of the file, and fails.
Exception in trunk:
{noformat}
Running org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch
Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.903 sec <<< FAILURE!
testContainerEnvVariables(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) Time elapsed: 1307 sec <<< ERROR!
java.lang.NullPointerException
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testContainerEnvVariables(TestContainerLaunch.java:278)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
{noformat}
- YARN-837.
Major sub-task reported by Zhijie Shen and fixed by Zhijie Shen
ClusterInfo.java doesn't seem to belong to org.apache.hadoop.yarn
- YARN-834.
Blocker sub-task reported by Arun C Murthy and fixed by Zhijie Shen
Review/fix annotations for yarn-client module and clearly differentiate *Async apis
Review/fix annotations for yarn-client module
- YARN-833.
Major bug reported by Zhijie Shen and fixed by Zhijie Shen
Move Graph and VisualizeStateMachine into yarn.state package
Graph and VisualizeStateMachine are only used by state machine, they should belong to state package.
- YARN-831.
Blocker sub-task reported by Jian He and fixed by Jian He
Remove resource min from GetNewApplicationResponse
- YARN-829.
Major bug reported by Zhijie Shen and fixed by Zhijie Shen
Rename RMTokenSelector to be RMDelegationTokenSelector
Therefore, the name of it will be consistent with that of RMDelegationTokenIdentifier.
- YARN-828.
Major bug reported by Zhijie Shen and fixed by Zhijie Shen
Remove YarnVersionAnnotation
YarnVersionAnnotation is not used at all, and the version information can be accessed through YarnVersionInfo instead.
- YARN-827.
Critical sub-task reported by Bikas Saha and fixed by Jian He
Need to make Resource arithmetic methods accessible
org.apache.hadoop.yarn.server.resourcemanager.resource has stuff like Resources and Calculators that help compare/add resources etc. Without these users will be forced to replicate the logic, potentially incorrectly.
- YARN-826.
Major sub-task reported by Zhijie Shen and fixed by Zhijie Shen
Move Clock/SystemClock to util package
Clock/SystemClock should belong to util.
- YARN-825.
Blocker sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
Fix yarn-common javadoc annotations
- YARN-824.
Major sub-task reported by Jian He and fixed by Jian He
Add static factory to yarn client lib interface and change it to abstract class
Do this for AMRMClient, NMClient, YarnClient. and annotate its impl as private.
The purpose is not to expose impl
- YARN-823.
Major sub-task reported by Jian He and fixed by Jian He
Move RMAdmin from yarn.client to yarn.client.cli and rename as RMAdminCLI
- YARN-822.
Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
Rename ApplicationToken to AMRMToken
API change. At present this token is getting used on scheduler api AMRMProtocol. Right now name wise it is little confusing as it might be useful for the application to talk to complete yarn system (RM/NM) but that is not the case after YARN-694. NM will have specific NMToken so it is better to name it as AMRMToken.
- YARN-821.
Major sub-task reported by Jian He and fixed by Jian He
Rename FinishApplicationMasterRequest.setFinishApplicationStatus to setFinalApplicationStatus to be consistent with getter
- YARN-820.
Major sub-task reported by Bikas Saha and fixed by Mayank Bansal
NodeManager has invalid state transition after error in resource localization
- YARN-814.
Major sub-task reported by Hitesh Shah and fixed by Jian He
Difficult to diagnose a failed container launch when error due to invalid environment variable
The container's launch script sets up environment variables, symlinks etc.
If there is any failure when setting up the basic context ( before the actual user's process is launched ), nothing is captured by the NM. This makes it impossible to diagnose the reason for the failure.
To reproduce, set an env var where the value contains characters that throw syntax errors in bash.
- YARN-812.
Major bug reported by Ramya Sunil and fixed by Siddharth Seth
Enabling app summary logs causes 'FileNotFound' errors
RM app summary logs have been enabled as per the default config:
{noformat}
#
# Yarn ResourceManager Application Summary Log
#
# Set the ResourceManager summary log filename
yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
# Set the ResourceManager summary log level and appender
yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
# Appender for ResourceManager Application Summary Log
# Requires the following properties to be set
# - hadoop.log.dir (Hadoop Log directory)
# - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename)
# - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender)
log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger}
log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false
log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender
log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file}
log4j.appender.RMSUMMARY.MaxFileSize=256MB
log4j.appender.RMSUMMARY.MaxBackupIndex=20
log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout
log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
{noformat}
This however, throws errors while running commands as non-superuser:
{noformat}
-bash-4.1$ hadoop dfs -ls /
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /var/log/hadoop/hadoopqa/rm-appsummary.log (No such file or directory)
at java.io.FileOutputStream.openAppend(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:192)
at java.io.FileOutputStream.<init>(FileOutputStream.java:116)
at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:672)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:516)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
at org.apache.log4j.Logger.getLogger(Logger.java:104)
at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:289)
at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:109)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1116)
at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:858)
at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336)
at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685)
at org.apache.hadoop.fs.FsShell.<clinit>(FsShell.java:41)
Found 1 items
drwxr-xr-x - hadoop hadoop 0 2013-06-12 21:28 /user
{noformat}
- YARN-806.
Major sub-task reported by Jian He and fixed by Jian He
Move ContainerExitStatus from yarn.api to yarn.api.records
- YARN-805.
Blocker sub-task reported by Jian He and fixed by Jian He
Fix yarn-api javadoc annotations
- YARN-803.
Major improvement reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (resourcemanager , scheduler)
factor out scheduler config validation from the ResourceManager to each scheduler implementation
Per discussion in YARN-789 we should factor out from the ResourceManager class the scheduler config validations.
- YARN-799.
Major bug reported by Chris Riccomini and fixed by Chris Riccomini (nodemanager)
CgroupsLCEResourcesHandler tries to write to cgroup.procs
The implementation of
bq. ./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
Tells the container-executor to write PIDs to cgroup.procs:
{code}
public String getResourcesOption(ContainerId containerId) {
String containerName = containerId.toString();
StringBuilder sb = new StringBuilder("cgroups=");
if (isCpuWeightEnabled()) {
sb.append(pathForCgroup(CONTROLLER_CPU, containerName) + "/cgroup.procs");
sb.append(",");
}
if (sb.charAt(sb.length() - 1) == ',') {
sb.deleteCharAt(sb.length() - 1);
}
return sb.toString();
}
{code}
Apparently, this file has not always been writeable:
https://patchwork.kernel.org/patch/116146/
http://lkml.indiana.edu/hypermail/linux/kernel/1004.1/00536.html
https://lists.linux-foundation.org/pipermail/containers/2009-July/019679.html
The RHEL version of the Linux kernel that I'm using has a CGroup module that has a non-writeable cgroup.procs file.
{quote}
$ uname -a
Linux criccomi-ld 2.6.32-131.4.1.el6.x86_64 #1 SMP Fri Jun 10 10:54:26 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
{quote}
As a result, when the container-executor tries to run, it fails with this error message:
bq. fprintf(LOGFILE, "Failed to write pid %s (%d) to file %s - %s\n",
This is because the executor is given a resource by the CgroupsLCEResourcesHandler that includes cgroup.procs, which is non-writeable:
{quote}
$ pwd
/cgroup/cpu/hadoop-yarn/container_1370986842149_0001_01_000001
$ ls -l
total 0
-r--r--r-- 1 criccomi eng 0 Jun 11 14:43 cgroup.procs
-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 cpu.rt_period_us
-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 cpu.rt_runtime_us
-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 cpu.shares
-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 notify_on_release
-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 tasks
{quote}
I patched CgroupsLCEResourcesHandler to use /tasks instead of /cgroup.procs, and this appears to have fixed the problem.
I can think of several potential resolutions to this ticket:
1. Ignore the problem, and make people patch YARN when they hit this issue.
2. Write to /tasks instead of /cgroup.procs for everyone
3. Check permissioning on /cgroup.procs prior to writing to it, and fall back to /tasks.
4. Add a config to yarn-site that lets admins specify which file to write to.
Thoughts?
- YARN-795.
Major bug reported by Wei Yan and fixed by Wei Yan (scheduler)
Fair scheduler queue metrics should subtract allocated vCores from available vCores
The queue metrics of fair scheduler doesn't subtract allocated vCores from available vCores, causing the available vCores returned is incorrect.
This is happening because {code}QueueMetrics.getAllocateResources(){code} doesn't return the allocated vCores.
- YARN-792.
Major sub-task reported by Jian He and fixed by Jian He
Move NodeHealthStatus from yarn.api.record to yarn.server.api.record
- YARN-791.
Blocker sub-task reported by Sandy Ryza and fixed by Sandy Ryza (api , resourcemanager)
Ensure that RM RPC APIs that return nodes are consistent with /nodes REST API
- YARN-789.
Major improvement reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (scheduler)
Enable zero capabilities resource requests in fair scheduler
Per discussion in YARN-689, reposting updated use case:
1. I have a set of services co-existing with a Yarn cluster.
2. These services run out of band from Yarn. They are not started as yarn containers and they don't use Yarn containers for processing.
3. These services use, dynamically, different amounts of CPU and memory based on their load. They manage their CPU and memory requirements independently. In other words, depending on their load, they may require more CPU but not memory or vice-versa.
By using YARN as RM for these services I'm able share and utilize the resources of the cluster appropriately and in a dynamic way. Yarn keeps tab of all the resources.
These services run an AM that reserves resources on their behalf. When this AM gets the requested resources, the services bump up their CPU/memory utilization out of band from Yarn. If the Yarn allocations are released/preempted, the services back off on their resources utilization. By doing this, Yarn and these service correctly share the cluster resources, being Yarn RM the only one that does the overall resource bookkeeping.
The services AM, not to break the lifecycle of containers, start containers in the corresponding NMs. These container processes do basically a sleep forever (i.e. sleep 10000d). They are almost not using any CPU nor memory (less than 1MB). Thus it is reasonable to assume their required CPU and memory utilization is NIL (more on hard enforcement later). Because of this almost NIL utilization of CPU and memory, it is possible to specify, when doing a request, zero as one of the dimensions (CPU or memory).
The current limitation is that the increment is also the minimum.
If we set the memory increment to 1MB. When doing a pure CPU request, we would have to specify 1MB of memory. That would work. However it would allow discretionary memory requests without a desired normalization (increments of 256, 512, etc).
If we set the CPU increment to 1CPU. When doing a pure memory request, we would have to specify 1CPU. CPU amounts a much smaller than memory amounts, and because we don't have fractional CPUs, it would mean that all my pure memory requests will be wasting 1 CPU thus reducing the overall utilization of the cluster.
Finally, on hard enforcement.
* For CPU. Hard enforcement can be done via a cgroup cpu controller. Using an absolute minimum of a few CPU shares (ie 10) in the LinuxContainerExecutor we ensure there is enough CPU cycles to run the sleep process. This absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the shares for 1 CPU are 1024.
* For Memory. Hard enforcement is currently done by the ProcfsBasedProcessTree.java, using a minimum absolute of 1 or 2 MBs would take care of zero memory resources. And again, this absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the increment memory is in several MBs if not 1GB.
- YARN-787.
Blocker sub-task reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (api)
Remove resource min from Yarn client API
Per discussions in YARN-689 and YARN-769 we should remove minimum from the API as this is a scheduler internal thing.
- YARN-782.
Critical improvement reported by Sandy Ryza and fixed by Sandy Ryza (nodemanager)
vcores-pcores ratio functions differently from vmem-pmem ratio in misleading way
The vcores-pcores ratio functions differently from the vmem-pmem ratio in the sense that the vcores-pcores ratio has an impact on allocations and the vmem-pmem ratio does not.
If I double my vmem-pmem ratio, the only change that occurs is that my containers, after being scheduled, are less likely to be killed for using too much virtual memory. But if I double my vcore-pcore ratio, my nodes will appear to the ResourceManager to contain double the amount of CPU space, which will affect scheduling decisions.
The lack of consistency will exacerbate the already difficult problem of resource configuration.
- YARN-781.
Major sub-task reported by Devaraj Das and fixed by Jian He
Expose LOGDIR that containers should use for logging
The LOGDIR is known. We should expose this to the container's environment.
- YARN-777.
Major sub-task reported by Jian He and fixed by Jian He
Remove unreferenced objects from proto
- YARN-773.
Major sub-task reported by Jian He and fixed by Jian He
Move YarnRuntimeException from package api.yarn to api.yarn.exceptions
- YARN-767.
Major bug reported by Jian He and fixed by Jian He
Initialize Application status metrics when QueueMetrics is initialized
Applications: ResourceManager.QueueMetrics.AppsSubmitted, ResourceManager.QueueMetrics.AppsRunning, ResourceManager.QueueMetrics.AppsPending, ResourceManager.QueueMetrics.AppsCompleted, ResourceManager.QueueMetrics.AppsKilled, ResourceManager.QueueMetrics.AppsFailed
For now these metrics are created only when they are needed, we want to make them be seen when QueueMetrics is initialized
- YARN-764.
Major bug reported by nemon lou and fixed by nemon lou (resourcemanager)
blank Used Resources on Capacity Scheduler page
Even when there are jobs running,used resources is empty on Capacity Scheduler page for leaf queue.(I use google-chrome on windows 7.)
After changing resource.java's toString method by replacing "<>" with "{}",this bug gets fixed.
- YARN-763.
Major bug reported by Bikas Saha and fixed by Xuan Gong
AMRMClientAsync should stop heartbeating after receiving shutdown from RM
- YARN-761.
Major bug reported by Vinod Kumar Vavilapalli and fixed by Zhijie Shen
TestNMClientAsync fails sometimes
See https://builds.apache.org/job/PreCommit-YARN-Build/1101//testReport/.
It passed on my machine though.
- YARN-760.
Major bug reported by Sandy Ryza and fixed by Niranjan Singh (nodemanager)
NodeManager throws AvroRuntimeException on failed start
NodeManager wraps exceptions that occur in its start method in AvroRuntimeExceptions, even though it doesn't use Avro anywhere else.
- YARN-759.
Major sub-task reported by Bikas Saha and fixed by Bikas Saha
Create Command enum in AllocateResponse
Use command enums for shutdown/resync instead of booleans.
- YARN-757.
Blocker bug reported by Bikas Saha and fixed by Bikas Saha
TestRMRestart failing/stuck on trunk
- YARN-756.
Major sub-task reported by Jian He and fixed by Jian He
Move PreemptionContainer/PremptionContract/PreemptionMessage/StrictPreemptionContract/PreemptionResourceRequest to api.records
- YARN-755.
Major sub-task reported by Bikas Saha and fixed by Bikas Saha
Rename AllocateResponse.reboot to AllocateResponse.resync
For work preserving rm restart the am's will be resyncing instead of rebooting. rebooting is an action that currently satisfies the resync requirement. Changing the name now so that it continues to make sense in the real resync case.
- YARN-753.
Major sub-task reported by Jian He and fixed by Jian He
Add individual factory method for api protocol records
- YARN-752.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (api , applications)
In AMRMClient, automatically add corresponding rack requests for requested nodes
A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on. When a node is present without its rack, it makes sense for the client to automatically add the node's rack.
- YARN-750.
Major sub-task reported by Arun C Murthy and fixed by Arun C Murthy
Allow for black-listing resources in YARN API and Impl in CS
YARN-392 and YARN-398 enhance scheduler api to allow for white-lists of resources.
This jira is a companion to allow for black-listing (in CS).
- YARN-749.
Major sub-task reported by Arun C Murthy and fixed by Arun C Murthy
Rename ResourceRequest (get,set)HostName to (get,set)ResourceName
We should rename ResourceRequest (get,set)HostName to (get,set)ResourceName since the name can be host, rack or *.
- YARN-748.
Major sub-task reported by Jian He and fixed by Jian He
Move BuilderUtils from yarn-common to yarn-server-common
- YARN-746.
Major sub-task reported by Steve Loughran and fixed by Steve Loughran
rename Service.register() and Service.unregister() to registerServiceListener() & unregisterServiceListener() respectively
make it clear what you are registering on a {{Service}} by naming the methods {{registerServiceListener()}} & {{unregisterServiceListener()}} respectively.
This only affects a couple of production classes; {{Service.register()}} and is used in some of the lifecycle tests of the YARN-530. There are no tests of {{Service.unregister()}}, which is something that could be corrected.
- YARN-742.
Major bug reported by Kihwal Lee and fixed by Jason Lowe (nodemanager)
Log aggregation causes a lot of redundant setPermission calls
In one of our clusters, namenode RPC is spending 45% of its time on serving setPermission calls. Further investigation has revealed that most calls are redundantly made on /mapred/logs/<user>/logs. Also mkdirs calls are made before this.
- YARN-739.
Major sub-task reported by Siddharth Seth and fixed by Omkar Vinit Joshi
NM startContainer should validate the NodeId
The NM validates certain fields from the ContainerToken on a startContainer call. It shoudl also validate the NodeId (which needs to be added to the ContianerToken).
- YARN-737.
Major sub-task reported by Jian He and fixed by Jian He
Some Exceptions no longer need to be wrapped by YarnException and can be directly thrown out after YARN-142
- YARN-736.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
Add a multi-resource fair sharing metric
Currently, at a regular interval, the fair scheduler computes a fair memory share for each queue and application inside it. This fair share is not used for scheduling decisions, but is displayed in the web UI, exposed as a metric, and used for preemption decisions.
With DRF and multi-resource scheduling, assigning a memory share as the fair share metric to every queue no longer makes sense. It's not obvious what the replacement should be, but probably something like fractional fairness within a queue, or distance from an ideal cluster state.
- YARN-735.
Major sub-task reported by Jian He and fixed by Jian He
Make ApplicationAttemptID, ContainerID, NodeID immutable
- YARN-733.
Major bug reported by Zhijie Shen and fixed by Zhijie Shen
TestNMClient fails occasionally
The problem happens at:
{code}
// getContainerStatus can be called after stopContainer
try {
ContainerStatus status = nmClient.getContainerStatus(
container.getId(), container.getNodeId(),
container.getContainerToken());
assertEquals(container.getId(), status.getContainerId());
assertEquals(ContainerState.RUNNING, status.getState());
assertTrue("" + i, status.getDiagnostics().contains(
"Container killed by the ApplicationMaster."));
assertEquals(-1000, status.getExitStatus());
} catch (YarnRemoteException e) {
fail("Exception is not expected");
}
{code}
NMClientImpl#stopContainer returns, but container hasn't been stopped immediately. ContainerManangerImpl implements stopContainer in async style. Therefore, the container's status is in transition. NMClientImpl#getContainerStatus immediately after stopContainer will get either the RUNNING status or the COMPLETE one.
There will be the similar problem wrt NMClientImpl#startContainer.
- YARN-731.
Major sub-task reported by Siddharth Seth and fixed by Zhijie Shen
RPCUtil.unwrapAndThrowException should unwrap remote RuntimeExceptions
Will be required for YARN-662. Also, remote NPEs show up incorrectly for some unit tests.
- YARN-727.
Blocker sub-task reported by Siddharth Seth and fixed by Xuan Gong
ClientRMProtocol.getAllApplications should accept ApplicationType as a parameter
Now that an ApplicationType is registered on ApplicationSubmission, getAllApplications should be able to use this string to query for a specific application type.
- YARN-726.
Critical bug reported by Siddharth Seth and fixed by Mayank Bansal
Queue, FinishTime fields broken on RM UI
The queue shows up as "Invalid Date"
Finish Time shows up as a Long value.
- YARN-724.
Major sub-task reported by Jian He and fixed by Jian He
Move ProtoBase from api.records to api.records.impl.pb
Simply move ProtoBase to records.impl.pb
- YARN-720.
Major sub-task reported by Siddharth Seth and fixed by Zhijie Shen
container-log4j.properties should not refer to mapreduce properties
This refers to yarn.app.mapreduce.container.log.dir and yarn.app.mapreduce.container.log.filesize. This should either be moved into the MR codebase. Alternately the parameters should be renamed.
- YARN-719.
Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
Move RMIdentifier from Container to ContainerTokenIdentifier
This needs to be done for YARN-684 to happen.
- YARN-717.
Major sub-task reported by Jian He and fixed by Jian He
Copy BuilderUtil methods into token-related records
This is separated from YARN-711,as after changing yarn.api.token from interface to abstract class, eg: ClientTokenPBImpl has to extend two classes: both TokenPBImpl and ClientToken abstract class, which is not allowed in JAVA.
We may remove the ClientToken/ContainerToken/DelegationToken interface and just use the common Token interface
- YARN-716.
Major task reported by Siddharth Seth and fixed by Siddharth Seth
Make ApplicationID immutable
- YARN-715.
Major bug reported by Siddharth Seth and fixed by Vinod Kumar Vavilapalli
TestDistributedShell and TestUnmanagedAMLauncher are failing
Tests are timing out. Looks like this is related to YARN-617.
{code}
2013-05-21 17:40:23,693 ERROR [IPC Server handler 0 on 54024] containermanager.ContainerManagerImpl (ContainerManagerImpl.java:authorizeRequest(412)) - Unauthorized request to start container.
Expected containerId: user Found: container_1369183214008_0001_01_000001
2013-05-21 17:40:23,694 ERROR [IPC Server handler 0 on 54024] security.UserGroupInformation (UserGroupInformation.java:doAs(1492)) - PriviledgedActionException as:user (auth:SIMPLE) cause:org.apache.hado
Expected containerId: user Found: container_1369183214008_0001_01_000001
2013-05-21 17:40:23,695 INFO [IPC Server handler 0 on 54024] ipc.Server (Server.java:run(1864)) - IPC Server handler 0 on 54024, call org.apache.hadoop.yarn.api.ContainerManagerPB.startContainer from 10.
Expected containerId: user Found: container_1369183214008_0001_01_000001
org.apache.hadoop.yarn.exceptions.YarnRemoteException: Unauthorized request to start container.
Expected containerId: user Found: container_1369183214008_0001_01_000001
at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:43)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:413)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:440)
at org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:72)
at org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
{code}
- YARN-714.
Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
AMRM protocol changes for sending NMToken list
NMToken will be sent to AM on allocate call if
1) AM doesn't already have NMToken for the underlying NM
2) Key rolled over on RM and AM gets new container on the same NM.
On allocate call RM will send a consolidated list of all required NMTokens.
- YARN-711.
Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Jian He
Copy BuilderUtil methods into individual records
BuilderUtils is one giant utils class which has all the factory methods needed for creating records. It is painful for users to figure out how to create records. We are better off having the factories in each record, that way users can easily create records.
As a first step, we should just copy all the factory methods into individual classes, deprecate BuilderUtils and then slowly move all code off BuilderUtils.
- YARN-708.
Major task reported by Siddharth Seth and fixed by Siddharth Seth
Move RecordFactory classes to hadoop-yarn-api, miscellaneous fixes to the interfaces
This is required for additional changes in YARN-528.
Some of the interfaces could use some cleanup as well - they shouldn't be declaring YarnException (Runtime) in their signature.
- YARN-706.
Major bug reported by Zhijie Shen and fixed by Zhijie Shen
Race Condition in TestFSDownload
See the test failure in YARN-695
https://builds.apache.org/job/PreCommit-YARN-Build/957//testReport/org.apache.hadoop.yarn.util/TestFSDownload/testDownloadPatternJar/
- YARN-701.
Blocker sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
ApplicationTokens should be used irrespective of kerberos
- Single code path for secure and non-secure cases is useful for testing, coverage.
- Having this in non-secure mode will help us avoid accidental bugs in AMs DDos'ing and bringing down RM.
- YARN-700.
Major bug reported by Ivan Mitic and fixed by Ivan Mitic
TestInfoBlock fails on Windows because of line ending missmatch
Exception:
{noformat}
Running org.apache.hadoop.yarn.webapp.view.TestInfoBlock
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.962 sec <<< FAILURE!
testMultilineInfoBlock(org.apache.hadoop.yarn.webapp.view.TestInfoBlock) Time elapsed: 873 sec <<< FAILURE!
java.lang.AssertionError:
at org.junit.Assert.fail(Assert.java:91)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at org.apache.hadoop.yarn.webapp.view.TestInfoBlock.testMultilineInfoBlock(TestInfoBlock.java:79)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
{noformat}
- YARN-695.
Major sub-task reported by Zhijie Shen and fixed by Zhijie Shen
masterContainer and status are in ApplicationReportProto but not in ApplicationReport
If masterContainer and status are no longer part of ApplicationReport, they should be removed from proto as well.
- YARN-694.
Major bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
Start using NMTokens to authenticate all communication with NM
AM uses the NMToken to authenticate all the AM-NM communication.
NM will validate NMToken in below manner
* If NMToken is using current or previous master key then the NMToken is valid. In this case it will update its cache with this key corresponding to appId.
* If NMToken is using the master key which is present in NM's cache corresponding to AM's appId then it will be validated based on this.
* If NMToken is invalid then NM will reject AM calls.
Modification for ContainerToken
* At present RPC validates AM-NM communication based on ContainerToken. It will be replaced with NMToken. Also now onwards AM will use NMToken per NM (replacing earlier behavior of ContainerToken per container per NM).
* startContainer in case of Secured environment is using ContainerToken from UGI YARN-617; however after this it will use it from the payload (Container).
* ContainerToken will exist and it will only be used to validate the AM's container start request.
- YARN-693.
Major bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
Sending NMToken to AM on allocate call
This is part of YARN-613.
As per the updated design, AM will receive per NM, NMToken in following scenarios
* AM is receiving first container on underlying NM.
* AM is receiving container on underlying NM after either NM or RM rebooted.
** After RM reboot, as RM doesn't remember (persist) the information about keys issued per AM per NM, it will reissue tokens in case AM gets new container on underlying NM. However on NM side NM will still retain older token until it receives new token to support long running jobs (in work preserving environment).
** After NM reboot, RM will delete the token information corresponding to that AM for all AMs.
* AM is receiving container on underlying NM after NMToken master key is rolled over on RM side.
In all the cases if AM receives new NMToken then it is suppose to store it for future NM communication until it receives a new one.
AMRMClient should expose these NMToken to client.
- YARN-692.
Major bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
Creating NMToken master key on RM and sharing it with NM as a part of RM-NM heartbeat.
This is related to YARN-613 . Here we will be implementing NMToken generation on RM side and sharing it with NM during RM-NM heartbeat. As a part of this JIRA mater key will only be made available to NM but there will be no validation done until AM-NM communication is fixed.
- YARN-690.
Blocker bug reported by Daryn Sharp and fixed by Daryn Sharp (resourcemanager)
RM exits on token cancel/renew problems
The DelegationTokenRenewer thread is critical to the RM. When a non-IOException occurs, the thread calls System.exit to prevent the RM from running w/o the thread. It should be exiting only on non-RuntimeExceptions.
The problem is especially bad in 23 because the yarn protobuf layer converts IOExceptions into UndeclaredThrowableExceptions (RuntimeException) which causes the renewer to abort the process. An UnknownHostException takes down the RM...
- YARN-688.
Major bug reported by Jian He and fixed by Jian He
Containers not cleaned up when NM received SHUTDOWN event from NodeStatusUpdater
Currently, both SHUTDOWN event from nodeStatusUpdater and CleanupContainers event happens to be on the same dispatcher thread, CleanupContainers Event will not be processed until SHUTDOWN event is processed. see similar problem on YARN-495.
On normal NM shutdown, this is not a problem since normal stop happens on shutdownHook thread.
- YARN-686.
Major sub-task reported by Sandy Ryza and fixed by Sandy Ryza (api)
Flatten NodeReport
The NodeReport returned by getClusterNodes or given to AMs in heartbeat responses includes both a NodeState (enum) and a NodeHealthStatus (object). As UNHEALTHY is already NodeState, a separate NodeHealthStatus doesn't seem necessary. I propose eliminating NodeHealthStatus#getIsNodeHealthy and moving its two other methods, getHealthReport and getLastHealthReportTime, into NodeReport.
- YARN-684.
Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
ContainerManager.startContainer needs to only have ContainerTokenIdentifier instead of the whole Container
The NM only needs the token, the whole Container is unnecessary.
- YARN-663.
Major sub-task reported by Xuan Gong and fixed by Xuan Gong
Change ResourceTracker API and LocalizationProtocol API to throw YarnRemoteException and IOException
- YARN-661.
Major bug reported by Jason Lowe and fixed by Omkar Vinit Joshi (nodemanager)
NM fails to cleanup local directories for users
YARN-71 added deletion of local directories on startup, but in practice it fails to delete the directories because of permission problems. The top-level usercache directory is owned by the user but is in a directory that is not writable by the user. Therefore the deletion of the user's usercache directory, as the user, fails due to lack of permissions.
- YARN-660.
Major sub-task reported by Bikas Saha and fixed by Bikas Saha
Improve AMRMClient with matching requests
- YARN-656.
Major bug reported by Sandy Ryza and fixed by Sandy Ryza (resourcemanager , scheduler)
In scheduler UI, including reserved memory in "Memory Total" can make it exceed cluster capacity.
"Memory Total" is currently a sum of availableMB, allocatedMB, and reservedMB. Including reservedMB in this sum can make the total exceed the capacity of the cluster.
- YARN-655.
Major bug reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
Fair scheduler metrics should subtract allocated memory from available memory
In the scheduler web UI, cluster metrics reports that the "Memory Total" goes up when an application is allocated resources.
- YARN-654.
Major bug reported by Bikas Saha and fixed by Xuan Gong
AMRMClient: Perform sanity checks for parameters of public methods
- YARN-651.
Major sub-task reported by Xuan Gong and fixed by Xuan Gong
Change ContainerManagerPBClientImpl and RMAdminProtocolPBClientImpl to throw IOException and YarnRemoteException
YARN-632 AND YARN-633 changes RMAdmin and ContainerManager api to throw YarnRemoteException and IOException. RMAdminProtocolPBClientImpl and ContainerManagerPBClientImpl should do the same changes
- YARN-648.
Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla (scheduler)
FS: Add documentation for pluggable policy
YARN-469 and YARN-482 make the scheduling policy in FS pluggable. Need to add documentation on how to use this.
- YARN-646.
Major bug reported by Dapeng Sun and fixed by Dapeng Sun (documentation)
Some issues in Fair Scheduler's document
Issues are found in the doc page for Fair Scheduler http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html:
1.In the section “Configuration”, It contains two properties named “yarn.scheduler.fair.minimum-allocation-mb”, the second one should be “yarn.scheduler.fair.maximum-allocation-mb”
2.In the section “Allocation file format”, the document tells “ The format contains three types of elements”, but it lists four types of elements following that.
- YARN-645.
Major bug reported by Jian He and fixed by Jian He
Move RMDelegationTokenSecretManager from yarn-server-common to yarn-server-resourcemanager
RMDelegationTokenSecretManager is specific to resource manager, should not belong to server-common
- YARN-642.
Major bug reported by Sandy Ryza and fixed by Sandy Ryza (api , resourcemanager)
Fix up /nodes REST API to have 1 param and be consistent with the Java API
The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked.
- YARN-639.
Major bug reported by Zhijie Shen and fixed by Zhijie Shen (applications/distributed-shell)
Make AM of Distributed Shell Use NMClient
YARN-422 adds NMClient. AM of Distributed Shell should use it instead of using ContainerManager directly.
- YARN-638.
Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)
Restore RMDelegationTokens after RM Restart
This is missed in YARN-581. After RM restart, RMDelegationTokens need to be added both in DelegationTokenRenewer (addressed in YARN-581), and delegationTokenSecretManager
- YARN-637.
Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla (scheduler)
FS: maxAssign is not honored
maxAssign limits the number of containers that can be assigned in a single heartbeat. Currently, FS doesn't keep track of number of assigned containers to check this.
- YARN-635.
Major sub-task reported by Xuan Gong and fixed by Siddharth Seth
Rename YarnRemoteException to YarnException
- YARN-634.
Major sub-task reported by Siddharth Seth and fixed by Siddharth Seth
Make YarnRemoteException not backed by PB and introduce a SerializedException
LocalizationProtocol sends an exception over the wire. This currently uses YarnRemoteException. Post YARN-627, this needs to be changed and a new serialized exception is required.
- YARN-633.
Major sub-task reported by Xuan Gong and fixed by Xuan Gong
Change RMAdminProtocol api to throw IOException and YarnRemoteException
- YARN-632.
Major sub-task reported by Xuan Gong and fixed by Xuan Gong
Change ContainerManager api to throw IOException and YarnRemoteException
- YARN-631.
Major sub-task reported by Xuan Gong and fixed by Xuan Gong
Change ClientRMProtocol api to throw IOException and YarnRemoteException
- YARN-630.
Major sub-task reported by Xuan Gong and fixed by Xuan Gong
Change AMRMProtocol api to throw IOException and YarnRemoteException
- YARN-629.
Major sub-task reported by Xuan Gong and fixed by Xuan Gong
Make YarnRemoteException not be rooted at IOException
After HADOOP-9343, it should be possible for YarnException to not be rooted at IOException
- YARN-628.
Major sub-task reported by Siddharth Seth and fixed by Siddharth Seth
Fix YarnException unwrapping
Unwrapping of YarnRemoteExceptions (currently in YarnRemoteExceptionPBImpl, RPCUtil post YARN-625) is broken, and often ends up throwin UndeclaredThrowableException. This needs to be fixed.
- YARN-625.
Major sub-task reported by Siddharth Seth and fixed by Siddharth Seth
Move unwrapAndThrowException from YarnRemoteExceptionPBImpl to RPCUtil
- YARN-618.
Major bug reported by Jian He and fixed by Jian He
Modify RM_INVALID_IDENTIFIER to a -ve number
RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. Probably a -ve number is what we want.
- YARN-617.
Minor sub-task reported by Vinod Kumar Vavilapalli and fixed by Omkar Vinit Joshi
In unsercure mode, AM can fake resource requirements
Without security, it is impossible to completely avoid AMs faking resources. We can at the least make it as difficult as possible by using the same container tokens and the RM-NM shared key mechanism over unauthenticated RM-NM channel.
In the minimum, this will avoid accidental bugs in AMs in unsecure mode.
- YARN-615.
Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
ContainerLaunchContext.containerTokens should simply be called tokens
ContainerToken is the name of the specific token that AMs use to launch containers on NMs, so we should rename CLC.containerTokens to be simply tokens.
- YARN-613.
Major sub-task reported by Bikas Saha and fixed by Omkar Vinit Joshi
Create NM proxy per NM instead of per container
Currently a new NM proxy has to be created per container since the secure authentication is using a containertoken from the container.
- YARN-610.
Blocker sub-task reported by Siddharth Seth and fixed by Omkar Vinit Joshi
ClientToken (ClientToAMToken) should not be set in the environment
Similar to YARN-579, this can be set via ContainerTokens
- YARN-605.
Major bug reported by Hitesh Shah and fixed by Hitesh Shah
Failing unit test in TestNMWebServices when using git for source control
Failed tests: testNode(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
testNodeSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
testNodeDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
testNodeInfo(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
testNodeInfoSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
testNodeInfoDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
testSingleNodesXML(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
- YARN-600.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (resourcemanager , scheduler)
Hook up cgroups CPU settings to the number of virtual cores allocated
YARN-3 introduced CPU isolation and monitoring through cgroups. YARN-2 and introduced CPU scheduling in the capacity scheduler, and YARN-326 will introduce it in the fair scheduler. The number of virtual cores allocated to a container should be used to weight the number of cgroups CPU shares given to it.
- YARN-599.
Major bug reported by Zhijie Shen and fixed by Zhijie Shen
Refactoring submitApplication in ClientRMService and RMAppManager
Currently, ClientRMService#submitApplication call RMAppManager#handle, and consequently call RMAppMangager#submitApplication directly, though the code looks like scheduling an APP_SUBMIT event.
In addition, the validation code before creating an RMApp instance is not well organized. Ideally, the dynamic validation, which depends on the RM's configuration, should be put in RMAppMangager#submitApplication. RMAppMangager#submitApplication is called by ClientRMService#submitApplication and RMAppMangager#recover. Since the configuration may be changed after RM restarts, the validation needs to be done again even in recovery mode. Therefore, resource request validation, which based on min/max resource limits, should be moved from ClientRMService#submitApplication to RMAppMangager#submitApplication. On the other hand, the static validation, which is independent of the RM's configuration should be put in ClientRMService#submitApplication, because it is only need to be done once during the first submission.
Furthermore, try-catch flow in RMAppMangager#submitApplication has a flaw. RMAppMangager#submitApplication has a flaw is not synchronized. If two application submissions with the same application ID enter the function, and one progresses to the completion of RMApp instantiation, and the other progresses the completion of putting the RMApp instance into rmContext, the slower submission will cause an exception due to the duplicate application ID. However, the exception will cause the RMApp instance already in rmContext (belongs to the faster submission) being rejected with the current code flow.
- YARN-598.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (resourcemanager , scheduler)
Add virtual cores to queue metrics
QueueMetrics includes allocatedMB, availableMB, pendingMB, reservedMB. It should have equivalents for CPU.
- YARN-597.
Major bug reported by Ivan Mitic and fixed by Ivan Mitic
TestFSDownload fails on Windows because of dependencies on tar/gzip/jar tools
{{testDownloadArchive}}, {{testDownloadPatternJar}} and {{testDownloadArchiveZip}} fail with the similar Shell ExitCodeException:
{code}
testDownloadArchiveZip(org.apache.hadoop.yarn.util.TestFSDownload) Time elapsed: 480 sec <<< ERROR!
org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd: /D:/svn/t/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/TestFSDownload: No such file or directory
gzip: 1: No such file or directory
at org.apache.hadoop.util.Shell.runCommand(Shell.java:377)
at org.apache.hadoop.util.Shell.run(Shell.java:292)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:497)
at org.apache.hadoop.yarn.util.TestFSDownload.createZipFile(TestFSDownload.java:225)
at org.apache.hadoop.yarn.util.TestFSDownload.testDownloadArchiveZip(TestFSDownload.java:503)
{code}
- YARN-595.
Major sub-task reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
Refactor fair scheduler to use common Resources
resourcemanager.fair and resourcemanager.resources have two copies of basically the same code for operations on Resource objects
- YARN-594.
Major bug reported by Jian He and fixed by Jian He
Update test and add comments in YARN-534
This jira is simply to add some comments in the patch YARN-534 and update the test case
- YARN-593.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (nodemanager)
container launch on Windows does not correctly populate classpath with new process's environment variables and localized resources
On Windows, we must bundle the classpath of a launched container in an intermediate jar with a manifest. Currently, this logic incorrectly uses the nodemanager process's environment variables for substitution. Instead, it needs to use the new environment for the launched process. Also, the bundled classpath is missing some localized resources for directories, due to a quirk in the way {{File#toURI}} decides whether or not to append a trailing '/'.
- YARN-591.
Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
RM recovery related records do not belong to the API
We need to move out AppliationStateData and ApplicationAttemptStateData into resourcemanager module. They are not part of the public API..
- YARN-590.
Major improvement reported by Vinod Kumar Vavilapalli and fixed by Mayank Bansal
Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown
We should log such message in NM itself. Helps in debugging issues on NM directly instead of distributed debugging between RM and NM when such an action is received from RM.
- YARN-586.
Trivial bug reported by Zhijie Shen and fixed by Zhijie Shen
Typo in ApplicationSubmissionContext#setApplicationId
The parameter should be applicationId instead of appplicationId
- YARN-585.
Major bug reported by Zhijie Shen and fixed by Zhijie Shen
TestFairScheduler#testNotAllowSubmitApplication is broken due to YARN-514
TestFairScheduler#testNotAllowSubmitApplication is broken due to YARN-514. See the discussions in YARN-514.
- YARN-583.
Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache
Currently application cache files are getting localized under local-dir/usercache/userid/appcache/appid/. however they should be localized under filecache sub directory.
- YARN-582.
Major sub-task reported by Bikas Saha and fixed by Jian He (resourcemanager)
Restore appToken and clientToken for app attempt after RM restart
These need to be saved and restored on a per app attempt basis. This is required only when work preserving restart is implemented for secure clusters. In non-preserving restart app attempts are killed and so this does not matter.
- YARN-581.
Major sub-task reported by Bikas Saha and fixed by Jian He (resourcemanager)
Test and verify that app delegation tokens are added to tokenRenewer after RM restart
The code already saves the delegation tokens in AppSubmissionContext. Upon restart the AppSubmissionContext is used to submit the application again and so restores the delegation tokens. This jira tracks testing and verifying this functionality in a secure setup.
- YARN-579.
Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
Make ApplicationToken part of Container's token list to help RM-restart
Container is already persisted for helping RM restart. Instead of explicitly setting ApplicationToken in AM's env, if we change it to be in Container, we can avoid env and can also help restart.
- YARN-578.
Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Omkar Vinit Joshi (nodemanager)
NodeManager should use SecureIOUtils for serving and aggregating logs
Log servlets for serving logs and the ShuffleService for serving intermediate outputs both should use SecureIOUtils for avoiding symlink attacks.
- YARN-577.
Major sub-task reported by Hitesh Shah and fixed by Hitesh Shah
ApplicationReport does not provide progress value of application
An application sends its progress % to the RM via AllocateRequest. This should be able to be retrieved by a client via the ApplicationReport.
- YARN-576.
Major bug reported by Hitesh Shah and fixed by Kenji Kikushima
RM should not allow registrations from NMs that do not satisfy minimum scheduler allocations
If the minimum resource allocation configured for the RM scheduler is 1 GB, the RM should drop all NMs that register with a total capacity of less than 1 GB.
- YARN-571.
Major sub-task reported by Hitesh Shah and fixed by Omkar Vinit Joshi
User should not be part of ContainerLaunchContext
Today, a user is expected to set the user name in the CLC when either submitting an application or launching a container from the AM. This does not make sense as the user can/has been identified by the RM as part of the RPC layer.
Solution would be to move the user information into either the Container object or directly into the ContainerToken which can then be used by the NM to launch the container. This user information would set into the container by the RM.
- YARN-569.
Major sub-task reported by Carlo Curino and fixed by Carlo Curino (capacityscheduler)
CapacityScheduler: support for preemption (using a capacity monitor)
There is a tension between the fast-pace reactive role of the CapacityScheduler, which needs to respond quickly to
applications resource requests, and node updates, and the more introspective, time-based considerations
needed to observe and correct for capacity balance. To this purpose we opted instead of hacking the delicate
mechanisms of the CapacityScheduler directly to add support for preemption by means of a "Capacity Monitor",
which can be run optionally as a separate service (much like the NMLivelinessMonitor).
The capacity monitor (similarly to equivalent functionalities in the fairness scheduler) operates running on intervals
(e.g., every 3 seconds), observe the state of the assignment of resources to queues from the capacity scheduler,
performs off-line computation to determine if preemption is needed, and how best to "edit" the current schedule to
improve capacity, and generates events that produce four possible actions:
# Container de-reservations
# Resource-based preemptions
# Container-based preemptions
# Container killing
The actions listed above are progressively more costly, and it is up to the policy to use them as desired to achieve the rebalancing goals.
Note that due to the "lag" in the effect of these actions the policy should operate at the macroscopic level (e.g., preempt tens of containers
from a queue) and not trying to tightly and consistently micromanage container allocations.
------------- Preemption policy (ProportionalCapacityPreemptionPolicy): -------------
Preemption policies are by design pluggable, in the following we present an initial policy (ProportionalCapacityPreemptionPolicy) we have been experimenting with. The ProportionalCapacityPreemptionPolicy behaves as follows:
# it gathers from the scheduler the state of the queues, in particular, their current capacity, guaranteed capacity and pending requests (*)
# if there are pending requests from queues that are under capacity it computes a new ideal balanced state (**)
# it computes the set of preemptions needed to repair the current schedule and achieve capacity balance (accounting for natural completion rates, and
respecting bounds on the amount of preemption we allow for each round)
# it selects which applications to preempt from each over-capacity queue (the last one in the FIFO order)
# it remove reservations from the most recently assigned app until the amount of resource to reclaim is obtained, or until no more reservations exits
# (if not enough) it issues preemptions for containers from the same applications (reverse chronological order, last assigned container first) again until necessary or until no containers except the AM container are left,
# (if not enough) it moves onto unreserve and preempt from the next application.
# containers that have been asked to preempt are tracked across executions. If a containers is among the one to be preempted for more than a certain time, the container is moved in a the list of containers to be forcibly killed.
Notes:
(*) at the moment, in order to avoid double-counting of the requests, we only look at the "ANY" part of pending resource requests, which means we might not preempt on behalf of AMs that ask only for specific locations but not any.
(**) The ideal balance state is one in which each queue has at least its guaranteed capacity, and the spare capacity is distributed among queues (that wants some) as a weighted fair share. Where the weighting is based on the guaranteed capacity of a queue, and the function runs to a fix point.
Tunables of the ProportionalCapacityPreemptionPolicy:
# observe-only mode (i.e., log the actions it would take, but behave as read-only)
# how frequently to run the policy
# how long to wait between preemption and kill of a container
# which fraction of the containers I would like to obtain should I preempt (has to do with the natural rate at which containers are returned)
# deadzone size, i.e., what % of over-capacity should I ignore (if we are off perfect balance by some small % we ignore it)
# overall amount of preemption we can afford for each run of the policy (in terms of total cluster capacity)
In our current experiments this set of tunables seem to be a good start to shape the preemption action properly. More sophisticated preemption policies could take into account different type of applications running, job priorities, cost of preemption, integral of capacity imbalance. This is very much a control-theory kind of problem, and some of the lessons on designing and tuning controllers are likely to apply.
Generality:
The monitor-based scheduler edit, and the preemption mechanisms we introduced here are designed to be more general than enforcing capacity/fairness, in fact, we are considering other monitors that leverage the same idea of "schedule edits" to target different global properties (e.g., allocate enough resources to guarantee deadlines for important jobs, or data-locality optimizations, IO-balancing among nodes, etc...).
Note that by default the preemption policy we describe is disabled in the patch.
Depends on YARN-45 and YARN-567, is related to YARN-568
- YARN-568.
Major improvement reported by Carlo Curino and fixed by Carlo Curino (scheduler)
FairScheduler: support for work-preserving preemption
In the attached patch, we modified the FairScheduler to substitute its preemption-by-killling with a work-preserving version of preemption (followed by killing if the AMs do not respond quickly enough). This should allows to run preemption checking more often, but kill less often (proper tuning to be investigated). Depends on YARN-567 and YARN-45, is related to YARN-569.
- YARN-567.
Major sub-task reported by Carlo Curino and fixed by Carlo Curino (resourcemanager)
RM changes to support preemption for FairScheduler and CapacityScheduler
A common tradeoff in scheduling jobs is between keeping the cluster busy and enforcing capacity/fairness properties. FairScheduler and CapacityScheduler takes opposite stance on how to achieve this.
The FairScheduler, leverages task-killing to quickly reclaim resources from currently running jobs and redistributing them among new jobs, thus keeping the cluster busy but waste useful work. The CapacityScheduler is typically tuned
to limit the portion of the cluster used by each queue so that the likelihood of violating capacity is low, thus never wasting work, but risking to keep the cluster underutilized or have jobs waiting to obtain their rightful capacity.
By introducing the notion of a work-preserving preemption we can remove this tradeoff. This requires a protocol for preemption (YARN-45), and ApplicationMasters that can answer to preemption efficiently (e.g., by saving their intermediate state, this will be posted for MapReduce in a separate JIRA soon), together with a scheduler that can issues preemption requests (discussed in separate JIRAs YARN-568 and YARN-569).
The changes we track with this JIRA are common to FairScheduler and CapacityScheduler, and are mostly propagation of preemption decisions through the ApplicationMastersService.
- YARN-563.
Major sub-task reported by Thomas Weise and fixed by Mayank Bansal
Add application type to ApplicationReport
This field is needed to distinguish different types of applications (app master implementations). For example, we may run applications of type XYZ in a cluster alongside MR and would like to filter applications by type.
- YARN-562.
Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)
NM should reject containers allocated by previous RM
Its possible that after RM shutdown, before AM goes down,AM still call startContainer on NM with containers allocated by previous RM. When RM comes back, NM doesn't know whether this container launch request comes from previous RM or the current RM. we should reject containers allocated by previous RM
- YARN-561.
Major sub-task reported by Hitesh Shah and fixed by Xuan Gong
Nodemanager should set some key information into the environment of every container that it launches.
Information such as containerId, nodemanager hostname, nodemanager port is not set in the environment when any container is launched.
For an AM, the RM does all of this for it but for a container launched by an application, all of the above need to be set by the ApplicationMaster.
At the minimum, container id would be a useful piece of information. If the container wishes to talk to its local NM, the nodemanager related information would also come in handy.
- YARN-557.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (applications)
TestUnmanagedAMLauncher fails on Windows
{{TestUnmanagedAMLauncher}} fails on Windows due to attempting to run a Unix-specific command in distributed shell and use of a Unix-specific environment variable to determine username for the {{ContainerLaunchContext}}.
- YARN-553.
Minor sub-task reported by Harsh J and fixed by Karthik Kambatla (client)
Have YarnClient generate a directly usable ApplicationSubmissionContext
Right now, we're doing multiple steps to create a relevant ApplicationSubmissionContext for a pre-received GetNewApplicationResponse.
{code}
GetNewApplicationResponse newApp = yarnClient.getNewApplication();
ApplicationId appId = newApp.getApplicationId();
ApplicationSubmissionContext appContext = Records.newRecord(ApplicationSubmissionContext.class);
appContext.setApplicationId(appId);
{code}
A simplified way may be to have the GetNewApplicationResponse itself provide a helper method that builds a usable ApplicationSubmissionContext for us. Something like:
{code}
GetNewApplicationResponse newApp = yarnClient.getNewApplication();
ApplicationSubmissionContext appContext = newApp.generateApplicationSubmissionContext();
{code}
[The above method can also take an arg for the container launch spec, or perhaps pre-load defaults like min-resource, etc. in the returned object, aside of just associating the application ID automatically.]
- YARN-549.
Major sub-task reported by Zhijie Shen and fixed by Zhijie Shen
YarnClient.submitApplication should wait for application to be accepted by the RM
Currently, when submitting an application, storeApplication will be called for recovery. However, it is a blocking API, and is likely to block concurrent application submissions. Therefore, it is good to make application submission asynchronous, and postpone storeApplication. YarnClient needs to change to wait for the whole operation to complete so that clients can be notified after the application is really submitted. YarnClient needs to wait for application to reach SUBMITTED state or beyond.
- YARN-548.
Major sub-task reported by Vadim Bondarev and fixed by Vadim Bondarev
Add tests for YarnUncaughtExceptionHandler
- YARN-547.
Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
Race condition in Public / Private Localizer may result into resource getting downloaded again
Public Localizer :
At present when multiple containers try to request a localized resource
* If the resource is not present then first it is created and Resource Localization starts ( LocalizedResource is in DOWNLOADING state)
* Now if in this state multiple ResourceRequestEvents arrive then ResourceLocalizationEvents are sent for all of them.
Most of the times it is not resulting into a duplicate resource download but there is a race condition present there. Inside ResourceLocalization (for public download) all the requests are added to local attempts map. If a new request comes in then first it is checked in this map before a new download starts for the same. For the current download the request will be there in the map. Now if a same resource request comes in then it will rejected (i.e. resource is getting downloaded already). However if the current download completes then the request will be removed from this local map. Now after this removal if the LocalizerRequestEvent comes in then as it is not present in local map the resource will be downloaded again.
PrivateLocalizer :
Here a different but similar race condition is present.
* Here inside findNextResource method call; each LocalizerRunner tries to grab a lock on LocalizerResource. If the lock is not acquired then it will keep trying until the resource state changes to LOCALIZED. This lock will be released by the LocalizerRunner when download completes.
* Now if another ContainerLocalizer tries to grab the lock on a resource before LocalizedResource state changes to LOCALIZED then resource will be downloaded again.
At both the places the root cause of this is that all the threads try to acquire the lock on resource however current state of the LocalizedResource is not taken into consideration.
- YARN-542.
Major bug reported by Vinod Kumar Vavilapalli and fixed by Zhijie Shen
Change the default global AM max-attempts value to be not one
Today, the global AM max-attempts is set to 1 which is a bad choice. AM max-attempts accounts for both AM level failures as well as container crashes due to localization issue, lost nodes etc. To account for AM crashes due to problems that are not caused by user code, mainly lost nodes, we want to give AMs some retires.
I propose we change it to atleast two. Can change it to 4 to match other retry-configs.
- YARN-541.
Blocker bug reported by Krishna Kishore Bonagiri and fixed by Bikas Saha (resourcemanager)
getAllocatedContainers() is not returning all the allocated containers
I am running an application that was written and working well with the hadoop-2.0.0-alpha but when I am running the same against 2.0.3-alpha, the getAllocatedContainers() method called on AMResponse is not returning all the containers allocated sometimes. For example, I request for 10 containers and this method gives me only 9 containers sometimes, and when I looked at the log of Resource Manager, the 10th container is also allocated. It happens only sometimes randomly and works fine all other times. If I send one more request for the remaining container to RM after it failed to give them the first time(and before releasing already acquired ones), it could allocate that container. I am running only one application at a time, but 1000s of them one after another.
My main worry is, even though the RM's log is saying that all 10 requested containers are allocated, the getAllocatedContainers() method is not returning me all of them, it returned only 9 surprisingly. I never saw this kind of issue in the previous version, i.e. hadoop-2.0.0-alpha.
Thanks,
Kishore
- YARN-539.
Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi
LocalizedResources are leaked in memory in case resource localization fails
If resource localization fails then resource remains in memory and is
1) Either cleaned up when next time cache cleanup runs and there is space crunch. (If sufficient space in cache is available then it will remain in memory).
2) reused if LocalizationRequest comes again for the same resource.
I think when resource localization fails then that event should be sent to LocalResourceTracker which will then remove it from its cache.
- YARN-538.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza
RM address DNS lookup can cause unnecessary slowness on every JHS page load
When I run the job history server locally, every page load takes in the 10s of seconds. I profiled the process and discovered that all the extra time was spent inside YarnConfiguration#getRMWebAppURL, trying to resolve 0.0.0.0 to a hostname. When I changed my yarn.resourcemanager.address to localhost, the page load times decreased drastically.
There's no that we need to perform this resolution on every page load.
- YARN-536.
Major sub-task reported by Xuan Gong and fixed by Xuan Gong
Remove ContainerStatus, ContainerState from Container api interface as they will not be called by the container object
Remove containerstate, containerStatus from container interface. They will not be called by container object
- YARN-534.
Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)
AM max attempts is not checked when RM restart and try to recover attempts
Currently,AM max attempts is only checked if the current attempt fails and check to see whether to create new attempt. If the RM restarts before the max-attempt fails, it'll not clean the state store, when RM comes back, it will retry attempt again.
- YARN-532.
Major bug reported by Siddharth Seth and fixed by Siddharth Seth
RMAdminProtocolPBClientImpl should implement Closeable
Required for RPC.stopProxy to work. Already done in most of the other protocols. (MAPREDUCE-5117 addressing the one other protocol missing this)
- YARN-530.
Major sub-task reported by Steve Loughran and fixed by Steve Loughran
Define Service model strictly, implement AbstractService for robust subclassing, migrate yarn-common services
# Extend the YARN {{Service}} interface as discussed in YARN-117
# Implement the changes in {{AbstractService}} and {{FilterService}}.
# Migrate all services in yarn-common to the more robust service model, test.
- YARN-525.
Major improvement reported by Thomas Graves and fixed by Thomas Graves (capacityscheduler)
make CS node-locality-delay refreshable
the config yarn.scheduler.capacity.node-locality-delay doesn't change when you change the value in capacity_scheduler.xml and then run yarn rmadmin -refreshQueues.
- YARN-523.
Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Jian He
Container localization failures aren't reported from NM to RM
This is mainly a pain on crashing AMs, but once we fix this, containers also can benefit - same fix for both.
- YARN-521.
Major sub-task reported by Sandy Ryza and fixed by Sandy Ryza (api)
Augment AM - RM client module to be able to request containers only at specific locations
When YARN-392 and YARN-398 are completed, it would be good for AMRMClient to offer an easy way to access their functionality
- YARN-518.
Major improvement reported by Dapeng Sun and fixed by Sandy Ryza (documentation)
Fair Scheduler's document link could be added to the hadoop 2.x main doc page
Currently the doc page for Fair Scheduler looks good and it’s here, http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html.
It would be better to add the document link to the YARN section in the Hadoop 2.x main doc page, so that users can easily find the doc to experimentally try Fair Scheduler as Capacity Scheduler.
- YARN-515.
Blocker bug reported by Robert Joseph Evans and fixed by Robert Joseph Evans
Node Manager not getting the master key
On branch-2 the latest version I see the following on a secure cluster.
{noformat}
2013-03-28 19:21:06,243 [main] INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Security enabled - updating secret keys now
2013-03-28 19:21:06,243 [main] INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registered with ResourceManager as RM:PORT with total resource of <me
mory:12288, vCores:16>
2013-03-28 19:21:06,244 [main] INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl is started.
2013-03-28 19:21:06,245 [main] INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.NodeManager is started.
2013-03-28 19:21:07,257 [Node Status Updater] ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Caught exception in status-updater
java.lang.NullPointerException
at org.apache.hadoop.yarn.server.security.BaseContainerTokenSecretManager.getCurrentKey(BaseContainerTokenSecretManager.java:121)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:407)
{noformat}
The Null pointer exception just keeps repeating and all of the nodes end up being lost. It looks like it never gets the secret key when it registers.
- YARN-514.
Major sub-task reported by Bikas Saha and fixed by Zhijie Shen (resourcemanager)
Delayed store operations should not result in RM unavailability for app submission
Currently, app submission is the only store operation performed synchronously because the app must be stored before the request returns with success. This makes the RM susceptible to blocking all client threads on slow store operations, resulting in RM being perceived as unavailable by clients.
- YARN-513.
Major sub-task reported by Bikas Saha and fixed by Jian He (resourcemanager)
Create common proxy client for communicating with RM
When the RM is restarting, the NM, AM and Clients should wait for some time for the RM to come back up.
- YARN-512.
Minor bug reported by Jason Lowe and fixed by Maysam Yabandeh (nodemanager)
Log aggregation root directory check is more expensive than it needs to be
The log aggregation root directory check first does an {{exists}} call followed by a {{getFileStatus}} call. That effectively stats the file twice. It should just use {{getFileStatus}} and catch {{FileNotFoundException}} to handle the non-existent case.
In addition we may consider caching the presence of the directory rather than checking it each time a node aggregates logs for an application.
- YARN-507.
Minor bug reported by Karthik Kambatla and fixed by Karthik Kambatla (scheduler)
Add interface visibility and stability annotations to FS interfaces/classes
Many of FS classes/interfaces are missing annotations on visibility and stability.
- YARN-506.
Major bug reported by Ivan Mitic and fixed by Ivan Mitic
Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute
Move to common utils described in HADOOP-9413 that work well cross-platform.
- YARN-500.
Major bug reported by Nishan Shetty and fixed by Kenji Kikushima (resourcemanager)
ResourceManager webapp is using next port if configured port is already in use
- YARN-496.
Minor bug reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
Fair scheduler configs are refreshed inconsistently in reinitialize
When FairScheduler#reinitialize is called, some of the scheduler-wide configs are refreshed and others aren't. They should all be refreshed.
Ones that are refreshed: userAsDefaultQueue, nodeLocalityThreshold, rackLocalityThreshold, preemptionEnabled
Ones that aren't: minimumAllocation, maximumAllocation, assignMultiple, maxAssign
- YARN-495.
Major bug reported by Jian He and fixed by Jian He
Change NM behavior of reboot to resync
When a reboot command is sent from RM, the node manager doesn't clean up the containers while its stopping.
- YARN-493.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (nodemanager)
NodeManager job control logic flaws on Windows
Both product and test code contain some platform-specific assumptions, such as availability of bash for executing a command in a container and signals to check existence of a process and terminate it.
- YARN-491.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (nodemanager)
TestContainerLogsPage fails on Windows
{{TestContainerLogsPage}} contains some code for initializing a log directory that doesn't work correctly on Windows.
- YARN-490.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (applications/distributed-shell)
TestDistributedShell fails on Windows
There are a few platform-specific assumption in distributed shell (both main code and test code) that prevent it from working correctly on Windows.
- YARN-488.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (nodemanager)
TestContainerManagerSecurity fails on Windows
These tests are failing to launch containers correctly when running on Windows.
- YARN-487.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (nodemanager)
TestDiskFailures fails on Windows due to path mishandling
{{TestDiskFailures#testDirFailuresOnStartup}} fails due to insertion of an extra leading '/' on the path within {{LocalDirsHandlerService}} when running on Windows. The test assertions also fail to account for the fact that {{Path}} normalizes '\' to '/'.
- YARN-486.
Major sub-task reported by Bikas Saha and fixed by Xuan Gong
Change startContainer NM API to accept Container as a parameter and make ContainerLaunchContext user land
Currently, id, resource request etc need to be copied over from Container to ContainerLaunchContext. This can be brittle. Also it leads to duplication of information (such as Resource from CLC and Resource from Container and Container.tokens). Sending Container directly to startContainer solves these problems. It also makes CLC clean by only having stuff in it that it set by the client/AM.
- YARN-485.
Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla
TestProcfsProcessTree#testProcessTree() doesn't wait long enough for the process to die
TestProcfsProcessTree#testProcessTree fails occasionally with the following stack trace
{noformat}
Stack Trace:
junit.framework.AssertionFailedError: expected:<false> but was:<true>
at org.apache.hadoop.util.TestProcfsBasedProcessTree.testProcessTree(TestProcfsBasedProcessTree.java)
{noformat}
kill -9 is executed asynchronously, the signal is delivered when the process comes out of the kernel (sys call). Checking if the process died immediately after can fail at times.
- YARN-482.
Major sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (scheduler)
FS: Extend SchedulingMode to intermediate queues
FS allows setting {{SchedulingMode}} for leaf queues. Extending this to non-leaf queues allows using different kinds of fairness: e.g., root can have three child queues - fair-mem, drf-cpu-mem, drf-cpu-disk-mem taking different number of resources into account. In turn, this allows users to decide on the scheduling latency vs sophistication of the scheduling mode.
- YARN-481.
Major bug reported by Chris Riccomini and fixed by Chris Riccomini (client)
Add AM Host and RPC Port to ApplicationCLI Status Output
Hey Guys,
I noticed that the ApplicationCLI is just randomly not printing some of the values in the ApplicationReport. I've added the getHost and getRpcPort. These are useful for me, since I want to make an RPC call to the AM (not the tracker call).
Thanks!
Chris
- YARN-479.
Major bug reported by Hitesh Shah and fixed by Jian He
NM retry behavior for connection to RM should be similar for lost heartbeats
Regardless of connection loss at the start or at an intermediate point, NM's retry behavior to the RM should follow the same flow.
- YARN-476.
Minor bug reported by Jason Lowe and fixed by Sandy Ryza
ProcfsBasedProcessTree info message confuses users
ProcfsBasedProcessTree has a habit of emitting not-so-helpful messages such as the following:
{noformat}
2013-03-13 12:41:51,957 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 28747 may have finished in the interim.
2013-03-13 12:41:51,958 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 28978 may have finished in the interim.
2013-03-13 12:41:51,958 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 28979 may have finished in the interim.
{noformat}
As described in MAPREDUCE-4570, this is something that naturally occurs in the process of monitoring processes via procfs. It's uninteresting at best and can confuse users who think it's a reason their job isn't running as expected when it appears in their logs.
We should either make this DEBUG or remove it entirely.
- YARN-475.
Major sub-task reported by Hitesh Shah and fixed by Hitesh Shah
Remove ApplicationConstants.AM_APP_ATTEMPT_ID_ENV as it is no longer set in an AM's environment
AMs are expected to use ApplicationConstants.AM_CONTAINER_ID_ENV and derive the application attempt id from the container id.
- YARN-474.
Major bug reported by Hitesh Shah and fixed by Zhijie Shen (capacityscheduler)
CapacityScheduler does not activate applications when maximum-am-resource-percent configuration is refreshed
Submit 3 applications to a cluster where capacity scheduler limits allow only 1 running application. Modify capacity scheduler config to increase value of yarn.scheduler.capacity.maximum-am-resource-percent and invoke refresh queues.
The 2 applications not yet in running state do not get launched even though limits are increased.
- YARN-469.
Major sub-task reported by Karthik Kambatla and fixed by Karthik Kambatla (scheduler)
Make scheduling mode in FS pluggable
Currently, scheduling mode in FS is limited to Fair and FIFO. The code typically has an if condition at multiple places to determine the correct course of action.
Making the scheduling mode pluggable helps in simplifying this process, particularly as we add new modes (DRF in this case).
- YARN-468.
Major sub-task reported by Aleksey Gorshkov and fixed by Aleksey Gorshkov
coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter
coverage fix org.apache.hadoop.yarn.server.webproxy.amfilter
patch YARN-468-trunk.patch for trunk, branch-2, branch-0.23
- YARN-467.
Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi (nodemanager)
Jobs fail during resource localization when public distributed-cache hits unix directory limits
If we have multiple jobs which uses distributed cache with small size of files, the directory limit reaches before reaching the cache size and fails to create any directories in file cache (PUBLIC). The jobs start failing with the below exception.
java.io.IOException: mkdir of /tmp/nm-local-dir/filecache/3901886847734194975 failed
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
at org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
we need to have a mechanism where in we can create directory hierarchy and limit number of files per directory.
- YARN-460.
Blocker bug reported by Thomas Graves and fixed by Thomas Graves (capacityscheduler)
CS user left in list of active users for the queue even when application finished
We have seen a user get left in the queues list of active users even though the application was removed. This can cause everyone else in the queue to get less resources if using the minimum user limit percent config.
- YARN-458.
Major bug reported by Sandy Ryza and fixed by Sandy Ryza (nodemanager , resourcemanager)
YARN daemon addresses must be placed in many different configs
The YARN resourcemanager's address is included in four different configs: yarn.resourcemanager.scheduler.address, yarn.resourcemanager.resource-tracker.address, yarn.resourcemanager.address, and yarn.resourcemanager.admin.address
A new user trying to configure a cluster needs to know the names of all these four configs.
The same issue exists for nodemanagers.
It would be much easier if they could simply specify yarn.resourcemanager.hostname and yarn.nodemanager.hostname and default ports for the other ones would kick in.
- YARN-450.
Major sub-task reported by Bikas Saha and fixed by Zhijie Shen
Define value for * in the scheduling protocol
The ResourceRequest has a string field to specify node/rack locations. For the cross-rack/cluster-wide location (ie when there is no locality constraint) the "*" string is used everywhere. However, its not defined anywhere and each piece of code either defines a local constant or uses the string literal. Defining "*" in the protocol and removing other local references from the code base will be good.
- YARN-448.
Major bug reported by Kihwal Lee and fixed by Kihwal Lee (nodemanager)
Remove unnecessary hflush from log aggregation
AggregatedLogFormat#writeVersion() calls hflush() after writing the version. Calling hflush does not seem to be necessary. It can add a lot of load to hdfs in a big busy cluster.
- YARN-447.
Minor improvement reported by nemon lou and fixed by nemon lou (scheduler)
applicationComparator improvement for CS
Now the compare code is :
return a1.getApplicationId().getId() - a2.getApplicationId().getId();
Will be replaced with :
return a1.getApplicationId().compareTo(a2.getApplicationId());
This will bring some benefits:
1,leave applicationId compare logic to ApplicationId class;
2,In future's HA mode,cluster time stamp may change,ApplicationId class already takes care of this condition.
- YARN-444.
Major sub-task reported by Sandy Ryza and fixed by Sandy Ryza (api , applications/distributed-shell)
Move special container exit codes from YarnConfiguration to API
YarnConfiguration currently contains the special container exit codes INVALID_CONTAINER_EXIT_STATUS = -1000, ABORTED_CONTAINER_EXIT_STATUS = -100, and DISKS_FAILED = -101.
These are not really not really related to configuration, and YarnConfiguration should not become a place to put miscellaneous constants.
Per discussion on YARN-417, appmaster writers need to be able to provide special handling for them, so it might make sense to move these to their own user-facing class.
- YARN-441.
Major sub-task reported by Siddharth Seth and fixed by Xuan Gong
Clean up unused collection methods in various APIs
There's a bunch of unused methods like getAskCount() and getAsk(index) in AllocateRequest, and other interfaces. These should be removed.
In YARN, found them in. MR will have it's own set.
AllocateRequest
StartContaienrResponse
- YARN-440.
Major sub-task reported by Siddharth Seth and fixed by Xuan Gong
Flatten RegisterNodeManagerResponse
RegisterNodeManagerResponse has another wrapper RegistrationResponse under it, which can be removed.
- YARN-439.
Major sub-task reported by Siddharth Seth and fixed by Xuan Gong
Flatten NodeHeartbeatResponse
NodeheartbeatResponse has another wrapper HeartbeatResponse under it, which can be removed.
- YARN-426.
Critical bug reported by Jason Lowe and fixed by Jason Lowe (nodemanager)
Failure to download a public resource on a node prevents further downloads of the resource from that node
If the NM encounters an error while downloading a public resource, it fails to empty the list of request events corresponding to the resource request in {{attempts}}. If the same public resource is subsequently requested on that node, {{PublicLocalizer.addResource}} will skip the download since it will mistakenly believe a download of that resource is already in progress. At that point any container that requests the public resource will just hang in the {{LOCALIZING}} state.
- YARN-422.
Major sub-task reported by Bikas Saha and fixed by Zhijie Shen
Add NM client library
Create a simple wrapper over the ContainerManager protocol to provide hide the details of the protocol implementation.
- YARN-417.
Major sub-task reported by Sandy Ryza and fixed by Sandy Ryza (api , applications)
Create AMRMClient wrapper that provides asynchronous callbacks
Writing AMs would be easier for some if they did not have to handle heartbeating to the RM on their own.
- YARN-412.
Minor bug reported by Roger Hoover and fixed by Roger Hoover (scheduler)
FifoScheduler incorrectly checking for node locality
In the FifoScheduler, the assignNodeLocalContainers method is checking if the data is local to a node by searching for the nodeAddress of the node in the set of outstanding requests for the app. This seems to be incorrect as it should be checking hostname instead. The offending line of code is 455:
application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
In the CapacityScheduler, it's done using hostname. See LeafQueue.assignNodeLocalContainers, line 1129
application.getResourceRequest(priority, node.getHostName());
Note that this bug does not affect the actual scheduling decisions made by the FifoScheduler because even though it incorrect determines that a request is not local to the node, it will still schedule the request immediately because it's rack-local. However, this bug may be adversely affecting the reporting of job status by underreporting the number of tasks that were node local.
- YARN-410.
Major bug reported by Vinod Kumar Vavilapalli and fixed by Omkar Vinit Joshi
New lines in diagnostics for a failed app on the per-application page make it hard to read
We need to fix the following issues on YARN web-UI:
- Remove the "Note" column from the application list. When a failure happens, this "Note" spoils the table layout.
- When the Application is still not running, the Tracking UI should be title "UNASSIGNED", for some reason it is titled "ApplicationMaster" but (correctly) links to "#".
- The per-application page has all the RM related information like version, start-time etc. Must be some accidental change by one of the patches.
- The diagnostics for a failed app on the per-application page don't retain new lines and wrap'em around - looks hard to read.
- YARN-406.
Minor improvement reported by Hitesh Shah and fixed by Hitesh Shah
TestRackResolver fails when local network resolves "host1" to a valid host
- YARN-400.
Critical bug reported by Jason Lowe and fixed by Jason Lowe (resourcemanager)
RM can return null application resource usage report leading to NPE in client
RMAppImpl.createAndGetApplicationReport can return a report with a null resource usage report if full access to the app is allowed but the application has no current attempt. This leads to NPEs in client code that assumes an app report will always have at least an empty resource usage report.
- YARN-398.
Major sub-task reported by Arun C Murthy and fixed by Arun C Murthy
Enhance CS to allow for white-list of resources
Allow white-list and black-list of resources in scheduler api.
- YARN-396.
Major sub-task reported by Bikas Saha and fixed by Zhijie Shen
Rationalize AllocateResponse in RM scheduler API
AllocateResponse contains an AMResponse and cluster node count. AMResponse that more data. Unless there is a good reason for this object structure, there should be either AMResponse or AllocateResponse.
- YARN-392.
Major sub-task reported by Bikas Saha and fixed by Sandy Ryza (resourcemanager)
Make it possible to specify hard locality constraints in resource requests
Currently its not possible to specify scheduling requests for specific nodes and nowhere else. The RM automatically relaxes locality to rack and * and assigns non-specified machines to the app.
- YARN-391.
Trivial improvement reported by Steve Loughran and fixed by Steve Loughran (nodemanager)
detabify LCEResourcesHandler classes
the LCEResourcesHandler classes from YARN-3 have had some tab chars that have snuck into the source tree. fix this before that code starts getting branched off and it's too late
- YARN-390.
Major bug reported by Chris Nauroth and fixed by Chris Nauroth (client)
ApplicationCLI and NodeCLI use hard-coded platform-specific line separator, which causes test failures on Windows
{{ApplicationCLI}}, {{NodeCLI}}, and the corresponding test {{TestYarnCLI}} all use a hard-coded '\n' as the line separator. This causes test failures on Windows.
- YARN-387.
Blocker sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli
Fix inconsistent protocol naming
We now have different and inconsistent naming schemes for various protocols. It was hard to explain to users, mainly in direct interactions at talks/presentations and user group meetings, with such naming.
We should fix these before we go beta.
- YARN-385.
Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (api)
ResourceRequestPBImpl's toString() is missing location and # containers
ResourceRequestPBImpl's toString method includes priority and resource capability, but omits location and number of containers.
- YARN-383.
Minor bug reported by Hitesh Shah and fixed by Hitesh Shah
AMRMClientImpl should handle null rmClient in stop()
2013-02-06 09:31:33,813 INFO [Thread-2] service.CompositeService (CompositeService.java:stop(101)) - Error stopping org.apache.hadoop.yarn.client.AMRMClientImpl
org.apache.hadoop.HadoopIllegalArgumentException: Cannot close proxy since it is null
at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:605)
at org.apache.hadoop.yarn.client.AMRMClientImpl.stop(AMRMClientImpl.java:150)
at org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
at org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)
- YARN-382.
Major improvement reported by Thomas Graves and fixed by Zhijie Shen (scheduler)
SchedulerUtils improve way normalizeRequest sets the resource capabilities
In YARN-370, we changed it from setting the capability to directly setting memory and cores:
- ask.setCapability(normalized);
+ ask.getCapability().setMemory(normalized.getMemory());
+ ask.getCapability().setVirtualCores(normalized.getVirtualCores());
We did this because it is directly setting the values in the original resource object passed in when the AM gets allocated and without it the AM doesn't get the resource normalized correctly in the submission context. See YARN-370 for more details.
I think we should find a better way of doing this long term, one so we don't have to keep adding things there when new resources are added, two because its a bit confusing as to what its doing and prone to someone accidentally breaking it in the future again. Something closer to what Arun suggested in YARN-370 would be better but we need to make sure all the places work and get some more testing on it before putting it in.
- YARN-381.
Minor improvement reported by Eli Collins and fixed by Sandy Ryza (documentation)
Improve FS docs
The MR2 FS docs could use some improvements.
Configuration:
- sizebasedweight - what is the "size" here? Total memory usage?
Pool properties:
- minResources - what does min amount of aggregate memory mean given that this is not a reservation?
- maxResources - is this a hard limit?
- weight: How is this ratio configured? Eg base is 1 and all weights are relative to that?
- schedulingMode - what is the default? Is fifo pure fifo, eg waits until all tasks for the job are finished before launching the next job?
There's no mention of ACLs, even though they're supported. See the CS docs for comparison.
Also there are a couple typos worth fixing while we're at it, eg "finish. apps to run"
Worth keeping in mind that some of these will need to be updated to reflect that resource calculators are now pluggable.
- YARN-380.
Major bug reported by Thomas Graves and fixed by Omkar Vinit Joshi (client)
yarn node -status prints Last-Last-Health-Update
I assume the Last-Last-Health-Update is a typo and it should just be Last-Health-Update.
$ yarn node -status foo.com:8041
Node Report :
Node-Id : foo.com:8041
Rack : /10.10.10.0
Node-State : RUNNING
Node-Http-Address : foo.com:8042
Health-Status(isNodeHealthy) : true
Last-Last-Health-Update : 1360118400219
Health-Report :
Containers : 0
Memory-Used : 0M
Memory-Capacity : 24576
- YARN-378.
Major sub-task reported by xieguiming and fixed by Zhijie Shen (client , resourcemanager)
ApplicationMaster retry times should be set by Client
We should support that different client or user have different ApplicationMaster retry times. It also say that "yarn.resourcemanager.am.max-retries" should be set by client.
- YARN-377.
Minor bug reported by Tsz Wo (Nicholas), SZE and fixed by Chris Nauroth
Fix TestContainersMonitor for HADOOP-9252
HADOOP-9252 slightly changed the format of some StringUtils outputs. It caused TestContainersMonitor to fail.
Also, some methods were deprecated by HADOOP-9252. The use of them should be replaced with the new methods.
- YARN-376.
Blocker bug reported by Jason Lowe and fixed by Jason Lowe (resourcemanager)
Apps that have completed can appear as RUNNING on the NM UI
On a busy cluster we've noticed a growing number of applications appear as RUNNING on a nodemanager web pages but the applications have long since finished. Looking at the NM logs, it appears the RM never told the nodemanager that the application had finished. This is also reflected in a jstack of the NM process, since many more log aggregation threads are running then one would expect from the number of actively running applications.
- YARN-369.
Major sub-task reported by Hitesh Shah and fixed by Mayank Bansal (resourcemanager)
Handle ( or throw a proper error when receiving) status updates from application masters that have not registered
Currently, an allocate call from an unregistered application is allowed and the status update for it throws a statemachine error that is silently dropped.
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: STATUS_UPDATE at LAUNCHED
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:588)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:99)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:471)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:452)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
at java.lang.Thread.run(Thread.java:680)
ApplicationMasterService should likely throw an appropriate error for applications' requests that should not be handled in such cases.
- YARN-368.
Trivial bug reported by Albert Chu and fixed by Albert Chu
Fix typo "defiend" should be "defined" in error output
Noticed the following in an error log output while doing some experiements
./1066018/nodes/hyperion987/log/yarn-achu-nodemanager-hyperion987.out:java.lang.RuntimeException: No class defiend for uda.shuffle
"defiend" should be "defined"
- YARN-365.
Major sub-task reported by Siddharth Seth and fixed by Xuan Gong (resourcemanager , scheduler)
Each NM heartbeat should not generate an event for the Scheduler
Follow up from YARN-275
https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt
- YARN-363.
Major bug reported by Jason Lowe and fixed by Kenji Kikushima
yarn proxyserver fails to find webapps/proxy directory on startup
Starting up the proxy server fails with this error:
{noformat}
2013-01-29 17:37:41,357 FATAL webproxy.WebAppProxy (WebAppProxy.java:start(99)) - Could not start proxy web server
java.io.FileNotFoundException: webapps/proxy not found in CLASSPATH
at org.apache.hadoop.http.HttpServer.getWebAppsPath(HttpServer.java:533)
at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:225)
at org.apache.hadoop.http.HttpServer.<init>(HttpServer.java:164)
at org.apache.hadoop.yarn.server.webproxy.WebAppProxy.start(WebAppProxy.java:90)
at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.main(WebAppProxyServer.java:94)
{noformat}
- YARN-362.
Minor bug reported by Jason Lowe and fixed by Ravi Prakash
Unexpected extra results when using webUI table search
When using the search box on the web UI to search for a specific task number (e.g.: "0831"), sometimes unexpected extra results are shown. Using the web browser's built-in search-within-page does not show any hits, so these look like completely spurious results.
It looks like the raw timestamp value for time columns, which is not shown in the table, is also being searched with the search box.
- YARN-347.
Major improvement reported by Junping Du and fixed by Junping Du (client)
YARN CLI should show CPU info besides memory info in node status
With YARN-2 checked in, CPU info are taken into consideration in resource scheduling. yarn node -status <NodeID> should show CPU used and capacity info as memory info.
- YARN-345.
Critical bug reported by Devaraj K and fixed by Robert Parker (nodemanager)
Many InvalidStateTransitonException errors for ApplicationImpl in Node Manager
{code:xml}
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH_APPLICATION at FINISHED
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
at java.lang.Thread.run(Thread.java:662)
{code}
{code:xml}
2013-01-17 04:03:46,726 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH_APPLICATION at APPLICATION_RESOURCES_CLEANINGUP
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
at java.lang.Thread.run(Thread.java:662)
{code}
{code:xml}
2013-01-17 00:01:11,006 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: FINISH_APPLICATION at FINISHING_CONTAINERS_WAIT
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
at java.lang.Thread.run(Thread.java:662)
{code}
{code:xml}
2013-01-17 10:56:36,975 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1358385982671_1304_01_000001 transitioned from NEW to DONE
2013-01-17 10:56:36,975 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: APPLICATION_CONTAINER_FINISHED at FINISHED
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
at java.lang.Thread.run(Thread.java:662)
2013-01-17 10:56:36,975 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1358385982671_1304 transitioned from FINISHED to null
{code}
{code:xml}
2013-01-17 10:56:36,026 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: INIT_CONTAINER at FINISHED
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:398)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:58)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:520)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:512)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
at java.lang.Thread.run(Thread.java:662)
2013-01-17 10:56:36,026 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1358385982671_1304 transitioned from FINISHED to null
{code}
- YARN-333.
Major bug reported by Sandy Ryza and fixed by Sandy Ryza
Schedulers cannot control the queue-name of an application
Currently, if an app is submitted without a queue, RMAppManager sets the RMApp's queue to "default".
A scheduler may wish to make its own decision on which queue to place an app in if none is specified. For example, when the fair scheduler user-as-default-queue config option is set to true, and an app is submitted with no queue specified, the fair scheduler should assign the app to a queue with the user's name.
- YARN-326.
Major new feature reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)
Add multi-resource scheduling to the fair scheduler
With YARN-2 in, the capacity scheduler has the ability to schedule based on multiple resources, using dominant resource fairness. The fair scheduler should be able to do multiple resource scheduling as well, also using dominant resource fairness.
More details to come on how the corner cases with fair scheduler configs such as min and max resources will be handled.
- YARN-319.
Major bug reported by shenhong and fixed by shenhong (resourcemanager , scheduler)
Submit a job to a queue that not allowed in fairScheduler, client will hold forever.
RM use fairScheduler, when client submit a job to a queue, but the queue do not allow the user to submit job it, in this case, client will hold forever.
- YARN-309.
Major sub-task reported by Xuan Gong and fixed by Xuan Gong (resourcemanager)
Make RM provide heartbeat interval to NM
- YARN-297.
Major improvement reported by Arun C Murthy and fixed by Xuan Gong
Improve hashCode implementations for PB records
As [~hsn] pointed out in YARN-2, we use very small primes in all our hashCode implementations.
- YARN-295.
Major sub-task reported by Devaraj K and fixed by Mayank Bansal (resourcemanager)
Resource Manager throws InvalidStateTransitonException: Invalid event: CONTAINER_FINISHED at ALLOCATED for RMAppAttemptImpl
{code:xml}
2012-12-28 14:03:56,956 ERROR org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: CONTAINER_FINISHED at ALLOCATED
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490)
at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
at java.lang.Thread.run(Thread.java:662)
{code}
- YARN-289.
Major bug reported by Sandy Ryza and fixed by Sandy Ryza
Fair scheduler allows reservations that won't fit on node
An application requests a container with 1024 MB. It then requests a container with 2048 MB. A node shows up with 1024 MB available. Even if the application is the only one running, neither request will be scheduled on it.
- YARN-269.
Major bug reported by Thomas Graves and fixed by Jason Lowe (resourcemanager)
Resource Manager not logging the health_check_script result when taking it out
The Resource Manager not logging the health_check_script result when taking it out. This was added to jobtracker in 1.x with MAPREDUCE-2451, we should do the same thing for RM.
- YARN-249.
Major improvement reported by Ravi Prakash and fixed by Ravi Prakash (capacityscheduler)
Capacity Scheduler web page should show list of active users per queue like it used to (in 1.x)
On the jobtracker, the web ui showed the active users for each queue and how much resources each of those users were using. That currently isn't being displayed on the RM capacity scheduler web ui.
- YARN-237.
Major improvement reported by Ravi Prakash and fixed by Jian He (resourcemanager)
Refreshing the RM page forgets how many rows I had in my Datatables
If I choose a 100 rows, and then refresh the page, DataTables goes back to showing me 20 rows.
This user preference should be stored in a cookie.
- YARN-236.
Major bug reported by Jason Lowe and fixed by Jason Lowe (resourcemanager)
RM should point tracking URL to RM web page when app fails to start
Similar to YARN-165, the RM should redirect the tracking URL to the specific app page on the RM web UI when the application fails to start. For example, if the AM completely fails to start due to bad AM config or bad job config like invalid queuename, then the user gets the unhelpful "The requested application exited before setting a tracking URL".
Usually the diagnostic string on the RM app page has something useful, so we might as well point there.
- YARN-227.
Major bug reported by Jason Lowe and fixed by Jason Lowe (resourcemanager)
Application expiration difficult to debug for end-users
When an AM attempt expires the AMLivelinessMonitor in the RM will kill the job and mark it as failed. However there are no diagnostic messages set for the application indicating that the application failed because of expiration. Even if the AM logs are examined, it's often not obvious that the application was externally killed. The only evidence of what happened to the application is currently in the RM logs, and those are often not accessible by users.
- YARN-209.
Major bug reported by Bikas Saha and fixed by Zhijie Shen (capacityscheduler)
Capacity scheduler doesn't trigger app-activation after adding nodes
Say application A is submitted but at that time it does not meet the bar for activation because of resource limit settings for applications. After that if more hardware is added to the system and the application becomes valid it still remains in pending state, likely forever.
This might be rare to hit in real life because enough NM's heartbeat to the RM before applications can get submitted. But a change in settings or heartbeat interval might make it easier to repro. In RM restart scenarios, this will likely hit more if its implemented by re-playing events and re-submitting applications to the scheduler before the RPC to NM's is activated.
- YARN-200.
Major sub-task reported by Robert Joseph Evans and fixed by Ravi Prakash
yarn log does not output all needed information, and is in a binary format
yarn logs does not output attemptid, nodename, or container-id. Missing these makes it very difficult to look through the logs for failed containers and tie them back to actual tasks and task attempts.
Also the output currently includes several binary characters. This is OK for being machine readable, but difficult for being human readable, or even for using standard tool like grep.
The help message can also be more useful to users
- YARN-198.
Minor improvement reported by Ramgopal N and fixed by Jian He (nodemanager)
If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager
If we are navigating to Nodemanager by clicking on the node link in RM,there is no link provided on the NM to navigate back to RM.
If there is a link to navigate back to RM it would be good
- YARN-196.
Major bug reported by Ramgopal N and fixed by Xuan Gong (nodemanager)
Nodemanager should be more robust in handling connection failure to ResourceManager when a cluster is started
If NM is started before starting the RM ,NM is shutting down with the following error
{code}
ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting services org.apache.hadoop.yarn.server.nodemanager.NodeManager
org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149)
at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242)
Caused by: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145)
... 3 more
Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131)
at $Proxy23.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
... 5 more
Caused by: java.net.ConnectException: Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857)
at org.apache.hadoop.ipc.Client.call(Client.java:1141)
at org.apache.hadoop.ipc.Client.call(Client.java:1100)
at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128)
... 7 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:659)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:469)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:563)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1247)
at org.apache.hadoop.ipc.Client.call(Client.java:1117)
... 9 more
2012-01-16 15:04:13,336 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher thread interrupted
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76)
at java.lang.Thread.run(Thread.java:619)
2012-01-16 15:04:13,337 INFO org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is stopped.
2012-01-16 15:04:13,392 INFO org.mortbay.log: Stopped SelectChannelConnector@0.0.0.0:9999
2012-01-16 15:04:13,493 INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer is stopped.
2012-01-16 15:04:13,493 INFO org.apache.hadoop.ipc.Server: Stopping server on 24290
2012-01-16 15:04:13,494 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 24290
2012-01-16 15:04:13,495 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2012-01-16 15:04:13,496 INFO org.apache.hadoop.yarn.service.AbstractService: Service:org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler is stopped.
2012-01-16 15:04:13,496 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: AsyncDispatcher thread interrupted
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76)
at java.lang.Thread.run(Thread.java:619)
{code}
- YARN-193.
Major bug reported by Hitesh Shah and fixed by Zhijie Shen (resourcemanager)
Scheduler.normalizeRequest does not account for allocation requests that exceed maximumAllocation limits
- YARN-142.
Blocker task reported by Siddharth Seth and fixed by
[Umbrella] Cleanup YARN APIs w.r.t exceptions
Ref: MAPREDUCE-4067
All YARN APIs currently throw YarnRemoteException.
1) This cannot be extended in it's current form.
2) The RPC layer can throw IOExceptions. These end up showing up as UndeclaredThrowableExceptions.
- YARN-125.
Minor sub-task reported by Steve Loughran and fixed by Steve Loughran
Make Yarn Client service shutdown operations robust
Make the yarn client services more robust against being shut down while not started, or shutdown more than once, by null-checking fields before closing them, setting to null afterwards to prevent double-invocation. This is a subset of MAPREDUCE-3502
- YARN-124.
Minor sub-task reported by Steve Loughran and fixed by Steve Loughran
Make Yarn Node Manager services robust against shutdown
Add the nodemanager bits of MAPREDUCE-3502 to shut down the Nodemanager services. This is done by checking for fields being non-null before shutting down/closing etc, and setting the fields to null afterwards -to be resilient against re-entrancy.
No tests other than manual review.
- YARN-123.
Minor sub-task reported by Steve Loughran and fixed by Steve Loughran
Make yarn Resource Manager services robust against shutdown
Split MAPREDUCE-3502 patches to make the RM code more resilient to being stopped more than once, or before started.
This depends on MAPREDUCE-4014.
- YARN-117.
Major improvement reported by Steve Loughran and fixed by Steve Loughran
Enhance YARN service model
Having played the YARN service model, there are some issues
that I've identified based on past work and initial use.
This JIRA issue is an overall one to cover the issues, with solutions pushed out to separate JIRAs.
h2. state model prevents stopped state being entered if you could not successfully start the service.
In the current lifecycle you cannot stop a service unless it was successfully started, but
* {{init()}} may acquire resources that need to be explicitly released
* if the {{start()}} operation fails partway through, the {{stop()}} operation may be needed to release resources.
*Fix:* make {{stop()}} a valid state transition from all states and require the implementations to be able to stop safely without requiring all fields to be non null.
Before anyone points out that the {{stop()}} operations assume that all fields are valid; and if called before a {{start()}} they will NPE; MAPREDUCE-3431 shows that this problem arises today, MAPREDUCE-3502 is a fix for this. It is independent of the rest of the issues in this doc but it will aid making {{stop()}} execute from all states other than "stopped".
MAPREDUCE-3502 is too big a patch and needs to be broken down for easier review and take up; this can be done with issues linked to this one.
h2. AbstractService doesn't prevent duplicate state change requests.
The {{ensureState()}} checks to verify whether or not a state transition is allowed from the current state are performed in the base {{AbstractService}} class -yet subclasses tend to call this *after* their own {{init()}}, {{start()}} & {{stop()}} operations. This means that these operations can be performed out of order, and even if the outcome of the call is an exception, all actions performed by the subclasses will have taken place. MAPREDUCE-3877 demonstrates this.
This is a tricky one to address. In HADOOP-3128 I used a base class instead of an interface and made the {{init()}}, {{start()}} & {{stop()}} methods {{final}}. These methods would do the checks, and then invoke protected inner methods, {{innerStart()}}, {{innerStop()}}, etc. It should be possible to retrofit the same behaviour to everything that extends {{AbstractService}} -something that must be done before the class is considered stable (because once the lifecycle methods are declared final, all subclasses that are out of the source tree will need fixing by the respective developers.
h2. AbstractService state change doesn't defend against race conditions.
There's no concurrency locks on the state transitions. Whatever fix for wrong state calls is added should correct this to prevent re-entrancy, such as {{stop()}} being called from two threads.
h2. Static methods to choreograph of lifecycle operations
Helper methods to move things through lifecycles. init->start is common, stop-if-service!=null another. Some static methods can execute these, and even call {{stop()}} if {{init()}} raises an exception. These could go into a class {{ServiceOps}} in the same package. These can be used by those services that wrap other services, and help manage more robust shutdowns.
h2. state transition failures are something that registered service listeners may wish to be informed of.
When a state transition fails a {{RuntimeException}} can be thrown -and the service listeners are not informed as the notification point isn't reached. They may wish to know this, especially for management and diagnostics.
*Fix:* extend {{ServiceStateChangeListener}} with a callback such as {{stateChangeFailed(Service service,Service.State targeted-state, RuntimeException e)}} that is invoked from the (final) state change methods in the {{AbstractService}} class (once they delegate to their inner {{innerStart()}}, {{innerStop()}} methods; make a no-op on the existing implementations of the interface.
h2. Service listener failures not handled
Is this an error an error or not? Log and ignore may not be what is desired.
*Proposed:* during {{stop()}} any exception by a listener is caught and discarded, to increase the likelihood of a better shutdown, but do not add try-catch clauses to the other state changes.
h2. Support static listeners for all AbstractServices
Add support to {{AbstractService}} that allow callers to register listeners for all instances. The existing listener interface could be used. This allows management tools to hook into the events.
The static listeners would be invoked for all state changes except creation (base class shouldn't be handing out references to itself at this point).
These static events could all be async, pushed through a shared {{ConcurrentLinkedQueue}}; failures logged at warn and the rest of the listeners invoked.
h2. Add some example listeners for management/diagnostics
* event to commons log for humans.
* events for machines hooked up to the JSON logger.
* for testing: something that be told to fail.
h2. Services should support signal interruptibility
The services would benefit from a way of shutting them down on a kill signal; this can be done via a runtime hook. It should not be automatic though, as composite services will get into a very complex state during shutdown. Better to provide a hook that lets you register/unregister services to terminate, and have the relevant {{main()}} entry points tell their root services to register themselves.
- YARN-112.
Major sub-task reported by Jason Lowe and fixed by Omkar Vinit Joshi (nodemanager)
Race in localization can cause containers to fail
On one of our 0.23 clusters, I saw a case of two containers, corresponding to two map tasks of a MR job, that were launched almost simultaneously on the same node. It appears they both tried to localize job.jar and job.xml at the same time. One of the containers failed when it couldn't rename the temporary job.jar directory to its final name because the target directory wasn't empty. Shortly afterwards the second container failed because job.xml could not be found, presumably because the first container removed it when it cleaned up.
- YARN-109.
Major bug reported by Jason Lowe and fixed by Mayank Bansal (nodemanager)
.tmp file is not deleted for localized archives
When archives are localized they are initially created as a .tmp file and unpacked from that file. However the .tmp file is not deleted afterwards.
- YARN-101.
Minor bug reported by xieguiming and fixed by Xuan Gong (nodemanager)
If the heartbeat message loss, the nodestatus info of complete container will loss too.
see the red color:
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.java
protected void startStatusUpdater() {
new Thread("Node Status Updater") {
@Override
@SuppressWarnings("unchecked")
public void run() {
int lastHeartBeatID = 0;
while (!isStopped) {
// Send heartbeat
try {
synchronized (heartbeatMonitor) {
heartbeatMonitor.wait(heartBeatInterval);
}
{color:red}
// Before we send the heartbeat, we get the NodeStatus,
// whose method removes completed containers.
NodeStatus nodeStatus = getNodeStatus();
{color}
nodeStatus.setResponseId(lastHeartBeatID);
NodeHeartbeatRequest request = recordFactory
.newRecordInstance(NodeHeartbeatRequest.class);
request.setNodeStatus(nodeStatus);
{color:red}
// But if the nodeHeartbeat fails, we've already removed the containers away to know about it. We aren't handling a nodeHeartbeat failure case here.
HeartbeatResponse response =
resourceTracker.nodeHeartbeat(request).getHeartbeatResponse();
{color}
if (response.getNodeAction() == NodeAction.SHUTDOWN) {
LOG
.info("Recieved SHUTDOWN signal from Resourcemanager as part of heartbeat," +
" hence shutting down.");
NodeStatusUpdaterImpl.this.stop();
break;
}
if (response.getNodeAction() == NodeAction.REBOOT) {
LOG.info("Node is out of sync with ResourceManager,"
+ " hence rebooting.");
NodeStatusUpdaterImpl.this.reboot();
break;
}
lastHeartBeatID = response.getResponseId();
List<ContainerId> containersToCleanup = response
.getContainersToCleanupList();
if (containersToCleanup.size() != 0) {
dispatcher.getEventHandler().handle(
new CMgrCompletedContainersEvent(containersToCleanup));
}
List<ApplicationId> appsToCleanup =
response.getApplicationsToCleanupList();
//Only start tracking for keepAlive on FINISH_APP
trackAppsForKeepAlive(appsToCleanup);
if (appsToCleanup.size() != 0) {
dispatcher.getEventHandler().handle(
new CMgrCompletedAppsEvent(appsToCleanup));
}
} catch (Throwable e) {
// TODO Better error handling. Thread can die with the rest of the
// NM still running.
LOG.error("Caught exception in status-updater", e);
}
}
}
}.start();
}
private NodeStatus getNodeStatus() {
NodeStatus nodeStatus = recordFactory.newRecordInstance(NodeStatus.class);
nodeStatus.setNodeId(this.nodeId);
int numActiveContainers = 0;
List<ContainerStatus> containersStatuses = new ArrayList<ContainerStatus>();
for (Iterator<Entry<ContainerId, Container>> i =
this.context.getContainers().entrySet().iterator(); i.hasNext();) {
Entry<ContainerId, Container> e = i.next();
ContainerId containerId = e.getKey();
Container container = e.getValue();
// Clone the container to send it to the RM
org.apache.hadoop.yarn.api.records.ContainerStatus containerStatus =
container.cloneAndGetContainerStatus();
containersStatuses.add(containerStatus);
++numActiveContainers;
LOG.info("Sending out status for container: " + containerStatus);
{color:red}
// Here is the part that removes the completed containers.
if (containerStatus.getState() == ContainerState.COMPLETE) {
// Remove
i.remove();
{color}
LOG.info("Removed completed container " + containerId);
}
}
nodeStatus.setContainersStatuses(containersStatuses);
LOG.debug(this.nodeId + " sending out status for "
+ numActiveContainers + " containers");
NodeHealthStatus nodeHealthStatus = this.context.getNodeHealthStatus();
nodeHealthStatus.setHealthReport(healthChecker.getHealthReport());
nodeHealthStatus.setIsNodeHealthy(healthChecker.isHealthy());
nodeHealthStatus.setLastHealthReportTime(
healthChecker.getLastHealthReportTime());
if (LOG.isDebugEnabled()) {
LOG.debug("Node's health-status : " + nodeHealthStatus.getIsNodeHealthy()
+ ", " + nodeHealthStatus.getHealthReport());
}
nodeStatus.setNodeHealthStatus(nodeHealthStatus);
List<ApplicationId> keepAliveAppIds = createKeepAliveApplicationList();
nodeStatus.setKeepAliveApplications(keepAliveAppIds);
return nodeStatus;
}
- YARN-99.
Major sub-task reported by Devaraj K and fixed by Omkar Vinit Joshi (nodemanager)
Jobs fail during resource localization when private distributed-cache hits unix directory limits
If we have multiple jobs which uses distributed cache with small size of files, the directory limit reaches before reaching the cache size and fails to create any directories in file cache. The jobs start failing with the below exception.
{code:xml}
java.io.IOException: mkdir of /tmp/nm-local-dir/usercache/root/filecache/1701886847734194975 failed
at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
at org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
at org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{code}
We should have a mechanism to clean the cache files if it crosses specified number of directories like cache size.
- YARN-84.
Minor improvement reported by Brandon Li and fixed by Brandon Li
Use Builder to get RPC server in YARN
In HADOOP-8736, a Builder is introduced to replace all the getServer() variants. This JIRA is the change in YARN.
- YARN-71.
Critical bug reported by Vinod Kumar Vavilapalli and fixed by Xuan Gong (nodemanager)
Ensure/confirm that the NodeManager cleans up local-dirs on restart
We have to make sure that NodeManagers cleanup their local files on restart.
It may already be working like that in which case we should have tests validating this.
- YARN-62.
Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Omkar Vinit Joshi
AM should not be able to abuse container tokens for repetitive container launches
Clone of YARN-51.
ApplicationMaster should not be able to store container tokens and use the same set of tokens for repetitive container launches. The possibility of such abuse is there in the current code, for a duration of 1d+10mins, we need to fix this.
- YARN-45.
Major sub-task reported by Chris Douglas and fixed by Carlo Curino (resourcemanager)
Scheduler feedback to AM to release containers
The ResourceManager strikes a balance between cluster utilization and strict enforcement of resource invariants in the cluster. Individual allocations of containers must be reclaimed- or reserved- to restore the global invariants when cluster load shifts. In some cases, the ApplicationMaster can respond to fluctuations in resource availability without losing the work already completed by that task (MAPREDUCE-4584). Supplying it with this information would be helpful for overall cluster utilization [1]. To this end, we want to establish a protocol for the RM to ask the AM to release containers.
[1] http://research.yahoo.com/files/yl-2012-003.pdf
- YARN-24.
Major bug reported by Jason Lowe and fixed by Sandy Ryza (nodemanager)
Nodemanager fails to start if log aggregation enabled and namenode unavailable
If log aggregation is enabled and the namenode is currently unavailable, the nodemanager fails to startup.