I am running a local yarn cluster with 8 vCores and 8Gb shared memory.
The workflow is as follows:
YarnClient sends an application request that launches AppMaster in the container.
AppMaster start, creates amRMClient and nmClient, registers with RM, and then creates 4 container requests for workflows through amRMClient.addContainerRequest
Although there are enough resources, containers are not allocated (the onContainersAllocated callback function is never called). I tried to check the nodemanager and resourcemanager logs and I do not see any line related to container requests. I kept a close eye on apache docs and couldn't figure out what I was doing wrong.
For reference: AppMaster code:
@Override public void run() { Map<String, String> envs = System.getenv(); String containerIdString = envs.get(ApplicationConstants.Environment.CONTAINER_ID.toString()); if (containerIdString == null) { // container id should always be set in the env by the framework throw new IllegalArgumentException("ContainerId not set in the environment"); } ContainerId containerId = ConverterUtils.toContainerId(containerIdString); ApplicationAttemptId appAttemptID = containerId.getApplicationAttemptId(); LOG.info("Starting AppMaster Client..."); YarnAMRMCallbackHandler amHandler = new YarnAMRMCallbackHandler(allocatedYarnContainers); // TODO: get heart-beet interval from config instead of 100 default value amClient = AMRMClientAsync.createAMRMClientAsync(1000, this); amClient.init(config); amClient.start(); LOG.info("Starting AppMaster Client OK"); //YarnNMCallbackHandler nmHandler = new YarnNMCallbackHandler(); containerManager = NMClient.createNMClient(); containerManager.init(config); containerManager.start(); // Get port, ulr information. TODO: get tracking url String appMasterHostname = NetUtils.getHostname(); String appMasterTrackingUrl = "/progress"; // Register self with ResourceManager. This will start heart-beating to the RM RegisterApplicationMasterResponse response = null; LOG.info("Register AppMaster on: " + appMasterHostname + "..."); try { response = amClient.registerApplicationMaster(appMasterHostname, 0, appMasterTrackingUrl); } catch (YarnException | IOException e) { // TODO Auto-generated catch block e.printStackTrace(); return; } LOG.info("Register AppMaster OK"); // Dump out information about cluster capability as seen by the resource manager int maxMem = response.getMaximumResourceCapability().getMemory(); LOG.info("Max mem capabililty of resources in this cluster " + maxMem); int maxVCores = response.getMaximumResourceCapability().getVirtualCores(); LOG.info("Max vcores capabililty of resources in this cluster " + maxVCores); containerMemory = Integer.parseInt(config.get(YarnConfig.YARN_CONTAINER_MEMORY_MB)); containerCores = Integer.parseInt(config.get(YarnConfig.YARN_CONTAINER_CPU_CORES)); // A resource ask cannot exceed the max. if (containerMemory > maxMem) { LOG.info("Container memory specified above max threshold of cluster." + " Using max value." + ", specified=" + containerMemory + ", max=" + maxMem); containerMemory = maxMem; } if (containerCores > maxVCores) { LOG.info("Container virtual cores specified above max threshold of cluster." + " Using max value." + ", specified=" + containerCores + ", max=" + maxVCores); containerCores = maxVCores; } List<Container> previousAMRunningContainers = response.getContainersFromPreviousAttempts(); LOG.info("Received " + previousAMRunningContainers.size() + " previous AM running containers on AM registration."); for (int i = 0; i < 4; ++i) { ContainerRequest containerAsk = setupContainerAskForRM(); amClient.addContainerRequest(containerAsk); // NOTHING HAPPENS HERE... LOG.info("Available resources: " + amClient.getAvailableResources().toString()); } while(completedYarnContainers != 4) { try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } } LOG.info("Done with allocation!"); } @Override public void onContainersAllocated(List<Container> containers) { LOG.info("Got response from RM for container ask, allocatedCnt=" + containers.size()); for (Container container : containers) { LOG.info("Allocated yarn container with id: {}" + container.getId()); allocatedYarnContainers.push(container); // TODO: Launch the container in a thread } } @Override public void onError(Throwable error) { LOG.error(error.getMessage()); } @Override public float getProgress() { return (float) completedYarnContainers / allocatedYarnContainers.size(); }
Here's the output from jps:
14594 NameNode 15269 DataNode 17975 Jps 14666 ResourceManager 14702 NodeManager
And here is the AppMaster log for initialization and 4 container requests:
23:47:09 YarnAppMaster - Starting AppMaster Client OK 23:47:09 YarnAppMaster - Register AppMaster on: andrei-mbp.local/192.168.1.4... 23:47:09 YarnAppMaster - Register AppMaster OK 23:47:09 YarnAppMaster - Max mem capabililty of resources in this cluster 2048 23:47:09 YarnAppMaster - Max vcores capabililty of resources in this cluster 2 23:47:09 YarnAppMaster - Received 0 previous AM running containers on AM registration. 23:47:11 YarnAppMaster - Requested container ask: Capability[<memory:512, vCores:1>]Priority[0] 23:47:11 YarnAppMaster - Available resources: <memory:7680, vCores:0> 23:47:11 YarnAppMaster - Requested container ask: Capability[<memory:512, vCores:1>]Priority[0] 23:47:11 YarnAppMaster - Available resources: <memory:7680, vCores:0> 23:47:11 YarnAppMaster - Requested container ask: Capability[<memory:512, vCores:1>]Priority[0] 23:47:11 YarnAppMaster - Available resources: <memory:7680, vCores:0> 23:47:11 YarnAppMaster - Requested container ask: Capability[<memory:512, vCores:1>]Priority[0] 23:47:11 YarnAppMaster - Available resources: <memory:7680, vCores:0> 23:47:11 YarnAppMaster - Progress indicator should not be negative
Thanks in advance.