I installed cloudera cdh4 release and I'm trying to get mapreduce working. I get the following error →
2012-07-09 15:41:16 ZooKeeperSaslClient [INFO] Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration. 2012-07-09 15:41:16 ClientCnxn [INFO] Socket connection established to Cloudera/192.168.0.102:2181, initiating session 2012-07-09 15:41:16 RecoverableZooKeeper [WARN] Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master 2012-07-09 15:41:16 RetryCounter [INFO] The 1 times to retry after sleeping 2000 ms 2012-07-09 15:41:16 ClientCnxn [INFO] Session establishment complete on server Cloudera/192.168.0.102:2181, sessionid = 0x1386b0b44da000b, negotiated timeout = 60000 2012-07-09 15:41:18 TableOutputFormat [INFO] Created table instance for exact_custodian 2012-07-09 15:41:18 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2012-07-09 15:41:18 JobSubmitter [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2012-07-09 15:41:18 JobSubmitter [INFO] Cleaning up the staging area file:/tmp/hadoop-hdfs/mapred/staging/hdfs48876562/.staging/job_local_0001 2012-07-09 15:41:18 UserGroupInformation [ERROR] PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244) at
I can run the sample programs given in the hasoop-mapreduce-examples-2.0.0-cdh4.0.0.jar file. But I get this error when my job is successfully sent to jobtracker. It looks like he is trying to access the local file system again (although I installed all the necessary libraries for the job in the distributed cache, still trying to access the local directory). Are these issues related to user rights?
I) Cloudera:~ # hadoop fs -ls hdfs://<MyClusterIP>:8020/ shows -
Found 8 items drwxr-xr-x - hbase hbase 0 2012-07-04 17:58 hdfs://<MyClusterIP>:8020/hbase<br/> drwxr-xr-x - hdfs supergroup 0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/input<br/> drwxr-xr-x - hdfs supergroup 0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/output<br/> drwxr-xr-x - hdfs supergroup 0 2012-07-06 16:03 hdfs:/<MyClusterIP>:8020/tools-lib<br/> drwxr-xr-x - hdfs supergroup 0 2012-06-26 14:02 hdfs://<MyClusterIP>:8020/test<br/> drwxrwxrwt - hdfs supergroup 0 2012-06-12 16:13 hdfs://<MyClusterIP>:8020/tmp<br/> drwxr-xr-x - hdfs supergroup 0 2012-07-06 15:58 hdfs://<MyClusterIP>:8020/user<br/>
II), --- No results for the following ----
hdfs@Cloudera :/etc/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/> hdfs@Cloudera :/etc/hbase/conf> find . -name '**' | xargs grep "default.name"<br/>
Instead, I think with the new APIS we use →
fs.defaultFS -> hdfs: // Cloudera: 8020, which I installed correctly
Although for "fs.default.name" I got entries for a cluster with clusters 0.20.2 (non-cloudera cluster)
cass-hadoop@Pratapgad :~/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/> ./core-default.xml: <name>fs.default.name</name><br/> ./core-site.xml: <name>fs.default.name</name><br/>
I think the default configuration for cdh4 should add this entry to the appropriate directory. (If his mistake).
The command I use to run my program is
hdfs@Cloudera :/home/cloudera/yogesh/lib> java -classpath hbase-tools.jar:hbase.jar:slf4j-log4j12-1.6.1.jar:slf4j-api-1.6.1.jar:protobuf-java-2.4.0a.jar:hadoop-common-2.0.0-cdh4.0.0.jar:hadoop-hdfs-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-common-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-core-2.0.0-cdh4.0.0.jar:log4j-1.2.16.jar:commons-logging-1.0.4.jar:commons-lang-2.5.jar:commons-lang3-3.1.jar:commons-cli-1.2.jar:commons-configuration-1.6.jar:guava-11.0.2.jar:google-collect-1.0-rc2.jar:google-collect-1.0-rc1.jar:hadoop-auth-2.0.0-cdh4.0.0.jar:hadoop-auth.jar:jackson.jar:avro-1.5.4.jar:hadoop-yarn-common-2.0.0-cdh4.0.0.jar:hadoop-yarn-api-2.0.0-cdh4.0.0.jar:hadoop-yarn-server-common-2.0.0-cdh4.0.0.jar:commons-httpclient-3.0.1.jar:commons-io-1.4.jar:zookeeper-3.3.2.jar:jdom.jar:joda-time-1.5.2.jar com.hbase.xyz.MyClassName