Adding Node data to hasoop cluster

When I start hasoopnode1 using start-all.sh , it successfully starts the services on the master and slave (see the output of the jps command for the slave). But when I try to see live nodes in the admin screen, the slave node does not appear. Even when I run the hadoop fs -ls / command from the wizard, it works fine, but an error message is displayed from salve

 @hadoopnode2:~/hadoop-0.20.2/conf$ hadoop fs -ls / 12/05/28 01:14:20 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 0 time(s). 12/05/28 01:14:21 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 1 time(s). 12/05/28 01:14:22 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 2 time(s). 12/05/28 01:14:23 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 3 time(s). . . . 12/05/28 01:14:29 INFO ipc.Client: Retrying connect to server: hadoopnode1/192.168.1.120:8020. Already tried 10 time(s). 

It looks like slave (hadoopnode2) cannot find / connect the node master (hadoopnode1)

Please indicate to me what I am missing?

The following are the settings from the Master and Slave nodes - Postscript - The master and slave working with the same version of Linux and Hadoop and SSH work fine, because I can run the slave from the master node

Also the same settings for core-site.xml, hdfs-site.xml and mapred-site.xml for master (hadooopnode1) and slave (hadoopnode2)

OS - Ubuntu 10 Hadoop Version -

 oop@hadoopnode1 :~/hadoop-0.20.2/conf$ hadoop version Hadoop 0.20.2 Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707 Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010 

- Master (hadoopnode1)

 hadoop@hadoopnode1 :~/hadoop-0.20.2/conf$ uname -a Linux hadoopnode1 2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux hadoop@hadoopnode1 :~/hadoop-0.20.2/conf$ jps 9923 Jps 7555 NameNode 8133 TaskTracker 7897 SecondaryNameNode 7728 DataNode 7971 JobTracker masters -> hadoopnode1 slaves -> hadoopnode1 hadoopnode2 

- Slave (hadoopnode2)

 hadoop@hadoopnode2 :~/hadoop-0.20.2/conf$ uname -a Linux hadoopnode2 2.6.35-32-generic #67-Ubuntu SMP Mon Mar 5 19:35:26 UTC 2012 i686 GNU/Linux hadoop@hadoopnode2 :~/hadoop-0.20.2/conf$ jps 1959 DataNode 2631 Jps 2108 TaskTracker masters - hadoopnode1 core-site.xml hadoop@hadoopnode2 :~/hadoop-0.20.2/conf$ cat core-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/var/tmp/hadoop/hadoop-${user.name}</value> <description>A base for other temp directories</description> </property> <property> <name>fs.default.name</name> <value>hdfs://hadoopnode1:8020</value> <description>The name of the default file system</description> </property> </configuration> hadoop@hadoopnode2 :~/hadoop-0.20.2/conf$ cat mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>hadoopnode1:8021</value> <description>The host and port that the MapReduce job tracker runs at.If "local", then jobs are run in process as a single map</description> </property> </configuration> hadoop@hadoopnode2 :~/hadoop-0.20.2/conf$ cat hdfs-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>2</value> <description>Default block replication</description> </property> </configuration> 
+4
source share
7 answers

It seems that the problem is associated not only with the slave, but also with the node master (hadoopnode1). When I checked the log from the wizard, I saw the same error as it, t connect to hadoopnode1

Login from the node wizard (hadoopnode1). I changed the loopback address to 127.0.0.1

 2012-05-30 20:54:31,760 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoopnode1/127.0.0.1:8020. Already tried 0 time(s). 2012-05-30 20:54:32,761 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoopnode1/127.0.0.1:8020. Already tried 1 time(s). 2012-05-30 20:54:33,764 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoopnode1/127.0.0.1:8020. Already tried 2 time(s). 2012-05-30 20:54:34,764 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoopnode1/127.0.0.1:8020. Already tried 3 time(s). . . hadoopnode1/127.0.0.1:8020. Already tried 8 time(s). 2012-05-30 20:54:40,782 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoopnode1/127.0.0.1:8020. Already tried 9 time(s). 2012-05-30 20:54:40,784 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: null java.net.ConnectException: Call to hadoopnode1/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:767) 

Here is my / etc / hosts file

 192.168.1.120 hadoopnode1 # Added by NetworkManager 127.0.0.1 localhost.localdomain localhost hadoopnode1 ::1 hadoopnode1 localhost6.localdomain6 localhost6 192.168.1.121 hadoopnode2 # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts 

I am really confused about how this will work. I have been trying to make a cluster now from the last 15 days. Any help is appreciated.

@ Raze2dust- I deleted all tmp files, but now the problem looks different. What I think. More name resolution issues

@William Yao - You do not have curl, but I can ping the server from each other, and also be in SSH

+2
source

In the web GUI, you can see the number of nodes in your cluster. If you see less than you expected, make sure that the / etc / hosts file on the host has only hosts (for the node cluster).

 192.168.0.1 master 192.168.0.2 slave 

If you see 127.0 ..... ip, then comment out because Hadoop will see them first as host (s). I had a problem above and I solved it above. Hope this helps.

+1
source

check your service sudo jps master should not display what you need to do

 Restart Hadoop Go to /app/hadoop/tmp/dfs/name/current Open VERSION (ie by vim VERSION) Record namespaceID Go to /app/hadoop/tmp/dfs/data/current Open VERSION (ie by vim VERSION) Replace the namespaceID with the namespaceID you recorded in step 4. 

That should work. Good luck.

+1
source

Check the nomenclature and datanode logs. (Must be in $HADOOP_HOME/logs/ ). Most likely, the problem may be that the identifiers namenode and datanode do not match. Remove hadoop.tmp.dir from all nodes and format namenode again ( $HADOOP_HOME/bin/hadoop namenode -format ), then try again.

0
source

I think in slave 2. Slave 2 should listen on the same port 8020 instead of listening on 8021.

0
source

Add the new node host name to the slave file and run the node data and track tasks on the new node.

0
source

Indeed, in your case there are two errors.

 can't connect to hadoop master node from slave 

This is a network problem. Check this out: curl 192.168.1.120:8020.

Normal answer: curl: (52) An empty answer from the server

In my case, I get a host not found error. So just take a look at your firewall settings

 data node down: 

This problem. The Raze2dust method is good. Here's another way if you see the "Incompatible namespace names" error in your log:

stop hadoop and edit the value of namespaceID in / current / VERSION to match the value of the current namenode, then run hasoop.

You can always check the available datanodes using: hadoop fsck /

0
source

Source: https://habr.com/ru/post/1414672/


All Articles