How to start an HDFS cluster without DNS

I am creating a local HDFS dev environment (actually hasoop + mesos + zk + kafka) to facilitate Spark and facilitate local integrated testing. All other components work fine, but I have problems with HDFS. When the Data Node tries to connect to the node name, I get a DisallowedDataNodeException :

 org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode 

Most issues related to the same problem boil down to resolving Node data under the Node name, either static through etc/hosts files, or using dns. Static resolution is not an option with docker, since I don’t know the data nodes when creating a container named Node. I would like to avoid creating and maintaining an additional DNS service. Ideally, I would like to connect everything using the --link function of the docker.

Is there a way to configure HDFS to use only IP addresses?

I found this property and set it to false, but this did not help:

dfs.namenode.datanode.registration.ip-hostname-check (default: true)

Is there a way for a local HDFS cluster with multiple nodes to work only using IP addresses and without using DNS?

+6
source share
1 answer

I would look at reconfiguring your Docker image to use a different hosts file [1]. In particular:

  • In the Docker file, run the switch-a-roo [1] command
  • Raise the node wizard
  • Lift the data nodes associated
  • Before starting datanode, copy the / etc / hosts file to a new location, / tmp / hosts
  • Add the name of the node wizard and node ip master to the new hosts file

Hope this works for you!

[1] https://github.com/dotcloud/docker/issues/2267#issuecomment-40364340

+4
source

All Articles