What the hadoop namenode -format command will do

I am trying to learn Hadoop by following the tutorial and trying to do pseudo-distributed mode on my machine.

My core-site.xml :

 <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. </description> </property> </configuration> 

My hdfs-site.xml file:

 <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.replication</name> <value>1</value> <description>The actual number of replications can be specified when the file is created. </description> </property> </configuration> 

My mapred-site.xml file:

 <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> <description>The host and port that the MapReduce job tracker runs at. </description> </property> </configuration> 

When I run the command, it starts successfully, but what it does:

 hadoop-1.2.1$ bin/hadoop namenode -format 14/11/26 12:37:16 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = myhost/127.0.0.8 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 1.2.1 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013 STARTUP_MSG: java = 1.6.0_45 ************************************************************/ 14/11/26 12:37:17 INFO util.GSet: Computing capacity for map BlocksMap 14/11/26 12:37:17 INFO util.GSet: VM type = 64-bit 14/11/26 12:37:17 INFO util.GSet: 2.0% max memory = 932118528 14/11/26 12:37:17 INFO util.GSet: capacity = 2^21 = 2097152 entries 14/11/26 12:37:17 INFO util.GSet: recommended=2097152, actual=2097152 14/11/26 12:37:17 INFO namenode.FSNamesystem: fsOwner=myuser 14/11/26 12:37:17 INFO namenode.FSNamesystem: supergroup=supergroup 14/11/26 12:37:17 INFO namenode.FSNamesystem: isPermissionEnabled=true 14/11/26 12:37:17 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 14/11/26 12:37:17 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 14/11/26 12:37:17 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 14/11/26 12:37:17 INFO namenode.NameNode: Caching file names occuring more than 10 times 14/11/26 12:37:17 INFO common.Storage: Image file /tmp/hadoop-myuser/dfs/name/current/fsimage of size 115 bytes saved in 0 seconds. 14/11/26 12:37:18 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-myuser/dfs/name/current/edits 14/11/26 12:37:18 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-myuser/dfs/name/current/edits 14/11/26 12:37:18 INFO common.Storage: Storage directory /tmp/hadoop-myuser/dfs/name has been successfully formatted. 14/11/26 12:37:18 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at chaitanya-OptiPlex-3010/127.0.0.8 ************************************************************/ 

Can someone please let me know what he is doing inside.

I went through these posts, but there is no correct explanation.

What is overlay formatting overlay formatting?

hasoop namenode is not formatted

How can I test this almost on my machine so that I can see the differences before and after the command. I am new to Hadoop, so this may be a trivial question.

+8
hadoop
source share
5 answers

Hadoop namenode -format

  • The hadop namenode directory contains fsimage and edit files, which stores basic information about the file system, where the available data that the user created such files

  • If you format namenode, then the above information is deleted from the namenode directory, which is specified in hdfs-site.xml as dfs.namenode.name.dir

  • But you still have hadoop attribute data, but no changed meta data.

+6
source share

hadoop namenode -format this command deletes all files in your hdf files.

The tmp directory contains two datanode directories, namenode in the local file system. if you format namenode, these two folders become empty.

Note. If you want to format your namenode, first stop all hadoop services, then delete the tmp folder (contains the namenode and datanode file) in the local file system and start the hadoop service, and this will probably take effect.

Reason for using nadenode file:

Hadoop NameNode is a central location of the HDFS file system that stores the directory tree of all files in the file system and tracks where the file data is stored in the cluster. In short, it saves metadata related to datanodes. When we format the namenode, it formats the metadata associated with the data nodes. By doing so, all information about datanodes is lost and can be reused for new data.

By default , the default namenode location will be "/ tmp / hadoop-myuser / dfs / name"

While you format the namenode, this file location has been cleared.

To change the location of the name , add the following properties to hdfs-site.xml

 <property> <name>dfs.namenode.name.dir</name> <value>file:/search/data/dfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/search/data/dfs/datanode</value> </property> 

Hope this helps you .. :-)

+7
source share

Namenode contains metadata about the Hadoop file system.

This command (hasoop-1.2.1 $ bin / hasoop namenode -format) will format the entire Hadoop Distributed File System (HDFS). Therefore, if you run this command on an existing file system, you will lose all your data.

+2
source share

In fact, formatting Namenode will not format Datanode.

It just formats the contents of your namenode (which contains datanode data). Your namenode will no longer know where your data is. Also namenode -format will assign a new namespace identifier to namenode

You must change your namespaceID in your datanode to make your datanode work. This will be in the format dfs / data / current / VERSION

JIRA is now open for the same sentence as the Datanode format when you format the Namenode. HDFS-107

+2
source share

Steps start all services using "start-all.sh"

verify that the services are running or not using "JPS" note: if you use hadoop2.3.0, then the following services should start

 Namenode Datanode Resourcemanager Nodemanager 

Move the file from local to HDFS using hdfs -put /

Now check in the location "/ tmp / hadoop-myuser / dfs / name" , you can find this file, divided into some BLOCKS conatain of 64 MB each.

Then start formatting with **hadoop namenode -format** Now the file is not physically accessible at this location

Click here for more information.

0
source share

All Articles