What is the difference between the hadoop fs shell commands and the hdfs dfs shell commands?

Are they equal?

but why did the " hadoop fs " hdfs files show hdfs files , while the " hdfs dfs " hdfs dfs show local files?

Here is the version information for howop:

Hadoop 2.0.0-mr1-cdh4.2.1 Subversion git: //ubuntu-slave07.jenkins.cloudera.com/var/lib/jenkins/workspace/CDH4.2.1-Packaging-MR1/build/cdh4/mr1/2.0.0 -mr1-cdh4.2.1 / source -r Compiled by jenkins on Mon Apr 22 10:48:26 PDT 2013

+56
hadoop hdfs
Aug 09 '13 at 8:37
source share
6 answers

Below are three commands that display the same but have slight differences

  • hadoop fs {args}
  • hasoop dfs {args}
  • hdfs dfs {args}

     hadoop fs <args> 

FS refers to a common file system that can point to any file system, such as local, HDFS, etc. Thus, it can be used when you are dealing with various file systems such as Local FS, HFTP FS, S3 FS and others.

  hadoop dfs <args> 

dfs is very specific to hdfs. will work to work with HDFS. This was deprecated and we should use hdfs dfs .

  hdfs dfs <args> 

the same as the 2nd, I will work for all operations related to HDFS, and is the recommended command instead of hadoop dfs

below is a list classified as HDFS commands.

  **#hdfs commands** namenode|secondarynamenode|datanode|dfs|dfsadmin|fsck|balancer|fetchdt|oiv|dfsgroups 

So, even if you use Hadoop dfs , it will look for hdfs and delegate this hdfs dfs command

+68
Jun 25 '14 at 8:49
source share

From what I can tell, there is no difference between hdfs dfs and hadoop fs . These are just different naming conventions based on the version of Hadoop you are using. For example, the notes in 1.2.1 use hdfs dfs , and 0.19 uses hadoop fs . Note that individual commands are described verbatim. They are used the same way.

Also note that both commands can refer to different file systems depending on what you specify (hdfs, file, s3, etc.). If no file system is specified, they return to the default specified in your configuration.

You are using Hadoop 2.0.0 and it looks like ( based on the 2.0.5 documentation ) that hadoop fs are used in Alpha versions and is set to use HDFS as the default scheme in your configuration. The hdfs dfs can be left from earlier, and since it is not specified in the configuration, it may just be the default for the local file system.

So I would just stick with hadoop fs and not be too worried, as they are identical in the documentation.

+5
Aug 09 '13 at 16:16
source share

fs refers to any file system, it can be local or HDFS, but dfs refers only to the HDFS file system. Therefore, if you need to access / transfer data between different file systems, fs is the way to go.

+4
Aug 09 '13 at 8:45
source share

FS refers to a common file system that can point to any file system, such as local, HDFS, etc. But dfs is very specific to HDFS. Therefore, when we use FS , it can perform an operation from / from a local or distributed file system with a distributed file system to the destination. But the DFS operation reference refers to HDFS.

The following are excerpts from the suffix documentation that describes the two as different shells.

FS Shell The FileSystem (FS) shell is invoked by bin / hadoop fs. All FS shell commands accept path URIs as arguments. The URI format is a scheme: // autority / path. For HDFS, the hdfs scheme, and for the local file system, a file. Scheme and credentials are optional. If not specified, the default schema specified in the configuration is used. An HDFS file or directory, such as / parent / child, can be specified as hdfs: // namenodehost / parent / child or simply as / parent / child (assuming your configuration is set to hdfs: // namenodehost). Most commands in the FS shell behave like corresponding Unix commands.

DFShell The HDFS wrapper is invoked by bin / hadoop dfs. All HDFS shell commands accept path URIs as arguments. The URI format is a scheme: // autority / path. For HDFS, the hdfs scheme, and for the local file system, a file. Scheme and credentials are optional. If not specified, the default schema specified in the configuration is used. An HDFS file or directory, such as / parent / child, can be specified as hdfs: // namenode: namenodeport / parent / child, or simply as / parent / child (assuming your configuration is set to namenode: namenodeport). Most commands in the HDFS shell behave like corresponding Unix commands.

So, from the above, we can conclude that it all depends on the configure scheme. When using these two commands with an absolute URI, that is, the scheme: // a / b, the behavior should be identical. Only its standard configured schema value for files and hdfs for fs and dfs respectively, which causes differences in behavior.

+3
Oct 17 '15 at 13:34
source share

fs = file system dfs = distributed file system

fs = other file systems + distributed file systems

FS refers to a common file system that can point to any file system, such as local, HDFS, etc. But dfs is very specific to HDFS. Therefore, when we use FS, it can perform an operation from / from a local or distributed file system with a distributed file system to the destination. But the DFS operation reference refers to HDFS.

It all depends on the configure scheme. When using these two commands with an absolute URI, that is, the scheme: // a / b, the behavior should be identical. Only its standard configured schema value for files and hdfs for fs and dfs respectively, which causes differences in behavior.

+2
03 Sep '17 at 2:42 on
source share

enter image description here

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html

The file system (FS) includes various shell-like commands that interact directly with the Hadoop Distributed File System (HDFS), as well as other file systems supported by Hadoop, such as Local FS, WebHDFS, S3 FS, and others.

bin / hadoop fs <args>

All FS shell commands accept path URIs as arguments. The URI format is a scheme: // authority / path. For HDFS, the hdfs scheme and for the local FS, this scheme is a file. Scheme and credentials are optional. If not specified, the standard circuit specified in the configuration used. An HDFS file or directory, such as / parent / child, can be specified as hdfs: // namenodehost / parent / child or simply as / parent / child (given that your configuration is set to hdfs: // namenodehost).

Most commands in the FS shell behave like corresponding Unix commands. Differences are described with each of the teams. Error information is sent to stderr, and the output is sent to stdout.

If using HDFS,

hdfs dfs

is synonymous with.

0
Aug 28 '17 at 1:26 on
source share



All Articles