Here is an explanation: https://notes.mindprince.in/2014/08/01/difference-between-s3-block-and-s3-native-filesystem-on-hadoop.html
Hadoop 0.10.0 (HADOOP-574) introduced the first Hadoop file system with S3 support. It was called the file system of the S3 block, and the URI s3: // scheme was assigned to it. In this implementation, files are stored as blocks, as in HDFS. Files stored in this file system are not compatible with other S3 tools - this means that if you go to the AWS console and try to find files written by this file system, you will not find them, instead you will find files with names like block_ -1212312341234512345 etc.
To overcome these limitations, another S3-enabled file system was introduced in Hadoop 0.18.0 (HADOOP-930). It was called the S3 native file system and was assigned the s3n: // URI scheme. This file system allows you to access files on S3 that were written using other tools ... When this file system was introduced, S3 had a file size limit of 5 GB and therefore this file system could only work with files smaller than 5 GB At the end of 2010, Amazon ... increased the file size from 5 GB to 5 TB ...
Using the S3 block file system is no longer recommended. Various Hadoop-as-a-service providers, such as Qubole and Amazon EMR, have come to match both s3: // and s3n: // URIs for their own S3 file system to provide this.
Therefore, always use your own file system. 5Gb limits no longer. Sometimes you may need to enter s3:// instead of s3n:// , but just make sure that all the files you created are visible in the browser browser.
Also see http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-plan-file-systems.html .
Amazon EMR previously used the S3 Native FileSystem with the s3n URI. Although this still works, we recommend using the s3 URI scheme for maximum performance, security, and reliability.
It also says that you can use s3bfs:// to access the old block file system, formerly known as s3:// .
osa Jun 03 '16 at 16:44 2016-06-03 16:44
source share