AWS S3 - listing all objects within a folder without a prefix

Question

AWS S3 - listing all objects within a folder without a prefix

I'm having trouble retrieving all the objects (file names) inside a folder in AWS S3. Here is my code:

ListObjectsRequest listObjectsRequest = new ListObjectsRequest() .withBucketName(bucket) .withPrefix(folderName + "/") .withMarker(folderName + "/") ObjectListing objectListing = amazonWebService.s3.listObjects(listObjectsRequest) for (S3ObjectSummary summary : objectListing.getObjectSummaries()) { print summary.getKey() }

It returns the correct object, but with a prefix in it, for example. foldename / file name

I know that I can just use java, perhaps a substring, to exclude the prefix, but I just wanted to know if there is a method in it in the AWS SDK.

+6

java amazon-s3 amazon-web-services

Marz Apr 22 '14 at 11:19

source share

4 answers

For Scala developers, here a recursive function performs a full scan and display of AmazonS3 bucket contents using the official AWS SDK for Java

 import com.amazonaws.services.s3.AmazonS3Client import com.amazonaws.services.s3.model.{S3ObjectSummary, ObjectListing, GetObjectRequest} import scala.collection.JavaConversions.{collectionAsScalaIterable => asScala} def map[T](s3: AmazonS3Client, bucket: String, prefix: String)(f: (S3ObjectSummary) => T) = { def scan(acc:List[T], listing:ObjectListing): List[T] = { val summaries = asScala[S3ObjectSummary](listing.getObjectSummaries()) val mapped = (for (summary <- summaries) yield f(summary)).toList if (!listing.isTruncated) mapped.toList else scan(acc ::: mapped, s3.listNextBatchOfObjects(listing)) } scan(List(), s3.listObjects(bucket, prefix)) }

To call the aforementioned curried map() function, simply pass the already-built (and correctly initialized) AmazonS3Client object (refer to the official AWS SDK for Java API Link ), the bucket name and the prefix name in the list of first parameters. Also pass the f() function that you want to apply to match all the object summaries in the second parameter list.

for instance

 map(s3, bucket, prefix) { s => println(s.getKey.split("/")(1)) }

print all file names (no prefix)

 val tuple = map(s3, bucket, prefix)(s => (s.getKey, s.getOwner, s.getSize))

will return the full list of tuples (key, owner, size) in this bucket / prefix

 val totalSize = map(s3, "bucket", "prefix")(s => s.getSize).sum

will return the total size of its contents (note the additional folding function sum() applied at the end of the expression ;-)

You can combine map() with many other functions, since you usually approach Monads in functional programming

+3

Paolo angioletti Jun 05 '14 at 13:13

source share

Just to follow the comment above - “here it is a recursive function to perform a full scan and map” - there is an error in the code (as highlighted by @Eric), if the bucket contains more than 1000 keys, the correction is actually quite simple, correlated The list should be combined with acc.

 def map[T](s3: AmazonS3Client, bucket: String, prefix: String)(f: (S3ObjectSummary) => T) = { def scan_s3_bucket(acc:List[T], listing:ObjectListing): List[T] = { val summaries = asScala[S3ObjectSummary](listing.getObjectSummaries()) val mapped = (for (summary <- summaries) yield f(summary)).toList if (!listing.isTruncated) { acc ::: mapped.toList } else { println("list extended, more to go: new_keys '%s', current_length '%s'".format(mapped.length, acc.length)) scan_s3_bucket(acc ::: mapped, s3.listNextBatchOfObjects(listing)) } } scan_s3_bucket(List(), s3.listObjects(bucket, prefix)) }

+2

Joshua Jun 29 '16 at 6:47

source share

This code will help me find the subdirectory of my bucket.

Example: - "Testing" is the name of my bucket, which contains the " kdblue@gmail.com " folder, and then contains the "IMAGE" folder, which contains the image files.

  ArrayList<String> transferRecord = new ArrayList<>(); ListObjectsRequest listObjectsRequest = new ListObjectsRequest() .withBucketName(Constants.BUCKET_NAME) .withPrefix(" kdblue@gmail.com " + "/IMAGE"); ObjectListing objects = s3.listObjects(listObjectsRequest); for (;;) { List<S3ObjectSummary> summaries = objects.getObjectSummaries(); if (summaries.size() < 1) { break; } for(int i=0;i<summaries.size();i++){ ArrayList<String> file = new ArrayList<>(); file.add(summaries.get(i).getKey()); transferRecord.add(file); } objects = s3.listNextBatchOfObjects(objects); }

Hope this helps you.

0

kdblue Mar 05 '18 at 16:13

source share

Dan Ciborowski - MSFT · Accepted Answer · 2014-04-22T14:49:37+0000

No. Linked - This is a list of all available methods. The reason for this is the design of the S3. S3 does not have subfolders. Instead, it is simply a list of files where the file name is the “prefix” plus the desired file name. The graphical interface displays data similar to windows stored in "folders", but in S3 there is no folder logic.

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/S3ObjectSummary.html

It is best to divide by "/" and take the last object in the array.

AWS S3 - listing all objects within a folder without a prefix

More articles: