AWS S3 - listing all objects within a folder without a prefix

I'm having trouble retrieving all the objects (file names) inside a folder in AWS S3. Here is my code:

ListObjectsRequest listObjectsRequest = new ListObjectsRequest() .withBucketName(bucket) .withPrefix(folderName + "/") .withMarker(folderName + "/") ObjectListing objectListing = amazonWebService.s3.listObjects(listObjectsRequest) for (S3ObjectSummary summary : objectListing.getObjectSummaries()) { print summary.getKey() } 

It returns the correct object, but with a prefix in it, for example. foldename / file name

I know that I can just use java, perhaps a substring, to exclude the prefix, but I just wanted to know if there is a method in it in the AWS SDK.

+6
source share
4 answers

No. Linked - This is a list of all available methods. The reason for this is the design of the S3. S3 does not have subfolders. Instead, it is simply a list of files where the file name is the β€œprefix” plus the desired file name. The graphical interface displays data similar to windows stored in "folders", but in S3 there is no folder logic.

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/S3ObjectSummary.html

It is best to divide by "/" and take the last object in the array.

+5
source

For Scala developers, here a recursive function performs a full scan and display of AmazonS3 bucket contents using the official AWS SDK for Java

 import com.amazonaws.services.s3.AmazonS3Client import com.amazonaws.services.s3.model.{S3ObjectSummary, ObjectListing, GetObjectRequest} import scala.collection.JavaConversions.{collectionAsScalaIterable => asScala} def map[T](s3: AmazonS3Client, bucket: String, prefix: String)(f: (S3ObjectSummary) => T) = { def scan(acc:List[T], listing:ObjectListing): List[T] = { val summaries = asScala[S3ObjectSummary](listing.getObjectSummaries()) val mapped = (for (summary <- summaries) yield f(summary)).toList if (!listing.isTruncated) mapped.toList else scan(acc ::: mapped, s3.listNextBatchOfObjects(listing)) } scan(List(), s3.listObjects(bucket, prefix)) } 

To call the aforementioned curried map() function, simply pass the already-built (and correctly initialized) AmazonS3Client object (refer to the official AWS SDK for Java API Link ), the bucket name and the prefix name in the list of first parameters. Also pass the f() function that you want to apply to match all the object summaries in the second parameter list.

for instance

 map(s3, bucket, prefix) { s => println(s.getKey.split("/")(1)) } 

print all file names (no prefix)

 val tuple = map(s3, bucket, prefix)(s => (s.getKey, s.getOwner, s.getSize)) 

will return the full list of tuples (key, owner, size) in this bucket / prefix

 val totalSize = map(s3, "bucket", "prefix")(s => s.getSize).sum 

will return the total size of its contents (note the additional folding function sum() applied at the end of the expression ;-)

You can combine map() with many other functions, since you usually approach Monads in functional programming

+3
source

Just to follow the comment above - β€œhere it is a recursive function to perform a full scan and map” - there is an error in the code (as highlighted by @Eric), if the bucket contains more than 1000 keys, the correction is actually quite simple, correlated The list should be combined with acc.

 def map[T](s3: AmazonS3Client, bucket: String, prefix: String)(f: (S3ObjectSummary) => T) = { def scan_s3_bucket(acc:List[T], listing:ObjectListing): List[T] = { val summaries = asScala[S3ObjectSummary](listing.getObjectSummaries()) val mapped = (for (summary <- summaries) yield f(summary)).toList if (!listing.isTruncated) { acc ::: mapped.toList } else { println("list extended, more to go: new_keys '%s', current_length '%s'".format(mapped.length, acc.length)) scan_s3_bucket(acc ::: mapped, s3.listNextBatchOfObjects(listing)) } } scan_s3_bucket(List(), s3.listObjects(bucket, prefix)) } 
+2
source

This code will help me find the subdirectory of my bucket.

Example: - "Testing" is the name of my bucket, which contains the " kdblue@gmail.com " folder, and then contains the "IMAGE" folder, which contains the image files.

  ArrayList<String> transferRecord = new ArrayList<>(); ListObjectsRequest listObjectsRequest = new ListObjectsRequest() .withBucketName(Constants.BUCKET_NAME) .withPrefix(" kdblue@gmail.com " + "/IMAGE"); ObjectListing objects = s3.listObjects(listObjectsRequest); for (;;) { List<S3ObjectSummary> summaries = objects.getObjectSummaries(); if (summaries.size() < 1) { break; } for(int i=0;i<summaries.size();i++){ ArrayList<String> file = new ArrayList<>(); file.add(summaries.get(i).getKey()); transferRecord.add(file); } objects = s3.listNextBatchOfObjects(objects); } 

Hope this helps you.

0
source

All Articles