Delete all files in a folder or prefixed in Google Cloud Bucket with Java

I know that the idea of โ€‹โ€‹โ€œfoldersโ€ does not exist or does not exist in Google Cloud Storage, but I need a way to delete all objects in a โ€œfolderโ€ or with a given prefix from Java.

GcsService has a delete function, but as far as I can tell, it accepts only one GscFilename object and does not accept wildcards (ie "folderName / **" does not work).

Any tips?

+8
java google-cloud-storage google-cloud-endpoints
source share
3 answers

The API supports deleting only one object at a time. You can only request a lot of deletes using a lot of HTTP requests or batch delete requests. There is no API call to delete multiple objects using wildcards or the like. To delete all objects with a specific prefix, you need to list the objects, and then make a delete call for each object that matches the template.

The command line utility, gsutil, does just that when you ask it to remove the path "gs: // bucket / dir / **. It selects a list of objects matching this pattern, then it calls a call for each of them.

If you need a quick fix, you can always have the Java exec gsutil program.

Here is the code that matches the answer above in case anyone else wants to use it:

public void deleteFolder(String bucket, String folderName) throws CoultNotDeleteFile { try { ListResult list = gcsService.list(bucket, new ListOptions.Builder().setPrefix(folderName).setRecursive(true).build()); while(list.hasNext()) { ListItem item = list.next(); gcsService.delete(new GcsFilename(file.getBucket(), item.getName())); } } catch (IOException e) { //Error handling } } 
+7
source share

Very late to the party, but here for current Google searches. We can efficiently delete multiple blobs using com.google.cloud.storage.StorageBatch .

Like this:

 public static void rmdir(Storage storage, String bucket, String dir) { StorageBatch batch = storage.batch(); Page<Blob> blobs = storage.list(bucket, Storage.BlobListOption.currentDirectory(), Storage.BlobListOption.prefix(dir)); for(Blob blob : blobs.iterateAll()) { batch.delete(blob.getBlobId()); } batch.submit(); } 

This should work MUCH faster than deleting one at a time when your cart / folder contains a non-trivial number of items.

Edit, as this gets a little attention, I will demonstrate error handling:

 public static boolean rmdir(Storage storage, String bucket, String dir) { List<StorageBatchResult<Boolean>> results = new ArrayList<>(); StorageBatch batch = storage.batch(); try { Page<Blob> blobs = storage.list(bucket, Storage.BlobListOption.currentDirectory(), Storage.BlobListOption.prefix(dir)); for(Blob blob : blobs.iterateAll()) { results.add(batch.delete(blob.getBlobId())); } } finally { batch.submit(); } return results.stream().allMatch(r -> r != null && r.get()); } 

This method will: Delete every blob in the given folder of this segment, returning true if so. Otherwise, the method will return false. You can refer to the batch.delete() return method for better understanding and protection against errors.

To make sure ALL items are deleted, you can name it as follows:

 boolean success = false while(!success)) { success = rmdir(storage, bucket, dir); } 
+5
source share

I understand that this is an old question, but I just stumbled upon the same problem and found another way to solve it.

The Storage class in the Google Cloud Java Client for storage includes a method for displaying blobs in a bucket, which can also take a parameter to set a prefix for filtering results for a drop whose names begin with a prefix.

For example, deleting all files with a given prefix from a bucket can be achieved as follows:

 Storage storage = StorageOptions.getDefaultInstance().getService(); Iterable<Blob> blobs = storage.list("bucket_name", Storage.BlobListOption.prefix("prefix")).iterateAll(); for (Blob blob : blobs) { blob.delete(Blob.BlobSourceOption.generationMatch()); } 
+4
source share

All Articles