AmazonS3 putObject with an example inputStream length

I upload a file to S3 using Java - this is what I got so far:

AmazonS3 s3 = new AmazonS3Client(new BasicAWSCredentials("XX","YY")); List<Bucket> buckets = s3.listBuckets(); s3.putObject(new PutObjectRequest(buckets.get(0).getName(), fileName, stream, new ObjectMetadata())); 

The file is loading, but a WARNING occurs when I do not set the content length:

 com.amazonaws.services.s3.AmazonS3Client putObject: No content length specified for stream > data. Stream contents will be buffered in memory and could result in out of memory errors. 

This is the file I'm loading, and the stream variable is an InputStream , from which I can get an array of bytes as follows: IOUtils.toByteArray(stream) .

So when I try to set the length of the content and MD5 (taken from here ), follow these steps:

 // get MD5 base64 hash MessageDigest messageDigest = MessageDigest.getInstance("MD5"); messageDigest.reset(); messageDigest.update(IOUtils.toByteArray(stream)); byte[] resultByte = messageDigest.digest(); String hashtext = new String(Hex.encodeHex(resultByte)); ObjectMetadata meta = new ObjectMetadata(); meta.setContentLength(IOUtils.toByteArray(stream).length); meta.setContentMD5(hashtext); 

This results in the following error returned from S3:

The content you specified is MD5 invalid.

What am I doing wrong?

Any help appreciated!

PS I am in Google App Engine - I cannot write a file to disk or create a temporary file because AppEngine does not support FileOutputStream.

+60
java google-app-engine inputstream amazon-s3 md5
Dec 02 '11 at 4:45
source share
7 answers

Since the original question never answered, and I had to face the same problem, the solution to the problem with MD5 is that S3 does not want us to usually think about the MD5 encoding encoded by Hex.

Instead, I had to do it.

 // content is a passed in InputStream byte[] resultByte = DigestUtils.md5(content); String streamMD5 = new String(Base64.encodeBase64(resultByte)); metaData.setContentMD5(streamMD5); 

Essentially, they want the MD5 value to be a Base64 based source array of MD5 bytes, not a Hex string. When I switched to this, it started working fine for me.

+52
May 24 '13 at 13:46
source share

If all you are trying to do is resolve the content length error from amazon, then you can just read the bytes from the input stream in Long and add this to the metadata.

 /* * Obtain the Content length of the Input stream for S3 header */ try { InputStream is = event.getFile().getInputstream(); contentBytes = IOUtils.toByteArray(is); } catch (IOException e) { System.err.printf("Failed while reading bytes from %s", e.getMessage()); } Long contentLength = Long.valueOf(contentBytes.length); ObjectMetadata metadata = new ObjectMetadata(); metadata.setContentLength(contentLength); /* * Reobtain the tmp uploaded file as input stream */ InputStream inputStream = event.getFile().getInputstream(); /* * Put the object in S3 */ try { s3client.putObject(new PutObjectRequest(bucketName, keyName, inputStream, metadata)); } catch (AmazonServiceException ase) { System.out.println("Error Message: " + ase.getMessage()); System.out.println("HTTP Status Code: " + ase.getStatusCode()); System.out.println("AWS Error Code: " + ase.getErrorCode()); System.out.println("Error Type: " + ase.getErrorType()); System.out.println("Request ID: " + ase.getRequestId()); } catch (AmazonClientException ace) { System.out.println("Error Message: " + ace.getMessage()); } finally { if (inputStream != null) { inputStream.close(); } } 

You will need to read the input stream twice using this exact method, so if you are loading a very large file, you may need to read it once in the array and then read it from there.

+33
Jun 20 '12 at 9:01
source share

For loading S3 SDK has two putObject methods:

 PutObjectRequest(String bucketName, String key, File file) 

and

 PutObjectRequest(String bucketName, String key, InputStream input, ObjectMetadata metadata) 

The inputstream + ObjectMetadata method requires the minimum Content Length metadata of your input stream. If you do not, then it will buffer in memory to get this information, it can cause OOM. Alternatively, you can do your own in-memory buffering to get the length, but then you need to get a second input stream.

Not set OP (restrictions on his environment), but for someone else, such as me. It becomes easier and safer for me (if you have access to a temporary file) to write the input stream to a temporary file and put the temporary file. There is no buffer in memory and there is no need to create a second input stream.

 AmazonS3 s3Service = new AmazonS3Client(awsCredentials); File scratchFile = File.createTempFile("prefix", "suffix"); try { FileUtils.copyInputStreamToFile(inputStream, scratchFile); PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, id, scratchFile); PutObjectResult putObjectResult = s3Service.putObject(putObjectRequest); } finally { if(scratchFile.exists()) { scratchFile.delete(); } } 
+17
May 04 '15 at
source share

When writing to S3, you need to specify the length of the S3 object to make sure that there are no errors in the memory.

Using IOUtils.toByteArray(stream) also subject to OOM errors because it is supported by ByteArrayOutputStream

So, the best option is to first write the input stream to a temporary file on the local disk, and then use this file to write to S3, indicating the length of the temporary file.

+6
Dec 02 2018-11-11T00:
source share

I actually do the same, but on my AWS S3 repository: -

Code for the servlet that receives the downloaded file: -

 import java.io.IOException; import java.io.PrintWriter; import java.util.List; import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import org.apache.commons.fileupload.FileItem; import org.apache.commons.fileupload.disk.DiskFileItemFactory; import org.apache.commons.fileupload.servlet.ServletFileUpload; import com.src.code.s3.S3FileUploader; public class FileUploadHandler extends HttpServlet { protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { doPost(request, response); } protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { PrintWriter out = response.getWriter(); try{ List<FileItem> multipartfiledata = new ServletFileUpload(new DiskFileItemFactory()).parseRequest(request); //upload to S3 S3FileUploader s3 = new S3FileUploader(); String result = s3.fileUploader(multipartfiledata); out.print(result); } catch(Exception e){ System.out.println(e.getMessage()); } } } 

Code that loads this data as an AWS object: -

 import java.io.ByteArrayInputStream; import java.io.IOException; import java.util.List; import java.util.UUID; import org.apache.commons.fileupload.FileItem; import com.amazonaws.AmazonClientException; import com.amazonaws.AmazonServiceException; import com.amazonaws.auth.ClasspathPropertiesFileCredentialsProvider; import com.amazonaws.services.s3.AmazonS3; import com.amazonaws.services.s3.AmazonS3Client; import com.amazonaws.services.s3.model.ObjectMetadata; import com.amazonaws.services.s3.model.PutObjectRequest; import com.amazonaws.services.s3.model.S3Object; public class S3FileUploader { private static String bucketName = "***NAME OF YOUR BUCKET***"; private static String keyName = "Object-"+UUID.randomUUID(); public String fileUploader(List<FileItem> fileData) throws IOException { AmazonS3 s3 = new AmazonS3Client(new ClasspathPropertiesFileCredentialsProvider()); String result = "Upload unsuccessfull because "; try { S3Object s3Object = new S3Object(); ObjectMetadata omd = new ObjectMetadata(); omd.setContentType(fileData.get(0).getContentType()); omd.setContentLength(fileData.get(0).getSize()); omd.setHeader("filename", fileData.get(0).getName()); ByteArrayInputStream bis = new ByteArrayInputStream(fileData.get(0).get()); s3Object.setObjectContent(bis); s3.putObject(new PutObjectRequest(bucketName, keyName, bis, omd)); s3Object.close(); result = "Uploaded Successfully."; } catch (AmazonServiceException ase) { System.out.println("Caught an AmazonServiceException, which means your request made it to Amazon S3, but was " + "rejected with an error response for some reason."); System.out.println("Error Message: " + ase.getMessage()); System.out.println("HTTP Status Code: " + ase.getStatusCode()); System.out.println("AWS Error Code: " + ase.getErrorCode()); System.out.println("Error Type: " + ase.getErrorType()); System.out.println("Request ID: " + ase.getRequestId()); result = result + ase.getMessage(); } catch (AmazonClientException ace) { System.out.println("Caught an AmazonClientException, which means the client encountered an internal error while " + "trying to communicate with S3, such as not being able to access the network."); result = result + ace.getMessage(); }catch (Exception e) { result = result + e.getMessage(); } return result; } } 

Note. - I am using the aws property file for credentials.

Hope this helps.

+3
Mar 21 '14 at 10:19
source share

I created a library that uses multi-page downloads in the background to avoid buffering everything in memory and also not writing to disk: https://github.com/alexmojaki/s3-stream-upload

+3
Oct 22 '15 at 14:11
source share

Adding the log4j-1.2.12.jar file resolved the issue for me

-8
Dec 27 '16 at 20:57
source share



All Articles