Overriding WebHostBufferPolicySelector to load without buffering files

In an attempt to create an unbuffered file download, I expanded System.Web.Http.WebHost.WebHostBufferPolicySelector, overriding the UseBufferedInputStream () function, as described in this article: http://www.strathweb.com/2012/09/dealing-with-large- files-in-asp-net-web-api / . When the file is sent to my controller, I can see in the trace output that the overridden UseBufferedInputStream () function will definitely return FALSE as expected. However, using diagnostic tools, I can see how memory grows when a file is downloaded.

In my custom MediaTypeFormatter (e.g. FileMediaFormatter) there is heavy memory usage: http://lonetechie.com/ ). It is in this formatting that I would like to gradually write the input file to disk, but I also need to parse json and perform some other operations with loading Content-Type: multipart / form-data. Therefore, I use the HttpContent ReadAsMultiPartAsync () method, which seems to be a source of memory growth. I placed the trace output before / after the "wait", and it turned out that when the task was locked, memory consumption increased quite quickly.

Once I find the contents of the file in the parts returned by ReadAsMultiPartAsync (), I use Stream.CopyTo () to write the contents of the file to disk. This is written to disk as expected, but unfortunately the source file is already in memory at this point.

Does anyone have any thoughts on what could go wrong? ReadAsMultiPartAsync () seems to buffer all message data; if so, why do we require var fileStream = await fileContent.ReadAsStreamAsync () to get the contents of the file? Is there any other way to perform the separation of parts without reading them in memory? The code in my MediaTypeFormatter looks something like this:

// save the stream so we can seek/read again later Stream stream = await content.ReadAsStreamAsync(); var parts = await content.ReadAsMultipartAsync(); // <- memory usage grows rapidly if (!content.IsMimeMultipartContent()) { throw new HttpResponseException(HttpStatusCode.UnsupportedMediaType); } // // pull data out of parts.Contents, process json, etc. // // find the file data in the multipart contents var fileContent = parts.Contents.FirstOrDefault( x => x.Headers.ContentDisposition.DispositionType.ToLower().Trim() == "form-data" && x.Headers.ContentDisposition.Name.ToLower().Trim() == "\"" + DATA_CONTENT_DISPOSITION_NAME_FILE_CONTENTS + "\""); // write the file to disk using (var fileStream = await fileContent.ReadAsStreamAsync()) { using (FileStream toDisk = File.OpenWrite("myUploadedFile.bin")) { ((Stream)fileStream).CopyTo(toDisk); } } 
+4
source share
1 answer

WebHostBufferPolicySelector only indicates whether the underlying request is paperless. This is what the Web API will do under the hood:

 IHostBufferPolicySelector policySelector = _bufferPolicySelector.Value; bool isInputBuffered = policySelector == null ? true : policySelector.UseBufferedInputStream(httpContextBase); Stream inputStream = isInputBuffered ? requestBase.InputStream : httpContextBase.ApplicationInstance.Request.GetBufferlessInputStream(); 

So, if your implementation returns false, then the query doesn't matter.

However, ReadAsMultipartAsync() loads everything into a MemoryStream - because if you do not specify a provider, MultipartMemoryStreamProvider is used by default.

To automatically save files to disk during processing of each part, use MultipartFormDataStreamProvider (if you are dealing with files and form data) or MultipartFileStreamProvider (if you are dealing only with files).

Below is an example of asp.net or here . In these examples, everything happens in the controllers, but there is no reason why you would not use it, i.e. Formatter

Another option, if you really want to play with streams, is to implement a custom class that inherits from MultipartStreamProvider , which will start whatever processing you want as soon as it captures part of the stream. The use will be similar to the above providers - you need to pass it to the ReadAsMultipartAsync(provider) method.

Finally - if you feel suicidal - since the main stream of requests is theoretically without buffering, you can use something like this in your controller or formatter:

  Stream stream = HttpContext.Current.Request.GetBufferlessInputStream(); byte[] b = new byte[32*1024]; while ((n = stream.Read(b, 0, b.Length)) > 0) { //do stuff with stream bit } 

But of course, this is very, due to the lack of a better word, "ghetto".

+11
source

All Articles