WebRequest timeout with F # -Style asynchronous workflows

Question

WebRequest timeout with F # -Style asynchronous workflows

For a broader context, here is my code that loads a list of URLs.

It seems to me that there is no good way to handle timeouts in F # when using the t20> style fetch. I have almost everything that I like (error handling and asynchronous request and downloading responses) keep the problem that occurs when the website takes a long time to respond. My current code just hangs endlessly. I tried this on a PHP script, I wrote that it waits 300 seconds. He was waiting all the time.

I found two kinds of “solutions”, both of which are undesirable.

`AwaitIAsyncResult` + `BeginGetResponse`

Like ildjarn answer to this other question . The problem is that if you queued a lot of asynchronous requests, some of them are artificially blocked on AwaitIAsyncResult . In other words, a call to make a request was made, but something behind the scenes is blocking the call. This causes the AwaitIAsyncResult to timeout AwaitIAsyncResult when many simultaneous requests are executed. My assumption is a limitation on the number of requests to one domain, or simply a limit on the number of requests.

To support my suspicion, I wrote a small WPF application to draw a graph when requests seem to start and end. In my code above, pay attention to starting and stopping the timer on lines 49 and 54 (calling line 10). Below is an image of the timeline .

When I move the timer from the beginning after the initial response (so that I only load the content), the timeline looks much more realistic . Note that these are two separate runs, but without changing the code, except where the timer starts. Instead of measuring startTime just before use! response = request.AsyncGetResponse() use! response = request.AsyncGetResponse() , I get this directly after that.

To support my application, I made a schedule with Fiddler2 . The following is a summary timeline . Obviously, the requests do not start exactly when I tell them.

`GetResponseStream` in a new stream

In other words, synchronous requests and download requests are executed in the secondary thread. This works because GetResponseStream matches the Timeout property of the WebRequest object. But in the process, we lose all the waiting time, because the request is on the wire, and the response has not yet returned. We could also write it in C # ...;)

Questions

Is this known?
Is there any good solution that uses F # asynchronous workflows and still allows timeouts and error handling?
If the problem is that I am making too many requests at the same time, then is the best way to limit the number of requests to use Semaphore(5, 5) or something like that?
Side question: if you looked at my code, can you see some stupid things that I did, and what can they fix?

If you are confusing, let me know.

+4

asynchronous f # httpwebrequest

Joel verhagen Oct 30 '11 at 21:52

source share

2 answers

David grenier · Answer 1 · 2011-11-21T20:44:46+0000

AsyncGetResponse simply ignores any latency value posted ... here is the solution we just prepared:

 open System open System.IO open System.Net type Request = Request of WebRequest * AsyncReplyChannel<WebResponse> let requestAgent = MailboxProcessor.Start <| fun inbox -> async { while true do let! (Request (req, port)) = inbox.Receive () async { try let! resp = req.AsyncGetResponse () port.Reply resp with | ex -> sprintf "Exception in child %s\n%s" (ex.GetType().Name) ex.Message |> Console.WriteLine } |> Async.Start } let getHTML url = async { try let req = "http://" + url |> WebRequest.Create try use! resp = requestAgent.PostAndAsyncReply ((fun chan -> Request (req, chan)), 1000) use str = resp.GetResponseStream () use rdr = new StreamReader (str) return Some <| rdr.ReadToEnd () with | :? System.TimeoutException -> req.Abort() Console.WriteLine "RequestAgent call timed out" return None with | ex -> sprintf "Exception in request %s\n\n%s" (ex.GetType().Name) ex.Message |> Console.WriteLine return None } |> Async.RunSynchronously;; getHTML "www.grogogle.com"

i.e. We delegate to another agent and call it, providing an asynchronous timeout ... if we do not receive a response from the agent in the specified amount of time, we abort the request and proceed.

David grenier · Answer 2 · 2011-11-21T21:28:02+0000

I see that my other answer may not answer your specific question ... here is another implementation for a task limiter that does not require the use of a semaphore.

 open System type IParallelLimiter = abstract GetToken : unit -> Async<IDisposable> type Message= | GetToken of AsyncReplyChannel<IDisposable> | Release let start count = let agent = MailboxProcessor.Start(fun inbox -> let newToken () = { new IDisposable with member x.Dispose () = inbox.Post Release } let rec loop n = async { let! msg = inbox.Scan <| function | GetToken _ when n = 0 -> None | msg -> async.Return msg |> Some return! match msg with | Release -> loop (n + 1) | GetToken port -> port.Reply <| newToken () loop (n - 1) } loop count) { new IParallelLimiter with member x.GetToken () = agent.PostAndAsyncReply GetToken} let limiter = start 100;; for _ in 0..1000 do async { use! token = limiter.GetToken () Console.WriteLine "Sleeping..." do! Async.Sleep 3000 Console.WriteLine "Releasing..." } |> Async.Start

WebRequest timeout with F # -Style asynchronous workflows

AwaitIAsyncResult + BeginGetResponse

GetResponseStream in a new stream

Questions

More articles:

`AwaitIAsyncResult` + `BeginGetResponse`

`GetResponseStream` in a new stream