What is the best way to work with multiple machines?

We are developing a .NET application that should be tens of thousands of small web service requests to a third-party web service. We would prefer a more β€œstocky” call, but the third party does not support it. We developed a client to use a custom number of workflows, and through testing there is code that is reasonably well optimized for a single multi-core machine. However, we still want to improve the speed and look at the spread of work on several machines. We are well versed in typical client / server / database applications, but new to designing for multiple machines. So, a few questions related to this:

  • Is there any other client-side optimization other than multithreading that we should consider that can improve the speed of the HTTP request / response? (I should note that this is a non-standard web service, so it is implemented using WebClient, not WCF or SOAP client).
  • Our current thinking is to use WCF to publish chunks of work in MSMQ and run clients on one or more machines to get the work out of the queue. We have experience with WCF + MSMQ, but we want to make sure that we do not have the best options. Are there other, better ways to do this today?
  • I saw some third-party tools like DigiPede and Microsoft HPC, but they seem redundant. Any experience with these products or considerations should we consider them in rolls?
+4
source share
4 answers

It seems your goal is to complete all these web service calls as fast as you can and get the results in a table. Given that your maximum performance control will be to scale the number of simultaneous requests that you can make.

Be sure to check out client-side restrictions . By default, I think that by default 2 connections are established for the system. I have not tried this myself, but by increasing the number of connections with this property, theoretically you should see the multiplier effect in terms of generating more requests, creating more connections from one machine. There are more details on MS forums.

The MSMQ option works well. I run this configuration myself. ActiveMQ is also a great solution, but MSMQ is already on the server.

You have a good starting point. Get it in action, and then go on to performance and bandwidth.

+3
source

At CodeMash this year, Wesley Faler made an interesting presentation on this issue. His solution was to store the β€œtasks” in the database, and then use clients to pull out the work and mark the status upon completion.

He then pushed the entire infrastructure to Amazon EC2.

Here are his slides from the presentation - they should give you the main idea:

I did something similar with several computers locally - the basics of workload management were similar to the Faler approach.

+1
source

If you optimized the code, you can study the network side optimization to minimize the number of packets sent:

  • reuse HTTP sessions (i.e. multiple transactions in one session, while maintaining an open connection, reduces the load on TCP)
  • Reduce the number of HTTP headers to a minimum in the request to save bandwidth.
  • if supported by the server, use gzip to compress the request body (you need to balance the use of CPU for compression and bandwidth)
+1
source

You might want to consider Rhino Service Bus instead of MSMQ. Source is available here .

0
source

All Articles