Heavy Asynchronous Processing

I have an application, it has the simplest form, it reads a large number of phone numbers from the database (about 15 million) and sends each number from one line at a time to the URL for processing. I developed the application as follows:

  • mass export of telephone numbers from sql to a text file using SSIS. It is very fast and takes 1 or 2 minutes.
  • load numbers into the message queue (I am using MSMQ at the moment).
  • Delete messages from the command line application and run a request via http to some service, for example, 3 calls to a phone number and, finally, enter the database.

The problem is that it still takes a long time to complete. MSMQ also has a limit on the size of messages it can receive, and now I have to create multiple message queues. I need a lot of resiliency, but I dare not make a transactional transaction because of performance. I am going to publish a message queue (currently a private queue) to an active directory so that processes can delete it from different systems so that it can complete faster. In addition, my processors reach 100% at runtime, and I modify it to use threadpool at this time. I am ready to research JMS right now if it will handle the queue better. So far, the most efficient part of all processing is the SSIS part.

I would like to hear a better approach to design, especially if you have already processed this volume. I am ready to upgrade to unix or make lisp if it handles this situation better.

Thank.

+5
source share
5 answers

Here is just a super pragmatic solution:

First, split the text file into smaller files, possibly with something like 10,000 entries in each file. Let's call them numbers_x.queue.

Create a thread-based application in which each thread processes files using the following steps:

  • Find the file named numbers_x.done, if it exists, it will find the last full number in it.
  • If you find a scan of the .done file through numbers_x.queue to put yourself on the number after the last in the .done file.
  • .queue
  • IP- -
  • .done
  • .queue , goto 3
  • ,
  • .queue 1

, , , .queue .

+2

MSMQ, . ? , , ? RAM, .

0

-. MSMQ - , SQL. -, , SQL. , , SQL.

, , , , . , .

, - . , , , , .

0

? , ?

0

+ JMS - , - JMS ? "", - ? , - , ?

, -, JMS. .

: , - - . , / . , , , " " . , , .

:

  • (, ) ( , - , db )
  • , , , 10 -.
  • , ( , ). . , , .

, , , , .

- - :

  • Save JMS with one (or more equivalent) queue for submission of senders; create a queue to notify part of the reader about the completed work (so that the download queues do not overload).
  • Make a part of the reader that feeds data - sets of numbers - and reads the notification queue.
  • Create a database to track processed numbers - make it common for senders or isolated for reading when sending "processing reports" from senders. The reader than updates the database.
0
source

All Articles