What is the correct way to use a timeout manager with a distributor in NServiceBus 3+?

Question

What is the correct way to use a timeout manager with a distributor in NServiceBus 3+?

The pre-3 version recommended starting the timeout manager as a separate process on your cluster, next to the distributor. (More details here: http://support.nservicebus.com/customer/portal/articles/965131-deploying-nservicebus-in-a-windows-failover-cluster ).

After you enable the timeout manager as a satellite assembly, what is the correct way to use it when scaling with a distributor?

Should each employee of service A start with the timeout manager turned on, or only has to configure the distributor process for service A to start the timeout manager for service A?

If every worker starts it, do they have the same Raven instance to store timeouts? (And if so, how do you make sure that two or more employees do not receive the same timeout with an expired time at the same time?)

+7

nservicebus nservicebus3

janovesk Feb 05 '13 at 22:19

source share

2 answers

In version 3+, we created the concept of the node wizard, which contains all the satellites inside it, such as the distributor, timeout manager, gateway, etc.

The node wizard is very easy to run - you just pass the / master flag to the NServiceBus.Host.exe process, and it runs everything for you. So, in terms of deployment, when you used distributor deployment, you are now deploying the node wizard.

+1

Udi dahan Feb 06 '13 at 6:48

source share

janovesk · Accepted Answer · 2013-02-10T17:41:16+0000

Let me answer this clearly.

After a lot of digging and with the help of Andreas Olund in the NSB team ( http://tech.groups.yahoo.com/group/nservicebus/message/17758 ), the correct answer to this question

As Udi Dahan mentioned, by design ONLY the distributor / master node should run the timeout manager in the scaling script.
Unfortunately, in earlier versions of NServiceBus 3 this is not implemented as designed.

You have the following 3 problems:

1) Starting with a Distributor profile does NOT start the timeout manager.

Workaround:

Run the timeout manager directly at the distributor, including this code in the distribution:

class DistributorProfileHandler : IHandleProfile<Distributor> { public void ProfileActivated() { Configure.Instance.RunTimeoutManager(); } }

If you run the main profile, this is not a problem, since you automatically start the timeout manager on the master node.

2) Workers working with the Worker profile each run a local timeout manager.

This is not so designed and will interfere with polling against the timeout and sending timeouts. All employees survey store timeout "give me inevitable timeouts for MASTERNODE." Note that they request MASTERNODE timeouts, not W1, W2, etc. Thus, some employees can simultaneously receive the same timeouts from the timeout repository, which leads to conflicts with Raven when timeouts are removed from it.

Sending always occurs through the LOCAL.timouts / .timeoutsdispatcher queues, while it MUST go through the timeout manager queues on MasterNode / Distributor.

Workaround, you will need to do both:

a) Disable worker timeout manager. Include this code for your employees

 class WorkerProfileHandler:IHandleProfile<Worker> { public void ProfileActivated() { Configure.Instance.DisableTimeoutManager(); } }

b) Repeat the NServiceBus for workers to use the .timeouts queue on the MasterNode / Distributor.

If you do not, any call to RequestTimeout or Defer on the desktop will die with an exception saying that you forgot to configure the timeout manager. Include this in your worker configuration:

 <UnicastBusConfig TimeoutManagerAddress="{endpointname} .Timeouts@ {masternode}" />

3) Error messages "Ready" to the distributor.

Because the timeout manager sends messages directly to the job entry queue without deleting the entry from the available workers in the distributor’s storage queue, workers send erroneous Ready messages back to the distributor after the timeout has been processed. This happens even if you corrected 1 and 2, and it does not matter if the timeout was selected from the local timeout manager at the workplace or one of the ones running on the distributor / MasterNode. The consequence is the creation of an additional record in the storage queue at the distributor for each timeout processed by the worker.

Workaround: Use NServiceBus 3.3.15 or later.

What is the correct way to use a timeout manager with a distributor in NServiceBus 3+?

1) Starting with a Distributor profile does NOT start the timeout manager.

2) Workers working with the Worker profile each run a local timeout manager.

3) Error messages "Ready" to the distributor.

More articles: