Take a step back.
Start by specifying all desired and undesirable characteristics before you start writing a solution. Some of them immediately come to mind:
- "Work" is performed in thread W. "UI" is performed in thread U.
- Work done in "units of work." Each unit of work is “short” in duration, for some definition of “short”. Let me call a method that does the work of M ().
- Work is performed continuously by W, in a cycle, until U tells her to stop.
- U calls the cleanup method D () when all the work is done.
- D () should never start before or while M () is running.
- Exit () should be called after D (), in the U thread.
- U should never block a “long” time; it is acceptable for him to block a "short" time.
- No deadlocks, etc.
Does this summarize the problem space?
First, I note that at first glance it seems that the problem is that U should be the caller of D (). If W was the caller of D (), you would not have to worry; you simply signal W to exit the loop, and then W will call D () after the loop. But it just trades one problem for another; presumably in this scenario, U must wait for W to call D () before U calls Exit (). Therefore, moving the call to D () from U to W does not make the task easier.
You said you did not want to use double check lock. You should be aware that with CLR v2, it is considered that a double-locked pattern is safe. The guarantees of the memory model were strengthened in version 2. Therefore, it will probably be safe for you to use double-check locking.
UPDATE: you requested information on (1), why is the double-lock flag in v2, but not in v1? and (2) why did I use the affectionate word "maybe"?
To understand why double-check locking is unsafe in the CLR v1 memory model, but safe in the CLR v2 memory model, read the following:
http://web.archive.org/web/20150326171404/https://msdn.microsoft.com/en-us/magazine/cc163715.aspx
I said maybe because, as Joe Duffy wisely says:
as soon as you risk even a little outside of the limits of the few “blessed” unsecured practices [...] you are opening yourself up to the worst kind of race conditions.
I don’t know if you plan to use double-check locks correctly or plan to write your own smart, broken double-check lock option, which actually fits on IA64 machines. Therefore, this is likely to work for you if your problem is really suitable for double checking the lock and you are writing code correctly.
If you're interested, you should read Joe Duffy's articles:
http://www.bluebytesoftware.com/blog/2006/01/26/BrokenVariantsOnDoublecheckedLocking.aspx
and
http://www.bluebytesoftware.com/blog/2007/02/19/RevisitedBrokenVariantsOnDoubleCheckedLocking.aspx
And this SO question has a good discussion:
Need volatile modifier for double checked lock in .NET
It is probably best to find a mechanism other than double check locking.
There is a mechanism for waiting for the completion of one thread, which ends before completion - thread.Join. You can join the UI thread to the workflow; when the workflow shuts down, the user interface thread wakes up again and performs the deletion.
UPDATE: added connection information.
“Join” basically means that “thread U says thread W is shutting down and U is going to sleep until this happens.” A brief description of the exit method:
// do this in a thread-safe manner of your choosing running = false; // wait for worker thread to come to a halt workerThread.Join(); // Now we know that worker thread is done, so we can // clean up and exit Dispose(); Exit();
Suppose you did not want to use "Join" for some reason. (Perhaps the workflow should continue to work to do something else, but you still need to know when it is done using objects.) We can create our own mechanism that works like Join using wait commands. Now you need two locking mechanisms: one that allows U to send a signal to W that says “stop working now,” and then another that waits for W to finish the last call to M ().
What will I do in this case:
- make the flag flag "running". Use any mechanism convenient for you to make it thread safe. I would personally start with a lock on this; if you decide later that you can go with blocked lock operations, then you can always do this later.
- make AutoResetEvent act as a shutter available.
So, a brief sketch:
UI thread, startup logic:
running = true waithandle = new AutoResetEvent(false) start up worker thread
UI thread, shutdown logic:
running = false; // do this in a thread-safe manner of your choosing waithandle.WaitOne(); // WaitOne is robust in the face of race conditions; if the worker thread // calls Set *before* WaitOne is called, WaitOne will be a no-op. (However, // if there are *multiple* threads all trying to "wake up" a gate that is // waiting on WaitOne, the multiple wakeups will be lost. WaitOne is named // WaitOne because it WAITS for ONE wakeup. If you need to wait for multiple // wakeups, don't use WaitOne. Dispose(); waithandle.Close(); Exit();
work flow:
while(running) // make thread-safe access to "running" M(); waithandle.Set(); // Tell waiting UI thread it is safe to dispose
Note that this depends on the fact that M () is short. If M () takes a long time, you can wait a long time to exit the application, which seems bad.
It makes sense?
Indeed, you should not do this. If you want to wait for the workflow to complete before deleting the object that it uses, simply attach it.
UPDATE: Some additional questions:
Is it a good idea to wait without a timeout?
In fact, note that in my Join example and my WaitOne example, I don’t use options for them that wait a certain amount of time before giving up. Rather, I urge that my assumption is that the workflow shuts down cleanly and quickly. Is this the right thing?
It depends! It depends on how bad the working thread is and what it does when it behaves badly.
If you can guarantee that the work is short in duration, for any “short” means for you, you do not need a timeout. If you cannot guarantee this, I would suggest rewriting the code first so you can guarantee it; life becomes much easier if you know that the code will be completed quickly when you ask for it.
If you can not, then what to do? The assumption about this scenario is that the employee behaves badly and does not interrupt on time when asked. So now we have to ask ourselves: "Is the worker slow in design, buggy or hostile?"
In the first scenario, a worker simply does something that takes a lot of time and for some reason cannot be interrupted. What is the right thing here? I have no idea. This is a terrible situation. The worker probably does not close quickly because it is dangerous or impossible. In this case, what are you going to do when the timeout expires ??? You have something that is dangerous or impossible to close, and it does not close in a timely manner. Your choice seems to be (1) to do nothing, (2) to do something dangerous or (3) to do something impossible. Probably the third choice. Choice one is equivalent to waiting forever, which we have already rejected. This leaves "doing something dangerous."
Knowing what to do in order to minimize harm to user data depends on the exact circumstances that cause the danger; carefully analyze it, understand all the scenarios and find out what needs to be done.
Now suppose a worker should be able to quickly complete a job, but not because he has an error. Obviously, if possible, correct the error. If you cannot fix the error - perhaps it is in code that you do not own - then again you are in a terrible decision. You must understand what consequences did not wait until the error with the error and, therefore, unpredictable code ends before deleting the resources that, as you know, are being used right now in another thread. And you should know what are the consequences of terminating the application, while the buggy workflow is still busy making heaven, only knows what the state of the operating system is.
If the code is hostile and actively resists closure, then you have already lost. You cannot stop the flow by conventional means, and you cannot even interrupt it. There is no guarantee that termination of a hostile thread actually completes it; the owner of a hostile code that you recklessly run in your process can do all of its work in a finite block or other restricted area that prevents thread interrupt exceptions.
The best thing to do is to never get into this situation in the first place; if you have code that, in your opinion, is hostile, either do not run it at all, or run it in your own process, and end the process, not the thread, when everything happens badly.
In short, there is no good answer to the question "what should I do if it takes too much time?" You are in a terrible situation if this happens, and there is no simple answer. It is best to work hard to ensure that you do not get into it in the first place; just run the cooperative, safe, secure code that always closes automatically and quickly when asked.
What if the worker throws an exception?
OK, so what if this happens? Again, it is better not to be in this situation in the first place; write a working code so that it does not throw away. If you cannot do this, you have two options: to handle the exception or not to handle the exception.
Suppose you are not handling an exception. In my opinion, CLR v2, an unhandled exception in a workflow terminates the entire application. The reason is that in the past, what would happen, you would start a bunch of workflows, all of them would throw exceptions, and you would end up running the application without any work threads not doing work, and not informing the user about it. It is better to force the author of the code to handle a situation where the workflow is reduced due to an exception; this old tool effectively hides errors and simplifies the recording of fragile applications.
Suppose you handle this exception. Now what? Something threw an exception, which by definition is an unexpected condition for error. Now you have no information that any of your data is consistent or that any of your software invariants is supported in any of your subsystems. So what are you going to do? There is hardly anything safe at the moment.
Question: "what is better for the user in this unfortunate situation?" It depends on what the application does. It is possible that the best thing to do at this moment is to aggressively close and tell the user that something unexpected has failed. This may be better than trying to get confused and possibly make things worse, say by accidentally destroying user data when trying to clear it.
Or, it’s quite possible that the best thing to do is to make a good faith effort to save user data, remove as many states as possible, and terminate as often as possible.
In principle, both questions: "What should I do when my subsystems do not behave?" If your subsystems are unreliable, either make them reliable, or develop a policy for how you deal with an unreliable subsystem and implement this policy. This is an uncertain answer that I know, but that since dealing with an unreliable subsystem is essentially a terrible situation. How you deal with this depends on the nature of its insecurity and the consequences of this insecurity on valuable user data.