On March 31, I did a webcast for O'Reilly on material from
Effective Modern C++. (The webcast was recorded, and the recording is available
here.) The focus of the presentation was how void futures can be used to communicate between threads in situations where condition variables and/or boolean flags might otherwise be used. The example I employed was starting a thread in a suspended state, and the code I ultimately built to was this:
{
std::promise<void> p; // created first,
// destroyed last
std::thread t2([&p]
{
try {
p.get_future().get();
funcToRun();
}
catch(...) { … }
}
);
ThreadRAII tr(std::move(t2), // created after p,
ThreadRAII::DtorAction::join); // destroyed before p
…
p.set_value();
…
} // if p hasn’t been set,tr’s
// dtor hangs on a join
I explained that this code has the drawback that if control flows through this block such that the ThreadRAII object tr is created, but p.set_value() isn't executed (e.g., because an exception gets thrown), tr will hang in its destructor waiting to join with a thread that will never finish. For details, watch the webcast or consult pages 266-269 of the printed book (i.e., the second half of Item 39).
In
Effective Modern C++, I conclude my discussion of this matter with "There are ways to address this problem, but I’ll leave them in the form of the hallowed exercise for the reader," and I refer to
a blog post I made in December 2013 that, along with the comments it engendered, examines the issue in detail. In the webcast, I offer an explanation for why I don't show how to deal with the problem: "I don't know of a solution that's clean, simple, straightforward, [and] non-error-prone."
Michael Lindner took me to task for this. He wrote, "I was
disappointed at the end of the webcast on void futures, because there was hype
about void futures being useful, but the only example was a non-working one
with no solution." Ouch. He went on to suggest that it may only be necessary to join with t2 if it's unsuspended, i.e., if funcToRun is permitted to run. If we don't need to synchronize with the completion of funcToRun, Michael suggested, we could do a detach instead of a join. This would mean that before the call to set_value, tr's DtorAction should be detach, and it should be set to join only after set_value has been called. Adding a setter to ThreadRAII for its DtorAction would make that possible, and the code could look like this:
{
std::promise<void> p;
std::thread t2([fut = std::move(p.get_future())]
{
try {
fut.get();
funcToRun();
}
catch(...) { … }
}
);
ThreadRAII tr(std::move(t2), // before set_value is called,
ThreadRAII::DtorAction::detach); // configure tr's dtor to detach
…
p.set_value();
tr.setDtorAction(ThreadRAII::DtorAction::join); // after set_value is called,
// configure tr's dtor to join
…
} // if p hasn't been set, its dtor sets an exception,
// thus preventing tr's dtor from hanging
With this approach, if set_value isn't called, p's destructor will write a std::future_error exception into the shared state that the future accesses, and that will cause the call to get in t2 to unblock. t2 will thus be able to run to completion. We won't know when t2 will complete, and we certainly can't guarantee that it will occur before the block containing tr finishes executing, but this will avoid the problem in my code whereby tr's destructor blocks waiting for a call to set_value that never occurs.
An interesting subtlety of this code is the lambda's capture clause, which performs what amounts to a move capture of the future produced by p. This is in contrast to the reference capture of p present in my code. My code performs a join on t2, and t2's lifetime is less than p's, so p must exist while t2 runs. It's therefore safe for t2 to refer to p. In Michael's approach, the detach means it's possible for t2 to continue running after p has been destroyed. It's thus critical that t2 not access p, and the move capture of p's future arranges for that to be the case. I'm grateful to Tomasz Kamiński for bringing this issue to my attention, though I'll note that Yehezkel Bernat suggested the same thing (presumably on stylistic grounds) in group chat during the webcast.
But what if we really need to do an unconditional join with t2--even if funcToRun is never permitted to execute? In that case, a detach-based approach won't work. The fact that std::promise's destructor writes an exception into its shared state seems like it should be able to be used as the basis of a solution, but there appears to be a chicken-and-egg problem:
- In order to avoid having tr hang in its destructor, p must be destroyed before tr.
- That means that p must be defined after tr, because objects are destroyed in the inverse order of their construction.
- tr must be initialized with the function its thread will execute, and that function must wait on a future that we provide.
- To get a future, we need a std::promise.
- That means that p must be defined before tr, because we need p to produce the future for initialization of tr.
- Bullets 2 and 5 are contradictory.
Tomasz Kamiński sent me some code that showed how to avoid the apparent contradiction. The essence of his approach is to define tr before p, but to initialize it without a function to run on the std::thread it wraps. Without a function to run, there is no need for a future on which that function should block. After p has been defined, tr can receive an assignment that specifies the function to run and the action to perform in tr's destructor. This function can block on a future in the usual manner. Tomasz uses a class to encapsulate the various subtleties in his solution (e.g., tr must be declared before p, tr must be default-initialized and then assigned to). In this code (which is based on his implementation, but is not identical to it), I'm assuming that ThreadRAII has been revised to
support default construction and that a default-constructed ThreadRAII
object contains a default-constructed std::thread, i.e., a std::thread with no function to execute:
class SuspendedTask {
public:
// FuncType should be callable as if of type void(std::future<void>)
template<typename FuncType>
explicit SuspendedTask(FuncType&& f)
{ // no member init list ==> default-construct tr and p
tr = ThreadRAII(std::thread([fut = p.get_future(), f = std::forward<FuncType>(f)]{ f(std::move(fut)); }),
ThreadRAII::DtorAction::join);
}
std::promise<void>& get_promise() { return p; }
private:
ThreadRAII tr; // tr must be listed
std::promise<void> p; // before p!
};
This class can be used as follows:
{
SuspendedTask t2([](std::future<void> fut){
try {
fut.get();
funcToRun();
}
catch(...) { … }
});
…
t2.get_promise().set_value();
…
}
A key element of this design is a class that bundles together a std::thread (inside the ThreadRAII object) and a std::promise. That idea was the crux of the first comment on my December 2013 blog post; it was made by Rein Halbersma. Other commenters on that post embraced the same idea, but as the thread went on, things got increasingly complicated, and by the end, I wasn't really happy with any of the solutions presented there. That's why I didn't use any of them in the book. In retrospect, I should have spent more time on this problem.
I'm grateful to Michael Lindner and Tomasz Kamiński for pushing me to revisit the problem of thread suspension through void futures, ThreadRAII, and exception-safe code.