Wednesday, April 8, 2015

More on ThreadRAII and Thread Suspension

On March 31, I did a webcast for O'Reilly on material from Effective Modern C++. (The webcast was recorded, and the recording is available here.) The focus of the presentation was how void futures can be used to communicate between threads in situations where condition variables and/or boolean flags might otherwise be used. The example I employed was starting a thread in a suspended state, and the code I ultimately built to was this:
{
  std::promise<void> p;                        // created first,
                                               // destroyed last
  std::thread t2([&p]
                 {
                   try {
                     p.get_future().get();
                     funcToRun();
                   }
                   catch(...) { … }
                 }
                );             

  ThreadRAII tr(std::move(t2),                  // created after p,
                ThreadRAII::DtorAction::join);  // destroyed before p

  …

  p.set_value();

  …

}                                   // if p hasn’t been set,tr’s
                                    // dtor hangs on a join
I explained that this code has the drawback that if control flows through this block such that the ThreadRAII object tr is created, but p.set_value() isn't executed (e.g., because an exception gets thrown), tr will hang in its destructor waiting to join with a thread that will never finish. For details, watch the webcast or consult pages 266-269 of the printed book (i.e., the second half of Item 39).

In Effective Modern C++, I conclude my discussion of this matter with "There are ways to address this problem, but I’ll leave them in the form of the hallowed exercise for the reader," and I refer to a blog post I made in December 2013 that, along with the comments it engendered, examines the issue in detail. In the webcast, I offer an explanation for why I don't show how to deal with the problem: "I don't know of a solution that's clean, simple, straightforward, [and] non-error-prone."

Michael Lindner took me to task for this. He wrote, "I was disappointed at the end of the webcast on void futures, because there was hype about void futures being useful, but the only example was a non-working one with no solution." Ouch. He went on to suggest that it may only be necessary to join with t2 if it's unsuspended, i.e., if funcToRun is permitted to run. If we don't need to synchronize with the completion of funcToRun, Michael suggested, we could do a detach instead of a join. This would mean that before the call to set_value, tr's DtorAction should be detach, and it should be set to join only after set_value has been called. Adding a setter to ThreadRAII for its DtorAction would make that possible, and the code could look like this:
{
  std::promise<void> p;
                       
  std::thread t2([fut = std::move(p.get_future())]
                 {
                   try {
                     fut.get();
                     funcToRun();
                   }
                   catch(...) { … }
                 }
                );

  ThreadRAII tr(std::move(t2),                      // before set_value is called,
                ThreadRAII::DtorAction::detach);    // configure tr's dtor to detach

  …

  p.set_value();
  tr.setDtorAction(ThreadRAII::DtorAction::join);   // after set_value is called,
                                                    // configure tr's dtor to join

  …

}                               // if p hasn't been set, its dtor sets an exception,
                                // thus preventing tr's dtor from hanging
With this approach, if set_value isn't called, p's destructor will write a std::future_error exception into the shared state that the future accesses, and that will cause the call to get in t2 to unblock. t2 will thus be able to run to completion. We won't know when t2 will complete, and we certainly can't guarantee that it will occur before the block containing tr finishes executing, but this will avoid the problem in my code whereby tr's destructor blocks waiting for a call to set_value that never occurs.

An interesting subtlety of this code is the lambda's capture clause, which performs what amounts to a move capture of the future produced by p. This is in contrast to the reference capture of p present in my code. My code performs a join on t2, and t2's lifetime is less than p's, so p must exist while t2 runs. It's therefore safe for t2 to refer to p. In Michael's approach, the detach means it's possible for t2 to continue running after p has been destroyed. It's thus critical that t2 not access p, and the move capture of p's future arranges for that to be the case. I'm grateful to Tomasz Kamiński for bringing this issue to my attention, though I'll note that Yehezkel Bernat suggested the same thing (presumably on stylistic grounds) in group chat during the webcast.

But what if we really need to do an unconditional join with t2--even if funcToRun is never permitted to execute? In that case, a detach-based approach won't work. The fact that std::promise's destructor writes an exception into its shared state seems like it should be able to be used as the basis of a solution, but there appears to be a chicken-and-egg problem:
  • In order to avoid having tr hang in its destructor, p must be destroyed before tr. 
  • That means that p must be defined after tr, because objects are destroyed in the inverse order of their construction.
  • tr must be initialized with the function its thread will execute, and that function must wait on a future that we provide.
  • To get a future, we need a std::promise.
  • That means that p must be defined before tr, because we need p to produce the future for initialization of tr.
  • Bullets 2 and 5 are contradictory.
Tomasz Kamiński sent me some code that showed how to avoid the apparent contradiction. The essence of his approach is to define tr before p, but to initialize it without a function to run on the std::thread it wraps. Without a function to run, there is no need for a future on which that function should block. After p has been defined, tr can receive an assignment that specifies the function to run and the action to perform in tr's destructor. This function can block on a future in the usual manner. Tomasz uses a class to encapsulate the various subtleties in his solution (e.g., tr must be declared before p, tr must be default-initialized and then assigned to). In this code (which is based on his implementation, but is not identical to it), I'm assuming that ThreadRAII has been revised to support default construction and that a default-constructed ThreadRAII object contains a default-constructed std::thread, i.e., a std::thread with no function to execute:
class SuspendedTask {
public:
  // FuncType should be callable as if of type void(std::future<void>)
  template<typename FuncType> 
  explicit SuspendedTask(FuncType&& f)
  {                         // no member init list ==> default-construct tr and p

    tr = ThreadRAII(std::thread([fut = p.get_future(), f = std::forward<FuncType>(f)]{ f(std::move(fut)); }), 
                    ThreadRAII::DtorAction::join); 
  }
 
  std::promise<void>& get_promise() { return p; }
 
private:
  ThreadRAII tr;            // tr must be listed 
  std::promise<void> p;     // before p!
};
This class can be used as follows:
{
  SuspendedTask t2([](std::future<void> fut){
                     try {
                       fut.get();
                       funcToRun();
                     }
                     catch(...) { … }
                   });

  …

  t2.get_promise().set_value();

  …

}
A key element of this design is a class that bundles together a std::thread (inside the ThreadRAII object) and a std::promise. That idea was the crux of the first comment on my December 2013 blog post; it was made by Rein Halbersma. Other commenters on that post embraced the same idea, but as the thread went on, things got increasingly complicated, and by the end, I wasn't really happy with any of the solutions presented there. That's why I didn't use any of them in the book. In retrospect, I should have spent more time on this problem.

I'm grateful to Michael Lindner and Tomasz Kamiński for pushing me to revisit the problem of thread suspension through void futures, ThreadRAII, and exception-safe code.

16 comments:

  1. The SuspendedTask class seems like half an abstraction. What you need is to move the lambda that handles the future.get() call inside the SuspendedTask constructor, and to replace the st.get_promise().set_value() with a resume_task() function:

    class SuspendedTask {
    public:
    // FuncType should be callable as if of type void()
    template
    explicit SuspendedTask(FuncType&& f)
    { // no member init list ==> default-construct tr and p
    tr = ThreadRAII(
    std::thread(
    [fut = p.get_future(), f = std::forward(f)]() mutable{
    try {
    fut.get();
    f();
    }
    catch(...) {}
    }),
    ThreadRAII::DtorAction::join);
    }

    void resume_task(){
    p.set_value();
    }

    private:
    ThreadRAII tr; // tr must be listed
    std::promise p; // before p!
    };

    int main()
    {
    SuspendedTask t2(funcToRun);

    t2.resume_task();
    }

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. @Anthony Williams: A drawback to your suggestion is that the client no longer has control over what happens if invoking get on the future yields an exception. One way to deal with that would be to offer the opportunity for clients to pass two function objects to the SuspendedTask constructor, one for the case where future::get yields a value, the other for when it yields an exception.

    ReplyDelete
  4. Yes, you're right that the client no longer has control if the get() throws, but this will be because resume_task() was never called, or because something has gone wrong internally. In these cases, you can probably handle the circumstance elsewhere, but I suppose you could add an overload of the SuspendedTask constructor in case someone wants to handle it there.

    The real scenario that would need handling is if f() throws, so you might want the overload just for that case. Of course, you could just wrap f() in a lambda to handle that.

    ReplyDelete
  5. @Anthony Williams: Exposing the communication channel (promise/future pair) makes the solution more general.

    As Scott pointed out with the future you are passing both value and exception.

    Furthermore if the future is passed the the consumer can perform other actions before going to the suspended state (that need to be done anyway) or test if the value was provided. Invoking the get at beginning of the thread is not always best solution.

    Furthermore providing the single resume_task method instead of future block the possibility to fully utilize its features - likes passing specific exception that result in lack of produced value.

    In my original design I used the Consumer template instead of Suspended Task (T = void) that allowed to pass any value between tasks, not only a signal.

    ReplyDelete
  6. I agree that it is more general if the future is exposed, however I think that is a bad thing: it requires the user to write additional scaffolding for the simple case of creating a suspended task.

    In the general case, you are using a future to pass a value or exception into the middle of a thread's operation. I don't think you could describe this as a "Suspended Task", and as such I think it warrants a separate class, if it needs a specific class at all.

    ReplyDelete
  7. @Anthony Williams: I don't agree that my solution adds additional burden - if run code in separate thread I need to handle exception anyway (otherwise it will terminate), so having probably throwing get is not a problem.

    In contrast your approach is silently consuming non-only exceptions from the future but also one thrown in user provided lambda (empty catch). That is makes code more error prone, that terminating the program - it is really easier to find a source on terminate, than silently discarded exception.

    ReplyDelete
  8. I just cut-and-pasted the code from Scott's example usage. Thinking about it, I would just have the get() on its own in the try block, and return in the catch block, leaving f() outside, where an escapee exception will terminate the application.

    If the function never throws, this is still fine, but if it does then it is now clear to the user that the exception needs to be handled.

    The additional burden for the client of handling the future is small, but present, and something to avoid in library code.

    ReplyDelete
  9. @Anthony Williams:
    The additional burden for the client of handling the future is small, but present, and something to avoid in library code.

    I think that the benefits provided by the future, ie:
    1. option to perform some action before suspension
    2. way to transfer more detailed error information from original thread to one that is suspended (this is also user exception)
    3. the consumer is able to see (and handle if needed) broken_promise.
    Overthrown this small burden.

    ReplyDelete
  10. You're welcome to do what you want in your code. I don't think the tradeoff is worth it. In particular, the simple case of a suspended task as described in Scott's article is much more complicated if you expose the future. Scott's example code:

    {
    SuspendedTask t2(
    [](std::future fut){
    try {
    fut.get();
    funcToRun();
    } catch(...) {

    }
    });

    t2.get_promise().set_value();

    }

    code with my simplified version:

    {
    SuspendedTask t2(funcToRun);

    t2.resume_task();

    }

    No messing around with futures and promises, and no message chains, just a simple to-the-point class.

    If you need the benefits granted by exposing the future, then it's not unnecessary overhead, it's core functionality. I'm not sure what you gain from wrapping it in a class, rather than just writing the promise directly, but if it works for you, I won't stop you.

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. @Anthony Williams: I am hiding following code sinplet that includes subtle ordering requirements:

    ThreadRAII tr;
    std::promise p;
    std::thread t2([fut = std::move(p.get_future())]
    {
    try {
    fut.get(); funcToRun();
    } catch (...) {
    }
    });
    tr = ThreadRAII(std::move(t2), ThreadRAII::DtorAction::join);

    That from first glance looks at it could be simplified to:
    std::promise p;
    std::thread t2([fut = std::move(p.get_future())]
    {
    try {
    fut.get(); funcToRun();
    } catch (...) {
    }
    });
    ThreadRAII tr(std::move(t2), ThreadRAII::DtorAction::join);

    But that simple/obvious refactoring reintroduces original problem.

    For your simplified sniplet - you are assuming that funcToRun is not throwing exceptions - I think that is not usually a case. Otherwise the gain is marginal.

    ReplyDelete
  14. I'm assuming that funcToRun() doesn't leak exceptions under normal usage, and that if it does then I want the application to terminate, just the same as if I'd passed it to std::thread directly.

    From my point of view, funcToRun() is whatever you want to run on the thread (but suspended for now), including any necessary exception handling.

    Mostly I use the suspended task pattern for testing, and I've been happy to write the boilerplate every time, but Scott's blog post made me think about how to remove it, and I'd therefore like to minimise it as much as possible.

    Looking at your example code showing the ordering problem, I'd probably rewrite it as:

    JoiningThread tj;
    std::promise p;
    tj.launch([fut = std::move(p.get_future())]
    {
    try {
    fut.get(); funcToRun();
    } catch (...) {
    }
    });

    // ...

    p.set_value();

    JoiningThread wouldn't have a launching constructor; the launch() member function would forward its arguments to create a std::thread.

    This then reduces the potential ordering problem: you only need to ensure that the JoiningThread object is constructed before the promise, rather than having to do non-intuitive move-assignment.

    ReplyDelete
  15. This comment has been removed by the author.

    ReplyDelete
  16. I do not think that your JoiningThread solution is addressing my issue. The "non-intuitivness" for me does not lay in the move-assigment, but in the fact that the working code seems to break basic principle that objects should be definied when they are needed. In case of both sniplets, wrapper (ThreadRAII, JoiningThread) needs to be declared before promise and from the first glance it looks like this object is declared to early: it is not needed to create a promise, so it should be declared after it.

    With my wrapper I am encapsulating this need to break define variables at use point principle (and only it), so the code is not accidentally broken by otherwise looking to be legit refactoring.

    ReplyDelete