The View from Aristeia: March 2013

Monday, March 25, 2013

Thread Handle Destruction and Behavioral Consistency

Suppose you fire up a thread in a function, then return from the function without joining or detaching the thread:

void doSomeWork();

void f1()
{
  std::thread t(doSomeWork);
  ...                          // no join, no detach
}

What happens?

Your program is terminated. The destructor of a std::thread object that refers to a "joinable" thread calls std::terminate.

Now suppose you do the same thing, except instead of firing up the thread directly, you do it via std::async:

void f2()
{
  auto fut = std::async(std::launch::async, doSomeWork);
  ...                          // no get, no wait
}

Now what happens?

Your function blocks until the asychronously running thread completes. This is because the shared state for a std::async call causes the last future referring to that shared state to block in its destructor. Practically speaking, the destructor for the final future referring to a std::async shared state does an implicit join on the asynchronously running thread.

(The behavior I'm describing is mandated by the standard. Some implementations, notably Microsoft's, don't behave this way, because the standardization committee is considering changing this aspect of the standard, and Microsoft has implemented the revised behavior they believe will ultimately be adopted.)

Finally, suppose you create a packaged_task for the function to be run asynchronously, then you detach from the thread running the packaged_task, while retaining the future for the packaged_task:

void f3()
{
  std::packaged_task<void()> pt(doSomeWork);
  auto fut = pt.get_future();
  std::thread(std::move(pt)).detach();
  ...                          // no get, no wait
}

Now what happens?

Your function returns, even if the function to be run asynchronously is still running. In essence, the thread is detached. The destructor of the thread object no longer refers to a joinable thread (thanks to the call to detach), so it doesn't call std::terminate, and the destructor of the std::future refer doesn't refer to a shared state for a call to std::async, so it doesn't performs an implicit join.

"So what's your point?," you may be wondering. Well, we can think of both std::thread objects and futures as handles for asynchronously running threads, and it's interesting to note that when such handles are destroyed, in some cases, we terminate, in others we do an implicit join, and in others we do an implicit detach. As I've been known to put it, the standardaization committee, when faced with a choice of three possible behaviors, chose all three.

In fact, I'm making this post at the request of a member of the standardization committee who thought it would be worthwhile to point out this inconsistency in the standard's treatment of thread handles. Whether anything will be done about it remains to be seen. If the specification for std::async is modified such that its shared state no longer causes the blocking behavior I described in my last post, that would eliminate the implicit join behavior, but I'm not convinced that such a change is a shoe-in for adoption. The problem is that such a change to the standard could silently break the behavior of existing programs (i.e., code that depends on the implicit join in future destructors that are the final reference to a shared state coming from std::async), and the standardization committee is generally very reluctant to adopt changes that can silently change the behavior of conforming programs.

Scott

Wednesday, March 20, 2013

std::futures from std::async aren't special!

This is a slightly-revised version of my original post. It reflects information I've since received that confirms some of the suppositions I'd been making, and it rewords some things to clarify them.

It's comparatively well known that the std::future returned from std::async will block in its destructor until the asynchronously running thread has completed:

void f()
{
  std::future<void> fut = std::async(std::launch::async, 
                                     [] { /* compute, compute, compute */ });

}                                    // block here until thread spawned by
                                     // std::async completes

Only std::futures returned from std::async behave this way, so I had been under the impression that they were special. But now I believe otherwise. I now believe that all futures must behave the same way, regardless of whether they originated in std::async. This does not mean that all futures must block in their destructors. The story is more nuanced than that.

There's definiately something special about std::async, because futures you get from other sources (e.g., from a std::promise or a std:: packaged_task) don't block in their destructors. But how does the specialness of std::async affect the behavior of futures?

C++11 futures are the caller's end of a communications channel that begins with a callee that's (typically) called asynchronously. When the called function has a result to communicate to its caller, it performs a set operation on the std::promise corresponding to the future. That is, an asynchronous callee sets a promise (i.e., writes a result to the communication channel between it and its caller), and its caller gets the future (i.e., reads the result from the communications channel).

(As usual, I'm ignoring a host of details that don't affect the basic story I'm telling. Such details including return values versus exceptions, waiting versus getting, unshared versus shared futures, etc.)

Between the time a callee sets its promise and its caller does a corresponding get, an arbitrarily long time may elapse. (In fact, the get may never take place, but that's a detail I'm ignoring.) As a result, the std::promise object that was set may be destroyed before a get takes place. This means that the value with which the callee sets the promise can't be stored in the promise--the promise may not have a long enough lifetime. The value also can't be stored in the future corresponding to the promise, because the std::future returned from std::async could be moved into a std::shared_future before being destroyed, and the std::shared_future could then be copied many times to new objects, some of which would subsequently be destroyed. In that case, which future would hold the value returned by the callee?

Because neither the promise nor the future ends of the communications channel between caller and callee are suitable for storing the result of an asynchronously invoked function, it's stored in a neutral location. This location is known as the shared state. There's nothing in the C++ standard library corresponding to the shared state. No class, no type, no function. In practice, I'm guessing it's implemented as a class that's templatized on at least the type of the result to be communicated between callee and caller.

The special behavior commonly attributed to futures returned by std::async is actually determined by the shared state. Once you know what to look for, this is indicated in only moderately opqaque prose (for the standard) in 30.6.8/3, where we learn that

The thread object [for the function to be run asynchronously] is stored in the shared state and affects the behavior of any asynchronous return objects [e.g., futures] that reference that state.

and in 30.6.8/5, where we read:

the thread completion [for the function run asynchronously] synchronizes with [i.e., occurs before] [1] the return from the first function that successfully detects the ready status of the shared state or [2] with the return from the last function that releases the shared state, whichever happens first.

It's provision [2] that's relevant to us here. It tells us that if a future holds the last reference to the shared state corresponding to a call to std::async, that future's destructor must block until the thread for the asynchronously running function finishes. This is a requirement for any future object. There is nothing special about std::futures returned from std::async. Rather, the specialness of std::async is manifested in its shared state.

By the way, when I write that the "future's destructor must block," I don't mean it literally. The standard just says that the function releasing the last reference to a shared state corresponding to a std::async call can't return as long as the thread for the asynchronously running function is still executing. That behavior doesn't have to be implemented by having a future's destructor directly block. The future destructor might simply call a member function to decrement the reference count on the shared state. Inside that call, if the result of the decrement was zero and the shared state corresponded to a std::async call, the member function would simply wait until the thread running the asynchronously running function completed before it returned to the future destructor. From the future's point of view, it merely made a synchronous call to a function to decrement the reference count on the shared state. The runtime behavior, however, would be that it could block until the asynchronously running thread completed.

The provision stating that, essentially, the shared state corresponding to a call to std::async must somehow indicate that the last future referring to them must block until the associated thread has finished running, is not popular. It's been proposed to be changed, and some standard library implementations (e.g., Microsoft's) have already revised their implementations to eliminate the "futures from std::async block in their destructors" behavior. That makes it trickier for you to test the behavior of this part of the standard, because the library you use may be deliberately nonconformant in this area.

Scott

PS - The reason I got caught up in this matter was that I was trying to find a way to perform the moral equivalent of a detach on a thread spawned via std::async. Because I believed it was the std::future returned from std::async that was special, I started experimenting with things like moving that std::future into a std::shared_future in an attempt to return from the function calling std::async before the asynchronously running function had finished. But since it's the shared state that's special, not the std::future, this approach seems doomed. If you know how to get detach-like behavior when using std::async (without the cooperation of the function being run asynchronously), please let me know!

Wednesday, March 13, 2013

The Line-Length Problem

The bane of publishing code for consumption on a variety of platforms is that the available horizontal space varies. I've blogged elsewhere that I want to avoid horizontal scrolling or bad line breaks in code, and I'm working with my publisher on how to do that. I'd like your help, too.

My understanding is that on Kindle and iPad (the platforms for which I currently have some data), the size of the text you see depends on both the font size specified in the document's CSS (which you, as a reader, typically can't control) as well as on the font size specified for the device (which you, as a reader, typically can). The response to my earlier post about font choices showed a marked preference for code in a fixed-pitch font, so that's what I plan to use in Effective C++11. I've received the following information regarding how many characters fit on a line in Kindle and iPad in various combinations of device and CSS font sizes and device orientations:

It's interesting that on iPad, using the device in landscape mode shows two columns instead of one, thus providing less horizontal line space per line. As an author, this means I actually have more room to work with when the device is used in portrait mode.

As you can see, if I limit my code displays to 45 characters per line, that should display without problems under all but two combinations of settings above. I think that 45 characters per line would look strange on devices with more horizontal room, however, and the data also show that for many combinations of settings, I could use up to 60 characters per line (which is about what I'd have in a printed book). Not being a fan of lowest-common-denominator constraint satisfaction (i.e., not penalizing people with devices and settings for wider lines for the benefit of people with devices and settings for narrower lines) my thought is that I'll format my code displays twice, once with no more than 45 characters/line and once with up to 60. As an example of what that could mean in real life, here's some sample code from Item 3 of the current (third) edition of Effective C++. As is all code in that book, it's in a proportional font:

Here it is formatted in a fixed-pitch font with no more than 60 characters/line:

class TextBlock {  
public:
  ...  

  const char&
  operator[](std::size_t position) const   // operator[] for
  { return text[position]; }               // const objects

  char&
  operator[](std::size_t position)       // operator[] for
  { return text[position]; }             // non-const objects

private:
  std::string text;  
};

And here it is again with no more than 45 characters/line:

class TextBlock {  
public:
  ...  

  // operator[] for const objects
  const char&
  operator[](std::size_t position) const
  { return text[position]; }

  // operator[] for non-const objects
  char& operator[](std::size_t position)
  { return text[position]; }

private:
  std::string text;  
};

Do you think it's worth my formatting code displays twice, once for wide lines and once for narrow ones, or do you think that using narrow formatting everywhere would suffice? Don't worry about how much work it is for me. That's my problem. Focus on what would work better for you.

Assuming for the moment that formatting the code twice is preferable, there's a logistical issue that has to be addressed, namely, how to write a single manuscript that can generate documents with one of two sets of code displays. My plan had been to use Microsoft Word and to use conditional text to switch between code displays, i.e., to set up "wide" and "narrow" configurations and hide the code displays that did not correspond to the current configuration. Alas, Microsoft Word 2010 (the version I'm using) lacks support for conditional text, something that quite surprised me, because both FrameMaker and OpenOffice/LibreOffice have had it for years. Switching to a different document authoring system leads to new problems, because the publication process likely to be followed by my book is likely to involve Microsoft Word as the point of entry, meaning that even if I produce my manuscript using, say, OpenOffice, that's likely to be converted into Word as step 0, so what Word can't represent is likely to be troublesome. (Before you bombard me with suggestions to use LaTeX or some other markup language, I'm on record as viewing those as inferior to WYSIWYG systems, as I detail here.)

Do you have any ideas about how I should approach the production of code displays that look good on all "reasonable" publication platforms and that can reasonably be produced and maintained by my authoring tool, which is highly likely to be Word 2010?

Thanks,

Scott

Friday, March 1, 2013

C++ and Beyond 2013 Registration has Begun!

Registration for this year's C++ and Beyond with me, Herb Sutter, and Andrei Alexandrescu is now open! Participation is limited to 64 developers. That's about two-thirds the demand of prior years, which means that not only will C&B 2013 sell out, it's likely to sell out quickly.

For details on this year's C++ and Beyond, consult its web site. Bottom line? If you're interested in joining a small group of developers as well as me, Herb, and Andrei for three intense days of C++ and C++-related topics December 9-12 at the Salish Lodge and Spa near Seattle, you'll want to register soon.

I look forward to seeing you there!

Scott

The View from Aristeia