Thursday, October 11, 2012

Parameter Types in Constructors

[This is the first time I've tried to include code fragments in a blog post, and let's just say I was surprised at how badly Blogger deals with them.  I apologize for any formatting problems, and I welcome suggestions on how to get Blogger to swallow code displays without throwing a fit.]


I recently went through Sumant Tambe's presentation materials from his Silicon Valley Code Camp presentation, "C++11 Idioms."  He argues that an emerging idiom is to pass arguments to constructors by value, because this takes advantage of move opportunities when they are available.  I was surprised to read about this emerging idiom, in part because I had not heard of it (I'm supposed to be clued in about this kind of stuff) and in part because it runs contrary to my own thinking on the subject, which is to use perfect forwarding. 

His presentation goes on to discuss the general problem of efficient parameter passing, based on a trio of blog posts by










Suppose I have a Widget class with a std::string data member.  In C++98, I'd initialize it like this:
class Widget1 {
public:
  Widget1(const std::string& n): name(n) {}

private:
  std::string name;
};
The problem here is that if an rvalue is passed to the Widget1 constructor, name is copy-initialized from n instead of being move-initialized.  That's easy to fix using perfect forwarding, which is designed to optimize this kind of thing:
class Widget2 {
public:
  template <typename T>
  Widget2(T&& n): name(std::forward<T>(n)) {}

private:
  std::string name;
};
No muss, no fuss, no enable_if, no testing for whether the argument passed to n was an rvalue--none of that stuff.  Very nice.

But let's suppose there's another way to initialize a Widget.  Instead of passing a name, we can pass an ID number, and the name can be looked up from the ID number.  In C++98, adding this would be trivial:
class Widget3 {
public:
  Widget3(const std::string& n): name(n) {}
  Widget3(int id): name(findNameFromID(id)) {}

private:
  std::string name;
};
Adding this to the version of Widget with the perfect forwarding constructor, however, leads to grief.  But not immediately. The class will compile without any trouble, and simple tests will work as expected:
class Widget4 {
public:
  template <typename T>
  Widget4(T&& n): name(std::forward<T>(n)) { std::cout << "T&&\n"; }
  Widget4(int id): name(findName(id)) { std::cout << "int\n"; }

private:
  std::string name;
};

Widget4 w1("Hello");      // calls perfect forwarding ctor
Widget4 w2(22);           // calls int constructor
But slightly trickier tests fail.  For example, if the type of the passed ID isn't exactly int--e.g., a long or a short or an unsigned--the perfect forwarding constructor is a better match, and the code will try to initialize a std::string with a numerical value.  This makes no sense, and the compiler will exhibit no reluctance in telling you that.  Given
Widget4 w3(22L);          // pass a long
gcc 4.7 says this:
ctors.cpp: In instantiation of 'Widget4(long int &&)':
ctors.cpp:40:18:   required from here
ctors.cpp:18:42: error: invalid conversion from 'long int' to 'const char *' [-
    fpermissive]

basic_string.h:487:7: error: init. arg 1 of 'basic_string<
        char; _Traits = char_traits<char>; _Alloc = allocator<char>, _Traits
      , _Alloc
    >::basic_string(
        const char; _Traits = char_traits<char>; _Alloc = allocator<char> *
      , const _Alloc &
    )' [-fpermissive]
In file included from c:\users\scott\apps\mingw\bin\../lib/gcc/i686-pc-
    mingw32/4.7.0/../../../../include/c++/4.7.0/string:54:0,
from ctors.cpp:1
The fundamental problem is that perfect forwarding and overloading make very bad bedfellows, because perfect forwarding functions want to take everything. They're the greediest functions in C++.

To curb their avarice, you have to limit the conditions under which they can be instantiated, and that's where enable_if comes in.  And where enable_if comes in, any semblance of beauty leaves.  For example, if you'd like to disable the perfect forwarding template for integral types (thus allowing the int overload to handle those types), you could code it this way:
class Widget5 {
public:
  template <typename T>
  Widget5(T&& n, typename std::enable_if<!std::is_integral<T>::value>::type* = 0)
    : name(std::forward<T>(n)) { std::cout << "T&&\n"; }
  Widget5(int id): name(findName(id)) { std::cout << "int\n"; }

private:
  std::string name;
};
This code behaves as we'd like, assuming (sigh) we'd like to treat a char as a number:
  Widget5 w3a("Sally");           // calls perfect fowarding ctor
  Widget5 w3b(44);                // calls int ctor
  Widget5 w3c(44L);               // calls int ctor
  Widget5 w3d((short)44);         // calls int ctor
  Widget5 w3e('x');               // calls int ctor
If more constructor overloads are added, or if the constraints on when the perfect forwarding constructor is applicable become more complicated, the enable_if-related code can become, er, unpleasant. Such unpleasantness is presumably the noise that Sumant Tambe refers to when he rejects perfect forwarding.

In those cases, I think his preferred solution to the problem of declaring constructor parameters--take everything by value--is reasonable.   The generated code isn't quite as efficient as you'd get from perfect forwarding, but the overloading rules are a lot easier to understand, and the resulting code is easier to both read and write.

I still think that perfect forwarding is the language tool you should prefer when you need to write constructors and similar functions (e.g., setters) that simply shuttle values from one place to another.  That's what it's designed for.  If you need to overload such functions, however, things get very messy very quickly, and except in the most demanding of performance-sensitive applications and libraries, I think it's reasonable to fall back on pass-by-value as the parameter-passing mechanism.

But these are still early days in C++11 programming, and we're all still learning.  I welcome your thoughts on how to declare constructor parameters, the proper role of perfect forwarding, and the wisdom of passing parameters by value.

Scott

24 comments:

  1. The IS_RVALUE macro I used is clearly extremely verbose. I knew I needed a better way of finding whether a parameter is rvalue reference or not. So, thank you! I also like how you call perfect forwarding functions as "greediest functions" and T&& as a universal reference. Very easy to remember.

    You've shown interesting pitfalls of perfect forwarding. Particularly, when it does not mix well with overloading. The reality is worse, unfortunately.

    As it turns out, perfect forwarding does not mix well with initializer lists either. Argument deduction/substitution fails when T&& is used in combination with an initializer list. In your example, while a std::string object can be initialized with {'A', 'B', 'C'}, Widget4 can't be initialized with it. Pass-by-value would avoid the problem.

    Interestingly gcc 4.7 supports a (now deprecated) language extension that allows the compiler to deduce T as initializer_list. Also see -fno-deduce-init-list

    Moreover, when perfect forwarding is used, IDE offers little to no help for auto-completion. There is no context to provide suggestions. Programmers will make miskates and we still don't have respite from the age-old problem of long unwieldy compiler errors. Although, clang team is doing wonders in this area.

    If you use perfect forwarding for binary plus operator, you would feel uncontrollable urge to use enable_if and constrain the possible types. Otherwise, the whole world becomes addable. Without something like Concepts there is no easy way around.

    IMO, perfect forwarding is a tool for generic library developers and therefore, must be used judiciously in application code. Pass-by-value is so much cleaner and easier to understand when used with well-designed libraries.

    ReplyDelete
  2. I was discussing this with a client yesterday.

    Pass by const reference is still useful if you don't intend to copy or modify the supplied object, since it will accept both lvalues and rvalues without introducing spurious copies.

    On the other hand, pass by value is great if you intend to take a copy locally within the function, since it can eliminate copying when passing rvalues: potentially, a temporary object passed as the parameter could be constructed directly into the parameter (copy elision), whereas with a const reference and internal copy you need two objects: the parameter object and the internal copy.

    If you're copying the value somewhere else, such as a data member, then this loses its shine somewhat, except that you can now move the object into the data member, which may well be almost as efficient.

    Yes, "perfect forwarding" works well in these scenarios from an efficiency point of view, but there is a downside. You mentioned that perfect forwarding is greedy: this means you lose any type safety. If your member had been a std::vector rather than a std::string, the perfect forwarding constructor would have gladly accepted the long value, and constructed a vector with that many elements. Not necessarily what was wanted. To constrain a perfectly-forwarded argument to things implicitly convertible to the target type is non-trivial.

    ReplyDelete
  3. In C++11 `enable_if` can be used in a much less cumbersome manner. I explain it all here: http://rmartinho.github.com/2012/06/01/almost-static-if.html

    ReplyDelete
  4. About displaying lines of code on blogger I suggest you look at Syntax Highlighter.

    I've tried this in the past and it worked real fine for me.

    Hope it helps.

    ReplyDelete
  5. Passing by value also implicitly strengthens the ability to fire and forget a function - that is to call it in a seperate thread and at least not have to worry about parameter lifetime.

    ReplyDelete
  6. Just for the references: I think the idiom to pass parameters to constructors is based on Dave Abrahams aritcle "Want Speed? Pass by Value.".

    ReplyDelete
  7. In terms of readability and compile times would it not be better to have one single argument constructor which takes everything, then use tag dispatch in the constructor body to actually do the work?

    ReplyDelete
  8. Using Martinho's idea would be perfect:

    template
    Widget2(T&& n): name(std::forward(n)) {}

    ReplyDelete
  9. The problem with passing by value is that if you don't make a copy of the argument, then this can result in significant overhead. So, ideally, we would want to pass by value if we are making a copy and pass by const reference otherwise.

    Unfortunately, this approach itself has some serious drawbacks. First of all, we embed the assumption about what the implementation will do with the argument in the interface. This is not very clean. Furthermore, it is not always clear whether we will make a copy of the argument or not (i.e., if we pass it to some other function, etc).

    In the series of posts that you mentioned, I suggest that we differentiate between copy/non-copy arguments conceptually, rather than "actually". It seems that such an approximation will give pretty good results in most cases.

    ReplyDelete
  10. @Sumant Tambe: the fact that braced initializer lists don't work with perfect forwarding has nothing to do with perfect forwarding. It's a template restriction. You can't pass braced initializer lists to templates even if the parameter is passed by value. (Note that you can pass std::initializer_list objects. It's just braced lists you can't pass.)

    As for perfect forwarding and operator+, one of the things I had hoped to look into (but haven't yet had time for) was whether the use of namespaces would avoid the path to enable_if-Hell. The general idea is to put the template into (in the example you used) a matrix namespace, then let the compiler use ADL to find it. That should avoid the need for enable_if. But I haven't tried it yet.

    ReplyDelete
  11. @Martinho Fernandes: Your technique is very nice, and I'm sure it will find widespread use, but it doesn't change what I view as one of the foundational problems of enable_if-Hell, which is that conditions on different templates have to be disjoint. So if I want to write perfect forwarding templates for constraint sets A, B, and C, I have to EnableIf< A && !B &&!C > and EnableIf< !A && B &&!C > and EnableIf< !A && !B && C >. With your syntax, this is a substantially less unpleasant section of Hell than using std::enable_if, but it's still unpleasant. What we really want is partial ordering of perfect forwarding templates, but we don't have it.

    Let me say again, however, that I think the techniques you show are tremendously useful, and I'm very glad you blogged about them.

    ReplyDelete
  12. @Anonymous: Regarding
    "In terms of readability and compile times would it not be better to have one single argument constructor which takes everything, then use tag dispatch in the constructor body to actually do the work?", remember that we're dealing with constructors, and that means we want to use the member initialization list whenever possible. So any technique that requires doing something in the constructor body is out.

    Having said that, I think this general idea could be implemented, but we'd have to use constructor delegation, and that would mean that we'd have to write a function to take the perfectly-forwarded arguments and compute the proper tag from them. How practical that would turn out to be, I don't know.

    ReplyDelete
  13. @Boris Kolpackov: What was not clear to me in your articles was why you reject perfect forwarding. You say that that solution isn't suitable for "application developers," but you then go on to develop a runtime-based solution for a problem that is fully solvable during compilation. I'd be interested to know why you can't use perfect forwarding.

    ReplyDelete
  14. @Nuno Barreiro: Thanks for the pointer to SyntaxHighlighter.

    ReplyDelete
  15. Thanks for your reply (about tag dispatch). Here is an example of what I meant, it does add some lines of code, but it reduces complexity in the public interface and (might) reduce compile times. It is also, I think, a good use of the C++11 delegating constructors.

    class widget {
    std::string name;

    template
    widget(T&& n, std::false_type)
    : name { n } { }

    widget(int i, std::true_type)
    : name { std::to_string(i) } { }
    public:
    template
    widget(T&& val)
    : widget { std::forward(val), typename std::is_integral::type() }
    { }
    };

    This works with gcc and clang.

    ReplyDelete
  16. @dobby156: Thanks for the example. The more I think about this approach, the more appealing I find it. With only a single perfect-forwarding function, there's no need for enable_if nonsense, though the logic that would have gone into enable_if migrates into the tag computation logic. Analogous to my comment replying to Martinho Fernandes, in your approach, it would be complicated to express (during compilation) which tag should be used where there are multiple disjoint conditions. (It'd be easy using a runtime computation, but then you'd incur an otherwise unnecessary runtime cost.)

    ReplyDelete
  17. Scott, I didn't explicitly mention perfect forwarding as a candidate for the same reasons that you've described here. That is, it doesn't specify the interface, it pushes the diagnostics down into the implementation code, it is not usable with virtual functions, and it doesn't play well with overload resolution. I have written about the overload resolution issue just before starting the series so it was still fresh and "obvious" in my head. But I agree, I should have mentioned this option.

    Note also that the in<> class template I described is really an attempt to "tame" the perfect forwarding mechanism somewhat. If you look at the implementation, internally it uses the same techniques as you have shown, that is, std::enable_if to restrict the interface, etc.

    ReplyDelete
  18. Well, perfect forwarding inherits the "limitations" of reusing the template machinery and we've to give up on simple initializer list syntax for parameter passing. I should have said it differently earlier: Pass-by-value without any template would solve the problem [of passing initializer list].

    The Widget4 class above has more pitfalls w.r.t. copy construction. The single argument perfect forwarding constructor duals as a copy-ctor because it is greedy. It is immediately evident that simple copy-construction will fail to compile.

    Fixing this problem adds a lot of verbosity. Adding just a good-old copy-ctor isn't enough. First, adding a copy-ctor turns off implicit move operations. Second, Widget4 w3(w2) still selects the greedy constructor because that's a better match. So I ended up adding yet another copy-ctor that accepts a non-const Widget4. That construtor cannot be defaulted. To regain the lost move-semantics defaulted move-ctor and move-assign must also be added. Not quite elegant.

    ReplyDelete
  19. Short reply because I'm on a phone: with pass-by-value, the errors show up at the point of the mistake. With perfect forwarding, they end up one or more layers down.

    ReplyDelete
  20. @Sumant Tambe: I address the issue of universal references and copy construction in my next post, "Copying Constructors in C++11."

    ReplyDelete
  21. Maybe this is relevant. Many years ago I was thinking about how to avoid creating two function each time I would be accepting both an rvalue and a const reference.

    It seems that a small class with a little overhead can do the trick:

    template< class T >
    class fwd
    {
    bool is_rvalue;
    const T& t_;

    constexpr fwd( T&& t)
    : is_rvalue(true), t_(t) {}

    constexpr fwd( const T& t )
    : is_rvalue(false), t_(t) {}

    // perhaps implicit conversion
    // instead
    T get()
    {
    if( is_rvalue )
    return std::move( const_cast(t_) );
    return t_;
    }

    };


    Use like so:

    class Foo
    {
    Foo( fwd str );
    };

    ReplyDelete
  22. @Thorsten Ottosen: Your approach is essentially the same as the one Boris Kolpackov posted here. From what I can tell, both implementations have the drawback that they forward const lvalues as non-const lvalues, though I think addition of a constructor overload taking a T&, along with the obvious associated logic, would take care of that problem. The fact that you both came up with the same idea suggests that it's a good one.

    ReplyDelete
  23. This comment has been removed by a blog administrator.

    ReplyDelete
  24. The tag-dispatch method is as extensible as tag-dispatch is in general, although I guess having it in the constructors init-list can be a bit unwieldy.

    That said the extra type_traits added to C++11 has made making more complex tags easier, with the use of(nested?) std::conditional, std::integral_constant. Again after a point it might make more sense to make your own meta-function at which point this approach should be weighed up against the others suggested.

    It is my personal opinion that SFINAE and runtime switching would be a second choice to this unless they are quite a bit cleaner.

    ReplyDelete