Saturday, March 8, 2014

If braced initializers have no type, why is the committee so insistent on deducing one for them?

I'm one Item away from having five full draft chapters for my book on effective use of C++11 and C++14. That's one Item away from being about 75% done. I'd really like to get that Item behind me.

The prospective Item title is "Distinguish () and {} when creating objects." I've put off writing it for months, because, frankly, I think the technical situation is a mess, and the advice I have isn't as helpful as I'd like. There are a number of aspects to the mess, but part of the problem is that a braced initializer such as "{ 10, 20, 30}" has no type, yet the C++11 and C++14 standards insist on deducing one for it. But only sometimes. Not always. Not for template arguments. Not for function return types. Not for auto parameters to lambda expressions. Only for auto variables. Which is why this is the case:

auto x = { 5 };    // "{ 5 }" has no type,
                   // but x's type is std::initializer_list

In this area, the standard makes no distinction between copy initialization (using "=", as in the example above) and direct initialization (without "=", as in the example on the next line), so:

auto y { 5 };      // y's type is also std::initializer_list

I've never understood why there's a special type deduction rule for braced initializers in the context of an auto variable, but, hey, I'm just a grunt on the ground. My job isn't to make the rules, it's to report them. 

There's an Item in my draft book devoted to the interaction of auto variables and braced initializers, because that interaction confuses everybody. Even so-called experts like me and Herb Sutter have known the thrill of publishing blog posts with incorrect code, because we forgot that 

int x1 = 5;
int x2(5);
int x3{ 5 };
int x4 = { 5 };

all do the same thing, but

auto x1 = 5;       // deduced type is int
auto x2(5);        // deduced type is int
auto x3{ 5 };      // deduced type is std::initializer_list
auto x4 = { 5 };   // deduced type is std::initializer_list

don't.

The committee is not unaware of this issue. The abstract for N3681 says, "This paper proposes to change a brace-initialized auto to not deduce to an initializer list..." That is, it proposes getting rid of type deduction for a construct that doesn't have a type and thereby eliminating a provision of C++ that confuses everybody. (As you can probably tell, I'm in favor of this proposal.)

The successors to N3681, N3912 and N3922, propose getting rid of this behavior only for direct initialization, thus introducing, in my view, even more confusion into an area that already has plenty. Under the proposed new rules, this would be the situation:

auto x1 = 5;      // deduced type is int
auto x2(5);       // deduced type is int
auto x3{ 5 };     // deduced type is int
auto x4 = { 5 };  // deduced type is std::initializer_list

Can somebody please explain to me why it' s so important to have a special rule for deducing a type for a braced initializer list for auto variables using copy initialization, but not important enough for auto variables using direct initialization, auto parameters to lambda expressions, or auto function return types?


Scott

17 comments:

  1. I believe that it's for the ranged-for loop.

    ReplyDelete
  2. @anonymous: The ranged-for loop is defined in terms of std::begin and std::end (except for arrays, which get special treatment). It doesn't directly rely on deducing a type for braced initializers, much less deducing a type for braced initializers when initializing an auto variable. If the committee wanted to ensure that you could say "for (auto i : { x, y, z })", they could make a special rule for braced initializers in ranged-for loops. Note that in this example, i would be initialized with the result of invoking std::begin on "{ x, y, z }", not with "{ x, y, z }" itself.

    ReplyDelete
  3. I completely agree with your concerns, Scott.

    I can't think of a single instance where I found the `auto x = { ... }` syntax useful.

    If I actually want to instantiate an `std::initializer_list`, I find that explicitly specifying the type is easier to read and makes the intent clearer.

    I wish `{}` and `()` didn't have different meanings when using `auto`, and I wish there was something like `auto x = make_initializer_list(...);`.

    ReplyDelete
  4. When an initializer list reuse is intended do we prefer
    std::initializer_list x4 = { 5 };
    over
    auto x4 = { 5 };
    ?

    ReplyDelete
  5. Scott,

    Maybe your Item should be "Beware the moving target of brace initialisation semantics".

    It must be frustrating to be running into these kinds of issues considering you so carefully waited for the standard to settle down and some kind of consensus to arise on C++11 best practice.

    If we look back to C++98/03 as a guide, maybe it will be 2016 before all the kinks are ironed out of C++11/14!

    Anyway, it's great news that you are so far through your new book!

    Regards,

    Ben

    ReplyDelete
  6. @Vipul Chawathe: If I were writing the code, I'd be inclined to go with

    std::initializer_list<int> x = { 5 };

    but the Herb-Sutter-style

    auto x = std::initializer_list<int>{ 5 };

    would be fine, too, IMO. I think that

    auto x = { 5 };

    is less clear, though the commented version would be okay, I guess:

    auto x = { 5 }; // create std::initializer_list

    ReplyDelete
  7. @Ben Hanson: Now that the committee has decided to issue new standards on a schedule of roughly every three years, C++ and its effective application will be a moving target all the time, I think. My hope is that the information in my book will be sufficiently settled down that it won't change very much between now and 2017 (at which point the next major language revision will presumably make all books, including mine, more or less out of date).

    ReplyDelete
  8. Assuming it didn't have this special rule, I think you could still be explicit but doing:

    auto x = {{1,2,3,4}}; // Brace init with initializer_list type

    Though, I'm really not sure if thats nicer than just being explicit about the type.

    ReplyDelete
  9. @mmocny: Without the special rule, the code would not compile, because the braced initializer list isn't an expression with a type, so auto wouldn't be able to deduce a type from it. This is why passing a braced initializer list to a template function fails: the braced initializer list has no type for the template to deduce.

    ReplyDelete
  10. Brace initializers have no type, yes. But the end result does, most relevant of which are arrays.
    `int * a` is an array. Probably. `int b[] = {}` or `auto b = {}` is also an array. For sure this time. Logically `a` and `b` are arrays, but have a bit different semantics, because different amount of information can be deduced.

    What may be a bit confusing is why `auto b = {5}` is not an int, when it clearly can be. My take is that it is already confusing why you have used explicit initializer list for one value and compiler does not implicitly tamper with your exlplicit intent

    ReplyDelete
  11. @Scott "The ranged-for loop is defined in terms of std::begin and std::end (except for arrays, which get special treatment)."

    As usual, it can't possibly be that simple:
    - arrays get the special treatment you mention.
    - Then it looks for member begin or end - if either one is found, these are used (so you'ld better not provide just one).
    - If neither of these are found, it uses-non member begin/end, looked up via ADL. In C++11, (N3337), it says "For the purposes of this name lookup, namespace std is an associated namespace." For C++14 (N3936) this wording is gone, since if you think about it for a moment, it's useless, as normal std::begin/end are only useful on things in std anyway (which I suppose includes std::initializer_list).

    ReplyDelete
  12. auto x = { 5 }; // "{ 5 }" has no type,
    // but x's type is std::initializer_list


    But hey you can use C++ w/o standard library right? And what will the type then be?

    ReplyDelete
  13. I follow a few rules of thumb regarding intialization and these rules tend to jibe with the way C++ currently defines things:

    1. Treat contents of curly braces as lists.
    2. Treat contents of parentheses as construction arguments.
    3. Treat the equals sign as "initialization assignment."

    To apply the first rule, I use the curly brace syntax only when I'm initializing a list or an aggregate. I think of the values in the curly braces as data that I'm assigning to my new object.

    int x1 = { 5 }; // Bad. x1 is a scalar.
    int x2[] = { 5 }; // Good. x2 is a list of length 1.
    int x3[] = { 5, 4 }; // Good. x3 is a list of length 2.
    vector x4 = { 5 }; // Good
    vector x5 = { 5, 4 }; // Good

    Regardless of how the language treats x1's construction, I reason about it as the assignment of single-element list to an integer. I avoid it because a direct assignment is cleaner to read.

    When you toss "auto" into the mix, there's no longer a type on the left to force that conceptual cast from a list of ints to an int:

    auto x6 = { 5 }; // int? int[]? vector?
    auto x7 = { 5, 4 }; // int doesn't make sense here.

    Intuitively, I'd like x6 and x7 to be the same type. And I'd like that type to be some sort of generic list that I can then turn around to use to initialize a more concrete list object (e.g. built-in array or std::vector).

    When I'm writing a class, I include a list constructor only if my class represents a list of data and the elements of the initializer list represent the initial values of that data. It works just like a C struct or a built-in array.

    To apply the second rule, I think of each value in the parentheses as an argument that affects the construction of the object. Maybe I'm setting the *properties* of the object rather than the value/contents of the object.

    vector x1(5); // Vector of length 5.
    vector x2 = { 5 }; // Vector of length 1 with the contents "5."

    complex x3 = { 3,4 }; // Good. We're assigning a value.
    complex x4(3,4); // Bad.

    I use the parentheses-style construction in a lot of the same cases for which I would make the constructor explicit.

    For the third rule, I take the '=' to mean that I'm initializing the value of my object to the value on the right. I use '=' with curly brace construction and naked construction, but I don't use it with parentheses-style construction.

    widget x1 = 5; // x1 stores a representation of 5.
    widget x2(5); // x2's construction is parameterized by 5.
    widget x3{ 5, 4 }; // Bad
    widget x4 = { 5, 4 }; // x4 stores the list.

    These rules seem to jibe well with how the standard library implements its various constructors. There are certainly some gray areas, but for the most part, I very rarely have to ask myself which syntax to use. I know this isn't how Bjarne or Herb Sutter like to do it, but it's intuitive for me.

    ReplyDelete
  14. @Nils: Even without any #include directives, the type of x in

    auto x = { 5 };

    is still std::initializer_list. There are a few places where the core language knows about parts of this standard library, and this is one of them. (Another example is how dynamic_cast can throw a std::bad_cast exception.)

    ReplyDelete
  15. On a related note, do you happen if the EWG has intentions to broaden the use of list-initialization from mere RHS of an assignment?

    For example, I wish I could write:

    r = f(x, {y, z});

    But only this is allowed:

    i = {y, z};
    f(x, i);

    It's a pity that the complexity of writing a LR(1) parser inhibits this natural extension of list-initialization.

    (This bothered me in a specific context in the past: http://stackoverflow.com/q/11420448/1170277)

    ReplyDelete
  16. @Matthias Vallentin: The syntax you're asking about is legal today. Consider:

    #include <iostream>
    #include <vector>
    #include <typeinfo>

    std::vector<int> f(int val, const std::vector<int>& orig)
    {
    auto copy = orig;
    copy.insert(copy.begin(), val);
    return copy;
    }

    int main()
    {
    int x = 10, y = 20, z = 30;
    std::vector<int> r;
    r = f(x, { y , z });
    for (auto v : r) std::cout << v << ' ';
    std::cout << 'n';
    }

    As to whether EWG is considering expanding the contexts in which braced initializers can be used, I don't know.

    ReplyDelete
  17. Sorry Scott, in my attempt to abstract the issue I failed to give the correct example. I meant to describe the following: if f is a binary operator, e.g.,operator<<(T, U) where U can be brace-initialized, then this fails:

    class foo { };

    struct bar
    {
    template
    bar(T const&...) { }
    };

    foo& operator<<(foo& f, bar const&) { return f; }

    int main()
    {
    foo baz;
    baz << {1, -2, "foo", 4, 5};
    return 0;
    }

    ReplyDelete