Comments on The View from Aristeia: Breaking all the Eggs in C++

Approach with third party tools is completely wron...

2015-12-06T11:11:21.466-08:00

Approach with third party tools is completely wrong and is likely to fail.
No "magic wand" will help if we have 10M lines of sources with requirement to do code review and write (and physically sign!) "formal code review reports" (this is what FDA requires from healthcare related project!).
The best solution for this problem is to adopt for C++ something like "use strict" does for JavaScript.

Now it is up to developer (!) to decide: do we need to update all the 10000files of project sources with "magic wand" (and write tons of those "formal code review reports"), or use this new "strict" mode only for new or refactored files!

This new "#pragma strict" or "using strict" will not be the same thing as all those "MISRA" or "embedded C++" or other "ugly ducklings" like "special safe coding conventions" - every big enough company (or even division within this company) has invented their own "special safe coding convention" to work around C/C++ flaws and has their own set of ugly buggy tools to support this hell.
The new "#pragma strict" or "using strict" or whatever we call it will be different from this hell just because it is part of C++ standard and every conforming compiler is forced to support this new feature! No more reinvention the wheel and no more trying to tie square wheels invented by my company to triangle wheels invented by third party companies!

@Scott: Its fine, helped me thinking about the pro...

2015-12-02T09:18:01.649-08:00

@Scott: Its fine, helped me thinking about the problem again. Thanks for your time.
The reason I want only constant zero is simply to enforce that code, some simple canonical "bitcheck" that has a well defined meaning whatever the type is. x == 0 is a good candidate because its pretty simple, "builtin" for plain enums, widely known and used.

@npl: I don't have a solution for you, I'm...

2015-12-01T20:48:50.452-08:00

@npl: I don't have a solution for you, I'm sorry. Perhaps others here do. If it makes you feel any better (and it probably won't), because your current codebase only compares bitmasks with the compile-time constant 0, if you modify the equality and inequality comparison functions for bitmasks to accept an arbitrary int, the behavior of your current code won't change, because only zeros will be passed in.

@Scott Meyers: You know, I am talking about an ide...

2015-12-01T15:12:37.919-08:00

@Scott Meyers:
You know, I am talking about an ideal, egg-free omlett world (sounds somehow implausible).
There are alot ways I could check for "zero" (a simple template function would do), but ideally I would want to take an existing codebase and replace an plain-old-enum with an enum class by simply transforming the member names (given an existing naming sheme with the enum-name as prefix).

The upsides are cleaning up the namespace and defining a underlining type. The last part is more important than some might think, clang and gcc seem to have different defaults for "short-enums" (on arm atleast). The enum below would be either one or 4 byte and this bit me already. egg:

enum EType {
eType_Somebit = 1 << 0,
eType_Otherbit = 1 << 1,
};

bool foo(EType e)
{
return (e & eType_Somebit) != 0;
}

// easily transformed (search + replace mainly) to omlette:
enum class EType : unsigned {
Somebit = 1 << 0,
Otherbit = 1 << 1,
}
// define & | ~ &= |= == != operators for EType, a MACRO can do this

bool foo(EType e)
{
return (e & EType::Somebit) != 0;
}

BTW, I had written some lines to explain why there cant be a "standard" way to test for constant 0 argument - but while writing I might have found one =)

@npl: Okay, I think I see what you mean. However, ...

2015-11-30T16:46:20.898-08:00

@npl: Okay, I think I see what you mean. However, I believe that [bitmask.types]/4 is simply defining terminology, not required expressions. ("The following terms apply...") As such, I think cppreference's interpretation of that clause of the Standard is incorrect.

Even if we assume that [bitmask.types]/4 requires that the expression "(X & Y)" be testable to see if it's nonzero, I don't see any requirement that the zero value be a compile-time constant. That is, I believe this would be valid:

template<typename T>
void setToZero(T& param) { param = 0; }

int variable;
setToZero(variable);

if ((X & Y) == variable) ....

As such, if you choose to define operators to test the result of X & Y against zero, I think you have to support variables with the runtime value zero, not just zero as a compile-time value. (If somebody were to do something like test X & Y against 42, results would presumably be undefined.)

If you really want to ensure that the value passed in is a compile-time zero, I suspect you can find a way to do that using either static_assert or enable_if. That is, you can still do what you want to do without relying on 0 being interpreted as the null pointer constant.

Cause the interface of a bitmasktype requires it: ...

2015-11-30T15:35:45.316-08:00

Cause the interface of a bitmasktype requires it: http://en.cppreference.com/w/cpp/concept/BitmaskType
Its supposed to be identical in use / interchangeable with a pre11 plain enum

@Greg Marr: Thank you for the explanation. But IM...

2015-11-29T23:14:36.825-08:00

@Greg Marr: Thank you for the explanation.
But IMHO the 'virtual' should also contextual. And so having the possibility to use override at the same place doesn't require that have them as full keyword. It is compiler implementation dependent. Or at least, for facility, it could be a 'local' (optional) keyword (mmmmm sounds like contextual isn't it). So no problem with backward compatibility.

@Greg Marr: I should have remembered that; I write...

2015-11-29T17:22:26.520-08:00

@Greg Marr: I should have remembered that; I write about it in Effective Modern C++, Item 12. Thanks for reminding me!

It's a contextual keyword, which means it has ...

2015-11-29T15:55:47.932-08:00

It's a contextual keyword, which means it has to go after the function definition. To put it at the front, it would have to be a full keyword, meaning that it's a reserved word, and all programs that used it as a type or variable name would be invalid.

@Vincent G.: I'm not familiar with the history...

2015-11-29T10:31:11.120-08:00

@Vincent G.: I'm not familiar with the history of the placement of "override" at the end of the function declaration, sorry.

@npl: From what I can tell, your operator== functi...

2015-11-29T10:29:11.621-08:00

@npl: From what I can tell, your operator== function returns whether all bits in the bitmask are 0, so I don't see why you want a binary operator to test that. Why don't you just define a function something like this?

constexpr bool noBitsAreSet(bitmask X)
{ return X == bitmask(); }

About override why not just write: class ... { ...

2015-11-26T02:24:21.647-08:00

About override why not just write:

class ... {
...
override type fun(paramlist);
};

instead of:

virtual type fun(paramlist) override;

I have found one use for nullptr and deprecation o...

2015-11-25T01:53:29.896-08:00

I have found one use for nullptr and deprecation or removing the automatic cast from 0 to nullptr would break it.

When you want to provide enums as bitmask.types (17.5.2.1.3), you have to define a few operations on them. set, clear, xor and negate are easy and even documented in the standard.
Now taking a enum bitmask and 2 instances X,Y, defined as
enum class bitmask : int_type{....}; bitmask X,Y;

you would have to support 2 additional operations (noted in the standard):
(X & Y) == 0; (X & Y) != 0;

In other words, you need operator== and operator!=, which ideally ONLY TAKE CONSTANT 0. The solution I came up with was:
constexpr bool operator ==(bitmask X, const decltype(nullptr))
{ return X == bitmask(); }

Maybe this is a bit offtopic, but if nullptr is an egg that has to be broken, what would the best solution be for bitmask types. I found using ints or other types to be more troublesome since comparing with anything but constant 0 would be undefined and might have different behaviour depening on the implementation and size of the enum.

I agree it's time to do this break those eggs,...

2015-11-23T23:59:29.312-08:00

I agree it's time to do this break those eggs, as for the comment on python 2to3 two points
1). the change is happening but slowly and
2). the change has been much slowed by the fact that the 2to3 utility was sloppily implemented it couldn't even do trivial changes like print args --> print(args) we can do better

Hi Scott, I cannot agree more that it is time to ...

2015-11-20T08:06:32.161-08:00

Hi Scott,

I cannot agree more that it is time to break some eggs. As for the general argument that it a important to compile C code as C++, I think that the extern "C" solves this issue. Maybe that would be a way for the standard to evolve, e.g. define a extern "C++14" when there are breaking changes. However, I can see why a compiler vendor would not want this because it basically leads to different compilers which have to be maintained.

I think that the a different wording of the standard could help to lead the path to remove redundant features, and make it easy to implement tools. As an example, I looked at typedef and type aliases. The standard defines typedef and type aliases in one section and defines a translation from type alias syntax to typedef. Let's assume that type aliases should replace typedefs in the long run. In this case, I think it would be better to define type aliases as the primary construct, and then define the semantics of typedef by referring to that definition, maybe in a different section for deprecated features. That would make it very clear that type aliases are the good feature, and typedef will vanish. If it is put in a special "deprecated features section", it would help the hidden beautiful language in C++ to eventually reach the surface.

Another example would be uniform initialization. The standard defines all forms of initialization in one section and e.g. defines the semantics of T a{b} and T a(b) in the same paragraph. I think it would be much more explicit if uniform initialization is the preferred way to define uniform initialization, and the define alternative forms in separate sections, possibly deprecated.

Cheers,
Jens

Hi Scott, i think you are rigth and it will be an ...

2015-11-19T14:41:50.489-08:00

Hi Scott, i think you are rigth and it will be an improvement but i think it shouldnt be the language who encourages it, instead leave C++ as it is and make the ides and compiler to do that, null pointer = error instead of a warning (if ever it is in the ide)

This way you dont break any legacy code and you will get better use of the tool, you should probably say in the Visual Studio conmunity and make a proposal for it, VS guys are doing great improvements and listen to the com, or to CLion, it will make it product better and make it diferent so more people will use it and it will be less eggs in the basket

P.D I am ready for your next book ;-)

Note that /Wall is extremely noisy, as it turns on...

2015-11-17T18:00:59.521-08:00

Note that /Wall is extremely noisy, as it turns on all the warnings. It is roughly equivalent to GCC's and Clang's -Weverything.

@Rein Halbersma: Thanks for the test results with ...

2015-11-17T14:05:28.589-08:00

@Rein Halbersma: Thanks for the test results with Clang and gcc. The current MSVC issues no warnings with /W4, but with /Wall (an option I didn't know about until just now), it warns about mf1-mf3. I'll make an errata list entry to update the book, because it now seems to be the case that both Clang and MSVC issues warnings for mf1-mf3 with full warnings enabled, though gcc remains silent.

@Scott Re: the compiler warnings on your Item 12. ...

2015-11-17T12:38:36.108-08:00

@Scott Re: the compiler warnings on your Item 12. I tested this with both clang 3.4 through 3.8SVN and gcc 6.0SVN. clang does warn with -Wall on mf1 through mf3 (even providing the reason for mf1 and mf2). gcc is silent even with -Wsuggest-override. I filed a bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68391

Great idea! Except, there are already enough langu...

2015-11-16T22:34:49.747-08:00

Great idea! Except, there are already enough languages which are perfect candidates for this. So just use one of them instead of making a new one!

I'm inclined to agree with Nevin on default in...

2015-11-16T18:24:54.163-08:00

I'm inclined to agree with Nevin on default initialization not being 0, at least with regards to simple stack types like int.

I don't want to look at a declaration and assume that the author intended to initialise to zero, I'd rather explicitly know.

I also often find bugs by the fact that the values are random each time rather than zero as often then the bug eventually results in a crash; where as with 0, if the bug doesn't result in a crash immediately it results in an on going but a subtle error, the value is consistent and often goes unseen.

One could argue that the other way though, that they'd rather auto initialize to 0 so that if that wasn't intended, one can predict what will/has happened to the data/system better.

If one likes that latter view, then could say default initialization to zero is done, but the user/compiler should still not let default initialization to zero affect it's compile time analysis of errors. But it will likely affect runtime.

But overall, so far, I'm not in favour of automatic initialization, at least of simple types like int. I want the compiler and the user to be able to assume that anything not explicitly initialized is a potential source of error.

I might be more inclined to want pods' default initialized to zero if there is no default constructor as that is a pretty common use case (see windows sdk) and when debugging if you see a pod that is all zero, it's easier to see that it hasn't been further initialized with something extra when it probably should have been which helps see that something is wrong. But as interesting as this idea is, I'm just as inclined to think what we have already is still workable though and not the most pressing problem. I hope good things will come out of it all being discussed though so thanks for raising the issues.

@Nevin ":-)": I'm proposing changing...

2015-11-16T16:46:16.072-08:00

@Nevin ":-)": I'm proposing changing C++ only if there is an essentially foolproof way to migrate legacy code with no change in semantics. If you have code where you want the current behavior, migrate it using the migration tool (during the decade or more where the practicality of the change is being considered), and nothing will behave differently. If you choose not to use the migration tool and the language is changed, then the semantics of your program may change.

You're essentially arguing that C++ should retain the current rule whereby some memory is implicitly uninitialized on some platforms some of the time so that the subset of users who use sanitizers can have those sanitizers diagnose problems arising from reads of uninitialized memory. That doesn't make a lot of sense to me.

My guess is that if all memory holding built-ins were zero-initialized by default, sanitizer implementers would find a way to identify zero-initialization writes and offer an option to disregard them when looking for reads of uninitialized memory. If that were to happen, sanitizers could offer the same functionality they offer now.

You are basically allowing more things to legally ...

2015-11-16T16:22:48.761-08:00

You are basically allowing more things to legally compile and run, so there is less checking that compilers and sanitizers can do.

Example #1: the following does not compile with warnings enabled:

int i;
std::cout << i << std::endl;

If the behavior were changed so that i was initialized to 0, this code would have to legally compile.

Example #2: this code compiles but is caught by the sanitizer:

foo(int& ii) { std::cout << ii << std::endl; }

int i;
foo(i);

With your proposed change, sanitizers could no longer diagnose such code, as it would be perfectly legal.

So, why make the change?

Reason #1: this bug is common in the wild. Unfortunately, if we make this change, detecting such a bug becomes harder, not easier, as we can no longer use tools like sanitizers to find it, because this previously illegal code would now be legal. It seems like a big presumption to assume that all the uninitialized values were meant to be 0.

Reason #2: this is a common mistake for beginners. Beginners (as well as experts and everyone in between) ought to be using sanitizers.

On the whole, this kind of change seems to mask bugs instead of preventing bugs. What am I missing?

Backwards compatibility is indeed necessary and de...

2015-11-16T16:02:47.179-08:00

Backwards compatibility is indeed necessary and desired, but C++ has grown too big, and for new code indeed there are lots of bits that one should be able to only enable selectively. Pretty much everything in effective C++ (or more modern equivalent) that can be automatically checked should be checked. And it should be done by default. It's far too easy to write incorrect code (Declare unique pointer, move it, access it and watch it core -- should have been checked statically). I should not have to read three books of C++ gotchas to get some basic code written.

The language doesn't need to change, but the compilers should help you pick a more sensible language subset that is suited for your task. Libraries would need to change to allow this. If you then want to go off and enable a language feature for your specific use case, then go ahead - in a scoped manner.

// disables things that make it easy for you to shoot yourself in the foot.
lang mode strict;

// modern code, sensible language subset

lang enable c_varargs;
// some code that interfaces C
lang disable c_varargs;

As an aside, It's also INSANE that -Wall on gcc doesn't really mean everything. This is again for backwards compatibility.
No!, if I mean everything I really mean it. If I'm upgrading compiler and I wanted to have yesterday's -Wall, then it should be versioned: -Wall-gcc49.
Otherwise what's the point of all those compiler devs spending their time and effort trying to make my life easier if it's so hard to access their efforts?

At the same time, as someone who spends quite a lot of time doing C99, it's also incredible that C++ is not a superset of C. I know this is not completely possible, but a number of things are different when there clearly is no need for it.

@Nevin ":-)": In this post, my interest ...

2015-11-16T15:45:02.647-08:00

@Nevin ":-)": In this post, my interest is in a way to change the language in a way that preserves full backward compatibility, so any context in which sanitizers are currently useful would remain a context where sanitizers would be useful on programs that had been transformed. Note that due to my focus on maintaining strict backward compatibility in this post, nothing I'm proposing would cause existing code that currently has an uninitialized value to become implicitly zero-initialized.

My sense is that you'd like to get rid of implicit zero initialization entirely, and I believe that that, too, is something amenable to "magic wand" legacy program transformation: have a Clang-based tool replace all code that currently gets implicitly zero-initialized with the corresponding syntax for explicit zero initialization.