What I dislike about C++, part 1: References

I've started a new job, for those of you who didn't know. I'm now coding C++ daily. My relationship with C++ has been distant, simply because I haven't really ever had a need to use it. However, C and C# are both strong languages of mine, and C++ sits somewhere in the middle: C with classes, C# without garbage collection. (These are rough approximations, and not without caveats.)

There are bound to be things in every programming language that get in the way of productivity. In this post I'll highlight one of those things in C++: references. References are obfuscating and mostly useless. But first, why would you use a reference?

A reference is effectively a pointer, but this is hidden by the language. You use it like it's not a pointer, and the compiler turns direct accesses into indirect accesses. For example:

#include <iostream>

int main() {
    int a = 5;
    int &b = a;

    std::cout << "a is " << a << std::endl;

    b = 6;

    std::cout << "a is " << a << std::endl;

The output will be:

a is 5
a is 6

Note that I did not say *b = 6;. The reference is treated as though it were not a pointer. Cool... but what's the benefit?

The solitary benefit I've heard from others is that references cannot be null. When you declare a function/method that accepts an argument typed as a reference, it's not possible to make that reference equivalent to a null pointer.

Ok, so we trade a bit of clarity for a compile-time guarantee that we won't be dereferencing a null pointer. That's a good trade, right?

Maybe. Consider this excerpt:

class Foo;

void do_something() {
    Foo *foo = new Foo();
    delete foo;

This is contrived, yes, and there's a better way to write this code. But I'm illustrating something here. I've told you that Foo is a class, but I haven't told you the prototype for use_foo(). That's on purpose. Now, you tell me if the Foo instance is going to be copied. I'll even tell you that Foo doesn't overload operator*.

Do you have your answer yet? If you said yes -- the logical choice -- you're wrong. If you said no, you're also wrong. Well, actually, if you said yes or no, you might be wrong. It's impossible to tell. If use_foo()'s declaration is use_foo(Foo foo) then a copy will be made using Foo's copy constructor (if possible, otherwise it will be a compile-time error). But if the function's prototype is use_foo(Foo &foo) then we are actually passing in the value stored in the foo variable -- a memory address. The object will not be copied. In other words, while it looks like we are dereferencing foo, we are actually doing no such thing.

In C, you can tell pretty much everything you need to know from a call site. In C++, you must know how the function you're calling is defined too, simply because you have to know when things are references and when they are not. The treatment of what is fundamentally a pointer type as a value type (at the language level) is what causes the uncertainty. You use value-type grammar around references, even though they are not really a value type.

If it weren't for the existence references, the code above would be perfectly clear. (Well, unless you didn't have me to tell you that Foo doesn't implement operator*...)

Don't get me wrong, C++ still has a lot of good features. It's just a bit irritating that because of a bad feature (and some other features too, such as some forms of operator overloading), I must know the details of every type and function on a line of code to be absolutely sure what it's doing.