What I dislike about C++, part 1: References

I’ve started a new job, for those of you who didn’t know. I’m now coding C++ daily. My relationship with C++ has been distant, simply because I haven’t really ever had a need to use it. However, C and C# are both strong languages of mine, and C++ sits somewhere in the middle: C with classes, C# without garbage collection. (These are rough approximations, and not without caveats.)

There are bound to be things in every programming language that get in the way of productivity. In this post I’ll highlight one of those things in C++: references. References are obfuscating and mostly useless. But first, why would you use a reference?

A reference is effectively a pointer, but this is hidden by the language. You use it like it’s not a pointer, and the compiler turns direct accesses into indirect accesses. For example:

#include <iostream>

int main() {
	int a = 5;
	int &b = a;
	
	std::cout << "a is " << a << std::endl;
	
	b = 6;
	
	std::cout << "a is " << a << std::endl;
}

The output will be:

a is 5
a is 6

Note that I did not say *b = 6;. The reference is treated as though it were not a pointer. Cool… but what’s the benefit?

The solitary benefit I’ve heard from others is that references cannot be null. When you declare a function/method that accepts an argument typed as a reference, it’s not possible to make that reference equivalent to a null pointer.

Ok, so we trade a bit of clarity for a compile-time guarantee that we won’t be dereferencing a null pointer. That’s a good trade, right?

Maybe. Consider this excerpt:

class Foo;

void do_something() {
    Foo *foo = new Foo();
    use_foo(*foo);
    delete foo;
}

This is contrived, yes, and there’s a better way to write this code. But I’m illustrating something here. I’ve told you that Foo is a class, but I haven’t told you the prototype for use_foo(). That’s on purpose. Now, you tell me if the Foo instance is going to be copied. I’ll even tell you that Foo doesn’t overload operator*.

Do you have your answer yet? If you said yes — the logical choice — you’re wrong. If you said no, you’re also wrong. Well, actually, if you said yes or no, you might be wrong. It’s impossible to tell. If use_foo()‘s declaration is use_foo(Foo foo) then a copy will be made using Foo‘s copy constructor (if possible, otherwise it will be a compile-time error). But if the function’s prototype is use_foo(Foo &foo) then we are actually passing in the value stored in the foo variable — a memory address. The object will not be copied. In other words, while it looks like we are dereferencing foo, we are actually doing no such thing.

In C, you can tell pretty much everything you need to know from a call site. In C++, you must know how the function you’re calling is defined too, simply because you have to know when things are references and when they are not. The treatment of what is fundamentally a pointer type as a value type (at the language level) is what causes the uncertainty. You use value-type grammar around references, even though they are not really a value type.

If it weren’t for the existence references, the code above would be perfectly clear. (Well, unless you didn’t have me to tell you that Foo doesn’t implement operator*…)

Don’t get me wrong, C++ still has a lot of good features. It’s just a bit irritating that because of a bad feature (and some other features too, such as some forms of operator overloading), I must know the details of every type and function on a line of code to be absolutely sure what it’s doing.

6 Replies to “What I dislike about C++, part 1: References”

  1. I guess one argument in favour is that it lets you optimize by taking parameters by reference when possible, without the caller having to know about the optimization.

    The first version of your library might have bar(Foo f). Later you find a better algorithm that can work with an immutable Foo object so you change your library function to bar(const Foo& f). Now the code runs faster (if Foo is a big object) but the caller doesn’t have to care.

    When you get into non-const references, such that a function call may sneakily modify the object you passed it, it does get a little confusing. swap(a, b) is nice syntax but that’s not enough justification.

  2. I can understand that point. My counter-argument is that this feature causes obfuscation of every function call everywhere since one can no longer be certain when things are being passed by reference.

  3. The worst thing is that the one place where a reference would really have made sense – ‘this’ – it isn’t used and we have pointer syntax instead, even though ‘this’ cannot be assigned to and is never null. (References were added to the language after instance functions, but a new keyword ‘self’ would be great.)

  4. Hi Chris,
    Sorry for reopening a topic after nearly 6 months. But I cannot stay silent.
    I think you got it wrong. Completely.
    Although a reference might behave like “some sort” of a pointer, it is *not* a pointer. Your statement: [quote]A reference is effectively a pointer, but this is hidden by the language.[/quote] is completely wrong.
    To quote the C++ standard: [quote]A reference is an alterantive name for an object.[/quote] It is just a new name for something that you’ve defined elsewhere. That’s the very reason why it cannot be null –> You cannot have an alternative name for an object that you do not have yet.

    Look at this piece of code:
    [code]
    #include
    int main()
    {
    int a = 5;
    int &b = a;
    std::cout << std::hex << &a << std::endl << &b << std::endl;
    }
    [/code]
    This code will print the address of a and b. As you see the address is the *same*, b references the *same* object as a.

    To explain the use of it, let's slightly extend your second example above:
    [code]
    class Foo {
    public:
    Foo() {};
    };

    void use_foo(Foo foo) // use_foo() uses the object directly
    {
    std::cout << &foo << std::endl;
    }

    void bar_foo(Foo &foo) // bar_foo() uses a *reference*
    {
    std::cout << &foo << std::endl;
    }

    void footest()
    {
    Foo *foo = new Foo(); // create the foo object
    std::cout << foo << std::endl; // print the address of it
    use_foo(*foo);
    bar_foo(*foo);
    delete foo;
    }
    [/code]
    This code will print three addresses on three lines. The first is the address of the foo object we create in footest(), the second is the address of the foo object used in the use_foo() function, the third is the address of the foo object used in the bar_foo() function. You'll see that the addresses of the first and third line are the same.

    When you use an object directly as an argument to a function *it is copied*. This is not a big issue when you only have a type of int or long or float. But it will have a huge impact if you use large objects like vector or thelike. All these objects will be copied on the stack!

    This means: with the help of references you can use an object within a function whithout having it copied to the stack. I agree that you can do effectively the same thing with pointers but with references you avoid the zero pointer problem completely.

    There is another thing to note: reference arguments can be used to return values from functions. This may be positive or negative, depending on your use of it.

    regards Willi …

  5. @Willi: I don’t see anything in your code examples that conflicts with what I’ve said.

    I would like to point out that there is nothing you can do with references that you can’t also do with pointers. The converse is not true. I find no appreciable benefit to using references except for the inability to have a null reference, and syntax (references are effectively pointers that automatically dereference).

Comments are closed.