17. Week 10 Tuesday: Pointers!
≪ 16. Week 9 Thursday: Tic-Tac-Toe | Table of Contents | 18. Week 10 Thursday: const and Pointers; C-Style Arrays ≫Suppose you are designing a struct that represents a human being. After populating it with a bunch of member variables, you decide that you want to keep track of an individual human’s spouse. So you write the following code:
Conceptually, this code is extremely suspicious.
It says to C++, “A Human is some data, plus another Human.
That Human then contains more data and another Human.
That Human then contains…”
In other words, there is an infinite
matryoshka
of nested Humans created
the moment one tries to declare a single Human.
In fact, the compiler recognises this and gives a compilation error.
A more reasonable attempt would be
to make the member variable spouse a referece to a Human instead:
No more infinite matryoshkas.
But now there’s a problem:
when a Human variable is declared,
the spouse member variable must be initialised…
as a reference to another Human.
So how do we define the first Human?
Moreover, what happens if a Human is single?
What if they want a divorce or want to marry someone else?
What if they are polyamorous?
None of these very realistic scenarios can be modeled by a reference.
Remark 1.
In fact, the compiler still recognises this scenario as conceptually impossible and will throw a build error.
The fix is to use a pointer
rather than a reference for the spouse member variable!
(You can also do vector<Human> spouses;,
but this secretly uses pointers under the hood.)
This was a brief example to illustrate the necessity of the concept of a pointer. We’ll spend the remainder of this class diving into the mechanics of how pointers work.
Revisiting the Box Model
Let’s consider the very simple program
This says to C++,
“create an integer box named i and set it equal to 7.
Then add the label r to that box.”
There’s a bit more that the compiler keeps track of under the hood.
When it creates the box i,
your operating system says to the program,
“Okay, I gave you an integer-sized box
at the address 123 Ram Street.”
These addresses are actually huge numbers such as 0x7fff0e28eab4
which are way too cumbersome for humans to handle.
Thence, every time you refer to the box i,
C++ implicitly understands that you’re referring
to the box at 123 Ram Street,
or whatever address it received from the operating system.
Likewise, the variable name is “bound”
to the address 123 Ram Street
when we say int& r = i.
This is not so different from what we’ve been doing
with the box model this whole quarter,
we’re just giving an extra name to the locations of these boxes.
Definition 2.
A pointer is a box whose contents are memory addresses.
Let’s consider the code
int* p says we’re creating a new box called p
whose contents are the address of an integer rather than an int.
The &i means “get the address of i”.
Continuing as before,
the box p may be located at 456 Hard Drive,
and its contents would be 123 Ram Street.
So the code
1cout << p << endl;
actually prints out the address 123 Ram Street.
(I’m lying, it prints out some garbage like 0x7fff0e28eab4.)
To actually make use of the address, we use the dereferencing operator:
1cout << *p << endl;
This prints out 7,
since the contents of the box at 123 Ram Street are 7.
We can also modify *p like its a variable:
With addresses as before,
the line *p = 15 says to go to 123 Ram Street,
then place the number 15 in the box that’s there.
Exercise 3.
Predict the output of the following code:
Problem 4.
Predict the output of the following code:
All The Things That Go Wrong
Let’s now consider some of the horrible things that can happen when we start using pointers.
The first problem we’ve already seen before with references. Consider the following code:
1int* p;
2
3cout << "Enter an integer: " << endl;
4int i; cin >> i;
5
6if(i < 0) {
7 int a = -1;
8 p = &a;
9} else {
10 int b = 1;
11 p = &b;
12}
13
14cout << *p << endl;
Let’s trace through the program when the user enters 5.
- An uninitialised pointer
pis created. - The user enters
5, which is stored ini. - The
ifblock is skipped, and in theelseblock…- An integer
bis created at123 Ram Street, initialised to the value1. - The address
123 Ram Streetis stored in the pointerp. - The
elseblock ends, so the boxbis destroyed.
- An integer
- The contents of the box at address
123 Ram Streetare printed.
But there’s no longer a box at 123 Ram Street
when the last step is run!
This causes undefined behaviour.
In much the same way, trying to use an unitialised pointer is undefined behaviour. Just like how unitialised integers have unpredictable garbage, uninitialised pointers contain a practically random memory address, and trying to access that address can cause a crash or unpredictable output.
Let’s revisit the motivating example:
what do we do if our Human doesn’t have a spouse?
Pointers, unlike references,
allow us to make a Human* before the spouse exists.
But how do we say “a pointer to a nonexistent human”?
There is a special memory address that represents “nowhere”,
called the nullptr.
This is undefined behaviour!
Usually nullptr is used in exactly this situation —
it represents an address that doesn’t exist —
and you therefore can’t print out the box at nowhere.
Always check for a nullptr if you can!