19. Week 10 Thursday: Pointers and C-Style Arrays
≪ 18. Week 10 Tuesday: Practise with Classes and Pointers | Table of ContentsCongratulations on making it to the very end of the quarter! Today’s discussion will be focused on typing up loose ends with pointers and C-style arrays, which I haven’t really touched upon in discussion at all.
Let’s start with pointers. They’re very similar to references in how they behave and affect other variables in a program, but they’re a fundamentally different type of object.
Computers have physical memory — they’re chips full of electrons and fancy technology, and in order to access or modify a variable, the computer has to know exactly which physical location to go to. These are called memory addresses, and every variable has one. When a program runs the line int a = 17;
, it finds an integer-sized chunk of memory, and your computer remembers its location: “Okay, the variable a
lives at 123 Ram Street
”. This location is called a memory address.
Definition 1.
A pointer is a variable that contains the memory address of another variable.
Pointers are declared like regular variables, but they are preceeded by an asterisk, *
. For instance,
The int* ptr
should be read from right-to-left: the variable ptr
is a pointer to (*
) an integer (int
). It contains the address of a
, which is written as &a
(the ampersand is the “address-of” operator).
Warning 2.
While we can declare multiple integers on a single line of code like int a = 7, b = 14;
, one must be much more careful with pointers:
Here, p
is a pointer to an integer, but q
is just an integer! Instead, one needs to type
Note that the q
needs its own *
!
To access the variable “pointed to” by a pointer, we use a dereferencing operator:
The third line of this snippet says, “follow the address stored in the pointer p
, then set the memory there equal to 7
”. This overwrites the contents of a
with 7
, so a single number 7
is printed at the end.
The null pointer and runtime errors
Hopefully the above was all review. Let’s discuss a few ways that pointers can go wrong, and a very special pointer called the null pointer.
First of all, when you declare an integer without initialising it, nobody knows what value it will store. Likewise, when one declares a pointer without initialising it, who knows what address it will be pointing to? It may not be a valid memory address, but maybe it is, and maybe it’s even an address you’re not supposed to be touching.
Most likely, your program will crash! Hopefully your computer won’t get bricked too.
There are situations where you need to declare a pointer, but don’t necessarily have anything for it to point to. Rather than having it just point to some random value, it’s common to initialise it to the null pointer, nullptr
. This is a literal pointer value and is a keyword in C++; it means, “This memory address doesn’t point to anything”. This is particularly useful because you can check if a pointer is a null pointer or not!
Attempting to dereference a null pointer will be a runtime error, not a compilation error, and it is guaranteed to crash your program.
Besides dereferencing uninitialised pointers and null pointers, there is one more way for a pointer to cause a runtime error: the dangling pointer. These are pointers that were once pointing to valid addresses of memory, but at some point in the program, stopped doing so because the program finished using said memory address.
Consider the following snippet of code:
The variable a
is inaccessible outside the if statement’s scope. In fact, once the if statement ends, the memory address of a
is given back to the operating system — that variable can never be used again in the program, so there’s no reason to keep occupying that memory. But the pointer ptr
still points to the ghostly shadow of a
, and we are still accessing that address at the last line of the program!
This is undefined behaviour — the output of the program cannot be predicted because it depends on what your operating system and other programs on your computer do with the address of a
. Sometimes, this prints a
just fine; other times, it prints some random garbage; other times, it crashes altogether.
Warning 3. Pointers to Vectors' Elements
Consider the following snippet of code:
1vector<int> v;
2v.push_back(5);
3int* ptr = &(v.at(0));
4
5for(int i = 0; i < 100000; i++) {
6 v.push_back(1);
7}
8
9cout << *ptr << endl;
This is, surprisingly, undefined behaviour! This is because vectors use dynamic memory allocation and move their memory around when necessary. After adding 100,000 items to the vector, it will have most likely packed its bags and moved its memory elsewhere, leaving our original pointer ptr
dangling. Exercise extreme caution when making pointers to entries of a vector.
Problem 4.
Determine and classify the error(s) and/or undefined behaviour in the following code, if any. If there are no errors or undefined behaviour, predict the output of the code.
C-Style Arrays
Finally, we’ll discuss C-style arrays, which are actually pointers! You can define arrays with the following syntax:
<type> <variable name>[array size] = { item1, item2, ... } ;
For instance:
1int arr[5] = {1, 7, 9, 154, 4};
This declares a C-style array containing 5 integers: 1, 7, 9, 154, and 4, in that order. To access the elements of an array, we index its entries just like in a string or vector, and we use the square bracket syntax arr[index]
:
1cout << arr[0] << ' ' << arr[2] << endl;
The above snippet prints 1 9
.
You can omit the array size if you explicitly declare the entries of the array:
1int arr[] = {1, 7, 9, 154, 4};
You cannot omit the square brackets in this case. You can also just declare an array without initialising it, in which case you do have to supply the array size, e.g. int arr[5];
.
C-style arrays are just pointers. The value of the variable arr
in the above example is just a memory address, one that’s wide enough to fit 5
whole int
’s in it! Its entries are stored end-to-end in your computer’s memory, one right after another. The syntax arr[2]
says, “go two int
’s past the memory address arr
, then get the value at that address”.
To pass arrays to functions, you have several options: you can either supply it as an array, or you can supply it as a pointer.
1void f(int arr[]) {
2 cout << arr[0] << endl;
3}
4
5void g(int arr[5]) {
6 cout << arr[4] << endl;
7}
8
9void h(int* ptr) {
10 cout << ptr[2] << endl;
11}
12
13int main() {
14 int arr[3] = {1, 4, 7};
15 f(arr);
16 g(arr);
17 h(arr);
18
19 return 0;
20}
You can optionally supply the size of the array, as in the function g
, but this does literally nothing and is not enforced by C++. In the above example, we passed an array of size 3 off as an array of size 5! This is doable because C++ is less intelligent than a dead pigeon and can only understand arrays as poniters. In fact, this program has undefined behaviour.
Arrays are best utilised when one has a list of data of known and unchanging size that one wants to keep track of throughout the course of a program. The easiest examples of this are in board games and video games. In chess, for instance, one will always have exactly 64 squares to keep track of through the game, so programmers use a (two-dimensional) array rather than a (two-dimensional) vector to keep track of the game state.
Warning 5.
While there are ways to use the sizeof
command to find the length of an array, this is not reliable and does not always give the correct size, especially when passing arrays as function parameters. Instead, it’s common to include a size
parameter in such functions:
It’s also common to use const
global variables or macros to remember the size of certain arrays.
Problem 6.
Predict the output of the following code:
Problem 7.
You will probably have to look up how to write and use two-dimensional arrays in this problem.
Implement one of the following games in C++ without using a vector:
This last problem probably can’t be done in a single discussion, but I highly recommend sitting down and coding up one of these at least once in your life. I think it’s a great way to synthesise a lot of concepts we’ve learned in the quarter (classes, functions, arrays, input/output, etc.).