11. Week 6 Thursday: References
≪ 10. Week 6 Tuesday: Revisiting Wordle with Functions | Table of Contents | 12. Week 7 Tuesday: Vectors ≫We introduced functions last discussion, where we described them as “smaller jobs” or “tasks” for indentured servants or contractors to do. But conceptually, there was something missing from our discussion of functions, perhaps best demonstrated allegorically.
Imagine you’re working as a secretary at a huge public unversity, and you need to do a bunch of data processing on the database of student records. You have a very helpful assistant to help you perform some of this data processing, but there is a dilemma. On one hand, you can give them your login to the school database and have them directly do their job. While convenient, this has some security flaws: do you really want a lowly assistant to have unfettered access to all of this sensitive information? What if your assistant screws up and irreversibly messes up decades of student records?
The alternative is to print out or digitally create a copy of the entire school’s database. This fixes the security issues we addressed above, and as a bonus, even if your assistant is a moron and screws everything up, at least you’ll have a backup. However, there are some clear drawbacks as well. Any changes made to the copy of the database must be copied over to the school’s actual database after your assistant is done. This is quite redundant! Moreover, making the copy itself might take a long time, especially if you’re low-tech and want to print it all out on paper!
Here, the mysterious data analysis performed by the assistant represents a function in C++, and the database of student records is a necessary input (i.e. a parameter/argument) that must be “passed” to the assistant. C++ by default always copies over these inputs, as demonstrated by this helpful warmup:
Warmup 1.
Predict the output of the following code:
The output is 5.
When increment(n) is run,
our imaginary servant is given a copy
of the variable n in main.
Thus the n in the function and the n in main
are two distinct variables, despite sharing the same name!
The point is, the input to the function was copied by C++.
To instead give a function (think assistant)
“direct access” to a variable (think the database)
in the main function,
we need to instead pass by reference,
and this is indicated by an ampersand &
after the type of a parameter:
1#include <iostream>
2using namespace std;
3
4void increment(int& n) {
5 n += 1;
6}
7
8int main() {
9 int n = 5;
10 increment(n);
11 cout << n << endl;
12
13 return 0;
14}
This now prints 6 —
the function increment is now granted direct access
to the n in main.
Note that the variable names need not match;
the following program still prints 6:
1#include <iostream>
2using namespace std;
3
4void increment(int& n) {
5 n += 1;
6}
7
8int main() {
9 int number = 5;
10 increment(number);
11 cout << number << endl;
12
13 return 0;
14}
What’s happening under the hood
is that the box called int number in main
is given a second label int n;
the function increment refers to this box as n,
whereas the main program refers to it as number.
The int& n is called a “reference to an integer”;
it’s a label for a box that already exists.
This can be done directly in the main function as well:
This prints 7!
number and n are both names for the same box.
You should think of the line int& number = n as saying,
“attach a new label number to the box referred to as n”.
Remark 2.
You have seen a function that receives a reference before already:
getline!
When you call getline(cin, s),
the function directly modifies the string s that it’s given,
and those changes are reflected immediately in your program.
Problem 3.
Write a C++ function called to_upper that accepts a string parameter
and converts that string into all uppercase.
For instance, the snippet
should produce the output PIC10A ROX.
References, What Could Go Wrong?
Conceptually, there are many, many scenarios in which references can start to go sour. First and foremost, you cannot make a dangling reference:
1int& number;
We are asking C++ to make a label for a box without actually putting it on a box… this is a build error.
But beyond that, consider the following example:
Example 4.
Explain the build error in the following code:
When your computer runs a C++ program,
it sections off a little bit of RAM for the main program,
just enough space for each of the variables
that you use throughout your program.
Whenever you call a function, such as sqrt or increment,
your computer allocates another chunk of space
to hold all the variables in that function.
In addition to the main function
and any other functions that get called,
your computer also creates a little chunk of memory
to hold the program’s actual instructions!
This chunk of memory cannot be modified by the program.
This chunk of memory is used
to remember all the constants in your code,
including the 5 on line 10.
So when me write the line increment(5),
the compiler understands that we’re asking
for direct access to something we shouldn’t ever have access to,
thereby resulting in a compilation error.
Some curious minds may ask, “If parameters in a function can be references, can a function return a reference?” Indeed this is syntactically possible:
1#include <iostream>
2using namespace std;
3
4int& make_reference() {
5 int number = 7;
6 return number;
7}
8
9int main() {
10 int& reference = make_reference();
11 cout << reference << endl;
12
13 return 0;
14}
This is undefined behaviour. In fact, I get a very stern warning from the compiler.
Here, the flow of permissions is reversed.
The function make_reference creates a new box number,
then gives the main function direct access to its new box.
But when the function ends,
all of its memory is cleaned up by the computer
(this happens whenever you leave a scope!),
and in particular the box number is destroyed in the process.
Thus the label reference in main
immediately becomes a label for a box that has been destroyed…