Week 6 Tuesday: Functions

11. Week 6 Tuesday: Functions

≪ 10. Week 5 Thursday: More Practise with Loops | Table of Contents | 12. Week 6 Thursday: Function Scope and Passing by Reference ≫

So far, we’ve been putting some emphasis on writing pseudocode to help guide us through programming problems. Code is best when it’s very close to the pseudocode we write — it allows us to assess and understand the logic of our code in a way that’s similar to how we process natural language. But of course, there have been times when we had to deviate from pseudocode, as sometimes our English instructions are slightly more complicated that what the C++ language allows us to express in a few short lines of code.

Perhaps we are already familiar with this sort of dilemma. For instance, cooking shows and recipes often give instructions like “simmer for about four hours”, but don’t need to explain in detail how to turn on your stove and select the correct temperature setting. While this is hopefully a safe assumption for Julia Child to make, when we’re writing code in C++, it’s not so easy for us to make such assumptions with our pseudocode.

To give a clearer motivating example of what I mean, let’s consider the following problem:

Example 1. Divisor Sum

For a positive integer \(n\), we define the divisor sum of \(n\) to be the sum of all divisors of \(n\). For instance, the divisor sum of \(9\) is \(1+3+9=13\), and the divisor sum of \(8\) is \(1+2+4+8=15\).

Write a C++ program that accepts a single positive integer \(M\) from the user as input, then prints out the integer \(1 \leq n \leq M\) with the largest divisor sum.

SAMPLE RUN: 
INPUT   9
OUTPUT  8

One approach, and perhaps the most immediate approach, is to iterate through each value of \(n = 1,\ldots, M\), compute the divisor sum one at a time, and keep track of the largest divisor sum (and the corresponding value of \(n\)) as we go. In pseudocode, this might look like:

Accept an integer \(M\) as input from the user.
Initialise the integers largest_sum = 1 and largest_n = 1.
For each n = 1,..., M, repeat the following:
- Compute sum as the divisor sum of n.
- If sum > largest_sum, replace largest_sum with sum and replace largest_n with n.
Print out largest_n to the screen.

This is beautiful, pristine pseudocode that descended from the heavenly fathers above. It’s nearly in a one-to-one correspondence with actual C++ code: the steps that don’t fit into a single line can be written in at most two or three short and straightforward lines of code.

The exception is the step where we compute the divisor sum of n. This is not something that easily fits onto one line of code. We could write out the divisor sum in more detail as follows:

Accept an integer \(M\) as input from the user.
Initialise the integers largest_sum = 1 and largest_n = 1.
For each n = 1,..., M, repeat the following:
- Initialise the integer sum = 0.
- For each d = 1,..., n, repeat the following:
  - If n is divisible by d, add d to sum.
- If sum > largest_sum, replace largest_sum with sum and replace largest_n with n.
Print out largest_n to the screen.

This is quite a bit more opaque than the previous block of pseudocode, and one can imagine that pseudocode itself gets far less manageable when the complexity of our code gets larger. Imagine everything that goes on behind the scenes when you’re playing a video game, for instance!

This is exactly what functions are for. They allow us to teach C++ what it means to compute a divisor sum, for instance. On the pseudocode side, we would be able to describe separately how to compute the divisor sum! More broadly, functions represent complex actions or computations that need to be performed, often multiple times over the course of a program.

We’ve already been using functions! getline is a function that modifies the input stream we feed it and stores results in whatever string we feed it. Functions are useful for many, many, reasons, the most important of which are (arguably):

Well-written functions are modular and applicable in wide contexts. getline was only ever coded once, and other people like us were able to use the getline function and can understand exactly what it does, even though we don’t know how it was coded! This is particularly important when you are collaborating on a large codebase with other people.
They make code simpler, both to read and to debug. When something goes wrong, it’s often easy to identify exactly which function is not behaving the way it’s supposed to; we can then focus our efforts on that specific function and not worry about it again.

Function Syntax and the Return Statement

A function ought to be thought of as a command or a request to the computer. This request sometimes requires some input information, and it sometimes “returns” to us with some output information. We’ve seen several examples of these:

The sqrt function takes a double as an input and “returns” its squareroot to us.
The getline function takes cin and a string as an input, then reads a line of input from cin and stores it in the string.
The length function doesn’t take any inputs, and it returns the length of the string it’s “attached to”.

To write our own function, we need to use the following general syntax:

[return type] [function name]( [parameters] ) {
    [C++ code]
}

The return type is a variable type, like double or int, that describes what kind of information gets returned by the function. For instance, the sqrt function has a return type of double. Functions that don’t return anything have a return type of void.
The function name is…the name of the function. Like variable names, these should be concise, but specific enough to indicate what it does. sqrt is a great example of an informative function name!
The parameters are a comma-separated list of information that is “fed” to your function. Each parameter has a type and a name.
The function body is the C++ code that falls between the curly braces. These are the actual instructions that the computer identifies with the function.

If the function has a return type that’s not void, the function body must have a return statement in it. For instance, the main function always ends with return 0 — we’re telling the function to end itself and supply mysterious external forces with the number 0. Whatever value your function “returns”, it must match the function’s return type.

Let’s implement our divisor sum function. In pseudocode, it might look like the following: 0. Requires an integer n as input, and returns the divisor sum of n.

Initialise an integer sum = 0.
For each d = 1, ..., n, repeat the following:
- If d is a divisor of n, add d to sum.
Return sum.

Note that I’ve made explicit the inputs and outputs of the function in my pseudocode. This is not strictly necessary, but it’s good practise to clarify how functions work!

To convert this into an actual function, we need to know…

The return type is int, since the divisor sum should be an integer.
The function name is divisor_sum. You can name it something else if you despise my aesthetic choices.
The parameters are actually a single parameter: an int named n.

Converting this into code, we get:

 1int divisor_sum(int n) {
 2    // 1. initialise sum to 0 
 3    int sum = 0; 
 4
 5    // 2. for each d = 1, ..., n, repeat: 
 6    for(int d = 1; d <= n; d++) {
 7        // if n is divisible by d, add d to sum. 
 8        if(n % d == 0) {
 9            sum += d; 
10        } 
11    } 
12
13    // 3. return the sum! 
14    return sum; 
15}

Okay, we’re now ready to put together our full solution to the divisor sum problem. When adding the function to your code, you should put the whole function after all of the included libraries, but before the main function.

 1#include <iostream> // needed for cin, cout 
 2
 3using namespace std; 
 4
 5// divisor sum function: takes an integer n as input and returns
 6// the divisor sum of n. 
 7int divisor_sum(int n) {
 8    // 1. initialise sum to 0 
 9    int sum = 0; 
10
11    // 2. for each d = 1, ..., n, repeat: 
12    for(int d = 1; d <= n; d++) {
13        // if n is divisible by d, add d to sum. 
14        if(n % d == 0) {
15            sum += d; 
16        } 
17    } 
18
19    // 3. return the sum! 
20    return sum; 
21}
22
23int main() {
24    // 1. take an integer M as input from the user. 
25    int M; 
26    cin >> M; 
27
28    // 2. initialise largest_sum and largest_n to 1. 
29    int largest_sum = 1, largest_n = 1; 
30
31    // 3. for each n = 1, ..., M, repeat: 
32    for(int n = 1; n <= M; n++) {
33        // compute the divisor sum of n. 
34        int sum = divisor_sum(n); 
35
36        // compare this to largest_sum, and replace 
37        // largest_sum & largest_n if necessary. 
38        if(sum > largest_sum) {
39            largest_sum = sum; 
40            largest_n = n; 
41        } 
42    } 
43
44    // 4. output the value of largest_n 
45    cout << largest_n << endl; 
46
47    return 0; 
48}

Tada! Whenever C++ sees the line int sum = divisor_sum(n), it takes the current value of n in the for loop and supplies it to the divisor_sum function as an “input”. Once divisor_sum returns something, C++ replaces divisor_sum(n) with that return value, thereby copying it into the sum variable.

Common Mistake 2. Declaring Functions out of Order

While the placement of our divisor_sum function was somewhat innocuous, it is imperative that it comes before the main function. The following code does not compile:

 1#include <iostream> 
 2using namespace std; 
 3
 4int main() {
 5    for(int i = 0; i < 10; i++) {
 6        cout << square(i) << endl; 
 7    } 
 8
 9    return 0; 
10}
11
12// takes an integer n as an input and returns its square. 
13int square(int n) {
14    return n * n; 
15}

Like variables, functions must be declared before they are used. In this case, our first use of the square function came before C++ knew what it was!

Common Mistake 3. Missing Return Values

Your function needs to return something, no matter what happens within the function body. When there are lots of branching if statements, it’s possible for a return statement to be omitted. For instance:

1bool is_uppercase(char c) {
2    if('a' <= c && c <= 'z') {
3        return true; 
4    } else if('A' <= c && c <= 'Z') {
5        return false; 
6    } 
7}

This fails to compile because in some cases, neither branch of the if/else-if block is run, and C++ reaches the end of the program without seeing a return statement.

Here’s a few problems to practise writing functions with. While none of these problems strictly require functions to solve, it might be harder to organise your code and pseudocode if you don’t use them.

Problem 4.

Write a program that accepts an integer \(n\) from the user, followed by \(n\) words (strings without spaces). Then, print out the percentage of words that contain the letter e.

SAMPLE RUN: 
INPUT   5
INPUT   The car is not red
OUTPUT  40.0% 

INPUT   9
INPUT   The quick brown fox jumps over the lazy dog.
OUTPUT  33.3333%

Problem 5.

Write a program that accepts a sequence of integer inputs \(n_1, n_2, \ldots, n_k, 0\) terminating in a zero. For each nonzero input \(n_j\), compute the \(n_j\)-th prime number. Print out the sum of these primes.

SAMPLE RUN: 
INPUT   2 1 3 0
OUTPUT  10

(Here, the second prime is 3, the first prime is 2, and the third prime is 5. Their sum is 10.)

Hunter Liu's Website

11. Week 6 Tuesday: Functions

Function Syntax and the Return Statement