Hunter Liu's Website

11. Week 6 Tuesday: Functions

≪ 10. Week 5 Thursday: Vectors | Table of Contents | 12. Week 6 Thursday: Practise with Functions ≫

Let’s consider the following problem together:

Example 1.

Write a C++ program that allows the user to input an integer \(n\), followed by \(n\) string inputs (that do not include spaces). Determine what percentage of the inputs contain the letter ’e'.

Sample Run: 
INPUT   9
INPUT   The quick brown fox jumps over the lazy dog 
OUTPUT  33.33333%

We can produce some pseudocode for this problem:

  1. Take a positive integer \(n\) as user input.
  2. Initialise a counter to 0
  3. Repeat \(n\) times:
    • Take a string input from the user
    • If this input contains the letter ’e’, add one to the counter
  4. Output the counter divided by \(n\) as a percentage.

There is an issue, however, and it’s that checking if a string input contains the letter ’e’ is quite nontrivial and doesn’t have a direct translation into code. We could modify the pseudocode above to have some more detail and specificity at this step.

  1. Take a positive integer \(n\) as user input.
  2. Initialise a counter to 0
  3. Repeat \(n\) times:
    • Take a string input from the user
    • Initialise a boolean variable called has_e to false
    • For each letter in the user’s input, repeat:
      • If the letter is either ’e’ or ‘E’, set has_e to true.
    • If has_e is true, add one to the counter
  4. Output the counter divided by \(n\) as a percentage.

This is an alright solution. The pseudocode is quite lengthy, however, and the logic is significantly less clear than our first rendition of the pseudocode.

Moreover, this solution does not scale! Many programs that are written for real-world applications have significantly more complicated logic, and writing pseudocode that can be immediately and directly translated into code would result in pages upon pages of nested logic, repeating instructions, and unclear flow.

For instance, think about how you would write pseudocode for a program that takes a set of correlated data, runs a linear regression using several different regression techniques, and compares their predictions. Think about how you would write pseudocode for a game of checkers.

A better solution would be to modularise pseudocode into small, digestible units. Let’s go back to our first attempt:

  1. Take a positive integer \(n\) as user input.
  2. Initialise a counter to 0
  3. Repeat \(n\) times:
    • Take a string input from the user
    • If this input contains the letter ’e’, add one to the counter
  4. Output the counter divided by \(n\) as a percentage.

The issue is that we don’t know how to determine if a string contains the letter ’e’ or not. We may encapsulate this process in a separate bit of pseudocode: 0. Given: a string variable str

  1. Initialise a boolean variable has_e to false
  2. For each character c in str:
    • if c is equal to ’e’ or ‘E’, set has_e to true
  3. If has_e is true, str contains an e. Otherwise, it doesn’t.

This modularisation allows us to first directly implement our main program as a one-to-one translation of our original pseudocode, then supplement the main program with the “missing pieces”. These supplementary pieces are called “functions”.

Writing Our First Functions

In C++, a function can be thought of as a sub-program. The main program provides it with some inputs (possible none at all), then obtains some output or result from this subprogram that it uses later. We’ve already seen some examples of this:

We can teach C++ how to run our own custom-made subprograms, such as the has_e subprogram we want to use for the example. The syntax for this is as follows:

[return type] [function name]( [parameters] ) { 
    [C++ code] 
}

Here’s an explanation of each piece:

Most importantly, the function body must contain a return statement. These statements have the syntax return [value];, where the value can be a variable or parameter, a literal value, the output of another function or arithmetic operation, etc. This value is the end result of your function, and it must match the return type specified at the start.

Let’s implement our subfunction, which takes a single string as input and returns a boolean that indicates whether or not the provided string has the letter ’e’. Following the pseudocode for it, this is the code we would produce:

 1bool contains_e(string str) {
 2    // initialise a boolean variable to false 
 3    bool has_e = false; 
 4
 5    // check each character to see if it's an e or not. 
 6    for(int i = 0; i < str.length(); i++) {
 7        // if you found an e, you don't need to keep looking.
 8        if(str.at(i) == 'e' || str.at(i) == 'E') {
 9            has_e = true; 
10            break; 
11        }
12    }
13
14    return has_e; 
15}

We can now provide a one-to-one translation of our full program’s pseudocode as follows:

 1int main() {
 2    // take an integer n as input 
 3    int n; 
 4    cin >> n;
 5
 6    // create the counter, set to 0. 
 7    int counter = 0; 
 8
 9    // take n inputs, incrementing the counter if 
10    // the input contains an e. 
11    for(int i = 0; i < n; i++) {
12        string input;
13        cin >> input;
14        if(contains_e(input)) {
15            counter++; 
16        }
17    }
18
19    // compute the percentage of inputs containing an e,
20    // then print out the result. 
21    double percent = 100.0 * counter / n; 
22    cout << percent << "%" << endl; 
23
24    return 0; 
25}

Both of these functions belong in the .cpp file somewhere; the full file may look like

 1#include <iostream> // needed for cin, cout, endl 
 2#include <string>   // needed for strings 
 3
 4using namespace std; 
 5
 6bool contains_e(string str) {
 7    // code...
 8}
 9
10int main() {
11    // code...
12}

Our custom function goes outside the main function, but after the includes and using namespace std.

Common mistake 2. Incorrectly Ordering Function Definitions

In the above example, it is critical that the contains_e function is put before the main function and not after it. Like variables, you can only use a function after you have defined it. Otherwise, the compiler won’t be able to find the function you’re trying to use, causing a compilation error. For instance, the following (rather contrived) code does not compile:

 1#include <iostream> // needed for cin, cout, endl 
 2
 3using namespace std; 
 4
 5int main() {
 6    // prints out the first 10 square numbers 
 7    for(int i = 1; i <= 10; i++) {
 8        cout << square(i) << endl; 
 9    }
10
11    return 0; 
12}
13
14// squares an integer and returns it
15int square(int x) {
16    return x * x;
17}

Common mistake 3. Missing Return Statements

Your function almost always has to contain a return statement, with one exception coming in the following section. If your return type is an int, your program must return an integer at some point in the implementation. A common way for this to occur is when if statements come into play. For instance,

1bool is_uppercase(char c) {
2    if(c >= 'a' && c <= 'z') {
3        return false; 
4    } else if(c >= 'A' && c <= 'Z') {
5        return true; 
6    }
7}

is not a valid function, even though it does return a boolean value. The issue is that if one calls is_uppercase('#'), neither the if nor the else if block will run, and nothing gets returned.

The void Return Type

We can think of functions as little assistants that we dispatch to handle a task. In our example, we had an “assistant” help us determine whether or not a string had the letter ’e’ in it, and we used the results the assistant produced to move the program forward.

But there are situations where you want your assistant to perform some task, but don’t expect anything to get returned. Perhaps you want your assistant to print out some nicely formatted data for you, or perhaps you want your assistant to reorganise some old file cabinets. How would one provide a return type in this scenario?

The void return type exists specifically for this purpose. Functions with a void return type are no longer required to return anything; in fact, attempting to return something will cause a compilation error. The push_back function is an example of a void function: its purpose is to modify the contents of a vector, and there is no additional data or information for it to provide.

Problem 4.

Think of a situation where you would need a void function. Describe the situation in moderate detail. You do not need to write this in code.

Remark 5.

Some of the functions we have described thus far are, in some sense, “attached” to a variable. For instance, the push_back function needs to come with a vector or a string; one cannot just write push_back(5), one must write vec.push_back(5). These are called “class member functions”, since they are attached to a “class” of objects rather than free-floating. If you take PIC 10B, you will learn more about how this works, but you will not need to know this for PIC 10A.

Problem 6. Sum of Divisors

The sum of divisors of a positive integer is the sum of its positive divisors. For instance, the divisors of 6 are 1, 2, 3, and 6; their sum is 12. The sum of divisors of 6 is thus 12.

Write a C++ program that asks a user for a positive integer \(n\), then determines which integer \(0\leq j \leq n\) has the largest sum of divisors.

Sample Run 
INPUT   9
OUTPUT  8