Hunter Liu's Website

4. Week 2 Thursday: Increments, Decrements, and Strings

≪ 3. Week 2 Tuesday: Working with Numbers | Table of Contents | 5. Week 3 Tuesday: The Input Buffer ≫

There are two more or less disjoint topics to cover today. First, we’ll discuss the increment and decrement operators, which are very commonly used “shortcuts” that have particularly subtle behaviour. Second, we’ll discuss the string class and some quirks with it.

Neither of these two topics are particularly interesting or powerful alone, but the increment and decrement operators can be used to great effect when we learn about control flow. Moreover, a lot of text processing techniques (e.g. converting a string to all caps) rely on this kind of control flow, and there are scenarios where mastery of both concepts will let you write profoundly slick (but moderately unreadable) code. While you all are principled students that would never write confusing and opaque code, it’s entirely possible that one day, you will encounter someone less scrupulous than you.

Increments and Decrements

I had planned to cover this last Thursday, but I grossly underestimated how long we would spend on administrative miscellanea in the PIC lab.

The operation of adding or subtracting one from a number is a very common operation in C++. It’s used when counting number of occurrences of an event, keeping track of how many times a loop was repeated, and many others. If n is a numerical variable, the increments n++ and ++n add one to n, and the decrements n-- and --n subtract one from n.

Remark 1. Non-numeric variables can be incremented or decremented too.

The increment and decrement operators can be manually extended to non-numeric variables by programmers in a process called “operator overloading”. This is beyond the scope of this class, but the upshot is that some variables that are not numbers can nevertheless be incremented or decremented. We will see examples of this when we talk about pointers in the distant future.

Warning 2. Be wary of incrementing or decrementing `double`s.

As discussed last time, the double type in C++ has limited precision. In some cases (i.e., when a double is many orders of magnitude larger or smaller than 1), these operations will produce significant unexpected errors. Keep this in mind for when we begin doing control flow.

We’ll focus on the increment operators; everything we say below will have analogous statements for the corresponding decrement operators.

One can think of both n++ and ++n as shortcuts for writing the statement n += 1. When n++ and ++n stand alone, the behaviour is identical:

1int n = 5; 
2n++; 
3cout << n << endl; // prints 6 
4++n; 
5cout << n << endl; // prints 7

However, there is one important difference: you can use n++ and ++n within a larger piece of code. This is where all the power of these operators comes in!

1int n = 5; 
2cout << ++n << endl; // prints 6
3cout << n++ << endl; // prints ...6?? 

The above code highlights a subtlety of the two operators. ++n says to first increase n by one, then substitute its value into the code. Likewise, the n++ operator says to first substitute the value of n into the code, then increment its value.

Be very careful about using multiple increment/decrement operators in the same expression. One might be tempted to think that (n++)++ will increment n twice, but this does not even compile. The code (++n)++ does increment n twice (it increments n once, substitutes that value in, then increments again). However, the code ++(n++) does not compile.

Similarly, statements such as n++ + ++n, n + ++n, n * ++n, etc. produce undefined behaviour according to the C++ standard. The issue is one of the order of operations — C++ needs to substitute the values of n and ++n into the expression, but the order in which this substitution occurss both affects the resultant value and changes depending on the compiler. There are reasons for why these operators behave like this, but this is well beyond the scope of PIC 10A.

That being said, if one has two variables n and m, expressions such as ++n * ++m are perfectly okay. It’s when a single variable is being incremented or decremented and gets used more than once.

Example 3.

Predict the output of the following code:

 1#include <iostream> 
 2
 3using namespace std; 
 4
 5int main() {
 6    int n = 15; 
 7    int m = 17; 
 8
 9    cout << n++ + ++m << endl; 
10    cout << ++n + m-- << endl; 
11    cout << --m * n-- << endl; 
12    cout << m << endl; 
13    cout << n << endl; 
14
15    return 0; 
16} 
Solution

My preferred approach to these kinds of problems is to keep track of the values of n and m after each line of code. In the table below, the columns for n and m represent their values after the line has been executed.

Line Number Output n m
8 N/A 15 17
9 (15 + 18 = ) 33 16 18
10 (17 + 18 = ) 35 17 17
11 (16 * 17 = ) 272 16 16
12 16 16 16
13 16 16 16

Thus, the output consists of the five numbers 33, 35, 272, 16, and 16, each on their own line.

Characters and Strings

Characters and the ASCII Table

A string variable represents a string of text, and that’s really just a bunch of char variables (i.e. characters) that have been grouped together. In order to learn how to manipulate strings, we ought to learn how to manipulate individual characters.

char variables, unlike string variables, are declared using single quotation marks, such as char c = 'C';. We can also declare char variables using numbers? Consider the following code:

1char c = 67; 
2int n = 'C'; 
3cout << c << " " << n << endl; 

What do you think the output will be? If you run this on your computer, you should get the output C 67. This illuminates two important facts:

  1. The character 'C' and the integer 67 are the same thing. More broadly, all characters are just numbers to the computer.
  2. C++ interprets the value 67 (or the character 'C') differently depending on what type of variable it’s stored in.

Naturally, we should ask how the computer knows which numbers correspond to which letter and vice-versa. About 3 billion years ago, the leading prokaryotic cells of the time congregated and designed the American Standard Code for Information Interchange, or ASCII. This is a table describing which characters correspond to which numbers, and all standard C++ programs adhere to this code. You may look up an ASCII table online through Google, and you will not need to memorise any portion of the ASCII table for this class.

As with all numbers, we may add, subtract, and compare characters to each other. This is useful for a variety of reasons:

  1. The digits 0 through 9 occupy a contiguous block on the ASCII table. To see if a char variable c is a digit, we may use the code c >= '0' && c <= '9', as opposed to c == '0' || ... || c == '9'. We can do a similar trick for checking if c is a lower case letter or an upper case letter.
  2. To convert a character from lower case to upper case, we can serendipitously notice that 'a' has a value of 97 while 'A' has a value of 65; they differ by exactly 32. The rest of the 25 letters obey the same rule! If c is a char variable that holds a lower case letter, then to convert it to upper case, one may perform c -= 32;. If the constant 32 is too hard to remember (I think it is), you can also do c = c - 'a' + 'A';. Try to squint at the table and see why this works!

Accessing Characters within Strings

A string is a variable that holds a bunch of characters in a sequence. In order to create and work with string variables, you need to include the string library at the start of your program (i.e., #include <string>). Everything related to strings is part of the std namespace. To declare a string, you can just write the string within double quotes:

1string s = "Johnald MacDonald"; 

The entries of the string are labeled with “indices”, starting from 0:

J  o  h  n  a  l  d  _  M  a  c  D  o  n  a  l  d
0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16

In order to access the character at the index i, we can use s.at(i):

1string s = "Johnald MacDonald"; 
2cout << s.at(0) << endl;    // prints J 
3cout << s.at(7) << endl;    // prints a space 
4cout << s.at(16) << endl;   // prints a d
5cout << s.at(30) << endl;   // crashes the program. 

This last line where we tried accessing an index past the end of a string is an example of a runtime error! Such errors are most commonly caused by mislabelling the indices of your string, e.g. if you start labelling with 1 instead of with 0.

Remark 4.

An alternative way to access a string’s contents is s[0] instead of s.at(0). However, this is “unsafe” — the .at function will intentionally crash the program whenever you enter an invalid index. In contrast, it’s possible that s[3991] ends up working for reasons we will explain in a few weeks, and it can result in a program changing parts of your computer it was never supposed to access. If you’re interested, learn about buffer overflow attacks.

Some Other String Operations

There are some other operations that are above the level of character-by-character manipulations, and here are the ways we would perform them:

There are some other operations one can perform with strings, but they’re far too numerous for me to list here. If you’re curious about how to search for a substring of a string, how to remove spaces from either end of a string, etc., you may check the C++ reference, which contains documentation of every single function available for use on strings! Some key functions to know are find, rfind, getline, push_back, and pop_back.

Common Mistake 5. Concatenating String Literals

One may be tempted to write the code

1string name = "John Old " + 
2              "McDonald";

if one has a particularly narrow screen. When using + to join two strings, at least one of the two summands must be a string variable instead of a string literal, i.e. an explicit string of text enclosed by quotations.

Some Practise With Strings

Problem 6.

Predict the output of the following code. You may look at an ASCII table.

 1#include <iostream> 
 2#include <string> 
 3
 4using namespace std; 
 5
 6int main() {
 7    string s = "Hello world!"; 
 8    char c = s.at(6); 
 9
10    cout << ++c << endl; 
11    cout << s.at(6)++ << endl; 
12    cout << c << endl; 
13    cout << s << endl; 
14
15    return 0; 
16} 
Solution
The answer is x, w, x, and Hello xorld!, each on separate lines. The variable c is a distinct copy of the character at index 6 in the string s: changes to one are not reflected by the other.

Problem 7.

Predict the output of the following code. You may use an ASCII table if necessary.

 1#include <iostream> 
 2#include <string> 
 3
 4using namespace std; 
 5
 6int main() {
 7    string s1 = "Veni, vidi, vici."; 
 8    int n = 9; 
 9    ++s1.at(n++); 
10
11    string s2 = s1.substr(12); 
12    ++s1.at(--n); 
13
14    cout << s1 << endl; 
15    cout << s2 << endl; 
16
17    return 0; 
18} 

Note that this does not produce undefined behaviour, as the increment and decrement operators are applied to different objects on lines 9 and 12!