Hunter Liu's Website

6. Week 3 Tuesday: Handling Input with Strings and cin

≪ 5. Week 2 Thursday: Characters and Strings | Table of Contents | 7. Week 3 Thursday: If Statements ≫

Throughout the first two weeks, we’ve occassionally used the cin variable to extract user input and place it into a variable as a way to make our programs slightly more responsive. But you may have encountered some scenarios in which cin behaves kind of weird, and other scenarios in which cin doesn’t quite do what you want.

Here’s an example: suppose you’re writing a program that accepts the user’s name and also the name of their father, then says something nice about the user. You want your program to behave something like this:

Please enter your full name: 
George Fish
Please enter your father's first name: 
John
Hello George Fish, esteemed son of John.

The second and fourth lines are user inputs. Here’s a perfectly reasonable implementation of this:

 1#include <iostream>     // needed for cin, cout 
 2#include <string>       // needed for strings 
 3
 4using namespace std; 
 5
 6int main() {
 7    cout << "Please enter your full name: " << endl; 
 8    string user; 
 9    cin >> user; 
10    
11    cout << "Please enter your father's first name: " << endl; 
12    string father; 
13    cin >> father; 
14
15    cout << "Hello " << user << ", esteemed son of " 
16         << father << "." << endl; 
17    return 0; 
18} 

But this doesn’t work; running this program yields the output

Please enter your full name: 
George Fish
Please enter your father's first name: 
Hello George, esteemed son of Fish.

You don’t even get a chance to enter your father’s first name; C++ seemingly skips over the second input command. If instead you explicitly input a first and last name, this works better:

"Fixed" Code
 1#include <iostream>     // needed for cin, cout 
 2#include <string>       // needed for strings 
 3
 4using namespace std; 
 5
 6int main() {
 7    cout << "Please enter your full name: " << endl; 
 8    string first, last; 
 9    cin >> first >> last; 
10    
11    cout << "Please enter your father's first name: " << endl; 
12    string father; 
13    cin >> father; 
14
15    cout << "Hello " << first << " " << last 
16         << ", esteemed son of " << father << "." << endl; 
17    return 0; 
18} 

But this time, if someone has a middle name, again the code stops working well:

Please enter your full name: 
George Ichthys Fish
Please enter your father's first name: 
Hello George Ichthys, esteemed son of Fish.

These are all consequences of how cin cuts up the input into variable-sized chunks.

The cin Input Buffer

You should think of your program as being one “node” on a digitised assembly line, with a bunch of conveyor belts supplying information to and from your program. These conveyor belts may lead to files on your computer or even other programs. They behave like the conveyor belts at checkout lines in supermarkets and such: data can be loaded onto the belts freely, but the belt only moves as quickly as the program on the receiving end can process them.

cin and cout represent two special conveyor belts in your computer. Whenever your program prints something to cout, that information is loaded onto a belt that leads to your screen, where that data is printed out one character at a time. The cin conveyor belt flows the other way: this time, the user loads the belt with text input when the program is running, and the program waits for information to arrive on the belt for processing. This belt is called the “input buffer”.

Remark 1. Input and Output Streams

What I am calling “conveyor belts” are actually known as “streams” in C++. Streams that provide information to the program, such as the input buffer, are called “input streams” and have the type istream. Streams that lead data out of the program are called “output streams” and have the type ostream.

We will only work with cin and cout in this class, but you may encounter other streams if you need to handle data in your files, from the internet, or even from other string variables.

When cin needs to put something into a variable, it performs the following steps:

  1. Discard any spaces at the front of the input buffer.
  2. Read as many “valid characters” as possible. Whitespace will always be considered an invalid character.
    • For integers, “valid characters” are just digits, and possibly a + or - at the front.
    • For doubles, “valid characters” are digits, a single decimal point, a leading + or -, and even e in scientific notation (like -1.47e16).
    • For characters, “valid characters” constitutes a single nonspace character.
    • For strings, “valid characters” is a contiguous chunk of any nonspace characters.
  3. When an invalid character is seen on the belt, leave it on the belt and stop.

This explains more or less what’s going on in the example from the beginning of this discussion — cin only reads one chunk of characters at a time, and if someone’s name is chunkier that we expect in the program, this causes our output and the input buffer to become desynchronised.

Additionally, if you try to read an integer or a double, but there’s no way to interpret the user input as such, the cin stream will get really depressed and deem itself a failure. This is useful for validating user input; more on this later.

There are some ways to monitor and adjust the contents of the cin input stream, and they’re summarised on the C++ reference for the istream library. Some functions to pay attention to:

Usually, clear and ignore are used together to reset cin after encountering bad user input. Again, we’ll say more about this in the future when we get to if statements and control flow.

In any case, we can fix the code we presented earlier by making use of the getline function, which as its name suggests reads a full line of input into a variable, regardless of how many “chunks” it has:

 1#include <iostream> // needed for cin, cout 
 2#include <string>   // needed for strings, getline 
 3
 4using namespace std; 
 5
 6int main() {
 7    cout << "What is your full name?" << endl; 
 8    string user_name; 
 9
10    // reads a full line of input into the string user_name 
11    getline(cin, user_name); 
12
13    cout << "What is your father's first name?" << endl; 
14    string father; cin >> father; 
15
16    cout << "Hello " << user_name 
17         << ", esteemed son of " << father << "!" << endl; 
18
19    return 0;
20} 

It doesn’t matter how many middle names you have, this will always work.

However, something fishy happens when we reverse the order that these questions are asked:

 1#include <iostream> // needed for cin, cout 
 2#include <string>   // needed for strings, getline 
 3
 4using namespace std; 
 5
 6int main() {
 7    cout << "What is your father's first name?" << endl; 
 8    string father; cin >> father; 
 9
10    cout << "What is your full name?" << endl; 
11    string user_name; 
12
13    // reads a full line of input into the string user_name 
14    getline(cin, user_name); 
15
16    cout << "Hello " << user_name 
17         << ", esteemed son of " << father << "!" << endl; 
18
19    return 0;
20} 

First of all, if you enter more than one name for the father’s first name, this of course doesn’t work as anticipated. But more interestingly, even if you just enter one name for the father, the user’s name is left blank and the getline appears to be skipped!

This is because getline uses different rules than cin:

  1. Read characters from cin into the provided string variable (including spaces) until a newline character \n is reached.
  2. Delete, but do not store, the newline character.

The cin command on line 8 reads the father’s first name from the input buffer and stops when it sees the newline character. However, it leaves it in the input buffer, and this same newline character is seen by the getline function a few lines later! This results in nothing being read into the user_name variable.

Because of these conflicting behaviours, it’s generally a good idea to stick to only using cin or only getline. However, sometimes the benefits of using cin are too appealing to resist. To fix this discrepancy, one can use the cin.ignore function. This needs two pieces of data: a number and a character. cin.ignore([ n ]) will delete the first n characters in the input buffer unconditionally. cin.ignore([ n ], [ ch ]) will ignore up to n characters, but stops if it sees the character ch and destroys it too. For instance, cin.ignore(10000, '\n'); will delete up to 10,000 characters, stopping if it sees a newline character. We can add this line after the cin command and before the getline command to fix our program:

 1#include <iostream> // needed for cin, cout 
 2#include <string>   // needed for strings, getline 
 3
 4using namespace std; 
 5
 6int main() {
 7    cout << "What is your father's first name?" << endl; 
 8    string father; cin >> father; 
 9
10    // skip the rest of the line! 
11    cin.ignore(10000, '\n'); 
12
13    cout << "What is your full name?" << endl; 
14    string user_name; 
15
16    // reads a full line of input into the string user_name 
17    getline(cin, user_name); 
18
19    cout << "Hello " << user_name 
20         << ", esteemed son of " << father << "!" << endl; 
21
22    return 0;
23} 

Remark 2.

You may be wondering, “What if the user entered more than 10,000 characters before hitting enter?” Generally, there’s always a way for a pathalogical user input to ruin your life. However, for the ignore function, there is a special constant defined by the limits library (i.e., you have to #include <limits>) that makes cin ignore as many characters as it needs to before reaching the desired character. You can find more details on the C++ documentation.

Practise 3.

Consider the following C++ program:

 1#include <iostream>
 2#include <string>
 3
 4using namespace std;
 5
 6int main() {
 7    int i, j;
 8    string s1, s2;
 9
10    cin >> i >> j >> s1;
11    getline(cin, s2);
12
13    s1[i]++;
14    char ch = s1[j];
15    ++ch;
16    double d = i / j;
17
18    cout << d << endl;
19    cout << s1 << endl;
20    cout << s2 << endl;
21    return 0;
22}

Suppose the user enters the following inputs:

2 5
PIC10A
ROOLS

Predict the output of the code, and determine the contents of the input buffer at the end of each line of code that processes user input.