5. Week 3 Tuesday: Handling Input with Strings and cin
≪ 4. Week 2 Thursday: Characters and Strings | Table of Contents | 6. Week 3 Thursday: getline, If Statements, and Conditionals ≫Throughout the first two weeks, we’ve occassionally used the cin
variable to extract user input and place it into a variable as a way to make our programs slightly more responsive. But you may have encountered some scenarios in which cin
behaves kind of weird, and other scenarios in which cin
doesn’t quite do what you want.
Here’s an example: suppose you’re writing a program that accepts the user’s name and also the name of their father, then says something nice about the user. You want your program to behave something like this:
Please enter your full name:
George Fish
Please enter your father's first name:
John
Hello George Fish, esteemed son of John.
The second and fourth lines are user inputs. Here’s a perfectly reasonable implementation of this:
1#include <iostream> // needed for cin, cout
2#include <string> // needed for strings
3
4using namespace std;
5
6int main() {
7 cout << "Please enter your full name: " << endl;
8 string user;
9 cin >> user;
10
11 cout << "Please enter your father's first name: " << endl;
12 string father;
13 cin >> father;
14
15 cout << "Hello " << user << ", esteemed son of "
16 << father << "." << endl;
17 return 0;
18}
But this doesn’t work; running this program yields the output
Please enter your full name:
George Fish
Please enter your father's first name:
Hello George, esteemed son of Fish.
You don’t even get a chance to enter your father’s first name; C++ seemingly skips over the second input command. If instead you explicitly input a first and last name, this works better:
"Fixed" Code
1#include <iostream> // needed for cin, cout
2#include <string> // needed for strings
3
4using namespace std;
5
6int main() {
7 cout << "Please enter your full name: " << endl;
8 string first, last;
9 cin >> first >> last;
10
11 cout << "Please enter your father's first name: " << endl;
12 string father;
13 cin >> father;
14
15 cout << "Hello " << first << " " << last
16 << ", esteemed son of " << father << "." << endl;
17 return 0;
18}
But this time, if someone has a middle name, again the code stops working well:
Please enter your full name:
George Ichthys Fish
Please enter your father's first name:
Hello George Ichthys, esteemed son of Fish.
These are all consequences of how cin
cuts up the input into variable-sized chunks.
The cin
Input Buffer
You should think of your program as being one “node” on a digitised assembly line, with a bunch of conveyor belts supplying information to and from your program. These conveyor belts may lead to files on your computer or even other programs. They behave like the conveyor belts at checkout lines in supermarkets and such: data can be loaded onto the belts freely, but the belt only moves as quickly as the program on the receiving end can process them.
cin
and cout
represent two special conveyor belts in your computer. Whenever your program prints something to cout
, that information is loaded onto a belt that leads to your screen, where that data is printed out one character at a time. The cin
conveyor belt flows the other way: this time, the user loads the belt with text input when the program is running, and the program waits for information to arrive on the belt for processing. This belt is called the “input buffer”.
Remark 1. Input and Output Streams
What I am calling “conveyor belts” are actually known as “streams” in C++. Streams that provide information to the program, such as the input buffer, are called “input streams” and have the type istream
. Streams that lead data out of the program are called “output streams” and have the type ostream
.
We will only work with cin
and cout
in this class, but you may encounter other streams if you need to handle data in your files, from the internet, or even from other string variables.
When cin
needs to put something into a variable, it performs the following steps:
- Discard any spaces at the front of the input buffer.
- Read as many “valid characters” as possible. Whitespace will always be considered an invalid character.
- For integers, “valid characters” are just digits, and possibly a
+
or-
at the front. - For doubles, “valid characters” are digits, a single decimal point, a leading
+
or-
, and evene
in scientific notation (like-1.47e16
). - For characters, “valid characters” constitutes a single nonspace character.
- For strings, “valid characters” is a contiguous chunk of any nonspace characters.
- For integers, “valid characters” are just digits, and possibly a
- When an invalid character is seen on the belt, leave it on the belt and stop.
There are certainly scenarios in which you’d want to save an entire line of input from the user, with an unknown number of spaces, and that can be done using a function called getline
. However, this behaves subtly different from cin
, and we’ll hopefully talk more about this on Thursday.
This explains more or less what’s going on in the example from the beginning of this discussion — cin
only reads one chunk of characters at a time, and if someone’s name is chunkier that we expect in the program, this causes our output and the input buffer to become desynchronised.
Additionally, if you try to read an integer or a double, but there’s no way to interpret the user input as such, the cin
stream will get really depressed and deem itself a failure. This is useful for validating user input; more on this later.
There are some ways to monitor and adjust the contents of the cin
input stream, and they’re summarised on the C++ reference for the istream
library. Some functions to pay attention to:
peek
, which determines the next character on the conveyor belt without removing it.get
, which determines the next character on the conveyor belt and removes it.ignore
, which deletes characters up to and including the provided “target character”.clear
, which resets the status of thecin
stream and readies it for input after a failure. This does not reset the stream contents.
Usually, clear
and ignore
are used together to reset cin
after encountering bad user input. Again, we’ll say more about this in the future when we get to if statements and control flow.
Practise 2.
I adapted the following problem from Prof. Michael Andrews’ most recent PIC 10A midterm.
Consider the following code:
Code
1int i1, i2, i3, i4, i5;
2char c;
3string s;
4
5cin >> i1;
6cin >> i2;
7cin >> s;
8
9cin >> i3;
10cin.ignore();
11
12cin >> c;
13cin >> i4 >> i5;
14
15cout << endl;
16cout << "Line 1: " << i1 << endl;
17cout << "Line 2: " << i2 << endl; // These variables
18cout << "Line 3: " << s << endl; // are printed in
19cout << "Line 4: " << i3 << endl; // the same order
20cout << "Line 5: " << c << endl; // that they are
21cout << "Line 6: " << i4 << endl; // assigned to.
22cout << "Line 7: " << i5 << endl;
Suppose you entered the following four lines of input, with spaces indicated:
\n
9 8b\n
^
7 6543\n
^
2 1012 345 678 911\n
^ ^ ^ ^
Predict the output of the code, and determine the contents of the input buffer at the end of each line of code that processes user input.
Reformatting input
Sometimes, one wants to accept a large chunk of input using a single string and use string operations to further break it up or process it. Other times, one wants to use cin
’s built in input-splitting to do this. But in some scenarios, one is a better option than the other, especially once we learn about the capabilities of the getline
function.
- Suppose you ask for a user’s phone number and need to extract the digits only from an input such as
+1 (657) 114-5721
. - Suppose you are processing a table of numbers, with each row getting its own line and entries separated by commas. For instance, one row may appear as
334,447.3,-14.2,599.0,184.3
.
We are not particularly well-equipped to deal with either example at the moment — we will need to learn about loops and vectors first (stick around for a few weeks and find out!). The point is that getline
will give you access to an entire line of user input at a time, but you would have to do some additional work to extract numerical data from the input. Using cin
for the second example may be more suitable, but the first example can be processed as a string (especially since the numerical values in the input aren’t very relevant).
To highlight how each method of processing input works, let’s work through the following example:
Example 3. Distance to Home
Write a program that allows a user to input an ordered pair of numbers \(\left( x, y \right)\) and computes its distance to the origin.
SAMPLE RUNS:
Please input an ordered pair (x,y): (3,4)
The point (3,4) is a distance of 5.0 from the origin.
Please input an ordered pair (x,y): (4,7)
The point (4,7) is a distance of 8.06226 from the origin.
Implement two solutions: one using only cin
to obtain user input, the other using only one cin
command and several string operations.
For the string manipulation solutions, the functions stod, rfind, and substr may be helpful. Make sure you can read through the reference and figure out how these functions work, even if you can’t understand all the syntax! I think it’s best to see the examples provided on the reference pages.
Try doing the problem yourself before peeking at the solutions, though I suppose I can’t stop you. Note that the user will be inputting the ordered pair with the parentheses and the comma surrounding it.
Hint for cin implementation
cin >> ch;
, where ch
is a char
variable. Why does this work?Pseudocode with cin
Using the hint above, we may construct the following pseudocode:
- Prompt the user for input.
- Read a single character from the input stream. This should be the
(
in the ordered pair. - Read a double called
x
from the input stream. - Read a single character from the input stream. This should be the
,
in the ordered pair. - Read a double called
y
from the input stream. - Compute the distance \(d=\sqrt{x^2+y^2}\) from the origin.
- Print the results to the screen.
Most notably, we left the closing parentheses in the input stream, as we aren’t reading any more input after we obtain y
. In other applications where we may expect more output to follow, this would be a very bad idea, and it would be important to clear that out as well.
Implementation with cin
1#include <iostream> // needed for cin and cout
2#include <cmath> // needed for sqrt
3
4using namespace std;
5
6int main() {
7 // step 1
8 cout << "Please input an ordered pair (x,y): ";
9
10 // steps 2-5, compressed into a shorter line!
11 char ch;
12 double x, y;
13 cin >> ch >> x >> ch >> y;
14
15 // step 6
16 double d = sqrt(x * x + y * y);
17
18 // step 7 - record output
19 cout << "The point (" << x << "," << y
20 << ") is a distance of " << d
21 << " from the origin." << endl;
22
23 return 0;
24}
Pseudocode with string operations
For this implementation, we need to determine which indices the parentheses and comma lie in. Then, we’ll extract the substrings between these three extra characters before converting them to doubles.
- Prompt the user for input.
- Store the user’s full input in a string called
input
. - Create three variables,
paren1, comma, paren2
, which represent the indices of the opening parentheses, comma, and closing parentheses, respectively. - Extract the substring between
paren1
andcomma
(exclusive on both sides) and convert it to a double calledx
. - Extract the substring between
comma
andparen2
(exclusive on both sides) and convert it to a double calledy
. - Compute the distance \(d=\sqrt{x^2+y^2}\) from the origin.
- Print the results to the screen.
You will need to use rfind
on step 3, and substr
and stod
on steps 4 and 5. Make sure you use size_t
variables to store the indices in step 3!
Implementation with string operations
1#include <iostream> // needed for cin
2#include <string> // needed for rfind, stod
3#include <cmath> // needed for sqrt
4
5using namespace std;
6
7int main() {
8 // steps 1 and 2
9 cout << "Please enter an ordered pair (x,y): ";
10 string input;
11 cin >> input;
12
13 // step 3
14 size_t paren1 = input.rfind("(");
15 size_t comma = input.rfind(",");
16 size_t paren2 = input.rfind(")");
17
18 // steps 4 and 5
19 // remember that substr is inclusive in the first index,
20 // hence the +1. Make sure you know how I'm counting the
21 // length of these substrings!
22 string x_string = input.substr(paren1 + 1, comma - paren1 - 1);
23 double x = stod(x_string);
24
25 string y_string = input.substr(comma + 1, paren2 - paren1 - 1);
26 double y = stod(y_string);
27
28 // step 6
29 double d = sqrt(x * x + y * y);
30
31 // step 7 - record output
32 cout << "The point (" << x << "," << y
33 << ") is a distance of " << d
34 << " from the origin." << endl;
35
36 return 0;
37}