C++ Primer Plus - читать бесплатно онлайн полную версию книги автора Stephen Prata (ч. 29)

Generic Programming

Now that you have some experience using the STL, let’s look at the underlying philosophy. The STL is an example of generic programming. Object-oriented programming concentrates on the data aspect of programming, whereas generic programming concentrates on algorithms. The main things the two approaches have in common are abstraction and the creation of reusable code, but the philosophies are quite different.

A goal of generic programming is to write code that is independent of data types. Templates are the C++ tools for creating generic programs. Templates, of course, let you define a function or class in terms of a generic type. The STL goes further by providing a generic representation of algorithms. Templates make this possible, but not without the added element of careful and conscious design. To see how this mixture of templates and design works, let’s look at why iterators are needed.

Why Iterators?

Understanding iterators is perhaps the key to understanding the STL. Just as templates make algorithms independent of the type of data stored, iterators make the algorithms independent of the type of container used. Thus, they are an essential component of the STL’s generic approach.

To see why iterators are needed, let’s look at how you might implement a find function for two different data representations and then see how you could generalize the approach. First, let’s consider a function that searches an ordinary array of double for a particular value. You could write the function like this:

double * find_ar(double * ar, int n, const double & val)

{

for (int i = 0; i < n; i++)

if (ar[i] == val)

return &ar[i];

return 0; // or, in C++11, return nullptr;

}

If the function finds the value in the array, it returns the address in the array where the value is found; otherwise, it returns the null pointer. It uses subscript notation to move through the array. You could use a template to generalize to arrays of any type having an == operator. Nonetheless, this algorithm is still tied to one particular data structure—the array.

So let’s look at searching another kind of data structure, the linked list. (Chapter 12 uses a linked list to implement a Queue class.) The list consists of linked Node structures:

struct Node

{

double item;

Node * p_next;

};

Suppose you have a pointer that points to the first node in the list. The p_next pointer in each node points to the next node, and the p_next pointer for the last node in the list is set to 0. You could write a find_ll() function this way:

Node* find_ll(Node * head, const double & val)

{

Node * start;

for (start = head; start!= 0; start = start->p_next)

if (start->item == val)

return start;

return 0;

}

Again, you could use a template to generalize this to lists of any data type supporting the == operator. Nonetheless, this algorithm is still tied to one particular data structure—the linked list.

If you consider details of implementation, the two find functions use different algorithms: One uses array indexing to move through a list of items, and the other resets start to start->p_next. But broadly, the two algorithms are the same: Compare the value with each value in the container in sequence until you find a match.

The goal of generic programming in this case would be to have a single find function that would work with arrays or linked lists or any other container type. That is, not only should the function be independent of the data type stored in the container, it should be independent of the data structure of the container itself. Templates provide a generic representation for the data type stored in a container. What’s needed is a generic representation of the process of moving through the values in a container. The iterator is that generalized representation.

What properties should an iterator have in order to implement a find function? Here’s a short list:

• You should be able to dereference an iterator in order to access the value to which it refers. That is, if p is an iterator, *p should be defined.

• You should be able to assign one iterator to another. That is, if p and q are iterators, the expression p = q should be defined.

• You should be able to compare one iterator to another for equality. That is, if p and q are iterators, the expressions p == q and p != q should be defined.

• You should be able to move an iterator through all the elements of a container. This can be satisfied by defining ++p and p++ for an iterator p.

There are more things an iterator could do, but nothing more it need do—at least, not for the purposes of a find function. Actually, the STL defines several levels of iterators of increasing capabilities, and we’ll return to that matter later. Note, by the way, that an ordinary pointer meets the requirements of an iterator. Hence, you can rewrite the find_arr() function like this:

typedef double * iterator;

iterator find_ar(iterator ar, int n, const double & val)

{

for (int i = 0; i < n; i++, ar++)

if (*ar == val)

return ar;

return 0;

}

Then you can alter the function parameter list so that it takes a pointer to the beginning of the array and a pointer to one past-the-end of the array as arguments to indicate a range. (Listing 7.8 in Chapter 7, “Functions: C++’s Programming Modules,” does something similar.) And the function can return the end pointer as a sign the value was not found. The following version of find_ar() makes these changes:

typedef double * iterator;

iterator find_ar(iterator begin, iterator end, const double & val)

{

iterator ar;

for (ar = begin; ar != end; ar++)

if (*ar == val)

return ar;

return end; // indicates val not found

}

For the find_ll() function, you can define an iterator class that defines the * and ++ operators:

struct Node

{

double item;

Node * p_next;

};

class iterator

{

Node * pt;

public:

iterator() : pt(0) {}

iterator (Node * pn) : pt(pn) {}

double operator*() { return pt->item;}

iterator& operator++() // for ++it

{

pt = pt->p_next;

return *this;

}

iterator operator++(int) // for it++

{

iterator tmp = *this;

pt = pt->p_next;

return tmp;

}

// ... operator==(), operator!=(), etc.

};

(To distinguish between the prefix and postfix versions of the ++ operator, C++ adopted the convention of letting operator++() be the prefix version and operator++(int) be the suffix version; the argument is never used and hence needn’t be given a name.)

The main point here is not how, in detail, to define the iterator class, but that with such a class, the second find function can be written like this:

iterator find_ll(iterator head, const double & val)

{

iterator start;

for (start = head; start!= 0; ++start)

if (*start == val)

return start;

return 0;

}

This is very nearly the same as find_ar(). The point of difference is in how the two functions determine whether they’ve reached the end of the values being searched. The find_ar() function uses an iterator to one-past-the-end, whereas find_ll() uses a null value stored in the final node. Remove that difference, and you can make the two functions identical. For example, you could require that the linked list have one additional element after the last official element. That is, you could have both the array and the linked list have a past-the-end element, and you could end the search when the iterator reaches the past-the-end position. Then find_ar() and find_ll() would have the same way of detecting the end of data and become identical algorithms. Note that requiring a past-the-end element moves from making requirements on iterators to making requirements on the container class.

The STL follows the approach just outlined. First, each container class (vector, list, deque, and so on) defines an iterator type appropriate to the class. For one class, the iterator might be a pointer; for another, it might be an object. Whatever the implementation, the iterator will provide the needed operations, such as * and ++. (Some classes may need more operations than others.) Next, each container class will have a past-the-end marker, which is the value assigned to an iterator when it has been incremented one past the last value in the container. Each container class will have begin() and end() methods that return iterators to the first element in a container and to the past-the-end position. And each container class will have the ++ operation take an iterator from the first element to past-the-end, visiting every container element en route.

To use a container class, you don’t need to know how its iterators are implemented nor how past-the-end is implemented. It’s enough to know that it does have iterators, that begin() returns an iterator to the first element, and that end() returns an iterator to past-the-end. For example, suppose you want to print the values in a vector object. In that case, you can use this:

vector::iterator pr;

for (pr = scores.begin(); pr != scores.end(); pr++)

cout << *pr << endl;

Here the following line identifies pr as the iterator type defined for the vector class:

vector::iterator pr;

If you used the list class template instead to store scores, you could use this code:

list::iterator pr;

for (pr = scores.begin(); pr != scores.end(); pr++)

cout << *pr << endl;

The only change is in the type declared for pr. Thus, by having each class define appropriate iterators and designing the classes in a uniform fashion, the STL lets you write the same code for containers that have quite dissimilar internal representations.

With C++ automatic type deduction, you can simplify further and use the following code with either the vector or the list:

for (auto pr = scores.begin(); pr != scores.end(); pr++)

cout << *pr << endl;

Actually, as a matter of style, it’s better to avoid using the iterators directly; instead, if possible, you should use an STL function, such as for_each(), that takes care of the details for you. Alternatively, use the C++11 range-based for loop:

for (auto x : scores) cout << x << endl;

So to summarize the STL approach, you start with an algorithm for processing a container. You express it in as general terms as possible, making it independent of data type and container type. To make the general algorithm work with specific cases, you define iterators that meet the needs of the algorithm and place requirements on the container design. That is, basic iterator properties and container properties stem from requirements placed on the algorithm.

Kinds of Iterators

Different algorithms have different requirements for iterators. For example, a find algorithm needs the ++ operator to be defined so the iterator can step through the entire container. It needs read access to data but not write access. (It just looks at data and doesn’t change it.) The usual sorting algorithm, on the other hand, requires random access so that it can swap two non-adjacent elements. If iter is an iterator, you can get random access by defining the + operator so that you can use expressions such as iter + 10. Also a sort algorithm needs to be able to both read and write data.

The STL defines five kinds of iterators and describes its algorithms in terms of which kinds of iterators it needs. The five kinds are the input iterator, output iterator, forward iterator, bidirectional iterator, and random access iterator. For example, the find() prototype looks like this:

template

InputIterator find(InputIterator first, InputIterator last, const T& value);

This tells you that this algorithm requires an input iterator. Similarly, the following prototype tells you that the sort algorithm requires a random access iterator:

template

void sort(RandomAccessIterator first, RandomAccessIterator last);

All five kinds of iterators can be dereferenced (that is, the * operator is defined for them) and can be compared for equality (using the == operator, possibly overloaded) and inequality (using the != operator, possibly overloaded). If two iterators test as equal, then dereferencing one should produce the same value as dereferencing the second. That is, if

iter1 == iter2

is true, then the following is also true:

*iter1 == *iter2

Of course, these properties hold true for built-in operators and pointers, so these requirements are guides for what you must do when overloading these operators for an iterator class. Now let’s look at other iterator properties.

Input Iterators

The term input is used from the viewpoint of a program. That is, information going from the container to the program is considered input, just as information from a keyboard to the program is considered input. So an input iterator is one that a program can use to read values from a container. In particular, dereferencing an input iterator must allow a program to read a value from a container, but it needn’t allow a program to alter that value. So algorithms that require an input iterator are algorithms that don’t change values held in a container.

An input iterator has to allow you to access all the values in a container. It does so by supporting the ++ operator, both in prefix and suffix form. If you set an input operator to the first element in a container and increment it until it reaches past-the-end, it will point to every container item once en route. Incidentally, there is no guarantee that traversing a container a second time with an input iterator will move through the values in the same order. Also after an input iterator has been incremented, there is no guarantee that its prior value can still be dereferenced. Any algorithm based on an input iterator, then, should be a single-pass algorithm that doesn’t rely on iterator values from a previous pass or on earlier iterator values from the same pass.

Note that an input iterator is a one-way iterator; it can increment, but it can’t back up.

Output Iterators

In STL usage, the term output indicates that the iterator is used for transferring information from a program to a container. (Thus the output for the program is input for the container.) An output iterator is similar to an input iterator, except that dereferencing is guaranteed to allow a program to alter a container value but not to read it. If the ability to write without reading seems strange, keep in mind that this property also applies to output sent to your display; cout can modify the stream of characters sent to the display, but it can’t read what’s onscreen. The STL is general enough that its containers can represent output devices, so you can run into the same situation with containers. Also if an algorithm modifies the contents of a container (for example, by generating new values to be stored) without reading the contents, there’s no reason to require that it use an iterator that can read the contents.

In short, you can use an input iterator for single-pass, read-only algorithms and an output operator for single-pass, write-only algorithms.

Forward Iterators

Like input and output iterators, forward iterators use only the ++ operators for navigating through a container. So a forward iterator can only go forward through a container one element at a time. However, unlike input and output iterators, it necessarily goes through a sequence of values in the same order each time you use it. Also after you increment a forward iterator, you can still dereference the prior iterator value, if you’ve saved it, and get the same value. These properties make multiple-pass algorithms possible.

A forward iterator can allow you to both read and modify data, or it can allow you just to read it:

int * pirw; // read-write iterator

const int * pir; // read-only iterator

Bidirectional Iterators

Suppose you have an algorithm that needs to be able to traverse a container in both directions. For example, a reverse function could swap the first and last elements, increment the pointer to the first element, decrement the pointer to a second element, and repeat the process. A bidirectional iterator has all the features of a forward iterator and adds support for the two decrement operators (prefix and postfix).

Random Access Iterators

Some algorithms, such as standard sort and binary search, require the ability to jump directly to an arbitrary element of a container. This is termed random access, and it requires a random access iterator. This type of iterator has all the features of a bidirectional iterator, plus it adds operations (such as pointer addition) that support random access and relational operators for ordering the elements. Table 16.3 lists the operations a random access iterator has beyond those of a bidirectional iterator. In this table, X represents a random iterator type, T represents the type pointed to, a and b are iterator values, n is an integer, and r is a random iterator variable or reference.

Table 16.3. Random Access Iterator Operations

Expressions such as a + n are valid only if both a and a + n lie within the range of the container (including past-the-end).

Iterator Hierarchy

You have probably noticed that the iterator kinds form a hierarchy. A forward iterator has all the capabilities of an input iterator and of an output iterator, plus its own capabilities. A bidirectional iterator has all the capabilities of a forward iterator, plus its own capabilities. And a random access iterator has all the capabilities of a forward iterator, plus its own capabilities. Table 16.4 summarizes the main iterator capabilities. In it, i is an iterator, and n is an integer.

Table 16.4. Iterator Capabilities

An algorithm written in terms of a particular kind of iterator can use that kind of iterator or any other iterator that has the required capabilities. So a container with, say, a random access iterator can use an algorithm written for an input iterator.

Why all these different kinds of iterators? The idea is to write an algorithm using the iterator with the fewest requirements possible, allowing it to be used with the largest range of containers. Thus, the find() function, by using a lowly input iterator, can be used with any container that contains readable values. The sort() function, however, by requiring a random access iterator, can be used just with containers that support that kind of iterator.

Note that the various iterator kinds are not defined types; rather, they are conceptual characterizations. As mentioned earlier, each container class defines a class scope typedef name called iterator. So the vector class has iterators of type vector::iterator. But the documentation for this class would tell you that vector iterators are random access iterators. That, in turn, allows you to use algorithms based on any iterator type because a random access iterator has all the iterator capabilities. Similarly, a list class has iterators of type list::iterator. The STL implements a doubly linked list, so it uses a bidirectional iterator. Thus, it can’t use algorithms based on random access iterators, but it can use algorithms based on less demanding iterators.

Concepts, Refinements, and Models

The STL has several features, such as kinds of iterators, that aren’t expressible in the C++ language. That is, although you can design, say, a class that has the properties of a forward iterator, you can’t have the compiler restrict an algorithm to using only that class. The reason is that the forward iterator is a set of requirements, not a type. The requirements could be satisfied by an iterator class you’ve designed, but they could also be satisfied by an ordinary pointer. An STL algorithm works with any iterator implementation that meets its requirements. STL literature uses the word concept to describe a set of requirements. Thus, there is an input iterator concept, a forward iterator concept, and so on. By the way, if you do need iterators for, say, a container class you’re designing, you can look to the STL, which include iterator templates for the standard varieties.

Concepts can have an inheritance-like relationship. For example, a bidirectional iterator inherits the capabilities of a forward iterator. However, you can’t apply the C++ inheritance mechanism to iterators. For example, you might implement a forward iterator as a class and a bidirectional iterator as a regular pointer. So in terms of the C++ language, this particular bidirectional iterator, being a built-in type, couldn’t be derived from a class. Conceptually, however, it does inherit. Some STL literature uses the term refinement to indicate this conceptual inheritance. Thus, a bidirectional iterator is a refinement of the forward iterator concept.

A particular implementation of a concept is termed a model. Thus, an ordinary pointer-to-int is a model of the concept random access iterator. It’s also a model of a forward iterator, for it satisfies all the requirements of that concept.

The Pointer As Iterator

Iterators are generalizations of pointers, and a pointer satisfies all the iterator requirements. Iterators form the interface for STL algorithms, and pointers are iterators, so STL algorithms can use pointers to operate on non-STL containers that are based on pointers. For example, you can use STL algorithms with arrays. Suppose Receipts is an array of double values, and you would like to sort in ascending order:

const int SIZE = 100;

double Receipts[SIZE];

The STL sort() function, recall, takes as arguments an iterator pointing to the first element in a container and an iterator pointing to past-the-end. Well, &Receipts[0] (or just Receipts) is the address of the first element, and &Receipts[SIZE] (or just Receipts + SIZE) is the address of the element following the last element in the array. Thus, the following function call sorts the array:

sort(Receipts, Receipts + SIZE);

C++ guarantees that the expression Receipts + n is defined as long as the result lies in the array or one past-the-end. Thus, C++ supports the “one-past-the-end” concept for pointers into an array, and this makes it possible to apply STL algorithms to ordinary arrays. Thus, the fact that pointers are iterators and that algorithms are iterator based makes it possible to apply STL algorithms to ordinary arrays. Similarly, you can apply STL algorithms to data forms of your own design, provided that you supply suitable iterators (which may be pointers or objects) and past-the-end indicators.

copy(), ostream_iterator, and istream_iterator

The STL provides some predefined iterators. To see why, let’s establish some background. There is an algorithm called copy() for copying data from one container to another. This algorithm is expressed in terms of iterators, so it can copy from one kind of container to another or even from or to an array, because you can use pointers into an array as iterators. For example, the following copies an array into a vector:

int casts[10] = {6, 7, 2, 9 ,4 , 11, 8, 7, 10, 5};

vector dice[10];

copy(casts, casts + 10, dice.begin()); // copy array to vector

The first two iterator arguments to copy() represent a range to be copied, and the final iterator argument represents the location to which the first item is copied. The first two arguments must be input iterators (or better), and the final argument must be an output iterator (or better). The copy() function overwrites existing data in the destination container, and the container has to be large enough to hold the copied elements. So you can’t use copy() to place data in an empty vector—at least not without resorting to a trick that is revealed later in this chapter.

Now suppose you want to copy information to the display. You could use copy() if there was an iterator representing the output stream. The STL provides such an iterator with the ostream_iterator template. Using STL terminology, this template is a model of the output iterator concept. It is also an example of an adapter—a class or function that converts some other interface to an interface used by the STL. You can create an iterator of this kind by including the iterator (formerly iterator.h) header file and making a declaration:

#include

...

ostream_iterator out_iter(cout, " ");

The out_iter iterator now becomes an interface that allows you to use cout to display information. The first template argument (int, in this case) indicates the data type being sent to the output stream. The second template argument (char, in this case) indicates the character type used by the output stream. (Another possible value would be wchar_t.) The first constructor argument (cout, in this case) identifies the output stream being used. It could also be a stream used for file output. The final character string argument is a separator to be displayed after each item sent to the output stream.

You could use the iterator like this:

*out_iter++ = 15; // works like cout << 15 << " ";

For a regular pointer, this would mean assigning the value 15 to the pointed-to location and then incrementing the pointer. For this ostream_iterator, however, the statement means send 15 and then a string consisting of a space to the output stream managed by cout. Then it should get ready for the next output operation. You could use the iterator with copy() as follows:

copy(dice.begin(), dice.end(), out_iter); // copy vector to output stream

This would mean to copy the entire range of the dice container to the output stream—that is, to display the contents of the container.

Or you could skip creating a named iterator and construct an anonymous iterator instead. That is, you could use the adapter like this:

copy(dice.begin(), dice.end(), ostream_iterator(cout, " ") );

Similarly, the iterator header file defines an istream_iterator template for adapting istream input to the iterator interface. It is a model of the input iterator concept. You could use two istream_iterator objects to define an input range for copy():

copy(istream_iterator(cin),

istream_iterator(), dice.begin());

Like ostream_iterator, istream_iterator uses two template arguments. The first indicates the data type to be read, and the second indicates the character type used by the input stream. Using a constructor argument of cin means to use the input stream managed by cin. Omitting the constructor argument indicates input failure, so the previous code means to read from the input stream until end-of-file, type mismatch, or some other input failure.

Other Useful Iterators

The iterator header file provides some other special-purpose predefined iterator types in addition to ostream_iterator and istream_iterator. They are reverse_iterator, back_insert_iterator, front_insert_iterator, and insert_iterator.

Let’s start with seeing what a reverse iterator does. In essence, incrementing a reverse iterator causes it to decrement. Why not just decrement a regular iterator? The main reason is to simplify using existing functions. Suppose you want to display the contents of the dice container. As you just saw, you can use copy() and ostream_iterator to copy the contents to the output stream:

ostream_iterator out_iter(cout, " ");

copy(dice.begin(), dice.end(), out_iter); // display in forward order

Now suppose you want to print the contents in reverse order. (Perhaps you are performing time-reversal studies.) There are several approaches that don’t work, but rather than wallow in them, let’s go to one that does. The vector class has a member function called rbegin() that returns a reverse iterator pointing to past-the-end and a member rend() that returns a reverse iterator pointing to the first element. Because incrementing a reverse iterator makes it decrement, you can use the following statement to display the contents backward:

copy(dice.rbegin(), dice.rend(), out_iter); // display in reverse order

You don’t even have to declare a reverse iterator.

Note

Both rbegin() and end() return the same value (past-the-end), but as a different type (reverse_iterator versus iterator). Similarly, both rend() and begin() return the same value (an iterator to the first element), but as a different type.

Reverse pointers have to make a special compensation. Suppose rp is a reverse pointer initialized to dice.rbegin(). What should *rp be? Because rbegin() returns past-the-end, you shouldn’t try to dereference that address. Similarly, if rend() is really the location of the first element, copy() stops one location earlier because the end of the range is not in a range. Reverse pointers solve both problems by decrementing first and then dereferencing. That is, *rp dereferences the iterator value immediately preceding the current value of *rp. If rp points to position six, *rp is the value of position five, and so on. Listing 16.10 illustrates using copy(), an ostream iterator, and a reverse iterator.

Listing 16.10. copyit.cpp

// copyit.cpp -- copy() and iterators

#include

int main()

{

using namespace std;

int casts[10] = {6, 7, 2, 9 ,4 , 11, 8, 7, 10, 5};

vector dice(10);

// copy from array to vector

copy(casts, casts + 10, dice.begin());

cout << "Let the dice be cast!\n";

// create an ostream iterator

ostream_iterator out_iter(cout, " ");

// copy from vector to output

copy(dice.begin(), dice.end(), out_iter);

cout << endl;

cout <<"Implicit use of reverse iterator.\n";

copy(dice.rbegin(), dice.rend(), out_iter);

cout << endl;

cout <<"Explicit use of reverse iterator.\n";

vector::reverse_iterator ri;

for (ri = dice.rbegin(); ri != dice.rend(); ++ri)

cout << *ri << ' ';

cout << endl;

return 0;

}

Here is the output of the program in Listing 16.10:

Let the dice be cast!

6 7 2 9 4 11 8 7 10 5

Implicit use of reverse iterator.

5 10 7 8 11 4 9 2 7 6

Explicit use of reverse iterator.

5 10 7 8 11 4 9 2 7 6

If you have the choice of explicitly declaring iterators or using STL functions to handle the matter internally, for example, by passing an rbegin() return value to a function, you should take the latter course. It’s one less thing to do and one less opportunity to experience human fallibility.

The other three iterators (back_insert_iterator, front_insert_iterator, and insert_iterator) also increase the generality of the STL algorithms. Many STL functions are like copy() in that they send their results to a location indicated by an output iterator. Recall that the following copies values to the location beginning at dice.begin():

copy(casts, casts + 10, dice.begin());

These values overwrite the prior contents in dice, and the function assumes that dice has enough room to hold the values. That is, copy() does not automatically adjust the size of the destination to fit the information sent to it. Listing 16.10 takes care of that situation by declaring dice to have 10 elements, but suppose you don’t know in advance how big dice should be. Or suppose you want to add elements to dice rather than overwrite existing ones.

The three insert iterators solve these problems by converting the copying process to an insertion process. Insertion adds new elements without overwriting existing data, and it uses automatic memory allocation to ensure that the new information fits. A back_insert_iterator inserts items at the end of the container, and a front_insert_iterator inserts items at the front. Finally, the insert_iterator inserts items in front of the location specified as an argument to the insert_iterator constructor. All three of these iterators are models of the output container concept.

There are restrictions. A back_insert_iterator can be used only with container types that allow rapid insertion at the end. (Rapid refers to a constant time algorithm; the section “Container Concepts,” later in this chapter, discusses the constant time concept further.) The vector class qualifies. A front_insert_iterator can be used only with container types that allow constant time insertion at the beginning. Here the vector class doesn’t qualify, but the queue class does. The insert_iterator doesn’t have these restrictions. Thus, you can use it to insert material at the front of a vector. However, a front_insert_iterator does so faster for the container types that support it.

Tip

You can use an insert_iterator to convert an algorithm that copies data into one that inserts data.

These iterators take the container type as a template argument and the actual container identifier as a constructor argument. That is, to create a back_insert_iterator for a vector container called dice, you use this:

back_insert_iterator > back_iter(dice);

The reason you have to declare the container type is that the iterator has to make use of the appropriate container method. The code for the back_insert_iterator constructor will assume that a push_back() method exists for the type passed to it. The copy() function, being a standalone function, doesn’t have the access rights to resize a container. But the declaration just shown allows back_iter to use the vector::push_back() method, which does have access rights.

Declaring a front_insert_iterator has the same form. An insert_iterator declaration has an additional constructor argument to identify the insertion location:

insert_iterator > insert_iter(dice, dice.begin() );

Listing 16.11 illustrates using two of these iterators. Also it uses for_each() instead of an ostream iterator for output.

Listing 16.11. inserts.cpp

// inserts.cpp -- copy() and insert iterators

#include

void output(const std::string & s) {std::cout << s << " ";}

int main()

{

using namespace std;

string s1[4] = {"fine", "fish", "fashion", "fate"};

string s2[2] = {"busy", "bats"};

string s3[2] = {"silly", "singers"};

vector words(4);

copy(s1, s1 + 4, words.begin());

for_each(words.begin(), words.end(), output);

cout << endl;

// construct anonymous back_insert_iterator object

copy(s2, s2 + 2, back_insert_iterator >(words));

for_each(words.begin(), words.end(), output);

cout << endl;

// construct anonymous insert_iterator object

copy(s3, s3 + 2, insert_iterator >(words,

words.begin()));

for_each(words.begin(), words.end(), output);

cout << endl;

return 0;

}

Here is the output of the program in Listing 16.11:

fine fish fashion fate

fine fish fashion fate busy bats

silly singers fine fish fashion fate busy bats

The first copy()copies the four strings from s1 into words. This works in part because words is declared to hold four strings, which equals the number of strings being copied. Then the back_insert_iterator inserts the strings from s2 just in front of the end of the words array, expanding the size of words to six elements. Finally, the insert_iterator inserts the two strings from s3 just in front of the first element of words, expanding the size of words to eight elements. If the program attempted to copy s2 and s3 into words by using words.end() and words.begin() as iterators, there would be no room in words for the new data, and the program would probably abort because of memory violations.

If you’re feeling overwhelmed by all the iterator varieties, keep in mind that using them will make them familiar. Also keep in mind that these predefined iterators expand the generality of the STL algorithms. Thus, not only can copy() copy information from one container to another, it can copy information from a container to the output stream and from the input stream to a container. And you can also use copy() to insert material into another container. So you wind up with a single function doing the work of many. And because copy() is just one of several STL functions that use an output iterator, these predefined iterators multiply the capabilities of those functions, too.

Kinds of Containers

The STL has both container concepts and container types. The concepts are general categories with names such as container, sequence container, and associative container. The container types are templates you can use to create specific container objects. The original 11 container types are deque, list, queue, priority_queue, stack, vector, map, multimap, set, multiset, and bitset. (This chapter doesn’t discuss bitset, which is a container for dealing with data at the bit level.) C++11 adds forward_list, unordered_map, unordered_multimap, unordered_set, and unordered_multiset, and it moves bitset from the container category into its own separate category. Because the concepts categorize the types, let’s start with them.

Container Concepts

No type corresponds to the basic container concept, but the concept describes elements common to all the container classes. It’s sort of a conceptual abstract base class—conceptual because the container classes don’t actually use the inheritance mechanism. Or to put it another way, the container concept lays down a set of requirements that all STL container classes must satisfy.

A container is an object that stores other objects, which are all of a single type. The stored objects may be objects in the OOP sense, or they may be values of built-in types. Data stored in a container is owned by the container. That means when a container expires, so does the data stored in the container. (However, if the data are pointers, the pointed-to data does not necessarily expire.)

You can’t store just any kind of object in a container. In particular, the type has to be copy constructable and assignable. Basic types satisfy these requirements, as do class types—unless the class definition makes one or both of the copy constructor and the assignment operator private or protected. (C++11 refines the concepts, adding terms such as CopyInsertable and MoveInsertable, but we’ll take a more simplified, if less precise, overview.)

The basic container doesn’t guarantee that its elements are stored in any particular order or that the order doesn’t change, but refinements to the concept may add such guarantees. All containers provide certain features and operations. Table 16.5 summarizes several of these common features. In the table, X represents a container type (such as vector), T represents the type of object stored in the container, a and b represent values of type X, r is a value of type X&, and u represents an identifier of type X (that is, if X represents vector, then u is a vector object).

Table 16.5. Some Basic Container Properties

The Complexity column in Table 16.5 describes the time needed to perform an operation. This table lists three possibilities, which, from fastest to slowest, are as follows:

• Compile time

• Constant time

• Linear time

If the complexity is compile time, the action is performed during compilation and uses no execution time. A constant complexity means the operation takes place during runtime but doesn’t depend on the number of elements in an object. A linear complexity means the time is proportional to the number of elements. Thus, if a and b are containers, a == b has linear complexity because the == operation may have to be applied to each element of the container. Actually, that is a worst-case scenario. If two containers have different sizes, no individual comparisons need to be made.

Constant-Time and Linear-Time Complexity

Imagine a long, narrow box filled with large packages arranged in a line, and suppose the box is open at just one end. Suppose your task is to unload the package at the open end. This is a constant time task. Whether there are 10 packages or 1,000 packages behind the one at the end makes no difference.

Now suppose your task is to fetch the package at the closed end of the box. This is a linear time task. If there are 10 packages altogether, you have to unload 10 packages to get the one at the closed end. If there are 100 packages, you have to unload 100 packages at the end. Assuming that you are a tireless worker who can move only 1 package at a time, this task will take 10 times longer than the first one.

Now suppose your task is to fetch an arbitrary package. It might happen that the package you are supposed to get is the first one at hand. However, on the average, the number of packages you have to move is still proportional to the number of packages in the container, so the task still has linear-time complexity.

Replacing the long, narrow box with a similar box having open sides would change the task to constant-time complexity because then you could move directly to the desired package and remove it without moving the others.

The idea of time complexity describes the effect of container size on execution time but ignores other factors. If a superhero can unload packages from a box with one open end 1,000 times faster than you can, the task as executed by her still has linear-time complexity. In this case, the super hero’s linear time performance with a closed box (open end) would be faster than your constant time performance with an open box, as long as the boxes didn’t have too many packages.

Complexity requirements are characteristic of the STL. Although the details of an implementation may be hidden, the performance specifications should be public so that you know the computing cost of doing a particular operation.

C++11 Additions to Container Requirements

Table 16.6 shows some additions C++11 has made to the general container requirements. The table uses the notation rv to denote a non-constant rvalue of type X (for example, the return value of a function). Also the requirement in Table 16.6 that X::iterator satisfy the requirements for a forward iterator is a change from the former requirement that it just not be an output iterator.

Table 16.6. Some Added Basic Container Requirements (C++11)

The difference between copy construction and copy assignment on the one hand and move construction and move assignment on the other hand is that a copy operation leaves the original unchanged, whereas a move operation can alter the original, perhaps transferring ownership without doing any copying When the source object is temporary, move operations can provide more efficient code than does regular copying. Chapter 18 discusses move semantics further.

Sequences

You can refine the basic container concept by adding requirements. The sequence is an important refinement because several of the STL container types—deque, forward_list (C++11), list, queue, priority_queue, stack, and vector—are sequences. (Recall that a queue allows elements to be added at the rear end and removed from the front. A double-ended queue, represented by deque, allows addition and removal at both ends.) The requirement that the iterator be at least a forward iterator guarantees that the elements are arranged in a definite order that doesn’t change from one cycle of iteration to the next. The array class also is classified as a sequence container, although it doesn’t satisfy all the requirements.

The sequence also requires that its elements be arranged in strict linear order. That is, there is a first element, there is a last element, and each element but the first and last has exactly one element immediately ahead of it and one element immediately after it. An array and a linked list are examples of sequences, whereas a branching structure (in which each node points to two daughter nodes) is not.

Because elements in sequence have a definite order, operations such as inserting values at a particular location and erasing a particular range become possible. Table 16.7 lists these and other operations required of a sequence. The table uses the same notation as Table 16.5, with the addition of t representing a value of type T—that is, the type of value stored in the container, of n, an integer, and of p, q, i, and j, representing iterators.

Table 16.7. Sequence Requirements

Because the deque, list, queue, priority_queue, stack, and vector template classes are all models of the sequence concept, they all support the operators in Table 16.7. In addition, there are operations that are available to some of these six models. When allowed, they have constant-time complexity. Table 16.8 lists these additional operations.

Table 16.8. Optional Sequence Requirements

Table 16.8 merits a comment or two. First, notice that a[n] and a.at(n) both return a reference to the nth element (numbering from 0) in a container. The difference between the two is that a.at(n) does bounds checking and throws an out_of_range exception if n is outside the valid range for the container. Next, you might wonder why, say, push_front() is defined for list and deque and not for vector. Suppose you want to insert a new value at the front of a vector of 100 elements. To make room, you have to move element 99 to position 100, and then you have to move element 98 to position 99, and so on. This is an operation with linear-time complexity because moving 100 elements would take 100 times as long as moving a single element. But the operations in Table 16.8 are supposed to be implemented only if they can be performed with constant-time complexity. The design for lists and double-ended queues, however, allows an element to be added to the front without moving the other elements to new locations, so they can implement push_front() with constant-time complexity. Figure 16.4 illustrates push_front() and push_back().

Figure 16.4. push_front() and push_back().

Let’s take a closer look at the six sequence container types.

vector

You’ve already seen several examples using the vector template, which is declared in the vector header file. In brief, vector is a class representation of an array. The class provides automatic memory management that allows the size of a vector object to vary dynamically, growing and shrinking as elements are added or removed. It provides random access to elements. Elements can be added to or removed from the end in constant time, but insertion and removal from the beginning and the middle are linear-time operations.

In addition to being a sequence, a vector container is also a model of the reversible container concept. This adds two more class methods: rbegin() returns an iterator to the first element of the reversed sequence, and rend() returns a past-the-end iterator for the reversed sequence. So if dice is a vector container and Show(int) is a function that displays an integer, the following code displays the contents of dice first in forward order and then in reverse order:

for_each(dice.begin(), dice.end(), Show); // display in order

cout << endl;

for_each(dice.rbegin(), dice.rend(), Show); // display in reversed order

cout << endl;

The iterator returned by the two methods is of a class scope type reverse_iterator. Recall that incrementing such an iterator causes it to move through a reversible container in reverse order.

The vector template class is the simplest of the sequence types and is considered the type that should be used by default unless the program requirements are better satisfied by the particular virtues of the other types.

deque

The deque template class (declared in the deque header file) represents a double-ended queue, a type often called a deque (pronounced “deck”), for short. As implemented in the STL, it’s a lot like a vector container, supporting random access. The main difference is that inserting and removing items from the beginning of a deque object are constant-time operations instead of being linear-time operations the way they are for vector. So if most operations take place at the beginning and ends of a sequence, you should consider using a deque data structure.

The goal of constant-time insertion and removal at both ends of a deque makes the design of a deque object more complex than that of a vector object. Thus, although both offer random access to elements and linear-time insertion and removal from the middle of a sequence, the vector container should allow faster execution of these operations.

list

The list template class (declared in the list header file) represents a doubly linked list. Each element, other than the first and last, is linked to the item before it and the item following it, implying that a list can be traversed in both directions. The crucial difference between list and vector is that list provides for constant-time insertion and removal of elements at any location in the list. (Recall that the vector template provides linear-time insertion and removal except at the end, where it provides constant-time insertion and removal.) Thus, vector emphasizes rapid access via random access, whereas list emphasizes rapid insertion and removal of elements.

Like vector, list is a reversible container. Unlike vector, list does not support array notation and random access. Unlike a vector iterator, a list iterator remains pointing to the same element even after items are inserted into or removed from a container. For example, suppose you have an iterator pointing to the fifth element of a vector container. Then suppose you insert an element at the beginning of the container. All the other elements have to be moved to make room, so after the insertion, the fifth element now contains the value that used to be in the fourth element. Thus, the iterator points to the same location but to different data. Inserting a new element into a list, however, doesn’t move the existing elements; it just alters the link information. An iterator pointing to a certain item still points to the same item, but it may be linked to different items than before.

The list template class has some list-oriented member functions in addition to those that come with sequences and reversible containers. Table 16.9 lists many of them. (For a complete list of STL methods and functions, see Appendix G.) The Alloc template parameter is one you normally don’t have to worry about because it has a default value.

Table 16.9. Some list Member Functions

Listing 16.12 illustrates these methods, along with the insert() method, which comes with all STL classes that model sequences.

Listing 16.12. list.cpp

// list.cpp -- using a list

#include

void outint(int n) {std::cout << n << " ";}

int main()

{

using namespace std;

list one(5, 2); // list of 5 2s

int stuff[5] = {1,2,4,8, 6};

list two;

two.insert(two.begin(),stuff, stuff + 5 );

int more[6] = {6, 4, 2, 4, 6, 5};

list three(two);

three.insert(three.end(), more, more + 6);

cout << "List one: ";

for_each(one.begin(),one.end(), outint);

cout << endl << "List two: ";

for_each(two.begin(), two.end(), outint);

cout << endl << "List three: ";