IdentifiantMot de passe
Loading...
Mot de passe oublié ?Je m'inscris ! (gratuit)

Thinking in C++ - Volume 2

Thinking in C++ - Volume 2

Date de publication : 25/01/2007 , Date de mise à jour : 25/01/2007


2.5. Generic Containers
2.5.1. Containers and iterators
2.5.1.1. STL reference documentation
2.5.2. A first look
2.5.2.1. Containers of strings
2.5.2.2. Inheriting from STL containers
2.5.3. A plethora of iterators
2.5.3.1. Iterators in reversible containers
2.5.3.2. Iterator categories
2.5.3.3. Predefined iterators
2.5.4. The basic sequences: vector, list, deque
2.5.4.1. Basic sequence operations
2.5.4.2. vector
2.5.4.3. deque
2.5.4.4. Converting between sequences
2.5.4.5. Checked random–access
2.5.4.6. list
2.5.4.7. Swapping sequences
2.5.5. set
2.5.5.1. A completely reusable tokenizer
2.5.6. stack
2.5.7. queue
2.5.8. Priority queues
2.5.9. Holding bits
2.5.9.1. bitset<n>
2.5.9.2. vector<bool>
2.5.10. Associative containers
2.5.10.1. Generators and fillers for associative containers
2.5.10.2. The magic of maps
2.5.10.3. Multimaps and duplicate keys
2.5.10.4. Multisets
2.5.11. Combining STL containers
2.5.12. Cleaning up containers of pointers
2.5.13. Creating your own containers
2.5.14. STL extensions
2.5.15. Non–STL containers  
2.5.16. Summary
2.5.17. Exercises


2.5. Generic Containers

Container classes are the solution to a specific kind of code reuse problem. They are building blocks used to create object–oriented programs, and they make the internals of a program much easier to construct.

A container class describes an object that holds other objects. Container classes are so important that they were considered fundamental to early object-oriented languages. In Smalltalk, for example, programmers think of the language as the program translator together with the class library, and a critical part of that library is the set of container classes. It became natural, therefore, for C++ compiler vendors to also include a container class library. You'll note that the vector is so useful that it was introduced in its simplest form early in Volume 1 of this book.

Like many other early C++ libraries, early container class libraries followed Smalltalk's object-based hierarchy, which worked well for Smalltalk, but turned out to be awkward and difficult to use in C++. Another approach was required.

The C++ approach to containers is based on templates. The containers in the Standard C++ library represent a broad range of data structures designed to work well with the standard algorithms and to meet common software development needs.


2.5.1. Containers and iterators

If you don't know how many objects you're going to need to solve a particular problem, or how long they will last, you also don't know ahead of time how to store those objects. How can you know how much space to create? The answer is you don't—until run time.

The solution to most problems in object-oriented design seems simple; you create another type of object. For the storage problem, the new type of object holds other objects or pointers to objects. This new type of object, which is typically referred to in C++ as a container (also called a collection in some languages), expands itself whenever necessary to accommodate everything you place inside it. You don't need to know ahead of time how many objects you're going to place in a container; you just create a container object and let it take care of the details.

Fortunately, a good object-oriented programming language comes with a set of containers. In C++, it's the Standard Template Library (STL). In some libraries, a generic container is considered good enough for all needs, and in others (C++ in particular) the library has different types of containers for different needs: a vector for efficient access to all elements, and a linked list for efficient insertion at all positions, and several more, so you can choose the particular type that fits your needs.

All containers have some way to put things in and get things out. The way you place something into a container is fairly obvious; there's a function called “push” or “add” or a similar name. The way you retrieve things from a container is not always as apparent; if an entity is array-like, such as a vector, you might be able to use an indexing operator or function. But in many situations this doesn't make sense. Also, a single-selection function is restrictive. What if you want to manipulate or compare a group of elements in the container?

The solution for flexible element access is the iterator, an object whose job is to select the elements within a container and present them to the user of the iterator. As a class, an iterator also provides a level of abstraction, which you can use to separate the details of the container from the code that's accessing that container. The container, via the iterator, is seen as a sequence. The iterator lets you traverse the sequence without worrying about the underlying structure—that is, whether it's a vector, a linked list, a set, or something else. This gives you the flexibility to easily change the underlying data structure without disturbing the code in your program that traverses the container. Separating iteration from the container's control also allows multiple simultaneous iterators.

From a design standpoint, all you really want is a sequence that can be manipulated to solve your problem. If a single type of sequence satisfied all your needs, there would be no reason to have different types. You need a choice of containers for two reasons. First, containers provide different types of interfaces and external behavior. A stack has an interface and a behavior that is different from that of a queue, which is different from that of a set or a list. One of these might provide a more flexible solution to your problem than the other, or it might provide a clearer abstraction that conveys your design intent. Second, different containers have different efficiencies for certain operations. Compare a vector to a list, for example. Both are simple sequences that can have nearly identical interfaces and external behaviors. But certain operations can have radically different costs. Randomly accessing elements in a vector is a constant-time operation; it takes the same amount of time regardless of the element you select. However, it is expensive to move through a linked list to randomly access an element, and it takes longer to find an element if it is farther down the list. On the other hand, if you want to insert an element in the middle of a sequence, it's cheaper with a list than with a vector. The efficiencies of these and other operations depend on the underlying structure of the sequence. In the design phase, you might start with a list and, when tuning for performance, change to a vector, or vice-versa. Because of iterators, code that merely traverses sequences is insulated from changes in the underlying sequence implementation.

Remember that a container is only a storage cabinet that holds objects. If that cabinet solves all your needs, it probably doesn't really matter how it is implemented. If you're working in a programming environment that has built-in overhead due to other factors, the cost difference between a vector and a linked list might not matter. You might need only one type of sequence. You can even imagine the “perfect” container abstraction, which can automatically change its underlying implementation according to the way it is used.(99)


2.5.1.1. STL reference documentation

As in the previous chapter, you will notice that this chapter does not contain exhaustive documentation describing each of the member functions in each STL container. Although we describe the member functions we use, we've left the full descriptions to others. We recommend the online resources available for the Dinkumware, Silicon Graphics, and STLPort STL implementations.(100)


2.5.2. A first look

Here's an example using the set class template, a container modeled after a traditional mathematical set and which does not accept duplicate values. The following set was created to work with ints:


//: C07:Intset.cpp
// Simple use of STL set.
#include <cassert>
#include <set>
using namespace std;

int main() {
  set<int> intset;
  for(int i = 0; i < 25; i++)
    for(int j = 0; j < 10; j++)
      // Try to insert duplicates:
      intset.insert(j);
  assert(intset.size() == 10);
} ///:~
The insert( ) member does all the work: it attempts to insert an element and ignores it if it's already there. Often the only activities involved in using a set are simply insertion and testing to see whether it contains the element. You can also form a union, an intersection, or a difference of sets and test to see if one set is a subset of another. In this example, the values 0–9 are inserted into the set 25 times, but only the 10 unique instances are accepted.

Now consider taking the form of Intset.cpp and modifying it to display a list of the words used in a document. The solution becomes remarkably simple.


//: C07:WordSet.cpp
#include <fstream>
#include <iostream>
#include <iterator>
#include <set>
#include <string>
#include "../require.h"
using namespace std;

void wordSet(const char* fileName) {
  ifstream source(fileName);
  assure(source, fileName);
  string word;
  set<string> words;
  while(source >> word)
    words.insert(word);
  copy(words.begin(), words.end(),
    ostream_iterator<string>(cout,
"\n"));
  cout << "Number of unique words:"
       << words.size() << endl;
}

int main(int argc, char* argv[]) {
  if(argc > 1)
    wordSet(argv[1]);
  else
    wordSet("WordSet.cpp");
} ///:~
The only substantive difference here is that the set holds strings instead of integers. The words are pulled from a file, but the other operations are similar to those in Intset.cpp. Not only does the output reveal that duplicates have been ignored, but because of the way set is implemented, the words are automatically sorted.

A set is an example of an associative container, one of the three categories of containers provided by the Standard C++ library. The containers and their categories are summarized in the following table:

Category Containers
Sequence Containers vector, list, deque
Container Adaptors queue, stack, priority_queue
Associative Containers set, map, multiset, multimap
These categories represent different models that are used for different needs. The Sequence Containers simply organize their elements linearly, and are the most fundamental type of containers. For some problems, special properties need to be attached to these sequences, and that's exactly what the Container Adaptors do—they model abstractions such as a queue or stack. The associative containers organize their data based on keys, allowing for fast retrieval of that data.

All the containers in the standard library hold copies of the objects you place in them, and expand their resources as needed, so your objects must be copy-constructible (have an accessible copy constructor) and assignable (have an accessible assignment operator). The key difference between one container and another is the way the objects are stored in memory and what operations are available to the user.

A vector, as you already know, is a linear sequence that allows rapid random access to its elements. However, it's expensive to insert an element in the middle of a co-located sequence like a vector, just as it is with an array. A deque (double-ended-queue, pronounced “deck”) also allows random access that's nearly as fast as vector, but it's significantly faster when it needs to allocate new storage, and you can easily add new elements at the front as well as the back of the sequence. A list is a doubly linked list, so it's expensive to move around randomly but cheap to insert an element anywhere. Thus list, deque and vector are similar in their basic functionality (they all hold linear sequences), but different in the cost of their activities. For your first attempt at a program, you could choose any one and experiment with the others only if you're tuning for efficiency.

Many of the problems you set out to solve will only require a simple linear sequence such as a vector, deque, or list. All three have a member function push_back( ) that you use to insert a new element at the back of the sequence (deque and list also have push_front( ), which inserts elements at the beginning of the sequence).

But how do you retrieve the elements stored in a sequence container? With a vector or deque, it is possible to use the indexing operator[ ], but that doesn't work with list. You can use iterators on all three sequences to access elements. Each container provides the appropriate type of iterator for accessing its elements.

Even though the containers hold objects by value (that is, they hold copies of whole objects), sometimes you'll want to store pointers so that you can refer to objects from a hierarchy and thus take advantage of the polymorphic behavior of the classes represented. Consider the classic “shape” example where shapes have a set of common operations, and you have different types of shapes. Here's what it looks like using the STL vector to hold pointers to various Shape types created on the heap:


//: C07:Stlshape.cpp
// Simple shapes using the STL.
#include <vector>
#include <iostream>
using namespace std;

class Shape {
public:
  virtual void draw() = 0;
  virtual ~Shape() {};
};

class Circle : public Shape {
public:
  void draw() { cout << "Circle::draw”
<< endl; }
  ~Circle() { cout << "~Circle” <<
endl; }
};

class Triangle : public Shape {
public:
  void draw() { cout << "Triangle::draw”
<< endl; }
  ~Triangle() { cout << "~Triangle” <<
endl; }
};

class Square : public Shape {
public:
  void draw() { cout << "Square::draw”
<< endl; }
  ~Square() { cout << "~Square” <<
endl; }
};

int main() {
  typedef std::vector<Shape*> Container;
  typedef Container::iterator Iter;
  Container shapes;
  shapes.push_back(new Circle);
  shapes.push_back(new Square);
  shapes.push_back(new Triangle);
  for(Iter i = shapes.begin(); i != shapes.end(); i++)
    (*i)->draw();
  // ... Sometime later:
  for(Iter j = shapes.begin(); j != shapes.end(); j++)
    delete *j;
} ///:~
The creation of Shape, Circle, Square, and Triangle should be fairly familiar. Shape is an abstract base class (because of the pure specifier =0) that defines the interface for all types of Shapes. The derived classes override the virtual function draw( ) to perform the appropriate operation. Now we'd like to create a bunch of different types of Shape objects, and the natural place to store them is in an STL container. For convenience, this typedef:


typedef std::vector<Shape*> Container;
creates an alias for a vector of Shape*, and this typedef:


typedef Container::iterator Iter;
uses that alias to create another one, for vector<Shape*>::iterator. Notice that the container type name must be used to produce the appropriate iterator, which is defined as a nested class. Although there are different types of iterators (forward, bidirectional, random, and so on), they all have the same basic interface: you can increment them with ++, you can dereference them to produce the object they're currently selecting, and you can test them to see if they're at the end of the sequence. That's what you'll want to do 90 percent of the time. And that's what is done in the previous example: after a container is created, it's filled with different types of Shape pointers. Notice that the upcast happens as the Circle, Square, or Rectangle pointer is added to the Shapes container, which doesn't know about those specific types but instead holds only Shape*. As soon as the pointer is added to the container, it loses its specific identity and becomes an anonymous Shape*. This is exactly what we want: toss them all in and let polymorphism sort it out.

The first for loop creates an iterator and sets it to the beginning of the sequence by calling the begin( ) member function for the container. All containers have begin( ) and end( ) member functions that produce an iterator selecting, respectively, the beginning of the sequence and one past the end of the sequence. To test to see if you're done, you make sure the iterator is not equal to the iterator produced by end( ); don't use < or <=. The only tests that work are != and ==, so it's common to write a loop like:


for(Iter i = shapes.begin(); i != shapes.end(); i++)
This says “take me through every element in the sequence.”

What do you do with the iterator to produce the element it's selecting? You dereference it using (what else?) the ‘*' (which is actually an overloaded operator). What you get back is whatever the container is holding. This container holds Shape*, so that's what *i produces. If you want to call a Shape member function, you must do so with the -> operator, so you write the line:


(*i)->draw();
This calls the draw( ) function for the Shape* the iterator is currently selecting. The parentheses are ugly but necessary to produce the desired operator precedence.

As they are destroyed or in other cases where the pointers are removed, the STL containers do not automatically call delete for the pointers they contain. If you create an object on the heap with new and place its pointer in a container, the container can't tell if that pointer is also placed inside another container, nor if it refers to heap memory in the first place. As always, you are responsible for managing your own heap allocations. The last lines in the program move through and delete every object in the container so that proper cleanup occurs. The easiest and safest way to handle pointers in containers is to use smart pointers. It should be noted that auto_ptr can't be used for this purpose, so you will need to look outside of the C++ Standard Library for a suitable smart pointers.(101)

You can change the type of container that this program uses with two lines. Instead of including <vector>, you include <list>, and in the first typedef you say:


typedef std::list<Shape*> Container;
instead of using a vector. Everything else goes untouched. This is possible not because of an interface enforced by inheritance (there is little inheritance in the STL), but because the interface is enforced by a convention adopted by the designers of the STL, precisely so you could perform this kind of interchange. Now you can easily switch between vector and list or any other container that supports the same interface (both syntactically and semantically) and see which one works fastest for your needs.


2.5.2.1. Containers of strings

In the previous example, at the end of main( ) it was necessary to move through the whole list and delete all the Shape pointers:


for(Iter j = shapes.begin(); j != shapes.end(); j++)
  delete *j;
STL containers make sure that each object they contain has its destructor called when the container itself is destroyed. Pointers, however, have no destructor, so we have to delete them ourselves.

This highlights what could be seen as an oversight in the STL: there's no facility in any of the STL containers to automatically delete the pointers they contain, so you must do it manually. It's as if the assumption of the STL designers was that containers of pointers weren't an interesting problem, but that's not the case.

Automatically deleting a pointer turns out to be problematic because of the multiple membership problem. If a container holds a pointer to an object, it's not unlikely that pointer could also be in another container. A pointer to an Aluminum object in a list of Trash pointers could also reside in a list of Aluminum pointers. If that happens, which list is responsible for cleaning up that object—that is, which list “owns” the object?

This question is virtually eliminated if the object rather than a pointer resides in the list. Then it seems clear that when the list is destroyed, the objects it contains must also be destroyed. Here, the STL shines, as you can see when creating a container of string objects. The following example stores each incoming line as a string in a vector<string>:


//: C07:StringVector.cpp
// A vector of strings.
#include <fstream>
#include <iostream>
#include <iterator>
#include <sstream>
#include <string>
#include <vector>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  const char* fname = "StringVector.cpp";
  if(argc > 1) fname = argv[1];
  ifstream in(fname);
  assure(in, fname);
  vector<string> strings;
  string line;
  while(getline(in, line))
    strings.push_back(line);
  // Do something to the strings...
  int i = 1;
  vector<string>::iterator w;
  for(w = strings.begin(); w !=
strings.end(); w++) {
    ostringstream ss;
    ss << i++;
    *w = ss.str() + ": " + *w;
  }
  // Now send them out:
  copy(strings.begin(), strings.end(),
    ostream_iterator<string>(cout,
"\n"));
  // Since they aren't pointers, string
  // objects clean themselves up!
} ///:~
Once the vector<string> called strings is created, each line in the file is read into a string and put in the vector:


  while(getline(in, line))
    strings.push_back(line);
The operation that's being performed on this file is to add line numbers. A stringstream provides easy conversion from an int to a string of characters representing that int.

Assembling string objects is quite easy, since operator+ is overloaded. Sensibly enough, the iterator w can be dereferenced to produce a string that can be used as both an rvalue and an lvalue:


*w = ss.str() + ": " + *w;
You may be surprised that you can assign back into the container via the iterator, but it's a tribute to the careful design of the STL.

Because the vector<string> contains the objects, two things are worthy of note. First, as explained before, you don't need to explicitly clean up the string objects. Even if you put addresses of the string objects as pointers into other containers, it's clear that strings is the “master list” and maintains ownership of the objects.

Second, you are effectively using dynamic object creation, and yet you never use new or delete! It's all taken care of for you by the vector because it stores copies of the objects you give it. Thus your coding is significantly cleaned up.


2.5.2.2. Inheriting from STL containers

The power of instantly creating a sequence of elements is amazing, and it makes you realize how much time you may have lost in the past solving this particular problem. For example, many utility programs involve reading a file into memory, modifying the file, and writing it back out to disk. You might as well take the functionality in StringVector.cpp and package it into a class for later reuse.

Now the question is: do you create a member object of type vector, or do you inherit? A general object-oriented design guideline is to prefer composition (member objects) over inheritance, but the standard algorithms expect sequences that implement a particular interface, so inheritance is often necessary.


//: C07:FileEditor.h
// A file editor tool.
#ifndef FILEEDITOR_H
#define FILEEDITOR_H
#include <iostream>
#include <string>
#include <vector>

class FileEditor : public
std::vector<std::string> {
public:
  void open(const char* filename);
  FileEditor(const char* filename) { open(filename); }
  FileEditor() {};
  void write(std::ostream& out = std::cout);
};
#endif // FILEEDITOR_H ///:~
The constructor opens the file and reads it into the FileEditor, and write( ) puts the vector of string onto any ostream. Notice in write( ) that you can have a default argument for the reference.

The implementation is quite simple:


//: C07:FileEditor.cpp {O}
#include "FileEditor.h"
#include <fstream>
#include "../require.h"
using namespace std;

void FileEditor::open(const char* filename) {
  ifstream in(filename);
  assure(in, filename);
  string line;
  while(getline(in, line))
    push_back(line);
}

// Could also use copy() here:
void FileEditor::write(ostream& out) {
  for(iterator w = begin(); w != end(); w++)
    out << *w << endl;
} ///:~
The functions from StringVector.cpp are simply repackaged. Often this is the way classes evolve—you start by creating a program to solve a particular application and then discover some commonly used functionality within the program that can be turned into a class.

The line-numbering program can now be rewritten using FileEditor:


//: C07:FEditTest.cpp
//{L} FileEditor
// Test the FileEditor tool.
#include <sstream>
#include "FileEditor.h"
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  FileEditor file;
  if(argc > 1) {
    file.open(argv[1]);
  } else {
    file.open("FEditTest.cpp");
  }
  // Do something to the lines...
  int i = 1;
  FileEditor::iterator w = file.begin();
  while(w != file.end()) {
    ostringstream ss;
    ss << i++;
    *w = ss.str() + ": " + *w;
    ++w;
  }
  // Now send them to cout:
  file.write();
} ///:~
Now the operation of reading the file is in the constructor:


FileEditor file(argv[1]);
(or in the open( ) member function), and writing happens in the single line (which defaults to sending the output to cout):


file.write();
The bulk of the program is involved with modifying the file in memory.


2.5.3. A plethora of iterators

An iterator is an abstraction for genericity. It works with different types of containers without knowing the underlying structure of those containers. Most containers support iterators,(102) so you can say:


<ContainerType>::iterator
<ContainerType>::const_iterator
to produce the iterator types for a container. Every container has a begin( ) member function that produces an iterator indicating the beginning of the elements in the container, and an end( ) member function that produces an iterator which is the past-the-end marker of the container. If the container is const¸ begin( ) and end( ) produce const iterators, which disallow changing the elements pointed to (because the appropriate operators are const).

All iterators can advance within their sequence (via operator++) and allow == and != comparisons. Thus, to move an iterator it forward without running it off the end, you say something like:


while(it != pastEnd) {
  // Do something
  ++it;
}
where pastEnd is the past-the-end marker produced by the container's end( ) member function.

An iterator can be used to produce the container element that it is currently selecting via the dereferencing operator (operator*). This can take two forms. If it is an iterator traversing a container, and f( ) is a member function of the type of objects held in the container, you can say either:


(*it).f();
or


it->f();
Knowing this, you can create a template that works with any container. Here, the apply( ) function template calls a member function for every object in the container, using a pointer to member that is passed as an argument:


//: C07:Apply.cpp
// Using simple iteration.
#include <iostream>
#include <vector>
#include <iterator>
using namespace std;

template<class Cont, class PtrMemFun>
void apply(Cont& c, PtrMemFun f) {
  typename Cont::iterator it = c.begin();
  while(it != c.end()) {
    ((*it).*f)(); // Alternate form
    ++it;
  }
}

class Z {
  int i;
public:
  Z(int ii) : i(ii) {}
  void g() { ++i; }
  friend ostream& operator<<(ostream& os,
const Z& z) {
    return os << z.i;
  }
};

int main() {
  ostream_iterator<Z> out(cout, " ");
  vector<Z> vz;
  for(int i = 0; i < 10; i++)
    vz.push_back(Z(i));
  copy(vz.begin(), vz.end(), out);
  cout << endl;
  apply(vz, &Z::g);
  copy(vz.begin(), vz.end(), out);
} ///:~
You can't use operator-> here because the resulting statement would be:


(it->*f)();
which attempts to use the iterator's operator->*, which is not provided by the iterator classes.(103)

It is much easier to use either for_each( ) or transform( ) to apply functions to sequences, as you saw in the previous chapter.


2.5.3.1. Iterators in reversible containers

A container may also be reversible, which means that it can produce iterators that move backward from the end, as well as iterators that move forward from the beginning. All standard containers support such bidirectional iteration.

A reversible container has the member functions rbegin( ) (to produce a reverse_iterator selecting the end) and rend( ) (to produce a reverse_iterator indicating “one past the beginning”). If the container is const, rbegin( ) and rend( ) will produce const_reverse_iterators.

The following example uses vector but will work with all containers that support iteration:


//: C07:Reversible.cpp
// Using reversible containers.
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include "../require.h"
using namespace std;

int main() {
  ifstream in("Reversible.cpp");
  assure(in, "Reversible.cpp");
  string line;
  vector<string> lines;
  while(getline(in, line))
    lines.push_back(line);
  for(vector<string>::reverse_iterator r =
lines.rbegin();
      r != lines.rend(); r++)
    cout << *r << endl;
} ///:~
You move backward through the container using the same syntax as you do when moving forward through a container with an ordinary iterator.


2.5.3.2. Iterator categories

The iterators in the Standard C++ library are classified into “categories” that describe their capabilities. The order in which they are generally described moves from the categories with the most restricted behavior to those with the most powerful behavior.

Input: read–only, one pass

The only predefined implementations of input iterators are istream_iterator and istreambuf_iterator, to read from an istream. As you can imagine, an input iterator can only be dereferenced once for each element that's selected, just as you can only read a particular portion of an input stream once. They can only move forward. A special constructor defines the past-the-end value. In summary, you can dereference it for reading (once only for each value) and move it forward.

Output: write–only, one pass

This is the complement of an input iterator, but for writing rather than reading. The only predefined implementations of output iterators are ostream_iterator and ostreambuf_iterator, to write to an ostream, and the less commonly used raw_storage_iterator. Again, these can only be dereferenced once for each written value, and they can only move forward. There is no concept of a terminal past-the-end value for an output iterator. Summarizing, you can dereference it for writing (once only for each value) and move it forward.

Forward: multiple read/write

The forward iterator contains all the functionality of both the input iterator and the output iterator, plus you can dereference an iterator location multiple times, so you can read and write to a value multiple times. As the name implies, you can only move forward. There are no predefined iterators that are only forward iterators.

Bidirectional: operator––

The bidirectional iterator has all the functionality of the forward iterator, and in addition it can be moved backward one location at a time using operator--. The iterators returned by the list container are bidirectional.

Random–access: like a pointer

Finally, the random-access iterator has all the functionality of the bidirectional iterator plus all the functionality of a pointer (a pointer is a random-access iterator), except that there is no “null” iterator analogue to a null pointer. Basically, anything you can do with a pointer you can do with a random-access iterator, including indexing with operator[ ], adding integral values to a pointer to move it forward or backward by a number of locations, or comparing one iterator to another with comparison operators.

Is this really important?

Why do you care about this categorization? When you're just using containers in a straightforward way (for example, just hand-coding all the operations you want to perform on the objects in the container), it usually doesn't matter. Things either work or they don't. The iterator categories become important when:

  1. You use some of the fancier built-in iterator types that will be demonstrated shortly, or you “graduate” to creating your own iterators (demonstrated later in this chapter).
  2. You use the STL algorithms (the subject of the previous chapter). Each of the algorithms places requirements on its iterators. Knowledge of the iterator categories is even more important when you create your own reusable algorithm templates, because the iterator category required by your algorithm determines how flexible the algorithm will be. If you require only the most primitive iterator category (input or output), your algorithm will work with everything (copy( ) is an example of this).
An iterator's category is identified by a hierarchy of iterator tag classes. The class names correspond to the iterator categories, and their derivation reflects the relationship between them:


struct input_iterator_tag {};
struct output_iterator_tag {};
struct forward_iterator_tag :
  public input_iterator_tag {};
struct bidirectional_iterator_tag :
  public forward_iterator_tag {};
struct random_access_iterator_tag :
  public bidirectional_iterator_tag {};
The class forward_iterator_tag derives only from input_iterator_tag, not from output_iterator_tag, because we need to have past-the-end iterator values in algorithms that use forward iterators, but algorithms that use output iterators always assume that operator* can be dereferenced. For this reason, it is important to make sure that a past-the-end value is never passed to an algorithm that expects an output iterator.

For efficiency, certain algorithms provide different implementations for different iterator types, which they infer from the iterator tag defined by the iterator. We will use some of these tag classes later in this chapter when we define our own iterator types.


2.5.3.3. Predefined iterators

The STL has a predefined set of iterators that can be quite handy. For example, you've already seen the reverse_iterator objects produced by calling rbegin( ) and rend( ) for all the basic containers.

The insertion iterators are necessary because some of the STL algorithms—copy( ), for example—use the assignment operator= to place objects in the destination container. This is a problem when you're using the algorithm to fill the container rather than to overwrite items that are already in the destination container—that is, when the space isn't already there. What the insert iterators do is change the implementation of operator= so that instead of doing an assignment, it calls a “push” or “insert” function for that container, thus causing it to allocate new space. The constructors for both back_insert_iterator and front_insert_iterator take a basic sequence container object (vector, deque or list) as their argument and produce an iterator that calls push_back( ) or push_front( ), respectively, to perform assignment. The helper functions back_inserter( ) and front_inserter( ) produce these insert-iterator objects with a little less typing. Since all the basic sequence containers support push_back( ), you will probably find yourself using back_inserter( ) with some regularity.

An insert_iterator lets you insert elements in the middle of the sequence, again replacing the meaning of operator=, but this time by automatically calling insert( ) instead of one of the “push” functions. The insert( ) member function requires an iterator indicating the place to insert before, so the insert_iterator requires this iterator in addition to the container object. The shorthand function inserter( ) produces the same object.

The following example shows the use of the different types of inserters:


//: C07:Inserters.cpp
// Different types of iterator inserters.
#include <iostream>
#include <vector>
#include <deque>
#include <list>
#include <iterator>
using namespace std;

int a[] = { 1, 3, 5, 7, 11, 13, 17, 19, 23 };

template<class Cont> void
frontInsertion(Cont& ci) {
  copy(a, a + sizeof(a)/sizeof(Cont::value_type),
    front_inserter(ci));
  copy(ci.begin(), ci.end(),
    ostream_iterator<typename Cont::value_type>(
    cout, " "));
  cout << endl;
}

template<class Cont> void backInsertion(Cont&
ci) {
  copy(a, a + sizeof(a)/sizeof(Cont::value_type),
    back_inserter(ci));
  copy(ci.begin(), ci.end(),
    ostream_iterator<typename Cont::value_type>(
    cout, " "));
  cout << endl;
}

template<class Cont> void midInsertion(Cont&
ci) {
  typename Cont::iterator it = ci.begin();
  ++it; ++it; ++it;
  copy(a, a + sizeof(a)/(sizeof(Cont::value_type) * 2),
    inserter(ci, it));
  copy(ci.begin(), ci.end(),
    ostream_iterator<typename Cont::value_type>(
    cout, " "));
  cout << endl;
}

int main() {
  deque<int> di;
  list<int>  li;
  vector<int> vi;
  // Can't use a front_inserter() with vector
  frontInsertion(di);
  frontInsertion(li);
  di.clear();
  li.clear();
  backInsertion(vi);
  backInsertion(di);
  backInsertion(li);
  midInsertion(vi);
  midInsertion(di);
  midInsertion(li);
} ///:~
Since vector does not support push_front( ), it cannot produce a front_insert_iterator. However, you can see that vector does support the other two types of insertions (even though, as you shall see later, insert( ) is not an efficient operation for vector). Note the use of the nested type Cont::value_type instead of hard-coding int.

More on stream iterators

We introduced the use of the stream iterators ostream_iterator (an output iterator) and istream_iterator (an input iterator) in conjunction with copy( ) in the previous chapter. Remember that an output stream doesn't have any concept of an “end,” since you can always just keep writing more elements. However, an input stream eventually terminates (for example, when you reach the end of a file), so you need a way to represent that. An istream_iterator has two constructors, one that takes an istream and produces the iterator you actually read from, and the other which is the default constructor and produces an object that is the past-the-end sentinel. In the following program this object is named end:


//: C07:StreamIt.cpp
// Iterators for istreams and ostreams.
#include <fstream>
#include <iostream>
#include <iterator>
#include <string>
#include <vector>
#include "../require.h"
using namespace std;

int main() {
  ifstream in("StreamIt.cpp");
  assure(in, "StreamIt.cpp");
  istream_iterator<string> begin(in), end;
  ostream_iterator<string> out(cout,
"\n");
  vector<string> vs;
  copy(begin, end, back_inserter(vs));
  copy(vs.begin(), vs.end(), out);
  *out++ = vs[0];
  *out++ = "That's all, folks!";
} ///:~
When in runs out of input (in this case when the end of the file is reached), init becomes equivalent to end, and the copy( ) terminates.

Because out is an ostream_iterator<string>, you can simply assign any string object to the dereferenced iterator using operator=, and that string will be placed on the output stream, as seen in the two assignments to out. Because out is defined with a newline as its second argument, these assignments also insert a newline along with each assignment.

Although it is possible to create an istream_iterator<char> and ostream_iterator<char>, these actually parse the input and thus will, for example, automatically eat whitespace (spaces, tabs, and newlines), which is not desirable if you want to manipulate an exact representation of an istream. Instead, you can use the special iterators istreambuf_iterator and ostreambuf_iterator, which are designed strictly to move characters.(104) Although these are templates, they are meant to be used with template arguments of either char or wchar_t.(105) The following example lets you compare the behavior of the stream iterators with the streambuf iterators:


//: C07:StreambufIterator.cpp
// istreambuf_iterator & ostreambuf_iterator.
#include <algorithm>
#include <fstream>
#include <iostream>
#include <iterator>
#include "../require.h"
using namespace std;

int main() {
  ifstream in("StreambufIterator.cpp");
  assure(in, "StreambufIterator.cpp");
  // Exact representation of stream:
  istreambuf_iterator<char> isb(in), end;
  ostreambuf_iterator<char> osb(cout);
  while(isb != end)
    *osb++ = *isb++; // Copy 'in' to cout
  cout << endl;
  ifstream in2("StreambufIterator.cpp");
  // Strips white space:
  istream_iterator<char> is(in2), end2;
  ostream_iterator<char> os(cout);
  while(is != end2)
    *os++ = *is++;
  cout << endl;
} ///:~
The stream iterators use the parsing defined by istream::operator>>, which is probably not what you want if you are parsing characters directly—it's fairly rare that you want all the whitespace stripped out of your character stream. You'll virtually always want to use a streambuf iterator when using characters and streams, rather than a stream iterator. In addition, istream::operator>> adds significant overhead for each operation, so it is only appropriate for higher-level operations such as parsing numbers.(106)

Manipulating raw storage

The raw_storage_iterator is defined in <memory> and is an output iterator. It is provided to enable algorithms to store their results in uninitialized memory. The interface is quite simple: the constructor takes an output iterator that is pointing to the raw memory (typically a pointer), and the operator= assigns an object into that raw memory. The template parameters are the type of the output iterator pointing to the raw storage and the type of object that will be stored. Here's an example that creates Noisy objects, which print trace statements for their construction, assignment, and destruction (we'll show the Noisy class definition later):


//: C07:RawStorageIterator.cpp {-bor}
// Demonstrate the raw_storage_iterator.
//{L} Noisy
#include <iostream>
#include <iterator>
#include <algorithm>
#include "Noisy.h"
using namespace std;

int main() {
  const int QUANTITY = 10;
  // Create raw storage and cast to desired type:
  Noisy* np = reinterpret_cast<Noisy*>(
    new char[QUANTITY * sizeof(Noisy)]);
  raw_storage_iterator<Noisy*, Noisy> rsi(np);
  for(int i = 0; i < QUANTITY; i++)
    *rsi++ = Noisy(); // Place objects in storage
  cout << endl;
  copy(np, np + QUANTITY,
    ostream_iterator<Noisy>(cout, "
"));
  cout << endl;
  // Explicit destructor call for cleanup:
  for(int j = 0; j < QUANTITY; j++)
    (&np[j])->~Noisy();
  // Release raw storage:
  delete reinterpret_cast<char*>(np);
} ///:~
To make the raw_storage_iterator template happy, the raw storage must be of the same type as the objects you're creating. That's why the pointer from the new array of char is cast to a Noisy*. The assignment operator forces the objects into the raw storage using the copy-constructor. Note that the explicit destructor call must be made for proper cleanup, and this also allows the objects to be deleted one at a time during container manipulation. The expression delete np would be invalid anyway since the static type of a pointer in a delete expression must be the same as the type assigned to in the new expression.


2.5.4. The basic sequences: vector, list, deque

Sequences keep objects in whatever order you store them. They differ in the efficiency of their operations, however, so if you are going to manipulate a sequence in a particular fashion, choose the appropriate container for those types of manipulations. So far in this book we've been using vector as the container of choice. This is quite often the case in applications. When you start making more sophisticated uses of containers, however, it becomes important to know more about their underlying implementations and behavior so that you can make the right choices.


2.5.4.1. Basic sequence operations

Using a template, the following example shows the operations supported by all the basic sequences: vector, deque, and list:


//: C07:BasicSequenceOperations.cpp
// The operations available for all the
// basic sequence Containers.
#include <deque>
#include <iostream>
#include <list>
#include <vector>
using namespace std;

template<typename Container>
void print(Container& c, char* title =
"") {
  cout << title << ':' << endl;
  if(c.empty()) {
    cout << "(empty)" << endl;
    return;
  }
  typename Container::iterator it;
  for(it = c.begin(); it != c.end(); it++)
    cout << *it << " ";
  cout << endl;
  cout << "size() "      <<
c.size()
       << " max_size() " <<
c.max_size()
       << " front() "    <<
c.front()
       << " back() "     <<
c.back()
       << endl;
}

template<typename ContainerOfInt> void
basicOps(char* s) {
  cout << "------- " << s
<< " -------" << endl;
  typedef ContainerOfInt Ci;
  Ci c;
  print(c, "c after default constructor");
  Ci c2(10, 1); // 10 elements, values all 1
  print(c2, "c2 after constructor(10,1)");
  int ia[] = { 1, 3, 5, 7, 9 };
  const int IASZ = sizeof(ia)/sizeof(*ia);
  // Initialize with begin & end iterators:
  Ci c3(ia, ia + IASZ);
  print(c3, "c3 after
constructor(iter,iter)");
  Ci c4(c2); // Copy-constructor
  print(c4, "c4 after copy-constructor(c2)");
  c = c2; // Assignment operator
  print(c, "c after operator=c2");
  c.assign(10, 2); // 10 elements, values all 2
  print(c, "c after assign(10, 2)");
  // Assign with begin & end iterators:
  c.assign(ia, ia + IASZ);
  print(c, "c after assign(iter, iter)");
  cout << "c using reverse iterators:"
<< endl;
  typename Ci::reverse_iterator rit = c.rbegin();
  while(rit != c.rend())
    cout << *rit++ << " ";
  cout << endl;
  c.resize(4);
  print(c, "c after resize(4)");
  c.push_back(47);
  print(c, "c after push_back(47)");
  c.pop_back();
  print(c, "c after pop_back()");
  typename Ci::iterator it = c.begin();
  ++it; ++it;
  c.insert(it, 74);
  print(c, "c after insert(it, 74)");
  it = c.begin();
  ++it;
  c.insert(it, 3, 96);
  print(c, "c after insert(it, 3, 96)");
  it = c.begin();
  ++it;
  c.insert(it, c3.begin(), c3.end());
  print(c, "c after insert("
    "it, c3.begin(), c3.end())");
  it = c.begin();
  ++it;
  c.erase(it);
  print(c, "c after erase(it)");
  typename Ci::iterator it2 = it = c.begin();
  ++it;
  ++it2; ++it2; ++it2; ++it2; ++it2;
  c.erase(it, it2);
  print(c, "c after erase(it, it2)");
  c.swap(c2);
  print(c, "c after swap(c2)");
  c.clear();
  print(c, "c after clear()");
}

int main() {
  basicOps<vector<int>
>("vector");
  basicOps<deque<int> >("deque");
  basicOps<list<int> >("list");
} ///:~
The first function template, print( ), demonstrates the basic information you can get from any sequence container: whether it's empty, its current size, the size of the largest possible container, the element at the beginning, and the element at the end. You can also see that every container has begin( ) and end( ) member functions that return iterators.

The basicOps( ) function tests everything else (and in turn calls print( )), including a variety of constructors: default, copy-constructor, quantity and initial value, and beginning and ending iterators. There are an assignment operator= and two kinds of assign( ) member functions. One takes a quantity and an initial value, and the other takes a beginning and ending iterator.

All the basic sequence containers are reversible containers, as shown by the use of the rbegin( ) and rend( ) member functions. A sequence container can be resized, and the entire contents of the container can be removed with clear( ). When you call resize( ) to expand a sequence, the new elements use the default constructor of the type of element in the sequence, or if they are built-in types, they are zero-initialized.

Using an iterator to indicate where you want to start inserting into any sequence container, you can insert( ) a single element, a number of elements that all have the same value, and a group of elements from another container using the beginning and ending iterators of that group.

To erase( ) a single element from the middle, use an iterator; to erase( ) a range of elements, use a pair of iterators. Notice that since a list supports only bidirectional iterators, all the iterator motion must be performed with increments and decrements. (If the containers were limited to vector and deque, which produce random-access iterators, operator+ and operator- could have been used to move the iterators in bigger jumps.)

Although both list and deque support push_front( ) and pop_front( ), vector does not, but push_back( ) and pop_back( ) work with all three.

The naming of the member function swap( ) is a little confusing, since there's also a nonmember swap( ) algorithm that interchanges the values of any two objects of same type. The member swap( ) swaps everything in one container for another (if the containers hold the same type), effectively swapping the containers themselves. It does this efficiently by swapping the contents of each container, which consists mostly of pointers. The nonmember swap( ) algorithm normally uses assignment to interchange its arguments (an expensive operation for an entire container), but it is customized through template specialization to call the member swap( ) for the standard containers. There is also an iter_swap algorithm that uses iterators to interchange two elements in the same container.

The following sections discuss the particulars of each type of sequence container.


2.5.4.2. vector

The vector class template is intentionally made to look like a souped-up array, since it has array-style indexing, but also can expand dynamically. The vector class template is so fundamentally useful that it was introduced in a primitive way early in this book and was used regularly in previous examples. This section will give a more in-depth look at vector.

To achieve maximally-efficient indexing and iteration, vector maintains its storage as a single contiguous array of objects. This is a critical point to observe in understanding the behavior of vector. It means that indexing and iteration are lightning-fast, being basically the same as indexing and iterating over an array of objects. But it also means that inserting an object anywhere but at the end (that is, appending) is not really an acceptable operation for a vector. In addition, when a vector runs out of preallocated storage, to maintain its contiguous array it must allocate a whole new (larger) chunk of storage elsewhere and copy the objects to the new storage. This approach produces a number of unpleasant side-effects.

Cost of overflowing allocated storage

A vector starts by grabbing a block of storage, as if it's taking a guess at how many objects you plan to put in it. As long as you don't try to put in more objects than can be held in the initial block of storage, everything proceeds rapidly. (If you do know how many objects to expect, you can preallocate storage using reserve( ).) But eventually you will put in one too many objects, and the vector responds by:

  1. Allocating a new, bigger piece of storage.
  2. Copying all the objects from the old storage to the new (using the copy-constructor).
  3. Destroying all the old objects (the destructor is called for each one).
  4. Releasing the old memory.
For complex objects, this copy-construction and destruction can end up being expensive if you often overfill your vector, which is why vectors (and STL containers in general) are designed for value types (i.e. types that are cheap to copy). This includes pointers.

To see what happens when you're filling a vector, here is the Noisy class mentioned earlier. It prints information about its creations, destructions, assignments, and copy-constructions:


//: C07:Noisy.h
// A class to track various object activities.
#ifndef NOISY_H
#define NOISY_H
#include <iostream>
using std::endl;
using std::cout;
using std::ostream;

class Noisy {
  static long create, assign, copycons, destroy;
  long id;
public:
  Noisy() : id(create++) {
    cout << "d[" << id <<
"]" << endl;
  }
  Noisy(const Noisy& rv) : id(rv.id) {
    cout << "c[" << id <<
"]" << endl;
    ++copycons;
  }
  Noisy& operator=(const Noisy& rv) {
    cout << "(" << id <<
")=[" << rv.id << "]" << endl;
    id = rv.id;
    ++assign;
    return *this;
  }
  friend bool operator<(const Noisy& lv, const
Noisy& rv) {
    return lv.id < rv.id;
  }
  friend bool operator==(const Noisy& lv,const
Noisy& rv) {
    return lv.id == rv.id;
  }
  ~Noisy() {
    cout << "~[" << id <<
"]" << endl;
    ++destroy;
  }
  friend ostream& operator<<(ostream& os,
const Noisy& n) {
    return os << n.id;
  }
  friend class NoisyReport;
};

struct NoisyGen {
  Noisy operator()() { return Noisy(); }
};

// A Singleton. Will automatically report the
// statistics as the program terminates:
class NoisyReport {
  static NoisyReport nr;
  NoisyReport() {} // Private constructor
  NoisyReport & operator=(NoisyReport &);  //
Disallowed
  NoisyReport(const NoisyReport&);         //
Disallowed
public:
  ~NoisyReport() {
    cout << "\n-------------------\n"
         << "Noisy creations: "
<< Noisy::create
         << "\nCopy-Constructions: "
<< Noisy::copycons
         << "\nAssignments: " <<
Noisy::assign
         << "\nDestructions: " <<
Noisy::destroy << endl;
  }
};
#endif // NOISY_H ///:~


//: C07:Noisy.cpp {O}
#include "Noisy.h"
long Noisy::create = 0, Noisy::assign = 0,
  Noisy::copycons = 0, Noisy::destroy = 0;
NoisyReport NoisyReport::nr;
///:~
Each Noisy object has its own identifier, and static variables keep track of all the creations, assignments (using operator=), copy-constructions, and destructions. The id is initialized using the create counter inside the default constructor; the copy-constructor and assignment operator take their id values from the rvalue. With operator= the lvalue is already an initialized object, so the old value of id is printed before it is overwritten with the id from the rvalue.

To support certain operations such as sorting and searching (which are used implicitly by some of the containers), Noisy must have an operator< and operator==. These simply compare the id values. The ostream inserter follows the usual form and simply prints the id.

Objects of type NoisyGen are function objects (since there is an operator( )) that produce Noisy objects during testing.

NoisyReport is a Singleton object(107) because we only want one report printed at program termination. It has a private constructor so no additional NoisyReport objects can be created, it disallows assignment and copy-construction, and it has a single static instance of NoisyReport called nr. The only executable statements are in the destructor, which is called as the program exits and static destructors are called. This destructor prints the statistics captured by the static variables in Noisy.

Using Noisy.h, the following program shows a vector overflowing its allocated storage:


//: C07:VectorOverflow.cpp {-bor}
// Shows the copy-construction and destruction
// that occurs when a vector must reallocate.
//{L} Noisy
#include <cstdlib>
#include <iostream>
#include <string>
#include <vector>
#include "Noisy.h"
using namespace std;

int main(int argc, char* argv[]) {
  int size = 1000;
  if(argc >= 2) size = atoi(argv[1]);
  vector<Noisy> vn;
  Noisy n;
  for(int i = 0; i < size; i++)
    vn.push_back(n);
  cout << "\n cleaning up  << endl;
} ///:~
You can use the default value of 1000, or you can use your own value by putting it on the command line.

When you run this program, you'll see a single default constructor call (for n), then a lot of copy-constructor calls, then some destructor calls, then some more copy-constructor calls, and so on. When the vector runs out of space in the linear array of bytes it has allocated, it must (to maintain all the objects in a linear array, which is an essential part of its job) get a bigger piece of storage and move everything over, first copying and then destroying the old objects. You can imagine that if you store a lot of large and complex objects, this process could rapidly become prohibitive.

There are two solutions to this problem. The nicest one requires that you know beforehand how many objects you're going to make. In that case, you can use reserve( ) to tell the vector how much storage to preallocate, thus eliminating all the copies and destructions and making everything very fast (especially random access to the objects with operator[ ]). Note that the use of reserve( ) is different from using the vector constructor with an integral first argument; the latter initializes a prescribed number of elements using the element type's default constructor.

Generally you won't know how many objects you'll need. If vector reallocations are slowing things down, you can change sequence containers. You could use a list, butas you'll see, the deque allows speedy insertions at either end of the sequence and never needs to copy or destroy objects as it expands its storage. The deque also allows random access with operator[ ], but it's not quite as fast as vector's operator[ ]. So if you're creating all your objects in one part of the program and randomly accessing them in another, you may find yourself filling a deque and then creating a vector from the deque and using the vector for rapid indexing. You don't want to program this way habitually—just be aware of these issues (that is, avoid premature optimization).

There is a darker side to vector's reallocation of memory, however. Because vector keeps its objects in a nice, neat array, the iterators used by vector can be simple pointers. This is good—of all the sequence containers, these pointers allow the fastest selection and manipulation. Whether they are simple pointers, or whether they are iterator objects that hold an internal pointer into their container, consider what happens when you add the one additional object that causes the vector to reallocate storage and move it elsewhere. The iterator's pointer is now pointing off into nowhere:


//: C07:VectorCoreDump.cpp
// Invalidating an iterator.
#include <iterator>
#include <iostream>
#include <vector>
using namespace std;

int main() {
  vector<int> vi(10, 0);
  ostream_iterator<int> out(cout, " ");
  vector<int>::iterator i = vi.begin();
  *i = 47;
  copy(vi.begin(), vi.end(), out);
  cout << endl;
  // Force it to move memory (could also just add
  // enough objects):
  vi.resize(vi.capacity() + 1);
  // Now i points to wrong memory:
  *i = 48// Access violation
  copy(vi.begin(), vi.end(), out); // No change to
vi[0]
} ///:~
This illustrates the concept of iterator invalidation. Certain operations cause internal changes to a container's underlying data, so any iterators in effect before such changes may no longer be valid afterward. If your program is breaking mysteriously, look for places where you hold onto an iterator while adding more objects to a vector. You'll need to get a new iterator after adding elements or use operator[ ] instead for element selections. If you combine this observation with the awareness of the potential expense of adding new objects to a vector, you may conclude that the safest way to use a vector is to fill it up all at once (ideally, knowing first how many objects you'll need) and then just use it (without adding more objects) elsewhere in the program. This is the way vector has been used in the book up to this point. The Standard C++ library documents the container operations that invalidate iterators.

You may observe that using vector as the “basic” container in the earlier chapters of this book might not be the best choice in all cases. This is a fundamental issue in containers and in data structures in general—the “best” choice varies according to the way the container is used. The reason vector has been the “best” choice up until now is that it looks a lot like an array and was thus familiar and easy for you to adopt. But from now on it's also worth thinking about other issues when choosing containers.

Inserting and erasing elements

The vector is most efficient if:

  1. You reserve( ) the correct amount of storage at the beginning so the vector never has to reallocate.
  2. You only add and remove elements from the back end.
It is possible to insert and erase elements from the middle of a vector using an iterator, but the following program demonstrates what a bad idea this is:


//: C07:VectorInsertAndErase.cpp {-bor}
// Erasing an element from a vector.
//{L} Noisy
#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
#include "Noisy.h"
using namespace std;

int main() {
  vector<Noisy> v;
  v.reserve(11);
  cout << "11 spaces have been
reserved" << endl;
  generate_n(back_inserter(v), 10, NoisyGen());
  ostream_iterator<Noisy> out(cout, "
");
  cout << endl;
  copy(v.begin(), v.end(), out);
  cout << "Inserting an element:"
<< endl;
  vector<Noisy>::iterator it =
    v.begin() + v.size() / 2; // Middle
  v.insert(it, Noisy());
  cout << endl;
  copy(v.begin(), v.end(), out);
  cout << "\nErasing an element:"
<< endl;
  // Cannot use the previous value of it:
  it = v.begin() + v.size() / 2;
  v.erase(it);
  cout << endl;
  copy(v.begin(), v.end(), out);
  cout << endl;
} ///:~
When you run the program, you'll see that the call to reserve( ) really does only allocate storage—no constructors are called. The generate_n( ) call is busy: each call to NoisyGen::operator( ) results in a construction, a copy-construction (into the vector), and a destruction of the temporary. But when an object is inserted into the vector in the middle, it must shift everything down to maintain the linear array, and, since there is enough space, it does this with the assignment operator. (If the argument of reserve( ) is 10 instead of 11, it must reallocate storage.) When an object is erased from the vector, the assignment operator is once again used to move everything up to cover the place that is being erased. (Notice that this requires that the assignment operator properly clean up the lvalue.) Last, the object on the end of the array is deleted.


2.5.4.3. deque

The deque container is a basic sequence optimized for adding and removing elements from either end. It also allows for reasonably fast random access—it has an operator[ ] like vector. However, it does not have vector's constraint of keeping everything in a single sequential block of memory. Instead, a typical implementation of deque uses multiple blocks of sequential storage (keeping track of all the blocks and their order in a mapping structure). For this reason, the overhead for a deque to add or remove elements at either end is low. In addition, it never needs to copy and destroy contained objects during a new storage allocation (like vector does), so it is far more efficient than vector if you are adding an unknown quantity of objects at either end. This means that vector is the best choice only if you have a good idea of how many objects you need. In addition, many of the programs shown earlier in this book that use vector and push_back( ) might have been more efficient had we used a deque instead. The interface to deque differs only slightly from vector (deque has a push_front( ) and pop_front( ) while vector does not, for example), so converting code from using vector to using deque is trivial. Consider StringVector.cpp, which can be changed to use deque by replacing the word “vector” with “deque” everywhere. The following program adds parallel deque operations to the vector operations in StringVector.cpp and performs timing comparisons:


//: C07:StringDeque.cpp
// Converted from StringVector.cpp.
#include <cstddef>
#include <ctime>
#include <deque>
#include <fstream>
#include <iostream>
#include <iterator>
#include <sstream>
#include <string>
#include <vector>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  char* fname = "StringDeque.cpp";
  if(argc > 1) fname = argv[1];
  ifstream in(fname);
  assure(in, fname);
  vector<string> vstrings;
  deque<string> dstrings;
  string line;
  // Time reading into vector:
  clock_t ticks = clock();
  while(getline(in, line))
    vstrings.push_back(line);
  ticks = clock() - ticks;
  cout << "Read into vector: " <<
ticks << endl;
  // Repeat for deque:
  ifstream in2(fname);
  assure(in2, fname);
  ticks = clock();
  while(getline(in2, line))
    dstrings.push_back(line);
  ticks = clock() - ticks;
  cout << "Read into deque: " <<
ticks << endl;
  // Now compare indexing:
  ticks = clock();
  for(size_t i = 0; i < vstrings.size(); i++) {
    ostringstream ss;
    ss << i;
    vstrings[i] = ss.str() + ":
" + vstrings[i];
  }
  ticks = clock() - ticks;
  cout << "Indexing vector: " <<
ticks << endl;
  ticks = clock();
  for(size_t j = 0; j < dstrings.size(); j++) {
    ostringstream ss;
    ss << j;
    dstrings[j] = ss.str() + ":
" + dstrings[j];
  }
  ticks = clock() - ticks;
  cout << "Indexing deque: " <<
ticks << endl;
  // Compare iteration
  ofstream tmp1("tmp1.tmp"),
tmp2("tmp2.tmp");
  ticks = clock();
  copy(vstrings.begin(), vstrings.end(),
    ostream_iterator<string>(tmp1,
"\n"));
  ticks = clock() - ticks;
  cout << "Iterating vector: " <<
ticks << endl;
  ticks = clock();
  copy(dstrings.begin(), dstrings.end(),
    ostream_iterator<string>(tmp2,
"\n"));
  ticks = clock() - ticks;
  cout << "Iterating deque: " <<
ticks << endl;
} ///:~
Knowing now what you do about the inefficiency of adding things to vector because of storage reallocation, you might expect dramatic differences between the two. However, on a 1.7 MB text file, one compiler's program produced the following (measured in platform/compiler specific clock ticks, not seconds):


Read into vector: 8350
Read into deque: 7690
Indexing vector: 2360
Indexing deque: 2480
Iterating vector: 2470
Iterating deque: 2410
A different compiler and platform roughly agreed with this. It's not so dramatic, is it? This points out some important issues:

  1. We (programmers and authors) are typically bad at guessing where inefficiencies occur in our programs.
  2. Efficiency comes from a combination of effects. Here, reading the lines in and converting them to strings may dominate over the cost of vector vs. deque.
  3. The string class is probably fairly well designed in terms of efficiency.
This doesn't mean you shouldn't use a deque rather than a vector when you know that an uncertain number of objects will be pushed onto the end of the container. On the contrary, you should—when you're tuning for performance. But also be aware that performance issues are usually not where you think they are, and the only way to know for sure where your bottlenecks are is by testing. Later in this chapter, you'll see a more “pure” comparison of performance between vector, deque, and list.


2.5.4.4. Converting between sequences

Sometimes you need the behavior or efficiency of one kind of container for one part of your program, and you need a different container's behavior or efficiency in another part of the program. For example, you may need the efficiency of a deque when adding objects to the container but the efficiency of a vector when indexing them. Each of the basic sequence containers (vector, deque, and list) has a two-iterator constructor (indicating the beginning and ending of the sequence to read from when creating a new object) and an assign( ) member function to read into an existing container, so you can easily move objects from one sequence container to another.

The following example reads objects into a deque and then converts to a vector:


//: C07:DequeConversion.cpp {-bor}
// Reading into a Deque, converting to a vector.
//{L} Noisy
#include <algorithm>
#include <cstdlib>
#include <deque>
#include <iostream>
#include <iterator>
#include <vector>
#include "Noisy.h"
using namespace std;

int main(int argc, char* argv[]) {
  int size = 25;
  if(argc >= 2) size = atoi(argv[1]);
  deque<Noisy> d;
  generate_n(back_inserter(d), size, NoisyGen());
  cout << "\n Converting to a
vector(1)" << endl;
  vector<Noisy> v1(d.begin(), d.end());
  cout << "\n Converting to a
vector(2)" << endl;
  vector<Noisy> v2;
  v2.reserve(d.size());
  v2.assign(d.begin(), d.end());
  cout << "\n Cleanup" << endl;
} ///:~
You can try various sizes, but note that it makes no difference—the objects are simply copy-constructed into the new vectors. What's interesting is that v1 does not cause multiple allocations while building the vector, no matter how many elements you use. You might initially think that you must follow the process used for v2 and preallocate the storage to prevent messy reallocations, but this is unnecessary because the constructor used for v1 determines the memory requirement ahead of time.

Cost of overflowing allocated storage

It's illuminating to see what happens with a deque when it overflows a block of storage, in contrast with VectorOverflow.cpp:


//: C07:DequeOverflow.cpp {-bor}
// A deque is much more efficient than a vector when
// pushing back a lot of elements, since it doesn't
// require copying and destroying.
//{L} Noisy
#include <cstdlib>
#include <deque>
#include "Noisy.h"
using namespace std;

int main(int argc, char* argv[]) {
  int size = 1000;
  if(argc >= 2) size = atoi(argv[1]);
  deque<Noisy> dn;
  Noisy n;
  for(int i = 0; i < size; i++)
    dn.push_back(n);
  cout << "\n cleaning up  << endl;
} ///:~
Here you will have relatively few (if any) destructors called before the words “cleaning up” appear in the output. Since the deque allocates all its storage in blocks instead of a contiguous array like vector, it never needs to move existing storage of each of its data blocks. (Thus, no additional copy-constructions and destructions occur.) The deque simply allocates a new block. For the same reason, the deque can just as efficiently add elements to the beginning of the sequence, since if it runs out of storage, it (again) just allocates a new block for the beginning. (The index block that holds the data blocks together may need to be reallocated, however.) Insertions in the middle of a deque, however, could be even messier than for vector (but not as costly).

Because of deque'sclever storage management, an existing iterator is not invalidated after you add new things to either end of a deque, as it was demonstrated to do with vector (in VectorCoreDump.cpp). If you stick to what deque is best at—insertions and removals from either end, reasonably rapid traversals and fairly fast random-access using operator[ ]—you'll be in good shape.


2.5.4.5. Checked random–access

Both vector and deque provide two random access functions: the indexing operator (operator[ ]), which you've seen already, and at( ), which checks the boundaries of the container that's being indexed and throws an exception if you go out of bounds. It does cost more to use at( ):


//: C07:IndexingVsAt.cpp
// Comparing "at()" to operator[].
#include <ctime>
#include <deque>
#include <iostream>
#include <vector>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  long count = 1000;
  int sz = 1000;
  if(argc >= 2) count = atoi(argv[1]);
  if(argc >= 3) sz = atoi(argv[2]);
  vector<int> vi(sz);
  clock_t ticks = clock();
  for(int i1 = 0; i1 < count; i1++)
    for(int j = 0; j < sz; j++)
      vi[j];
  cout << "vector[] "
<< clock() - ticks << endl;
  ticks = clock();
  for(int i2 = 0; i2 < count; i2++)
    for(int j = 0; j < sz; j++)
      vi.at(j);
  cout << "vector::at() " <<
clock()-ticks <<endl;
  deque<int> di(sz);
  ticks = clock();
  for(int i3 = 0; i3 < count; i3++)
    for(int j = 0; j < sz; j++)
      di[j];
  cout << "deque[] " << clock() -
ticks << endl;
  ticks = clock();
  for(int i4 = 0; i4 < count; i4++)
    for(int j = 0; j < sz; j++)
      di.at(j);
  cout << "deque::at() " <<
clock()-ticks <<endl;
  // Demonstrate at() when you go out of bounds:
  try {
    di.at(vi.size() + 1);
  } catch(...) {
    cerr << "Exception thrown" <<
endl;
  }
} ///:~
As you saw in Chapter 1, different systems may handle the uncaught exception in different ways, but you'll know one way or another that something went wrong with the program when using at( ), whereas it's possible to remain ignorant when using operator[ ].


2.5.4.6. list

A list is implemented as a doubly linked list data structure and is thus designed for rapid insertion and removal of elements anywhere in the sequence, whereas for vector and deque this is a much more costly operation. A list is so slow when randomly accessing elements that it does not have an operator[ ]. It's best used when you're traversing a sequence, in order, from beginning to end (or vice-versa), rather than choosing elements randomly from the middle. Even then the traversal can be slower than with a vector, but if you aren't doing a lot of traversals, that won't be your bottleneck.

The memory overhead of each link in a list requires a forward and backward pointer on top of the storage for the actual object. Thus, a list is a better choice when you have larger objects that you'll be inserting and removing from the middle of the list.

It's better not to use a list if you think you might be traversing it a lot, looking for objects, since the amount of time it takes to get from the beginning of the list—which is the only place you can start unless you've already got an iterator to somewhere you know is closer to your destination—to the object of interest is proportional to the number of objects between the beginning and that object.

The objects in a list never move after they are created. “Moving” a list element means changing the links, but never copying or assigning the actual objects. This means that iterators aren't invalidated when items are added to the list as it was demonstrated earlier to be the case vector. Here's an example using a list of Noisy objects:


//: C07:ListStability.cpp {-bor}
// Things don't move around in lists.
//{L} Noisy
#include <algorithm>
#include <iostream>
#include <iterator>
#include <list>
#include "Noisy.h"
using namespace std;

int main() {
  list<Noisy> l;
  ostream_iterator<Noisy> out(cout, "
");
  generate_n(back_inserter(l), 25, NoisyGen());
  cout << "\n Printing the list:"
<< endl;
  copy(l.begin(), l.end(), out);
  cout << "\n Reversing the list:"
<< endl;
  l.reverse();
  copy(l.begin(), l.end(), out);
  cout << "\n Sorting the list:"
<< endl;
  l.sort();
  copy(l.begin(), l.end(), out);
  cout << "\n Swapping two elements:"
<< endl;
  list<Noisy>::iterator it1, it2;
  it1 = it2 = l.begin();
  ++it2;
  swap(*it1, *it2);
  cout << endl;
  copy(l.begin(), l.end(), out);
  cout << "\n Using generic reverse():
" << endl;
  reverse(l.begin(), l.end());
  cout << endl;
  copy(l.begin(), l.end(), out);
  cout << "\n Cleanup" << endl;
} ///:~
Operations as seemingly radical as reversing and sorting the list require no copying of objects because, instead of moving the objects, the links are simply changed. However, notice that sort( ) and reverse( ) are member functions of list, so they have special knowledge of the internals of list and can rearrange the elements instead of copying them. On the other hand, the swap( ) function is a generic algorithm and doesn't know about list in particular, so it uses the copying approach for swapping two elements. In general, use the member version of an algorithm if that is supplied instead of its generic algorithm equivalent. In particular, use the generic sort( ) and reverse( ) algorithms only with arrays, vectors, and deques.

If you have large, complex objects, you might want to choose a list first, especially if construction, destruction, copy-construction, and assignment are expensive and if you are doing things like sorting the objects or otherwise reordering them a lot.

Special list operations

The list has some special built-in operations to make the best use of the structure of the list. You've already seen reverse( ) and sort( ). Here are some of the others:


//: C07:ListSpecialFunctions.cpp
//{L} Noisy
#include <algorithm>
#include <iostream>
#include <iterator>
#include <list>
#include "Noisy.h"
#include "PrintContainer.h"
using namespace std;

int main() {
  typedef list<Noisy> LN;
  LN l1, l2, l3, l4;
  generate_n(back_inserter(l1), 6, NoisyGen());
  generate_n(back_inserter(l2), 6, NoisyGen());
  generate_n(back_inserter(l3), 6, NoisyGen());
  generate_n(back_inserter(l4), 6, NoisyGen());
  print(l1, "l1", " "); print(l2,
"l2", " ");
  print(l3, "l3", " "); print(l4,
"l4", " ");
  LN::iterator it1 = l1.begin();
  ++it1; ++it1; ++it1;
  l1.splice(it1, l2);
  print(l1, "l1 after splice(it1, l2)",
" ");
  print(l2, "l2 after splice(it1, l2)",
" ");
  LN::iterator it2 = l3.begin();
  ++it2; ++it2; ++it2;
  l1.splice(it1, l3, it2);
  print(l1, "l1 after splice(it1, l3, it2)",
" ");
  LN::iterator it3 = l4.begin(), it4 = l4.end();
  ++it3; --it4;
  l1.splice(it1, l4, it3, it4);
  print(l1, "l1 after
splice(it1,l4,it3,it4)", " ");
  Noisy n;
  LN l5(3, n);
  generate_n(back_inserter(l5), 4, NoisyGen());
  l5.push_back(n);
  print(l5, "l5 before remove()", "
");
  l5.remove(l5.front());
  print(l5, "l5 after remove()", "
");
  l1.sort(); l5.sort();
  l5.merge(l1);
  print(l5, "l5 after
l5.merge(l1)", " ");
  cout << "\n Cleanup" << endl;
} ///:~
After filling four lists with Noisy objects, one list is spliced into another in three ways. In the first, the entire list l2 is spliced into l1 at the iterator it1. Notice that after the splice, l2 is empty—splicing means removing the elements from the source list. The second splice inserts elements from l3 starting at it2 into l1 starting at it1. The third splice starts at it1 and uses elements from l4 starting at it3 and ending at it4. The seemingly redundant mention of the source list is because the elements must be erased from the source list as part of the transfer to the destination list.

The output from the code that demonstrates remove( ) shows that the list does not have to be sorted in order for all the elements of a particular value to be removed.

Finally, if you merge( ) one list with another, the merge only works sensibly if the lists have been sorted. What you end up with in that case is a sorted list containing all the elements from both lists (the source list is erased—that is, the elements are moved to the destination list).

A unique( ) member function removes all duplicates, but only if you sort the list first:


//: C07:UniqueList.cpp
// Testing list's unique() function.
#include <iostream>
#include <iterator>
#include <list>
using namespace std;

int a[] = { 1, 3, 1, 4, 1, 5, 1, 6, 1 };
const int ASZ = sizeof a / sizeof *a;

int main() {
  // For output:
  ostream_iterator<int> out(cout, " ");
  list<int> li(a, a + ASZ);
  li.unique();
  // Oops! No duplicates removed:
  copy(li.begin(), li.end(), out);
  cout << endl;
  // Must sort it first:
  li.sort();
  copy(li.begin(), li.end(), out);
  cout << endl;
  // Now unique() will have an effect:
  li.unique();
  copy(li.begin(), li.end(), out);
  cout << endl;
} ///:~
The list constructor used here takes the starting and past-the-end iterator from another container and copies all the elements from that container into itself. Here, the “container” is just an array, and the “iterators” are pointers into that array, but because of the design of the STL, the list constructor works with arrays just as easily as with any other container.

The unique( ) function will remove only adjacent duplicate elements, and thus sorting is typically necessary before calling unique( ). The exception is when the problem you're trying to solve includes eliminating adjacent duplicates according to the current ordering.

Four additional list member functions are not demonstrated here: a remove_if( ) that takes a predicate, which decides whether an object should be removed; a unique( ) that takes a binary predicate to perform uniqueness comparisons; a merge( ) that takes an additional argument which performs comparisons; and a sort( ) that takes a comparator (to provide a comparison or override the existing one).

list vs. set

Looking at the previous example, you might note that if you want a sorted sequence with no duplicates, you could get that result with a set. It's interesting to compare the performance of the two containers:


//: C07:ListVsSet.cpp
// Comparing list and set performance.
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <list>
#include <set>
#include "PrintContainer.h"
using namespace std;

class Obj {
  int a[20]; // To take up extra space
  int val;
public:
  Obj() : val(rand() % 500) {}
  friend bool
  operator<(const Obj& a, const Obj& b) {
    return a.val < b.val;
  }
  friend bool
  operator==(const Obj& a, const Obj& b) {
    return a.val == b.val;
  }
  friend ostream&
  operator<<(ostream& os, const Obj& a) {
    return os << a.val;
  }
};

struct ObjGen {
  Obj operator()() { return Obj(); }
};

int main() {
  const int SZ = 5000;
  srand(time(0));
  list<Obj> lo;
  clock_t ticks = clock();
  generate_n(back_inserter(lo), SZ, ObjGen());
  lo.sort();
  lo.unique();
  cout << "list:" << clock() -
ticks << endl;
  set<Obj> so;
  ticks = clock();
  generate_n(inserter(so, so.begin()),
    SZ, ObjGen());
  cout << "set:" << clock() -
ticks << endl;
  print(lo);
  print(so);
} ///:~
When you run the program, you should discover that set is much faster than list. This is reassuring—after all, it is set's primary job description to hold only unique elements in sorted order!

This example uses the header PrintContainer.h, which contains a function template that prints any sequence container to an output stream. PrintContainer.h is defined as follows:


//:
C07:PrintContainer.h
// Prints a
sequence container
#ifndef
PRINT_CONTAINER_H
#define
PRINT_CONTAINER_H
#include
"../C06/PrintSequence.h"

template<class
Cont>
void
print(Cont& c, const char* nm = "",
          
const char* sep = "\n",
           std::ostream&
os = std::cout) {
 
print(c.begin(), c.end(), nm, sep, os);
}
#endif ///:~
The print( ) template defined here just calls the print( ) function template we defined in the previous chapter in PrintSequence.h.


2.5.4.7. Swapping sequences

We mentioned earlier that all basic sequences have a member function swap( ) that's designed to switch one sequence with another (but only for sequences of the same type). The member swap( ) makes use of its knowledge of the internal structure of the particular container in order to be efficient:


//: C07:Swapping.cpp {-bor}
// All basic sequence containers can be swapped.
//{L} Noisy
#include <algorithm>
#include <deque>
#include <iostream>
#include <iterator>
#include <list>
#include <vector>
#include "Noisy.h"
#include "PrintContainer.h"
using namespace std;
ostream_iterator<Noisy> out(cout, " ");

template<class Cont> void testSwap(char* cname) {
  Cont c1, c2;
  generate_n(back_inserter(c1), 10, NoisyGen());
  generate_n(back_inserter(c2), 5, NoisyGen());
  cout << endl << cname <<
":" << endl;
  print(c1, "c1"); print(c2, "c2");
  cout << "\n Swapping the " <<
cname << ":" << endl;
  c1.swap(c2);
  print(c1, "c1"); print(c2, "c2");
}

int main() {
  testSwap<vector<Noisy> >("vector");
  testSwap<deque<Noisy>
>("deque");
  testSwap<list<Noisy> >("list");
} ///:~
When you run this, you'll discover that each type of sequence container can swap one sequence for another without any copying or assignments, even if the sequences are of different sizes. In effect, you're completely swapping the resources of one object for another.

The STL algorithms also contain a swap( ), and when this function is applied to two containers of the same type, it uses the member swap( ) to achieve fast performance. Consequently, if you apply the sort( ) algorithm to a container of containers, you will find that the performance is very fast—it turns out that fast sorting of a container of containers was a design goal of the STL.


2.5.5. set

The set container accepts only one copy of each element. It also sorts the elements. (Sorting isn't intrinsic to the conceptual definition of a set, but the STL set stores its elements in a balanced tree data structure to provide rapid lookups, thus producing sorted results when you traverse it.) The first two examples in this chapter used sets.

Consider the problem of creating an index for a book. You might like to start with all the words in the book, but you only want one instance of each word, and you want them sorted. A set is perfect for this and solves the problem effortlessly. However, there's also the problem of punctuation and any other nonalpha characters, which must be stripped off to generate proper words. One solution to this problem is to use the Standard C library functions isalpha( ) and isspace( ) to extract only the characters you want. You can replace all unwanted characters with spaces so that you can easily extract valid words from each line you read:


//: C07:WordList.cpp
// Display a list of words used in a document.
#include <algorithm>
#include <cctype>
#include <cstring>
#include <fstream>
#include <iostream>
#include <iterator>
#include <set>
#include <sstream>
#include <string>
#include "../require.h"
using namespace std;

char replaceJunk(char c) {
  // Only keep alphas, space (as a delimiter), and '
  return (isalpha(c) || c == '\'') ? c : ' ';
}

int main(int argc, char* argv[]) {
  char* fname = "WordList.cpp";
  if(argc > 1) fname = argv[1];
  ifstream in(fname);
  assure(in, fname);
  set<string> wordlist;
  string line;
  while(getline(in, line)) {
    transform(line.begin(), line.end(), line.begin(),
              replaceJunk);
    istringstream is(line);
    string word;
    while(is >> word)
      wordlist.insert(word);
  }
  // Output results:
  copy(wordlist.begin(), wordlist.end(),
       ostream_iterator<string>(cout,
"\n"));
} ///:~
The call to transform( ) replaces each character to be ignored with a space. The set container not only ignores duplicate words, but compares the words it keeps according to the function object less<string> (the default second template argument for the set container), which in turn uses string::operator<( ), so the words emerge in alphabetical order.

You don't need to use a set just to get a sorted sequence. You can use the sort( ) function (along with a multitude of other functions in the STL) on different STL containers. However, it's likely that set will be faster here. Using a set is particularly handy when you just want to do lookup, since its find( ) member function has logarithmic complexity and so is much faster than the generic find( ) algorithm. As you recall, the generic find( ) algorithm needs to traverse the whole range until it finds the search element (resulting in a worst-case complexity of N, and an average complexity of N/2). However, if you have a sequence container that is already sorted, use equal_range( ) for logarithmic complexity when finding elements.

The following version shows how to build the list of words with an istreambuf_iterator that moves the characters from one place (the input stream) to another (a string object), depending on whether the Standard C library function isalpha( ) returns true:


//: C07:WordList2.cpp
// Illustrates istreambuf_iterator and insert
iterators.
#include <cstring>
#include <fstream>
#include <iostream>
#include <iterator>
#include <set>
#include <string>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  char* fname = "WordList2.cpp";
  if(argc > 1) fname = argv[1];
  ifstream in(fname);
  assure(in, fname);
  istreambuf_iterator<char> p(in), end;
  set<string> wordlist;
  while(p != end) {
    string word;
    insert_iterator<string> ii(word, word.begin());
    // Find the first alpha character:
    while(p != end && !isalpha(*p))
      ++p;
    // Copy until the first non-alpha character:
    while(p != end && isalpha(*p))
      *ii++ = *p++;
    if(word.size() != 0)
      wordlist.insert(word);
  }
  // Output results:
  copy(wordlist.begin(), wordlist.end(),
    ostream_iterator<string>(cout,
"\n"));
} ///:~
This example was suggested by Nathan Myers, who invented the istreambuf_iterator and its relatives. This iterator extracts information character by character from a stream. Although the istreambuf_iterator template argument might imply that you could extract, for example, ints instead of char, that's not the case. The argument must be of some character type—a regular char or a wide character.

After the file is open, an istreambuf_iterator called p is attached to the istream so characters can be extracted from it. The set<string> called wordlist will hold the resulting words.

The while loop reads words until it finds the end of the input stream. This is detected using the default constructor for istreambuf_iterator, which produces the past-the-end iterator object end. Thus, if you want to test to make sure you're not at the end of the stream, you simply say p != end.

The second type of iterator that's used here is the insert_iterator, which you saw previously. This inserts objects into a container. Here, the “container” is the string called word, which, for the purposes of insert_iterator, behaves like a container. The constructor for insert_iterator requires the container and an iterator indicating where it should start inserting the characters. You could also use a back_insert_iterator, which requires that the container have a push_back( ) (string does).

After the while loop sets everything up, it begins by looking for the first alpha character, incrementing start until that character is found. It then copies characters from one iterator to the other, stopping when a nonalpha character is found. Each word, assuming it is nonempty, is added to wordlist.


2.5.5.1. A completely reusable tokenizer

The word list examples use different approaches to extract tokens from a stream, neither of which is very flexible. Since the STL containers and algorithms all revolve around iterators, the most flexible solution will itself use an iterator. You could think of the TokenIterator as an iterator that wraps itself around any other iterator that can produce characters. Because it is certainly a type of input iterator (the most primitive type of iterator), it can provide input to any STL algorithm. Not only is it a useful tool in itself, the following TokenIterator is also a good example of how you can design your own iterators.(108)

The TokenIterator class is doubly flexible. First, you can choose the type of iterator that will produce the char input. Second, instead of just saying what characters represent the delimiters, TokenIterator will use a predicate that is a function object whose operator( ) takes a char and decides whether it should be in the token. Although the two examples given here have a static concept of what characters belong in a token, you could easily design your own function object to change its state as the characters are read, producing a more sophisticated parser.

The following header file contains two basic predicates, Isalpha and Delimiters, along with the template for TokenIterator:


//: C07:TokenIterator.h
#ifndef TOKENITERATOR_H
#define TOKENITERATOR_H
#include <algorithm>
#include <cctype>
#include <functional>
#include <iterator>
#include <string>

struct Isalpha : std::unary_function<char, bool>
{
  bool operator()(char c) { return std::isalpha(c); }
};

class Delimiters : std::unary_function<char,
bool> {
  std::string exclude;
public:
  Delimiters() {}
  Delimiters(const std::string& excl) :
exclude(excl) {}
  bool operator()(char c) {
    return exclude.find(c) == std::string::npos;
  }
};

template<class InputIter, class Pred = Isalpha>
class TokenIterator : public std::iterator<
    std::input_iterator_tag, std::string,
std::ptrdiff_t> {
  InputIter first;
  InputIter last;
  std::string word;
  Pred predicate;
public:
  TokenIterator(InputIter begin, InputIter end,
    Pred pred = Pred())
    : first(begin), last(end), predicate(pred) {
      ++*this;
  }
  TokenIterator() {} // End sentinel
  // Prefix increment:
  TokenIterator& operator++() {
    word.resize(0);
    first = std::find_if(first, last, predicate);
    while(first != last && predicate(*first))
      word += *first++;
    return *this;
  }
  // Postfix increment
  class CaptureState {
    std::string word;
  public:
    CaptureState(const std::string& w) : word(w) {}
    std::string operator*() { return word; }
  };
  CaptureState operator++(int) {
    CaptureState d(word);
    ++*this;
    return d;
  }
  // Produce the actual value:
  std::string operator*() const { return word; }
  const std::string* operator->() const { return
&word; }
  // Compare iterators:
  bool operator==(const TokenIterator&) {
    return word.size() == 0 && first == last;
  }
  bool operator!=(const TokenIterator& rv) {
    return !(*this == rv);
  }
};
#endif // TOKENITERATOR_H ///:~
The TokenIterator class derives from the std::iterator template. It might appear that some kind of functionality comes with std::iterator, but it is purely a way of tagging an iterator, to tell a container that uses it what it can do. Here, you can see input_iterator_tag as the iterator_category template argument—this tells anyone who asks that a TokenIterator only has the capabilities of an input iterator and cannot be used with algorithms requiring more sophisticated iterators. Apart from the tagging, std::iterator doesn't do anything beyond providing several useful type definitions. You must implement all other functionality yourself.

The TokenIterator class may look a little strange at first, because the first constructor requires both a “begin” and an “end” iterator as arguments, along with the predicate. Remember, this is a “wrapper” iterator that has no idea how to tell when it's at the end of its input, so the ending iterator is necessary in the first constructor. The reason for the second (default) constructor is that the STL algorithms (and any algorithms you write) need a TokenIterator sentinel to be the past-the-end value. Since all the information necessary to see if the TokenIterator has reached the end of its input is collected in the first constructor, this second constructor creates a TokenIterator that is merely used as a placeholder in algorithms.

The core of the behavior happens in operator++. This erases the current value of word using string::resize( ) and then finds the first character that satisfies the predicate (thus discovering the beginning of the new token) using find_if( ). The resulting iterator is assigned to first, thus moving first forward to the beginning of the token. Then, as long as the end of the input is not reached and the predicate is satisfied, input characters are copied into word. Finally, the TokenIterator object is returned and must be dereferenced to access the new token.

The postfix increment requires an object of type CaptureState to hold the value before the increment, so it can be returned. Producing the actual value is a straightforward operator*. The only other functions to define for an output iterator are the operator== and operator!= to indicate whether the TokenIterator has reached the end of its input. You can see that the argument for operator== is ignored—it only cares about whether it has reached its internal last iterator. Notice that operator!= is defined in terms of operator==.

A good test of TokenIterator includes a number of different sources of input characters, including a streambuf_iterator, a char*, and a deque<char>::iterator. Finally, the original word list problem is solved:


//: C07:TokenIteratorTest.cpp {-g++}
#include <fstream>
#include <iostream>
#include <vector>
#include <deque>
#include <set>
#include "TokenIterator.h"
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  char* fname = "TokenIteratorTest.cpp";
  if(argc > 1) fname = argv[1];
  ifstream in(fname);
  assure(in, fname);
  ostream_iterator<string> out(cout,
"\n");
  typedef istreambuf_iterator<char> IsbIt;
  IsbIt begin(in), isbEnd;
  Delimiters delimiters("
\t\n~;()\"<>:{}[]+-=&*#.,/\\");
  TokenIterator<IsbIt, Delimiters>
    wordIter(begin, isbEnd, delimiters), end;
  vector<string> wordlist;
  copy(wordIter, end, back_inserter(wordlist));
  // Output results:
  copy(wordlist.begin(), wordlist.end(), out);
  *out++ =
"-----------------------------------";
  // Use a char array as the source:
  char* cp = "typedef
std::istreambuf_iterator<char> It";
  TokenIterator<char*, Delimiters>
    charIter(cp, cp + strlen(cp), delimiters), end2;
  vector<string> wordlist2;
  copy(charIter, end2, back_inserter(wordlist2));
  copy(wordlist2.begin(), wordlist2.end(), out);
  *out++ =
"-----------------------------------";
  // Use a deque<char> as the source:
  ifstream in2("TokenIteratorTest.cpp");
  deque<char> dc;
  copy(IsbIt(in2), IsbIt(), back_inserter(dc));
 
TokenIterator<deque<char>::iterator,Delimiters>
    dcIter(dc.begin(), dc.end(), delimiters), end3;
  vector<string> wordlist3;
  copy(dcIter, end3, back_inserter(wordlist3));
  copy(wordlist3.begin(), wordlist3.end(), out);
  *out++ =
"-----------------------------------";
  // Reproduce the Wordlist.cpp example:
  ifstream in3("TokenIteratorTest.cpp");
  TokenIterator<IsbIt, Delimiters>
    wordIter2(IsbIt(in3), isbEnd, delimiters);
  set<string> wordlist4;
  while(wordIter2 != end)
    wordlist4.insert(*wordIter2++);
  copy(wordlist4.begin(), wordlist4.end(), out);
} ///:~
When using an istreambuf_iterator, you create one to attach to the istream object and one with the default constructor as the past-the-end marker. Both are used to create the TokenIterator that will produce the tokens; the default constructor produces the faux TokenIterator past-the-end sentinel. (This is just a placeholder and is ignored.) The TokenIterator produces strings that are inserted into a container of string—here a vector<string> is used in all cases except the last. (You could also concatenate the results onto a string.) Other than that, a TokenIterator works like any other input iterator.

When defining a bidirectional (and therefore also a random access) iterator, you can get reverse iterators “for free” by using the std::reverse_iterator adaptor. If you have already defined an iterator for a container with bidirectional capabilities, you can get a reverse iterator from your forward-traversing iterator with lines like the following inside your container class:


// Assume "iterator" is your nested iterator type
typedef std::reverse_iterator<iterator>
reverse_iterator;
reverse_iterator rbegin() {return
reverse_iterator(end());
reverse_iterator rend() {return
reverse_iterator(begin());
The std::reverse_iterator adaptor does all the work for you. For example, if you use the * operator to dereference your reverse iterator, it automatically decrements a temporary copy of the forward iterator it is holding in order to return the correct element, since reverse iterators logically point one position past the element they refer to.


2.5.6. stack

The stack container, along with queue and priority_queue, are classified as adaptors, which means they adapt one of the basic sequence containers to store their data. This is an unfortunate case of confusing what something does with the details of its underlying implementation—the fact that these are called “adaptors” is of primary value only to the creator of the library. When you use them, you generally don't care that they're adaptors, but instead that they solve your problem. Admittedly it's useful at times to know that you can choose an alternate implementation or build an adaptor from an existing container object, but that's generally one level removed from the adaptor's behavior. So, while you may see it emphasized elsewhere that a particular container is an adaptor, we'll only point out that fact when it's useful. Note that each type of adaptor has a default container that it's built upon, and this default is the most sensible implementation. In most cases you won't need to concern yourself with the underlying implementation.

The following example shows stack<string> implemented in the three ways: the default (which uses deque), then with a vector, and finally with a list:


//: C07:Stack1.cpp
// Demonstrates the STL stack.
#include <fstream>
#include <iostream>
#include <list>
#include <stack>
#include <string>
#include <vector>
using namespace std;

// Rearrange comments below to use different versions.
typedef stack<string> Stack1; // Default:
deque<string>
// typedef stack<string, vector<string> >
Stack2;
// typedef stack<string, list<string> >
Stack3;

int main() {
  ifstream in("Stack1.cpp");
  Stack1 textlines; // Try the different versions
  // Read file and store lines in the stack:
  string line;
  while(getline(in, line))
    textlines.push(line + "\n");
  // Print lines from the stack and pop them:
  while(!textlines.empty()) {
    cout << textlines.top();
    textlines.pop();
  }
} ///:~
The top( ) and pop( ) operations will probably seem non-intuitive if you've used other stack classes. When you call pop( ), it returns void rather than the top element that you might have expected. If you want the top element, you get a reference to it with top( ). It turns out this is more efficient, since a traditional pop( ) must return a value rather than a reference and thus invokes the copy-constructor. More important, it is exception safe, as we discussed in Chapter 1. If pop( ) both changed the state of the stack and attempted to return the top element, an exception in the element's copy-constructor could cause the element to be lost. When you're using a stack (or a priority_queue, described later), you can efficiently refer to top( ) as many times as you want and then discard the top element explicitly using pop( ). (Perhaps if some term other than the familiar “pop” had been used, this would have been a bit clearer.)

The stack template has a simple interface—essentially the member functions you saw earlier. Since it only makes sense to access a stack at its top, no iterators are available for traversing it. Nor are there sophisticated forms of initialization, but if you need that, you can use the underlying container upon which the stack is implemented. For example, suppose you have a function that expects a stack interface, but in the rest of your program you need the objects stored in a list. The following program stores each line of a file along with the leading number of spaces in that line. (You might imagine it as a starting point for performing some kind of source-code reformatting.)


//: C07:Stack2.cpp
// Converting a list to a stack.
#include <iostream>
#include <fstream>
#include <stack>
#include <list>
#include <string>
#include <cstddef>
using namespace std;

// Expects a stack:
template<class Stk>
void stackOut(Stk& s, ostream& os = cout) {
  while(!s.empty()) {
    os << s.top() << "\n";
    s.pop();
  }
}

class Line {
  string line; // Without leading spaces
  size_t lspaces; // Number of leading spaces
public:
  Line(string s) : line(s) {
    lspaces = line.find_first_not_of(' ');
    if(lspaces == string::npos)
      lspaces = 0;
    line = line.substr(lspaces);
  }
  friend ostream& operator<<(ostream& os,
const Line& l) {
    for(size_t i = 0; i < l.lspaces; i++)
      os << ' ';
    return os << l.line;
  }
  // Other functions here...
};

int main() {
  ifstream in("Stack2.cpp");
  list<Line> lines;
  // Read file and store lines in the list:
  string s;
  while(getline(in, s))
    lines.push_front(s);
  // Turn the list into a stack for printing:
  stack<Line, list<Line> > stk(lines);
  stackOut(stk);
} ///:~
The function that requires the stack interface just sends each top( ) object to an ostream and then removes it by calling pop( ). The Line class determines the number of leading spaces and then stores the contents of the line without the leading spaces. The ostream operator<< re-inserts the leading spaces so the line prints properly, but you can easily change the number of spaces by changing the value of lspaces. (The member functions to do this are not shown here.) In main( ), the input file is read into a list<Line>, and then each line in the list is copied into a stack that is sent to stackOut( ).

You cannot iterate through a stack; this emphasizes that you only want to perform stack operations when you create a stack. You can get equivalent “stack” functionality using a vector and its back( ), push_back( ), and pop_back( ) member functions, and then you have all the additional functionality of the vector. The program Stack1.cpp can be rewritten to show this:


//: C07:Stack3.cpp
// Using a vector as a stack; modified Stack1.cpp.
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
using namespace std;

int main() {
  ifstream in("Stack3.cpp");
  vector<string> textlines;
  string line;
  while(getline(in, line))
    textlines.push_back(line + "\n");
  while(!textlines.empty()) {
    cout << textlines.back();
    textlines.pop_back();
  }
} ///:~
This produces the same output as Stack1.cpp, but you can now perform vector operations as well. A list can also push things at the front, but it's generally less efficient than using push_back( ) with vector. (In addition, deque is usually more efficient than list for pushing things at the front.)


2.5.7. queue

The queue container is a restricted form of a deque—you can only enter elements at one end and pull them off the other end. Functionally, you could use a deque anywhere you need a queue, and you would then also have the additional functionality of the deque. The only reason you need to use a queue rather than a deque, then, is when you want to emphasize that you will only be performing queue-like behavior.

The queue class is an adaptor like stack, in that it is built on top of another sequence container. As you might guess, the ideal implementation for a queue is a deque, and that is the default template argument for the queue; you'll rarely need a different implementation.

Queues are often used if you want to model a system where some elements are waiting to be served by other elements in the system. A classic example of this is the “bank-teller problem.” Customers arrive at random intervals, get into a line, and then are served by a set of tellers. Since the customers arrive randomly and each takes a random amount of time to be served, there's no way to deterministically know how long the line will be at any time. However, it's possible to simulate the situation and see what happens.

In a realistic simulation each customer and teller should be run by a separate thread. What we'd like is a multithreaded environment so that each customer or teller would have his own thread. However, Standard C++ has no support for multithreading. On the other hand, with a little adjustment to the code, it's possible to simulate enough multithreading to provide a satisfactory solution.(109)

In multithreading, multiple threads of control run simultaneously, sharing the same address space. Quite often you have fewer CPUs than you do threads (and often only one CPU). To give the illusion that each thread has its own CPU, a time-slicing mechanism says “OK, current thread, you've had enough time. I'm going to stop you and give time to some other thread.” This automatic stopping and starting of threads is called preemptive, and it means you (the programmer) don't need to manage the threading process.

An alternative approach has each thread voluntarily yield the CPU to the scheduler, which then finds another thread that needs running. Instead, we'll build the “time-slicing” into the classes in the system. Here, it will be the tellers that represent the “threads,” (the customers will be passive). Each teller will have an infinite-looping run( ) member function that will execute for a certain number of “time units” and then simply return. By using the ordinary return mechanism, we eliminate the need for any swapping. The resulting program, although small, provides a remarkably reasonable simulation:


//: C07:BankTeller.cpp {RunByHand}
// Using a queue and simulated multithreading
// to model a bank teller system.
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <list>
#include <queue>
using namespace std;

class Customer {
  int serviceTime;
public:
  Customer() : serviceTime(0) {}
  Customer(int tm) : serviceTime(tm) {}
  int getTime() { return serviceTime; }
  void setTime(int newtime) { serviceTime = newtime; }
  friend ostream&
  operator<<(ostream& os, const Customer&
c) {
    return os << '[' << c.serviceTime
<< ']';
  }
};

class Teller {
  queue<Customer>& customers;
  Customer current;
  enum { SLICE = 5 };
  int ttime; // Time left in slice
  bool busy; // Is teller serving a customer?
public:
  Teller(queue<Customer>& cq)
  : customers(cq), ttime(0), busy(false) {}
  Teller& operator=(const Teller& rv) {
    customers = rv.customers;
    current = rv.current;
    ttime = rv.ttime;
    busy = rv.busy;
    return *this;
  }
  bool isBusy() { return busy; }
  void run(bool recursion = false) {
    if(!recursion)
      ttime = SLICE;
    int servtime = current.getTime();
    if(servtime > ttime) {
      servtime -= ttime;
      current.setTime(servtime);
      busy = true; // Still working on current
      return;
    }
    if(servtime < ttime) {
      ttime -= servtime;
      if(!customers.empty()) {
        current = customers.front();
        customers.pop(); // Remove it
        busy = true;
        run(true); // Recurse
      }
      return;
    }
    if(servtime == ttime) {
      // Done with current, set to empty:
      current = Customer(0);
      busy = false;
      return; // No more time in this slice
    }
  }
};

// Inherit to access protected implementation:
class CustomerQ : public queue<Customer> {
public:
  friend ostream&
  operator<<(ostream& os, const
CustomerQ& cd) {
    copy(cd.c.begin(), cd.c.end(),
      ostream_iterator<Customer>(os,
""));
    return os;
  }
};

int main() {
  CustomerQ customers;
  list<Teller> tellers;
  typedef list<Teller>::iterator TellIt;
  tellers.push_back(Teller(customers));
  srand(time(0)); // Seed the random number generator
  clock_t ticks = clock();
  // Run simulation for at least 5 seconds:
  while(clock() < ticks + 5 * CLOCKS_PER_SEC) {
    // Add a random number of customers to the
    // queue, with random service times:
    for(int i = 0; i < rand() % 5; i++)
      customers.push(Customer(rand() % 15 + 1));
    cout << '{' << tellers.size() <<
'}'
         << customers << endl;
    // Have the tellers service the queue:
    for(TellIt i = tellers.begin();
      i != tellers.end(); i++)
      (*i).run();
    cout << '{' << tellers.size() <<
'}'
         << customers << endl;
    // If line is too long, add another teller:
    if(customers.size() / tellers.size() > 2)
      tellers.push_back(Teller(customers));
    // If line is short enough, remove a teller:
    if(tellers.size() > 1 &&
      customers.size() / tellers.size() < 2)
      for(TellIt i = tellers.begin();
        i != tellers.end(); i++)
        if(!(*i).isBusy()) {
          tellers.erase(i);
          break; // Out of for loop
        }
  }
} ///:~
Each customer requires a certain amount of service time, which is the number of time units that a teller must spend on the customer to serve that customer's needs. The amount of service time will be different for each customer and will be determined randomly. In addition, you won't know how many customers will be arriving in each interval, so this will also be determined randomly.

The Customer objects are kept in a queue<Customer>, and each Teller object keeps a reference to that queue.When a Teller object is finished with its current Customer object, that Teller will get another Customer from the queue and begin working on the new Customer, reducing the Customer's service time during each time slice that the Teller is allotted. All this logic is in the run( ) member function, which is basically a three-way if statement based on whether the amount of time necessary to serve the customer is less than, greater than, or equal to the amount of time left in the teller's current time slice. Notice that if the Teller has more time after finishing with a Customer, it gets a new customer and recurses into itself.

Just as with a stack, when you use a queue, it's only a queue and doesn't have any of the other functionality of the basic sequence containers. This includes the ability to get an iterator in order to step through the stack. However, the underlying sequence container (that the queue is built upon) is held as a protected member inside the queue, and the identifier for this member is specified in the C++ Standard as ‘c', which means that you can derive from queue to access the underlying implementation. The CustomerQ class does exactly that, for the sole purpose of defining an ostream operator<< that can iterate through the queue and display its members.

The driver for the simulation is the while loop in main( ), which uses processor ticks (defined in <ctime>) to determine if the simulation has run for at least 5 seconds. At the beginning of each pass through the loop, a random number of customers is added, with random service times. Both the number of tellers and the queue contents are displayed so you can see the state of the system. After running each teller, the display is repeated. At this point, the system adapts by comparing the number of customers and the number of tellers. If the line is too long, another teller is added, and if it is short enough, a teller can be removed. In this adaptation section of the program you can experiment with policies regarding the optimal addition and removal of tellers. If this is the only section that you're modifying, you might want to encapsulate policies inside different objects.

We'll revisit this example in a multithreaded exercise in Chapter 11.


2.5.8. Priority queues

When you push( ) an object onto a priority_queue, that object is sorted into the queue according to a comparison function or function object. (You can allow the default less template to supply this, or you can provide one of your own.) The priority_queue ensures that when you look at the top( ) element, it will be the one with the highest priority. When you're done with it, you call pop( ) to remove it and bring the next one into place. Thus, the priority_queue has nearly the same interface as a stack, but it behaves differently.

Like stack and queue, priority_queue is an adaptor that is built on top of one of the basic sequences—the default sequence being vector.

It's trivial to make a priority_queue that works with ints:


//: C07:PriorityQueue1.cpp
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <queue>
using namespace std;

int main() {
  priority_queue<int> pqi;
  srand(time(0)); // Seed the random number generator
  for(int i = 0; i < 100; i++)
    pqi.push(rand() % 25);
  while(!pqi.empty()) {
    cout << pqi.top() << ' ';
    pqi.pop();
  }
} ///:~
This pushes into the priority_queue 100 random values from 0 to 24. When you run this program you'll see that duplicates are allowed, and the highest values appear first. To show how you can change the ordering by providing your own function or function object, the following program gives lower-valued numbers the highest priority:


//: C07:PriorityQueue2.cpp
// Changing the priority.
#include <cstdlib>
#include <ctime>
#include <functional>
#include <iostream>
#include <queue>
using namespace std;

int main() {
  priority_queue<int, vector<int>,
greater<int> > pqi;
  srand(time(0));
  for(int i = 0; i < 100; i++)
    pqi.push(rand() % 25);
  while(!pqi.empty()) {
    cout << pqi.top() << ' ';
    pqi.pop();
  }
} ///:~
A more interesting problem is a to-do list, where each object contains a string and a primary and secondary priority value:


//: C07:PriorityQueue3.cpp
// A more complex use of priority_queue.
#include <iostream>
#include <queue>
#include <string>
using namespace std;

class ToDoItem {
  char primary;
  int secondary;
  string item;
public:
  ToDoItem(string td, char pri = 'A', int sec = 1)
  : primary(pri), secondary(sec), item(td) {}
  friend bool operator<(
    const ToDoItem& x, const ToDoItem& y) {
    if(x.primary > y.primary)
      return true;
    if(x.primary == y.primary)
      if(x.secondary > y.secondary)
        return true;
    return false;
  }
  friend ostream&
  operator<<(ostream& os, const ToDoItem&
td) {
    return os << td.primary << td.secondary
      << ": " << td.item;
  }
};

int main() {
  priority_queue<ToDoItem> toDoList;
  toDoList.push(ToDoItem("Empty trash", 'C',
4));
  toDoList.push(ToDoItem("Feed dog", 'A',
2));
  toDoList.push(ToDoItem("Feed bird", 'B',
7));
  toDoList.push(ToDoItem("Mow lawn", 'C',
3));
  toDoList.push(ToDoItem("Water lawn", 'A',
1));
  toDoList.push(ToDoItem("Feed cat", 'B',
1));
  while(!toDoList.empty()) {
    cout << toDoList.top() << endl;
    toDoList.pop();
  }
} ///:~
The ToDoItem's operator< must be a nonmember function for it to work with less< >. Other than that, everything happens automatically. The output is


A1: Water lawn
A2: Feed dog
B1: Feed cat
B7: Feed bird
C3: Mow lawn
C4: Empty trash
You cannot iterate through a priority_queue, but it's possible to simulate the behavior of a priority_queue using a vector, thus allowing you access to that vector. You can do this by looking at the implementation of priority_queue, which uses make_heap( ), push_heap( ), and pop_heap( ). (These are the soul of the priority_queue—in fact you could say that the heap is the priority queue and that priority_queue is just a wrapper around it.) This turns out to be reasonably straightforward, but you might think that a shortcut is possible. Since the container used by priority_queue is protected (and has the identifier, according to the Standard C++ specification, named c), you can inherit a new class that provides access to the underlying implementation:


//: C07:PriorityQueue4.cpp
// Manipulating the underlying implementation.
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <queue>
using namespace std;

class PQI : public priority_queue<int> {
public:
  vector<int>& impl() { return c; }
};

int main() {
  PQI pqi;
  srand(time(0));
  for(int i = 0; i < 100; i++)
    pqi.push(rand() % 25);
  copy(pqi.impl().begin(), pqi.impl().end(),
    ostream_iterator<int>(cout, " "));
  cout << endl;
  while(!pqi.empty()) {
    cout << pqi.top() << ' ';
    pqi.pop();
  }
} ///:~
However, if you run this program, you'll discover that the vector doesn't contain the items in the descending order that you get when you call pop( ), the order that you want from the priority queue. It would seem that if you want to create a vector that is a priority queue, you have to do it by hand, like this:


//: C07:PriorityQueue5.cpp
// Building your own priority queue.
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <queue>
using namespace std;

template<class T, class Compare>
class PQV : public vector<T> {
  Compare comp;
public:
  PQV(Compare cmp = Compare()) : comp(cmp) {
    make_heap(this->begin(),this->end(), comp);
  }
  const T& top() { return this->front(); }
  void push(const T& x) {
    this->push_back(x);
    push_heap(this->begin(),this->end(), comp);
  }
  void pop() {
    pop_heap(this->begin(),this->end(), comp);
    this->pop_back();
  }
};

int main() {
  PQV< int, less<int> > pqi;
  srand(time(0));
  for(int i = 0; i < 100; i++)
    pqi.push(rand() % 25);
  copy(pqi.begin(), pqi.end(),
    ostream_iterator<int>(cout, " "));
  cout << endl;
  while(!pqi.empty()) {
    cout << pqi.top() << ' ';
    pqi.pop();
  }
} ///:~
But this program behaves in the same way as the previous one! What you are seeing in the underlying vector is called a heap. This heap data structure represents the tree of the priority queue (stored in the linear structure of the vector), but when you iterate through it, you do not get a linear priority-queue order. You might think that you can simply call sort_heap( ), but that only works once, and then you don't have a heap anymore, but instead a sorted list. This means that to go back to using it as a heap, the user must remember to call make_heap( ) first. This can be encapsulated into your custom priority queue:


//: C07:PriorityQueue6.cpp
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <queue>
using namespace std;

template<class T, class Compare>
class PQV : public vector<T> {
  Compare comp;
  bool sorted;
  void assureHeap() {
    if(sorted) {
      // Turn it back into a heap:
      make_heap(this->begin(),this->end(), comp);
      sorted = false;
    }
  }
public:
  PQV(Compare cmp = Compare()) : comp(cmp) {
    make_heap(this->begin(),this->end(), comp);
    sorted = false;
  }
  const T& top() {
    assureHeap();
    return this->front();
  }
  void push(const T& x) {
    assureHeap();
    this->push_back(x); // Put it at the end
    // Re-adjust the heap:
    push_heap(this->begin(),this->end(), comp);
  }
  void pop() {
    assureHeap();
    // Move the top element to the last position:
    pop_heap(this->begin(),this->end(), comp);
    this->pop_back();// Remove that element
  }
  void sort() {
    if(!sorted) {
      sort_heap(this->begin(),this->end(), comp);
      reverse(this->begin(),this->end());
      sorted = true;
    }
  }
};

int main() {
  PQV< int, less<int> > pqi;
  srand(time(0));
  for(int i = 0; i < 100; i++) {
    pqi.push(rand() % 25);
    copy(pqi.begin(), pqi.end(),
      ostream_iterator<int>(cout, "
"));
    cout << "\n-----” << endl;
  }
  pqi.sort();
  copy(pqi.begin(), pqi.end(),
    ostream_iterator<int>(cout, " "));
  cout << "\n-----” << endl;
  while(!pqi.empty()) {
    cout << pqi.top() << ' ';
    pqi.pop();
  }
} ///:~
If sorted is true, the vector is not organized as a heap but instead as a sorted sequence. The assureHeap( ) function guarantees that it's put back into heap form before performing any heap operations on it. The first for loop in main( ) now has the additional quality that it displays the heap as it is being built.

In the previous two programs we had to introduce a seemingly extraneous usage of the “this->” prefix. Although some compilers do not require it, the standard definition of C++ does. Note that the class PQV derives from vector<T>, therefore begin( ) and end( ), inherited from vector<T>, are dependent names.(110) Compilers can't look up names from dependent base classes in the definition of a template (vector, in this case) because for a given instantiation an explicitly specialized version of the template might be used that does not have a given member. The special naming requirement guarantees that you won't end up calling a base class member in some cases and possibly a function from an enclosing scope (such as a global one) in other cases. The compiler has no way of knowing that a call to begin( ) is dependent, so we must give it a clue with a “this‑>” qualification.(111) This tells the compiler that begin( ) is in the scope of PQV, so it waits until an instance of PQV is fully instantiated. If this qualifying prefix is left out, the compiler will attempt an early lookup for the names begin and end (at template definition time, and will fail to find them because there are no such names declared in enclosing lexical scopes in this example). In the code above, however, the compiler waits until the point of instantiation of pqi, and then finds the correct specializations of begin( ) and end( ) in vector<int>.

The only drawback to this solution is that the user must remember to call sort( ) before viewing it as a sorted sequence (although one could conceivably redefine all the member functions that produce iterators so that they guarantee sorting). Another solution is to create a priority queue that is not a vector, but will build you a vector whenever you want one:


//: C07:PriorityQueue7.cpp
// A priority queue that will hand you a vector.
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <queue>
#include <vector>
using namespace std;

template<class T, class Compare> class PQV {
  vector<T> v;
  Compare comp;
public:
  // Don't need to call make_heap(); it's empty:
  PQV(Compare cmp = Compare()) : comp(cmp) {}
  void push(const T& x) {
    v.push_back(x); // Put it at the end
    // Re-adjust the heap:
    push_heap(v.begin(), v.end(), comp);
  }
  void pop() {
    // Move the top element to the last position:
    pop_heap(v.begin(), v.end(), comp);
    v.pop_back(); // Remove that element
  }
  const T& top() { return v.front(); }
  bool empty() const { return v.empty(); }
  int size() const { return v.size(); }
  typedef vector<T> TVec;
  TVec getVector() {
    TVec r(v.begin(), v.end());
    // It's already a heap
    sort_heap(r.begin(), r.end(), comp);
    // Put it into priority-queue order:
    reverse(r.begin(), r.end());
    return r;
  }
};

int main() {
  PQV<int, less<int> > pqi;
  srand(time(0));
  for(int i = 0; i < 100; i++)
    pqi.push(rand() % 25);
  const vector<int>& v = pqi.getVector();
  copy(v.begin(), v.end(),
    ostream_iterator<int>(cout, " "));
  cout << "\n-----------” << endl;
  while(!pqi.empty()) {
    cout << pqi.top() << ' ';
    pqi.pop();
  }
} ///:~
The PQV class template follows the same form as the STL's priority_queue, but has the additional member getVector( ), which creates a new vector that's a copy of the one in PQV (which means that it's already a heap). It then sorts that copy (leaving PQV's vector untouched), and reverses the order so that traversing the new vector produces the same effect as popping the elements from the priority queue.

You may observe that the approach of deriving from priority_queue used in PriorityQueue4.cpp could be used with the above technique to produce more succinct code:


//: C07:PriorityQueue8.cpp
// A more compact version of PriorityQueue7.cpp.
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <queue>
using namespace std;

template<class T> class PQV : public
priority_queue<T> {
public:
  typedef vector<T> TVec;
  TVec getVector() {
    TVec r(this->c.begin(),this->c.end());
    // c is already a heap
    sort_heap(r.begin(), r.end(), this->comp);
    // Put it into priority-queue order:
    reverse(r.begin(), r.end());
    return r;
  }
};

int main() {
  PQV<int> pqi;
  srand(time(0));
  for(int i = 0; i < 100; i++)
    pqi.push(rand() % 25);
  const vector<int>& v = pqi.getVector();
  copy(v.begin(), v.end(),
    ostream_iterator<int>(cout, " "));
  cout << "\n-----------” << endl;
  while(!pqi.empty()) {
    cout << pqi.top() << ' ';
    pqi.pop();
  }
} ///:~
The brevity of this solution makes it the simplest and most desirable, plus it's guaranteed that the user will not have a vector in the unsorted state. The only potential problem is that the getVector( ) member function returns the vector<T> by value, which might cause some overhead issues with complex values of the parameter type T.


2.5.9. Holding bits

Because C is a language that purports to be “close to the hardware,” many have found it dismaying that there is no native binary representation for numbers. Decimal, of course, and hexadecimal (tolerable only because it's easier to group the bits in your mind), but octal? Ugh. Whenever you read specs for chips you're trying to program, they don't describe the chip registers in octal or even hexadecimal—they use binary. And yet C won't let you say 0b0101101, which is the obvious solution for a language close to the hardware.

Although there's still no native binary representation in C++, things have improved with the addition of two classes: bitset and vector<bool>, both of which are designed to manipulate a group of on-off values.(112) The primary differences between these types are:

  • Each bitset holds a fixed number of bits. You establish the quantity of bits in the bitset template argument. The vector<bool> can, like a regular vector, expand dynamically to hold any number of bool values.
  • The bitset template is explicitly designed for performance when manipulating bits, and is not a “regular” STL container. As such, it has no iterators. The number of bits, being a template parameter, is known at compile time and allows the underlying integral array to be stored on the runtime stack. The vector<bool> container, on the other hand, is a specialization of a vector and so has all the operations of a normal vector—the specialization is just designed to be space efficient for bool.
There is no trivial conversion between a bitset and a vector<bool>, which implies that the two are for very different purposes. Furthermore, neither is a traditional “STL container.” The bitset template class has an interface for bit-level operations and in no way resembles the STL containers we've discussed up to this point. The vector<bool> specialization of vector is similar to an STL-like container, but it differs as discussed below.


2.5.9.1. bitset<n>

The template for bitset accepts an unsigned integral template argument that is the number of bits to represent. Thus, bitset<10> is a different type than bitset<20>, and you cannot perform comparisons, assignments, and so on between the two.

A bitset provides the most commonly used bitwise operations in an efficient form. However, each bitset is implemented by logically packing bits in an array of integral types (typically unsigned longs, which contain at least 32 bits). In addition, the only conversion from a bitset to a numerical value is to an unsigned long (via the function to_ulong( )).

The following example tests almost all the functionality of the bitset (the missing operations are redundant or trivial).You'll see the description of each of the bitset outputs to the right of the output so that the bits all line up and you can compare them to the source values. If you still don't understand bitwise operations, running this program should help.


//: C07:BitSet.cpp {-bor}
// Exercising the bitset class.
#include <bitset>
#include <climits>
#include <cstdlib>
#include <ctime>
#include <cstddef>
#include <iostream>
#include <string>
using namespace std;

const int SZ = 32;
typedef bitset<SZ> BS;

template<int bits> bitset<bits>
randBitset() {
  bitset<bits> r(rand());
  for(int i = 0; i < bits/16 - 1; i++) {
    r <<= 16;
    // "OR" together with a new lower 16
bits:
    r |= bitset<bits>(rand());
  }
  return r;
}

int main() {
  srand(time(0));
  cout << "sizeof(bitset<16>) = "
       << sizeof(bitset<16>) << endl;
  cout << "sizeof(bitset<32>) = "
       << sizeof(bitset<32>) << endl;
  cout << "sizeof(bitset<48>) = "
       << sizeof(bitset<48>) << endl;
  cout << "sizeof(bitset<64>) = "
       << sizeof(bitset<64>) << endl;
  cout << "sizeof(bitset<65>) = "
       << sizeof(bitset<65>) << endl;
  BS a(randBitset<SZ>()), b(randBitset<SZ>());
  // Converting from a bitset:
  unsigned long ul = a.to_ulong();
  cout << a << endl;
  // Converting a string to a bitset:
  string cbits("111011010110111");
  cout << "as a string = " <<
cbits <<endl;
  cout << BS(cbits) << "
[BS(cbits)]" << endl;
  cout << BS(cbits, 2) << " [BS(cbits,
2)]" << endl;
  cout << BS(cbits, 2, 11) << " [BS(cbits,
2, 11)]"<< endl;
  cout << a << " [a]" <<
endl;
  cout << b << " [b]" <<
endl;
  // Bitwise AND:
  cout << (a & b) << " [a &
b]" << endl;
  cout << (BS(a) &= b) << " [a
&= b]" << endl;
  // Bitwise OR:
  cout << (a | b) << " [a | b]"
<< endl;
  cout << (BS(a) |= b) << " [a |=
b]" << endl;
  // Exclusive OR:
  cout << (a ^ b) << " [a ^ b]"
<< endl;
  cout << (BS(a) ^= b) << " [a ^=
b]" << endl;
  cout << a << " [a]" <<
endl; // For reference
  // Logical left shift (fill with zeros):
  cout << (BS(a) <<= SZ/2) << "
[a <<= (SZ/2)]" << endl;
  cout << (a << SZ/2) << endl;
  cout << a << " [a]" <<
endl; // For reference
  // Logical right shift (fill with zeros):
  cout << (BS(a) >>= SZ/2) << "
[a >>= (SZ/2)]" << endl;
  cout << (a >> SZ/2) << endl;
  cout << a << " [a]" <<
endl; // For reference
  cout << BS(a).set() << "
[a.set()]" << endl;
  for(int i = 0; i < SZ; i++)
    if(!a.test(i)) {
      cout << BS(a).set(i)
           << " [a.set(" << i
<<")]" << endl;
      break; // Just do one example of this
    }
  cout << BS(a).reset() << "
[a.reset()]"<< endl;
  for(int j = 0; j < SZ; j++)
    if(a.test(j)) {
      cout << BS(a).reset(j)
           << " [a.reset(" << j
<<")]" << endl;
      break; // Just do one example of this
    }
  cout << BS(a).flip() << "
[a.flip()]" << endl;
  cout << ~a << " [~a]" <<
endl;
  cout << a << " [a]" <<
endl; // For reference
  cout << BS(a).flip(1) << "
[a.flip(1)]"<< endl;
  BS c;
  cout << c << " [c]" <<
endl;
  cout << "c.count() = " <<
c.count() << endl;
  cout << "c.any() = "
       << (c.any() ? "true" :
"false") << endl;
  cout << "c.none() = "
       << (c.none() ? "true" :
"false") << endl;
  c[1].flip(); c[2].flip();
  cout << c << " [c]" <<
endl;
  cout << "c.count() = " <<
c.count() << endl;
  cout << "c.any() = "
       << (c.any() ? "true" :
"false") << endl;
  cout << "c.none() = "
       << (c.none() ? "true" :
"false") << endl;
  // Array indexing operations:
  c.reset();
  for(size_t k = 0; k < c.size(); k++)
    if(k % 2 == 0)
      c[k].flip();
  cout << c << " [c]" <<
endl;
  c.reset();
  // Assignment to bool:
  for(size_t ii = 0; ii < c.size(); ii++)
    c[ii] = (rand() % 100) < 25;
  cout << c << " [c]" <<
endl;
  // bool test:
  if(c[1])
    cout << "c[1] == true";
  else
    cout << "c[1] == false" <<
endl;
} ///:~
To generate interesting random bitsets, the randBitset( ) function is created. This function demonstrates operator<<= by shifting each 16 random bits to the left until the bitset (which is templatized in this function for size) is full. The generated number and each new 16 bits are combined using the operator|=.

The main( ) function first shows the unit size of a bitset. If it is less than 32 bits, sizeof produces 4 (4 bytes = 32 bits), which is the size of a single long on most implementations. If it's between 32 and 64, it requires two longs, greater than 64 requires 3 longs, and so on. Thus, you make the best use of space if you use a bit quantity that fits in an integral number of longs. However, notice there's no extra overhead for the object—it's as if you were hand-coding for a long.

Although there are no other numerical conversions from bitset besides to_ulong( ), there is a stream inserter that produces a string containing ones and zeros, and this can be as long as the actual bitset.

There's still no primitive format for binary values, but bitset supports the next best thing: a string of ones and zeros with the least-significant bit (lsb) on the right. The three constructors demonstrated take the entire string, the string starting at character 2, and the string from character 2 through 11. You can write to an ostream from a bitset using operator<<,and it comes out as ones and zeros. You can also read from an istream using operator>> (not shown here).

You'll notice that bitset only has three nonmember operators: and (&), or (|), and exclusive-or (^). Each of these creates a new bitset as its return value. All the member operators opt for the more efficient &=, |=, and so on, where a temporary is not created. However, these forms change the bitset's value (which is a in most of the tests in the above example). To prevent this, we created a temporary to be used as the lvalue by invoking the copy-constructor on a; this is why you see the form BS(a). The result of each test is displayed, and occasionally a is reprinted so you can easily look at it for reference.

The rest of the example should be self-explanatory when you run it; if not you can find the details in your compiler's documentation or in the other documentation mentioned earlier in this chapter.


2.5.9.2. vector<bool>

The vector<bool> container is a specialization of the vector template. A normal bool variable requires at least one byte, but since a bool only has two states, the ideal implementation of vector<bool> is such that each bool value only requires one bit. Since typical library implementations pack the bits into integral arrays, the iterator must be specially defined and cannot be a pointer to bool.

The bit-manipulation functions for vector<bool> are much more limited than those of bitset. The only member function that was added to those already in vector is flip( ), to invert all the bits. There is no set( ) or reset( ) as in bitset. When you use operator[ ], you get back an object of type vector<bool>::reference, which also has a flip( ) to invert that individual bit.


//: C07:VectorOfBool.cpp
// Demonstrate the vector<bool> specialization.
#include <bitset>
#include <cstddef>
#include <iostream>
#include <iterator>
#include <sstream>
#include <vector>
using namespace std;

int main() {
  vector<bool> vb(10, true);
  vector<bool>::iterator it;
  for(it = vb.begin(); it != vb.end(); it++)
    cout << *it;
  cout << endl;
  vb.push_back(false);
  ostream_iterator<bool> out(cout, "");
  copy(vb.begin(), vb.end(), out);
  cout << endl;
  bool ab[] = { true, false, false, true, true,
    true, true, false, false, true };
  // There's a similar constructor:
  vb.assign(ab, ab + sizeof(ab)/sizeof(bool));
  copy(vb.begin(), vb.end(), out);
  cout << endl;
  vb.flip(); // Flip all bits
  copy(vb.begin(), vb.end(), out);
  cout << endl;
  for(size_t i = 0; i < vb.size(); i++)
    vb[i] = 0; // (Equivalent to "false")
  vb[4] = true;
  vb[5] = 1;
  vb[7].flip(); // Invert one bit
  copy(vb.begin(), vb.end(), out);
  cout << endl;
  // Convert to a bitset:
  ostringstream os;
  copy(vb.begin(), vb.end(),
    ostream_iterator<bool>(os, ""));
  bitset<10> bs(os.str());
  cout << "Bitset:” << endl <<
bs << endl;
} ///:~
The last part of this example takes a vector<bool> and converts it to a bitset by first turning it into a string of ones and zeros. Here, you must know the size of the bitset at compile time. You can see that this conversion is not the kind of operation you'll want to do on a regular basis.

The vector<bool> specialization is a “crippled” STL container in the sense that certain guarantees that other containers provide are missing. For example, with the other containers the following relationships hold:


// Let c be an STL container other than
vector<bool>:
T& r = c.front();
T* p = &*c.begin();
For all other containers, the front( ) function yields an lvalue (something you can get a non-const reference to), and begin( ) must yield something you can dereference and then take the address of. Neither is possible because bits are not addressable. Both vector<bool> and bitset use a proxy class (the nested reference class, mentioned earlier) to read and set bits as necessary.


2.5.10. Associative containers

The set, map, multiset, and multimap are called associative containers because they associate keys with values. Well, at least maps and multimaps associate keys with values, but you can look at a set as a map that has no values, only keys (and they can in fact be implemented this way), and the same for the relationship between multiset and multimap. So, because of the structural similarity, sets and multisets are lumped in with associative containers.

The most important basic operations with associative containers are putting things in and, in the case of a set, seeing if something is in the set. In the case of a map, you want to first see if a key is in the map, and if it exists, you want the associated value for that key. There are many variations on this theme, but that's the fundamental concept. The following example shows these basics:


//: C07:AssociativeBasics.cpp {-bor}
// Basic operations with sets and maps.
//{L} Noisy
#include <cstddef>
#include <iostream>
#include <iterator>
#include <map>
#include <set>
#include "Noisy.h"
using namespace std;

int main() {
  Noisy na[7];
  // Add elements via constructor:
  set<Noisy> ns(na, na + sizeof
na/sizeof(Noisy));
  Noisy n;
  ns.insert(n); // Ordinary insertion
  cout << endl;
  // Check for set membership:
  cout << "ns.count(n)= " <<
ns.count(n) << endl;
  if(ns.find(n) != ns.end())
    cout << "n(" << n <<
") found in ns" << endl;
  // Print elements:
  copy(ns.begin(), ns.end(),
    ostream_iterator<Noisy>(cout, "
"));
  cout << endl;
  cout << "\n-----------” << endl;
  map<int, Noisy> nm;
  for(int i = 0; i < 10; i++)
    nm[i]; // Automatically makes pairs
  cout << "\n-----------” << endl;
  for(size_t j = 0; j < nm.size(); j++)
    cout << "nm[" << j
<<"] = " << nm[j] << endl;
  cout << "\n-----------” << endl;
  nm[10] = n;
  cout << "\n-----------” << endl;
  nm.insert(make_pair(47, n));
  cout << "\n-----------” << endl;
  cout << "\n nm.count(10)= " <<
nm.count(10) << endl;
  cout << "nm.count(11)= " <<
nm.count(11) << endl;
  map<int, Noisy>::iterator it = nm.find(6);
  if(it != nm.end())
    cout << "value:" <<
(*it).second
         << " found in nm at location
6" << endl;
  for(it = nm.begin(); it != nm.end(); it++)
    cout << (*it).first << ":"
<< (*it).second << ", ";
  cout << "\n-----------” << endl;
} ///:~
The set<Noisy> object ns is created using two iterators into an array of Noisy objects, but there is also a default constructor and a copy-constructor, and you can pass in an object that provides an alternate scheme for doing comparisons. Both sets and maps have an insert( ) member function to put things in, and you can check to see if an object is already in an associative container in two ways. The count( ) member function, when given a key, will tell you how many times that key occurs. (This can only be zero or one in a set or map, but it can be more than one with a multiset or multimap.) The find( ) member function will produce an iterator indicating the first occurrence (with set and map, the only occurrence) of the key that you give it or will produce the past-the-end iterator if it can't find the key. The count( ) and find( ) member functions exist for all the associative containers, which makes sense. The associative containers also have member functions lower_bound( ), upper_bound( ), and equal_range( ), which only make sense for multiset and multimap, as you will see. (But don't try to figure out how they would be useful for set and map, since they are designed for dealing with a range of duplicate keys, which those containers don't allow.)

Designing an operator[ ] always presents a bit of a dilemma. Because it's intended to be treated as an array-indexing operation, people don't tend to think about performing a test before they use it. But what happens if you decide to index out of the bounds of the array? One option is to throw an exception, but with a map, “indexing out of the array” could mean that you want to create a new entry at that location, and that's the way the STL map treats it. The first for loop after the creation of the map<int, Noisy> nm just “looks up” objects using the operator[ ], but this is actually creating new Noisy objects! The map creates a new key-value pair (using the default constructor for the value) if you look up a value with operator[ ] and it isn't there. This means that if you really just want to look something up and not create a new entry, you must use the member functions count( ) (to see if it's there) or find( ) (to get an iterator to it).

A number of problems are associated with the for loop that prints the values of the container using operator[ ]. First, it requires integral keys (which we happen to have here). Next and worse, if all the keys are not sequential, you'll end up counting from zero to the size of the container, and if some spots don't have key-value pairs, you'll automatically create them and miss some of the higher values of the keys. Finally, if you look at the output from the for loop, you'll see that things are very busy, and it's quite puzzling at first why there are so many constructions and destructions for what appears to be a simple lookup. The answer only becomes clear when you look at the code in the map template for operator[ ], which will be something like this:


mapped_type& operator[] (const key_type& k) {
  value_type tmp(k,T());
  return (*((insert(tmp)).first)).second;
}
The map::insert( ) function takes a key-value pair and does nothing if there is already an entry in the map with the given key—otherwise it inserts an entry for the key. In either case, it returns a new key-value pair holding an iterator to the inserted pair as its first element and holding true as the second element if an insertion took place. The members first and second give the key and value, respectively, because map::value_type is really just a typedef for a std::pair:


typedef pair<const Key, T> value_type;
You've seen the std::pair template before. It's a simple holder for two values of independent types, as you can see by its definition:


template<class T1, class T2> struct pair {
  typedef T1 first_type;
  typedef T2 second_type;
  T1 first;
  T2 second;
  pair();
  pair(const T1& x, const T2& y) : first(x),
second(y) {}
  // Templatized copy-constructor:
  template<class U, class V> pair(const
pair<U, V> &p);
};
The pair template class is very useful, especially when you want to return two objects from a function (since a return statement only takes one object). There's even a shorthand for creating a pair called make_pair( ), which is used in AssociativeBasics.cpp.

So to retrace the steps, map::value_type is a pair of the key and the value of the map—actually, it's a single entry for the map. But notice that pair packages its objects by value, which means that copy-constructions are necessary to get the objects into the pair. Thus, the creation of tmp in map::operator[ ] will involve at least a copy-constructor call and destructor call for each object in the pair. Here, we're getting off easy because the key is an int. But if you want to really see what kind of activity can result from map::operator[ ], try running this:


//: C07:NoisyMap.cpp
// Mapping Noisy to Noisy.
//{L} Noisy
#include <map>
#include "Noisy.h"
using namespace std;

int main() {
  map<Noisy, Noisy> mnn;
  Noisy n1, n2;
  cout << "\n--------” << endl;
  mnn[n1] = n2;
  cout << "\n--------” << endl;
  cout << mnn[n1] << endl;
  cout << "\n--------” << endl;
} ///:~
You'll see that both the insertion and lookup generate a lot of extra objects, and that's because of the creation of the tmp object. If you look back up at map::operator[ ], you'll see that the second line calls insert( ), passing it tmp—that is, operator[ ] does an insertion every time. The return value of insert( ) is a different kind of pair, where first is an iterator pointing to the key-value pair that was just inserted, and second is a bool indicating whether the insertion took place. You can see that operator[ ] grabs first (the iterator), dereferences it to produce the pair, and then returns the second, which is the value at that location.

So on the upside, map has this fancy “make a new entry if one isn't there” behavior, but the downside is that you always get a lot of extra object creations and destructions when you use map::operator[ ]. Fortunately, AssociativeBasics.cpp also demonstrates how to reduce the overhead of insertions and deletions, by avoiding operator[ ] if you don't need it. The insert( ) member function is slightly more efficient than operator[ ]. With a set, you hold only one object, but with a map, you hold key-value pairs; so insert( ) requires a pair as its argument. Here's where make_pair( ) comes in handy, as you can see.

For looking objects up in a map, you can use count( ) to see whether a key is in the map, or you can use find( ) to produce an iterator pointing directly at the key-value pair. Again, since the map contains pairs, that's what the iterator produces when you dereference it, so you have to select first and second. When you run AssociativeBasics.cpp, you'll notice that the iterator approach involves no extra object creations or destructions. It's not as easy to write or read, though.


2.5.10.1. Generators and fillers for associative containers

You've seen how useful the fill( ), fill_n( ), generate( ), and generate_n( ) function templates in <algorithm> have been for filling the sequential containers (vector, list, and deque) with data. However, these are implemented by using operator= to assign values into the sequential containers, and the way that you add objects to associative containers is with their respective insert( ) member functions. Thus, the default “assignment” behavior causes a problem when trying to use the “fill” and “generate” functions with associative containers.

One solution is to duplicate the “fill” and “generate” functions, creating new ones that can be used with associative containers. It turns out that only the fill_n( ) and generate_n( ) functions can be duplicated (fill( ) and generate( ) copy sequences, which doesn't make sense with associative containers), but the job is fairly easy, since you have the <algorithm> header file to work from:


//: C07:assocGen.h
// The fill_n() and generate_n() equivalents
// for associative containers.
#ifndef ASSOCGEN_H
#define ASSOCGEN_H

template<class Assoc, class Count, class T>
void assocFill_n(Assoc& a, Count n, const T&
val) {
  while(n-- > 0)
    a.insert(val);
}

template<class Assoc, class Count, class Gen>
void assocGen_n(Assoc& a, Count n, Gen g) {
  while(n-- > 0)
    a.insert(g());
}
#endif // ASSOCGEN_H ///:~
You can see that instead of using iterators, the container class itself is passed (by reference, of course).

This code demonstrates two valuable lessons. The first is that if the algorithms don't do what you want, copy the nearest thing and modify it. You have the example at hand in the STL header, so most of the work has already been done.

The second lesson is more pointed: if you look long enough, there's probably a way to do it in the STL without inventing anything new. The present problem can instead be solved by using an insert_iterator (produced by a call to inserter( )), which calls insert( ) to place items in the container instead of operator=. This is not simply a variation of front_insert_iterator or back_insert_iterator because those iterators use push_front( ) and push_back( ), respectively. Each of the insert iterators is different by virtue of the member function it uses for insertion, and insert( ) is the one we need. Here's a demonstration that shows filling and generating both a map and a set. (It can also be used with multimap and multiset.) First, some templatized generators are created. (This may seem like overkill, but you never know when you'll need them. For that reason they're placed in a header file.)


//: C07:SimpleGenerators.h
// Generic generators, including one that creates pairs.
#include <iostream>
#include <utility>

// A generator that increments its value:
template<typename T> class IncrGen {
  T i;
public:
  IncrGen(T ii) : i(ii) {}
  T operator()() { return i++; }
};

// A generator that produces an STL pair<>:
template<typename T1, typename T2> class PairGen
{
  T1 i;
  T2 j;
public:
  PairGen(T1 ii, T2 jj) : i(ii), j(jj) {}
  std::pair<T1,T2> operator()() {
    return std::pair<T1,T2>(i++, j++);
  }
};

namespace std {
// A generic global operator<< for printing any
STL pair<>:
template<typename F, typename S> ostream&
operator<<(ostream& os, const
pair<F,S>& p) {
  return os << p.first << "\t"
<< p.second << endl;
}
} ///:~
Both generators expect that T can be incremented, and they simply use operator++ to generate new values from whatever you used for initialization. PairGen creates an STL pair object as its return value, and that's what can be placed into a map or multimap using insert( ).

The last function is a generalization of operator<< for ostreams, so that any pair can be printed, assuming each element of the pair supports a stream operator<<. (It is in namespace std for the strange name lookup reasons discussed in Chapter 5, and explained once again after Thesaurus.cpp later on in this chapter.) As you can see in the following, this allows the use of copy( ) to output the map:


//: C07:AssocInserter.cpp
// Using an insert_iterator so fill_n() and generate_n()
// can be used with associative containers.
#include <iterator>
#include <iostream>
#include <algorithm>
#include <set>
#include <map>
#include "SimpleGenerators.h"
using namespace std;

int main() {
  set<int> s;
  fill_n(inserter(s, s.begin()), 10, 47);
  generate_n(inserter(s, s.begin()), 10,
    IncrGen<int>(12));
  copy(s.begin(), s.end(),
    ostream_iterator<int>(cout, "\n"));
  map<int, int> m;
  fill_n(inserter(m, m.begin()), 10,
make_pair(90,120));
  generate_n(inserter(m, m.begin()), 10,
    PairGen<int, int>(3, 9));
  copy(m.begin(), m.end(),
    ostream_iterator<pair<int,int>
>(cout,"\n"));
} ///:~
The second argument to inserter is an iterator, which is an optimization hint to help the insertion go faster (instead of always starting the search at the root of the underlying tree). Since an insert_iterator can be used with many different types of containers, with non-associative containers it is more than a hint—it is required.

Note how the ostream_iterator is created to output a pair. This won't work if the operator<< isn't created. Since it's a template, it is automatically instantiated for pair<int, int>.


2.5.10.2. The magic of maps

An ordinary array uses an integral value to index into a sequential set of elements of some type. A map is an associative array, which means you associate one object with another in an array-like fashion. Instead of selecting an array element with a number as you do with an ordinary array, you look it up with an object! The example that follows counts the words in a text file, so the index is the string object representing the word, and the value being looked up is the object that keeps count of the strings.

In a single-item container such as a vector or a list, only one thing is being held. But in a map, you've got two things: the key (what you look up by, as in mapname[key]) and the value that results from the lookup with the key. If you simply want to move through the entire map and list each key-value pair, you use an iterator, which when dereferenced produces a pair object containing both the key and the value. You access the members of a pair by selecting first or second.

This same philosophy of packaging two items together is also used to insert elements into the map, but the pair is created as part of the instantiated map and is called value_type, containing the key and the value. So one option for inserting a new element is to create a value_type object, loading it with the appropriate objects and then calling the insert( ) member function for the map. Instead, the following example uses the aforementioned special feature of map: if you're trying to find an object by passing in a key to operator[ ] and that object doesn't exist, operator[ ] will automatically insert a new key-value pair for you, using the default constructor for the value object. With that in mind, consider an implementation of a word-counting program:


//: C07:WordCount.cpp
// Count occurrences of words using a map.
#include <iostream>
#include <fstream>
#include <map>
#include <string>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  typedef map<string, int> WordMap;
  typedef WordMap::iterator WMIter;
  const char* fname = "WordCount.cpp";
  if(argc > 1) fname = argv[1];
  ifstream in(fname);
  assure(in, fname);
  WordMap wordmap;
  string word;
  while(in >> word)
    wordmap[word]++;
  for(WMIter w = wordmap.begin(); w != wordmap.end();
w++)
    cout << w->first << ": "
<< w->second << endl;
} ///:~
This example shows the power of zero-initialization. Consider this line of code from the program:


wordmap[word]++;
This expression increments the int associated with word. If there isn't such a word in the map, a key-value pair for the word is automatically inserted, with the value initialized to zero by a call to the pseudo-constructor int( ), which returns a 0.

Printing the entire list requires traversing it with an iterator. (There's no copy( ) shortcut for a map unless you want to write an operator<< for the pair in the map.) As previously mentioned, dereferencing this iterator produces a pair object, with the first member the key and the second member the value.

If you want to find the count for a particular word, you can use the array index operator, like this:


cout << "the: " << wordmap["the"] <<
endl;
You can see that one of the great advantages of the map is the clarity of the syntax; an associative array makes intuitive sense to the reader. (Note, however, that if “the” isn't already in the wordmap, a new entry will be created!)


2.5.10.3. Multimaps and duplicate keys

A multimap is a map that can contain duplicate keys. At first this may seem like a strange idea, but it can occur surprisingly often. A phone book, for example, can have many entries with the same name.

Suppose you are monitoring wildlife, and you want to keep track of where and when each type of animal is spotted. Thus, you may see many animals of the same kind, all in different locations and at different times. So if the type of animal is the key, you'll need a multimap. Here's what it looks like:


//: C07:WildLifeMonitor.cpp
#include <algorithm>
#include <cstdlib>
#include <cstddef>
#include <ctime>
#include <iostream>
#include <iterator>
#include <map>
#include <sstream>
#include <string>
#include <vector>
using namespace std;

class DataPoint {
  int x, y; // Location coordinates
  time_t time; // Time of Sighting
public:
  DataPoint() : x(0), y(0), time(0) {}
  DataPoint(int xx, int yy, time_t tm) :
    x(xx), y(yy), time(tm) {}
  // Synthesized operator=, copy-constructor OK
  int getX() const { return x; }
  int getY() const { return y; }
  const time_t* getTime() const { return &time; }
};

string animal[] = {
  "chipmunk", "beaver",
"marmot", "weasel",
  "squirrel", "ptarmigan",
"bear", "eagle",
  "hawk", "vole", "deer",
"otter", "hummingbird",
};
const int ASZ = sizeof animal/sizeof *animal;
vector<string> animals(animal, animal + ASZ);

// All the information is contained in a
// "Sighting," which can be sent to an
ostream:
typedef pair<string, DataPoint> Sighting;

ostream&
operator<<(ostream& os, const Sighting&
s) {
  return os << s.first << " sighted at
x= "
    << s.second.getX() << ", y= "
<< s.second.getY()
    << ", time = " <<
ctime(s.second.getTime());
}

// A generator for Sightings:
class SightingGen {
  vector<string>& animals;
  enum { D = 100 };
public:
  SightingGen(vector<string>& an) :
animals(an) {}
  Sighting operator()() {
    Sighting result;
    int select = rand() % animals.size();
    result.first = animals[select];
    result.second = DataPoint(
      rand() % D, rand() % D, time(0));
    return result;
  }
};

// Display a menu of animals, allow the user to
// select one, return the index value:
int menu() {
  cout << "select an animal or 'q' to quit:
";
  for(size_t i = 0; i < animals.size(); i++)
    cout <<'['<< i <<']'<<
animals[i] << ' ';
  cout << endl;
  string reply;
  cin >> reply;
  if(reply.at(0) == 'q') return 0;
  istringstream r(reply);
  int i;
  r >> i; // Converts to int
  i %= animals.size();
  return i;
}

int main() {
  typedef multimap<string, DataPoint> DataMap;
  typedef DataMap::iterator DMIter;
  DataMap sightings;
  srand(time(0));  // Randomize
  generate_n(inserter(sightings, sightings.begin()),
    50, SightingGen(animals));
  // Print everything:
  copy(sightings.begin(), sightings.end(),
    ostream_iterator<Sighting>(cout,
""));
  // Print sightings for selected animal:
  for(int count = 1; count < 10; count++) {
    // Use menu to get selection:
    // int i = menu();
    // Generate randomly (for automated testing):
    int i = rand() % animals.size();
    // Iterators in "range" denote begin, one
    // past end of matching range:
    pair<DMIter, DMIter> range =
      sightings.equal_range(animals[i]);
    copy(range.first, range.second,
      ostream_iterator<Sighting>(cout,
""));
  }
} ///:~
All the data about a sighting is encapsulated into the class DataPoint, which is simple enough that it can rely on the synthesized assignment and copy-constructor. It uses the Standard C library time functions to record the time of the sighting.

In the array of string, animal, notice that the char* constructor is automatically used during initialization, which makes initializing an array of string quite convenient. Since it's easier to use the animal names in a vector, the length of the array is calculated, and a vector<string> is initialized using the vector(iterator, iterator) constructor.

The key-value pairs that make up a Sighting are the string, which names the type of animal, and the DataPoint, which says where and when it was sighted. The standard pair template combines these two types and is typedefed to produce the Sighting type. Then an ostream operator<< is created for Sighting; this will allow you to iterate through a map or multimap of Sightingsand display it.

SightingGen generates random sightings at random data points to use for testing. It has the usual operator( ) necessary for a function object, but it also has a constructor to capture and store a reference to a vector<string>, which is where the aforementioned animal names are stored.

A DataMap is a multimap of string-DataPoint pairs, which means it stores Sightings. It is filled with 50 Sightings using generate_n( ) and displayed. (Notice that because there is an operator<< that takes a Sighting, an ostream_iterator can be created.) At this point the user is asked to select the animal for which they want to see all the sightings. If you press q, the program will quit, but if you select an animal number, the equal_range( ) member function is invoked. This returns an iterator (DMIter) to the beginning of the set of matching pairs and an iterator indicating past-the-end of the set. Since only one object can be returned from a function, equal_range( ) makes use of pair. Since the range pair has the beginning and ending iterators of the matching set, those iterators can be used in copy( ) to print all the sightings for a particular type of animal.


2.5.10.4. Multisets

You've seen the set, which allows only one object of each value to be inserted. The multiset is odd by comparison since it allows more than one object of each value to be inserted. This seems to go against the whole idea of “setness,” where you can ask, “Is ‘it' in this set?” If there can be more than one “it,” what does that question mean?

With some thought, you can see that it makes little sense to have more than one object of the same value in a set if those duplicate objects are exactly the same (with the possible exception of counting occurrences of objects, but as seen earlier in this chapter that can be handled in an alternative, more elegant fashion). Thus, each duplicate object will have something that makes it “different” from the other duplicates—most likely different state information that is not used in the calculation of the key during the comparison. That is, to the comparison operation, the objects look the same, but they contain some differing internal state.

Like any STL container that must order its elements, the multiset template uses the less function object by default to determine element ordering. This uses the contained class's operator<, but you can always substitute your own comparison function.

Consider a simple class that contains one element that is used in the comparison and another that is not:


//: C07:MultiSet1.cpp
// Demonstration of multiset behavior.
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <iterator>
#include <set>
using namespace std;

class X {
  char c; // Used in comparison
  int i; // Not used in comparison
  // Don't need default constructor and operator=
  X();
  X& operator=(const X&);
  // Usually need a copy-constructor (but the
  // synthesized version works here)
public:
  X(char cc, int ii) : c(cc), i(ii) {}
  // Notice no operator== is
required
  friend bool operator<(const X& x, const X&
y) {
    return x.c < y.c;
  }
  friend ostream& operator<<(ostream& os,
X x) {
    return os << x.c << ":"
<< x.i;
  }
};

class Xgen {
  static int i;
  // Number of characters to select from:
  enum { SPAN = 6 };
public:
  X operator()() {
    char c = 'A' + rand() % SPAN;
    return X(c, i++);
  }
};

int Xgen::i = 0;

typedef multiset<X> Xmset;
typedef Xmset::const_iterator Xmit;

int main() {
  Xmset mset;
  // Fill it with X's:
  srand(time(0));  // Randomize
  generate_n(inserter(mset, mset.begin()), 25, Xgen());
  // Initialize a regular set from mset:
  set<X> unique(mset.begin(), mset.end());
  copy(unique.begin(), unique.end(),
    ostream_iterator<X>(cout, " "));
  cout << "\n----” << endl;
  // Iterate over the unique values:
  for(set<X>::iterator i = unique.begin();
      i != unique.end(); i++) {
    pair<Xmit, Xmit> p = mset.equal_range(*i);
    copy(p.first,p.second,
ostream_iterator<X>(cout, " "));
    cout << endl;
  }
} ///:~
In X, all the comparisons are made with the char c. The comparison is performed with operator<, which is all that is necessary for the multiset, since in this example the default less comparison object is used. The class Xgen randomly generates X objects, but the comparison value is restricted to the span from ‘A' to ‘E'. In main( ), a multiset<X> is created and filled with 25 X objects using Xgen, guaranteeing that there will be duplicate keys. So that we know what the unique values are, a regular set<X> is created from the multiset (using the iterator, iterator constructor). These values are displayed, and then each one produces the equal_range( ) in the multiset (equal_range( ) has the same meaning here as it does with multimap: all the elements with matching keys). Each set of matching keys is then printed.

As a second example, a (possibly) more elegant version of WordCount.cpp can be created using multiset:


//: C07:MultiSetWordCount.cpp
// Count occurrences of words using a multiset.
#include <fstream>
#include <iostream>
#include <iterator>
#include <set>
#include <string>
#include "../require.h"
using namespace std;

int main(int argc, char* argv[]) {
  const char* fname =
"MultiSetWordCount.cpp";
  if(argc > 1) fname = argv[1];
  ifstream in(fname);
  assure(in, fname);
  multiset<string> wordmset;
  string word;
  while(in >> word)
    wordmset.insert(word);
  typedef multiset<string>::iterator MSit;
  MSit it = wordmset.begin();
  while(it != wordmset.end()) {
    pair<MSit, MSit> p = wordmset.equal_range(*it);
    int count = distance(p.first, p.second);
    cout << *it << ": " <<
count << endl;
    it = p.second; // Move to the next word
  }
} ///:~
The setup in main( ) is identical to WordCount.cpp, but then each word is simply inserted into the multiset<string>. An iterator is created and initialized to the beginning of the multiset; dereferencing this iterator produces the current word. The equal_range( ) member function (not generic algorithm) produces the starting and ending iterators of the word that's currently selected, and the algorithm distance( ) (defined in <iterator>) counts the number of elements in that range. The iterator it is then moved forward to the end of the range, which puts it at the next word. If you're unfamiliar with the multiset, this code can seem more complex. The density of it and the lack of need for supporting classes such as Count has a lot of appeal.

In the end, is this really a “set,” or should it be called something else? An alternative is the generic “bag” that is defined in some container libraries, since a bag holds anything, without discrimination—including duplicate objects. This is close, but it doesn't quite fit since a bag has no specification about how elements should be ordered. A multiset (which requires that all duplicate elements be adjacent to each other) is even more restrictive than the concept of a set. A set implementation might use a hashing function to order its elements, which would not put them in sorted order. Besides, if you want to store a bunch of objects without any special criteria, you will probably just use a vector, deque, or list.


2.5.11. Combining STL containers

When using a thesaurus, you want to know all the words that are similar to a particular word. When you look up a word, then, you want a list of words as the result. Here, the “multi” containers (multimap or multiset) are not appropriate. The solution is to combine containers, which is easily done using the STL. Here, we need a tool that turns out to be a powerful general concept, which is a map that associates a string with a vector:


//: C07:Thesaurus.cpp
// A map of vectors.
#include <map>
#include <vector>
#include <string>
#include <iostream>
#include <iterator>
#include <algorithm>
#include <ctime>
#include <cstdlib>
using namespace std;

typedef map<string, vector<string> >
Thesaurus;
typedef pair<string, vector<string> >
TEntry;
typedef Thesaurus::iterator TIter;

// Name lookup work-around:
namespace std {
ostream& operator<<(ostream& os,const
TEntry& t) {
  os << t.first << ": ";
  copy(t.second.begin(), t.second.end(),
    ostream_iterator<string>(os, " "));
  return os;
}
}

// A generator for thesaurus test entries:
class ThesaurusGen {
  static const string letters;
  static int count;
public:
  int maxSize() { return letters.size(); }
  TEntry operator()() {
    TEntry result;
    if(count >= maxSize()) count = 0;
    result.first = letters[count++];
    int entries = (rand() % 5) + 2;
    for(int i = 0; i < entries; i++) {
      int choice = rand() % maxSize();
      char cbuf[2] = { 0 };
      cbuf[0] = letters[choice];
      result.second.push_back(cbuf);
    }
    return result;
  }
};

int ThesaurusGen::count = 0;
const string ThesaurusGen::letters("ABCDEFGHIJKL"
 
"MNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");

// Ask for a "word" to look up:
string menu(Thesaurus& thesaurus) {
  while(true) {
    cout << "Select a \"word\", 0
to quit: ";
    for(TIter it = thesaurus.begin();
        it != thesaurus.end(); it++)
      cout << (*it).first << ' ';
    cout << endl;
    string reply;
    cin >> reply;
    if(reply.at(0) == '0') exit(0); // Quit
    if(thesaurus.find(reply) == thesaurus.end())
      continue; // Not in list, try again
    return reply;
  }
}

int main() {
  srand(time(0)); // Seed the random number generator
  Thesaurus thesaurus;
  // Fill with 10 entries:
  generate_n(inserter(thesaurus, thesaurus.begin()),
    10, ThesaurusGen());
  // Print everything:
  copy(thesaurus.begin(), thesaurus.end(),
    ostream_iterator<TEntry>(cout,
"\n"));
  // Create a list of the keys:
  string keys[10];
  int i = 0;
  for(TIter it = thesaurus.begin();
    it != thesaurus.end(); it++)
    keys[i++] = (*it).first;
  for(int count = 0; count < 10; count++) {
    // Enter from the console:
    // string reply = menu(thesaurus);
    // Generate randomly
    string reply = keys[rand() % 10];
    vector<string>& v = thesaurus[reply];
    copy(v.begin(), v.end(),
      ostream_iterator<string>(cout, "
"));
    cout << endl;
  }
} ///:~
A Thesaurus maps a string (the word) to a vector<string> (the synonyms). A TEntry is a single entry in a Thesaurus. By creating an ostream operator<< for a TEntry, a single entry from the Thesaurus can easily be printed (and the whole Thesaurus can easily be printed with copy( )). Notice the very strange placement of the stream inserter: we put it inside the std namespace!(113) This operator<<( ) function is used by ostream_iterator in the first call to copy( ) in main( ) above. When the compiler instantiates the needed ostream_iterator specialization, according to the rules of argument-dependent lookup (ADL) it only looks in std because that is where all the arguments to copy( ) are declared. If we declared our inserter in the global namespace (by removing the namespace block around it), then it would not be found. By placing it in std we enable ADL to find it.

The ThesaurusGen creates “words” (which are just single letters) and “synonyms” for those words (which are just other randomly chosen single letters) to be used as thesaurus entries. It randomly chooses the number of synonym entries to make, but there must be at least two. All the letters are chosen by indexing into a static string that is part of ThesaurusGen.

In main( ), a Thesaurus is created, filled with 10 entries and printed using the copy( ) algorithm. The menu( ) function asks the user to choose a “word” to look up by typing the letter of that word. The find( ) member function discovers whether the entry exists in the map. (Remember, you don't want to use operator[ ], which will automatically make a new entry if it doesn't find a match!) If so, operator[ ] fetches out the vector<string> that is displayed. The selection of the reply string is generated randomly, to allow automated testing.

Because templates make the expression of powerful concepts easy, you can take this concept much further, creating a map of vectors containing maps, and so on. For that matter, you can combine any of the STL containers this way.


2.5.12. Cleaning up containers of pointers

In Stlshape.cpp, the pointers did not clean themselves up automatically. It would be convenient to be able to do this easily, rather than writing out the code each time. Here is a function template that will clean up the pointers in any sequence container. Note that it is placed in the book's root directory for easy access:


//: :purge.h
// Delete pointers in an STL sequence container.
#ifndef PURGE_H
#define PURGE_H
#include <algorithm>

template<class Seq> void purge(Seq& c) {
  typename Seq::iterator i;
  for(i = c.begin(); i != c.end(); ++i) {
    delete *i;
    *i = 0;
  }
}

// Iterator version:
template<class InpIt> void purge(InpIt begin,
InpIt end) {
  while(begin != end) {
    delete *begin;
    *begin = 0;
    ++begin;
  }
}
#endif // PURGE_H ///:~
In the first version of purge( ), note that typename is absolutely necessary. This is exactly the case that keyword was designed to solve: Seq is a template argument, and iterator is something that is nested within that template. So what does Seq::iterator refer to? The typename keyword specifies that it refers to a type, and not something else.

Although the container version of purge( ) must work with an STL-style container, the iterator version of purge( ) will work with any range, including an array.

Here is a rewrite of Stlshape.cpp, modified to use the purge( ) function:


//: C07:Stlshape2.cpp
// Stlshape.cpp with the purge() function.
#include <iostream>
#include <vector>
#include "../purge.h"
using namespace std;

class Shape {
public:
  virtual void draw() = 0;
  virtual ~Shape() {};
};

class Circle : public Shape {
public:
  void draw() { cout << "Circle::draw”
<< endl; }
  ~Circle() { cout << "~Circle” <<
endl; }
};

class Triangle : public Shape {
public:
  void draw() { cout << "Triangle::draw”
<< endl; }
  ~Triangle() { cout << "~Triangle” <<
endl; }
};

class Square : public Shape {
public:
  void draw() { cout << "Square::draw”
<< endl; }
  ~Square() { cout << "~Square” <<
endl; }
};

int main() {
  typedef std::vector<Shape*> Container;
  typedef Container::iterator Iter;
  Container shapes;
  shapes.push_back(new Circle);
  shapes.push_back(new Square);
  shapes.push_back(new Triangle);
  for(Iter i = shapes.begin(); i != shapes.end(); i++)
    (*i)->draw();
  purge(shapes);
} ///:~
When using purge( ), carefully consider ownership issues. If an object pointer is held in more than one container, be sure not to delete it twice, and you don't want to destroy the object in the first container before the second one is finished with it. Purging the same container twice is not a problem because purge( ) sets the pointer to zero once it deletes that pointer, and calling delete for a zero pointer is a safe operation.


2.5.13. Creating your own containers

With the STL as a foundation, you can create your own containers. Assuming you follow the same model of providing iterators, your new container will behave as if it were a built-in STL container.

Consider the “ring” data structure, which is a circular sequence container. If you reach the end, it just wraps around to the beginning. This can be implemented on top of a list as follows:


//: C07:Ring.cpp
// Making a "ring" data structure from the
STL.
#include <iostream>
#include <iterator>
#include <list>
#include <string>
using namespace std;

template<class T> class Ring {
  list<T> lst;
public:
  // Declaration necessary so the following
  // 'friend' statement sees this 'iterator'
  // instead of std::iterator:
  class iterator;
  friend class iterator;
  class iterator : public std::iterator<
    std::bidirectional_iterator_tag,T,ptrdiff_t>{
    typename list<T>::iterator it;
    list<T>* r;
  public:
    iterator(list<T>& lst,
      const typename list<T>::iterator& i)
    : it(i), r(&lst) {}
    bool operator==(const iterator& x) const {
      return it == x.it;
    }
    bool operator!=(const iterator& x) const {
      return !(*this == x);
    }
    typename list<T>::reference operator*() const
{
      return *it;
    }
    iterator& operator++() {
      ++it;
      if(it == r->end())
        it = r->begin();
      return *this;
    }
    iterator operator++(int) {
      iterator tmp = *this;
      ++*this;
      return tmp;
    }
    iterator& operator--() {
      if(it == r->begin())
        it = r->end();
      --it;
      return *this;
    }
    iterator operator--(int) {
      iterator tmp = *this;
      --*this;
      return tmp;
    }
    iterator insert(const T& x) {
      return iterator(*r, r->insert(it, x));
    }
    iterator erase() {
      return iterator(*r, r->erase(it));
    }
  };
  void push_back(const T& x) { lst.push_back(x); }
  iterator begin() { return iterator(lst, lst.begin());
}
  int size() { return lst.size(); }
};

int main() {
  Ring<string> rs;
  rs.push_back("one");
  rs.push_back("two");
  rs.push_back("three");
  rs.push_back("four");
  rs.push_back("five");
  Ring<string>::iterator it = rs.begin();
  ++it; ++it;
  it.insert("six");
  it = rs.begin();
  // Twice around the ring:
  for(int i = 0; i < rs.size() * 2; i++)
    cout << *it++ << endl;
} ///:~
You can see that most of the coding is in the iterator. The Ring iterator must know how to loop back to the beginning, so it must keep a reference to the list ofits “parent” Ring object in order to know if it's at the end and how to get back to the beginning.

You'll notice that the interface for Ring is quite limited; in particular, there is no end( ), since a ring just keeps looping. This means that you won't be able to use a Ring in any STL algorithms that require a past-the-end iterator, which are many. (It turns out that adding this feature is a nontrivial exercise.) Although this can seem limiting, consider stack, queue, and priority_queue, which don't produce any iterators at all!


2.5.14. STL extensions

Although the STL containers may provide all the functionality you'll ever need, they are not complete. For example, the standard implementations of set and map use trees, and although these are reasonably fast, they may not be fast enough for your needs. In the C++ Standards Committee it was generally agreed that hashed implementations of set and map should have been included in Standard C++, however, there was not enough time to add these components, and thus they were left out.(114)

Fortunately, alternatives are freely available. One of the nice things about the STL is that it establishes a basic model for creating STL-like classes, so anything built using the same model is easy to understand if you are already familiar with the STL.

The SGI STL from Silicon Graphics(115) is one of the most robust implementations of the STL and can be used to replace your compiler's STL if that is found wanting. In addition, SGI has added a number of extensions including hash_set, hash_multiset, hash_map, hash_multimap, slist (a singly linked list), and rope (a variant of string optimized for very large strings and fast concatenation and substring operations).

Let's consider a performance comparison between a tree-based map and the SGI hash_map. To keep things simple, the mappings will be from int to int:


//: C07:MapVsHashMap.cpp
// The hash_map header is not part of the Standard C++
STL.
// It is an extension that is only available as part of
the
// SGI STL (Included with the dmc distribution).
// You can add the header by hand for all of these:
//{-bor}{-msc}{-g++}{-mwcc}
#include <hash_map>
#include <iostream>
#include <map>
#include <ctime>
using namespace std;

int main() {
  hash_map<int, int> hm;
  map<int, int> m;
  clock_t ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      m.insert(make_pair(j,j));
  cout << "map insertions: " <<
clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      hm.insert(make_pair(j,j));
  cout << "hash_map insertions: "
       << clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      m[j];
  cout << "map::operator[] lookups: "
       << clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      hm[j];
  cout << "hash_map::operator[] lookups:
"
       << clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      m.find(j);
  cout << "map::find() lookups: "
       << clock() - ticks << endl;
  ticks = clock();
  for(int i = 0; i < 100; i++)
    for(int j = 0; j < 1000; j++)
      hm.find(j);
  cout << "hash_map::find() lookups: "
       << clock() - ticks << endl;
} ///:~
The performance test we ran showed a speed improvement of roughly 4: 1 for the hash_map over the map in all operations (and as expected, find( ) is slightly faster than operator[ ] for lookups for both types of map). If a profiler shows a bottleneck in your map, consider a hash_map.


2.5.15. Non–STL containers

There are two “non-STL” containers in the standard library: bitset and valarray.(116) We say “non-STL” because neither of these containers fulfills all the requirements of STL containers. The bitset container, which we covered earlier in this chapter, packs bits into integers and does not allow direct addressing of its members. The valarray template class is a vector-like container that is optimized for efficient numeric computation. Neither container provides iterators. Although you can instantiate a valarray with nonnumeric types, it has mathematical functions that are intended to operate with numeric data, such as sin, cos, tan, and so on.

Here's a tool to print elements in a valarray:


//: C07:PrintValarray.h
#ifndef PRINTVALARRAY_H
#define PRINTVALARRAY_H
#include <valarray>
#include <iostream>
#include <cstddef>

template<class T>
void print(const char* lbl, const
std::valarray<T>& a) {
  std::cout << lbl << ": ";
  for(std::size_t i = 0; i < a.size(); ++i)
    std::cout << a[i] << ' ';
  std::cout << std::endl;
}
#endif // PRINTVALARRAY_H ///:~
Most of valarray's functions and operators operate on a valarray as a whole, as the following example illustrates:


//: C07:Valarray1.cpp {-bor}
// Illustrates basic valarray functionality.
#include "PrintValarray.h"
using namespace std;

double f(double x) { return 2.0 * x - 1.0; }

int main() {
  double n[] = { 1.0, 2.0, 3.0, 4.0 };
  valarray<double> v(n, sizeof n / sizeof n[0]);
  print("v", v);
  valarray<double> sh(v.shift(1));
  print("shift 1", sh);
  valarray<double> acc(v + sh);
  print("sum", acc);
  valarray<double> trig(sin(v) + cos(acc));
  print("trig", trig);
  valarray<double> p(pow(v, 3.0));
  print("3rd power", p);
  valarray<double> app(v.apply(f));
  print("f(v)", app);
  valarray<bool> eq(v == app);
  print("v == app?", eq);
  double x = v.min();
  double y = v.max();
  double z = v.sum();
  cout << "x = " << x <<
", y = " << y
       << ", z = " <<<<
endl;
} ///:~
The valarray class provides a constructor that takes an array of the target type and the count of elements in the array to initialize the new valarray. The shift( ) member function shifts each valarray element one position to the left (or to the right, if its argument is negative) and fills in holes with the default value for the type (zero in this case). There is also a cshift( ) member function that does a circular shift (or “rotate”). All mathematical operators and functions are overloaded to operate on valarrays, and binary operators require valarray arguments of the same type and size. The apply( ) member function, like the transform( ) algorithm, applies a function to each element, but the result is collected into a result valarray. The relational operators return suitably-sized instances of valarray<bool> that indicate the result of element-by-element comparisons, such as with eq above. Most operations return a new result array, but a few, such as min( ), max( ), and sum( ), return a single scalar value, for obvious reasons.

The most interesting thing you can do with valarrays is reference subsets of their elements, not only for extracting information, but also for updating it. A subset of a valarray is called a slice, and certain operators use slices to do their work. The following sample program uses slices:


//: C07:Valarray2.cpp {-bor}{-dmc}
// Illustrates slices and masks.
#include "PrintValarray.h"
using namespace std;

int main() {
  int data[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
};
  valarray<int> v(data, 12);
  valarray<int> r1(v[slice(0, 4, 3)]);
  print("slice(0,4,3)", r1);
  // Extract conditionally
  valarray<int> r2(v[v > 6]);
  print("elements > 6", r2);
  // Square first column
  v[slice(0, 4, 3)] *= valarray<int>(v[slice(0,
4, 3)]);
  print("after squaring first column", v);
  // Restore it
  int idx[] = { 1, 4, 7, 10 };
  valarray<int> save(idx, 4);
  v[slice(0, 4, 3)] = save;
  print("v restored", v);
  // Extract a 2-d subset: { { 1, 3, 5 }, { 7, 9, 11 }
}
  valarray<size_t> siz(2);
  siz[0] = 2;
  siz[1] = 3;
  valarray<size_t> gap(2);
  gap[0] = 6;
  gap[1] = 2;
  valarray<int> r3(v[gslice(0, siz, gap)]);
  print("2-d slice", r3);
  // Extract a subset via a boolean mask (bool
elements)
  valarray<bool> mask(false, 5);
  mask[1] = mask[2] = mask[4] = true;
  valarray<int> r4(v[mask]);
  print("v[mask]", r4);
  // Extract a subset via an index mask (size_t
elements)
  size_t idx2[] = { 2, 2, 3, 6 };
  valarray<size_t> mask2(idx2, 4);
  valarray<int> r5(v[mask2]);
  print("v[mask2]", r5);
  // Use an index mask in assignment
  valarray<char> text("now is the
time", 15);
  valarray<char> caps("NITT", 4);
  valarray<size_t> idx3(4);
  idx3[0] = 0;
  idx3[1] = 4;
  idx3[2] = 7;
  idx3[3] = 11;
  text[idx3] = caps;
  print("capitalized", text);
} ///:~
A slice object takes three arguments: the starting index, the number of elements to extract, and the “stride,” which is the gap between elements of interest. Slices can be used as indexes into an existing valarray, and a new valarray containing the extracted elements is returned. A valarray of bool, such as is returned by the expression v > 6, can be used as an index into another valarray; the elements corresponding to the true slots are extracted. As you can see, you can also use slices and masks as indexes on the left side of an assignment. A gslice object (for “generalized slice”) is like a slice, except that the counts and strides are themselves arrays, which means you can interpret a valarray as a multidimensional array. The example above extracts a 2 by 3 array from v, where the numbers start at zero and the numbers for the first dimension are found six slots apart in v, and the others two apart, which effectively extracts the matrix


1 3 5
7 9 11
Here is the complete output for this program:


slice(0,4,3): 1 4 7 10
elements > 6: 7 8 9 10
after squaring v: 1 2 3 16 5 6 49 8 9 100 11 12
v restored: 1 2 3 4 5 6 7 8 9 10 11 12
2-d slice: 1 3 5 7 9 11
v[mask]: 2 3 5
v[mask2]: 3 3 4 7
capitalized: N o w   I s   T h e   T i m e
 
A practical example of slices is found in matrix multiplication. Consider how you would write a function to multiply two matrices of integers with arrays.


void matmult(const int a[][MAXCOLS], size_t m, size_t
n,
             const int b[][MAXCOLS], size_t p, size_t
q,
             int result[][MAXCOLS);
This function multiplies the m-by-n matrix a by the p-by-q matrix b, where n and p are expected to be equal. As you can see, without something like valarray, you need to fix the maximum value for the second dimension of each matrix, since locations in arrays are statically determined. It is also difficult to return a result array by value, so the caller usually passes the result array as an argument.

Using valarray, you can not only pass any size matrix, but you can also easily process matrices of any type, and return the result by value. Here's how:


//: C07:MatrixMultiply.cpp
// Uses valarray to multiply matrices
#include <cassert>
#include <cstddef>
#include <cmath>
#include <iostream>
#include <iomanip>
#include <valarray>
using namespace std;

// Prints a valarray as a square matrix
template<class T>
void printMatrix(const valarray<T>& a, size_t
n) {
  size_t siz = n*n;
  assert(siz <= a.size());
  for(size_t i = 0; i < siz; ++i) {
    cout << setw(5) << a[i];
    cout << ((i+1)%n ? ' ' : '\n');
  }
  cout << endl;
}

// Multiplies compatible matrices in valarrays
template<class T>
valarray<T>
matmult(const valarray<T>& a, size_t arows,
size_t acols,
        const valarray<T>& b, size_t brows,
size_t bcols) {
  assert(acols == brows);
  valarray<T> result(arows * bcols);
  for(size_t i = 0; i < arows; ++i)
    for(size_t j = 0; j < bcols; ++j) {
      // Take dot product of row a[i] and col b[j]
      valarray<T> row = a[slice(acols*i, acols,
1)];
      valarray<T> col = b[slice(j, brows,
bcols)];
      result[i*bcols + j] = (row * col).sum();
    }
  return result;
}

int main() {
  const int n = 3;
  int adata[n*n] = {1,0,-1,2,2,-3,3,4,0};
  int bdata[n*n] = {3,4,-1,1,-3,0,-1,1,2};
  valarray<int> a(adata, n*n);
  valarray<int> b(bdata, n*n);
  valarray<int> c(matmult(a, n, n, b, n, n));
  printMatrix(c, n);
} ///:~
Each entry in the result matrix c is the dot product of a row in a with a column in b. By taking slices, you can extract these rows and columns as valarrays and use the global * operator and sum( ) function provided by valarray to do the work succinctly. The result valarray is computed at runtime; there's no need to worry about the static limitations of array dimensions. You do have to compute linear offsets of the position [i][j] yourself (see the formula i*bcols + j above), but the size and type freedom is worth it.


2.5.16. Summary

The goal of this chapter was not just to introduce the STL containers in some considerable depth. Although every detail could not be covered here, you now know enough that you can look up further information in the other resources. Our hope is that this chapter has helped you grasp the power available in the STL and shown you how much faster and more efficient your programming activities can be by understanding and using the STL.


2.5.17. Exercises

Solutions to selected exercises can be found in the electronic document The Thinking in C++ Volume 2 Annotated Solution Guide, available for a small fee from www.MindView.net.

  1. Create a set<char>, open a file (whose name is provided on the command line), and read that file in a char at a time, placing each char in the set. Print the results, and observe the organization. Are there any letters in the alphabet that are not used in that particular file?
  2. Create three sequences of Noisy objects, a vector, deque, and list. Sort them. Now write a function template to receive the vector and deque sequences as a parameter to sort them and record the sorting time. Write a specialized template function to do the same for list (ensure to call its member sort( ) instead of the generic algorithm). Compare the performance of the different sequence types.
  3. Write a program to compare the speed of sorting a list using list::sort( ) vs. using std::sort( ) (the STL algorithm version of sort( )).
  4. Create a generator that produces random int values between 0 and 20 inclusive, and use it to fill a multiset<int>. Count the occurrences of each value, following the example given in MultiSetWordCount.cpp.
  5. Change StlShape.cpp so that it uses a deque instead of a vector.
  6. Modify Reversible.cpp so it works with deque and list instead of vector.
  7. Use a stack<int> and populate it with a Fibonacci sequence. The program's command line should take the number of Fibonacci elements desired, and you should have a loop that looks at the last two elements on the stack and pushes a new one for every pass through the loop.
  8. Using only three stacks (source, sorted, and losers), sort a random sequence of numbers by first placing the numbers on the source stack. Assume the number on the top of the source is the largest, and push it on the sorted stack. Continue to pop the source stack comparing it with the top of the sorted stack. Whichever number is the smallest, pop it from its stack and push it onto the on the losers' stack. Once the source stack is empty, repeat the process using the losers' stack as the source stack, and use the source stack as the losers' stack. The algorithm completes when all the numbers have been placed into the winners' stack.
  9. Open a text file whose name is provided on the command line. Read the file a word at a time, and use a multiset<string> to create a word count for each word.
  10. Modify WordCount.cpp so that it uses insert( ) instead of operator[ ] to insert elements in the map.
  11. Create a class that has an operator< and an ostream& operator<<. The class should contain a priority number. Create a generator for your class that makes a random priority number. Fill a priority_queue using your generator, and then pull the elements out to show they are in the proper order.
  12. Rewrite Ring.cpp so it uses a deque instead of a list for its underlying implementation.
  13. Modify Ring.cpp so that the underlying implementation can be chosen using a template argument. (Let that template argument default to list.)
  14. Create an iterator class called BitBucket that just absorbs whatever you send to it without writing it anywhere.
  15. Create a kind of “hangman” game. Create a class that contains a char and a bool to indicate whether that char has been guessed yet. Randomly select a word from a file, and read it into a vector of your new type. Repeatedly ask the user for a character guess, and after each guess, display the characters in the word that have been guessed, and display underscores for the characters that haven't. Allow a way for the user to guess the whole word. Decrement a value for each guess, and if the user can get the whole word before the value goes to zero, they win.
  16. Open a file and read it into a single string. Turn the string into a stringstream. Read tokens from the stringstream into a list<string> using a TokenIterator.
  17. Compare the performance of stack based on whether it is implemented with vector, deque, or list.
  18. Create a template that implements a singly-linked list called SList. Provide a default constructor and begin( ) and end( ) functions (via an appropriate nested iterator), insert( ), erase( ) and a destructor.
  19. Generate a sequence of random integers, storing them into an array of int. Initialize a valarray<int> with its contents. Compute the sum, minimum value, maximum value, average, and median of the integers using valarray operations.
  20. Create a valarray<int> with 12 random values. Create another valarray<int> with 20 random values. You will interpret the first valarray as a 3 x 4 matrix of ints and the second as a 4 x 5 matrix of ints, and multiply them by the rules of matrix multiplication. Store the result in a valarray<int> of size 15, representing the 3 x 5 result matrix. Use slices to multiply the rows of the first matrix time the columns of the second. Print the result in rectangular matrix form.
 

 
(99) This would be an example of the State pattern, described in Chapter 10.
(100) Visit http://www.dinkumware.com, http://www.sgi.com/tech/stl, or http://www.stlport.org.
(101) This is about to change, as more smart pointer types are about to be added to the next version of the Standard. For a preliminary look at them, see the smart pointers available at www.boost.org.
(102) The container adaptors, stack, queue, and priority_queue do not support iterators, since they do not behave as sequences from the user's point of view.
(103) It will only work for implementations of vector that use a pointer (a T*) as the iterator type, like STLPort does.
(104) These were actually created to abstract the locale facets away from iostreams so that locale facets could operate on any sequence of characters, not just iostreams. Locales allow iostreams to easily handle culturally–different formatting (such as the representation of money).
(105) You will need to provide a char_traits specialization for any other argument type.
(106) We are indebted to Nathan Myers for explaining this.
(107) Singleton is a well–known design pattern and is discussed in depth in Chapter 10.
(108) This is another example coached by Nathan Myers.
(109) We revisit multithreading issues in Chapter 11.
(110) This means they depend in some way on a template parameter. See Chapter 5 in the section entitled “Name Lookup Issues.”
(111) As we explained in Chapter 5, any valid qualification, such as PQV::, will do.
(112) Chuck designed and provided the original reference implementations for bitset and also bitstring, the precursor to vector<bool>, while an active member of the C++ Standards Committee in the early 1990s.
(113) Technically, it is not legal for users to add to the standard namespace, but it is the easiest way to avoid this obscure name lookup problem, and is supported by all the compilers we use.
(114) They will likely appear in the next revision of Standard C++.
(115) Available at http://www.sgi.com/tech/stl.
(116) As we explained earlier, the vector<bool> specialization is also a non–STL container to some degree.
Ce document est issu de http://www.developpez.com et reste la propriété exclusive de son auteur. La copie, modification et/ou distribution par quelque moyen que ce soit est soumise à l'obtention préalable de l'autorisation de l'auteur.