Back to TOC Columns


Standard C/C++

P. J. Plauger

The Header <iterator>, Part 1

STL's iterators come in a variety flavors and colors. Categorizing them helps you choose just the right iterator for the task at hand.


Introduction

Thirteen headers constitute the Standard Template Library (or STL) in the draft Standard C++ library. I discussed <utility> as the first of these headers last month. (See "Standard C/C++: The Header <utility>," CUJ, March 1996.) The header <iterator> is another that supplies "glue" — facilities used throughout STL.

As the name implies, it provides lots of machinery used by iterators. Nothing in this header is very complicated, but it contains lots of details that need explaining. Moreover, iterators come in five basic flavors, or categories, not counting object pointers, which can serve as random-access iterators. So you will find that many of the things presented here also come in as many as half a dozen different versions.

Because the header is so large, I present it in pieces. Many of the pieces serve disjoint needs, anyway. The first set of pieces provide machinery used widely to categorize and manipulate all sorts of iterators. Next month, I present an assortment of specialized iterators and iterator adaptors also defined in <iterator>.

Bases and Tags

The first order of business is to impose some order on all these categories of iterators. In writing algorithms, it is often useful to know several things about a given iterator X:

Remember that STL consists almost exclusively of templates. All a template can know is the type or types for which it is specialized — and a few types derived therefrom. If you want to convey this sort of extra information, you have two basic choices:

Clearly, the second approach is more elegant, if you can pull it off. It turns out that you can. The trick is to derive any iterator classes you define from a handful of base classes. Listing 1 shows the first part of the header <iterator>, which defines those base classes: Thus, for example, you build a forward iterator on the basic structure:


class My_it
    : public forward_iterator<char,long>
    {..... }

Such an iterator can march through a binary file. Arithmetic on My_it iterators, being of type long, can follow much the same rules as for the ancient C library functions fseek and ftell.

Note, by the way, that output_iterator differs from the other base classes. It is just a structure, not a template, because it has no need to deliver up the parameter types T or Dist. You can only store an element through an output iterator — you can't read it back. And you can't do arithmetic on such iterators. So all that matters is the type of the base class, which determines the iterator category.

Categorizing Iterators

Now we can explore how to answer the first of the three questions posed above, given just this small hook. Determining the iterator category typically matters when you have more than one way to implement an algorithm. Here's a simple example from the STL header <algorithm>. The template function reverse reverses the elements in a sequence by swapping pairs of elements from the outside in. It can do so with two bidirectional iterators, using the template function iter_swap to do the actual swaps:


template<class BidIt> inline
    void reverse(BidItfirst, BidItlast)
    {for (; first != last &&
            first != --last; ++first)
        iter_swap(first, last); }


But you can implement the code a tad faster if you know the iterators are actually random access:


template<class RanIt> inline
    void reverse(RanIt first, RanIt last)
    {for (; first < last; ++first)
        iter_swap (first, --last); }


If all iterators are based on the classes shown above, you can indulge in a small bit of trickery to choose wisely between these two alternatives. Just repeat one of the arguments and use it to select among overloaded definitions of the secret template function _Reverse, as shown in Listing 2.

The specialization of reverse now translates to a call to some flavor of _Reverse. The flavor chosen depends on the base type of the iterator, which supplies the type of the template parameter. The net result is that the template specialization machinery is commandeered to act as a kind of translation-time switch statement. A moderately smart translator should even be able to optimize away the extra function call and dummy argument setup. Neat, huh?

Well, it's not quite neat enough. STL prides itself on blending well with existing coding practice. In particular, it permits the use of object pointers practically anywhere an iterator is permitted. The scheme shown in Listing 2 doesn't accommodate iterators that are simply pointers. Moreover, there's no simple way to extend the definition of _Reverse to handle an open-ended set of object pointer types. A bit more machinery is called for.

The solution is to introduce another set of category types, and a template function to deliver them. The new types are called iterator tags, with names such as input_iterator_tag, etc. They are simply empty structures used to convey unique types, with no accompanying values to construct or destroy.

The template function iterator_category returns a nebbish object of one of these tag types. What type it returns depends on the type of its single argument. Our five iterator base classes are permissible argument types, along with one important addition. Listing 3 shows another chunk of the header <iterator>, with these template definitions. The important addition comes at the end. iterator_category is also overloaded on the parameter type const T *, where T is the template parameter. This is one of the permissible recipes for determining a template parameter type from a function parameter type. Hence, this particular overload is selected for any call with an object pointer argument. Naturally, it politely reports, via its return value, that its parameter has type random_access_iterator_tag. So iterator_category lumps all object pointers in with iterators based on the template class random_access_iterator<T, Dist>, the desired behavior.

You can probably guess how template function reverse should properly be written. Listing 4 shows the improved form, which also works with iterators that are object pointers. It is a verbatim rendition of the actual code in the header <algorithm>. Selecting among alternate algorithms in this fashion occurs throughout STL.

Value and Distance Types

The remaining two questions, posed much earlier, can now be answered. Listing 5 shows another chunk of the header <iterator> with the necessary machinery. If you want to determine the element type T associated with an iterator, call the template function value_type. Similarly, if you want to determine the type used to measure the distance between iterators, call the template function distance_type. Note, in both cases, the special handling of object pointers as iterators.

These functions do not return objects of the type of interest (T or Dist). That could cause all sorts of problems with types that have very large objects, or that have "interesting" side effects during construction and destruction, or that just plain take a long time to manipulate and copy about. So instead, the functions each return a pointer to the desired type. As I showed in Listing 3, with the last overload for iterator_category, that's enough of a hook for the templates to work their magic.

Here, for example, is how the template function iter_swap, used by reverse in earlier examples, is implemented:


template<class FwdIt1, class FwdIt2> inline
    void iter_swap(FwdIt1x, FwdIt2y)
    {_Iter_swap(x, y, value_type(x)); }
template<class FwdIt1, class FwdIt2, classT > inline
    void_Iter_swap(FwdIt1 x, FwdIt2 y, T*)
    {T Tmp = *x;
    *x = *y, *y = Tmp; }


To swap two values, you need a temporary object to hold one value for a time. So iter_swap calls a secret template function _Iter_swap, adding an extra argument. The T * return type of value_type matches the T * parameter type of _Iter_swap. Thus, the element type T is smuggled from the iterator base (or extracted from the object pointer used as an iterator) and delivered to where it is needed in the working function.

You do much the same thing when you need to declare an object of some Dist type. I omit an example here, for want of space.

Iterator Arithmetic

Two other operations are also common across nearly all categories of iterators:

These operations are undefined for output iterators, of course. For input iterators they are barely defined — doing either computation often leaves an input iterator useless for anything else. STL includes them more for completeness than for general utility.

Operator notation like this is convenient, but historically STL has made little use of it. The forms X += N and X2 - X1 are required only for random-access iterators. (See "Standard C/C++," CUJ, February 1996.) Since they cannot be fully supported by the weaker iterator categories, it is arguably misleading to define them. Instead, the algorithms in STL make use of two template functions:


template<class InIt, class Dist> inline
    void advance(InIt& p, Dist n);
template<class InIt, class Dist> inline
    void distance(InIt first, InIt last, Dist& n);


The first essentially computes p += n, the second n = last - first, but only when it makes sense to do so.

Listing 6 shows the chunk of the header <iterator> that defines all the necessary overloads for these two template functions. By now you should be familiar with the various tricks that they quietly embody to do the right thing for each category of iterator, including pointers. Again, I omit any examples of usage, for want of space. But you will have ample opportunity to see these template functions at work in the coming months.o

P.J. Plauger is senior editor of C/C++ Users Journal. He is convener of the ISO C standards committee, WG14, and active on the C++ committee, WG21. He developed Lib Suite ++, the Plum-Hall validation suite for the Standard C++ library. His latest books are The Draft Standard C++ Library, Programming on Purpose (three volumes), and Standard C (with Jim Brodie), all published by Prentice-Hall. You can reach him at pjp@plauger.com.