1 <chapter xmlns="http://docbook.org/ns/docbook" version="5.0"
2 xml:id="std.iterators" xreflabel="Iterators">
3 <?dbhtml filename="iterators.html"?>
7 <indexterm><primary>Iterators</primary></indexterm>
21 <!-- Sect1 01 : Predefined -->
22 <section xml:id="std.iterators.predefined" xreflabel="Predefined"><info><title>Predefined</title></info>
25 <section xml:id="iterators.predefined.vs_pointers" xreflabel="Versus Pointers"><info><title>Iterators vs. Pointers</title></info>
29 FAQ <link linkend="faq.iterator_as_pod">entry</link> points out that
30 iterators are not implemented as pointers. They are a generalization
31 of pointers, but they are implemented in libstdc++ as separate
35 Keeping that simple fact in mind as you design your code will
36 prevent a whole lot of difficult-to-understand bugs.
39 You can think of it the other way 'round, even. Since iterators
40 are a generalization, that means
41 that <emphasis>pointers</emphasis> are
42 <emphasis>iterators</emphasis>, and that pointers can be used
43 whenever an iterator would be. All those functions in the
44 Algorithms section of the Standard will work just as well on plain
45 arrays and their pointers.
48 That doesn't mean that when you pass in a pointer, it gets
49 wrapped into some special delegating iterator-to-pointer class
50 with a layer of overhead. (If you think that's the case
51 anywhere, you don't understand templates to begin with...) Oh,
52 no; if you pass in a pointer, then the compiler will instantiate
53 that template using T* as a type, and good old high-speed
54 pointer arithmetic as its operations, so the resulting code will
55 be doing exactly the same things as it would be doing if you had
56 hand-coded it yourself (for the 273rd time).
59 How much overhead <emphasis>is</emphasis> there when using an
60 iterator class? Very little. Most of the layering classes
61 contain nothing but typedefs, and typedefs are
62 "meta-information" that simply tell the compiler some
63 nicknames; they don't create code. That information gets passed
64 down through inheritance, so while the compiler has to do work
65 looking up all the names, your runtime code does not. (This has
66 been a prime concern from the beginning.)
72 <section xml:id="iterators.predefined.end" xreflabel="end() Is One Past the End"><info><title>One Past the End</title></info>
75 <para>This starts off sounding complicated, but is actually very easy,
76 especially towards the end. Trust me.
78 <para>Beginners usually have a little trouble understand the whole
79 'past-the-end' thing, until they remember their early algebra classes
80 (see, they <emphasis>told</emphasis> you that stuff would come in handy!) and
81 the concept of half-open ranges.
83 <para>First, some history, and a reminder of some of the funkier rules in
84 C and C++ for builtin arrays. The following rules have always been
85 true for both languages:
87 <orderedlist inheritnum="ignore" continuation="restarts">
89 <para>You can point anywhere in the array, <emphasis>or to the first element
90 past the end of the array</emphasis>. A pointer that points to one
91 past the end of the array is guaranteed to be as unique as a
92 pointer to somewhere inside the array, so that you can compare
97 <para>You can only dereference a pointer that points into an array.
98 If your array pointer points outside the array -- even to just
99 one past the end -- and you dereference it, Bad Things happen.
103 <para>Strictly speaking, simply pointing anywhere else invokes
104 undefined behavior. Most programs won't puke until such a
105 pointer is actually dereferenced, but the standards leave that
110 <para>The reason this past-the-end addressing was allowed is to make it
111 easy to write a loop to go over an entire array, e.g.,
112 while (*d++ = *s++);.
114 <para>So, when you think of two pointers delimiting an array, don't think
115 of them as indexing 0 through n-1. Think of them as <emphasis>boundary
122 | | This is bad. Always having to
123 | | remember to add or subtract one.
124 | | Off-by-one bugs very common here.
127 |---|---|--...--|---|---|
128 | 0 | 1 | ... |N-2|N-1|
129 |---|---|--...--|---|---|
133 | | This is good. This is safe. This
134 | | is guaranteed to work. Just don't
135 | | dereference 'end'.
139 <para>See? Everything between the boundary markers is chapter of the array.
142 <para>Now think back to your junior-high school algebra course, when you
143 were learning how to draw graphs. Remember that a graph terminating
144 with a solid dot meant, "Everything up through this point,"
145 and a graph terminating with an open dot meant, "Everything up
146 to, but not including, this point," respectively called closed
147 and open ranges? Remember how closed ranges were written with
148 brackets, <emphasis>[a,b]</emphasis>, and open ranges were written with parentheses,
149 <emphasis>(a,b)</emphasis>?
151 <para>The boundary markers for arrays describe a <emphasis>half-open range</emphasis>,
152 starting with (and including) the first element, and ending with (but
153 not including) the last element: <emphasis>[beginning,end)</emphasis>. See, I
154 told you it would be simple in the end.
156 <para>Iterators, and everything working with iterators, follows this same
157 time-honored tradition. A container's <code>begin()</code> method returns
158 an iterator referring to the first element, and its <code>end()</code>
159 method returns a past-the-end iterator, which is guaranteed to be
160 unique and comparable against any other iterator pointing into the
161 middle of the container.
163 <para>Container constructors, container methods, and algorithms, all take
164 pairs of iterators describing a range of values on which to operate.
165 All of these ranges are half-open ranges, so you pass the beginning
166 iterator as the starting parameter, and the one-past-the-end iterator
167 as the finishing parameter.
169 <para>This generalizes very well. You can operate on sub-ranges quite
170 easily this way; functions accepting a <emphasis>[first,last)</emphasis> range
171 don't know or care whether they are the boundaries of an entire {array,
172 sequence, container, whatever}, or whether they only enclose a few
173 elements from the center. This approach also makes zero-length
174 sequences very simple to recognize: if the two endpoints compare
175 equal, then the {array, sequence, container, whatever} is empty.
177 <para>Just don't dereference <code>end()</code>.
183 <!-- Sect1 02 : Stream -->