1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
4 <title>Data-Structure Genericity</title>
5 <meta name="GENERATOR" content="Microsoft Visual Studio .NET 7.1">
6 <meta name="vs_targetSchema" content="http://schemas.microsoft.com/intellisense/ie5">
8 <body bgcolor = "white">
9 <h1>Data-Structure Genericity</h1>
12 This section describes genericity over different underlying data-structures. It is organized as follows.
15 <li><a href = "#problem">The Basic Problem</a></li>
16 <li><a href = "#ds_hierarchy">Container Hierarchy</a></li>
17 <li><a href = "#ds_traits">Data-Structure Tags and Traits</a></li>
18 <li><a href = "#find_range">Find-Type and Range-Type Methods and Iterators</a></li>
21 <h2><a name = "problem">The Basic Problem</a></h2>
24 The design attempts to address the following problem. When writing a function manipulating a generic container object, what is the behaviour of the object? <i>E.g.</i>, suppose one writes
28 <b>class</b> Cntnr>
29 <b>void</b> some_op_sequence
35 then one needs to address the following questions in the body
36 of <tt>some_op_sequence</tt>:
38 <li> Which types and methods does <tt>Cntnr</tt> support? Containers based on hash tables can be queries for the hash-functor type and object; this is meaningless for tree-based containers. Containers based on trees can be split, joined, or can erase iterators and return the following iterator; this cannot be done by hash-based containers. </li>
40 What are the guarantees of <tt>Cntnr</tt>? A container based on a probing hash-table invalidates all iterators when it is modified; this is not the case for containers based on node-based trees. Containers based on a node-based tree can be split or joined without exceptions; this is not the case for containers based on vector-based trees.
42 <li> How does the container maintain its elements? containers based on splay trees or lists with update policies "cache" "frequently accessed" elements; containers based on most other underlying data-structures do not.</li>
45 <h2><a name = "ds_hierarchy">Container Hierarchy</a></h2>
49 <a href = "#cd">Class hierarchy</a>
50 shows the container hierarchy.
54 <a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>
55 contains types and methods shared by all associative containers, <i>e.g.</i>, the type <tt>allocator</tt> and the method <tt>find</tt>.
57 <li><a href = "basic_assoc_cntnr.html"><tt>basic_hash_assoc_cntnr</tt></a> subclasses
58 <a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>, and contains
59 types and methods shared by all hash-based containers, <i>e.g.</i>, the type <tt>hash_fn</tt>.
63 <a href = "cc_hash_assoc_cntnr.html"><tt>cc_hash_assoc_cntnr</tt></a>
65 <a href = "gp_hash_assoc_cntnr.html"><tt>gp_hash_assoc_cntnr</tt></a>
67 <a href = "basic_hash_assoc_cntnr.html"><tt>basic_hash_assoc_cntnr</tt></a>, and encapsulate collision-chaining and (general) probing hash tables, respectively. These two types of hash tables have somewhat different policies and methods (<i>i.e.</i>, constructors and policy-access methods).
71 <a href = "tree_assoc_cntnr.html"><tt>tree_assoc_cntnr</tt></a>
73 <a href = "basic_tree_assoc_cntnr.html"><tt>basic_tree_assoc_cntnr</tt></a> which
75 <a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>.
76 <a href = "tree_assoc_cntnr.html"><tt>tree_assoc_cntnr</tt></a>
77 encapsulates a tree-based container, and is parameterized by which underlying data-structure to use (<i>e.g.</i>, a red-black tree);
78 <a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>.
79 is specialized to the capabilities of the underlying structure.
80 <a href = "tree_assoc_cntnr.html"><tt>tree_assoc_cntnr</tt></a> contains some additional methods over
81 <a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>,
82 <i>e.g.</i>, split and join methods.
85 <a href = "lu_assoc_cntnr.html"><tt>lu_assoc_cntnr</tt></a>
87 <a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a>,
88 and encapsulates a list with update policies.
93 The hierarchy is composed naturally, such that each container inherits
94 all types and methods from its base. <a href = "#ds_traits">Data-Structure Tags and Traits</a> discusses how to query which types and methods each container supports.
99 <h2><a name = "ds_traits">Data-Structure Tags and Traits</a></h2>
102 <tt>pb_assoc</tt> contains a tag and traits mechanism similar to that of the STL's iterators.
106 <tt>pb_assoc</tt> contains a tag hierarchy corresponding to the hierarchy
108 <a href = "#cd">Class hierarchy</a>.
109 The tag hierarchy is shown in Figure
110 <a href = "#ds_tag_cd">Data-structure tag class hierarchy</a>.
113 <h6 align = "center">
115 <img src = "ds_tag_cd.jpg" width = "70%" alt = "no image">
118 <h6 align = "center">
119 Data-structure tag class hierarchy.
123 <a href = "basic_assoc_cntnr.html"><tt>basic_assoc_cntnr</tt></a> publicly defines
124 <tt>ds_category</tt> as one of the classes in Figure
126 Given any container <tt>Cntnr</tt>, the tag of the underlying data-structure can be found via <tt><b>typename</b> Cntnr::ds_category</tt>.
130 Additionally, a traits mechanism can be used to query a container type for its attributes. Given any container <tt>Cntnr</tt>, then
131 <tt><a href = "ds_traits.html">ds_traits</a><Cntnr></tt>
132 is a traits class identifying the properties of the container.
136 To find if a container can throw when a key is erased (which is true for vector-based trees, for example), one can use
138 <a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::erase_can_throw</tt>,
142 Some of the definitions in
143 <a href = "ds_traits.html"><tt>ds_traits</tt></a>
144 are dependent on other definitions. <i>E.g.</i>, if
145 <a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::split_join</tt>
146 is <tt><b>true</b></tt> (which is the case for containers based on trees),
148 <a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::split_join_can_throw</tt>
149 indicates whether splits or joins can throw exceptions (which is true for vector-based trees); otherwise
150 <a href = "ds_traits.html"><tt>ds_traits</tt></a><tt><Cntnr>::split_join_can_throw</tt>
151 will yield a compilation error. (This is somewhat similar to a compile-time
152 version of the COM model
153 [<a href = "references.html#mscom">mscom</a>]).
156 <h2><a name = "find_range">Find-Type and Range-Type Methods and Iterators</a></h2>
159 <tt>pb_assoc</tt> differentiates between two types of methods: find-type methods, and range-type methods. For example, <tt>find</tt> is a find-type method, since a container object searches for an element with a given key; <tt>insert</tt> is a find-type method, since, by STL convention, a container object returns an iterator corresponding to an element with a given key; <tt>begin</tt> and <tt>end</tt> are range-type methods, since they are not used to find a specific element, but rather to go over all elements in a container object.
163 Correspondingly, containers in <tt>pb_assoc</tt> define two families of iterators. <tt>const_find_iterator</tt> and <tt>find_iterator</tt> are the iterator types returned by find-type methods; <tt>const_iterator</tt> and <tt>iterator</tt> are the iterator types returned by range-type methods.
167 The relationship between these iterator types varies between container types. In a tree-based container, for example, <tt>const_find_iterator</tt> and <tt>const_iterator</tt> are synonymous, and <tt>find_iterator</tt> and <tt>iterator</tt> are synonymous; in a hash-based container, for example, this is not the case. Futhermore, find-type iterators in a hash-based container lack movement operators, such as
168 <tt><b>operator++</b></tt>.
169 All containers, however, maintain the invariants shown in Figure
176 This distinction between find-type and range-type iterators and methods, while complicating the interface, has several advantages:
179 <h3>Iterators in unordered container types</h3>
182 Given an unordered container type, <i>e.g.</i>, a hash-based container, it makes no sense to move an iterator returned by a find-type method.
183 Let <tt>cntnr</tt> be an associative-container object, and
188 std::for_each(m.find(1), m.find(5), foo);
192 which applies <tt>foo</tt> to all elements in <tt>m</tt>
193 between <tt>1</tt> and <tt>5</tt>.
196 <p>If <tt>cntnr</tt> is a
197 tree-based container object, then an in-order walk will apply <tt>foo</tt>
198 to the relevant elements, <i>e.g.</i>, as in Figure
199 <a href = "#range_it_in_hts">Range iteration in different data-structures</a>
200 -A. If <tt>m</tt> is a
201 hash-based container, then the order of elements between any two
202 elements is undefined (and probably time-varying); there is no
203 guarantee that the elements traversed will coincide with the
204 <i>logical</i> elements between 1 and 5, <i>e.g.</i>, as in
205 Figure <a href = "#range_it_in_hts">Range iteration in different data-structures</a>-B.
210 range function <i>e.g.</i>, <tt>for_each</tt>, to a
211 pair of hash-based container's iterators is possibly sensical only
212 if the iterators are those returned by <tt>begin</tt> and <tt>end</tt>,
213 respectively. Therefore, the iterator returned by
214 <tt>m</tt>'s <tt>find</tt> method should be immovable.
218 Another point also indicates that hash-based containers'
219 find-type iterators and range-type iterators should be distinct.
221 <a href = "#find_its_in_hash_tables">
222 Find-type iterators in hash tables</a>-A.
224 (immovable) find-type iterator, designed only to access an
225 element, requires at most a single pointer to the element's link.
226 Conversely, an iterator designed for range operations
227 requires some more information <i>e.g.</i>, the bucket number),
228 since a cross-list traversal might be necessary. Alternatively,
229 the lists might be linked, forming a monolithic total-element
231 <a href = "#find_its_in_hash_tables">
232 Find-type iterators in hash tables</a>-B (this seems
233 similar to the Dinkumware design
234 [<a href = "references.html#dinkumware_stl">dinkumware_stl</a>]). This,
235 however, complicates the hash-table's operations.
237 <h6 align = "center">
238 <a name = "range_it_in_hts">
239 <img src = "find_iterators_range_ops_1.jpg" width = "70%" alt = "no image">
242 <h6 align = "center">
243 Range iteration in different data-structures.
247 <h6 align = "center">
248 <a name = "find_its_in_hash_tables">
249 <img src = "find_iterators_range_ops_2.jpg" width = "70%" alt = "no image">
252 <h6 align = "center">
253 Find-type iterators in hash tables.
257 As a consequence of this design,
261 std::for_each(m.find(1), m.find(5), foo);
265 will compile for tree-based containers, but will not compile
266 for hash-tables or other types. The returned type of <tt>find</tt>
267 is a find-type iterator. For tree-based containers, this is synonymous
268 with a range-type iterator, and therefore supports <tt><b>operator</b>++</tt>;
269 for other types of containers, a find-type iterator lacks <tt><b>operator</b>++</tt>.
272 <h3>Invalidation Guarantees</h3>
275 Consider the following snippet:
285 Following the call to <tt>erase</tt>, what is the validity
286 of <tt>it</tt>: can it be dereferenced? can it be incremented?
290 The answer depends on the underlying data-structure of the container.
292 <a href = "#invalidation_guarantee_erase">Effect of erase in different underlying data-structures</a>
293 shows three cases: A1 and A2 show a red-black tree;
294 B1 and B2 show an ordered-vector tree; C1 and C2
295 show a collision-chaining hash table.
298 <h6 align = "center">
299 <a name = "invalidation_guarantee_erase">
300 <img src = "invalidation_guarantee_erase.jpg" width = "70%" alt = "no image">
303 <h6 align = "center">
304 Effect of erase in different underlying data-structures.
310 `Erasing 5 from A1 yields A2. Clearly, an iterator to 3
311 can be dereferenced and incremented.
314 Erasing 5 from B1 yields B2. Clearly, an iterator to 3 is
318 Erasing 5 from C1 yields C2. Here the situation is more complicated.
319 On the one hand, incrementing <tt>it</tt> can be undefined. On the other
320 hand, there is no problem in dereferencing <tt>it</tt>. In
321 classic STL, it is not possible to express whether <tt>it</tt>
327 Thus again, the iterator concept seems overloaded. Distinguishing
328 between find and range types allows fine-grained invalidation guarantees.
329 <a href = #invalidation_guarantee_cd">Invalidation guarantees class hierarchy</a>
330 shows tags corresponding to different types of invalidation guarantees.
333 <h6 align = "center">
334 <a name = "invalidation_guarantee_cd">
335 <img src = "invalidation_guarantee_cd.jpg" width = "70%" alt = "no image">
338 <h6 align = "center">
339 Invalidation guarantees class hierarchy.
343 <li> <a href = "basic_invalidation_guarantee.html"><tt>basic_invalidation_guarantee</tt></a> corresponds to a basic guarantee that a find-type iterator, a found pointer, or a found reference, remains valid as long as the container object is not modified.
345 <li> <a href = "find_invalidation_guarantee.html"><tt>find_invalidation_guarantee</tt></a> corresponds to a guarantee that a find-type iterator, a found pointer, or a found reference, remains valid even if the containter object is modified.
347 <li> <a href = "range_invalidation_guarantee.html"><tt>range_invalidation_guarantee</tt></a> corresponds to a guarantee that a range-type iterator remains valid even if the containter object is modified.
353 To find the invalidation guarantee of a container, one can use
356 <b>typename</b> <a href = "ds_traits.html">ds_traits</a><Cntnr>::invalidation_guarantee
360 which is one of the classes in Figure
361 <a href = #invalidation_guarantee_cd">Invalidation guarantees class hierarchy</a>.