l4/pkg/libstdc++-v3/contrib/libstdc++-v3-4.4/doc/html/manual/bk01pt05ch13s04.html

   1 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
   2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   3 <html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Tokenizing</title><meta name="generator" content="DocBook XSL Stylesheets V1.74.0" /><meta name="keywords" content="&#10;      ISO C++&#10;    , &#10;      library&#10;    " /><link rel="home" href="../spine.html" title="The GNU C++ Library Documentation" /><link rel="up" href="bk01pt05ch13.html" title="Chapter 13. String Classes" /><link rel="prev" href="bk01pt05ch13s03.html" title="Arbitrary Character Types" /><link rel="next" href="bk01pt05ch13s05.html" title="Shrink to Fit" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Tokenizing</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="bk01pt05ch13s03.html">Prev</a> </td><th width="60%" align="center">Chapter 13. String Classes</th><td width="20%" align="right"> <a accesskey="n" href="bk01pt05ch13s05.html">Next</a></td></tr></table><hr /></div><div class="sect1" lang="en" xml:lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="strings.string.token"></a>Tokenizing</h2></div></div></div><p>
   4     </p><p>The Standard C (and C++) function <code class="code">strtok()</code> leaves a lot to
   5       be desired in terms of user-friendliness.  It's unintuitive, it
   6       destroys the character string on which it operates, and it requires
   7       you to handle all the memory problems.  But it does let the client
   8       code decide what to use to break the string into pieces; it allows
   9       you to choose the "whitespace," so to speak.
  10    </p><p>A C++ implementation lets us keep the good things and fix those
  11       annoyances.  The implementation here is more intuitive (you only
  12       call it once, not in a loop with varying argument), it does not
  13       affect the original string at all, and all the memory allocation
  14       is handled for you.
  15    </p><p>It's called stringtok, and it's a template function. Sources are
  16    as below, in a less-portable form than it could be, to keep this
  17    example simple (for example, see the comments on what kind of
  18    string it will accept).
  19    </p><pre class="programlisting">
  20 #include &lt;string&gt;
  21 template &lt;typename Container&gt;
  22 void
  23 stringtok(Container &amp;container, string const &amp;in,
  24           const char * const delimiters = " \t\n")
  25 {
  26     const string::size_type len = in.length();
  27           string::size_type i = 0;
  28
  29     while (i &lt; len)
  30     {
  31         // Eat leading whitespace
  32         i = in.find_first_not_of(delimiters, i);
  33         if (i == string::npos)
  34           return;   // Nothing left but white space
  35
  36         // Find the end of the token
  37         string::size_type j = in.find_first_of(delimiters, i);
  38
  39         // Push token
  40         if (j == string::npos)
  41         {
  42           container.push_back(in.substr(i));
  43           return;
  44         }
  45         else
  46           container.push_back(in.substr(i, j-i));
  47
  48         // Set up for next loop
  49         i = j + 1;
  50     }
  51 }
  52 </pre><p>
  53      The author uses a more general (but less readable) form of it for
  54      parsing command strings and the like.  If you compiled and ran this
  55      code using it:
  56    </p><pre class="programlisting">
  57    std::list&lt;string&gt;  ls;
  58    stringtok (ls, " this  \t is\t\n  a test  ");
  59    for (std::list&lt;string&gt;const_iterator i = ls.begin();
  60         i != ls.end(); ++i)
  61    {
  62        std::cerr &lt;&lt; ':' &lt;&lt; (*i) &lt;&lt; ":\n";
  63    } </pre><p>You would see this as output:
  64    </p><pre class="programlisting">
  65    :this:
  66    :is:
  67    :a:
  68    :test: </pre><p>with all the whitespace removed.  The original <code class="code">s</code> is still
  69       available for use, <code class="code">ls</code> will clean up after itself, and
  70       <code class="code">ls.size()</code> will return how many tokens there were.
  71    </p><p>As always, there is a price paid here, in that stringtok is not
  72       as fast as strtok.  The other benefits usually outweigh that, however.
  73       <a class="ulink" href="stringtok_std_h.txt" target="_top">Another version of stringtok is given
  74       here</a>, suggested by Chris King and tweaked by Petr Prikryl,
  75       and this one uses the
  76       transformation functions mentioned below.  If you are comfortable
  77       with reading the new function names, this version is recommended
  78       as an example.
  79    </p><p><span class="emphasis"><em>Added February 2001:</em></span>  Mark Wilden pointed out that the
  80       standard <code class="code">std::getline()</code> function can be used with standard
  81       <a class="ulink" href="../27_io/howto.html" target="_top">istringstreams</a> to perform
  82       tokenizing as well.  Build an istringstream from the input text,
  83       and then use std::getline with varying delimiters (the three-argument
  84       signature) to extract tokens into a string.
  85    </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="bk01pt05ch13s03.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="bk01pt05ch13.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="bk01pt05ch13s05.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Arbitrary Character Types </td><td width="20%" align="center"><a accesskey="h" href="../spine.html">Home</a></td><td width="40%" align="right" valign="top"> Shrink to Fit</td></tr></table></div></body></html>