2 <!DOCTYPE part PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
6 <part id="manual.io" xreflabel="Input and Output">
7 <?dbhtml filename="io.html"?>
20 <title>Input and Output</title>
22 <!-- Chapter 01 : Iostream Objects -->
23 <chapter id="manual.io.objects" xreflabel="IO Objects">
24 <title>Iostream Objects</title>
26 <para>To minimize the time you have to wait on the compiler, it's good to
27 only include the headers you really need. Many people simply include
28 <iostream> when they don't need to -- and that can <emphasis>penalize
29 your runtime as well.</emphasis> Here are some tips on which header to use
30 for which situations, starting with the simplest.
32 <para><emphasis><iosfwd></emphasis> should be included whenever you simply
33 need the <emphasis>name</emphasis> of an I/O-related class, such as
34 "ofstream" or "basic_streambuf". Like the name
35 implies, these are forward declarations. (A word to all you fellow
36 old school programmers: trying to forward declare classes like
37 "class istream;" won't work. Look in the iosfwd header if
38 you'd like to know why.) For example,
41 #include <iosfwd>
46 std::ifstream& input_file;
49 extern std::ostream& operator<< (std::ostream&, MyClass&);
51 <para><emphasis><ios></emphasis> declares the base classes for the entire
52 I/O stream hierarchy, std::ios_base and std::basic_ios<charT>, the
53 counting types std::streamoff and std::streamsize, the file
54 positioning type std::fpos, and the various manipulators like
55 std::hex, std::fixed, std::noshowbase, and so forth.
57 <para>The ios_base class is what holds the format flags, the state flags,
58 and the functions which change them (setf(), width(), precision(),
59 etc). You can also store extra data and register callback functions
60 through ios_base, but that has been historically underused. Anything
61 which doesn't depend on the type of characters stored is consolidated
64 <para>The template class basic_ios is the highest template class in the
65 hierarchy; it is the first one depending on the character type, and
66 holds all general state associated with that type: the pointer to the
67 polymorphic stream buffer, the facet information, etc.
69 <para><emphasis><streambuf></emphasis> declares the template class
70 basic_streambuf, and two standard instantiations, streambuf and
71 wstreambuf. If you need to work with the vastly useful and capable
72 stream buffer classes, e.g., to create a new form of storage
73 transport, this header is the one to include.
75 <para><emphasis><istream></emphasis>/<emphasis><ostream></emphasis> are
76 the headers to include when you are using the >>/<<
77 interface, or any of the other abstract stream formatting functions.
81 #include <istream>
83 std::ostream& operator<< (std::ostream& os, MyClass& c)
85 return os << c.data1() << c.data2();
88 <para>The std::istream and std::ostream classes are the abstract parents of
89 the various concrete implementations. If you are only using the
90 interfaces, then you only need to use the appropriate interface header.
92 <para><emphasis><iomanip></emphasis> provides "extractors and inserters
93 that alter information maintained by class ios_base and its derived
94 classes," such as std::setprecision and std::setw. If you need
95 to write expressions like <code>os << setw(3);</code> or
96 <code>is >> setbase(8);</code>, you must include <iomanip>.
98 <para><emphasis><sstream></emphasis>/<emphasis><fstream></emphasis>
99 declare the six stringstream and fstream classes. As they are the
100 standard concrete descendants of istream and ostream, you will already
103 <para>Finally, <emphasis><iostream></emphasis> provides the eight standard
104 global objects (cin, cout, etc). To do this correctly, this header
105 also provides the contents of the <istream> and <ostream>
106 headers, but nothing else. The contents of this header look like
109 #include <ostream>
110 #include <istream>
118 // this is explained below
119 <emphasis>static ios_base::Init __foo;</emphasis> // not its real name
122 <para>Now, the runtime penalty mentioned previously: the global objects
123 must be initialized before any of your own code uses them; this is
124 guaranteed by the standard. Like any other global object, they must
125 be initialized once and only once. This is typically done with a
126 construct like the one above, and the nested class ios_base::Init is
127 specified in the standard for just this reason.
129 <para>How does it work? Because the header is included before any of your
130 code, the <emphasis>__foo</emphasis> object is constructed before any of
131 your objects. (Global objects are built in the order in which they
132 are declared, and destroyed in reverse order.) The first time the
133 constructor runs, the eight stream objects are set up.
135 <para>The <code>static</code> keyword means that each object file compiled
136 from a source file containing <iostream> will have its own
137 private copy of <emphasis>__foo</emphasis>. There is no specified order
138 of construction across object files (it's one of those pesky NP
139 problems that make life so interesting), so one copy in each object
140 file means that the stream objects are guaranteed to be set up before
141 any of your code which uses them could run, thereby meeting the
142 requirements of the standard.
144 <para>The penalty, of course, is that after the first copy of
145 <emphasis>__foo</emphasis> is constructed, all the others are just wasted
146 processor time. The time spent is merely for an increment-and-test
147 inside a function call, but over several dozen or hundreds of object
148 files, that time can add up. (It's not in a tight loop, either.)
150 <para>The lesson? Only include <iostream> when you need to use one of
151 the standard objects in that source file; you'll pay less startup
152 time. Only include the header files you need to in general; your
153 compile times will go down when there's less parsing work to do.
158 <!-- Chapter 02 : Stream Buffers -->
159 <chapter id="manual.io.streambufs" xreflabel="Stream Buffers">
160 <title>Stream Buffers</title>
162 <sect1 id="io.streambuf.derived" xreflabel="Derived streambuf Classes">
163 <title>Derived streambuf Classes</title>
167 <para>Creating your own stream buffers for I/O can be remarkably easy.
168 If you are interested in doing so, we highly recommend two very
170 <ulink url="http://www.langer.camelot.de/iostreams.html">Standard C++
171 IOStreams and Locales</ulink> by Langer and Kreft, ISBN 0-201-18395-1, and
172 <ulink url="http://www.josuttis.com/libbook/">The C++ Standard Library</ulink>
173 by Nicolai Josuttis, ISBN 0-201-37926-0. Both are published by
174 Addison-Wesley, who isn't paying us a cent for saying that, honest.
176 <para>Here is a simple example, io/outbuf1, from the Josuttis text. It
177 transforms everything sent through it to uppercase. This version
178 assumes many things about the nature of the character type being
179 used (for more information, read the books or the newsgroups):
182 #include <iostream>
183 #include <streambuf>
184 #include <locale>
185 #include <cstdio>
187 class outbuf : public std::streambuf
190 /* central output function
191 * - print characters in uppercase mode
193 virtual int_type overflow (int_type c) {
195 // convert lowercase to uppercase
196 c = std::toupper(static_cast<char>(c),getloc());
198 // and write the character to the standard output
199 if (putchar(c) == EOF) {
209 // create special output buffer
211 // initialize output stream with that output buffer
212 std::ostream out(&ob);
214 out << "31 hexadecimal: "
215 << std::hex << 31 << std::endl;
219 <para>Try it yourself! More examples can be found in 3.1.x code, in
220 <code>include/ext/*_filebuf.h</code>, and on
221 <ulink url="http://www.informatik.uni-konstanz.de/~kuehl/c++/iostream/">Dietmar
222 Kühl's IOStreams page</ulink>.
227 <sect1 id="io.streambuf.buffering" xreflabel="Buffering">
228 <title>Buffering</title>
229 <para>First, are you sure that you understand buffering? Particularly
230 the fact that C++ may not, in fact, have anything to do with it?
232 <para>The rules for buffering can be a little odd, but they aren't any
233 different from those of C. (Maybe that's why they can be a bit
234 odd.) Many people think that writing a newline to an output
235 stream automatically flushes the output buffer. This is true only
236 when the output stream is, in fact, a terminal and not a file
237 or some other device -- and <emphasis>that</emphasis> may not even be true
238 since C++ says nothing about files nor terminals. All of that is
239 system-dependent. (The "newline-buffer-flushing only occurring
240 on terminals" thing is mostly true on Unix systems, though.)
242 <para>Some people also believe that sending <code>endl</code> down an
243 output stream only writes a newline. This is incorrect; after a
244 newline is written, the buffer is also flushed. Perhaps this
245 is the effect you want when writing to a screen -- get the text
246 out as soon as possible, etc -- but the buffering is largely
247 wasted when doing this to a file:
250 output << "a line of text" << endl;
251 output << some_data_variable << endl;
252 output << "another line of text" << endl; </programlisting>
253 <para>The proper thing to do in this case to just write the data out
254 and let the libraries and the system worry about the buffering.
255 If you need a newline, just write a newline:
258 output << "a line of text\n"
259 << some_data_variable << '\n'
260 << "another line of text\n"; </programlisting>
261 <para>I have also joined the output statements into a single statement.
262 You could make the code prettier by moving the single newline to
263 the start of the quoted text on the last line, for example.
265 <para>If you do need to flush the buffer above, you can send an
266 <code>endl</code> if you also need a newline, or just flush the buffer
270 output << ...... << flush; // can use std::flush manipulator
271 output.flush(); // or call a member fn </programlisting>
272 <para>On the other hand, there are times when writing to a file should
273 be like writing to standard error; no buffering should be done
274 because the data needs to appear quickly (a prime example is a
275 log file for security-related information). The way to do this is
276 just to turn off the buffering <emphasis>before any I/O operations at
277 all</emphasis> have been done (note that opening counts as an I/O operation):
284 os.rdbuf()->pubsetbuf(0,0);
285 is.rdbuf()->pubsetbuf(0,0);
287 os.open("/foo/bar/baz");
288 is.open("/qux/quux/quuux");
290 os << "this data is written immediately\n";
291 is >> i; // and this will probably cause a disk read </programlisting>
292 <para>Since all aspects of buffering are handled by a streambuf-derived
293 member, it is necessary to get at that member with <code>rdbuf()</code>.
294 Then the public version of <code>setbuf</code> can be called. The
295 arguments are the same as those for the Standard C I/O Library
296 function (a buffer area followed by its size).
298 <para>A great deal of this is implementation-dependent. For example,
299 <code>streambuf</code> does not specify any actions for its own
300 <code>setbuf()</code>-ish functions; the classes derived from
301 <code>streambuf</code> each define behavior that "makes
302 sense" for that class: an argument of (0,0) turns off buffering
303 for <code>filebuf</code> but does nothing at all for its siblings
304 <code>stringbuf</code> and <code>strstreambuf</code>, and specifying
305 anything other than (0,0) has varying effects.
306 User-defined classes derived from <code>streambuf</code> can
307 do whatever they want. (For <code>filebuf</code> and arguments for
308 <code>(p,s)</code> other than zeros, libstdc++ does what you'd expect:
309 the first <code>s</code> bytes of <code>p</code> are used as a buffer,
310 which you must allocate and deallocate.)
312 <para>A last reminder: there are usually more buffers involved than
313 just those at the language/library level. Kernel buffers, disk
314 buffers, and the like will also have an effect. Inspecting and
315 changing those are system-dependent.
321 <!-- Chapter 03 : Memory-based Streams -->
322 <chapter id="manual.io.memstreams" xreflabel="Memory Streams">
323 <title>Memory Based Streams</title>
324 <sect1 id="manual.io.memstreams.compat" xreflabel="Compatibility strstream">
325 <title>Compatibility With strstream</title>
328 <para>Stringstreams (defined in the header <code><sstream></code>)
329 are in this author's opinion one of the coolest things since
330 sliced time. An example of their use is in the Received Wisdom
331 section for Chapter 21 (Strings),
332 <ulink url="../21_strings/howto.html#1.1internal"> describing how to
333 format strings</ulink>.
335 <para>The quick definition is: they are siblings of ifstream and ofstream,
336 and they do for <code>std::string</code> what their siblings do for
337 files. All that work you put into writing <code><<</code> and
338 <code>>></code> functions for your classes now pays off
339 <emphasis>again!</emphasis> Need to format a string before passing the string
340 to a function? Send your stuff via <code><<</code> to an
341 ostringstream. You've read a string as input and need to parse it?
342 Initialize an istringstream with that string, and then pull pieces
343 out of it with <code>>></code>. Have a stringstream and need to
344 get a copy of the string inside? Just call the <code>str()</code>
347 <para>This only works if you've written your
348 <code><<</code>/<code>>></code> functions correctly, though,
349 and correctly means that they take istreams and ostreams as
350 parameters, not i<emphasis>f</emphasis>streams and o<emphasis>f</emphasis>streams. If they
351 take the latter, then your I/O operators will work fine with
352 file streams, but with nothing else -- including stringstreams.
354 <para>If you are a user of the strstream classes, you need to update
355 your code. You don't have to explicitly append <code>ends</code> to
356 terminate the C-style character array, you don't have to mess with
357 "freezing" functions, and you don't have to manage the
358 memory yourself. The strstreams have been officially deprecated,
359 which means that 1) future revisions of the C++ Standard won't
360 support them, and 2) if you use them, people will laugh at you.
367 <!-- Chapter 04 : File-based Streams -->
368 <chapter id="manual.io.filestreams" xreflabel="File Streams">
369 <title>File Based Streams</title>
371 <sect1 id="manual.io.filestreams.copying_a_file" xreflabel="Copying a File">
372 <title>Copying a File</title>
376 <para>So you want to copy a file quickly and easily, and most important,
377 completely portably. And since this is C++, you have an open
378 ifstream (call it IN) and an open ofstream (call it OUT):
381 #include <fstream>
383 std::ifstream IN ("input_file");
384 std::ofstream OUT ("output_file"); </programlisting>
385 <para>Here's the easiest way to get it completely wrong:
388 OUT << IN;</programlisting>
389 <para>For those of you who don't already know why this doesn't work
390 (probably from having done it before), I invite you to quickly
391 create a simple text file called "input_file" containing
395 The quick brown fox jumped over the lazy dog.</programlisting>
396 <para>surrounded by blank lines. Code it up and try it. The contents
397 of "output_file" may surprise you.
399 <para>Seriously, go do it. Get surprised, then come back. It's worth it.
401 <para>The thing to remember is that the <code>basic_[io]stream</code> classes
402 handle formatting, nothing else. In particular, they break up on
403 whitespace. The actual reading, writing, and storing of data is
404 handled by the <code>basic_streambuf</code> family. Fortunately, the
405 <code>operator<<</code> is overloaded to take an ostream and
406 a pointer-to-streambuf, in order to help with just this kind of
407 "dump the data verbatim" situation.
409 <para>Why a <emphasis>pointer</emphasis> to streambuf and not just a streambuf? Well,
410 the [io]streams hold pointers (or references, depending on the
411 implementation) to their buffers, not the actual
412 buffers. This allows polymorphic behavior on the part of the buffers
413 as well as the streams themselves. The pointer is easily retrieved
414 using the <code>rdbuf()</code> member function. Therefore, the easiest
415 way to copy the file is:
418 OUT << IN.rdbuf();</programlisting>
419 <para>So what <emphasis>was</emphasis> happening with OUT<<IN? Undefined
420 behavior, since that particular << isn't defined by the Standard.
421 I have seen instances where it is implemented, but the character
422 extraction process removes all the whitespace, leaving you with no
423 blank lines and only "Thequickbrownfox...". With
424 libraries that do not define that operator, IN (or one of IN's
425 member pointers) sometimes gets converted to a void*, and the output
426 file then contains a perfect text representation of a hexadecimal
427 address (quite a big surprise). Others don't compile at all.
429 <para>Also note that none of this is specific to o<emphasis>*f*</emphasis>streams.
430 The operators shown above are all defined in the parent
431 basic_ostream class and are therefore available with all possible
437 <sect1 id="manual.io.filestreams.binary" xreflabel="Binary Input and Output">
438 <title>Binary Input and Output</title>
441 <para>The first and most important thing to remember about binary I/O is
442 that opening a file with <code>ios::binary</code> is not, repeat
443 <emphasis>not</emphasis>, the only thing you have to do. It is not a silver
444 bullet, and will not allow you to use the <code><</>></code>
445 operators of the normal fstreams to do binary I/O.
447 <para>Sorry. Them's the breaks.
449 <para>This isn't going to try and be a complete tutorial on reading and
450 writing binary files (because "binary"
451 <ulink url="#7">covers a lot of ground)</ulink>, but we will try and clear
452 up a couple of misconceptions and common errors.
454 <para>First, <code>ios::binary</code> has exactly one defined effect, no more
455 and no less. Normal text mode has to be concerned with the newline
456 characters, and the runtime system will translate between (for
457 example) '\n' and the appropriate end-of-line sequence (LF on Unix,
458 CRLF on DOS, CR on Macintosh, etc). (There are other things that
459 normal mode does, but that's the most obvious.) Opening a file in
460 binary mode disables this conversion, so reading a CRLF sequence
461 under Windows won't accidentally get mapped to a '\n' character, etc.
462 Binary mode is not supposed to suddenly give you a bitstream, and
463 if it is doing so in your program then you've discovered a bug in
464 your vendor's compiler (or some other part of the C++ implementation,
465 possibly the runtime system).
467 <para>Second, using <code><<</code> to write and <code>>></code> to
468 read isn't going to work with the standard file stream classes, even
469 if you use <code>skipws</code> during reading. Why not? Because
470 ifstream and ofstream exist for the purpose of <emphasis>formatting</emphasis>,
471 not reading and writing. Their job is to interpret the data into
472 text characters, and that's exactly what you don't want to happen
475 <para>Third, using the <code>get()</code> and <code>put()/write()</code> member
476 functions still aren't guaranteed to help you. These are
477 "unformatted" I/O functions, but still character-based.
478 (This may or may not be what you want, see below.)
480 <para>Notice how all the problems here are due to the inappropriate use
481 of <emphasis>formatting</emphasis> functions and classes to perform something
482 which <emphasis>requires</emphasis> that formatting not be done? There are a
483 seemingly infinite number of solutions, and a few are listed here:
487 <para><quote>Derive your own fstream-type classes and write your own
488 <</>> operators to do binary I/O on whatever data
489 types you're using.</quote>
492 This is a Bad Thing, because while
493 the compiler would probably be just fine with it, other humans
494 are going to be confused. The overloaded bitshift operators
495 have a well-defined meaning (formatting), and this breaks it.
500 <quote>Build the file structure in memory, then
501 <code>mmap()</code> the file and copy the
506 Well, this is easy to make work, and easy to break, and is
507 pretty equivalent to using <code>::read()</code> and
508 <code>::write()</code> directly, and makes no use of the
509 iostream library at all...
514 <quote>Use streambufs, that's what they're there for.</quote>
517 While not trivial for the beginner, this is the best of all
518 solutions. The streambuf/filebuf layer is the layer that is
519 responsible for actual I/O. If you want to use the C++
520 library for binary I/O, this is where you start.
524 <para>How to go about using streambufs is a bit beyond the scope of this
525 document (at least for now), but while streambufs go a long way,
526 they still leave a couple of things up to you, the programmer.
527 As an example, byte ordering is completely between you and the
528 operating system, and you have to handle it yourself.
530 <para>Deriving a streambuf or filebuf
531 class from the standard ones, one that is specific to your data
532 types (or an abstraction thereof) is probably a good idea, and
533 lots of examples exist in journals and on Usenet. Using the
534 standard filebufs directly (either by declaring your own or by
535 using the pointer returned from an fstream's <code>rdbuf()</code>)
536 is certainly feasible as well.
538 <para>One area that causes problems is trying to do bit-by-bit operations
539 with filebufs. C++ is no different from C in this respect: I/O
540 must be done at the byte level. If you're trying to read or write
541 a few bits at a time, you're going about it the wrong way. You
542 must read/write an integral number of bytes and then process the
543 bytes. (For example, the streambuf functions take and return
544 variables of type <code>int_type</code>.)
546 <para>Another area of problems is opening text files in binary mode.
547 Generally, binary mode is intended for binary files, and opening
548 text files in binary mode means that you now have to deal with all of
549 those end-of-line and end-of-file problems that we mentioned before.
550 An instructive thread from comp.lang.c++.moderated delved off into
551 this topic starting more or less at
552 <ulink url="http://groups.google.com/groups?oi=djq&selm=an_436187505">this</ulink>
553 article and continuing to the end of the thread. (You'll have to
554 sort through some flames every couple of paragraphs, but the points
560 <sect1 id="manual.io.filestreams.binary2" xreflabel="Binary Input and Output">
561 <title>More Binary Input and Output</title>
562 <para>Towards the beginning of February 2001, the subject of
563 "binary" I/O was brought up in a couple of places at the
564 same time. One notable place was Usenet, where James Kanze and
565 Dietmar Kühl separately posted articles on why attempting
566 generic binary I/O was not a good idea. (Here are copies of
567 <ulink url="binary_iostreams_kanze.txt">Kanze's article</ulink> and
568 <ulink url="binary_iostreams_kuehl.txt">Kühl's article</ulink>.)
570 <para>Briefly, the problems of byte ordering and type sizes mean that
571 the unformatted functions like <code>ostream::put()</code> and
572 <code>istream::get()</code> cannot safely be used to communicate
573 between arbitrary programs, or across a network, or from one
574 invocation of a program to another invocation of the same program
575 on a different platform, etc.
577 <para>The entire Usenet thread is instructive, and took place under the
578 subject heading "binary iostreams" on both comp.std.c++
579 and comp.lang.c++.moderated in parallel. Also in that thread,
580 Dietmar Kühl mentioned that he had written a pair of stream
581 classes that would read and write XDR, which is a good step towards
582 a portable binary format.
589 <!-- Chapter 03 : Interacting with C -->
590 <chapter id="manual.io.c" xreflabel="Interacting with C">
591 <title>Interacting with C</title>
594 <sect1 id="manual.io.c.FILE" xreflabel="Using FILE* and file descriptors">
595 <title>Using FILE* and file descriptors</title>
597 See the <link linkend="manual.ext.io">extensions</link> for using
598 <type>FILE</type> and <type>file descriptors</type> with
599 <classname>ofstream</classname> and
600 <classname>ifstream</classname>.
604 <sect1 id="manual.io.c.sync" xreflabel="Performance Issues">
605 <title>Performance</title>
607 Pathetic Performance? Ditch C.
609 <para>It sounds like a flame on C, but it isn't. Really. Calm down.
610 I'm just saying it to get your attention.
612 <para>Because the C++ library includes the C library, both C-style and
613 C++-style I/O have to work at the same time. For example:
616 #include <iostream>
617 #include <cstdio>
619 std::cout << "Hel";
620 std::printf ("lo, worl");
621 std::cout << "d!\n";
623 <para>This must do what you think it does.
625 <para>Alert members of the audience will immediately notice that buffering
626 is going to make a hash of the output unless special steps are taken.
628 <para>The special steps taken by libstdc++, at least for version 3.0,
629 involve doing very little buffering for the standard streams, leaving
630 most of the buffering to the underlying C library. (This kind of
631 thing is tricky to get right.)
632 The upside is that correctness is ensured. The downside is that
633 writing through <code>cout</code> can quite easily lead to awful
634 performance when the C++ I/O library is layered on top of the C I/O
635 library (as it is for 3.0 by default). Some patches have been applied
636 which improve the situation for 3.1.
638 <para>However, the C and C++ standard streams only need to be kept in sync
639 when both libraries' facilities are in use. If your program only uses
640 C++ I/O, then there's no need to sync with the C streams. The right
641 thing to do in this case is to call
644 #include <emphasis>any of the I/O headers such as ios, iostream, etc</emphasis>
646 std::ios::sync_with_stdio(false);
648 <para>You must do this before performing any I/O via the C++ stream objects.
649 Once you call this, the C++ streams will operate independently of the
650 (unused) C streams. For GCC 3.x, this means that <code>cout</code> and
651 company will become fully buffered on their own.
653 <para>Note, by the way, that the synchronization requirement only applies to
654 the standard streams (<code>cin</code>, <code>cout</code>,
656 <code>clog</code>, and their wide-character counterparts). File stream
657 objects that you declare yourself have no such requirement and are fully