2 <!DOCTYPE part PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
6 <part id="manual.io" xreflabel="Input and Output">
7 <?dbhtml filename="io.html"?>
22 <indexterm><primary>Input and Output</primary></indexterm>
25 <!-- Chapter 01 : Iostream Objects -->
26 <chapter id="manual.io.objects" xreflabel="IO Objects">
27 <?dbhtml filename="iostream_objects.html"?>
28 <title>Iostream Objects</title>
30 <para>To minimize the time you have to wait on the compiler, it's good to
31 only include the headers you really need. Many people simply include
32 <iostream> when they don't need to -- and that can <emphasis>penalize
33 your runtime as well.</emphasis> Here are some tips on which header to use
34 for which situations, starting with the simplest.
36 <para><emphasis><iosfwd></emphasis> should be included whenever you simply
37 need the <emphasis>name</emphasis> of an I/O-related class, such as
38 "ofstream" or "basic_streambuf". Like the name
39 implies, these are forward declarations. (A word to all you fellow
40 old school programmers: trying to forward declare classes like
41 "class istream;" won't work. Look in the iosfwd header if
42 you'd like to know why.) For example,
45 #include <iosfwd>
50 std::ifstream& input_file;
53 extern std::ostream& operator<< (std::ostream&, MyClass&);
55 <para><emphasis><ios></emphasis> declares the base classes for the entire
56 I/O stream hierarchy, std::ios_base and std::basic_ios<charT>, the
57 counting types std::streamoff and std::streamsize, the file
58 positioning type std::fpos, and the various manipulators like
59 std::hex, std::fixed, std::noshowbase, and so forth.
61 <para>The ios_base class is what holds the format flags, the state flags,
62 and the functions which change them (setf(), width(), precision(),
63 etc). You can also store extra data and register callback functions
64 through ios_base, but that has been historically underused. Anything
65 which doesn't depend on the type of characters stored is consolidated
68 <para>The template class basic_ios is the highest template class in the
69 hierarchy; it is the first one depending on the character type, and
70 holds all general state associated with that type: the pointer to the
71 polymorphic stream buffer, the facet information, etc.
73 <para><emphasis><streambuf></emphasis> declares the template class
74 basic_streambuf, and two standard instantiations, streambuf and
75 wstreambuf. If you need to work with the vastly useful and capable
76 stream buffer classes, e.g., to create a new form of storage
77 transport, this header is the one to include.
79 <para><emphasis><istream></emphasis>/<emphasis><ostream></emphasis> are
80 the headers to include when you are using the >>/<<
81 interface, or any of the other abstract stream formatting functions.
85 #include <istream>
87 std::ostream& operator<< (std::ostream& os, MyClass& c)
89 return os << c.data1() << c.data2();
92 <para>The std::istream and std::ostream classes are the abstract parents of
93 the various concrete implementations. If you are only using the
94 interfaces, then you only need to use the appropriate interface header.
96 <para><emphasis><iomanip></emphasis> provides "extractors and inserters
97 that alter information maintained by class ios_base and its derived
98 classes," such as std::setprecision and std::setw. If you need
99 to write expressions like <code>os << setw(3);</code> or
100 <code>is >> setbase(8);</code>, you must include <iomanip>.
102 <para><emphasis><sstream></emphasis>/<emphasis><fstream></emphasis>
103 declare the six stringstream and fstream classes. As they are the
104 standard concrete descendants of istream and ostream, you will already
107 <para>Finally, <emphasis><iostream></emphasis> provides the eight standard
108 global objects (cin, cout, etc). To do this correctly, this header
109 also provides the contents of the <istream> and <ostream>
110 headers, but nothing else. The contents of this header look like
113 #include <ostream>
114 #include <istream>
122 // this is explained below
123 <emphasis>static ios_base::Init __foo;</emphasis> // not its real name
126 <para>Now, the runtime penalty mentioned previously: the global objects
127 must be initialized before any of your own code uses them; this is
128 guaranteed by the standard. Like any other global object, they must
129 be initialized once and only once. This is typically done with a
130 construct like the one above, and the nested class ios_base::Init is
131 specified in the standard for just this reason.
133 <para>How does it work? Because the header is included before any of your
134 code, the <emphasis>__foo</emphasis> object is constructed before any of
135 your objects. (Global objects are built in the order in which they
136 are declared, and destroyed in reverse order.) The first time the
137 constructor runs, the eight stream objects are set up.
139 <para>The <code>static</code> keyword means that each object file compiled
140 from a source file containing <iostream> will have its own
141 private copy of <emphasis>__foo</emphasis>. There is no specified order
142 of construction across object files (it's one of those pesky NP
143 problems that make life so interesting), so one copy in each object
144 file means that the stream objects are guaranteed to be set up before
145 any of your code which uses them could run, thereby meeting the
146 requirements of the standard.
148 <para>The penalty, of course, is that after the first copy of
149 <emphasis>__foo</emphasis> is constructed, all the others are just wasted
150 processor time. The time spent is merely for an increment-and-test
151 inside a function call, but over several dozen or hundreds of object
152 files, that time can add up. (It's not in a tight loop, either.)
154 <para>The lesson? Only include <iostream> when you need to use one of
155 the standard objects in that source file; you'll pay less startup
156 time. Only include the header files you need to in general; your
157 compile times will go down when there's less parsing work to do.
162 <!-- Chapter 02 : Stream Buffers -->
163 <chapter id="manual.io.streambufs" xreflabel="Stream Buffers">
164 <?dbhtml filename="streambufs.html"?>
165 <title>Stream Buffers</title>
167 <sect1 id="io.streambuf.derived" xreflabel="Derived streambuf Classes">
168 <title>Derived streambuf Classes</title>
172 <para>Creating your own stream buffers for I/O can be remarkably easy.
173 If you are interested in doing so, we highly recommend two very
175 <ulink url="http://www.langer.camelot.de/iostreams.html">Standard C++
176 IOStreams and Locales</ulink> by Langer and Kreft, ISBN 0-201-18395-1, and
177 <ulink url="http://www.josuttis.com/libbook/">The C++ Standard Library</ulink>
178 by Nicolai Josuttis, ISBN 0-201-37926-0. Both are published by
179 Addison-Wesley, who isn't paying us a cent for saying that, honest.
181 <para>Here is a simple example, io/outbuf1, from the Josuttis text. It
182 transforms everything sent through it to uppercase. This version
183 assumes many things about the nature of the character type being
184 used (for more information, read the books or the newsgroups):
187 #include <iostream>
188 #include <streambuf>
189 #include <locale>
190 #include <cstdio>
192 class outbuf : public std::streambuf
195 /* central output function
196 * - print characters in uppercase mode
198 virtual int_type overflow (int_type c) {
200 // convert lowercase to uppercase
201 c = std::toupper(static_cast<char>(c),getloc());
203 // and write the character to the standard output
204 if (putchar(c) == EOF) {
214 // create special output buffer
216 // initialize output stream with that output buffer
217 std::ostream out(&ob);
219 out << "31 hexadecimal: "
220 << std::hex << 31 << std::endl;
224 <para>Try it yourself! More examples can be found in 3.1.x code, in
225 <code>include/ext/*_filebuf.h</code>, and on
226 <ulink url="http://www.informatik.uni-konstanz.de/~kuehl/c++/iostream/">Dietmar
227 Kühl's IOStreams page</ulink>.
232 <sect1 id="io.streambuf.buffering" xreflabel="Buffering">
233 <title>Buffering</title>
234 <para>First, are you sure that you understand buffering? Particularly
235 the fact that C++ may not, in fact, have anything to do with it?
237 <para>The rules for buffering can be a little odd, but they aren't any
238 different from those of C. (Maybe that's why they can be a bit
239 odd.) Many people think that writing a newline to an output
240 stream automatically flushes the output buffer. This is true only
241 when the output stream is, in fact, a terminal and not a file
242 or some other device -- and <emphasis>that</emphasis> may not even be true
243 since C++ says nothing about files nor terminals. All of that is
244 system-dependent. (The "newline-buffer-flushing only occurring
245 on terminals" thing is mostly true on Unix systems, though.)
247 <para>Some people also believe that sending <code>endl</code> down an
248 output stream only writes a newline. This is incorrect; after a
249 newline is written, the buffer is also flushed. Perhaps this
250 is the effect you want when writing to a screen -- get the text
251 out as soon as possible, etc -- but the buffering is largely
252 wasted when doing this to a file:
255 output << "a line of text" << endl;
256 output << some_data_variable << endl;
257 output << "another line of text" << endl; </programlisting>
258 <para>The proper thing to do in this case to just write the data out
259 and let the libraries and the system worry about the buffering.
260 If you need a newline, just write a newline:
263 output << "a line of text\n"
264 << some_data_variable << '\n'
265 << "another line of text\n"; </programlisting>
266 <para>I have also joined the output statements into a single statement.
267 You could make the code prettier by moving the single newline to
268 the start of the quoted text on the last line, for example.
270 <para>If you do need to flush the buffer above, you can send an
271 <code>endl</code> if you also need a newline, or just flush the buffer
275 output << ...... << flush; // can use std::flush manipulator
276 output.flush(); // or call a member fn </programlisting>
277 <para>On the other hand, there are times when writing to a file should
278 be like writing to standard error; no buffering should be done
279 because the data needs to appear quickly (a prime example is a
280 log file for security-related information). The way to do this is
281 just to turn off the buffering <emphasis>before any I/O operations at
282 all</emphasis> have been done (note that opening counts as an I/O operation):
289 os.rdbuf()->pubsetbuf(0,0);
290 is.rdbuf()->pubsetbuf(0,0);
292 os.open("/foo/bar/baz");
293 is.open("/qux/quux/quuux");
295 os << "this data is written immediately\n";
296 is >> i; // and this will probably cause a disk read </programlisting>
297 <para>Since all aspects of buffering are handled by a streambuf-derived
298 member, it is necessary to get at that member with <code>rdbuf()</code>.
299 Then the public version of <code>setbuf</code> can be called. The
300 arguments are the same as those for the Standard C I/O Library
301 function (a buffer area followed by its size).
303 <para>A great deal of this is implementation-dependent. For example,
304 <code>streambuf</code> does not specify any actions for its own
305 <code>setbuf()</code>-ish functions; the classes derived from
306 <code>streambuf</code> each define behavior that "makes
307 sense" for that class: an argument of (0,0) turns off buffering
308 for <code>filebuf</code> but does nothing at all for its siblings
309 <code>stringbuf</code> and <code>strstreambuf</code>, and specifying
310 anything other than (0,0) has varying effects.
311 User-defined classes derived from <code>streambuf</code> can
312 do whatever they want. (For <code>filebuf</code> and arguments for
313 <code>(p,s)</code> other than zeros, libstdc++ does what you'd expect:
314 the first <code>s</code> bytes of <code>p</code> are used as a buffer,
315 which you must allocate and deallocate.)
317 <para>A last reminder: there are usually more buffers involved than
318 just those at the language/library level. Kernel buffers, disk
319 buffers, and the like will also have an effect. Inspecting and
320 changing those are system-dependent.
326 <!-- Chapter 03 : Memory-based Streams -->
327 <chapter id="manual.io.memstreams" xreflabel="Memory Streams">
328 <?dbhtml filename="stringstreams.html"?>
329 <title>Memory Based Streams</title>
330 <sect1 id="manual.io.memstreams.compat" xreflabel="Compatibility strstream">
331 <title>Compatibility With strstream</title>
334 <para>Stringstreams (defined in the header <code><sstream></code>)
335 are in this author's opinion one of the coolest things since
336 sliced time. An example of their use is in the Received Wisdom
337 section for Chapter 21 (Strings),
338 <ulink url="../21_strings/howto.html#1.1internal"> describing how to
339 format strings</ulink>.
341 <para>The quick definition is: they are siblings of ifstream and ofstream,
342 and they do for <code>std::string</code> what their siblings do for
343 files. All that work you put into writing <code><<</code> and
344 <code>>></code> functions for your classes now pays off
345 <emphasis>again!</emphasis> Need to format a string before passing the string
346 to a function? Send your stuff via <code><<</code> to an
347 ostringstream. You've read a string as input and need to parse it?
348 Initialize an istringstream with that string, and then pull pieces
349 out of it with <code>>></code>. Have a stringstream and need to
350 get a copy of the string inside? Just call the <code>str()</code>
353 <para>This only works if you've written your
354 <code><<</code>/<code>>></code> functions correctly, though,
355 and correctly means that they take istreams and ostreams as
356 parameters, not i<emphasis>f</emphasis>streams and o<emphasis>f</emphasis>streams. If they
357 take the latter, then your I/O operators will work fine with
358 file streams, but with nothing else -- including stringstreams.
360 <para>If you are a user of the strstream classes, you need to update
361 your code. You don't have to explicitly append <code>ends</code> to
362 terminate the C-style character array, you don't have to mess with
363 "freezing" functions, and you don't have to manage the
364 memory yourself. The strstreams have been officially deprecated,
365 which means that 1) future revisions of the C++ Standard won't
366 support them, and 2) if you use them, people will laugh at you.
373 <!-- Chapter 04 : File-based Streams -->
374 <chapter id="manual.io.filestreams" xreflabel="File Streams">
375 <?dbhtml filename="fstreams.html"?>
376 <title>File Based Streams</title>
378 <sect1 id="manual.io.filestreams.copying_a_file" xreflabel="Copying a File">
379 <title>Copying a File</title>
383 <para>So you want to copy a file quickly and easily, and most important,
384 completely portably. And since this is C++, you have an open
385 ifstream (call it IN) and an open ofstream (call it OUT):
388 #include <fstream>
390 std::ifstream IN ("input_file");
391 std::ofstream OUT ("output_file"); </programlisting>
392 <para>Here's the easiest way to get it completely wrong:
395 OUT << IN;</programlisting>
396 <para>For those of you who don't already know why this doesn't work
397 (probably from having done it before), I invite you to quickly
398 create a simple text file called "input_file" containing
402 The quick brown fox jumped over the lazy dog.</programlisting>
403 <para>surrounded by blank lines. Code it up and try it. The contents
404 of "output_file" may surprise you.
406 <para>Seriously, go do it. Get surprised, then come back. It's worth it.
408 <para>The thing to remember is that the <code>basic_[io]stream</code> classes
409 handle formatting, nothing else. In particular, they break up on
410 whitespace. The actual reading, writing, and storing of data is
411 handled by the <code>basic_streambuf</code> family. Fortunately, the
412 <code>operator<<</code> is overloaded to take an ostream and
413 a pointer-to-streambuf, in order to help with just this kind of
414 "dump the data verbatim" situation.
416 <para>Why a <emphasis>pointer</emphasis> to streambuf and not just a streambuf? Well,
417 the [io]streams hold pointers (or references, depending on the
418 implementation) to their buffers, not the actual
419 buffers. This allows polymorphic behavior on the part of the buffers
420 as well as the streams themselves. The pointer is easily retrieved
421 using the <code>rdbuf()</code> member function. Therefore, the easiest
422 way to copy the file is:
425 OUT << IN.rdbuf();</programlisting>
426 <para>So what <emphasis>was</emphasis> happening with OUT<<IN? Undefined
427 behavior, since that particular << isn't defined by the Standard.
428 I have seen instances where it is implemented, but the character
429 extraction process removes all the whitespace, leaving you with no
430 blank lines and only "Thequickbrownfox...". With
431 libraries that do not define that operator, IN (or one of IN's
432 member pointers) sometimes gets converted to a void*, and the output
433 file then contains a perfect text representation of a hexadecimal
434 address (quite a big surprise). Others don't compile at all.
436 <para>Also note that none of this is specific to o<emphasis>*f*</emphasis>streams.
437 The operators shown above are all defined in the parent
438 basic_ostream class and are therefore available with all possible
444 <sect1 id="manual.io.filestreams.binary" xreflabel="Binary Input and Output">
445 <title>Binary Input and Output</title>
448 <para>The first and most important thing to remember about binary I/O is
449 that opening a file with <code>ios::binary</code> is not, repeat
450 <emphasis>not</emphasis>, the only thing you have to do. It is not a silver
451 bullet, and will not allow you to use the <code><</>></code>
452 operators of the normal fstreams to do binary I/O.
454 <para>Sorry. Them's the breaks.
456 <para>This isn't going to try and be a complete tutorial on reading and
457 writing binary files (because "binary"
458 <ulink url="#7">covers a lot of ground)</ulink>, but we will try and clear
459 up a couple of misconceptions and common errors.
461 <para>First, <code>ios::binary</code> has exactly one defined effect, no more
462 and no less. Normal text mode has to be concerned with the newline
463 characters, and the runtime system will translate between (for
464 example) '\n' and the appropriate end-of-line sequence (LF on Unix,
465 CRLF on DOS, CR on Macintosh, etc). (There are other things that
466 normal mode does, but that's the most obvious.) Opening a file in
467 binary mode disables this conversion, so reading a CRLF sequence
468 under Windows won't accidentally get mapped to a '\n' character, etc.
469 Binary mode is not supposed to suddenly give you a bitstream, and
470 if it is doing so in your program then you've discovered a bug in
471 your vendor's compiler (or some other part of the C++ implementation,
472 possibly the runtime system).
474 <para>Second, using <code><<</code> to write and <code>>></code> to
475 read isn't going to work with the standard file stream classes, even
476 if you use <code>skipws</code> during reading. Why not? Because
477 ifstream and ofstream exist for the purpose of <emphasis>formatting</emphasis>,
478 not reading and writing. Their job is to interpret the data into
479 text characters, and that's exactly what you don't want to happen
482 <para>Third, using the <code>get()</code> and <code>put()/write()</code> member
483 functions still aren't guaranteed to help you. These are
484 "unformatted" I/O functions, but still character-based.
485 (This may or may not be what you want, see below.)
487 <para>Notice how all the problems here are due to the inappropriate use
488 of <emphasis>formatting</emphasis> functions and classes to perform something
489 which <emphasis>requires</emphasis> that formatting not be done? There are a
490 seemingly infinite number of solutions, and a few are listed here:
494 <para><quote>Derive your own fstream-type classes and write your own
495 <</>> operators to do binary I/O on whatever data
496 types you're using.</quote>
499 This is a Bad Thing, because while
500 the compiler would probably be just fine with it, other humans
501 are going to be confused. The overloaded bitshift operators
502 have a well-defined meaning (formatting), and this breaks it.
507 <quote>Build the file structure in memory, then
508 <code>mmap()</code> the file and copy the
513 Well, this is easy to make work, and easy to break, and is
514 pretty equivalent to using <code>::read()</code> and
515 <code>::write()</code> directly, and makes no use of the
516 iostream library at all...
521 <quote>Use streambufs, that's what they're there for.</quote>
524 While not trivial for the beginner, this is the best of all
525 solutions. The streambuf/filebuf layer is the layer that is
526 responsible for actual I/O. If you want to use the C++
527 library for binary I/O, this is where you start.
531 <para>How to go about using streambufs is a bit beyond the scope of this
532 document (at least for now), but while streambufs go a long way,
533 they still leave a couple of things up to you, the programmer.
534 As an example, byte ordering is completely between you and the
535 operating system, and you have to handle it yourself.
537 <para>Deriving a streambuf or filebuf
538 class from the standard ones, one that is specific to your data
539 types (or an abstraction thereof) is probably a good idea, and
540 lots of examples exist in journals and on Usenet. Using the
541 standard filebufs directly (either by declaring your own or by
542 using the pointer returned from an fstream's <code>rdbuf()</code>)
543 is certainly feasible as well.
545 <para>One area that causes problems is trying to do bit-by-bit operations
546 with filebufs. C++ is no different from C in this respect: I/O
547 must be done at the byte level. If you're trying to read or write
548 a few bits at a time, you're going about it the wrong way. You
549 must read/write an integral number of bytes and then process the
550 bytes. (For example, the streambuf functions take and return
551 variables of type <code>int_type</code>.)
553 <para>Another area of problems is opening text files in binary mode.
554 Generally, binary mode is intended for binary files, and opening
555 text files in binary mode means that you now have to deal with all of
556 those end-of-line and end-of-file problems that we mentioned before.
557 An instructive thread from comp.lang.c++.moderated delved off into
558 this topic starting more or less at
559 <ulink url="http://groups.google.com/groups?oi=djq&selm=an_436187505">this</ulink>
560 article and continuing to the end of the thread. (You'll have to
561 sort through some flames every couple of paragraphs, but the points
567 <sect1 id="manual.io.filestreams.binary2" xreflabel="Binary Input and Output">
568 <title>More Binary Input and Output</title>
569 <para>Towards the beginning of February 2001, the subject of
570 "binary" I/O was brought up in a couple of places at the
571 same time. One notable place was Usenet, where James Kanze and
572 Dietmar Kühl separately posted articles on why attempting
573 generic binary I/O was not a good idea. (Here are copies of
574 <ulink url="binary_iostreams_kanze.txt">Kanze's article</ulink> and
575 <ulink url="binary_iostreams_kuehl.txt">Kühl's article</ulink>.)
577 <para>Briefly, the problems of byte ordering and type sizes mean that
578 the unformatted functions like <code>ostream::put()</code> and
579 <code>istream::get()</code> cannot safely be used to communicate
580 between arbitrary programs, or across a network, or from one
581 invocation of a program to another invocation of the same program
582 on a different platform, etc.
584 <para>The entire Usenet thread is instructive, and took place under the
585 subject heading "binary iostreams" on both comp.std.c++
586 and comp.lang.c++.moderated in parallel. Also in that thread,
587 Dietmar Kühl mentioned that he had written a pair of stream
588 classes that would read and write XDR, which is a good step towards
589 a portable binary format.
596 <!-- Chapter 03 : Interacting with C -->
597 <chapter id="manual.io.c" xreflabel="Interacting with C">
598 <?dbhtml filename="io_and_c.html"?>
599 <title>Interacting with C</title>
602 <sect1 id="manual.io.c.FILE" xreflabel="Using FILE* and file descriptors">
603 <title>Using FILE* and file descriptors</title>
605 See the <link linkend="manual.ext.io">extensions</link> for using
606 <type>FILE</type> and <type>file descriptors</type> with
607 <classname>ofstream</classname> and
608 <classname>ifstream</classname>.
612 <sect1 id="manual.io.c.sync" xreflabel="Performance Issues">
613 <title>Performance</title>
615 Pathetic Performance? Ditch C.
617 <para>It sounds like a flame on C, but it isn't. Really. Calm down.
618 I'm just saying it to get your attention.
620 <para>Because the C++ library includes the C library, both C-style and
621 C++-style I/O have to work at the same time. For example:
624 #include <iostream>
625 #include <cstdio>
627 std::cout << "Hel";
628 std::printf ("lo, worl");
629 std::cout << "d!\n";
631 <para>This must do what you think it does.
633 <para>Alert members of the audience will immediately notice that buffering
634 is going to make a hash of the output unless special steps are taken.
636 <para>The special steps taken by libstdc++, at least for version 3.0,
637 involve doing very little buffering for the standard streams, leaving
638 most of the buffering to the underlying C library. (This kind of
639 thing is tricky to get right.)
640 The upside is that correctness is ensured. The downside is that
641 writing through <code>cout</code> can quite easily lead to awful
642 performance when the C++ I/O library is layered on top of the C I/O
643 library (as it is for 3.0 by default). Some patches have been applied
644 which improve the situation for 3.1.
646 <para>However, the C and C++ standard streams only need to be kept in sync
647 when both libraries' facilities are in use. If your program only uses
648 C++ I/O, then there's no need to sync with the C streams. The right
649 thing to do in this case is to call
652 #include <emphasis>any of the I/O headers such as ios, iostream, etc</emphasis>
654 std::ios::sync_with_stdio(false);
656 <para>You must do this before performing any I/O via the C++ stream objects.
657 Once you call this, the C++ streams will operate independently of the
658 (unused) C streams. For GCC 3.x, this means that <code>cout</code> and
659 company will become fully buffered on their own.
661 <para>Note, by the way, that the synchronization requirement only applies to
662 the standard streams (<code>cin</code>, <code>cout</code>,
664 <code>clog</code>, and their wide-character counterparts). File stream
665 objects that you declare yourself have no such requirement and are fully