C++11 Stream Iterators feel like Python

The C++11 standard has plenty of new goodies that make programmer’s life easier without sacrificing rigor.

Here is a neat example from the book “The C++ Programming Language” by Bjarne Stroustrup,
that in its 4th edition covers the C++11 standard.

Found in Page 107,  Section 4.5.3 Stream Iterators.

Let’s consider the tasks of:

  • Reading a list of words from an input file
  • Sorting them 
  • Eliminating duplicates and 
  • Writing them down into an output file

This can be done with the classic Unix shell commands:

   cat inputfile | sort | uniq > outputfile    
Where we expect “inputfile” to contain a list of words separated by new lines.

How to write the equivalent in a C++ program using C++11 ?

Let the code (and Stroustrup) speak:

#include <iterator>
#include <string>
#include <fstream>
#include <iostream>
#include <vector>
#include <algorithm>

int main() {

std::string inputfilename, outputfilename;

std::cin >> inputfilename >> outputfilename;

std::ifstream inputfile { inputfilename };
std::ofstream outputfile { outputfilename };

std::istream_iterator< std::string > isitr { inputfile };
std::ostream_iterator< std::string > ositr { outputfile, "\n" };

std::istream_iterator< std::string > eos {};

std::vector< std::string > str { isitr, eos };

std::sort( str.begin(), str.end() );

std::unique_copy( str.begin(), str.end(), ositr );

return !inputfile.eof() || !outputfile;

}

This code

  • Reads two filenames from the standard input
  • Opens one file for input
  • Opens one file for output
  • Associates one iterator to the input file
  • Associates one iterator to the output file, along with a separator “\n”
  • Creates a vector to contain strings and attach it to the iterator of the input file
    • At this point the full file is read into memory and placed into the vector of strings
  • Calls std::sort and in the process trigger the read.
  • Once sorted, copies unique entries to the output file, using the output iterator

This can be rewritten a bit shorter (from the same Book, page 108) as:

#include <iterator>
#include <string>
#include <fstream>
#include <iostream>
#include <set>
#include <algorithm>

int main(int argc, const char * argv [] ) {

  std::ifstream inputfile  { argv[1] };
  std::ofstream outputfile { argv[2] };

  using istritr = std::istream_iterator< std::string >;
  using ostritr = std::ostream_iterator< std::string >;

  std::set< std::string > words { istritr { inputfile }, istritr {} };

  std::copy( words.begin(), words.end(), ostritr { outputfile, "\n" } );

  return !inputfile.eof() || !outputfile;
}

 

Using an std::set instead of an std::vector, we get simultaneously the uniqueness property and the sorted property.
In particular, both of these properties are enforced as we go inserting new elements in the set.

 

…and for the offended Pythonists out there…
OK, you are right,
it is not quite as short as it could be in Python,
Here is an attempt to write the same in a Python script:
import sys

infile = open(str(sys.argv[1]))
outfile = open(str(sys.argv[2]),’w’)

outfile.writelines(sorted(list(set( infile.readlines() ))))

infile.close()
outfile.close()

 

Somehow it seems relevant to cite here the output of the Python command 
   import this    

 that returns the Zen of Python, by Tim Peters:

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren’t special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one– and preferably only one –obvious way to do it.
Although that way may not be obvious at first unless you’re Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it’s a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea — let’s do more of those!

There is a lot in here that C++ and Python developers can agree upon.

5 Responses to C++11 Stream Iterators feel like Python

  1. Jean-Christophe Fillion-Robin says:

    Very nice.

    An extra line can be removed :p See https://github.com/luisibanez/Cxx11/pull/1

  2. Luis Ibanez says:

    You are quite right !

    Your improvement has now been merged:
    https://github.com/luisibanez/Cxx11/commit/680cdfe72d65314af5f240aa058523fed26af20e

    Thanks !

  3. Jean-Christophe Fillion-Robin says:

    And ‘string’ header can also be removed. See https://github.com/luisibanez/Cxx11/pull/2

  4. Gert Wollny says:

    I really love C++11, but above examples barely show something new. Stream iterators existed before, (see, e.g. Josuttis “The C++ Standart Library” Addison-Wesley 1999).

    Replace the “using” by “typedef” and the “{}” by (), i.e. the initializer lists by the constructors, and you get a well-formed c++98 program that doesn’t look very different, i.e.

    #include
    #include
    #include
    #include
    #include
    #include

    int main(int argc, const char * argv [] )
    {
    std::ifstream inputfile ( argv[1] );
    std::ofstream outputfile ( argv[2] );

    typedef std::istream_iterator< std::string > istritr;
    typedef std::ostream_iterator< std::string > ostritr;

    std::set< std::string >
    words( (istritr(inputfile)), istritr() );

    std::copy( words.begin(), words.end(),
    (ostritr( outputfile, “n” )));

    return !inputfile.eof() || !outputfile;
    }

    regards,

  5. Luis Ibanez says:

    Gert,

    Thanks for the clarification and your C++98 vs C++11 correction.

    Your point is well taken.

Questions or comments are always welcome!