Monday, May 23, 2011

C++ slower than Java and Python!

... when badly programmed

I was still trying to squeeze the most performance out of my solution to the hangman problem from round 1A of Google Code Jam 2011 and so I decided to give C++ a try. C++ is like a long forgotten friend to me. I started my paid carreer with it, almost twenty years ago. I learned some tricks with the STL, template methods and other fun things. I already had a humble background in C, so pointers, memory manipulation and the like were still fresh in my head.

Today I simply tried to port my Python algorithm to C++, as straightforwardly as possible - I don't know if "straightforwardly" is a very usual adverb, but I'm French, so please excuse me if it isn't. With some vague memories about memory allocation I tried to use reference types as much as possible, to avoid unnecessary copying. I used string, vector<>, map<> quite easily. I wrote very simple input and output with cin and cout. I had a valid C++ program in no time. I ran it on the small input set. It was immediate and gave the right result. "Everything's under control" I thought.

Then I ran it over the large set, and it took 3'30": one minute longer than the Python version! Can you imagine?!

gprof to the rescue. I tweaked one heavily used string manipulation function. 3'. Still not fantastic. I realized that map<> was based on ordering and offered log(n) access time: I switched to unordered_map<> to get a hashmap implementation. Around 2'30". Now it was as fast as Python (but still five times slower than the Java version). And now gprof did not give my anything interesting. A few mangled symbols appeared on top, especially with -O3 compilation. I tried to reduce the number of string allocations, using string const & as much as possible, but to no avail. I started to think that C++ was definitely not good at string handling. I thought that maybe using Cygwin g++ was suboptimal. I tried g++ on Linux. Still the same order of performance.

And then I saw the light

In the middle of my main algorithm loop, I was swapping big maps. Real map objects, not references. So each iteration of the loop would do a complete copy of the previous map. Arghh! C++ memory allocation is quirky, coming from managed languages like Java, Python or JavaScript. Replacing those maps by references gave the top rank back to C++: 17" for the single-threaded version, running on a single core and not even compiled with optimization. L'honneur est sauf.

Lessons learned

  • It's far less difficult to work with C++ and the STL than I expected, for problems like those from Code Jam.
  • However, manual memory management is something you always have to keep in mind.
  • My newbie usage of gprof did not allow me to track superfluous data structure copying. If you have any suggestions in that domain, I'll be glad to hear them.

No comments:

Post a Comment