cplusplus.co.il

Different ways to iterate over a vector

Posted on: 28/08/2009

Iterating over a vector is a pretty simple task we get to do pretty often. It can be achieved in quite a few ways:

  • Using normal random access (operator[] with index).
  • Using std::iterator.
  • Using std::for_each algorithm.

I set out to check the runtime differences between all these options, and the results turned out to be a little surprising (or not).

Let me present the setup:

#include <algorithm>
#include <vector>

typedef std::vector<int> vec;

struct func {
    void operator () (int &n) const {
        n *= 2;
    }
};

void idx (vec &v, func &f) {
    for (size_t i=0;i<v.size();++i)
        f(v[i]);
}

void iterate (vec &v, func &f) {
    for (vec::iterator i=v.begin();i!=v.end();++i)
        f(*i);
}

void foreach (vec &v, func &f) {
    std::for_each(v.begin(), v.end(), f);
}

I used the Timer class (presented here) to time all the different options, in the following manner:

#include "timer.hpp"

int main () {
    Timer t("iterate");

    vec v(30000, 2);
    func f;

    for (unsigned i=30000;i;--i)
        iterate(v, f);

    return 0;
}

The tests were run on my own system; Windows Vista, Visual Studio 2008 (release config), Intel core2 quad q9450, 4gb ram. The observed results were the following:

idx: 1.188secs.

iterate: 6.813secs.

foreach: 0.709secs.

Looks like using the foreach algorithm is the fastest way, followed by normal random-access (using operator[]), while the usage of iterators seems to be the slowest by far.

Edit: as it turns out, the usage of checked iterators is enabled in visual studio by default, both in debug and in release modes. As you could guess, this fact has a great effect on the results. When the feature is disabled (#define _SECURE_SCL 0), the results are totally different; std::for_each still wins, but the margins are significantly smaller:

idx: 0.686secs.

iterate: 0.701secs.

foreach: 0.682secs.

I would like to thank Halbling for raising the question.

Advertisements

10 Responses to "Different ways to iterate over a vector"

Ew, calculating .end() on every iteration? No.

void iterate (vec &v, func &f) {
vec::iterator it = v.begin(), end = v.end();
for ( ; it != end; ++it)
f(*it);
}

Try again!
Also, is there something wrong with your SHIFT-key?

Regards

Er, and:

void idx (vec &v, func &f) {
size_t n = v.size();
for (size_t i = 0;i < n; ++i)
f(v[i]);
}

These can't be cached as the compiler does not know if f() has side-effects on the vector. Consequently, using my functions you have to ensure that it doesn't, but if it did then it wouldn't be a very good iteration benchmark would it. 🙂

The presented methods are all used in the most straightforward way; no optimization attempts have been made to any of them, and that was on purpose – to show the most common usage scenario.
Moreover, the optimizations you suggested are basically what is done inside std::for_each.

thanks alot for commenting!

There’s a difference between writing non-optimized code and writing silly code. Anyone using the functions you demonstrated should not be employed to write C++, so it’s moot. 🙂

On linux (Athlon XP 1600+, g++ -O3), I get the following times for the original versions:

> idx: 1.584 s
> iterate: 1.715 s
> foreach: 1.560 s

Seems like gcc does some decent optimizations.
Also make sure that Visual Studio does not use the range-checking iterators in your benchmark…

You are absolutly correct!
Reading http://msdn.microsoft.com/en-us/library/aa985965(VS.80).aspx it turns out that checked iterators are enabled by default (both in debug & release configurations).
Switching checked iterators off generates far better results, as updated in the above post.

I imagine that writing non-silly code — as I mentioned in my comment above — would produce all but identical results for all three options… if not identical instructions. So this raises the question: just what are you trying to say here? 🙂

I was just experimenting with different ways of achieving a pretty common goal, not to make any point. I posted my results, and after discussing them and solving an issue – it turned out that all the different approaches generate pretty much the same results.
What did I learn? Firstly, that the compiler is able to optimize common pieces of code as well as a human being can. And secondly, that Microsoft’s Visual Studio has a pretty odd configuration that should be checked and verified (why would I want checked pointers in release mode?).

All in all, I hope this article will come in handy for the readers.

Another way is Boost.Foreach : http://www.boost.org/doc/libs/1_39_0/doc/html/foreach.html .

vector myvector;
BOOST_FOREACH( int i , myvector )
cout << i;

The code is optimal as for with iterator and caching the end.
(Checked on MSVC 2008 )

First off, thanks for a very relevant comment.

Given the described situation, std::for_each is just as optimal. In a general case though, BOOST_FOREACH is indeed perfect: it provides all the benefits of a regular loop (you’re able to write normal code within it, etc) with all the caching and other issues being totally taken care of by boost.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: