Monday, March 7, 2011

Find() vs. enumeration on lists

Hi all,

I'm working with a code base where lists need to be frequently searched for a single element.

Is it faster to use a Predicate and Find() than to manually do an enumeration on the List?

for example:

string needle = "example";
FooObj result = _list.Find(delegate(FooObj foo) {
    return foo.Name == needle;
});

vs.

string needle = "example";
foreach (FooObj foo in _list)
{
    if (foo.Name == needle)
        return foo;
}

While they are equivalent in functionality, are they equivalent in performance as well?

From stackoverflow
  • They are not equivalent in performance. The Find() method requires a method (in this case delegate) invocation for every item in the list. Method invocation is not free and is relatively expensive as compared to an inline comparison. The foreach version requires no extra method invocation per object.

    That being said, I wouldn't pick one or the other based on performance until I actually profiled my code and found this to be a problem. I haven't yet found the overhead of this scenario to every be a "hot path" problem for code I've written and I use this pattern a lot with Find and other similar methods.

    Konrad Rudolph : Do you have data to back up your claim? As dalle has mentioned, benchmarks show the opposite for very similar code. I see no reason why the compiler shouldn't be able to inline a lambda in principle, thus making this method at least as efficient as the manual iteration.
    JaredPar : @Konrad, I don't have any data off hand. But we regularly profile our code internally and I profile several external projects. In a sufficiently complex project this hasn't ever been close to the bottleneck. It is slower but once you add a real application on top it's not noticable.
    JaredPar : @Kronrad (cont) The compiler could inline this code but currently the C# compiler doesn't do much rewriting of your code for optimizations purposes. AFAIK, it doesn't ever inline a lambda. This is certainly possible though. Look at F#. They do heavy rewriting of code.
  • As Jared pointed out, there are differences.

    But, as always, don't worry unless you know it's a bottleneck. And if it is a bottleneck, that's probably because the lists are big, in which case you should consider using a faster find - a hash table or binary tree, or even just sorting the list and doing binary search will give you log(n) performance which will have far more impact than tweaking your linear case.

  • If searching your list is too slow as-is, you can probably do better than a linear search. If you can keep the list sorted, you can use a binary search to find the element in O(lg n) time.

    If you're searching a whole lot, consider replacing that list with a Dictionary to index your objects by name.

  • Technically, the runtime performance of the delegate version will be slightly worse than the other version - but in most cases you'd be hard pressed to perceive any difference.

    Of more importance (IHMO) is the code time performance of being able to write what you want, rather than how you want it. This makes a big difference in maintainability.

    This original code:

    string needle = "example";
    foreach (FooObj foo in _list)
    {
        if (foo.Name == needle)        
            return foo;
    }
    

    requires any maintainer to read the code and understand that you're looking for a particular item.

    This code

    string needle = "example";
    return _list.Find(
        delegate(FooObj foo) 
        {
            return foo.Name == needle;
        });
    

    makes it clear that you're looking for a particular item - quicker to understand.

    Finally, this code, using features from C# 3.0:

    string needle = "example";
    return _list.Find( foo => foo.Name == needle);
    

    does exactly the same thing, but in one line that's even faster to read and understand (well, once you understand lambda expressions, anyway).

    In summary, given that the performance of the alternatives is nearly equal, choose the one that makes the code easier to read and maintain.

  • "I'm working with a code base where lists need to be frequently searched for a single element"

    It is better to change your data structure to be Dictionary instead of List to get better performance

  • Similar question was asked for List.ForEach vs. foreach-iteration (http://stackoverflow.com/questions/225937/foreach-vs-somelist-foreach).

    In that case List.ForEach was a bit faster.

0 comments:

Post a Comment