Monday, April 11, 2011

How can I increase the value in a HashTable?

I have a HashTable which I am keep track of colors (which are the keys) and count of colors which is the key.

I am trying to figure out how to increment the key when it the HashTable already contains the color. Here is a code snippet:

Hashtable htColors = new Hashtable();

if (htColors.Contains(color))
{
    // Want to increase the "value" of the key here.       
}
else
{
    htColors.Add(color, 1); //Found color for first time
}
From stackoverflow
  • Try the following

    if (htColors.Contains(color))
    {
       int old = (int)htColors[color];
       htColor[color] = old + 1;
    }
    

    EDIT Response to comments

    IMHO, the Dictionary approach is much better because it is 1) type safe and 2) eliminates the boxing involved in this solution.

    Having the line be the following won't affect the key, just the value

    htColor[color] = (int)htColor[color] + 1;
    
    Xaisoft : Ok, your solution appears to work. What do you think about the Dictionary alternative? Also, what if I just did (int)htColors[color] + 1, would that change the key or value?
    Xaisoft : Thanks for the help with the HashTable. It looks like I'll be going with a Dictionary after all.
    Michael Meadows : Someone can correct me if I'm wrong here, but I believe that auto boxing/unboxing occurs for any value type that is moved to the heap, even (in this case) ints in a typed dictionary (Dictionary). What Dictionary does elimintate is clutter from casting.
    JaredPar : @Michael, you're confusing lifetime with boxing. Boxing only occurs when a ValueType is converted to an object. Moving a ValueType into the heap in a strongly typed container incurs no boxing but does alter the variable lifetime. Eventually though the ValueType must be rooted by some reference.
  • I'm posting this to be pedantic. I don't like the interfacing to Dictionary because there is a cost to this very common kind of access - if your most common case is touching an element that already exists, you have to hash and look up your value 3 times. Don't believe me? I wrote DK's solution here:

    static void AddInc(Dictionary<string, int> dict, string s)
    {
        if (dict.ContainsKey(s))
        {
            dict[s]++;
        }
        else
        {
            dict.Add(s, 1);
        }
    }
    

    When put into IL - you get this:

    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: ldarg.1 
    L_0003: callvirt instance bool [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::ContainsKey(!0)
    L_0008: ldc.i4.0 
    L_0009: ceq 
    L_000b: stloc.0 
    L_000c: ldloc.0 
    L_000d: brtrue.s L_0028
    L_000f: nop 
    L_0010: ldarg.0 
    L_0011: dup 
    L_0012: stloc.1 
    L_0013: ldarg.1 
    L_0014: dup 
    L_0015: stloc.2 
    L_0016: ldloc.1 
    L_0017: ldloc.2 
    L_0018: callvirt instance !1 [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::get_Item(!0)
    L_001d: ldc.i4.1 
    L_001e: add 
    L_001f: callvirt instance void [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::set_Item(!0, !1)
    L_0024: nop 
    L_0025: nop 
    L_0026: br.s L_0033
    L_0028: nop 
    L_0029: ldarg.0 
    L_002a: ldarg.1 
    L_002b: ldc.i4.1 
    L_002c: callvirt instance void [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::Add(!0, !1)
    L_0031: nop 
    L_0032: nop 
    L_0033: ret
    

    which calls to ContainsKey, get_item, and set_item, all of which hash and look up.

    I wrote something less pretty which uses a class that holds an int and the class lets you side-effect it (you can't really use a struct without incurring the same penalty because of struct copying semantics).

    class IntegerHolder {
        public IntegerHolder(int x) { i = x; }
        public int i;
    }
    static void AddInc2(Dictionary<string, IntegerHolder> dict, string s)
    {
        IntegerHolder holder = dict[s];
        if (holder != null)
        {
            holder.i++;
        }
        else
        {
            dict.Add(s, new IntegerHolder(1));
        }
    }
    

    This gives you the following IL:

    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: ldarg.1 
    L_0003: callvirt instance !1 [mscorlib]System.Collections.Generic.Dictionary`2<string, class AddableDictionary.IntegerHolder>::get_Item(!0)
    L_0008: stloc.0 
    L_0009: ldloc.0 
    L_000a: ldnull 
    L_000b: ceq 
    L_000d: stloc.1 
    L_000e: ldloc.1 
    L_000f: brtrue.s L_0023
    L_0011: nop 
    L_0012: ldloc.0 
    L_0013: dup 
    L_0014: ldfld int32 AddableDictionary.IntegerHolder::i
    L_0019: ldc.i4.1 
    L_001a: add 
    L_001b: stfld int32 AddableDictionary.IntegerHolder::i
    L_0020: nop 
    L_0021: br.s L_0033
    L_0023: nop 
    L_0024: ldarg.0 
    L_0025: ldarg.1 
    L_0026: ldc.i4.1 
    L_0027: newobj instance void AddableDictionary.IntegerHolder::.ctor(int32)
    L_002c: callvirt instance void [mscorlib]System.Collections.Generic.Dictionary`2<string, class AddableDictionary.IntegerHolder>::Add(!0, !1)
    L_0031: nop 
    L_0032: nop 
    L_0033: ret
    

    Which calls get_item once - there is no additional hashing in the case of an object present. I got a little sleazy and made the field public to avoid the method calls for property access.

    If it were me, I would wrap this overall functionality into its own class and hide the IntegerHolder class from public view - here's a limited version:

    public class CountableItem<T>
    {
        private class IntegerHolder
        {
            public int i;
            public IntegerHolder() { i = 1; }
        }
        Dictionary<T, IntegerHolder> dict = new Dictionary<T, IntegerHolder>();
    
        public void Add(T key)
        {
            IntegerHolder val = dict[key];
            if (val != null)
                val.i++;
            else
                dict.Add(key, new IntegerHolder());
        }
    
        public void Clear()
        {
            dict.Clear();
        }
    
        public int Count(T key)
        {
            IntegerHolder val = dict[key];
            if (val != null)
                return val.i;
            return 0;
        }
    
        // TODO - write the IEnumerable accessor.
    }
    
    DK : +1 for a very good IL insight. This demonstrates why/when using a value type for dictionary bucket might be a bad idea.

0 comments:

Post a Comment