Boxing/Unboxing In .NET

In the .NET Framework there is a feature called boxing, which goes hand in hand with unboxing. Boxing is an implicit conversion of a value-type to the object type. While this is a very nice feature when programming normal applications, this overhead becomes a big performance hit when working with algorithms that need to do lots of operations, such as path finding.

When you create a value-type, it’s originally placed on the stack. If you box this value-type, the required memory is allocated on the heap, the value gets copied and an object is placed on the stack, pointing to the boxed value on the heap.

Boxing an int
Boxing an int

Unboxing the int again
Unboxing



In a pathfinder I created long ago, in the .NET 1.1 era, a list of points had to be maintained. While this was possible using an ArrayList, it proved faster to write a typed list. The main reason behind this was because the ArrayList’s method signatures all work with objects, causing implicit boxing when using them. Retrieving items from an ArrayList also required unboxing, because that had to be cast back into their Point type.

I wrote a small application to demonstrate the boxing and unboxing taking place, and the performance impact. The test data were 10 million random Point values.

C#:
  1. // Adding to an ArrayList
  2. for (int i = 0; i <itemsToGenerate; i++) {
  3.   arrayList.Add(rndCosts[i]);
  4. }

When we take a look at the IL code this piece generates, we would see the following:

CODE:
  1. ldobj     [System.Drawing]System.Drawing.Point
  2. box       [System.Drawing]System.Drawing.Point
  3. callvirt  instance int32 [mscorlib]System.Collections.ArrayList::Add(object)

On the second line, you can see it uses the box operation to box our value-type Point (stored in the typed Point array rndCosts) before calling the Add method.

The solution to this is using generics, available from .NET 2.0 onwards, or writing your own typed list object. As .NET 1.1 had to be used, I chose the second solution. To do this, I used Reflector to look at the ArrayList code, and use that code to create a new list, replacing all instances of object with our required type, Point in this example.

Now we use the same piece of test code with our PointList.

C#:
  1. // Adding to a typed PointList
  2. for (int i = 0; i <itemsToGenerate; i++) {
  3.   pointList.Add(rndCosts[i]);
  4. }

If we look at the IL this piece generates, we notice the following:

CODE:
  1. ldobj     [System.Drawing]System.Drawing.Point
  2. callvirt instance int32 [PointList]CumpsD.Collections.PointList::Add(
  3.           valuetype [System.Drawing]System.Drawing.Point)

The Framework does not use the box operation anymore, effectively getting rid of the overhead we had with our first solution. When we take a look at the test results, we can clearly see the difference between the first solution and the second.

CODE:
  1. ArrayList: added 10000000 items.
  2. 00:00:03.1845792 seconds
  3. PointList: added 10000000 items.
  4. 00:00:00.2804032 seconds

By using the strong typed PointList it was possible to get just a bit more performance out of the code, making the project operate better in the long run. Just for fun, I revisited this test using .NET 3.5, with generics and created a List to store the items. I ended up having similar performance (00:00:00.2915599 seconds) as the typed list I created in .NET 1.1, pretty good considering I wrote it two years ago ;)

The main conclusion here is to avoid any non-typed collection when performance is an issue. Some browny points for our Bits & Bytes Religion when it comes to the happy feeling you get when working with generics :)

7 reacties op “Boxing/Unboxing In .NET”

  1. zuraff zegt:

    This is a good explanation of boxing/unboxing issue, but I don’t like two things:
    –> Are you allowed to look with reflector at ArrayList implementation ?? I have doubts.
    –> Your benchmarking is not worth much if it ends up with time: 00:00:00.2915599, because in my opinion this is not enough long run and the result can be easily disturbed by other factors and system events.

  2. David Cumps zegt:

    If you are multi threading and gain a 3second benefit those quickly add up, especially in AI scenarios where your calculations have to be done in specific time slices.

    And in the places I used this, this list isn’t the only performance benefit of course, otherwise I’d agree, if you only used this and then would create some major inefficient algorithm, then it’d be pointless.

    But for places where you are tweaking everything, by using binary heaps for example instead of regular sorted lists, then a 3 sec (in my case this was running in 25 threads, which means a 75seconds gain) is quite significant.

    (7seconds vs 79seconds, not bad ;p)

  3. SamehGaber zegt:

    Dear Mr.David
    This program display 50
    I thing it Should be 51
    am I reghit ?
    Please help

    class Program
    {
    static void Main(string[] args)
    {

    object o = 50;
    AddThrowBox(o);
    Console.WriteLine(o);
    }
    public static void AddThrowBox(object o)
    {
    o=((int)o)+1;
    }
    }

  4. David Cumps zegt:

    The following will display 51:

    static void Main(string[] args)
    {
    object o = 50;
    AddThrowBox(ref o);
    Console.WriteLine(o);
    }
    public static void AddThrowBox(ref object o)
    {
    o = ((int)o) + 1;
    }

    Note the use of passing it by reference.

  5. samehgaber zegt:

    Dear Mr.David
    then
    waht is the added value
    I can send the variable by ref from the begining
    I used boxing to let value type act as refernce type without passing it by ref
    thanks a lot

  6. David Cumps zegt:

    I guess it clones the object when you pass it along without the ref. I wouldn’t exactly know though, maybe it’s done like that to prevent implicit changes from happening.

    The only thing why I wanted to know about boxing/unboxing is not to somehow ‘abuse’ it making my code less readable, but to understand how the old (non-generic) .NET classes worked internally.

  7. [...] cuando utilizamos funciones o tipos que reciben object en vez del tipo específico. En el siguiente post se muestra cómo detectar y minimizar este patrón de [...]

Reageer