Simple microbenchmarking in C#

Introduction and scope

Often in the newsgroups, people ask the fastest way of doing something. The first thing I believe it's important to realise is that unless you know that that task is causing a bottleneck in your real application, it's probably not worth working out what the truly fastest way of doing it is. Instead, write the clearest code which does the job - it's much better to take a while to get the right answer than to get the wrong answer very quickly.

Having said that, I've always enjoyed a bit of microbenchmarking. For proper application-wide benchmarking, you should be using a decent profiler etc, worrying about all kinds of different aspects of performance. A server-side app, for instance, should worry about latency as well as total throughput, and of course scalability and reliability. This page doesn't attempt to cover any of that.

This page provides a very simple microbenchmarking framework which I now use when testing different ways of attacking a problem. It's rough and ready, but it does well enough for newsgroup posts and testing for fun.

Obtaining the code and compiling benchmarks

The code has grown slightly since its first incarnation, and it no longer makes sense to have it inline in this page. However, it's still just a single file to download: Benchmark.cs. Once you've downloaded it, using it is simple: just compile it into the same assembly as the code you wish to benchmark, and run it. It will find all the types with benchmarking code in, and run that code, reporting the timings. It's probably best to demonstrate this with a small sample benchmark:

using System;
using System.Collections;

public class FillingArrayList
{
    [Benchmark]
    public static void Test()
    {
        ArrayList al = new ArrayList();
        for (int i=0; i < 10000000; i++)
            al.Add ("hello");
    }
}

The only "special" thing about this class is that the Test method is marked with the Benchmark attribute (which is a simple attribute which can only be applied to methods - it's declared in Benchmark.cs). The method must be public, static, and not take any parameters, otherwise the framework ignores it. All the methods which obey these rules are executed and timed (one at a time).

I almost always compile simple test programs from the command line - if you wish to use Visual Studio, it's just a case of having Benchmark.cs in the project. If you have other classes which have a Main method, set Benchmark as the startup class. The project should be set to be a console application project. When compiling from the command line I simply use something like: cs Benchmark.cs FillingArrayList.cs (using the above code as an example). This produces Benchmark.exe which I can then run directly.

Running the code produces results like this:

Benchmarking type FillingArrayList
  Test                 00:00:02.1044640

Hopefully the meaning is self-explanatory!

Going beyond the very basics

Some of the time, a test like the above is all that's needed. Often you'll want a bit more though. The framework allows you to use command-line parameters to control the tests, along with ways of resetting the class between tests and checking the results after each test. Here's an example showing all of these features, as well as demonstrating what happens when you have more than one benchmark method in a class. Note that you can use any of these features individually, or indeed none of them at all.

// See http://www.pobox.com/~skeet/csharp/benchmark.html for how to run this code
using System;
using System.Collections;

public class FillingArrayList
{
    static int iterations=10000000; // Give it a reasonable default
    
    static ArrayList list; // This is what we'll check
    
    public static void Init(string[] args)
    {
        if (args.Length>0)
            iterations = Int32.Parse(args[0]);
    }
    
    public static void Reset()
    {
        // Make sure that tests which don't produce
        // anything don't end up using the previous result
        list = null;
    }
   
    public static void Check()
    {
        if (list.Count != iterations)
            throw new Exception ("List doesn't have the right size.");
    }
    
    [Benchmark]
    public static void Simple()
    {
        int count=iterations;
        
        ArrayList al = new ArrayList();
        for (int i=0; i < count; i++)
            al.Add ("hello");
        // Set the result
        list=al;
    }

    [Benchmark]
    public static void RightSizing()
    {
        int count=iterations;
        
        ArrayList al = new ArrayList(count);
        for (int i=0; i < count; i++)
            al.Add ("hello");
        // Set the result
        list=al;
    }
}

The first thing to notice is that the number of iterations is no longer hardcoded. There's a static variable, iterations, which the tests use to determine how many iterations to run for. This has a default value (which is the same as the hardcoded value we used before), but it can also be set in the new Init method. This method must be public and static, and take an array of strings as its only parameter. The value passed in will never be null.

Before each test is run, the Reset method is called (if it is present). Like benchmark methods, it must be public, static and parameterless to be noticed by the framework. This can be used to clear any information which might be left from the previous test, for instance. In this example, we make sure that the list variable is null before the start of the test.

After each test is run, the Check method is called (if it is present). Again, it must be public, static and parameterless. In practice, you can often get away with resetting state at the end of this method rather than having a separate Reset method, but you may prefer to use Reset for clarity. The aim of the Check method is to verify that the results of the test are correct, and throw an exception if they're not. If the check fails, the message of the exception is printed out instead of the time taken. In this example, we check that the size of the resultant list is correct. Note that this check is carried out after the time has been calculated, so you can afford to put time-consuming checks in here without affecting the results.

Here are the results of two runs of the new class - once with a parameter on the command line, and once without.

c:\test>benchmark 1000000
Benchmarking type FillingArrayList
  Simple               00:00:00.1001440
  RightSizing          00:00:00.0801152

c:\test>benchmark
Benchmarking type FillingArrayList
  Simple               00:00:02.1731248
  RightSizing          00:00:00.7610944

Command-line options

There are now two command-line options available, which must come before any parameters you wish to be passed to the benchmarked types themselves. The types will only receive the options from the point of the first option which isn't understood, or after the pseudo-option -endoptions. (This last allows you to write tasks which use the same option names as the benchmark framework understands, if you really want to.) The options are:

-version
This just displays the operating system and CLR versions.
-runtwice
This changes the behaviour so that each method in each benchmarked type is run twice. This allows you to see roughly how long was spent JITting the method being tested, and to see the performance after JITting (ie the second result).

Any other features?

That's pretty much all there is to it. If you feel that there's a feature missing, please mail me at skeet@pobox.com to let me know. I suppose the most obvious "feature" to add would be a GUI interface instead of the current command-line one, but that will have to come at a later date, if at all.

Microbenchmark best practices

A few guidelines when it comes to writing benchmarks which will give some useful information:

Run tests on an idle machine.
The framework makes no attempt to only measure the time spent by the processor executing the benchmark code - it just compares the time at the start of the test with the time at the end of the test. If your machine is heavily loaded, that will affect things significantly.
Make your tests take a long time, where possible.
Even when you try to make your machine as idle as possible, things can crop up to make benchmarks less dependable. Running benchmarks for a long time (minutes rather than seconds) can help to average these "blips" out, so you can get more repeatable results. It also makes the "one-time" effects of things like the inaccuracy of the system timer, and the time taken to call the benchmarking methods by reflection less significant as a proportion of the final result. Sometime it's very difficult to get benchmarks to run for a long time without incurring other penalties such as garbage collection and swapping, unfortunately. The example code used here is a case in point - if you increase the size of the ArrayList too far, you'll end up eating up a lot of memory and the system will start swapping, which will dwarf any other considerations.
Use local variables where possible.
The CLR can do a more optimisations on code which doesn't (for the most part) "escape" from just local variables. For instance, it doesn't need to worry about other threads tampering with the variables. That's the reason the second example copies the number of iterations into a local variable before running the loop, and copies the result out of a local variable into a class variable right at the very end. This may or may not make a significant difference to your test (on the current CLR), but I believe it's good practice anyway - although you need to bear this in mind when considering using the results in a real application!
Avoid the compiler optimising your code away entirely.
Unless you use the results of an operation which the JIT compiler knows can't affect anything else, it may well decide not to bother performing the operation at all. It's very important that you check the results in some form or other, making sure that the storage of results doesn't overwhelm the rest of the test, of course! For instance, I often keep a running total of results when testing something that produces a numeric result. I then test the total in the Check method. This will only work while adding to the total doesn't take long compared with the time taken to calculate what I need to add though.
When posting code, say how to run it.
At the time of writing, I'm the only one who's used this benchmarking framework. Of course, I hope it will be used more widely in the long run - but unless you know someone else already has the framework, they won't have a clue what to do with the benchmark code you give them unless you tell them how to run it. I suggest using a comment at the top of the source code, such as the one used in the second piece of sample code above.
Avoid garbage collection if possible, unless that's the aim of the test.
Garbage collection can slow things down in unpredictable ways. If you can avoid creating too much garbage (particularly long-lived garbage) in your tests, so much the better. The framework calls the garbage collector before each test is run in order to give it the best possible chance, but there's not a lot it can do to help if you eat up vast amounts of memory in your tests themselves.

The framework code

Just in case you want to see the code but don't actually want to download it separately, here's the full code for the framework.

using System;
using System.Reflection;
using System.Collections;

/// <summary>
/// The attribute to use to mark methods as being
/// the targets of benchmarking.
/// </summary>
[AttributeUsage(AttributeTargets.Method)]
public class BenchmarkAttribute : Attribute
{
}

/// <summary>
/// Very simple benchmarking framework. Looks for all types
/// in the current assembly which have static parameterless
/// methods 
public class Benchmark
{
    public static void Main(string[] args)
    {
        // Save all the benchmark classes from doing a nullity test
        if (args==null)
            args = new string[0];
        
        // We're only ever interested in public static methods. This variable
        // just makes it easier to read the code...
        BindingFlags publicStatic = BindingFlags.Public | BindingFlags.Static;

        foreach (Type type in Assembly.GetCallingAssembly().GetTypes())
        {
            // Find an Init method taking string[], if any
            MethodInfo initMethod=type.GetMethod ("Init", publicStatic, null, 
                                                  new Type[]{typeof(string[])},
                                                  null);
            
            // Find a parameterless Reset method, if any
            MethodInfo resetMethod=type.GetMethod ("Reset", publicStatic, 
                                                   null, new Type[0],
                                                   null);
            
            // Find a parameterless Check method, if any
            MethodInfo checkMethod=type.GetMethod("Check", publicStatic, 
                                                  null, new Type[0],
                                                  null);

            // Find all parameterless methods with the [Benchmark] attribute
            ArrayList benchmarkMethods=new ArrayList();
            foreach (MethodInfo method in type.GetMethods(publicStatic))
            {
                ParameterInfo[] parameters = method.GetParameters();
                if (parameters!=null && parameters.Length != 0)
                    continue;
                
                if (method.GetCustomAttributes
                    (typeof(BenchmarkAttribute), false).Length != 0)
                {
                    benchmarkMethods.Add (method);
                }
            }

            // Ignore types with no appropriate methods to benchmark
            if (benchmarkMethods.Count==0)
                continue;
            
            Console.WriteLine ("Benchmarking type {0}", type.Name);

            // If we've got an Init method, call it once
            try
            {
                if (initMethod!=null)
                    initMethod.Invoke (null, new object[]{args});
            }
            catch (TargetInvocationException e)
            {
                Exception inner = e.InnerException;
                string message = (inner==null ? null : inner.Message);
                if (message==null)
                    message = "(No message)";
                Console.WriteLine ("Init failed ({0})", message);
                continue; // Next type
            }
                
            foreach (MethodInfo method in benchmarkMethods)
            {
                try
                {
                    // Reset (if appropriate)
                    if (resetMethod!=null)
                        resetMethod.Invoke(null, null);

                    // Give the test as good a chance as possible
                    // of avoiding garbage collection
                    GC.Collect();
                    GC.WaitForPendingFinalizers();
                    GC.Collect();
                    
                    // Now run the test itself
                    DateTime start = DateTime.Now;
                    method.Invoke (null, null);
                    DateTime end = DateTime.Now;

                    // Check the results (if appropriate)
                    // Note that this doesn't affect the timing
                    if (checkMethod!=null)
                        checkMethod.Invoke(null, null);
                    
                    // If everything's worked, report the time taken, 
                    // nicely lined up (assuming no very long method names!)
                    Console.WriteLine ("  {0,-20} {1}", method.Name, end-start);
                }
                catch (TargetInvocationException e)
                {
                    Exception inner = e.InnerException;
                    string message = (inner==null ? null : inner.Message);
                    if (message==null)
                        message = "(No message)";
                    Console.WriteLine ("  {0}: Failed ({1})", method.Name, message);
                }
            }
        }
    }
}


Back to the main page.