References and Values

I've been avoiding writing this article for a long time. It was always meant to be part of a longer article covering the whole .NET type system - and that article is likely never to be written, given time constraints. This is probably the most important part though, because it's so fundamental to developing in .NET.

In some ways, the topic of "what is a reference" or "what is a reference type" is the elephant in the corner of the world of .NET literature. Everyone knows it's important, and it's barely written about. There are probably more articles about unsafe code than about the fundamentals of references, even though many developers will never need to write unsafe code in their careers. I believe this is because although the concepts are actually quite simple, they're hard to describe to people who don't already understand them.

All of this is by way of me saying that I'll do my best to write a useful explanation, but please don't hurt me if I fail. Instead, mail me at skeet@pobox.com with suggestions for improvements.

Scope and disclaimers

I'm going to try to keep this as simple as possible. I won't talk much about variables, about whether things live on the stack or the heap, how parameters are passed, encapsulation, or good practice. I'm not going to talk about why mutable value types are generally a bad idea. In fact, all the examples here will use public fields rather than properties. Don't do this in real code. It's just for demonstration purposes.

I'm going to try to use real-world analogies where I think they can be helpful. Analogies are always dangerous, because people overinterpret them. In general, assume that if I haven't pointed out some aspect of an analogy, it may not be valid.

Shortcut for those who know pointers

If you don't know C/C++ or another language which uses pointers, skip this paragraph. Reading it is likely to be more confusing than helpful. If you're still with me you can, by and large, think of references as pointers. There's no such thing as "reference arithmetic" - a reference is either a reference to an object or it's null; it can't be a reference to "part way through an array" as it often would be in C/C++.

Analogy 1: Printed pages, web pages, and URLs

Suppose you are interested in a list of countries in the world. That information is available in both printed form and on the web. The information would be the same, but the two media types behave in different ways. Printed pages have value semantics and web sites have reference semantics. In other words, printed pages act like value types and web sites act like reference types.

Firstly, when you're looking at the printed page, the information is right there. The page contains the information directly. When you access a web page, you use a URL to get to it - in our example, the URL is analogous to a reference used to access a reference type object. The URL doesn't contain the information itself - it doesn't have the list of countries in it. Instead, it provides a way of "getting to" the information.

Now consider what happens if the information is changed. Suppose two people each have a copy of the printed form of the list. If one person crosses off one of the countries, the other person doesn't see the change. The copies are independent of each other. Now suppose two people each have the URL to the same web page. If the owner of the page changes the list of countries, then the two people will both see the changes - the references (URL) are accessing the same data, effectively.

What happens when we want to make a copy of the information? If I have a printed page and want to give you a copy, I photocopy it. All the data is copied. After the copy has been made, the "new" copy is distinct from the "old" copy. If, on the other hand, I have the URL to the web page, I'll just give you that. All that is copied is the URL, the reference - not the information itself. To reinforce the previous point - if you take the photocopy I've made you and scribble all over it, that doesn't change my printed page at all. If the web page is editable (e.g. it's a wiki page) and you scribble all over the contents of that, I'll see those changes when I next use my copy of the URL.

Now, let's make things slightly more difficult, and consider the URL itself. Ignore the fact that it's text made up of individual characters, and that usually you'd have URLs as bookmarks in a browser - let's think of them as being written on paper. Giving you a copy of the URL would involve giving you a piece of paper with the URL written on it. If you choose to change that somehow (or if I change the piece of paper I've got the URL written on) that doesn't change the web page in any way. The two copies of the URL are independent, even though both originally access the same page.

The analogy in code

As ever, analogies become imperfect (or at least have wrinkles) when transferred to code. The code below uses a single string to represent the contents of a web page or a printed page. Ignore the details of "what does a reference type within a value type do" and suchlike - concentrate just on the points mentioned above. The comments say what will be written to the console.

using System;

// PrintedPage is a value type
struct PrintedPage
{
    public string Text;
}

// WebPage is a reference type
class WebPage
{
    public string Text;
}

class Demonstration
{
    static void Main()
    {
        // First look at value type behaviour
        PrintedPage originalPrintedPage = new PrintedPage();
        originalPrintedPage.Text = "Original printed text";
        
        // Copy all the information
        PrintedPage copyOfPrintedPage = originalPrintedPage;
        
        // Change the new copy
        copyOfPrintedPage.Text = "Changed printed text";
        
        // Write out the contents of the original page.
        // Output=Original printed text
        Console.WriteLine ("originalPrintedPage={0}",
                           originalPrintedPage.Text);
        
        
        // Now look at reference type behaviour
        WebPage originalWebPage = new WebPage();
        originalWebPage.Text = "Original web text";
        
        // Copy just the URL
        WebPage copyOfWebPage = originalWebPage;
        
        // Change the page via the new copy of the URL
        copyOfWebPage.Text = "Changed web text";
        
        // Write out the contents of the page
        // Output=Changed web text
        Console.WriteLine ("originalWebPage={0}",
                           originalWebPage.Text);
        
        // Now change the copied URL variable to look at
        // a different web page completely
        copyOfWebPage = new WebPage();
        copyOfWebPage.Text = "Changed web page again";
        
        // Write out the contents of the first page
        // Output=Changed web text
        Console.WriteLine ("originalWebPage={0}",
                           originalWebPage.Text);
        
    }
}

The last section of the code mirrors the last point made in the text: in this case, there's no direct relationship between the originalWebPage and copyOfWebPage variables. The copy has the same value (URL) at the start as the original, but changing copyOfWebPage itself is equivalent to changing the URL that's written on a piece of paper - it's not the same as changing the contents of the website that URL refers to. This is a crucially important point, and one which is often the cause of miscommunication. Just to draw it out:

If x is a variable and its type is a reference type, then changing the value of x is not the same as changing the data in the object which the value of x refers to.

If that doesn't make sense to you, reread it a few times, then reread more of the page up to this point, until it does make sense. If that doesn't happen after ten minutes or so, mail me and I'll try to make the page clearer.



Back to the main page.