Beware operator precedence

In That’s just wrong!, I described some … unexpected results that occur due to the string concatenation optimization that the C# compiler performs. It seems a bit wonky to me, but perhaps people who do a lot of string concatenation find it useful.

Today I ran across a new one, thank to Eric Lippert’s blog post, A string Concatenation puzzle.

Based on my previous blog post, you know that this code:

    string s = "";
    s = s + 5 + 5;

Will result in the string s containing the value "55". Or, after the compiler’s optimizations: string.Concat(s, 5, 5)

So what would you expect as the result of this code?

    string s = "";
    s += 5 + 5;

How about "10"? Ouch. Operator precedence.

You see, writing s += expression is the same as writing s = s + (expression), so what you end up with is s + (5 + 5). After the compiler is done with its optimization magic, you have s = string.Concat(s, 10).

That seems wonky because in numerical calculations (a + b) + c is the same as a + (b + c). That’s true for strings, as well, but not true when you combine numerical operations and string concatenation. Remember, when you see s + (5 + 5), the compiler generates code to add the two numbers in parentheses and passes that value to String.Concat.

This kind of thing isn’t limited to strings, by the way. You can get yourself in trouble with numeric calculations, too. Consider:

    int a = 2;
    int b = 3;
    b = b + a << 1;  // result = 10

    // or, replace the above with
    b += a << 1;     // result = 7

Operator precedence again. Operator << has lower precedence than operator +. So b = b + a << 1 is evaluated as (b + a) << 1. The other is evaluated as b + (a << 1). Oddly enough, I saw this one trip somebody up the other day.

I think the key here is to understand that a += expression is equivalent to a = a + (expression). The entire expression is evaluated before the addition. Failure to understand that can lead to some rather mystifying results.

2 comments to Beware operator precedence

  • RH in CT

    I’ve long believed that the ideal programming language would not have operator precedence. Any expression that relied on precedence would raise an error at compile time as being ambiguous. Require (parentheses) to disambiguate as needed. Lacking that as a feature in any language I have used I simply include parentheses as though it was.

    • RH: Indeed this is why LISP uses only prefix notation. The parsing rules are insanely simple. FORTH uses postfix for the same reason, and parentheses aren’t used at all.



A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.