Questions & Answers

Columns

Questions & Answers

Pete Becker

Parentheses Won't Fix Everything

What's obvious to the eye isn't always obvious to a compiler. In particular, adding parentheses may not resolve order-of-evaluation ambiguities. Pete explains why, and corrects a few misconceptions.

Q

Would the expression
i = (i++);
differ from
i = i++;
— James E. King, III

A

No. There is only one sequence point in each expression, and that is at the end. The first expression still attempts to modify i twice without an intervening sequence point, so its effect is undefined.

More interesting, though, is why you thought of adding parentheses to make the expression work right. In many cases you can make expressions do as you expect by adding parentheses, but you should do this only if you understand why the expression behaves the way it does and what effect adding the parentheses will have. Don't just add parentheses out of a general sense that adding parentheses is good. I must confess, though, that I too sometimes add parentheses instead of thinking about the appropriate rules. Nevertheless, it's better to work toward understanding C syntax to the point where you can tell whether parentheses are appropriate, and not simply add them because they are sometimes helpful.

I suspect that our tendency to use parentheses indiscriminately arises from two sources: the need to protect ourselves from the preprocessor, and our failure to fully understand how the rules of precedence and grouping affect the meaning of an expression. The good news is that each of these problems is treatable. The bad news is that these problems stem from deeper problems, which can only be cured by a fairly careful study of the C language definition. I can give you some of the tools needed for this study, but if you suffer from this parenthesis disease you must be prepared to do some hard work to get rid of the bad habits that you have developed over the years.

The C preprocessor is quite powerful, but it is undisciplined. It can quickly turn healthy code into unmaintainable gibberish. For example, here's an error that you've probably seen far too many times:
#define a = 3
printf( "%d\n", a );
Of course, the = sign does not belong there. It got in because the programmer was thinking of initialization instead of macro substitution, and simply typed the wrong thing. The result is a program that does not compile.

A more subtle problem occurs in the following:
#define x 1 + 2
printf( "%d%d\n", x, 2*x);
The programmer almost certainly expected the printf statement to display the values 3 and 6, but instead it displays 3 and 4. The reason is that the preprocessor replaces each x in the printf statement with the macro corresponding to x, which is 1+2. The C compiler sees a printf statement that looks like this:
printf( "%d%d\n", 1+2, 2*1+2 );
Since 2*1+2 is 4, that is what printf displays for its second value. That's why every C programmer learns this programming rule: always enclose a macro definition in parentheses. Applying this rule prevents the identifier x from seeming to change value mysteriously. In this snippet:
#define x (1+2)
printf( "%d%d\n", x, 2*x );
printf will display the values 3 and 6, which makes much better sense.

A similar problem arises with function-like macros:
#define dbl(n) (2*n)
printf( "%d%d\n", 3, dbl(3) );
Here we've put parentheses around the definition of our macro text so that we can avoid the problem we saw earlier. This code fragment correctly displays the values 3 and 6. However, there's still a problem lurking here, which shows up if we use our macro in a slightly more complicated expression:
#define dbl(n) (2*n)
printf( "%d%d\n", 1+2, dbl(1+2) );
Now the expression displays the values 3 and 4, again not at all what we wanted it to do. The culprit, once again, is the interaction of the preprocessor and the C compiler. If we expand the macro in place we get a better view of what's happening. The printf expression becomes
printf( "%d%d\n", 1+2, (2*2+1) );
That's why we have the second rule of macro writing: always enclose uses of macro parameters in parentheses. The correct way to write the dbl macro is like this:
#define dbl(n) (2*(n))
Now our printf expression displays the correct output. That's because the preprocessor turned it into this:
printf( "%d%d\n", 2+1, (2*(1+2)) );
This proliferation of parentheses is appropriate and necessary when you're dealing with the preprocessor, because preprocessor macros simply turn into text at the point where they are used. The compiler does not distinguish the resulting text in any way from the surrounding text; what we expect to happen because we see only a single identifier doesn't necessarily happen when the processor turns that identifier into something more complicated.

However, when you move out of the realm of the preprocessor and into statements that mean exactly what they say, you no longer need this extra protection. Seeing unnecessary parentheses makes people nervous about your code because it suggests that you don't know what you are doing. If that is in fact the case, then you need to learn about the basics of grouping and precedence. Those are the things that can be changed by adding parentheses.

The precedence of operators tells the compiler how to group terms in an expression that contains a mix of operations. For example, the simple expression
1+2*3
evaluates to 7, because multiplication has higher precedence than addition. The compiler generates code to multiply 2 by 3 and add 1 to the result. We can change that behavior by adding parentheses:
(1+2)*3
This expression evaluates to 9, because the parentheses tell the compiler to add 1 and 2 and multiply the result by 3.

Applying precedence in this sort of context is usually fairly straightforward. But C complicates things in its notion of an expression: almost anything can go inside an expression, including an assignment. So, for example,
foo = 2*3
is an expression. Multiplication has higher precedence than assignment, so the compiler multiplies 2 by 3 and assigns the result to foo. The expression has the value 6. We can change that behavior by adding parentheses:
(foo=2)*3
This expression assigns 2 to foo, and multiplies the result of that assignment by 3, giving the value 6 for this expression. The value of the expression stayed the same in this case, but the value assigned to foo changed because of the parentheses. In fact, precedence is what makes assignment statements work the way we expect them to: assignment has lower precedence than all of the other operators except ?:, which is worth an essay of its own.

Precedence tells you how to combine terms when you have distinct operators in an expression. When the operators are the same you must turn to grouping. For example, the addition operator groups from left to right. That means that in an expression with more than one addition operator and no intervening operators, the right-hand addition operation gets the result of the left-hand addition.

For example:
1+2+3
In this expression, the compiler should add 1 and 2, producing 3, and add that result to 3, producing 6. Now, that may not seem important, and when you are dealing with an expression involving identical types, it doesn't make much difference. But when the operands are of different types, grouping affects how the various operands get promoted, which can affect the validity and meaning of the expression. For example:
long add1( long a, int b, int c )
{
return a+b+c;
}
In the return statement we have the expression a+b+c. Since addition groups from left to right, the compiler adds a and b, then adds the result to c. Since a is of type long, the compiler must promote b to long before performing the left-hand addition. The result of that addition is also a long. Since that result becomes an operand of the right-hand addition, the compiler must then promote c to a long before adding it to the sum of a and b. So what's wrong with that? Nothing, so far, but look what happens if we change the return statement slightly:
long add2( long a, int b, int c )
{
return b+c+a;
}
Now we're in trouble. Addition groups from left to right, so the left-hand addition operator says to add b and c. b and c are both integers, so the result is an integer. The right-hand addition operator adds this result to a. Since a is long, the result of the left-hand addition is promoted to long and then added to a. Do you see the difference?

If you work on systems in which int and long are the same size, you might not see it. Picture, though, a system in which an int is smaller than a long. Let's say, for example, that an int is 16 bits and a long is 32 bits. If you have such a system available, try this:
int main()
{
long res1 = add1( 1L, 32000, 32000 );
long res2 = add2( 1L, 32000, 32000 );
printf( "%ld%ld\n", res1, res2 );
return 0;
}
Borland C++ 5.0 compiling to a 16-bit application gives the following result:
64001 -1535
No, you can't blame this on the compiler. The difference is real, and is due to the language definition; the difference is a direct consequence of C's requirement that the + operator group from left to right.

Now, if you like to add parentheses, you can rewrite add2 to give the expected result:
long add2( long a, int b, int c)
{
return b+(c+a);
}
By imposing your own grouping rule you tell the compiler to add c and a, which results in a long, and to add that result to b.

However, this really isn't a good solution. It's not at all clear on a casual reading of this code that the parentheses are there to solve a grouping problem, so a future maintenance programmer could easily become confused. It's better to explicitly eliminate the mix of sizes in the expression. We can do that by making everything into a long.

But wait — I'm not quite through yet. Did you by chance make everything into a long by adding casts to convert b and c into longs in the return expression? Then you haven't been paying attention to my earlier columns. Casts are a sign of trouble. You've gotten yourself out of grouping trouble by getting yourself into cast trouble. Better to avoid both kinds of trouble by changing the interface to add2 so that it takes three longs:
long add2( long a, long b, long c )
{
return b+c+a;
}
Now add2 contains nothing confusing or order-dependent. If someone calls add2 with a long and two integers, the appropriate promotions will take place at the point of call and the result will be as expected. There's no need for parentheses, and no need for casts. The code tells you what it does, and tells you clearly. That's the way to write maintainable code.

Q

I am concerned about code-bloat. It seems executable sizes just keep growing. I try to do what I can to reasonably minimize the implemented size of the programs I write. In pursuit of that goal I'm hoping you will educate me in the impact of using C++ templates.

Suppose I define COL as a template, and then instantiate the template for USER1, USER2, int, unsigned int, and double. Note that I am collecting pointers to my USER classes.
template <classT> class COL
// template definition omitted

typedef COL< USER1 *> USER1_COL;
typedef COL< USER2 *> USER2_COL;
typedef COL< int > int_COL;
typedef COL< unsigned int > UINT_COL;
typedef COL < double > double_COL;
Now it is likely (depending on platform) that USER1*, USER2*, int, and unsigned int are all the same size. Therefore, these classes and primitives could clearly share common translated output for the collection's Add() member. But do they? Is that what the compiler does, or does it simply perform source pre-processing (similar to C macros), which would render templates as only a notationally convenient form of cut and paste?

Can you shed some light on what is really going on with compilers and templates, and on what the draft standard and ARM say in this regard?

— Steve Hamilton

A

Before I gave in to the dark side and became a manager I was the architect and primary developer of the BIDS container library that ships with Borland C++. One of the primary goals of that library was to be as space efficient as possible without compromising execution speed. I've given the issues you raise a lot of thought, and I hope I can shed some light on the conflicting concerns that they present.

A template is a shorthand notation for a family of classes. Each time you instantiate the template for a different type you create a new class. So, for example, in your sample code you have created five classes: COL<USER1*>, COL<USER2*>, COL<int>, COL<unsigned int>, and COL<double>. These five classes look a lot alike: they all have the same set of members with the same names. Be careful, though, about drawing further conclusions about similarities. It's a little more complicated than it looks.

For example, let's say that COL has a member function named size that returns the number of items in the container. The container probably stores this value internally in a member variable, so the function size simply returns the stored value, like this:
template <classT>
unsigned COL<T>::size() const
{
     return item_count;
}
This particular piece of source code does not depend on the type T used to instantiate the template. This suggests that when the compiler generates object code for the size member function it can use the same code for every template instantiation. It's fairly easy, though, to come up with an example where the compiler can't use the same code. Suppose, for instance, that the template contained an object of type T:
template <classT>
class COL
{
private:
     T data;
     unsigned item_count;
public:
     unsigned size() const;
};
In this case, if you instantiate COL for two types that have different sizes, the offset of item_count from the beginning of the class will differ in the two instantiations, because the compiler has to leave room for the member named data. Since the offset of item_count is different, the object code for size will also differ between versions; size's executable code has to add the appropriate offset to the this pointer in order to find item_count. The compiler can't avoid this problem by changing the order of item_count and data in memory, because the language definition requires that these two data members appear in the order in which they are declared.

You're right, of course, that instantiating the template for two types that are the same size ought to produce two classes with identical data layouts. Let's work through some of the details of your template COL, simplified by using a fixed-size array:
template <classT>
class COL
{
     static const array_size = 10;
     T data[array_size];
     unsigned next;
public:
     COL() : next(0) {}
     void Add( T item );
     T& Get( unsigned index ) const;
};
template <classT>
void COL<T>::Add( T item )
{
     if( next == array_size )
          throw bounds_error();
     data[next++] = item;
}
template <classT>
T& COL<T>::Get( unsigned index ) const
{
     if( index > next )
          throw bounds_error();
     return data[index];
}
template <classT>
const COL<T>::array_size;
Let's create two instances of this template, and use the resulting classes in some code:
int main()
{
     COL<int> ci;
     COL<unsigned> cu;
     return 0;
}
These are two of the classes you mentioned in your note. And, as you feared, compilers today will not recognize that the code we have written results in identical object code for both of these template instantiations. There's no inherent reason that compilers can't do this sort of optimization, but for now compiler vendors are concentrating on getting the semantics of templates right. Fine tuning the details will come later.

Sometimes it is a straightforward matter to eliminate this sort of duplication. If none of your template functions does anything other than copy data around, you can create a base template that handles data of the same size. The base template would look something like this:
template <unsignedn>
class BASE_COL
{
     static const array_size = 10*n;
     unsigned char data[array_size];
     unsigned next;
public:
     BASE_COL() : next(0){}
     void Add( void *item );
     void *Get( unsigned index ) const;
};

template <unsigned n>
void BASE_COL<n>::Add( void *item )
{
     if(next == array_size)
          throw bounds_error();
     memcpy( &data[next], item, n );
     next+= n;
}

template <unsigned n>
void *BASE_COL<n>::Get( unsigned index ) const
{
     if( index*n > next )
          throw bounds_error();
     return &data[index*n];
}

template <class T>
const BASE_COL<T>::array_size;
Using this base you can write a template that takes actual types:
template <class T>
class COL
{
     BASE_COL<sizeof T> data;
public:
     void Add( T item ) { data.Add(&item); }
     T& amp;Get( unsigned index )
     {
     return *static_cast<T*>(data.Get(index));
     }
};
If your compiler uses four bytes to store integers, when you instantiate COL<int> the compiler will generate BASE_COL<4> so that COL<int> can use it for its internal data storage. When you instantiate COL<unsigned> the compiler also generates BASE_COL<4>. You'll only get one copy of the code for BASE_COL<4> in your executable code, so you've avoided the code duplication that results from the more obvious form of this template.

This convenience comes at a cost, however: this version of COL is dangerous. If you try to use it to hold objects with non-trivial copy semantics you'll be in trouble: memcpy won't handle these objects correctly, because it bypasses your assignment operator.

Because of the dangers in hand-tuning the space usage of templates, I don't advise trying to do it as a standard procedure. If you find that space usage is, in fact, a problem, that's the time to see what you can do about it. Like all other hand optimizations, though, if you don't need it, don't do it. And someday we'll have compilers that can do these things for us.

Q

How can you possibly be qualified to write a column about C++ when you don't even know the most basic things about the language? In your previous answer you tried to initialize a static data member inside a class definition. The code was something like this:
class COL
{
     static const array_size = 10;
};
Everyone knows you can't do that. That's why we use the enum hack. You could at least run your code through a compiler before you put it in your column. You'd save your readers a lot of trouble, and yourself a lot of embarrassment.
— Mortimer J. Farquar

A

I did run it through a compiler, and it compiled just fine. You're right, though, that until recently this was not valid, and we did indeed have to use the enum hack to create class members that are compile-time constants:
class COL
{
     enum { array_size = 10};
};
However, the ANSI/ISO working groups added initialization of static const integral members to the C++ language, and the syntax set out in your note is now valid. You must still provide a definition of the data member, just like before, but you can either put the initializer in the declaration, as I did, or put it in the definition. That is, there are now two forms of initialization for static class members:
class COL
{
     static const int internal = 10;
     static const int external;
};
const int COL::internal;
const int COL::external = 10;
Both internal and external are static const data members. internal can be used as a compile-time constant, because it is initialized in the class definition. external cannot. With this new mechanism, we no longer need the enum hack. o

Pete Becker is Senior Development Manager for C++ Quality Assurance at Borland International. He has been involved with C++ as a developer and manager at Borland for the past six years, and is Borland's principal representative to the ANSI/ISO C++ standardization committee.