Pete reflects on the eight years of hard work that led to the C++ Standard.
To ask Pete a question about C or C++, send e-mail to petebecker@acm.org, use subject line: Questions and Answers, or write to Pete Becker, C/C++ Users Journal, 1601 W. 23rd St., Ste. 200, Lawrence, KS 66046.
Last week [1] brought with it two major events: the arrival of winter in New England, and the approval of the Final Draft Information Standard (FDIS) for the programming language C++. Winter is an exciting time of year: the snow makes the world look different, and the cold makes sounds crisper. It's also a time of transition for people, especially drivers, who have to adjust their habits to fit with the changed environment. The approval of the FDIS will have a similar effect in the C++ programming industry: the agreement on what the language definition will contain makes C++ look different, and the details that the FDIS contains will give programmers a common vocabulary and set of concepts to make discussions about C++ programming better focused and crisper. The new language features are no longer a subject for speculation. Programmers will begin to change their programming habits, demand better conformance from compiler vendors, and become less tolerant of non-standard practices in their own code. This month, instead of my usual Questions and Answers column, I'm going to talk about the C++ Standard. Not the technical details there's plenty of time for those in the years to come. I'm going to talk about how the Standard came about: a little of the alphabet soup that's inevitable in discussions about standards, but mostly about the people who did this and the immense effort involved in getting several hundred very bright and often stubborn people to agree unanimously on what the C++ language should be.
We spent eight years working on the Standard. The first technical meeting was held in November, 1989, in New Jersey. We approved the FDIS at the November, 1997 meeting, also in New Jersey. At that meeting we also did a little statistical analysis. Of the 60 people who attended, about 12 of us had been at the first technical session. One person had attended every session; two had attended all but one. The record for the most companies that one individual has represented is seven, held by the peripatetic Jerry Schwarz, also known as the father of the iostream library. We had one marriage, a couple who met at the standards meetings, and several births. One of these newcomers (age 1 yr.) attended the November session apparently his father wanted to get him started early.
The Problem
Broadly speaking, the reason we need language standards is to help ensure uniformity in the definition and use of a programming language. Uniformity makes it possible to write code carefully, expecting that such code can be ported to a different target platform fairly easily. Limiting the variations among dialects of a language leaves only programmer error and deliberate non-portability as barriers to porting. Uniformity also makes programmers themselves more portable that is, it's easier for programmers to move to new development environments, because the amount of relearning they have to do in the new environment is reduced.
Uniformity also has a less obvious benefit: it makes it easier for programmers to talk about what they are doing and how they do it. If you have to define your terms at the beginning of every discussion, it takes much longer to begin to make headway on technical details. Having a uniform set of language concepts and a uniform vocabulary for talking about those concepts makes it easier to learn a programming language, easier to use it, and easier to understand advanced techniques. As C++ programmers become more familiar with the concepts and vocabulary of the Standard, we'll see an overall increase in the capabilities of C++ programmers, and we'll all learn to use the language more effectively [2].
Of course, we get those benefits only if the language definition stays stable long enough for programmers to understand it. That's been one of the biggest criticisms of C++ during the standardization process: that it changed too often. That's inevitable during standardization, but now that it's done, ISO rules prohibit making technical changes, other than correcting errors, for five years.
It was clear from the beginning of the standardization process that uniformity alone was not sufficient. The Annotated C++ Reference Manual (ARM) [3] described not just the C++ language as implemented in cfront and a handful of other compilers, but also two significant extensions to the language: templates and exceptions. There had been discussions and papers on how to add these features to the language for a couple of years, but no one really had much practical experience with either of these language features in C++. Nobody questioned the need for them, but we all underestimated the amount of work that would be needed to fit them into C++ properly [4]. We eventually also added namespaces and new-style casts, which weren't in the ARM at all.
The other thing clearly needed was a standard library. It was pretty much assumed that you could use the Standard C Library from within C++, and most implementations provided the iostream library, but this fell well short of what we felt C++ programmers would expect and need. Things like operator new and operator delete needed more rigorous definitions. Another thing we obviously needed was a string class. Containers seemed like a good candidate for inclusion in the Standard Library, but there were several container libraries available for C++ at the time, and they all took somewhat different approaches. We didn't feel that we knew enough at first to be able to come up with a uniform and sufficiently powerful set of containers. So the task of developing a standard C++ library required making fundamental decisions about what that library should contain, as well as working out the details of specifying those contents. Specifiying the contents of the Standard C++ Library was a more open-ended task than specifying the language itself and the handful of extensions that eventually made it into the Standard.
So those were our goals: to come up with a precise definition of the C++ programming language, filling in the details of the language itself and its supporting library, and adding to the language and the library as needed.
The Process
There were two standards bodies directly involved in the technical work on the C++ Standard: the American National Standards Institute (ANSI) and the International Organization for Standardisation (ISO). When the C language was developed, these same two organizations were involved. ANSI produced the American standard for C, and ISO subsequently adopted the American standard with a few modifications as the international standard. C++ standardization started out with the same plan, but early on ANSI and ISO put together a procedure for joint development of standards, and we switched to what is known as a Type I process in order to simplify the development of the international standard. What that meant was that every meeting we held was a joint meeting of the ANSI working group and the ISO working group. Officially, the ANSI working group served as technical advisor to the ISO working group, so we took two votes on every technical issue: an ANSI vote, to decide what ANSI would recommend and what position the American representative to ISO should take; and an ISO vote, to actually make the decision. We usually had from thirty to fifty voting members of ANSI at meetings, and from five to nine voting members of ISO. This dual voting always seemed a bit silly, but it really did streamline things. The result is that we will soon have an international standard and an American standard that are identical.
ISO actually has nearly thirty national bodies taking part in the decisions about the C++ Standard. Most of these members didn't attend the regular meetings, but took part in discussions and ballots concerning the working paper from time to time as the ISO rules required. In particular, we produced two documents known as a Committee Draft (CD). Each of those documents was sent to the ISO members for their ballots and comments. The first CD garnered quite a few disapproving votes, and many comments. The second CD, approved by the working groups in November, 1996, got five 'no' votes. Our main job in 1997 was to respond to the comments that accompanied those 'no' votes, to try to turn them into 'yes' votes.
We met three times a year, for a week at a time, in various parts of the world. Some observers of this process think that all that travel is a great job perk. One person on the Internet suggested that at the Monterey meeting we'd all be out playing golf. Trust me, those meetings were hard work. Most of us spent about ten hours each day discussing standards issues. Some spent more. After the meeting in Santa Cruz, California, the staff at the hotel where the meeting was held commented that we had had the highest per capita coffee consumption of any group that had met there, and very little of that was decaf. I have to admit, though, that things tended to slow down as the week wore on. I don't know of anyone who played golf in Monterey, but I sure saw a lot of people I knew when I went to the Monterey Aquarium on Thursday afternoon.
One of the first decisions we made was to adopt the Annotated Reference Manual as one of the base documents for our work. The ARM, as it quickly came to be known, provided a reasonably clear and easily accessible starting point for our work. Turning its ideas into a standard meant, in many cases, filling in details that were only sketched into the ARM. The One Definition Rule was one of the hardest. Roughly speaking, it says that when you define the same thing in several places, the definitions must be identical. That's fairly hard to put into precise words [5], and it took several years to get to the wording that's in the Standard today.
We settled in fairly early on our working procedure. We divided into six working groups: core language, to handle core language issues; extensions, to handle additions to the language; formal syntax, to make sure we didn't lose our way and produce something that couldn't be compiled; environment, to deal with the interactions between compilation units [6] and the interaction with the system that the program is running on; C compatibility, to make sure that we understood what we were doing to the C language as we enhanced it; and libraries, to define the scope and contents of the Standard Library. Each meeting began with the full committee in general session, to review what was going to be done and get some formalities out of the way. That usually took up Monday morning. Then we'd split up into working groups for technical discussions, which lasted through the middle of Wednesday. On Wednesday afternoon and Thursday we'd listen to reports from the working groups of what they had done and what proposals they intended to make. We'd take a straw vote on each proposal, and any proposal that seemed to still be controversial would be taken back to the working group for further discussion. The idea was to have formal votes only on proposals that we were fairly certain would pass. The formal votes took place on Friday morning; we'd run through a list of sixty or seventy amendments to the current working paper all the things we'd discussed informally on Wednesday and Thursday and approve them. Then the project editor got the thankless task of rewriting the working paper to incorporate all the changes that we had made.
Now, that requirement that decisions be non-controversial might seem a bit odd, but it went a long way toward ensuring that the Standard eventually received the widespread support that it did. We tried to emphasize reaching a consensus on technical issues, rather than resolving them simply on a majority vote. Consensus doesn't mean that everybody agrees. It means that those who disagree with the decision feel that their views have been adequately explored and understood. As a rule of thumb, if a proposal got less than two-thirds approval, we'd send it back to the working group.
Often reaching consensus meant continuing heated discussion of a topic until someone came up with a compelling technical argument for one position. Such a compelling technical argument, of course, immediately dispels any opposition. That's what eventually happened on the question of resumption from exceptions. The question was whether an exception simply said, "I can't go on from here, I give up," or whether it said, "I can't go on, but if somebody who called me can fix the problem, I can continue." There were strong advocates on both sides, and the debate continued through several meetings with no resolution in sight. The breakthrough came when Jim Mitchell, from Sun Microsystems, attended one of the meetings. He had worked with operating systems and programming languages that supported resumable exceptions for many years. He had seen that programs are often written with resumable exceptions, but that as they go through maintenance cycles the maintenance programmers remove them, finding the control flow too hard to follow. That was the killer argument against resumable exceptions: practical experience had shown that they were too hard to use. After Jim's presentation, the vote was twenty-two to two in favor of the termination model for exceptions.
The Product
So what's the result of all this? A C++ standard that's a little less than 700 pages long [7], containing 27 chapters and five appendices. The first 16 chapters are the definition of the language itself; the next 11 are the library. If you compare this with the C Standard, it's obvious that the C++ Standard is quite a bit larger. The C Standard is just over 200 pages long. It has four chapters, with the first three containing the language definition and the last one the library. That's a bit misleading, though, because the last chapter is a long one: 100 pages. So the library accounts for a little less than half of the pages in the C Standard. In the C++ Standard the library is a little less than 400 pages, so it's somewhat more than half of the Standard.
The length of the C++ Standard could probably have been reduced if we'd spent more time on it. But, as Blaise Pascal put it, "I have made this letter longer than usual, because I lack the time to make it short." [8] We took on a large job, and it turned out to be larger than we had thought. In developing the C Standard, the participants managed to get the new work out of the way fairly early, and spend two years refining the wording and getting the commas in the right places. We were still struggling with some fairly complex technical issues at the last meeting. I think we've got them right, but I still feel like we were flying two thousand feet above the airport and landed rather quickly. It would have been nice to have had more time to polish what we did.
Still, what's there is a much more powerful language than C++ was when we started. The addition of STL, in particular, although it was a significant disruption and probably added a year to the time we spent on the Standard, gives us a powerful programming model that most of us are just beginning to appreciate. Most C++ programmers today grasp the notion of inheritance as a reuse mechanism. We understand how to use abstract base classes to define interfaces, and to specialize those interfaces for particular situations in derived classes. At the very least, we've learned that from the iostream library. Reuse through templates is completely different, and I look forward to understanding it better in the years to come.
I mentioned earlier that the Standard will stay the same for five years. That's not quite true there are procedures for making technical corrections to a standard. People who find problems can file Defect Reports (DRs) for consideration by the working groups. In response to a DR, the working group can simply say that it's not a defect it was intended to be that way. They can also decide that it's a serious problem, and needs to be fixed. That's not a broad license to rewrite the Standard. It's only permission to fix things that clearly need fixing. It shouldn't result in confusion about what the Standard says, because it only applies to parts of the Standard that aren't clear to begin with. Everyone understands that making drastic changes to the Standard itself would quickly result in self-immolation.
Now that we have a stable language definition for C++, compilers will start implementing the language definition more faithfully. You'll see fewer and fewer of those niggling little complaints about syntactic quirks, which will make it easier to port code. You'll also see more and more implementations that try to handle the more ambitious parts of the Standard. There are a few compilers today that try to implement member templates, but they generally don't do it very well. That will improve. There are new requirements for exception safety in the standard containers. Today, if you insert elements into a vector and run out of memory, you can't predict what will happen. You'll get a bad_alloc exception, but you'll have no idea what you can safely do with your container. The FDIS provides solid guarantees here, so you're in a much better position to design the behavior of your program in the presence of errors. That's not in the library implementations that you have now, but I expect that you'll see it in all the implementations of the Standard Library within a year.
During the next five years there will be proposals to change the C++ Standard. There's already some serious discussion of adding garbage collection. Also, there are a couple of containers that really should be in the library, but aren't there because they were proposed fairly recently. There wasn't time to integrate them into the language definition. There will undoubtedly be more such proposals. In fact, if there weren't pressure to change the language it would indicate that it wasn't being used much. I expect C++ to be heavily used in the coming years. It's a very powerful language, and having an international standard removes many of the obstacles to its use. Now it's time to use the language, and understand what it can and can't do. In five years we can apply the knowledge that this will give us, and perhaps improve the language further. In the meantime, though, in the immortal words of Josee LaJoie, the ANSI representative from IBM and the ISO head of delegation for Canada, "Hey, we're done!!!" o
Notes
[1] I'm writing this on November 18, 1997.
[2] See T.S. Kuhn. The Structure of Scientific Revolutions, Second Edition (University of Chicago Press, 1970) for a discussion of the effect of shared concepts and vocabulary on intellectual progress in the sciences. Kuhn introduced the word paradigm to describe a shared model in a field of science, and the term paradigm shift to describe a revolutionary change in the shared model. He is not, however, responsible for the degradation of the latter term into today's usual meaning of "I'm doing things differently from the way I used to do them."
[3] Margaret A. Ellis and Bjarne Stroustrup. The Annotated C++ Reference Manual (Addison-Wesley, 1990).
[4] John Spicer, from Edison Design Group, did a lot of the grunt work needed to integrate templates into C++. He recently mentioned to me that if he had known at the start how hard it was going to be, he might have picked something easier to work on.
[5] For example, you can define a class named C containing a data member of type int in one source file, and a class named C containing a data member of type INTEGER in another source file without violating the One Definition Rule if INTEGER is a macro or typedef for int. That's just one of the many possibilities that the One Definition Rule must account for.
[6] The environment working group did the work on the One Definition Rule, for example.
[7] These numbers are actually based on the working paper that preceded the FDIS, that is, the paper that we amended at the November, 1997 meeting. There may be a couple of pages difference here and there in the FDIS, but these numbers are good enough.
[8] Blaise Pascal. Lettres Provinciales (1656-1657), no. 4.
Pete Becker is Technical Project Leader for Dinkumware, Ltd. He spent eight years in the C++ group at Borland International, both as a developer and manager. He is a member of the ANSI/ISO C++ standardization commmittee. He can be reached by email at petebecker@acm.org.