Sunday, June 7, 2009

Standard C++

I have vague recollections of trying to track down a copy of the C++ standard several years ago. I went to the ISO's web site, went into sticker shock at the several hundred dollars they wanted for a CD or PDF, and soon forgot about it.

Until I came across this discussion on Stack Overflow, I didn't realize that the C++ standard was so easy to acquire. PDF versions are available for $30 from ANSI's web site and from Techstreet. If even $30 is too much, or if you want to read about cutting-edge C++ stuff, then you can download the C++0x draft standard from the ISO committee's web site for free, with the caveat that it is "incomplet and incorrekt." Copies of the C++98 and/or C++03 draft standards are also archived in various places around the web.

With the wealth of C++ information available on the web and in print, there are often more convenient resources than the standard, but topics such as the order in which integral types are promoted seem difficult to come by on the web, and for particularly arcane issues of C++ style, it's nice to have a more disciplined approach than "munging syntax until it works" and a more authoritative source than "whatever my compiler accepts." The C++ standard acts as a grammar textbook for the C++ language, and although the standard isn't exactly thrilling reading, its dialect of standardese isn't nearly as difficult to read as you might think.

Linguists will tell you that a natural language's grammar is descriptive, not prescriptive. In other words, English grammar isn't the list of rules that you find in a high school textbook, it's a description of how people actually talk and write, complete with colloquialisms, various "ungrammatical" speech, and, in the 21st century, emoticons and online acronyms. There's some merit to taking a similar approach with programming languages. Using a programming language effectively requires more than just knowing the language's formal constructs; it involves using the language's community's idioms (using and combining constructs to get things done, as in C++'s RAII) and taboos (aspects of the language that are too obscure or problematic to rely upon, like C++'s exception specifications or the relative precedence of || and &&). Moreover, what ultimately matters is how a language can be used in the field rather than how the standard says a language can be used; it's a small comfort to read that your template metaprogramming technique or C99 construct is permitted by the standard if your target platform rejects it.

The rules of grammar (in the prescriptive high school sense rather than the descriptive linguistic sense) are generally worth following, to promote clarity of communication, but skilled writers know that it's sometimes effective to break these rules, whether to create a particular effect (such as a conversational, informal tone) or because the rules are outdated or never made much sense in the first place (such as the prohibition against split infinitives). Similarly, there are times where it may be beneficial to break the C++ standard. Imperfect C++ gives two examples of this (with caveats as to their use): the "friendly template" technique (declaring a template parameter to be a friend of the template class; technically illegal but very widely supported) and nested or local functors (potentially very useful but rejected by almost all compilers). C99 features such as variable length arrays (supported by GCC) and variadic macros also fall in this category. Obviously, breaking the standard should only be done with full awareness of the consequences (primarily damage to future portability) and is usually a bad idea, but not always. For example, code using variadic macros completely violates the C++ standard, but it's likely much more portable than some of the more advanced uses of templates.

C++ might deserve mention for even having a (well-discussed, reasonably well-implemented) standard. Perl, for example, lacks any official standard; instead, its implementation and test suite serve as the de facto standard, and the need to remedy this may be part of the reason for Perl 6's delays. JavaScript has an international standard and yet continues to suffer from browser compatibility issues, not just in the DOM and event models, but in the language core:

var cmd, valve;
// Perl- or Python-style assignment to an array: legal in Firefox, but 
// rejected by Chrome
[ , cmd, valve ] = /(\w+)_valve_(\d+)/.exec(this.id);

// A C-style trailing comma in an initializer list: creates an array 
// of length 3 in Firefox, but IE creates an array of length 4 with 
// undefined as its last element
var fields = [ 'street', 'city', 'zip', ]

Fred Brooks has more to say about the risks of using an implementation as a standard in chapter 6 of The Mythical Man-Month, but by this point we're pretty far afield of the C++ standard. So, if you're a C++ developer, go download or buy yourself a copy and give it a look.

No comments: