Sunday, February 22, 2009

Things I Didn't Know about C++: Local Classes

C++ is a complex language, and although I thought I knew it pretty well, I'm continuing to find areas of the language that I either didn't know or didn't understand well enough. So, on the (possibly narcissistic) assumption that others may not know enough about them too, here's a brief series of postings about them.

First up are local classes, which are not the same thing as nested classes.

class imitation_string {
public:
  // This is a nested class.  Other code can then use imitation_string::iterator.
  class iterator {
  public:
    iterator& operator++();
    iterator& operator--();
    char& operator*();
  };
};

void string_tokenizer(const imitation_string& in, std::vector<imitation_string>& out)
{
  // This is a local class.  It can only be used from within this function.
  class token {
    // Insert class definition...
  };
}

Local classes are closely related to nested (local) functions, which are not permitted in C++.

// Not valid C++.
int f()
{
  int i = 0;
  void g()
  {
    // A nested function has access to its enclosing function's variables.
    i++;
  }
  g();
  return i;
}

The position of local classes within the design of C++ seems very awkward to me. "True" nested functions require special implementation from the compiler, due to the requirement that they be able to access their enclosing functions' stack frames, and calling a nested function (via a function pointer) after its enclosing function has exited is a quick way to crash a program. This might be reason enough to omit them from the C++ language, but it didn't stop the GNU C Compiler from implementing them as an extension. Stroustrup dismisses local functions by saying that "most often, the use of a local function is a sign that a function is too large," but that didn't stop local classes from being permitted (with the same "too large" caveat). Nested functions occupy a much more comfortable position in other languages; Pascal (and therefore Object Pascal and Delphi) have supported them for ages, they're a key development technique in Javascript (used for everything from closures to jQuery event handlers), lambda expressions offer equivalent functionality in languages which support them, and so on.

Even ignoring their awkward position, local classes in C++ have some odd traits. First, they can neither be used as template parameters (so no smart pointers to local classes). Local class templates and local template methods are likewise prohibited. Second, although local class methods cannot access the enclosing function's variables (unlike a nested function), they can access static local variables within the enclosing function.

// Valid C++!
int f()
{
  static int i;
  class g {
  public:
    static void Execute() { i++; }
  };
  g::Execute();
  return i;
}

Similarly, local classes within a class method can access static members (even protected and private static members) of the enclosing class, and such access is implicitly scoped. They can also access protected and private members of friends of the enclosing class.

class F {
public:
  DoStuff();
private:
  static int i;
};

int F::DoStuff()
{
  class G {
  public:
    static void Execute() { i++; }
  };
  return i;
}

Various workarounds have been proposed for local classes' inability to access local variables of their enclosing functions. Herb Sutter gives a rundown of the various options in GotW #58. His final solution is worth mention; in an bit of C++ judo, he turns the enclosing function into a class, whose constructor contains the function body and which is implicitly convertible to the desired return value. The function's local variables become member variables, and the local functions / nested classes become class methods that can access these member variables.

Local classes have two main uses. First, they can be used to augment the regular control flow of a function. For example, Boost's new ScopeExit library lets you write arbitrary code to be executed whenever a code block exits (whether it exits by finishing, an explicit return statement, or throwing an exception). It implements this by defining a local class whose destructor executes the code RAII-fashion. The Google C++ Testing Framework offers another example of augmenting control flow with local classes. A unit testing framework needs both a way to abort the current test if a test assertion fails and (ideally) a way to signify that certain failed assertions are expected. The standard way to do this is to have failed assertions handled by throwing exceptions, and expected failures can be caught and handled. However, a design goal of Google Test is to avoid requiring the use of exceptions for maximum portability. Failed assertions are handled by a simple return statement. Expected assertions are wrapped in a local class method so that the return statement doesn't abort the entire function.

static int i = 1;
// This asserion:
ASSERT_EQ(1, i);
// expands to code vaguely resembling this:
if (1 != i) {
  ReportFailedAssertion(1, i, "ASSERT_EQ(1, i)", __LINE__);
  return;
}

// This assertion:
EXPECT_FATAL_FAILURE(ASSERT_EQ(2, i));
// expands to code vaguely resembling this.  Note the local class scoped to
// a dummy do/while block.
do {
  class GTestExpectFatalFailureHelper {
  public:
    static void Execute() {
      if (2 != i) {
        ReportFailedAssertion(2, i, "ASSERT_EQ(2, i)", __LINE__);
        return;
      }
    }
  };
  GTestExpectFatailFailureHelper::Execute();
} while(false);

Second, local classes can be used like any other class or function, as a mechanism for reusing and refactoring code. If you have code that appears in more than one place within a function, but is too specific to be used outside of that function, then following the principles of information hiding (if you prefer the dry CS term) or Spartan programming (if you prefer classical historical allusions or gratuitous Frank Miller references), you can use a local class to keep the code scoped as narrowly as possible. Personally, I've very rarely found code that's repeated within a function but is so specific that it will never be used outside of that function. However, this could simply be a case of my available tools determining how I solve problems. For example, Delphi permits local functions, and Delphi developers seem to find them moderately useful; the latest incarnation of Delphi's Visual Components Library, consisting of roughly 11,000 methods across 220,000 lines of code, uses about 190 local functions. Walter Bright, the creator of the D programming language, gives several examples of how nested functions can be used. He concludes,

Lack of nested functions is a significant deficit in the expressive power of C and C++, necessitating lengthy and error-prone workarounds. Nested functions are applicable to a wide variety of common programming situations, providing a symbolic, pointer-free, and type-safe solution.

They could be added to C and C++, but to my knowledge they are not being considered for inclusion. Nested functions and delegates are available now with D. As I get used to them, I find more and more uses for them in my own code and find it increasingly difficult to do without them in C++.

2 comments:

bitdeveloper said...

I got to understand working of local class referring questionscompiled.com. But do you have a practical example of using a local class?

Josh Kelley said...

Sure. As I mentioned, they're useful as workarounds for local functions - if you have code that needs to be called from more than one place within a function, but is only needed within that function, then you can pull it into a static method of a local class, to avoid cluttering the namespace of anything outside the function.

There are other uses. As I mentioned, Boost.ScopeExit uses it to let you write arbitrary cleanup code. You can place the code wherever it's convenient (for example, code to clean up a resource can be written immediately after the resource is acquired), and it's used as the destructor of an instance of a local class so that it will automatically be called whenever the function exits.

I've also used local classes to implement event handlers in Embarcadero C++Builder. If I want to write a function that configures a visual control to behave in a certain way, then that function needs to be able to attach event handlers to that visual control, and using a local class instance to handle the various events often ends up being the best way to handle this. (The details here involve some of Embarcadero C++Builder's extensions to C++, but the concepts apply to standard C++.)