C++ and Java communities' fanaticism hurting real-world programmers

Outgrowing Macrophobia
by Conrad Weisert, March, 2000

© 2000, Information Disciplines, Inc., Chicago. This article may be circulated freely as long as this copyright notice is included.

Background

For nearly 15 years articles and textbooks on C++ have been discouraging the use of C's preprocessor facilities. In 1986 the principal designer of C++ advised:

"The first rule about . . [macros] is: don't use them if you do not have to. . . almost every macro demonstrates a flaw in either the programming language or in the program." 1

No argument there. Why use any feature of any programming language unless you have to? And, yes, it is indeed C++'s shortcomings (not "flaws") that often encourage enlightened macro adherents resort to that tool.

For a while respected writers echoed Stroustrup's sentiments in sensible moderation. For example:

"Prefer the compiler to the preprocessor."2
". . . this device is needed only occasionally" 3

Later, however, the tone moved to from moderate discouraging to vigorous condemnation. Today the lack of a preprocessor is cited as a strong advantage for Java, and one popular C++ periodical doesn't accept articles containing code examples that use macros.

I've had to remove macro coding not only from freeware giveaways but also from my advanced C++ course handouts, even at the expense of expanding a 1-page example to 3 or 4 pages. I had found myself spending too much class time in defending the macro coding against an occasional shocked student who had been advised by an earlier mentor that all macro coding is bad.

On the other hand, I've found the non-C++/Java OOP audience (i.e experienced Smalltalk or CLOS programmers) receptive to and appreciative of macro coding in C++ examples used as presentation handouts.

Use, misuse, and blind spots

What the macrophobes are really complaining about is not the use of macro coding but the misuse, specifically for two narrow purposes:

  1. Defining constants, such as
    #define PI 3.1415926

  2. In-line generic functions, such as
    #define ABS(X) ((X) >= 0 ? (X) : -(X))

There's no disagreement there, especially with respect to the latter, where C's heavy use of expression side effects often leads to results that astonish the naive programmer. Both C++ and Java support better and safer ways of doing both.

But those of us who have enjoyed the experience of exploiting a powerful macro processor know that macro programming has dozens of other uses, most of them far more valuable than defining constants or trivial generic functions.

The gap in understanding arises from the very weakness of the C/C++ macro facility. When it's hard or impossible to do something using a given tool, then programmers whose programming experience has been almost entirely with that tool often don't perceive the value of the thing they can't easily do. Remember COBOL programmers of the 1960's asserting: "But I've never needed to pass a file to a subroutine,".

So, we find ourselves in a circular situation:

  1. Macros don't get used much because the preprocessor is weak (C++) or non-existent (Java), and when they are used they're most often misused.

  2. But the preprocessors will never get improved, because most insiders hate their misuse and will oppose any proposal to make them more flexible or more powerful.

Benefits and purposes of macro coding

We exploit macros in order:

Those benefits themselves ought to be beyond controversy. If so, then shouldn't we view any macro coding that provides the most practical way of realizing them as not only acceptable but highly desirable?

General readability and language-extending keywords

Let's face it. C's syntax is a lot harder to read than that of most other programming languages. Readability wasn't a major criterion in the original design of C. Indeed, Kernighan and Ritchie even cited 5 the now-notorious Algol-like keyword macros then, begin, and end as a useful possibility.

Consider a line of code that starts this way:

          unsigned int  aRatherLongName . . . . 
Is that a data item declaration or the start of a function? In PL/I or Pascal the first thing the reader sees is a distinctive keyword answering that question. In C the reader's eye must scan the line searching for the telltale but inconspicuous left parenthesis that indicates a function.

Consider a line of code that starts this way:

          aRatherLongName[aSubscriptExpression] . . . 
That's probably the start of an assignment to an array element, but it could also be an invocation of a void function through a table of function pointers. For the latter a call verb at the start of the line would have clarified the intent instantly.

Defining language keywords are viewed by macrophobes not only as an insult to the C language but also as an impairment to readability by anyone who isn't acquainted with the macros. Our experience sharply contradicts their opinion for two reasons:

  1. The major audience for whom source code must be readable are programmers within the organization responsible for maintaining it. If the use of certain language-extending macros is standard in that organization, how can they have trouble?

  2. If language-extending macros are sensibly designed, then even readers who haven't seen them before ought to understand them intuitively. Who could possibly get confused by a simple call in front of a void function invocation?

Class definition

Macros are especially helpful in defining a C++ class, most often a messy, tedious, and error-prone undertaking. A complete class example is too long for this article, so let's just look at a few individual statements or statement fragments. The left column below contains macro coding that can appear inside a class definition, and the right column contains the equivalent (or generated) raw C++ code.

ASSIGNMENT_OPERATOR(rstype) public: ClassName& operator=(rstype rs)
ACCESSOR(name, type, expr) public: type name() const {return expr;}
CONVENTIONAL_SUBSCRIPT
(T,expr)
public:
T  operator[](const int rs) const
                             {return expr;}
T& operator[](const int rs)  {return expr;}
OSTREAM_OPERATOR
inline ostream& operator<<
	(ostream& ls, const ClassName& rs)
	{rs.display(ls); return ls;}
CLASS_CONSTANT(name,value) enum {name = value);
MEMBER_DATA protected:
CONSTRUCTOR public: ClassName

Even hardcore macrophobes admit that the code on the left side is not only easier to read for the C++ novice but also much easier to spot on a cluttered listing for the experienced C++ programmer. But the main benefit probably isn't readability. Under which approach is the programmer more likely to make a subtle error, such as forgetting a const or a reference, or defining only one of the two subscript operators? Reusable component libraries are full of such class definitions that worked fine for the application that developed them, but cause such unpleasant surprises for others who try to use them that re-use has gotten a bad reputation.

Another benefit of the macro approach, of course, is ease of change, a fundamental goal of OOP. Whether such changes reflect error correction, new insights, or a project-specific override, it's a great benefit to be able to make them in one place rather than in dozens of scattered and unknown class definitions. In reviewing the above CONVENTIONAL_SUBSCRIPT macro we may note that we could gain some efficiency for large objects if the first of the two operators returned const T& instead of T. If we make the change to the macro definition then every class definition that used that macro will switch to the improved version the next time it is compiled.

Even more powerful are these class-definition macros that would appear outside the class definition:

DERIVED_RELATIONAL_OPS
Assumes ordering (<)
and equality (==)
are already defined.
inline bool operator!=(const ClassName& ls, const ClassName& rs) 
                      {return !(ls == rs);}
inline bool operator> (const ClassName& ls, const ClassName& rs)
                      {return  rs < ls;}
inline bool operator<=(const ClassName& ls, const ClassName& rs)
                      {return !(ls > rs);}
inline bool operator>=(const ClassName& ls, const ClassName& rs)
                      {return rs <= ls;}
ADDITION_OPS
Assumes compound
assignment operators
are already defined
as per Scott Myers6
inline 
ClassName operator+(const ClassName& ls, const ClassName& rs)
                                 {return ClassName(ls) += rs;}
inline 
ClassName operator-(const ClassName& ls, const ClassName& rs)
                                 {return ClassName(ls) -= rs;}
template<class=PureNumber> inline 
ClassName operator*(const ClassName& ls, PureNumber rs)
                                 {return ClassName(ls) *= rs;}
template<class=PureNumber> inline 
ClassName operator*(PureNumber ls, const ClassName& rs)
                                 {return ClassName(rs) *= ls;}
template<class=PureNumber> inline 
ClassName operator/(const ClassName& ls, PureNumber rs)
                                 {return ClassName(ls) /= rs;}
template<class=PureNumber> inline 
PureNumber operator/(const ClassName& ls, const ClassName& rs)
                             {/* How can we do this one? */}

The above macros enforce the discipline that whenever we define a particular operator we're obliged, for the sake of the client programmer's reasonable expectations, to define certain other operators as well.

Nearly a decade ago I proposed7 a collection of class-definition macros. Since we've learned a lot since then, I wouldn't recommend those exact class-definition macros today, but I continue to use and strongly recommend some macro package for that purpose. (We give out our current set to participants in IDI's half-day seminar Macros to Simplify and Standardize C++ Class Definition).

Breaking the bonds

We've seen so far that even C's primitive preprocessor offers a valuable tool for organizing software. A less restrictive tool would expand those benefits many times over.

As an example of a more powerful macro preprocessor look at PL/I. Some organizations who fully exploited the PL/I preprocessor's capabilities reported order-of-magnitude productivity improvements in coding, testing, and maintenance compared with Fortran or COBOL programming, satisfying James Martin's criterion for a fourth generation language!

Taking the PL/I preprocessor as our inspiration, we can imagine C++ examples like the following:

  1. Testing a class-template parameter:
              template<class=T>
    		class  {
    			T value;
    			.
    		T operator%=(T& rs) {value =
                    #IntegerType(T) #? value % rs #: fmodl(value,rs.value);
                    return *this;};
    			.
    		};
  2. Saving and retrieving deferred source code
    
             #ReportDef(inventory, 54)
                #Heading("Warehouse Inventory as of " + #Today)
                #Column(15, "Product Number"   , item.ID()
                #Column(40, "Description"      , item.Description()
                #Column(8,  "Quantity on hand" , item.onHandQ()
                #Column(8,  "Quantity on order", item.onOrdQ()
                #Footing(#Pageno)
             #EndReportDef
                .
                .
    	  invFile >> item;	
    	  while (!invFile.eof())
               {#ReportPrintDetail
                invFile >> item;
               }		
           
    could generate:
      invFile >> item;	
      while (!invFile.eof())
         {const int pageLen = 54;
          int       pageNo  = 0, 
                    lineNo  = pageLen;
          if (++lineNo > pageLen)
            {if (pageno > 0)  
                 inventoryRpt << inventoryRpt.footing() << endpage;
             ++pageNo; lineNo = 1;
             inventoryRpt << inventoryRpt.heading(); 
             inventoryRpt << setw(15) << "  Product " 
                          << setw(40) << ""            
                          << setw(8)  << "Quantity" 
                          << setw(8)  << "Quantity" << endl
                          << setw(15) << "  Number  " 
                          << setw(40) << "               Description"  
                          << setw(8)  << " on Hand" 
                          << setw(8)  << " on order"<< endl;  
            }	      
          inventoryRpt << setw(15) << item.ID()
                       << setw(40) << item.description()
                       << setw(8)  << item.onHandQ()
                       << setw(8)  << item.onOrdQ() << endl;
          invFile >> item;
         }

Note that we don't claim that the above are the preferred ways or even good ways of accomplishing those results. They simply illustrate techniques of program organization that good programmers have been using for decades in languages that support a powerful macro facility.

Where to stick it

Since a preprocessor has no access to symbol tables or other information collected by the compiler, there's no reason to bind it to the compiler. If the preprocessor were part of an editor then the compiler wouldn't need to know anything about it. We could then enjoy full macro capabilities in Java programming without the blessing of the JDK, any JVM, the Java community, or any compiler vendor.

The main drawback to packaging a macro processor in an editor arises with compiler error diagnostics. The source code the programmer is looking at may not explicitly contain the offending code, and the message may therefore be out-of-context and cryptic. (Actually it's hard to imagine C++ error diagnostics getting much more cryptic than they already are from one leading compiler. The JDK does better.) Still, that's a minor annoyance at worst, especially if the editor can display expanded code. As in the case of all generators, the maintenance programmer should resist the temptation to modify generated code, and should work exclusively with the original source text.

Two Conclusions

  1. Weak as they are, the C/C++ preprocessor facilities deserve to be used more often as a tool for organizing flexible and highly maintainable programs.

  2. We desperately need a more powerful macro processor for both C++ and Java, perhaps implemented in an editor separate from the compilers and other development tools. Note that a Java preprocessor could also expand operators into function calls, as suggested in Conventions for Arithmetic Operations in Java.

I would be pleased to hear from readers interested in contributing to any aspect of the C++/Java macro issue.

Conrad Weisert


1 Bjarne Stroustrup: The C++ Programming Language, 1986, Addison-Wesley, ISBN 0-201-12078-X, p. 129.
2 Scott Meyers Effective C++ 1992, Addison-Wesley, ISBN 0-201-56364-9, p. 10.
3 Frank Friedman and Elliot Koffman: Problem Solving and Abstraction Using C++, 1997, Addison-Wesley, p. 328.
4 Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides: Design Patterns, 1995, Addison-Wesley, ISBN 0-201-63361-2
5 Brian Kernighan and Dennis Ritchie: The C Programming Language, 1978, Prentice Hall, ISBN 0-13-110164-3, p. 87.
6 Scott Meyers More Effective C++ 1996, Addison-Wesley, ISBN 0-201-63371-X, item #22
7 Conrad Weisert: "Macros for Defining C++ Classes", ACM SIGPLAN Notices, November, 1992.

Return to Technical articles
Return to IDI home page.