Pet programming peeves . . .

Stamp Out Multiple Object Representations
© Conrad Weisert, Information Disciplines, Inc., Chicago
1 August 2009

NOTE: This document may be circulated or quoted from freely, as long as the copyright credit is included.


Here's another strange antipattern practice that keeps turning up not only in students's exercises and application programs, but also in articles and course handouts from people we expect to know better. I haven't seen it yet in a textbook, but it wouldn't be surprising.

A representation switch

Consider this class definition fragment1 with two private member data items:

 class Angle {
     double value;	
     short  unit;       // 0:  value is in radians; 
                        // 1:  value is in degrees
  public:
    .
    .
  };

The designer/programmer probably felt that he was providing additional flexibility for users of his Angle class. He may also have wanted to avoid an argument over whether the internal representation of an Angle object should be in radians or in degrees. "This will please everyone," he may have congratulated himself.

Nonsense! Such an atrocious practice will enormously complicate both the methods of the Angle class and most of the programs that use it, and will invite many kinds of program bug.

One of the major advantages of object-oriented programming (OOP) is that it simplifies programs. But just imagine what a method has to do to in order to add two Angles or to compare them. Imagine the confusion that can result from setting the wrong unit.

But wait! It gets worse.

Delegating representation

Having suffered rejection in a design review, our class designer next falls back on an alternative approach, which he claims is really object oriented: inheritance!

 class AngleBase { // Abstract class
     double value;	

  public:
     virtual  . . .  
      .
      .
  };
  class RadianAngle : public AngleBase {
      .
      .
  };
  class DegreeAngle : public AngleBase {
      .
      .
  };

Well, this "solution" is even worse. The two concrete classes have identical behavior and differ only in the internal representation, which is supposed to be hidden from users.

What's the point of encapsulating knowledge of the internal representation, if every program that instantiates a RadianAngle object knows that the representation is in radians? They might as well be coding in Fortran.

"Not so," our zealous designer counters. "The user can still instantiate a BaseAngle pointer or reference, and then exploit polymorphism to operate on objects of either kind."

At this point, we need to send our designer/programmer back to school. Polymorphic function invocation is intended to simplify doing different things to an object depending on its type, not doing identical things to objects that are distinguished only by their internal representation.

The sensible solution

Most readers of this web site already know this.

We teach students to pick one internal representation for any object, preferably the simplest one. If some users need an alternative representation, we can provide accessor methods for them:

 class Angle {
     double value;	//  (Radians)

  public:
     double radians() const 
            {return value;}
     double degrees() const 
            {return value * RADIANS_TO_DEGREES;}
      .
      .
    };

One of the inline accessor functions returns the member data item; the other one returns that item multiplied by a constant. Users of the class should have no idea which. We might switch from one internal representation to another, e.g. for performance tuning, confident that the user programs will function exactly as they did before.

The rules

  1. Never support more than one internal representation (private member data) for objects of a class.

  2. Never derive subclasses based solely on different internal representations, where the rule for converting one representation to another is fixed.2

Clarifying addendum, August 6

A couple of colleagues, including Konstantin Läufer at Loyola, pointed out that one may want a specialized version of a container (Java collection) class that's identical in behavior to a general version but uses a different internal data representation, e.g. for a sparse array. Instead of using a separate class and associated iterator, one might design a single class with an internal switch to choose the representation.

I know of no reasonable exceptions to either rule. If you think of one, let me know.


1—The examples are in C++; Java and C# equivalents are obvious.
2—It would be reasonable, for example, to define derived
Money classes for ChineseMoney and USMoney, since conversion between them varies according to a volatile exchange rate.

Return to IDI Home Page
Return to Technical articles

Last modified July 29, 2009