Secondary issues of string handling can distract from object-oriented design

Objects before Strings
Part two of a three-part discussion

Conrad Weisert, August 15, 2003
©2003 Information Disciplines, Inc.

This article may be freely circulated, as long as the copyright credit is included.


Background

We're continuing to follow up on Koenig & Moo's excellent advice on character-string representation and manipulation in this month's C/C++ Users Journal. In part one we examined how a programmer can use old-fashioned C character strings (arrays of char) in a disciplined and relatively safe way.

An example

In pointing out the superiority of a C++ string class over C's crude character manipulation facilities, the authors showed this example:
Pure C version C++ / STL version
     struct Person {        
       char name[30];
       char address[50];
     }; 
    struct Person {         
       string  name;
       string  address;
     }; 

Which of those is "better"? Which is appropriate in an object-oriented program?

Of course, neither is appropriate. The pure C version may be worse than the C++ / STL version, but both versions seriously violate the letter and spirit of OOP.1

In the first version an address is a string of maximum length 49.
In the second version an address is a string of arbitrary length.
Some obvious questions:

Every systems analyst, programmer, and data administrator knows that an address has structure. That structure is an obvious candidate for standardization within an organization, whether or not the organization follows an object-oriented approach to software design and development. Designing a class or struct for Address is not only desirable, it's mandatory. Failure to do so is a serious design flaw.

So, we might replace the earlier examples by this:
    struct Person {         
        string  name;
        Address address;
     }; 

but that's not all.

Designing the Address class or structure

We're so well acquainted with mailing addresses that it's tempting to code without much reflection something like this:
    struct Address {         
        string  street;
        string  city;
        string  state;
        string  zip;
     }; 

but unfortunately, that structure is loaded with problems. For one thing it's limited to one particular kind of mailing address, a U.S. address that isn't a post-office box. Isn't this a perfect example of an is-a hierarchy?

In addition, we see these problems:

We'll refrain from suggesting specific solutions here, since there are many valid ones. But we have to do something to resolve those issues.

Is that all?

No. We agree that addresses have structure, but so do people's names. Representing a person's name by a character string (or by two or three character strings) is naive, inflexible, inconsistent, and error-prone. Taking a top-down design approach, we can rewrite the original example:
    struct Person {         
        PersonName    name;      
        USA_Addresss  address;     
     }; 

You can work out the definitions of the two member classes as a not-so-trivial exercise. Once that's done, your organization will find them useful in many applications.

There remain serious questions about the anemic Person structure itself. Those issues, however, are well beyond the scope of either this article or Koenig & Moo's original, and we won't go into them here.

Coming soon

In the next article, we'll look at some shortcomings of the standard C++ string class that render it unsuitable for representing some text fields, and discuss practical alternatives.


1 -- Of course Koenig & Moo's purpose was limited to the details of string manipulation, so it wouldn't be fair to condemn the authors for violating higher-level design criteria. Still, since the example may mislead less experienced readers, it shouldn't stand unchallenged.

2 -- IDI's reusable component library contains the tables relating zip-code ranges to state names and state abbreviations, as well as BASIC subroutines to search them. Call or E-mail us if you're interested.

Return to Technical articles
Return to C++ topics
Return to IDI home page.

Last modified August 16, 2003