Avoiding messy traditional numeric representations . . .

Mixed-Unit Numerics Invite Errors
(and lots of extra work)

Conrad Weisert
May 3, 2014

NOTE: This article may be reproduced and circulated freely, as long as the copyright credit is included.


Mixed-unit number traditions

The English system of measurement, still used in the United States, often uses strange mixed units. Also a few mixed unit systems, dealing with time, are still in widespread use. This tradition complicates arithmetic and drives schoolchildren crazy. Here are some examples:

Numeric type Common mixed unit representation
Distance (also Area and Volume)miles, feet, inches
Weightpounds, ounces
Plane angledegrees, minutes, seconds
Liquid volumegallons, quarts, pints
British money (obs.)pounds, shillings, pence
Time of dayhours, minutes, seconds

We also have compound units derived from the above, such as speed and momentum.

Messy as all those are, an even clumsier example, used worldwide, is Date, specified in years, months, and days. Different calendar months contain different numbers of days, determined by ancient tradition.

Obviously it's awkward and error-prone to perform arithmetic on such quantities. The answer not only has to be valid but often also normalized, so that, for example, a number of minutes1 is always between 0 and 59. As with simple (non-mixed) units we may also want negative fraction results to have positive denominators and all fractions to have no common "cancelable" factor.

Yes, it's messy and error prone, but in computing it's also rarely if ever necessary.

Internal and external numeric representations

In object-oriented technology, the internal (private) representation of numeric data should always be in simple units. (That was true of internal representations before we had object-oriented technology, too, but harder to enforce.) The metric system suggests possible preferred representations, but we can use anything that works. Of course, if an international or industry standard exists for a type of data, we should try to follow it.

The result of this practice is simplicity and far fewer opportunities for error. A recent well-reviewed book on testing strategy showed a subroutine to add 1 to a date. The date was in year-month-day form, beloved by old-fashioned COBOL programmers.

Now that subroutine was almost two pages long! It understood and implemented the rules for February in leap-years. It tested for and reported a number of error conditions.

The subject was: "How do we effectively test that subroutine?" The answer should have been: "We don't! We throw it in the trash and start over."

So how do we add one to a date d? With a reasonable date representation, such as the one used internally by most spreadsheet processors, those two pages of mixed unit program logic reduce in C++ to just this:

++d;

There are no validity tests or error messages, because every date has a successor. Computing the number of days between two dates is similarly trivial (d1 - d2).   And once we've created a date, we don't have to keep re-checking it for validity every time it's involved in any process.

Localizing the messy part

Of course, users of our applications, especially in America, will want to continue using their familiar traditional numeric units. That's all right. Only a few parts of an application system know about and manipulate those representations:

The latter is more challenging, since there's no built-in way of distinguishing different units in a constructor. Sometimes the number and type of parameters is sufficient:

     Distance (double meters);  
     Distance (long feet, short inches); 

But we can live without every possible set of constructor parameters. The class definition can provide global conversion constants, and leave it up to the user programmer to understand what the standard consructor expects.


1—in a TimeOfDay but not in an elapsedTime. See Point-Extent pattern for further explanation.

Last modified June 23, 2014

Return to IDI home page
Technical articles