Last updated April, 2017s

Errata in Collins: Data Structures and
the Java Collections Framework
(3rd edition)

TO: Students

This page lists issues that may trouble you when you read the assigned chapters in the Collins textbook. They include:

They do not, however, include some other judgmental issues where the textbook may disagree with expert opinion or common practice. Those can be discussed in class or cited in the separate comments document.

Students who discover further errors in the textbook or in any other class material are encouraged to bring them to the instructor's attention. The textbook author's own errata list is also helpful.


Chapter 0

Page 1:   "Basically a class consists of varliables, called fields, together with functions, called methods, that operate on those fields."
Actually, we shall see that a Java class can be either:
  1. a template for instantiating objects, which are the things that consist of fields and methods, or
  2. an arbitrary collection of independent functions (not usually called methods); This kind of class, which does not fit the so-called "object paradigm" is sometimes called a "pseudo class".
Page 3:   The table entries for short and byte specify an incorrect range. This is a Java standard.

Page 5:   "The string class has a method that returns a copy of a specified substring . . ."

It actually returns a reference to a copy . . . The statement in the text would be understood in casual discussion among experienced programmers, but since one of Java's special difficulties is understanding reference semantics, we should try to be rigorous in the early part of our course.

The repetition in the @return specification on the next page (6) is acceptable because references are the only things the subroutine could return.

Page 12:   "In this program and in all subsequent programs in this book . . . a new instance of the class is created with a call to the class's default constructor . . . and this new instance invokes its run method."

The test-drivers use that peculiar technique here and throughout the book. The main function instantiates an instance (object) of its own pseudo-class! Instantiating a pseudo-class (having no state or member data) is sufficiently unusual that it requires an explanation. Here it serves no purpose except to allow a call to the run() function, which is really a class (static) function, not an instance function. The self-contained function, run(), neither uses nor affects anything in its object, which obviously serves no purpose at all.

The programmer, who may or may not be the author of the book, might have tried to invoke run() from main, getting a compiler diagnostic about the need for an object in order to invoke an instance method. The simpler and more conceptually sound response would have been simply to make run() a class method by specifying the omitted keyword static. Alternatively, if the test driver program is short, you can forget about run() and just embed its logic in main.

Page 22:   Using floating-point (double) representation for amounts of money introduces unnecessary complications and constitutes a rejection of the object paradigm. See the entry for page 32 for a better approach. Meanwhile, for this course, you can ignore the messy DecimalFormat class.

Chapter 1

Page 29   Strictly speaking the interface defines a closedFigure. Otherwise it would have no center.

Page 32:   There are three issues with the so-called MONEY object:

  1. It's misnamed. A reasonable experienced programmer reading source code would first assume that it's an amount of money. MONEY_FORMAT would be a more appropriate name.
  2. It's misplaced. The way we choose to represent amounts of money externally is hardly an attribute of an Employee. This constant would surely also apply to Product price, Account balance, and dozens of other amounts of money having nothing to do with an employee.
  3. The issue wouldn't arise here if the example had used a Money class instead of double floating-point for representing grossPay. Then the formatting constants could be associated with the Money class, and therefore available to all users of that class. (Floating-point dollars may lead to rounding and truncation problems that upset accountants when used with amounts of money in this way.)

Don't worry if all that isn't absolutely clear at first. We'll see throughout the course how the object paradigm can help to simplify our program designs.

Page 33:   Defining name as a general String with neither constraints nor comments specifying its format is naïve and risky. The author might claim that the choice of representation for a name is secondary in a book about data structures, but then we wonder why he was so concerned with specific formatting of Money objects. This problem arises in other examples throughout the book.

Page 35:   When you name a class with a noun denoting an object in the application domain, such as Company, readers of your program will expect to be able to instantiate objects of that type. Here, however, Company is just a test-driver pseudo-class for validating the Employee class, and it has no relationship at all to a Company.

Page 38:   Defining HourlyEmployee as derived from FullTimeEmployee seriously violates the notion of an is-a class hierarchy, central to the object paradigm. (See the Liskov sustitution principle, which the textbook quotes at the bottom of page 44).

A more natural and more manageable hierarchy would define Employee as a base class instead of an interface (page 32), and then derive both the HourlyEmployee and FullTimeEmployee classes from it.

Page 44:   The HourlyCompany class makes little, if any, sense. What, exactly, is an HourlyCompany? Why does this class know about the format of an input file that has little relationship to either employees or companies?

Chapter 2

Page 60: protected final static int SPEED_LIMIT = 65.0;

Collins is right, of course, in defining this constant once instated of spreading a hard coded value through the program. However, the actual example raises more issues than it settles:

  1. What are the units? An application that might be used outside the United States, one of a tiny and shrinking minority of nations clinging to old-fashioned English units, should either stick to international standards or provide clear commentary explaining the deviation.
  2. The extraneous (for an int) ".0" makes the reader wonder whether the programmer had intended the constant to be floating-point.
  3. Ideally, the object-oriented program should be using a Speed or Velocity class instead of a primitive type for a quantity that has a unit of measure and exhibits behavior. But since that's beyond the scope of this course, we won't insist.

At a minimum, a thoughtful programmer would drop the zero and append a clarifying comment:
  protected final static int SPEED_LIMIT = 65; // Miles per hour

Page 63 & 65   These tests of the toString() method are ill-advised and may actually impede original development or later maintenance. The exact format of the returned string (shown on page 34) is not part of the program specification. If we write the test before developing the method, as the author recommends, we unnecessarily constrain the programmer who develops the function. In any case, if a maintenance programmer should later make an "improvement" to the toString() function, the whole set of tests will fail and have to be recoded.

Unit tests are for validating the specified behavior of a program component, not incidental details of the way it may be implemented.

Page 71:   In the separate comments document we noted issues with module cohesion, and cited two examples. This example (isLeapYear) is so extreme, however, that we have to put it in the solid error category. Obviously the leap-year-checking logic is all mixed up with the input scanning, but that's not the worst:

Page 91   "In an object-oriented environment bottom-up testing is the norm." That's a little strong. The strategy we use for a complete application may be different from what we use to develop class infrastructure. We shall discuss testing strategies in the fourth week of the course.

Page 92:   "[initializes] char fields to the character at position 0 in the Unicode collating sequence":
A cryptic way of specifying the null character.

Page 94:   "return true if the two objects are the same," The same is ambiguous in Java: the same object or the same value? The latter is meant here.

Page 94:   "An accessor method returns a copy of a field." or a simple function of one or more fields. For example: the user-programmer shouldn't know whether a fahrenheit() accessor in a Temperature object returns the member data item or the result of a conversion formula from, say, degrees Kelvin.

Page 95:   This statement:

  return name.equals (full.name) ""
         MONEY.format (grossPay).equals (MONEY.format (full.grossPay));
has problems.

First, "" should be &&. This error is repeated on pages 200 and 201.

Second, and more serious, having to convert values to external representation in order to compare them for equality is a sure sign that something's very wrong with the choice of internal representation. This is another consequence of the problem noted on page 32 (#3)

Chapter 3

Page 106—The function aboveMeanCount is misnamed and misdocumented. It returns the number of array elements greater than a parameter, which need have no relationship to the actual mean of the array elements.

We shall see on the next page that we're not primarily interested in the number of statements executed. Statements in a high level language span a huge range of complexity and time. The main contributor to execution time is the number of times around the loop, which is n.

Page 128—In exercise 3.4 the finite series c and the infinite series h both have closed-form solutions, familiar from elementary algebra. So the time to compute them is constant. (Was this a trick question?)

Page 129:   Exercise 3.6-b asks you to estimate times for
  for (int i=0; Math.sqrt(i) < n; i++)   <statements>
Calling a square root routine inside a loop, especially for an integer, is a needless complication, and in a real program would be poor practice. If that's really what was meant, most programmers would code:
  for (int i=0; i < n*n; i++)   <statements>
But it's unlikely that the example is what any programmer would really want. It would probably make more sense to apply the square root to the loop limit. What was meant for this example is more likely:
  for (int i=0; i*i < n; i++)   <statements>
That might apply to decomposing n into its prime factors.

Chapter 4

Page 133:   "A collection is an object that is composed of elements."
No. It's an object that can store or contain data items (primitive or references). A composite object that has a field or multiple fields (member data items) is permanently "composed of" those data items. The elements stored in a collection are not members of the object in the OOP sense, and may be removed, inserted, or replaced under program control. (This is repeated on page 150.)

Page 143:   "object in a class" is ambiguous: a member data item of the class or an instantiation of the class? Given the four exercises, we can assume the latter is meant; some people would just say "a collection of students", etc.

It's easy for programmers new to the object paradigm to get confused here, because three distinct kinds of object are mentioned on this page:

  1. The object that is the instantiation of the whole collection.
  2. The objects that are conceptually stored in the collection.
  3. The iterator object (two paragraphs below) that provides access to (2) within (1).
A good test of your mastery of this example is your ability to explain it clearly to someone else.

Chapter 5

Page 158: The part about "for the sake of efficiency" is not relevant, since no sensible programmer would use a recursive factorial function in a real-life program.

Page 162:   The third subfield of the for statement should be --i or i--. This error, which appears multiple times in the book, is probably just a typographical misprint, combining two hyphens into one dash.

Page 162-167 (section 5.2)   This function is not "Decimal to Binary" conversion, but "Integer to Binary Character String".

Page 162:   "For a large int value such as one billion, the binary equivalent will have about 30 bits. But 30 digits of zeros and ones are too big for an int or even a long."—Obviously the value of an int can't be too big for an int, so it's unclear what point the author was trying to make. Just ignore the paragraph. (You may still find the code example interesting if you can figure a way of revising the documentation to clear up the confusion.)

The example is repeated on page 334 and again on page 337.

Page 163:   * param n the non-negative integer in decimal notation.—No, there's no decimal notation involved in this example.

Page 184-186:   The binary search algorithm is so staightforward and well known, that a recursive implementation is just silly. Instead of invoking itself, the code just needs a simple assignment statement to set either first or last. Furthermore, calling the test driver the BinarySearchUser class is misleading.

Page 200-201:   In the isOK and isGOAL methods "" should be &&. This error also occurs on page 96.

Chapter 6

Page 244-246:   This long and disorganized dictionary-maintenance example is seriously flawed in multiple ways:

Page 248:   " . . each element would be saved, but not the entire array." a puzzling statement. Just what is it that is not saved and why is that desirable?

Page 251 - 257:   The lengthy VeryLongInt class is intended more to illustrate the use of ArrayList rather than as a serious example of high-precision arithmetic. Nevertheless, we must note some serious design issues:

Chapter 7

Page 272   A page of code to test one result (the state of a freshly created list after three items have been inserted) will discourage potential users of JUnit. Note that the first test repeats the problem noted for page 63.

Chapter 8

Page 334 & 337:   See entry for page 162. Also the duplicate version here has a misaligned return statement.

Furthermore, invoking Integer.toString() to obtain a single character which can be only 0 or 1 is ridiculous.

Page 339:   It's unclear exactly what this means:

"In early (pre 1960) compilers, the tranlsation was performed directly. But direct translation is awkward because the machine-level code for an operator cannot be generated until both of its operands are known."
Since the discussion is about converting from infix expressions to postfix, it may mean that early compilers didn't do that. However, the advantages of postfix expression evaluation were well known in 1960, and some early compilers exploited them.

Page 351 - 363:   There's so much wrong with this "car wash" example that it would take too much space to analyze it here in detail. It violates many of the criteria we've noted for modular organization—both cohesion and coupling. Functions are misplaced, and the scope is unnecessarily narrow. It may form the basis for one or two homework assignments.

For example, note the four functions involved in printing and formatting lines for the report:

The only one of the four that needs to know the format of the report is printResults, which is the only one of the four that does not know the format of the report!

I tried to find something good to say about this example, but it's simply an atrocious program that we hope students at this level know better than to imitate. If you're unsure, then please raise questions in class.

Chapter 11

Page 418:   The repetition of 8 almost identical lines of code is regrettable, but the misalignment of the final else is erroneous and may confuse the reader.

Pages 461, 474, & 487:   The occurrences of "hard coded" constants may not be considered an error, since the program gets the right results, but many programming managers view it as a serious gaffe, especially when the same constant (7) is repeated. See the entry in the accompanying comments documents.

Page 481-482:   In the med3 method the comment:
    * @return the median of x [a], x[b], and x [c].
is incorrect. The code actually returns the index in x of the median The distinction is important because the user code sort1 on the previous page contains

     m = med3 (x, 1, m,n);  //  median of 3
     int v = x [m];         //  v is the pivot
confirming what med3 actually does.

However, sort1 never uses m again, but refers to v multiple times. Therefore, it would have made more sense for med3 to do what the above comment said and return the median, not an index to it. Such a function would have broader use, because it wouldn't require any array parameter.

NOTE: This demonstration program would have been less confusing if the author had chosen a primitive type other than int for the array elements. It would then have been obvious to him and to the reader whether the return value was an index (subscript) or an array element.

Page 487:   The use of med3 in this program does need the indices, raising doubt about our conclusion on page 481. Whichever way we resolve the design issue, of course, the comments must match what the function actually does.

Page 488:   The vecswap function has a serious flaw. If the two subarrays overlap, the results will surely not be what any user program would want. It needs to assure that the subarrays don't overlap, and do something sensible if they do.

We've criticized the author's tendency to overuse exceptions, but this is a place where an exception (ArrayIndexOutOfBounds) is the simplest action.

Chapter 12

Page 506:   The documentation for containsKey states:

 *  @return true - if there is at least one mapping for the specified key in
 *  this map object; otherwise return false.
The phrase "at least one" is nonsense. We know that a Map doesn't allow duplicate keys.

Page 516:   In the middle of the page the statement size- should be --size or size--. (The error would have been easier to spot and correct if the line of code had included a comment explaining its purpose. Also see the entry for page 162.)

Chapter 13

Page 555-557:   The Student class is an example of what we call an ad hoc class. That is, it isn't what anyone would seriously propose for instantiating Student objects, but was thrown together solely in order to demonstrate something else (PriorityQueue) Although the object paradigm is supposed to make our programs simpler and more flexible than procedural programming, this class has the unfortunate opposite effect.

Here are some serious problems with the Student class:

  1. The documentation, including the @param line refer to "the String object" used to initialize the Student object, but fail to provide the essential information about the structure and content of that string. By examining the constructor code below, we see that it's supposed to contain both the Student's name (in an unspecified form) and his or her grade-point average, a very strange and error-prone choice, since one could just define a two-parameter constructor.

    The actual input originates in the driver program in the PriorityQueueExample pseudo-class, which should handle the data editing. Various unhandled exceptions can be raised in the constructor of a class that knows nothing about the input source.

    Note that the use of a PersonName class could avoid the issue.

  2. The compareTo function is both bizarre (Should we really say that one student is greater than another student if he has a higher gpa?) and unnecssary (Just provide a gpa() or getGpa() accessor function and the user can make the comparison directly and naturally. Also, the last return statement is misaligned.

  3. In any reasonable academic-support system the gpa would be computed by the program, not input by a user.

Chapter 14

Page 608:   The constant 1023 is not as clear to all readers as 01777 or 0x3FF. Better yet, define it as a named constant INDEX_MASK.

Because the bitwise and operator is infrequently used, the programmer should explicitly parenthesize the expression
  hash (key.hashcode()) & (table.length -1);
if we really have to depend on the exact power-of-2 table size.

Page 609:   Using grossPay as part of a key is unacceptable on two counts:

  1. Since grossPay include overtime and other frequently varying components, it's likely to change from week to week and therefore be useless as a look-up key.
  2. Floating-point numbers are rarely if ever useful as key fields. (Yes, we know that the Java library supports hashing floating-point numbers, and some experts call for doing so, but we have yet to see a sensible example of that strange practice.

Page 623-624:   If this program worked it would still be almost indecipherable. It tries to do much too much in one function, and it desperately needs commentary. Who could understand the purpose and use of start, finish, or skip from the declarations at the bottom of page 623?

But it doesn't work, because the programmer forgot that:

This line of perfectly legal input text:   // Just one " on this line
will cause the first charAt to refer to the non-existent position -2!

In this case the sloppy programming may well have led to the bug. If the programmer had divided the messy logic into two or more separate functions and if he had clarified his thoughts with block comments, he would very likely have seen the problem and avoided the error.

Chapter 15

Page 656   The substitution of Tennessee for Tallahassee is such a trivial and obvious error that it's hardly worth mentioning. However, it may confirm that the book was published without being reviewed by proofreaders, by editors, or even by the author's friends and family. Any reviewer would have easily noticed the error and called it to the author's attention.

Page 678:   Like the earlier car wash and readSourceCode examples, this try block is grotesquely overextended. The only statement that can throw a FileNotFoundException is the very first statement in the page-long block! And as usual the catch block doesn't do anything useful.

Return to IDI home page.