An ancient problem remains surprisingly controversial . . .

Just say NO to computing factorials

by Conrad Weisert
© 2010 Information Disciplines, Inc., Chicago

NOTE: This article may be circulated freely as long as the copyright notice is included.

Background

The factorial of a positive integer N, written N! in conventional mathematical notation, is defined as the product of the integers from 1 through N. For example

    7! = 7 X 6 X 5 X 4 X 3 X 2 X 1 = 5040
The factorial function is formally defined recursively by:
    0! = 1
    (N+1)! = (N+1) × N!
It commonly occurs in calculations involving permutations and combinations and in the terms of series expansions, and should therefore be in every programmer's comfortable repertoire.

Controversial advice?

In a couple of book reviews I've pointed out naive advice from well-respected writers about computing factorials:

I've been getting inquiries, mostly from students, who wonder why I was so critical of that advice. They sometimes cite yet another programming textbook or web article that supports one or both of them.

Actually any computational implementation of

            double factorial(int N)
is naive and wasteful.

Floating point range

On modern computers the built-in data-type with the widest range is double float. Its maximum value is 1.7976931348623157 x 10308. That's a gigantic number, but so is the value of the factorial function for large N. Indeed factorial(171) is about 1.2410180702 X 10309.

It follows that the function declared above will work only for 0 <= N <= 170. Our factorial function has only 171 posible values. A table containing all of them will occupy 8 X 171 = 1368 bytes, an easily affordable chunk of space on a modern machine.

Execution time

Fetching a value from a table takes constant time. Obtaining the factorials of 3 and 85 will take exactly the same time. But any recomputation, recursive or iterative, has to recompute the function values for every smaller N. Computing 85 factorial, if you have to do it often, can slow up a program noticeably as it recomputes previously computed values.

It gets worse—February, 2013

If you thought computing factorials was a silly idea, the grand prize goes to a recent textbook1 that offers this recursive function for summing consecutive ingegers:

  public int Sum (int num)
  {int result;
   if (num == 1) 
         result = 1;
   else  result = num + Sum(num-1);
   return result;
  }

Never fear. The author concedes two pages later that the recursive version is inefficient, failing to mention that it loses control on a zero or negative value. He proposes a more efficient iterative version :

  sum = 0;
  for (int number = 1; number <= num; number++)
     sum += number;
  return sum;

Of course, schoolchildren know the simple closed-form expression:

  return (num * (num+1)) /2;

1—John Lewis: C# Software Solutions, 2007, Pearson Education, ISBN 0-321-26716-8, p. 596.

Accuracy and initialization

The fraction portion of a standard IEEE double number contains 52 bits or just under 16 decimal digits. That loses significance for factorials above 23. For most computational purposes, however, unless you're doing theoretical number theory research, 15-digit accuracy is sufficient.

In C++ we could start defining the table of factorials like this:

  static const double  factorial[171] 
   = {1, 1, 2, 6, 24, 120, 720, 5040, 40320 . . .
but how do we set the rest of the values?
  1. We certainly don't want to hand compute those huge values.
  2. We could write a one-time program to generate the comma delimited list, but would we then trust a C++ compiler to convert them to internal IEEE double representation without loss of precision?
  3. We could initialize the values by a one-time computation in each program that uses them, possibly using lazy evaluation to store only the values up to the current requested one.
  4. We could write a one-time program to generate the table and write it, in internal IEEE form to a small file, from which user programs can read it to initialize their copy of the table.

Either c or d is a safe choice, whichever takes less time to initialize the table in a user program. One might think that 170 multiplications would always beat a disk read, but high- speed non-volatile storage media may have changed that trade-off.

What if that's not enough range or accuracy?

If you need factorials above 170 or if you need more than 15-digit accuracy, you'll need a number representation beyond built-in double float. Packages, including object-oriented classes, are available for integer arithmetic up to whatever fits on your machine. Some scripting languages and spreadsheet processors support unlimited integer size. But we cringe at the likely run times for doing anything non-trivial with such numbers.

Sample code

Here's a C++ function implementing option c above:

//  Factorial function (copyright 2010, Information Disciplines, Inc.)
//  ------------------

//  (Demonstration version; uses no library items)

double                          //  Returns N! for N <= 170
 factorial(const unsigned int N)//  Infinity or NaN for N > 170
 {
  const int factorialMax = 170; //  Largest N for which the factorial
                                //    fits in a standard double float
  static int currentMax = 18;   //  Largest N for which we trust the
                                //    compiler to convert the decimal
                                //      constant value accurately
                                //    (See table below)    

//  Table of factorials (with one extra cell for overflow value)
//  -------------------

 static double factorialTable[factorialMax+2] = {
 1, 1, 2, 6, 24,  120 ,  720 ,   5040,  40320 ,  362880 , 3628800,
 39916800, 479001600, 6227020800, 87178291200 ,     1307674368000,
   20922789888000 ,  355687428096000          ,  6402373705728000};
//  Remaining values will be generated as needed.



    int n = (N  > factorialMax) ?  //  Guard against
             factorialMax + 1 : N; //    overflow


 
// If the table doesn't already contain the value, compute and save it
    
    while (currentMax < n)
        {++currentMax;
         factorialTable[currentMax] 
         = currentMax * factorialTable[currentMax-1];}
 
   return factorialTable[n];     //  (Result) 
}

Notes on the above C++ function:

  1. We didn't put the function in this web site's freeware collection, because we don't consider it an ideal solution, despite its being vastly superior to the recomputing versions found in popular textbooks.

  2. During a given execution of the using program, the function will never recompute the same result.

  3. It's hard to imagine how the so-called test-driven design fad methodology ("Start with the simplest thing that could possibly work for one value of the function argument . . . ") could produce a satisfactory solution to this simple problem.

  4. The effort required to produce this solution, while greater than that for the recursive or iterative solutions, was far from huge. The function was working for all argument values within an hour.

  5. The Java or C# verions are nearly identical, except for the obvious keywords, such as const, the two static items, and the handling of exceptions. Any programming language implementation that supports IEEE double-precision floating point data can use this approach.

  6. Because of the static data item currentMax the function is non-reentrant, and should not be used unprotected in a multi-thread environment.

Return to IDI home page

Last modified December 2, 2010