Central documentation technique of Structured Systems Analysis still indispensible . . .

Preparing Dataflow Diagrams
© Conrad Weisert, Information Disciplines, Inc., Chicago
24 august 2003


Background -- Why dataflow diagrams?

A snippy Amazon customer review of David Hay's recent Requirements book1 complained:

". . . a book that is 90% data modeling and steeped in analysis techniques of the pre-OO era, such as dataflow diagrams (people still use these?). This looks like a book I had in school 10 years ago."

If the Amazon reviewer owned such a book 10 years ago, he obviously didn't understand it. He dismissively wonders whether people still use dataflow diagrams (DFD). Well, do they?

Indeed, they do. Competent systems analysts prepare dataflow diagrams. System designers, as well as responsible user representatives, read and understand them. If we expect professionals to abandon a documentation technique that worked well for them 10 years ago, we had better furnish them with a practical alternative form of documentation. Such alternative documents would have to satisfy the criteria that DFDs satisfied so well, including these:

Now, if you know of another form of system documentation that satisfies those criteria, then by all means feel free to substitute if for DFDs in your system specifications. If it has other advantages, then urge your colleagues to use it, too, and tell me about it. The Amazon reviewer didn't suggest such an alternative.

For now, then, we shall offer some comments and advice on preparing dataflow diagrams. This short article is not a self-contained DFD tutorial. It offers some guidelines for readers who have some experience with them. If you need more information, see any of the excellent textbooks on Structured Systems Analysis.

Do DFDs and OOA fit together?

Because many of us learned about dataflow diagrams several years before we learned about object-oriented analysis (OOA), there's a widespread misconception that DFDs are somehow un-object-oriented or even anti-object-oriented. On the contrary, DFDs support objects more smoothly and more directly than some less rigorous newer kinds of diagram that come with some popular OOA tools.

The elements of a DFD

A DFD contains four kinds of symbol:
  1. Processes -- The only active elements. Processes cause something to happen. They have embedded descriptions, often in verb-object form. (Sometimes informally called "bubbles" because of their shape in an early version of SA.)
  2. Terminators -- Represent users or other systems, i.e. entities outside the boundary of the computer system being described.
  3. Dataflows -- Composite data items (or objects) that pass either
  4. Data stores -- Holding places for dataflows; often implemented by databases.
Each symbol is labelled with a description in English or another natural language.

Extensions to Classic SA

The original treatments of Structured Analysis by DeMarco and by Gane & Sarson don't mention two of the most important components of a system specification. Instead of calling for system output specifications they simply suggest that a dataflow arrow to a user-terminator box implies a system output. Similarly a dataflow from a user-terminator represents a system input (transaction) specification.

On the contrary, experience shows that highlighting those centrally important and highly visible system components:

  • contributes greatly to the user audience's grasp of how the proposed system will function,
  • focuses the systems analyst / author's attention (and later the designer's) on the most crucial aspects of the system,
  • facilitates developing computerized input-output prototypes.

Graphic symbols

This may be the first article you've seen on Structured Analysis that doesn't show the symbols or graphic shapes that represent the four kinds of element. A dataflow is always represented by a labeled arrow, but there are alternatives for the other elements. In the early days of SA, partisans of different conventions debated with vigor bordering on acrimony competing conventions for those symbols. Even the simple dataflow arrow generated controversy between those who prefered it straight and those who prefered it curved. I have in my files a blistering memo from a young man who was outraged that I had told a roomful of his colleagues that the choice didn't make much difference.

Well, it still makes very little difference. Your organization should choose a set of symbols and, to avoid confusing the readers, stick with it. Actually, the choice may be made for you if you use a C.A.S.E.2 tool or even a general diagramming tool, such as Visio®.

Some common-sense rules

Systems analysts apply this checklist to look for errors in their DFDs:

  1. Every process must have at least one input dataflow (Violators are called "magic" processes, since they claim to do something based on no input, not even a trigger.)
  2. Every process must have at least one output dataflow (Violators are called "black hole" processes, since their inputs are swallowed up for no reason.)
  3. Every dataflow must connect two elements. One of them must be a process; the other can be a terminator, a data store or another process.
  4. Each dataflow diagram should contain no more than six or seven processes and no more than six or seven data stores, and all the processes should be conceptually at the same level of detail. If a part of the system is too big or too complicated to describe in an easily grasped diagram, break it down into two or three lower-level diagrams. (We sometimes see hanging on an office wall a huge tour de force DFD that tries to describe an entire large system at a low level of detail with several dozen processes and convoluted intersecting dataflow arrows. That's not something to be proud of. It doesn't communicate to any audience.)
  5. For every process, one of the following must eventually be true:
    1. The description label is so simple and unambiguous that every reader will understand it in exactly the same way.
    2. It is expanded or decomposed into a separate lower-level dataflow diagram that preserves exactly the same net inputs and outputs, but shows internal detail, such as data stores and internal processes.
    3. It is rigorously described by a separate process specification (business rule, decision rule, function definition, algorithm, etc.).

The starting point: Context (level-0) diagram

The systems analyst begins by preparing the top-level DFD. This "context diagram" shows the entire system as a single process. Interactions with users and other external entities are shown as dataflows.

The context diagram, although often almost trivially simple, serves two essential purposes:

The system diagram (level-1 DFD)

After everyone agrees that the context diagram is correct and complete, the systems analyst examines the first-level breakdown of major functions. Most systems can be decomposed into between two and seven major areas.

The result is called the "system diagram". It gives a clear overview of the system and serves as a base for further decomposition.

The end

The dataflow diagrams are complete when:

There's more to come, but the remaining components of the system specification (or detailed user requirements documentation) have little or no effect on the functionality of the proposed system. Note that the information contained in these documents is essential not only as a foundation for building a custom application but also as a basis for evaluating and choosing a packaged application software product.


1 -- A comparative review of that book and two other requirements books will be published on this web site in early 2004.
2 -- Computer Assisted Software Engineering

Return to Requirements Guidelines
IDI home page

Last modified (minor editing) July, 2015