2.5.11  Analysis Techniques and Deliverables

2.5.11.1  (04-01-2003)
Introduction

  1. Structured analysis is a technique that involves analysis, description, specification, and decomposition of business processes and data to derive a result that is graphic and concise, non-redundant, top down partitioned and logical instead of physical. This technique uses logical models to enhance communication by emphasizing what (logically) needs to be designed, not how (physically) to design it.

  2. Data flow diagrams, data definitions, and process specifications are the tools used in structured analysis. The deliverable that results from applying structured analysis is the functional specification package. This deliverable comprises data flow diagrams, data definitions, and process specifications.

2.5.11.1.1  (04-01-2003)
Purpose

  1. This manual establishes standards, guidelines, and other controls for analyzing business processes and data. This manual describes techniques for modeling processes/data flows among these processes, defining data, and specifying processes. This manual is distributed to promote the development of business models that are easy to understand, change, and maintain.

2.5.11.1.2  (04-01-2003)
Scope

  1. The guidelines, standards, techniques, and other controls established in this manual apply to all software developed for the Internal Revenue Service. This development includes that performed by government employees as well as contractors. For system development purposes, the controls established in this manual may be used with any Agency approved life cycle (e.g. SDLC, eSDLC, or ELC).

2.5.11.2  (04-01-2003)
Developing Data Flow Diagrams

  1. A data flow diagram is a graphic tool for depicting the partitioning of a system into a network of activities and their interfaces, together with their origins, destinations, and stores of data. The system being partitioned can be automated, manual, or a combination of each. A data flow diagram pictures the system as a continuous stream of ongoing data but does not address physical concerns (i.e., decisions or loops) as does the traditional flow chart. A data flow diagram emphasizes the flow of data and de-emphasizes the flow of control.

  2. Data flow diagrams present a logical view of the system, unlike flowcharts, which introduce many physical constraints too early in system development.

  3. At a very early stage, data flow diagrams provide a graphic model of the system being developed which can be easily understood by the customer. Areas of misunderstanding are resolved early in system development rather than in a later development stage where changes have much more impact.

  4. Data flow diagrams break-up a system into functional subcomponents. This partitioning aids in identifying and isolating the various functions of a system.

  5. Data flow diagrams graphically depict the boundaries between the system itself and the externals, which interact with the system. In addition to providing this macro view, data flow diagrams can be decomposed into levels of increasing detail to provide the analyst with a very flexible graphic representation. At its higher levels, the data flow diagrams present system overviews suitable for management briefings; and at their most detailed levels data flow diagrams readily communicate with the system designer.

  6. Data flow diagrams are used to graphically describe the transformation of data through the system. Data flow diagrams are developed by studying the data from the user's point of view and then creating different logical and physical system models.

2.5.11.2.1  (04-01-2003)
Types of Data Flow Diagrams

  1. In applying structured analysis, develop and use the following types of data flow diagrams:

    1. Current physical data flow diagram;

    2. Current logical data flow diagram;

    3. New logical data flow diagram;

    4. New physical data flow diagram.

2.5.11.2.1.1  (04-01-2003)
Current Physical Data Flow Diagram

  1. Use this typal data flow diagram to model the current physical environment. This type of data flow diagram models the physical characteristics of an existing system such as: department names, physical location, organizations, people's names, and mechanical or operational devices.

  2. Use this typal data flow diagram when modeling a system for the first time. Since the user is more familiar with the physical terminology, get the user's approval of the accuracy of the model of the existing system before continuing analysis.

2.5.11.2.1.2  (04-01-2003)
Current Logical Data Flow Diagram

  1. Make a current physical data flow diagram by removing physical considerations and constraints. For instance, replace department names with the actual processing functions within that department. The logical model must depict how the data is being transformed, not who or what is transforming it.

2.5.11.2.1.3  (04-01-2003)
New Logical Data Flow Diagram

  1. To accommodate required changes to a system, examine and modify the current logical data flow diagram. Reexamine the rationale behind why processes are done and the way they are done. This model is still logical and becomes the candidate for the final aspect of data flow diagram development.

2.5.11.2.1.4  (04-01-2003)
New Physical Data Flow Diagram

  1. The data flow diagrams that result from this technique are the actual maintainable documentation required within the functional specification package.

  2. In the final aspect of data flow diagram development, balance the implementation of the ideal system (as represented by the new logical data flow diagrams) against the realities of time and cost constraints. Consider feasibility and impact studies, cost/benefit analysis, and other variables until an appropriate compromise physical model is selected.

  3. Make physical decisions (such as which data stores will be data bases, as opposed to sequential files) and consider "packaging."

  4. Do not allow the data flow diagrams to become too physical, as this will defeat their purpose and unnecessarily limit the choices available to the designer. Depict functional processes as opposed to organizational entities affecting the system, such as departments or divisions.

2.5.11.2.2  (04-01-2003)
Elements of a Data Flow Diagram

  1. This section discusses the elements that appear on data flow diagrams.

2.5.11.2.2.1  (04-01-2003)
Data Stream

  1. A data stream is one or more elements of data. A data stream is used to indicate a sharing of data. A data stream is graphically represented by an arrow that shows the direction for which data is being shared. Figure 2.5.11- 1 depicts a data stream.

    Figure 2.5.11-1

    Data Stream

    Data Stream

2.5.11.2.2.1.1  (04-01-2003)
Naming Data Streams/Modifiers

  1. Label all data streams with meaningful names and applicable naming standards. When a data stream has been logically transformed and this needs to be distinguished, do not create a new data name. Use a modifier to qualify the name and place it in parentheses after the data stream name. Figure 2.5.11- 2 depicts data streams with modified names.

    Figure 2.5.11-2

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Data Stream and Data Streams with Modified Names

2.5.11.2.2.1.2  (04-01-2003)
Routers and Collectors

  1. A router is used to subdivide a data stream or decompose data. A router is graphically represented by a right half circle. Figure 2.5.11- 3 depicts a router.

    Figure 2.5.11-3

    Router

    Router

  2. A collector is used to rebuild data streams or recompose data. A router is graphically represented by a left half circle. Figure 2.5.11- 4 depicts a collector.

    Figure 2.5.11-4

    Collector

    Collector

2.5.11.2.2.1.3  (04-01-2003)
Split Data Stream

  1. A split data stream divides the routing of data. Unlike the case with the logical router and logical collector, no decomposing or recomposing of the data stream takes place. A split arrow is used to show the routing of a data stream to two or more destinations. A split data stream is graphically represented by a multi prong arrow. Figure 2.5.11- 5 depicts a split data stream.

    Figure 2.5.11-5

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Split Data Stream

2.5.11.2.2.2  (04-01-2003)
Process

  1. A process represents a logical transformation of an incoming data stream(s) into an outgoing data stream(s). A process is a type of object that represents activity and constitutes a data flow diagram. A process name should consist of a transitive verb followed by a subject. Show a process by an ellipse or circle with a process name inside. Figure 2.5.11-6 shows these conventions.

    Figure 2.5.11-6

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Two Conventions Used to Depict a Process

  2. For decomposition purposes, three types of processes are acknowledged:

    1. context process

    2. parent process

    3. elementary process

  3. A context process represents the scope of activity being analyzed and modeled. A context process represents the first level of decomposition for a related set of data flow diagrams. A context process is a process that comprises other processes and does not constitute another process. As a rule, a context process may not constitute another process

  4. A parent process is a process that comprises other processes. As a rule, a parent process must constitute another process and comprise other processes

  5. An elementary process is a process that constitutes another process and does not comprise other processes. As a rule, an elementary process must not comprise other processes.

2.5.11.2.2.3  (04-01-2003)
Data Store

  1. A Data Store (Data Base, File, Table), represented by parallel lines, is a data stream, which is at rest (i.e., a temporary repository of data). Place the data store name between the parallel lines. Figure 2.5.11-7 illustrates a data store.

    Figure 2.5.11-7

    Data Store

    Data Store

2.5.11.2.2.4  (04-01-2003)
Access Key

  1. An access key is an optional figure that is represented by a dashed line with a name; and is only used to represent a key accessing a random access disk file. Figure 2.5.11-8 illustrates an access key to a data store.

    Figure 2.5.11-8

    Access Key to a Data Store

    Access Key to a Data Store

2.5.11.2.2.5  (04-01-2003)
Source/Sink

  1. Represent a Source/Sink (e.g., External Entity, External Input and Output) by a rectangle. A source or sink is a person or organization, external to the context of a system that is a net originator or receiver of system data. Place the source/sink name inside the rectangle. Figure 2.5.11-9 illustrates a source/sink.

    Figure 2.5.11-9

    Source/Sink

    Source/Sink

2.5.11.2.3  (04-01-2003)
Using Special Conventions to Model Data Flows

  1. Some situations will require other diagramming conventions to graphically express the situation.

2.5.11.2.3.1  (04-01-2003)
Accessing/Updating a Data Store

  1. Use the arrows to represent reads, writes, or other accesses to a data store and may be used in any appropriate combination. Figure 2.5.11-10 illustrates the accessing of or reading from a data store.

    Figure 2.5.11-10

    Convention used to depict Reading a Data Store

    Convention used to depict Reading a Data Store

  2. Figure 2.5.11-11 illustrates the updating of or writing to a data store.

    Figure 2.5.11-11

    Convention used to depict Writing to a Data Store

    Convention used to depict Writing to a Data Store

  3. When diagramming a file key accessing a data store, the key is optional. Only show the key on a physical data flow diagram (i.e., after it is decided that the file media will be direct access). Figure 2.5.11-12 illustrates a file key accessing a data store.

    Figure 2.5.11-12

    Convention used to depict Key Access to a Data Store

    Convention used to depict Key Access to a Data Store

2.5.11.2.3.2  (04-01-2003)
Updating Re-circulating Data Stores

  1. When developing logical or physical data flow diagrams, some situations will require the modeling of re-circulating data stores.

2.5.11.2.3.2.1  (04-01-2003)
Depicting Files on Logical Data Flow Diagrams

  1. When designating files on logical data flow diagrams in which data from an old version of a file is being transformed into data for a new version of that same file, show the old and new versions of the file. Figure 2.5.11-13 illustrates a data transformation.

    Figure 2.5.11-13

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Data Transformation

2.5.11.2.3.2.2  (04-01-2003)
Depicting Files on Physical Data Flow Diagrams

  1. If a decision is made during creation of the physical data flow diagram to use a single random access file, designate it on the new physical data flow diagram (but not on the logical data flow diagram). Figure 2.5.11-14 illustrates random file access depicted on a data flow diagram.

    Figure 2.5.11-14

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Random File Access

  2. If the file is to be sequentially processed or it is still unsure how it will be processed, depict the file on the new physical data flow diagram as Figure 2.5.11-15 illustrates.

    Figure 2.5.11-15

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    File Processing

2.5.11.2.3.3  (04-01-2003)
Sorts

  1. Introduce a sort as a process bubble only when it is logically required If the sort process is being shown as a bubble and is sorting an input file and putting out a sorted file, then show these files on the data flow diagram. Name the data stores only, not the data streams.

  2. Figure 2.5.11-16 provides an example (of a sequential file update or an update where the decision as to whether it will be random or sequential has not been made) that illustrates a sort.

    Figure 2.5.11-16

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Sort

2.5.11.2.3.4  (04-01-2003)
Off-Page Connector

  1. An off-page connector is represented by a circle and arrow. Avoid continuation pages because they make the data flow diagram less readable. When a data flow diagram must be continued onto another page and the diagram remains at the same level of decomposition, then, the off-page connector may be used. Write the sending and receiving page numbers within the respective circles. To avoid confusion, make the circles smaller than the process bubbles on your data flow diagram. Figure 2.5.11-17 illustrates the conventions used to depict off-page connectors.

    Figure 2.5.11-17

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Conventions used to depict Off-Page Connectors

2.5.11.2.4  (04-01-2003)
Leveling Data Flow Diagrams

  1. Leveling is the partitioning of a large system into manageable units, resulting in system documentation that is easier to comprehend. Top-down analysis and reanalysis of processes and data (partitioning and re-partitioning) produce a high level overview for management and lower, more detailed levels for the designer and users. A leveled data flow diagram set comprises:

    1. The top-level diagram, called the context diagram, which defines the boundary of the system and consists of only one bubble that is labeled with an overall system descriptor. The system sources, sinks, inputs, and outputs are depicted; and the input and output data streams are shown to define the domain of the system.

    2. Middle-level data flow diagrams are used when it is necessary to represent the system processes within the context diagram broken down into a more detailed level. They are the intermediate level between a context diagram and the functional primitives.

    3. The lowest-level data flow diagram, called a functional primitive, represents a process that cannot be further decomposed. A functional primitive has no internal data streams and usually only a single input and single output.

  2. Exhibit 2.5.11- 1 illustrates the format for a leveled data flow diagram.

2.5.11.2.4.1  (04-01-2003)
Parent/Child Process Relationships

  1. A diagram for which there is a lower level diagram(s) is termed a "parent" diagram. For instance, in Exhibit 2.5.11- 1, the context diagram is parent to Diagram 0 which is termed a "child" diagram. Diagram 0 also assumes the role of a parent to Diagram 2, which is the child of Diagram 0. Therefore, a diagram can be both a child of a higher-level data flow diagram and a parent to a lower level data flow diagram. However, a lowest level (Functional Primitive) data flow diagram can only be a child diagram because it cannot be further decomposed.

2.5.11.2.4.2  (04-01-2003)
11.2.4.2 Leveling Conventions/Standards

  1. Each level of the data flow diagram is to reside on a separate page. The reader can follow the diagram leveling using the diagram and bubble numbering system as a guide.

  2. There is no set number of levels. However, there is always at least a context diagram level and an associated lowest level. The number of middle level diagrams is dependent upon the complexity of the system being defined.

  3. In the interest of readability, partition levels into about seven bubbles (plus or minus two bubbles).

2.5.11.2.5  (04-01-2003)
Identifying a Data Flow Diagram

  1. Data flow diagrams are identified through naming and numbering.

2.5.11.2.5.1  (04-01-2003)
Naming a Data Flow Diagram

  1. Title each data flow diagram with the name of its "parent' bubble. The context diagram within a data flow diagram set has no "parent" diagram; it is the highest-level diagram and identifies the system name, input and outputs.

2.5.11.2.5.2  (04-01-2003)
Numbering a Data Flow Diagram

  1. Except for the context diagram, each data flow diagram is labeled with the diagram number of its parent bubble. This diagram number is carried over into the numbering of the individual bubbles by taking the diagram number, placing a decimal point after it, and then placing a sequential number after the decimal point to give each bubble a unique identifier. The diagram number retains the bulk of the numbering and the bubbles are numbered with only the last decimal point number. Figure 2.5.11-18 illustrates the numbering, i.e., the actual process reference numbers of diagram 2.4.5 are 2.4.5.1, 2.4.5.2, and 2.4.5.3.

    Figure 2.5.11-18

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Processes that constitute Process 2.4.5

  2. Exhibit 2.5.11- 1 illustrates a properly numbered data flow diagram.

2.5.11.2.5.3  (04-01-2003)
Sequencing Data Flow Diagrams

  1. Place the sequence or order of appearance of the data flow diagrams in the functional specification package in ascending numeric order (the data flow diagram name is unimportant and not used in this sequencing). Use the data flow diagram numbers, which appear in the page heading of each diagram, for sequencing. Follow one particular sequencing order to maintain uniformity between various functional specification packages.

  2. Figure 2.5.11-19 illustrates proper numeric sequence.

    Figure 2.5.11-19

    Proper Numeric Sequence

    Proper Numeric Sequence

2.5.11.2.6  (04-01-2003)
Balancing Data Flow Diagrams

  1. Keep data flow diagrams in balance. Represent in the associated bubbles in the child diagram all data streams shown entering and exiting a parent diagram. There are exceptions to the balance rule-minor error paths and trivial inputs (e.g., error messages, system date) need not be in balance.

  2. Show a data store (file) on the first data flow diagram level where all system references to it are shown. Apply this concept at all levels. If a file is used primarily by the system represented in the context diagram, there is no need to show the file at the context diagram level, however, if the file is external to the system, show it on the context diagram.

  3. As an example, Figure 2.5.11-20 illustrates a data store or file not shown in Diagram 0. This is because the file is internal to the processing in process 3 (as though it is concealed inside the bubble). The file and all its data streams are shown when process 3 is diagrammed.

    Figure 2.5.11-20

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Diagram 0 (on the left) and Diagram 3 (on the right)

  4. Figure 2.5.11-21 is for Diagram 0 and illustrates a file being used by processes 2 and 3.

    Figure 2.5.11-21

    This image is too large to be displayed in the current screen. Please click the link to view the image.

    Diagram 0

2.5.11.2.7  (04-01-2003)
Additional Guidelines for Developing Data Flow Diagrams

  1. Identify all major inputs and outputs to the system on the context diagram.

  2. Show minor inputs, and reject data flows at an appropriate lower level. These data streams need not be balanced between parent and child.

  3. Do not show trivial error paths, such as screen messages, on the data flow diagram. Instead, note the processing for the message and the actual message in the appropriate Process Specification.

  4. Label each data stream, data process, and data store with a meaningful name developed in accordance with applicable naming standards.

  5. Use a descriptive and strong action verb to name a process bubble. Try to use a singular object to complement the verb.

  6. On a data flow diagram, ensure that each process has at least one input data stream and one output data stream.

  7. Format each level of the data flow diagrams in a left to right flow, a convention with which most readers are familiar. Try to aid readability by not crossing data streams.

  8. Present the system from the viewpoint of the data and show the processes transforming the data.

  9. Don't represent the flow of control or control information (i.e., timing considerations).

  10. Don't show initialization and termination such as job control language, control decisions (beginning or end-of-file) and file initialization.

2.5.11.2.8  (04-01-2005)
Transition to Design

  1. This section provides guidance on using the results from analysis as the basis for software design.

2.5.11.2.8.1  (04-01-2005)
Partitioning the Data Flow Diagrams

  1. If a data flow diagram has not been evenly partitioned, the diagram will combine some detail and some higher levels of abstraction. In this case, perform a top-down partitioning of the data flow diagram by:

    1. Replacing any problem bubble by its "child" network and then connecting the data flows;

    2. Grouping into sets to minimize interfaces;

    3. Allocating one top-level bubble per set;

    4. Renumbering and renaming everything.

2.5.11.2.8.2  (04-01-2003)
Packaging the Data Flow Diagrams

  1. As the new physical data flow diagrams are being developed, both the analyst and the designer must consider certain physical details. Unless a data flow diagram is small and limited in function, it will need to be "packaged". Packaging is the process of subdividing the data flow diagram processes into related groups of processes; and each of these related groups of processes evolves into a separate structure chart that will be created during the software design. The following physical boundaries and constraints have a bearing on the packaging of a data flow diagram set:

    • Man/machine boundary-separates manual processes from those performed on ADP equipment.

    • Hardware boundary-separates processes, which must be performed on different types of ADP equipment.

    • Batch/on-line/real time boundary-various functions of a system may be on-line, real time, or batch mode depending on the speed requirements for data retrieval, display, availability, etc.

    • Cycle or timing boundary-some processes must be run daily, while others only need to be run once a week, month, or year.

    • Commercial software-some processes may be accomplished using vendor-supplied software.

    • Security/safety needs-security and safety requirements may cause the addition of otherwise unnecessary boundaries and intermediate data stores. Other needs of this type include audit, back-up, recovery, and checkpoint/restart requirements.

    • Resources-some processes may not be able to be run at the same time because of limited resources, (e.g., the job is too large for computer capacity).

2.5.11.3  (04-01-2003)
Developing Data Definitions

  1. Data flow diagrams provide a general picture of the data transformations (processes) and their interfaces (data streams) in a system. To make the data flow diagrams more precise, define both the data and the processing. Data definitions add precision to a system by capturing the details of the data streams and data stores. Since the means to catalogue these definitions will vary by site, the exact form they take will vary. Whether this method is manual, automated, or a combination of the two, standards dictate that project managers ensure consistency within their systems.

  2. Develop a definition for each data stream and data store on the data flow diagram, and maintained in a system glossary. Define all data elements and groups of data elements contained in these data streams and data stores, and all components referenced in the process specifications. Ensure that all names are in accordance with established naming standards. If an enterprise data dictionary is used to maintain data definitions, then follow any agency guidelines on use of the dictionary.

2.5.11.3.1  (04-01-2003)
Types of Data Definitions

  1. Define Data in terms of its components and their relationship in the hierarchy. Define the following four types:

    • Data element - the smallest piece of data that is not further decomposed.

    • Data group - a data structure that consists of other data groups and/or data elements.

    • Data stream (also called data flow) - data in motion; a pipeline along which information of known composition is passed Note that data streams are not defined as separate entities in the system glossary; each data stream is a flow consisting of either a data group or a data element.

    • Data store data at rest - i.e., a temporary repository of data, a file.

  2. External entities (sources and sinks) are not data but should also be described in the system glossary or data dictionary.


More Internal Revenue Manual