2.5.3  Programming and Source Code Standards

2.5.3.1  (07-01-2006)
Introduction

  1. This Internal Revenue Manual (IRM) is organized into the following subsections:

    1. Introduction

    2. General Programming

    3. COBOL Programming

    4. C Language Programming

    5. C++ Programming

    6. Java Programming

  2. The subsection, "Introduction" introduces this Internal Revenue Manual (IRM). This subsection states the purpose of this manual, defines terms, identifies personnel affected by this manual, describes the roles mentioned in the manual, describes how this manual is organized, and identifies other documents addressed in this manual. This subsection also provides background and history about this manual.

  3. The subsection, "General Programming" addresses non-language specific general programming topics.

  4. The subsection, "COBOL Programming" addresses topics specific to COBOL programming.

  5. The subsection, "C Language Programming" addresses topics specific to C language programming.

  6. The subsection, "C++ Programming" addresses topics specific to C++ programming.

  7. The subsection, "Java Programming" addresses topics specific to Java programming.

2.5.3.1.1  (01-01-2004)
Purpose

  1. This Internal Revenue Manual (IRM) establishes standards and guidelines to promote the development of maintainable, portable, reliable software applications in all Service used/approved languages as outlined in this IRM.

2.5.3.1.2  (01-01-2004)
Definitions

  1. Exhibit 2.5.3-1 defines terms used in this Internal Revenue Manual.

2.5.3.1.3  (01-01-2004)
Affected Personnel

  1. The controls established in this Internal Revenue Manual (IRM) apply to Service personnel responsible for developing or maintaining the Service's application systems or software applications, identified in the IRS Enterprise Architecture. Service personnel who contract for development or maintenance of these systems/software applications shall ensure contracts comply with these controls.

  2. These controls apply to all organizations, organizational units, and projects responsible for developing or maintaining the Service's application systems or software applications, identified in the IRS Enterprise Architecture.

2.5.3.1.4  (07-01-2006)
References

  1. As supplement references on the development of maintainable, portable, reliable software applications, the following documents are recommended:

    • The Elements of Programming Style, ISBN: 0070342075, Brian W. Kernighan and P. J. Plauger

    • IRM 2.5.12 - Design Techniques and Deliverables, provides comprehensive standards and guidelines regarding structure charts and module specifications

    • Assembler Language Programming, ISBN: 0–471–88657–2, Nancy Stern, Alden Sager and Robert A. Stein

    • Structured COBOL Programming, ISBN 0-471-29987-1, Nancy Stern and Robert A Stern

    • The Elements of C Programming Style, ISBN 0070512787, Jay Ranade and Alan Nash

    • IRM 2.5.2 Software System Testing

    • IRS Enterprise Architecture at http://irsprime.web.irs.gov/IRSEA/default.htm

    • IRS Document 12384, C++ Programming Standards

    • Code Conventions for the Java Programming Language, Sun Microsystems, Inc.

2.5.3.1.5  (06-01-2002)
Waivers

  1. IRM 2.5.1 Systems Development documents the waiver process.

2.5.3.2  (06-30-2004)
General Programming

  1. This section of the IRM, which should be Weighed based on platform and language -specific idiosyncrasies, presents a variety of standards and guidelines to be applied to application program development and documentation efforts.

  2. The objective of this section is to promote the development of programs that are reliable, modular, easily maintainable, and as portable as possible.

  3. New software tools for application development and decision support may supplement and/or replace traditional design and programming techniques. Commercially acquired software packages may reduce development time by eliminating "detailed" design and programming activities. Off-the-shelf software packages should be carefully considered before the decision is made to develop software.

  4. The scope of this directive is Servicewide. This includes software developed by contractors. Where the guidelines apply to Assembler Language, COBOL, C Language, C++ programming, and Java programming, these guidelines shall be followed respectively.

  5. This directive is designed for use by application software developers and programmers/project developers and contractors who are responsible for development of source code and who must ensure that the proper documentation is sent to the implementation/test site.

2.5.3.2.1  (06-01-2002)
Goals

  1. The primary goal of structured programming is to produce working programs that are modular, accurate, and self-documenting, so that they are easy to read and maintain by someone other than the original author.

  2. Structured programming includes the following activities:

    • Developing specifications for the logic of each module;

    • Writing structured code to implement the logic of the module; and

    • Using a structured testing methodology that gradually creates a working program as each module is introduced into the application system.

2.5.3.2.2  (01-01-2004)
Basic Principles

  1. Structured programming employs the use of limited syntax (constructs) for source code, single-entry/single-exit modules, and top-down development.

  2. Base the logic of each module on various combinations of control structures. The three basic constructs are Sequence, Selection (If-Then-Else), and Repetition (Do-While)/(Test-First). Two optional constructs include Repetition (Do-Until)/(Test-Last) and Selection (Case).

  3. Exhibit 2.5.3-4 depicts a flowchart and Structure diagram for each construct. The actual implementation of these structures will vary according to the requirements of the particular language being used.

  4. Ensure that each module has only one entry point to and one exit point from the module.

  5. Partition and organize each module, program, and application system into a hierarchical structure. Structure charts, and therefore module specifications and structured code, are part of the design of a system.

2.5.3.2.3  (01-01-2004)
Design Specifications

  1. Various tools are commonly used to communicate and transition design specifications to source code. These tools are:

    1. structure charts

    2. module specifications

    IRM 2.5.12 - Design Techniques and Deliverables, provides comprehensive standards and guidelines regarding structure charts and module specifications during design.

2.5.3.2.4  (06-01-2002)
Documenting, Testing, and Debugging Source Code

  1. This section addresses services that should be performed regardless of the language or platform selected.

2.5.3.2.4.1  (01-01-2004)
Documenting Code

  1. Document each module and paragraph for future modifications at the time of writing or review/use/modification of the code. Document the changed module and the changes.

  2. Ensure that all source code is well documented, clear, understandable, and easy to modify and maintain.

  3. Make each module a small block of source code; not exceed one page of printed output (exclusive of comments).

  4. Indent source code statements.

2.5.3.2.4.2  (01-01-2004)
Testing and Debugging Code

  1. Review, analyze and test the code for consistency, correctness, clarity and completeness according to IRS coding standards.

  2. Test the software according to IRM 2.5.2 Software Systems Testing.

2.5.3.2.5  (01-01-2004)
Selecting Programming Languages

  1. For new projects, select the programming language based on IRS Enterprise Architecture requirements.

2.5.3.2.6  (01-01-2004)
Data Controls

  1. Data controls is knowing for what purpose a variable has been defined (ie., its type) and what functions of the program access modify a variable.

  2. This subsection provides general guidelines for developing data controls and examples of data control types that are often used in system development. This is not an all-inclusive list of controls, but rather a general framework for control development.

  3. Data controls should have one and only one purpose for each variables.

  4. Variable scope should be apparent and limited (ie., the set of program functions which can access the variable).

  5. Variables should only be "global" as necessary. Only functions which require a variable should have access to it.

  6. Data controls permit an operating entity to verify that the correct operations have been performed, in the correct manner, with the correct data.

  7. Ensure that Data control considerations comprise an integral part of the design process.

2.5.3.2.6.1  (06-01-2002)
Basic Principles of Data Controls

  1. Controls refer to the manual and automated measures employed to:

    • Preserve the accuracy of data by detecting and/or preventing operator errors.

    • Ensure that no data is lost or added, by monitoring balances between processes.

    • Ensure data integrity so programs do not inadvertently change the values of data.

    • Permit the proper recovery/reconstruction of file data after a system failure or abnormal termination.

    • Safeguard sensitive data to prevent unauthorized access, embezzlement, and other breaches of security.

2.5.3.2.6.2  (06-01-2002)
Programming Considerations for Data Controls

  1. Integrate controls into the development effort. The types of controls and the amount of detail are dependent upon the size and complexity of the application system.

  2. Weigh each development effort based on the following operational considerations:

    • the amount of operator intervention;

    • multi-file/multi-cartridge processing;

    • checkpoint/restart capability;

    • the file ID on all internal reports;

    • back-up of control file;

    • initialization of working storage and output buffers with spaces and zeros; and

    • run to run balancing

2.5.3.2.6.3  (06-01-2002)
Data Controls

  1. Place controls as close as possible to the source of the data (e.g., verification of data immediately after it is entered; block balancing before data is released to update modules, etc.).

  2. Automate controls whenever possible.

  3. Keep controls simple to read and balance, and easy to maintain.

  4. Explain the purpose and use of controls. Describe how the totals were derived.

  5. Record counts must be provided and broken down into logical records for each run.

2.5.3.2.6.4  (06-01-2002)
Internal Data Controls

  1. Internal controls are balancing procedures developed to verify the validity of the processing within a run. Internal controls are usually a response to user requirements for accuracy, completeness and security within an information system. Segment these controls into three classes:

    1. controls over input,

    2. controls over processing, and

    3. controls over output.

2.5.3.2.6.4.1  (06-01-2002)
Input Controls

  1. Input controls are the most important and the most numerous. Most errors are generated during input processing. Some common techniques are:

    • Check digit verification--Use check digits to review the accuracy of specific fields. For example, a check digit can help determine whether an account number is valid.

    • Consistency tests--If the application permits it, verify accuracy by comparing the values of various fields to determine whether the combinations make sense. For example, if the "Country" field indicates that the record concerns an organization in Canada, the "Postal Code " field should have a specific alphanumeric format.

    • Validity tests--In some cases, fields can take only a limited range of values, or must have a predetermined format. Matching the actual value to the allowable values will detect errors. For example, if a field is supposed to contain a valid U.S. postal abbreviation for a state, "AZ " would be valid but "A2" would not.

    • Batch numbering--This technique ensures that transactions are not lost. Processor checks can be made to assure that all transactions are accounted for and processed in a logical order.

    • Control totals--These totals help avoid errors during data entry. Various input fields (e.g., check amount or quantity received) are added both manually and automatically for comparison. In some cases, these totals are developed for fields that would normally not be added (e.g., account numbers or social security numbers). These are called hash totals. In either case, both the expected totals and the individual transactions are passed to the application system. The application system then recalculates the totals from the individual records received and compares them to the expected totals. If they don't match, an error has been detected.

    • Transaction counts--Use this method to keep track of the number of transactions that should have been processed by the application system.

2.5.3.2.6.4.2  (06-01-2002)
Processing Controls

  1. There are two major types of processing controls:

    1. Run to Run

    2. File and Operator

  2. Run to run controls consist of data generation controls and verification controls.

    • Use data generation controls to ensure that the correct version of the file is being used.

    • Verification controls ensure that the totals or record counts for the prior run match the opening totals for the current run (e.g., header/trailer counts).

  3. File and operator controls are actions that the operator can take to insure that the application system is processing the right files and data. The controls can be as simple as checking a cartridge. Operator intervention should be kept to a minimum. Operator controls should be very specific and should be accompanied by sufficient operator instruction. For example, if the operator receives a message on the console: CARTRIDGE LABEL ERROR Enter "R" to retry, "N" to abort, " A" to accept. The operator should not be able to override this message.

2.5.3.2.6.4.3  (06-01-2002)
Output Controls

  1. There are three types of output controls:

    1. Control Totals

    2. Verification Controls

    3. Distribution Controls

  2. Use control totals to verify the correctness of the outputs. For example, if an accounts payable application system generates 236 checks with an expected value of $395,000.12, the checks could be physically added to verify that the actual values of the checks were generated.

  3. Use verification controls to coordinate internal and external processes. For example, to avoid unauthorized loss of blank checks, have the computer keep track of the expected serial numbers of the preprinted checks and print the expected number on the check. If the two numbers differ, something is wrong.

  4. Use distribution controls to ensure that once an output is printed, it is delivered to the authorized recipients. This includes having users sign for reports on-site, as well as controls between sites.

2.5.3.2.6.5  (06-01-2002)
External Data Controls

  1. External controls consist of that information necessary for operations personnel to perform balancing between and within runs. These controls are manual in nature and should include precise instructions as to:

    • Which output listing/file contains the control data;

    • What type of control data is being generated (e.g., transaction counts, hash totals, etc.);

    • How to balance the various elements of control data (e.g., ITEM 1 + ITEM 2 = ITEM 3).

  2. Accumulate and print controls at the end of processing to include, at a minimum:

    • Counts of total inputs and outputs;

    • Balancing counts;

    • Information counts;

    • Run to run counts; and to

    • generated, dropped and error records,

2.5.3.2.6.5.1  (01-01-2004)
Control Totals

  1. Keep a record of the data as it moves through an application system and is subjected to a series of manual and automated processes. This can be accomplished in two ways:

    1. Controls Totals

    2. Control File

  2. Control totals can be embedded in the process itself. This is not the best approach since these totals are easily modified.

  3. A separate, highly controlled (limited user access) " control file" is very effective in that it is not as accessible as the data files. This file should include the following:

    • Block and/or record counts, hash totals, and total counts;

    • Logical record counts, if possible, when they differ from tape-record counts

    • Controls on money amount fields (cumulative arithmetic totals);

    • Adequate controls to account for all records, including those dropped, by-passed or combined during processing.

2.5.3.2.6.5.2  (06-01-2002)
Intra-Run Controls

  1. Intra-run controls generate and/or present control information to operations personnel during the execution of the run.

  2. When designing a program, limit the amount of intervention required by operations personnel. As this is not always possible, consider the following ideas when developing intra-run controls:

    • Enable the run to print all operationally controlled parameters used for the run.

    • Stack all control data to a separate tape/disk file and print at the end of the job. Don't clutter the console with control information during processing.

    • Print totals for each run, every time, even when the totals are in balance.

2.5.3.2.6.6  (06-01-2002)
Including Data Controls

  1. Include computer generated control lists with record counts by file, file number and name, money amounts and tape/disk numbers.

  2. Make sure that programs generate identifying information on all internally used output (e.g., reports). The project/run/file ID will be printed on each page of printed output. Do not print this information on transcripts, taxpayer letters and notices, and externally distributed reports.

  3. Include instructions for manually processing the control list.

  4. Computer generated cartridge numbers on all control lists:

    • Print cartridge file ID on controls page (from job number next to the corresponding count).

  5. Computer generated hard copy control output for all runs.

  6. List all control features in either the user handbook and/or the Computer Operators Handbook (COH), explaining:

    • The purpose and use of each control;

    • How they were derived and their meaning; and

    • The cause and meaning of all programmed halts.

  7. Assign a unique identifier to each cartridge file.

  8. When processing a multi-reel program that also has multi-file input, use halts at the end of each file if the accumulated counts are not equal to the record count in a trailer record.

  9. Institute checkpoint/restart capabilities for any application with estimated or actual run times that exceed one hour normal processing time as well as for large programs that process extensive amounts of data.

2.5.3.2.7  (01-01-2004)
File Design and Cartridge Interface Formats

  1. This subsection addresses file design and cartridge interface format considerations.

2.5.3.2.7.1  (06-01-2002)
File Design Formats

  1. The following sections include the design of the sequential file and logical data record formats. They are concerned with the association or grouping of the data elements into groups and records.

2.5.3.2.7.1.1  (06-01-2002)
Record Format Design

  1. Fixed length records--a file composed of records that are all the same length.

  2. Variable-length records/multiple fixed formats--a file composed of a finite number of fixed length record sets, where the record lengths within any set are equal, but the record lengths between sets differ.

  3. Variable-length records/variable subscripted format--a file composed of one or more sets of records whose format consists of a fixed portion followed by a variable number of repeating groups. These groups must either be fixed in length, or composed of a fixed portion plus a subgroup whose entries are fixed in length.

  4. Variable-length records/variable string format--a file composed of records consisting of character strings of unspecified lengths.

2.5.3.2.7.1.2  (06-01-2002)
Defining Data Fields

  1. When defining data fields which will compose a file, do not assign more than one significance to a field; (e.g., if a field is labeled DATE), the values carried by that field should be date information in all cases.

  2. Specify all the search key fields, and if possible, place them at the beginning of the record.

  3. Reduce redundant data fields to the minimum.

  4. Specify sensitivity levels for files. Classify all the sensitive data fields that require authorization for access.

  5. Restrict data fields to one and only one data item. This is really a VERY important standard to enforce

  6. The name should comply with IRM 2.5.7 Data Naming Standards to include characteristics such as a) the item should be easily defined. b) The name should reflect and be specific to what is in the field (e.g. IRS-Mailing-Dt.) c) Data names should end in a class word, indicating the data type.

  7. Data should not be intermixed (the same field should never be used for multiple types of data).

2.5.3.2.7.1.3  (06-01-2002)
File Design

  1. Use "Fixed" and "variable multiple fixed" formats when possible.

  2. Avoid variable length records/variable subscripted format (i.e., Nth dimensional groups, where N is greater than 2).

  3. Do not use variable string formats.

2.5.3.2.7.2  (01-01-2004)
Tape Interface

  1. Tape interface standards reduce the difficulty of sharing data between different users and different application systems. They allow the users to consider only the logical structure of files, and simplify the transporting and maintenance of data.

  2. All files created on an application system to be processed on another will:

    • Contain only ASCII character data;

    • Be in either Fixed or Variable format; and

    • Carry signs (+ or -) as a separate, leading ASCII character for signed numeric data fields. The reason for carrying signs separately is that ANSI otherwise leaves the method of signing as an implementor option; therefore, consistency of embedded signing between application systems should never be assumed.

  3. All files that are passed between application systems will be limited to 9995 characters per record.

  4. Record lengths (for variable records) consist of four decimal (ASCII) characters in the Record Control Word (RCW). The RCW is automatically generated by the application system and precedes each logical record.

2.5.3.2.8  (04-15-2004)
Date Fields

  1. This subsection pertains to date fields and addresses the following topics:

    1. Year

    2. Date

    3. Gregorian Dates

    4. Exceptions

2.5.3.2.8.1  (04-15-2004)
Year

  1. Format, output and represent all year fields as YYYY .

2.5.3.2.8.2  (04-15-2004)
Date

  1. Do not store non-date values in DATE fields (i.e., indicators, freeze codes).

  2. Do not use any DATE field to store non-date information, as in the case of moving all 9's to a field as an indicator of a particular status.

  3. Do not store special characters in any DATE fields.

  4. Make DATE field names meaningful and accurately descriptive of the date stored in the fields, (e.g., BIRTH-DATE ).

  5. Add validity checks for DATE fields entered on screens or at their initial entry point into Service Application Systems. This includes External Trading Partners Processing.

  6. Externalize literal usage of dates wherever possible. For example, interest rates that apply to certain date ranges would be established as a data file or database table rather than being hard-coded in the program. If at all possible, eliminate hard-coded dates.

  7. Use system-wide standard DATE routines (either IRS-developed or COTS) in source code, wherever possible.

2.5.3.2.8.3  (04-15-2004)
Gregorian Dates

  1. All Gregorian dates must be in (YYYYMMDD) format.

2.5.3.2.8.4  (04-15-2004)
Exceptions

  1. Archive data no longer included in regularly scheduled processing need not be converted.

  2. Transmittal numbers and data set names (including File Names) containing dates need not be converted.

2.5.3.3  (01-01-2004)
COBOL Programming

  1. This subsection provides establishes controls to ensure COBOL programs are reliable, maintainable, and portable.

2.5.3.3.1  (01-01-2004)
Scope

  1. The controls prescribed are applicable to all IRS COBOL programs whether they are developed by the IRS or outside vendors for the IRS.

2.5.3.3.2  (11-26-2001)
Basic Principles

  1. The development of structured COBOL programs in accordance with this section is dependent on structured design.

  2. Structured COBOL code is the implementation of the logic depicted in module specifications. Module specifications directly correspond to the modules shown on the structure chart.

  3. Structure charts, and therefore module specifications and structured code, are based on a top-down design of the application system. Each of the modules that constitute a structure chart should have a single entry point and a single exit point. The logic of each of the modules is based on various combinations of the three control structures: sequence, selection, and iteration.

  4. These principles have been established with the understanding that COBOL programs are not always maintained by the original author. All structured programs will have the same visual format. Only the most common formats are discussed.

2.5.3.3.3  (11-26-2001)
Structured Programming

  1. Structured programming is comprised of three logical structures:

    1. sequence;

    2. selection; and

    3. iteration

  2. Sequence structure - In a sequential structure, the commands are executed in sequence. The flow of the program is to complete one instruction and then drop down and execute the next instruction and then the next until something terminates the sequence such as the end of a paragraph.

  3. Selection structure — In a selection structure the processing is dependent on a condition that is being tested. In COBOL, the selection structure is usually accomplished with an IF or an EVALUATE (the implementation of the case structure in COBOL) or with an implied IF such as the AT END clause in the READ statement.

  4. Iteration structure— (LOOP STRUCTURE) The iteration structure causes something to be executed over and over again until some condition terminates the repetition.

  5. This structure is essentially the looping structure that has been used in all of the programs.

  6. When defining iteration, there are two basic structures that a language may implement: Do-While and Do-Until.

  7. The difference between the two structures is when the condition is tested. In the Do-While structure the condition is tested before the loop is executed while in the Do-Until structure the condition is tested after the loop has been executed. This means that with the Do-While structure there is a possibility that the loop will never be executed.

  8. The PERFORM...UNTIL used in the sample programs is an example of the Do-While structure because the condition is tested before the loop is executed.

  9. Use meaningful names. Ensure names conform to IRM 2.5.7 Data Naming Standards.

2.5.3.3.3.1  (11-26-2001)
General Programming

  1. This section applies to all divisions of a COBOL program.

  2. COBOL programs shall be written in accordance with the American National Standard Institute (ANSI). Where a standard is not specified in this manual, the relevant ANSI standard will be considered the established standard.

  3. Begin to insert comment line in these specified areas:

    1. IDENTIFICATION DIVISION

    2. DATA DIVISION

    3. WORKING-STORAGE SECTION

    4. LINKAGE SECTION

    5. PROCEDURE DIVISION

    6. Any section within the PROCEDURE DIVISION.

    7. Any paragraph/section that represents a Structure Chart module within the PROCEDURE DIVISION.

  4. Place division, section, and paragraph names on a line by themselves and start in column 8. This also applies to the module names corresponding to structure chart modules.

  5. Insert a blank line between each Division name and the first statement of the Division.

  6. Insert a blank line between each Section name and the first statement of the Section.

  7. Do not split names or words between lines. If possible, avoid splitting literals between lines.

  8. Only one statement per line is allowed.

  9. With the exception of nested IF constructs, end each statement with a period.

  10. Indent statements that are continued on another line at least two spaces from the starting position of the initial line.

  11. Use blank lines and page ejects effectively.

  12. Use meaningful names. Ensure names conform to IRM 2.5.7 Data Naming Standards.

2.5.3.3.3.2  (11-26-2001)
Identification Division

  1. Include the following paragraphs in the IDENTIFICATION DIVISION of all programs: AUTHOR, INSTALLATION, SECURITY, and REMARKS. When necessary, they will be annotated as COBOL comments.

  2. The AUTHOR paragraph will include:

    • The name and office symbols of the section(s) responsible for the maintenance of the program.

    • At a minimum, the name of the last programmer/analyst to write or modify any of the code of the program.

    • It is a good practice to retain the names of the last few authors to allow quicker access to originators of code if problems arise.

  3. The INSTALLATION paragraph will contain " INTERNAL REVENUE SERVICE" .

  4. The SECURITY paragraph will contain " FOR OFFICIAL USE ONLY" .

  5. REMARKS paragraph will describe the function of the program, the subprograms that are called, the files that are used by the program, and the effective date. At the developer's option, this paragraph may also list modified modules and reasons for modifications after the program has been in production. (This often leads to quicker resolution of problems.)

2.5.3.3.3.3  (11-26-2001)
Environment Division

  1. Start each main clause (for example, the SELECT clause) in column 12.

  2. Start each sub-clause in column 16.

2.5.3.3.3.4  (11-26-2001)
Data Division

  1. Start FD and 01 entries start in column 8. Clauses of FD entries will start in column 12, one clause per line.

  2. Put level numbers in sequential order to reflect the logical levels of the structure (i.e., 01, 02, 03 rather than 01, 05, 10, 15, etc.).

  3. Indent level numbers 4 positions for each subordinate level.

  4. Indent data names (including condition names) 2 columns to the right of the level number. For example, see the following figure.

    Figure 2.5.3-1

    02 NO-MORE-MASTERS-FLAG PIC X(5)
      88 NO-MORE-MASTERS   VALUE " TRUE" .
      88 MORE-MASTERS   VALUE " FALSE" .

  5. Do not use 77 level entries in the DATA DIVISION.

  6. Start all PIC, VALUE, USAGE, OCCURS, and REDEFINES clauses in the same column, where possible.

  7. PIC clauses must not contain sequences of more than two identical symbols (except for edited fields). For example, use PIC X(4) rather than PIC XXXX. An edited field such as PIC ZZ,ZZZ.99 will be allowed.

  8. Do not group program flags, indexes, constants, etc., by class, under one 01 level. For example, in the following figure, the code would be prohibited.

    Figure 2.5.3-2

    01 PROGRAM-FLAGS
      02 NO-MORE-MASTERS-FLAG
      02 MORE-UPDATES-FLAG

  9. Ensure that local variables and constants associated with one module immediately follow each other in the DATA DIVISION. If a variable or constant is associated with more than one module, it should usually be defined with the highest level module that references it.

  10. Do not give flags and indexes multiple uses.

  11. Initialize constants, variables and output record areas by using the initialize statement or as follows:

    • Initialize constants in working storage, including FILLER fields, with a VALUE clause. Use VALUE SPACES or ZEROS, not "b" or "0" .

    • Initialize the variables in working storage (i.e., those fields that are changed during execution of the program) by using specific statements in the PROCEDURE DIVISION.

    • Initialize output record areas to clear buffers that are not overlaid during program execution. One way to initialize an output record area is to move SPACES to the record as a group item, and then move ZEROS to the numeric fields.

  12. Use meaningful data names derived from the problem being solved. Where applicable, data names should be consistent with those used in the structure charts. COBOL allows names of up to 30 characters. Ensure that names conform with IRM 2.5.7 Data Naming Standards.

  13. Avoid data names that convey little meaning. For example, see the following figure.

    Figure 2.5.3-3

      INDEX, I, K, TR127, COUNTER, EOT.  

  14. Use data names that convey meaning. For example, see the following figure.

    Figure 2.5.3-4

      MESSAGE-INDEX  
      TRANSACTION-COUNT  
      NO-MORE-TRANSACTIONS-FLAG  

  15. Apply the PIC 9 versus PIC X standard to date fields in the following manner:

    • Use PIC 9 for date fields in situations where the particular date value in question will be used for numeric functions (e.g., calculations, computations, estimations, etc.) rather than for accepting input or direct display. Assign four positions to the year field (YYYY) and do not store non-date values or special characters in the date field.

    • Define the data as necessary (PIC X, PIC 9, PIC S9, or another format) in order to accommodate input that may be blank, coming from electronic files, External Trading Partners (e.g., SSA), taxpayer submitted files, DB2 special formats, unique database machine formats, or other formats. It is not necessary to use PIC 9 when defining fields that are accepting input.

    • Define the data as necessary (PIC X, PIC 9, PIC S9, or another format) in order to display data (e.g., reports, screens) or format/unformat display dates that contain special characters in edit fields (e.g., slashes, commas, dashes, etc., depending on the function requested). However, note that the Year Field must be four positions (YYYY). It is not necessary to use PIC 9 when displaying data.

2.5.3.3.3.5  (11-26-2001)
Procedure Division

  1. A functional module should be limited to 50 lines of executable code as a general rule.

  2. Each module (not each paragraph within a module) must start on a new page with comment lines indicating the module number of the Structure Chart that is represented by the code and the function of the module as described on the Module Specification. For example, see the following figure.

    Figure 2.5.3-5

    /**   MODULE 2.4.3.7    
    *        
    * (Description of module)      
    *        
    * GET-VALID-TRANSACTION.      
             
      READ TRANSACTION-FILE      
      AT END      
      MOVE "TRUE" TO NO-MORE-TRANSACTIONS-FLAG.
             
          --rest of code--

  3. Module names in a COBOL listing must correspond to Structure Chart module names.

  4. Name a paragraph that is an implementation of an N-S control structure (e.g., PERFORM-UNTIL, ELSE, CASE, nested IF-THEN-DO-UNTIL, etc.) in a way that explains its purpose. These paragraphs are not separate modules; they are paragraphs within the module.

  5. Arrange modules in a program listing in either a horizontal or a vertical sequence corresponding to the Structure Chart level numbers. For example, see the following figure.

    Figure 2.5.3-6

    Horizontal Module Arrangement --Veritical Module Arrangement
      0.0   0.0
      1.0   1.0
      2.0   1.1
      3.0   1.2
      1.1   2.0
      1.2   2.1
      2.1   2.1.1
      2.2   2.1.2
      2.3   2.1.2.1
      2.1.1   2.1.2.2
      2.1.2   2.1.3
      2.1.3   2.2
      2.2.1   2.2.1
      2.2.2   2.2.2
      2.1.2.1   2.3
      2.1.2.2   3.0
      etc.   etc.

  6. Code the READ statement and WRITE statement options one per line, indented 2 columns. See the following figure.

    Figure 2.5.3-7

    READ file-name   WRITE record-name
      AT END     AFTER ADVANCING identifier LINES
        statements.       statement.
                 
    or,       or,    
                 
    READ file-name   WRITE record-name
      INVALID-KEY     INVALID-KEY
        statements.       statement.

  7. Place phrases such as AT END, WHEN, and VARYING on the next line indented two columns.

  8. Begin any statement not covered by other indentation rules in the same column as the statement above it.

  9. Never use the ALTER verb (or any other method of dynamically altering thePROCEDURE DIVISION).

  10. Do not use the GO TO verb, except in the implementation of the CASE construct or in the use of internal SORT exits.

  11. Handle all file openings and closings in any given module with one OPEN or CLOSE statement. The following figure depicts these formats.

    Figure 2.5.3-8

      OPEN INPUT   file-name-1
            file-name-2
        OUTPUT   file-name-3
            file-name-4
             
      CLOSE   file-name-1
            file-name-2

  12. Immediately follow the MOVE CORRESPONDING statement with a comment documenting all data items involved. This ensures thorough documentation. For example, see the following figure.

    Figure 2.5.3-9

      MOVE CORRESPONDING RECORD-A TO RECORD-B
       
      *FIELD-1, FIELD-3, FIELD-5.

  13. Use the COMPUTE verb to develop Arithmetic operations with the following exceptions:

    • Use the DIVIDE statement to compute remainders.

    • ADD X to (counter) and SUBTRACT X from (counter) are allowed.

  14. STOP RUN must only occur once as the last logical statement in the main procedure of a program. EXIT PROGRAM may only occur as the last logical statement of the main procedure of a subprogram. EXCEPTION: In some cases, it may be justifiable to use a STOP RUN in a low-level module of a very large run.

  15. Make the logic of the program independent of the physical sequence of paragraphs. A PERFORM statement should not perform more than one paragraph within a PERFORM statement. Explicitly identify paragraphs. For example, two figures follow. The first figure illustrates a statement that would not satisfy this standard. The second figure illustrates a statement that satisfies this standard.

    Figure 2.5.3-10

      PERFORM Paragraph-A THRU Paragraph-B.

    Figure 2.5.3-11

      PERFORM Paragraph-A.
       
      PERFORM Paragraph-B.

  16. The following figure illustrates a statement that is exempt from this standard.

    Figure 2.5.3-12

      PERFORM Case-Paragraph
       
      THRU Case-End-Paragraph.

  17. When a module is invoked via a PERFORM statement, represent the parameter table shown on the Structure Chart with comment lines. In the following figure, Parm-3 is both input to and output from Module-X.

    Figure 2.5.3-13

        PERFORM MODULE-X.
      * ** USING: Parm-1,Parm-2,Parm-3
      * ** GIVING: Parm-3,Parm-4,Parm-5

  18. When a module is invoked via a CALL statement, do not list the USING phrase as a comment line as it is part of the syntax. Group all parameters shown on the Structure Chart Diagram that are passed between the main program and the called module so that all of the input parameters precede the output parameters. Represent the GIVING phrase as a comment line and identify the output parameters (since COBOL does not make the distinction between input and output parameters). In the following example Parm-1 through Parm-5 are listed in the USING phrase, but not as a comment line. Any output parameter that is input to the module, such as Parm-3 illustrated in the following figure, is listed in the GIVING as a comment (so that it is not listed twice in the program code).

    Figure 2.5.3-14

        CALL Module-X
          USING Parm-1,Parm-2,Parm-3
      * ** GIVING Parm-3,
            Parm-4,Parm-5.

  19. The PERFORM verb has 5 acceptable formats:

    • PERFORM Paragraph-Name.--This format is used with USING and GIVING comment statements to implement a module call; or used without the comments to PERFORM paragraphs within a module. (e.g., nested IFs, the body of the (PERFORM-UNTIL) structure, etc.).

    • PERFORM Select-Case THRU Select-Case-End.--The THRU option of the PERFORM should only be used for implementation of the SELECT-CASE structure.

    • PERFORM Paragraph-Name UNTIL Terminating--Condition.

    • PERFORM Line-Spacing-Paragraph Line-Count TIMES.--This option executes a procedure a set number of times.

    • The PERFORM-UNTIL may also be used to vary a subscript or index as in a table-search routine. See the following figure.

    Figure 2.5.3-15

      PERFORM Table-Search
        VARYING Table-Index
          FROM 1 BY 1
        UNTIL Match-Found
          OR Table-Index GREATER THAN Max-Entries.

  20. Implement the DO-UNTIL structure in one of two ways. The first way is a PERFORM/PERFORM-UNTIL combination. See the following figure.

    Figure 2.5.3-16

      PERFORM Paragraph-Name.
      PERFORM Paragraph-Name
        UNTIL Terminating-Condition.

  21. The second way to implement a DO-UNTIL structure is to use a switch to terminate the loop. See the following figure.

    Figure 2.5.3-17

        MOVE True to Loop-Predicate.
        PERFORM Paragraph-Name
          UNTIL Loop-Predicate = False.
        ...    
        Paragraph-Name.
        . . . Statements . . .
          IF Terminating-Condition
      *     THEN
              Move False to Loop-Predicate.
      *     END-IF

  22. The following figure illustrates the format of the IF-THEN-ELSE statement.

    Figure 2.5.3-18

        IF Condition
      *   THEN
            True-Procedures
          ELSE
            False-Procedures.
      * END-IF

  23. The ELSE part of the IF statement is optional when there are no actions to be taken (i.e., "ELSE NEXT SENTENCE" is not required). The THEN and END-IF comments are required. The True-Procedure and False-Procedure statements are indented 2 spaces from their corresponding THEN or ELSE. The IF and the corresponding END-IF keywords start in the same column. The THEN and ELSE keywords are indented 2 spaces in from the IF. As a guideline, the positive condition (rather than the negative) should be tested in a conditional statement. In a compound conditional statement, negative and positive tests should not be mixed.

  24. When there are compound conditions associated with an IF statement, ensure that the statement is as readable as possible. The best method of doing this depends on the particular condition. The following figures illustrate the two formats.

    Figure 2.5.3-19

      Format 1 – Putting each condition on a separate line:
             
        IF Condition-1
          OR Condition-2
      *   THEN
            True-Procedures
          ELSE
            False-Procedures.
      * END-IF

    Figure 2.5.3-20

      Format 2 – Using parentheses to specify the order of evaluation for the individual conditions of more complex conditions:
             
        IF ((Condition-1) OR (Condition-2))
          AND Condition-3
      *   THEN
            True-Procedures
          ELSE
            False-Procedures.
      * END-IF

  25. Do not nest IF statements more than 3 levels deep. See the following figure.

    Figure 2.5.3-21

      *   THEN
            IF Condition-2
              AND Condition-3
      *       THEN
                True-Procedure-1
              ELSE  
                False-Procedure-1
      *     END-IF
          ELSE
            False-Procedure-2.
      * END-IF

  26. If it appears that the nesting has to be more than 3 levels deep or even if the statement looks "cluttered" at 2 or 3 levels then PERFORM the inner test conditions. See the following figure.

    Figure 2.5.3-22

          IF Condition-1
        *   THEN
              PERFORM Inner-Test
            ELSE
              False-Procedure-2.
        * END-IF
                .
                .
      Inner-Test.
          IF Condition-2
              AND
        *     Condition-3
        *     THEN
                True-Procedure-1
              ELSE
                False-Procedure-1.
        * END-IF  

  27. Implement the SELECT-CASE construct in one of two ways:

    1. Nested IFs

    2. GO TO DEPENDING ON

  28. Implement the "Nested IF" of the SELECT-CASE construct as prescribed in the following figure. Note that this format is different from a normal IF-THEN-ELSE statement.

    Figure 2.5.3-23

      * SELECT CASE.
        IF Condition-1
      * CASE-1:
          Case-1-Statements
        ELSE
        IF Condition-2
      * CASE-2:
          Case-2-Statements
        ELSE
        IF Condition-3
      * CASE-3:
          Case-3-Statements
        ELSE
      * Error-CASE:
          Error-Case-Statements.
      * ENDCASE

  29. Implement the GO TO DEPENDING ON of the SELECT-CASE as prescribed in the following figure. Note that the ERROR-CASE statement comes before the valid case paragraphs. Each case has a separate paragraph with a GO TO the EXIT paragraph. This structure is normally executed via a PERFORM Select-Case THRU Select-Case-End statement.

    Figure 2.5.3-24

      SELECT CASE
        GO TO
          CASE-1
          CASE-2
            .
            .
            .
          CASE-N
        DEPENDING ON Select-Code.
      ERROR-CASE.
        Error-Statements.
        GO TO SELECT-CASE-END.
      CASE-1.
        Case-1-Statements.
        GO TO SELECT-CASE-END.
      CASE-2.
        Case-2-Statements.
        GO TO SELECT-CASE-END.
      ...
      CASE-N.
        Case-n-Statements.
        GO TO SELECT-CASE-END.
      SELECT-CASE-END.
        EXIT.

2.5.3.4  (06-01-2004)
C Programming

  1. This section of the IRM provides guidelines for coding C programs and naming C program components.

2.5.3.4.1  (06-01-2004)
File Naming

  1. Create the file names from a base name and an optional period and suffix.

  2. Store very large files by date (for archive or delete). Make the date a part of the name (e.g. log files).

  3. Make the first character of the name a letter.

  4. Assign a file name that is unique in as large a context as possible.

  5. Use uppercase and lowercase letters to name source code files.

  6. Include comments in the module so other programmers will understand the modules' purpose (ie., Title Section).

  7. Use System Name followed by file name. For example, the application system is Telefile (or EMS, TEPS, EFDS, EFTPS, etc.) and the file name is ReturnData.

  8. Include comments on the name

  9. Generally, programs that rely heavily on external libraries, such as GUI programs written using X/Motif, are constrained by the naming conventions employed by these libraries.

2.5.3.4.2  (01-01-2004)
Source Code Files

  1. Size Considerations:

    1. Limit the size of a source code file to 1000 lines as large source code files can be very cumbersome to deal with.

    2. Per each line in a source code file, limit the number of characters per line to 163 or fewer characters.

    3. Decompose long lines into smaller pieces, such that when the file is printed, all portions of the code will print out legibly.

    4. Indent subsequent sections of a longer line so that it is clear that these are continuations of the line above.

    5. In the length of a line, include any commentary that follows the code on the line.

    6. Where a function exceeds two pages, reexamine the design of the function.

    7. Especially consider if more than one function is involved or if sub-functions would be better in separate modules.

    8. If functions are short and related to each other, then place them in same source code file.

  2. Composition:

    1. Prologue

    2. Includes

    3. Defines and Typedefs

    4. Global Definitions

    5. Function Placement

2.5.3.4.2.1  (01-01-2004)
Prologue

  1. Make the prologue first in the file as it indicates what is in that file.

  2. Use a description of the purpose of the objects in the files (whether they be functions, external data declarations or definitions or something else) rather than a list of the object names. A description of the method(s) used is helpful for any complex function.

  3. Avoid making descriptions so detailed that maintenance of the header takes more effort than is gained by increased understanding of the code itself.

2.5.3.4.2.2  (01-01-2004)
Includes

  1. Header files are files that are included in other files prior to compilation by the C preprocessor.

  2. Place the header file after the prologue. If the include is for a non-obvious reason, comment the reason. In most cases, application system include files like stdio.h should be included before user include files.

2.5.3.4.2.2.1  (01-01-2004)
Header File Organization

  1. Relate all functions in a given header file to the same general function, i.e. declarations for separate sub-systems should be in separate header files. Example: all functions in the header file stdio.h either perform or assist in the performance of input and output.

  2. Do not use any function implementations, except macros in Header files.

  3. Within a header file, group functions that perform related tasks in the same section. Example: within the file stdio.h, all functions in the scanf family must be placed together.

  4. Define and include certain header files, such as stdio.h at the application system level and for any program using the standard I/O library.

  5. Use Header files to contain data declarations and defines that are needed by more than one program.

  6. Organize header files functionally, i.e., declarations for separate subsystems should be in separate header files.

  7. If a set of declarations is likely to change when code is ported from one machine to another, place those declarations in a separate header file.

  8. Use "<>" (<stdio.h>) for system include files and double quotes ("user.h" ) for user include files.

2.5.3.4.2.2.2  (01-01-2004)
Header File Inclusion in the File that defines the Function

  1. Include Header files that declare functions or external variables in the file that defines the function or variable. This allows the compiler to do type checking and the external declaration will always agree with the definition.

  2. To prevent accidental double-inclusion, in each .h file, use code like the following.

    #ifndef EXAMPLE:
    #define EXAMPLE
    ... /* body of example.h file */
    #end-if /* EXAMPLE */

  3. Use a general-purpose header file for commonly used symbolic constants.

2.5.3.4.2.2.3  (01-01-2004)
Nested Header Files

  1. Do not nest Header files. The prologue for a header file must describe what other headers need to be included for the header to be functional.

  2. Where a large number of header files are to be included in several different source files, put all common include statements in one include file.

2.5.3.4.2.2.4  (01-01-2004)
Header File Names

  1. Avoid private header filenames that are the same as public header filenames. The statement #include "math.h" must include the standard library math header file if the intended one is not found in the current directory. If this is what you want to happen, comment this fact.

  2. Don't use absolute pathnames for header files. Use the <name> construction for getting them from a standard place, or define them relative to the current directory. Use the "include-path" option of the C compiler (-I on many application systems) used in the Makefile to handle extensive private libraries of header files; it permits reorganizing the directory structure without having to alter source files.

2.5.3.4.2.3  (01-01-2004)
Defines and Typedefs

  1. Place the defines and typedefs that apply to the file as a whole after the includes.

  2. It is sometimes useful to place a Define before the header files so that they will apply to the header files.

  3. Place "constant'" macros first, then " function" ' macros, then typedefs and enums.


More Internal Revenue Manual