If Nested IF-THENIELSE Statements Give You Headaches, Then Try ARRAY 3-D - or Else Take Aspirin Judy Alicante, Pacific Gas & Electric Company Glenn Ironhat, Pacific Gas & Electric Company ABSTRACT

Customer number

= 1234

Billing period = 4/11/93-5/10/93 (starting date and ending date of bill)

Have you ever spent more time indenting and aligning nested IF-THEN/ELSE code than writing it? After writing two pages of the same DO-END block or starting your lines past column forty, do you wonder if there isn't a better way?

Energy used Bill amount

=500 kwh = $30.00

The transaction record is used to update the master file record with the same customer number.

There is a better way with SAS version 6.07. You can replace your Stone Age code with a New Age threedimensional array. ARRAY 3-D shows how we replaced 75% of IF-THEN/ELSE logic with a decision matrix. The decision matrix was then converted into SAS code as a three-dimensional array. The threedimensional array construct increases program readability and maintainability, in addition to relieving your IF-THEN/ELSE headaches.

THE ELECFIX MASTER FILE

The master file is organized so that each record holds a customer's monthly billing data for the past 13 months. This monthly data is stored as an array. The master file is a rolling file because each month the oldest month's data is discarded, making room forthe new transaction billing data. Using the four variables above, our master file looks like this:

INTRODUCTION

Customer number = 1234 Billing period 1 = 4110/92-5/10/92 Energy used 1. = 430 kwh Bill amount 1 = $32.50

The Pacific Gas and Electric Company has over 4.5 million customers. It has a big Information Systems department and huge customer information files. Since the Customer Information System (CIS) was written decades ago specifically for billing customers and reporting revenues, the files are not in a format that facilitates research and analysis.

Billing period 2 = 5110/92-6111/92 Energy used 2 = 495 kwh Bill amount 2 = $35.45 . Billing period 3 = 6111/92-7112192 Energy used 3 = 650 kwh Bill amount 3 = $45.10

The PG&E Rates Department researches, designs and proposes new rates for the company's gas and electric service. To perform these functions, we access customers' historical billing data. This data is collected and processed monthly. Our ELECFIX system reads a transaction file of customers' bills for the previous month and then updates the ELECFIX master file. The ELECFIX master file holds the previous thirteen months of customer billing data and was designed to facilitate research and analysis work.

Billing period 11 = 2/10/93-3110/93 Enetgy used 11 = 335 kwh Bill amount 11 = $28.30 Billing period 12 = 3/10/93-4/11/93 Energy used 12 = 550 kwh Bill amount 12 = $35.00

With the introduction of SAS version 6.07, we decided to rewrite the ELECFIX system to use many of the new SAS features. This paper shows how we used a three-dimensional array as a decision table and made our ELECFIX system more readable, easier to maintain, and more efficient.

Billing period 13 = blank In this example, billing period 1 is located in array element 1, which we have nicknamed "Bucket 1." Billing period 2 is in Bucket 2, and so on. Bucket 13 in this example does not have any data. This example shows Customer Record 1234 after rolling the master file, leaving Bucket 13 empty. Befcire rolling the file all the buckets were full (see Figure 1, next page).

THE TRANSACTION FILE.

Every month, the ELECFIX master file is updated with transaction records, which correspond to customer bills. A transaction record contains many billing variables. Forthis paper, we will showonlyfourvariables:

690

Before Rolling:

Full or partial month Any bucket that contains data for a billing period of more than twenty days is said to be a full-month bucket. A partial-month bucket contains data for a billing period offewer than twenty-one days.

8888-------888 319192-

"'(1'92

""0/92-

5I'Iom:

51'10/92-

SfttJ92

61111927/12/92

lNi3· 2110193

211M33/10J93

311M341'11193

Contiguous The transaction record is said to be contiguous with the first-choice bucket if the starting date of the transaction record is the same as the ending date of the first-choice bucket If the ending date of the transaction record is the same as the starting date of the second-choice bucket, the ttansaction record is said to be contiguous with the secondchoice bucket.

RoUing:

-----~ INn211M3

3110/5133/10J93

3/1om_ '-'lIm

After rolling:

8888-------888 4'10192S/10192

5/1019251'11192

61111927/'l2192

Based on the values of these parameters, five possible actions will be used to update the master file record. We have associated one action code to each action .. These codes tell the system which bucketto update and how to update the bucket. Here are the definitions of the action codes:

7112192""192

Figure 1: How the master file rolls THE PROBLEM If every customer were billed once a month, and each billing period had 30 or 31 days, it would be simple to assign each transaction or bill to a bucket. Life at a utility is not that simple. The complicating factors are:

transaction data replaces data in first-choice bucket transaction data replaces data in secondchoice bucket

1 2

transaction data is added to data in firstchoice bucket transaction data is added to data in secondchoice bucket

3

• some bills have a short number of days • some bills are several months old

4

• some billing dates do not match the starting and lor ending dates of any billing period bucket in the master file record. So the problem in simple terms is, which bucket in the master file record should be updated by the transaction data? Before we present our solution, let's go over some rules:

5

= transaction data is added to bucket with fewer days

We will now show a simple example of the update process. Following that we will show the Stone Age code that will do the update. After that we will explain

First-choice bucket There are two possible buckets in the master file that are selected as candidates for the update. These buckets are referred to as the firstchoice and second-choice buckets. The first-choice bucket is chosen based on the starting and ending dates of the transaction record. The second-choice bucket is always the bucket one greater than the firstchoice bucket.

~.~~~

LJ

~

eJ

""M12-

S11G'92611W2

6'lvn-

511G'92

111m

7112/1/211'1192

During Update;

Final choice bucket Once the first- and secondchoice buckets are determined, other variables determine the final choice of the bucket to be updated. These other variables include the starting and ending dates in the transaction record, the starting and ending dates of both the first and second-choice buckets of the master record, whether the transaction record and the two chosen buckets have a full or partial month's data and whether the transaction record is contiguous with the first-choice bucket orthe secondchoice bucket.

E.7E.78"

eJ-----~ ~ 2110\133/1a93

3/1Q'93-

""V93

-

New transaction fQcord

... ,(Wt-

Sl1M12-

6/1t192-

11121112-

2110'93-

6'1Vl12

71~

1111/12

5I1G'93 '

3/1&lG-

Sl1Q'92

-

2110'933/10'93

:vt1W'3-

""t19:l-

..., V93

After Update:

""0/92-

SllM12-

611 tI92-

1112192-

5110'92

61tV92

111m

1111192

.,tV93

Figure 2: Updating the master file

691

511Q'93

else if days{choice2) ge 21 and startdt(choice2) ne enddate then action=2 ; ** full-m:mth, noncontiguous **i end i else if days (choice1) ge 21 and enddt(choice1) ne startdte then do ; ** full-roonth, noncontiguous **i if startdt{choice2) · then action=2 ; ** no data in choice2 tucket **; else if days (choice2) It 21 and startdt(choice2)=enddate then ** partial-month, contiguous **; action::4 ; else if days(choice2) It 21 and startdt(choice2) ne enddate ** partial-roonth, noncontiguous **; then action=2 i else if days (choice2) ge 21 and startdt(choice2)=enddate then H full-month, contiguous **; action=1 ; else if days(choice2) ge 21 and startdt(choice2) ne enddate ** full-roonth, noncontiguous **; then action=1 ; en:! en:! ; *'**** end of if ge 21 (full-month transaction) ***** ;

our improved code design using a three-dimensional array. Using the rules outlined previously, we will see that this record is relatively easy to place in a bucket (ref. Figure 2). • The first-choice bucket for this transaction record is bucket twelve. • The transaction record is greater than twenty days, so it is a full month transaction. • The first-choice bucket has no data in tt. Solution: In this example, the transaction record will replace the first-choice bucket.

else if billdays It 21 then /* partial-rronth transaction */ do if startdt (choice1) = . or startdt (ci)oice1) =startdte or enddt(choice1)=enddate or (prewkt gt 0 and errldt (prewkt) =startdte_) then H no data in cMicel bucket action=1 ; ** or sarre billing dates else if days(choice1) It 21 and enddt(choice1)=startdte then do ; U partial-month, contiguous **; if startdt(choice2) . then action=3 i h no data in choice2 bucket , else if days(choice2) It 21 and startdt(choice2)=enddate then action=5 ; ** partial-rronth, contiguous , else if days(choice2) It 21 and startdt(choice2) ne enddate then H partial-month, noncontiguous Hi action=3 ; else if days(choice2) ge 21 and startdt(choice2)=enddate then action=3 ; ** full-roonth, contiguous **i else if days(choice2) ge 21 and startdt{dlOice2} ne enddate then action=3 ; ** full-month, noncontiguous **; en:! ;

"THE; STONE AGE CODE Below is a fragment of the actual IF-THEN/ELSE code that we used to make the necessary decisions and"determine the appropriate action to take place. • • • • • • • • • • • • • • ,. t •• t •• t.,. t • • •_•••• Itt • • • • • • • • • • ,. • • ,. . . . . . . . . . . . . . . .

Detemi.ne the choicel and

~ice2

buckets.

..;

t ............................................................ ;

-%detl:uckt i billdays = enddate - startdte ; ..................... t •••••••••••

*.,. •••••••••

fI . . . . . t

"' •• _ . . . . . . . ;

• Determine which bucket to update and how to update the rucket.

•;

• • • • • • • • • • • • • • • • • • ,. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . t • • • • • ;

if billdays ge 21 then

/'

full-roonth transaction

'/

do

else if days(choice1) It 21 and enddt(choice1) ne startdte then do ; ** partial-roonth, noncontiguous H; if startdt (choice2) • then it' no data in choice2 bUcket , action=1 ; else if days (choice2) It 21 and startdt(choice2)=enddate then ** partial-month, contiguous **i action=4 ; else if days(choice2) It 21 and startdt (choice2) ne enddate ** partial-month, noncontiguous **i then action=1 ; else if days (choice2) ge 21 and startdt(choice2)=enddate then action=4 ; ** full-m:mth, contiguous H; else if days (choice2,) ge 21 and startdt(choice2) ne enddate then action=1 ; ** full-month, noncontiguous *' *i en:! ; else if days(choicel) ge 21 and enddt(choice1)=sta~dte then do ; ** full-month, contiguous **i if startdt(choice2) · then action=2; . ** no data in ehoice2 tucket , else if days(choice2) It 21 am startdt{choice2):enddate then ** partial-month, contiguous Hi action=4 ; else if days (chOice2) It 21 and startdt (choice2) ne enddate ** partial-month, noncontiguous **; then action=3 ; else if days (choice2) ge 21 and startdt(choice2)=enddate then ** full-roonth, contiguous H; action=5 i else if days (choice2) ge 21 and startdt (choice2) ne enddate then action=3 ; ** full-roonth, noncontiguous H; end ; else if days(choice11 ge 21 and endcit(choice1) ne startdte then do i ** full-month, noncontiguous **i if startdt(choice2) · then ** no data in choice2 bucket , action=2 i else if days(choice2) it 2! and startdt(choice2)=enddate then ** partial-month, contiguous **; action=4 ; else if days (choice2) It 21 and startdt(choice2) ne enddate then action=2 ; ** partial-month, noncontiguous H; else if days(choice2) ge 2! am startclt(choic;e2)=enddate then ** full-rrontb, con'tiguous **; action=4 ; else if days (choice2) ge 21 and startdt(choice2j ne end.date U full-m:mth, noncontiguous Hi then action=1; en:! en:!; end of partial-month processing

if startdt(choicel)= . or startdt(choicel)::startdte or enddt(choicel) = enddate or (prevbkt gt 0 and enddt (prevbkt) = startdte) then action=1 ; tt no data in choice! bucket **; ** or same billing dates or H; H data is -a nonnal bill and is , ** contiguous' to previous bucket'**; else if days (choice1) It 21 and enddt(choice1)=startdte then do; '** partial-rronth, contiguous '**; if startdt (cho'ice2) . then action:;3 ; '** no data in choice2 bucket , else if days (choice2 ) It 21 and startdt(choice2)=enddate then '** partial-month, contiguous '**i action=5 i else if days(choice2) It 21 and startdt(choice2) ne enddate ** partial-month, noncontiguous '**; then action=3 ; else if days(choice2) ge 21 and startdt(choice2)=enddate then action=3 i '** full-rronth, contiguous ** i else if days(choice2) ge 21 and startdt (choice2) ne enddate ** full-rronth, noncontiguous '**; then action=3 ; end;

else if days(choice1) It 21 and enddt(choice1) ne'startdte'then do ; ** partial-rronth, noncontiguous Hi if startdt (choice2) . then action:;l i H no data in choice2 bucket , else if days(choice2) It 21 and startdt(choice2)':='enddate then H partial-rronth, contiguous H; action=4 i else if days(ciioice2) It 21 arid startdt(choice2) ne enddate then'action=l; ** partial-month, noncontiguous **; else if days(choice2) ge 2! and startdt(choice2)=enddate then action:;! i _" ** full-month, contiguous , else if days(choice2} ge 21 and startdt(choice2) ne enddate then action:;! ; H full-month, noncontiguous **; en:! ; else if days (ch.oice1) ge 21 and enddt(choice1)=startdte then do ; H full-rronth, contiguous if startdt(choice2) = . then '**, no data: in choice2 bucket H; action=2 i else if days{choJce2) It ,21 and startdtCchoice2)=ermate then action:-4 i . ** partial-month, contiguous **i else if days (choice2) It 21 and startdt(choice2) ne endclate then action=2 i ** partia!-rrcnth, noncontiguous **; else if days(choice2) ge 21 and startdt(choice2)=endclate then ** full-m:mth, con_tiguous action=l i

******************************************************** Perfom the update as dictated by the action code. ******************************************************** %ACrrON;

692

A person" might be tempted to combine many of the THEN-ELSE clauses that generate the same action code into one IF condition statement. This would be a mistake. That is exactly the situation that we are trying to avoid. Once that is done, the logic becomes so interwoven that program maintenance becomes unmanageable. That was the situation in the old system that we were rewriting.

THE DECISION MATRIX The following two matrices show the decision tables used to determine how the master file bucket will be updated. Matrix 1 is the decision table for a transaction record wfth a partial month's data and Matrix 2 is for a transaction record with a full month's data. In both matrices, the rows describe the data in the first-choice bucket of the master file. The columns describe the data in the second-choice bucket. The data in the first-choice and second-choice master file buckets can have full-month .or partial-month data, and can be contiguous or noncontiguous with the transaction data. Each cell contains the action code assigned to that condftion. The numbers in the cells are the action codes.

As we studied this IF-THEN/ELSE code, we noticed a pattern in the logic. From that pattern we extracted the action codes and placed them into a table. We called thattable the decision matrix. This leads us into the design of our new code.

SECOND-CHOICE BUCKET PARTIAL MONTH CONTIGUOUS NONCONTIGUOUS

NO DATA NO DATA, OR SAME DATES, OR CONTIGUOUS TO PREVIOUS BUCKET

tu

'"u=> w '" u (5

:J:

~

a:

il:

FULL MONTH CONTIGUOUS NONCONTIGUOUS

1

1

1

1

1

PARTIAL MONTH CONTIGUOUS

3

5

3

3

3

PARTIAL MONTH NONCONTIGUOUS

1

4

1

4

1

FULL MONTH CONTIGUOUS

2

4

3

5

3

FULL MONTH NONCONTIGUOUS

2

4

2

4

1

Matrix.1: Transaction record has partial-month data (Numbers in cells are action codes) "

SECOND-CHOICE BUCKET NO DATA NO DATA, OR SAME DATES, OR CONTIGUOUS TO PREVIOUS BUCKET

tu

PARTIAL MONTH CONTIGUOUS NONCONTIGUOUS

.

FULL MONTH CONTIGUOUS " "NONCONTIGUOUS

"

1

1

1

1

1

'"u=>

PARTIAL MONTH CONTIGUOUS

3

5

3

3

3.

~u,..,.

PARTIAL-MONTH NONCONTIGUOUS

1

4

1

1

1

FULL MONTH CONTIGUOUS

2

4

2

1

2

FULL MONTH NONCONTIGUOUS

2

4

2

'"w

en

55

u..

"

-

1 .

Matnx 2. Transaction record has full month data (Numbers in cells are action cOdes)

693

1 .

Each of these two matrices can be structured as a two-dimensionaI5-by-5 array. When the first matrix is placed on top of the second matrix. it forms a threedimensional 2-by-5-by-5 array. Below is a picture of the resulting three-dimensional array.

is used to set the indexes into the array. We will explain why this is better than the original IF-THEN/ ELSE code when we discuss the benefits of our new code. THE NEW AGE CODE By using a three-dimensional array as a decision table. we can alleviate the maintenance problems that are inherent with heavily nested IF-THEN/ELSE code. Althe same time we can create a data structure thaI closely resembles the decision matrix. Below is the three-dimensional array code: Retain f* partial rronth transaction *I act! 1 act2 1 act3 1 act4 act6 3 act7 5 act8 3 act9 actll 1 act12 4 act13 1 act14 act16 2 act1? 4 actlB 3 act19

PM =Partial Marth Data FM::: FIJI Mcnlh Data CT" Continguous Data NCT" Nonconti uous Data

act21 2

act22 4

/* full act26 1 act31 3 act36 1 act41 2 act46 "2

month act27 act32 act)?

act23 2

transaction */ 1 act28 1 5 act33 3 4 act38 1

act42 4 act47 4

act43 3 act48 2

1 3 4 5 act24 4

actS 1

act29 act34 act39 act44 act99

act30 act35 act40 act45 actSO

1 3 4 5 4

actIO 3

actlS 1 ad20 3 act25 1 1 3 1 3 1

** •••••••••••••••••••••••••• ** ••••••••••••••••••••••••••••••••••• ; dete:mine the choicel and choice2 buckets

"i

** ••••••••••• *••••••••••• *******************************ttt*tt*tt; %DRI'BOCKT ;

Figure 3: Three-Dimensional Array

t t** ** t ******** **** *** t** * *** t ** ********t ** t* t **** t* tt t tt ***t tit t ; detemine' the x, Y and Z coordinates of the 3"'Clime:!a. array t 1 ******t******tt****t*ttt*ttttt**************:*ttt.****t*t*******t*t j billdays ::: endda te - startdte ;

Using the same example as before and following the same rules. we can see how the X. Y. and Z coordinates of the array indexes are set. The X-coordinate points to the transaction record. the Y-coordinate to the first-choice bucket and the Z-coordinate to the second-choice bucket.

.

****t* * t** **t **t * t ***t **n **t t t**t** *** ** * t** ******.tt t ...... _t***. t ;

dete%mine the X coordinate

:

(transaction record).

t*****t**** .. *.. *t*****t*t* .. tt*tt*tt .. ttt*t .. ** .... *... ****tt.*t* i

if billdays It 21 then x ::: 1 ; else x ::: 2 ;

• The first-choice bucket for this transaction record is bucket 12.

tt *** *.t* *.. **.*** ••••,••• *~ ** •• *••••• t. ** •• tt ••• *. *t ••••• ***•••• * ; determine the Y COordinate (choice1 bucket) *; *.* .....****t**'.*,*.***t*t •• *.**.***.***************.t****.t****t.;

• The transaction record is greater than 21 days so tt is a full month bucket and therefore. the X coordinate is set equal to 2.

.if startdt (¢1oice1), :::. or (billdays ge 21 and

then

(startdt (choice1) :::startdte or enddt (choicel) :enddate)) y ~ 1 ; else if days(choicel) It 21 and endcl.t(choke1)=startdte then

• The first-choice bucket is empty so there is 110 data in tt. therefore. the Y coordinate is set equal to 1. .

y

~

2 ;

else if days(choice1) It 21 and enddt(choice1) ne startdte then y ::: 3 ; else if days (choice1) ge 21 and,enddt(chokel)=startdte then y ::: 4 ; else if days(choicel) ge 21 and en?dt(choicell ne startdte then

• The second-choice bucket has no data therefore. the Z coordinate is set equal to 1. To retrieve the action code from the three-dimensional array. refer to the diagram and locate the cell where X=2. Y=1 and Z=I. The contents of the array cell is the action code used to determine how the bucket will be updated.

y

~

5 ;

*t ** * ** .*t*t _**.* *t .** ..... .,.,* *.* ••• **** .. *.. ** .* •• t .......... **.,.; detezmine the Z coordinate (cboice2 bucket) .: ............ t ......... *.*t ••••••• *•••• *••• *••••••• t.t •• t •••••••• ;

if startdt (choice2) :::. z =1 j else if days(choice2) It z ::: 2 ; else if days(choice2) It z ::: 3 ; else if days(choice2) ge z ::: 4 j else if days(choice2) ge

Our new code uses a three-dimensional array to retrieve the action code that determines how to update the record. The RETAIN statement was wrttten to resemble the two-dimensional decision matrix. This was done as a program maintenance aid. You will also notice some IF-THEN/ELSE code. This code

z ::: 5 ;

694

then 21 and startdt(choice2)=enddate then 21 and startdt(choice2) ne enddate then 21 and startdt(choice2):::endcl.ate then 21 and startdt(choice2) ne enddate then

Stone Age code were eliminateq. The same results are obt"ined in four lines of our New Age code.

***************************************************************'* get the resulting action frem the cell of the 3-d.i.meMicna1* t anay aDd perfom. the action. * *t*****t*tt**t*t*tt**ttttttt*t********t*****t****t******tttt*t** array "t;fz (1:2, l:S, 1:5) action", y¥z (x,Y,z)

The most noteworthy benefit for us is the ability to make changes to the program and be reasonably sure that the changes will have the effect that we intend. The biggest problem with our old system, which was written as many IF-THEN/ELSE code groups, was that no one could understand enough of the program to feel comfortable making a change that would work.

act1 - actSO ;

select (action);

\..hen (1) do ; %OVElU.AY (choice

= choice1)

end ,

"men (2) do i %OVElU.AY (choice

= choice21

;

end , "men (3) do ; %SUM (choice ::: choice1)

end ;

CONCLUSION As was mentioned earlier, the 2-by-S-by-S matrix that was transformed into a three"dimensional array is a real application that is currently being used at the Pacific Gas and Electric Company. We know for a fact, that the three-dimensional array made our department's ELECFIX system better than the one it replaced.

.men (4) do ; %SUM (choice = choice2)

end , Irklen IS) do ; %DETFEWST ; %OVElU.AY (choice::: choice3) ;

end , otherwise output

ERRORXYZ;

end ,

BENEFITS

The following are the benefits we experienced with our 3-D transformation: .

Is this any better than the IF-THEN/ELSE code that we are replacing? The answer, of course, depends on what you are trying to accomplish and the size of your application.

• readability • maintainability • efficiency And don't forget the non-data processing related benefits like:

By size of the application, we mean the size of the array itself. Anything smaller than our 2xSxS array may not be worth doing as an array. However, asthe number of rows and/or columns in the decision matrix increases, the benefits grow geometrically (see table below). Each row in the matrix represents one line of code. Each column represents a number of lines of code equal to the number of rows. In our matrix, each column represented five lines of code. The number of lines of code needed for each group along the third axis is the product of the number of rows times columns. In our project, for example, S8 lines of the

Array Dimensi~ns

• keeping your sanity • less or no headaches • less aspirin to buy .• savings on other medical costs ACKNOWLEDGMENTS

Number of IF-THENIELSE clauses Number With Without saved With of aspirin II.RRAY-3D ARRAY-3D ARRAY-3D saved SO

38

2

60

47

4.

13

7S

62

8

2x8x8

18

128

110

16

3x8x8

19

192

173

32

2xSxS

12

2xSx6

13

3xSxS

-

The authors thank all those involved. in ~ritingthe orig.inal ELECFIX system which we rewrote and transformed into a three-dimensional array ",pplication. Special thanks to Steve Marino, whoi~ also on the ELECFIX rewrite team, for his contributions, directly and indirectly, to this paper. Finally, abigth,mkyouto our personal desktop publisher, Danine CozzElns, without whom this paper would not be possible and to Eric Zirbel for letting us present it at SUGI. TRADEMARKS SAS is a registered trademark of the SAS Institute Inc., Cary, N.C., USA

Three-Dimensional Array Compared To IF-THENIELSE Statements.

695