Creating User-Defined Functions

Transcript

Creating User-Defined Functions Transcript was developed by Linda Mitterling and Jim Simon. Additional contributions were made by Cynthia Johnson, Warren Repole, and Jason Secosky. Editing and production support was provided by the Curriculum Development and Support Department. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. Creating User-Defined Functions Transcript Copyright © 2009 SAS Institute Inc. Cary, NC, USA. All rights reserved. Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. Book code E1586, course code RLSPFCMC, prepared date 18Dec2009.

RLSPFCMC_001

ISBN 978-1-60764-394-4

For Your Information

Table of Contents Lecture Description ..................................................................................................................... iv  Prerequisites ................................................................................................................................. v  Accessibility Tips ......................................................................................................................... v  Creating User-Defined Functions .................................................................................. 1  1.

Lecture Scenario ................................................................................................................. 7 

2.

Investigating End Use and Development Factors ............................................................. 16 

3.

Investigating Availability and Efficiency Factors ............................................................. 26 

4.

Conclusion ........................................................................................................................ 43 

Appendix A

Demonstration Programs ................................................................... A-1 

1.

Creating a DATA Step to Solve a Business Need ........................................................... A-3 

2.

Creating and Calling a SAS Macro and a PROC FCMP Function ................................. A-4 

3.

Creating a DATA Step to Solve Another Business Task ................................................. A-5 

iii

iv

For Your Information

Lecture Description This lecture compares and contrasts the FCMP procedure and SAS macros for encapsulating code into user-defined functions.

To learn more… For information on other courses in the curriculum, contact the SAS Education Division at 1-800-333-7660, or send e-mail to [email protected]. You can also find this information on the Web at support.sas.com/training/ as well as in the Training Course Catalog.

For a list of other SAS books that relate to the topics covered in this Course Notes, USA customers can contact our SAS Publishing Department at 1-800-727-3228 or send e-mail to [email protected]. Customers outside the USA, please contact your local SAS office. Also, see the Publications Catalog on the Web at support .sas.com/pubs for a complete list of books and a convenient order form. H

H

For Your Information

v

Prerequisites Before listening to this lecture, you should be familiar with DATA step syntax and basic SAS macro syntax. You can gain this knowledge by completing the SAS® Programming 1: Essentials course and the SAS® Macro Language 1: Essentials course.

Accessibility Tips If you are using a screen reader, such as Freedom Scientific’s JAWS, you may want to configure your punctuation settings so that characters used in code samples (comma, ampersand, semicolon, percent) are announced. Typically, the screen reader default for the character & is to read “and.” For clarity in code samples, you may want to configure your screen reader to read & as “ampersand.” In addition, depending on your verbosity options, the character & might be omitted. The same is true for some commas before a code variable. To confirm code lines, you may choose to read some lines character by character. When testing this scenario with Adobe Acrobat Reader 9.1 and JAWS 10, ampersands before SAS macro names were announced only when in character-reading mode.

vi

For Your Information

Creating User-Defined Functions 1. 

Lecture Scenario ............................................................................................................... 7 

2. 

Investigating End Use and Development Factors ........................................................ 16 

3. 

Investigating Availability and Efficiency Factors ......................................................... 26 

4. 

Conclusion ....................................................................................................................... 43 

2

Creating User-Defined Functions

1. Lecture Scenario

Creating User-Defined Functions

Welcome to this e-lecture on Creating User-Defined Functions. My name is Betsy, and I’ll be working with my colleague Linda to guide you through this session. Both Linda and I work for the Education Division at SAS. A quick note of thanks to our colleague Jim for all of his hard work on this lecture material.

3

4

Creating User-Defined Functions

Reference Materials To access the transcript for this lecture: 1. Go to the table of contents on the left side of the viewer. 2. Select Reference. 3. Select Transcript.

2

Before we begin the lecture, let me mention that we have included a transcript so that you can print all of the technical information provided in this lecture. To access the transcript, select Reference and then Transcript in the table of contents on the left side of the viewer, as in the example shown here. You can print this transcript now for use when viewing the lecture or print it later to keep as a reference. Also, note that Appendix A in the transcript contains copies of the programs used for the demonstrations in this lecture. U

U

U

U

1. Lecture Scenario

Navigation Help For information on how to navigate this lecture: 1. Go to the upper-right corner of the browser. 2. Select Help.

3

If you need help with the navigation of this lecture, please select Help in the upper-right corner of the browser. U

U

5

6

Creating User-Defined Functions

Creating User-Defined Functions

1. Lecture Scenario

2. Investigating End Use and Development Factors

3. Investigating Availability and Efficiency Factors

4. Conclusion 4

These are the topics that we will cover in this lecture. First, we will take a look at the scope and approach to be used for covering the lecture topics. Then, in Sections 2 and 3, we will discuss factors that need to be considered when comparing the two methods that we will use to solve the business task at hand. And then in Section 4, we will review the topics covered and summarize our findings from throughout the lecture.

1. Lecture Scenario

1.

Lecture Scenario

Creating User-Defined Functions

1. Lecture Scenario

2. Investigating End Use and Development Factors

3. Investigating Availability and Efficiency Factors

4. Conclusion 5

So, let’s get started with a quick discussion of the lecture scenario.

7

8

Creating User-Defined Functions

Objectives

„

Define the business need for a date function.

„

Identify the two methods to solve the business task.

„

Outline the approach for comparing the two methods.

6

In the first section, we will discuss our business need to create a user-defined function or macro to solve a business task. We will then outline our approach for comparing the two methods.

1. Lecture Scenario

9

Scope of the Lecture Macros and user-defined functions can simplify common programming tasks.

data age; set orion.staff; YearsDiff=intck("year",Birth_Date,Emp_Hire_Date); TextBirth=put(Birth_Date,mmddyy4.); TextHire =put(Emp_Hire_Date, mmddyy4.); TextNext =put(Emp_Hire_Date+1,mmddyy4.); if TextBirth gt TextHire then YearsDiff=YearsDiff-1; if TextBirth="0229" and TextHire="0228" and TextNext="0301" then YearsDiff=YearsDiff+1; drop TextBirth TextHire TextNext; run;

Macro %years

Function years( )

7 Note that the contents of this program will be discussed later in this lecture.

User-defined macros can encapsulate lengthy DATA step code to simplify common programming tasks. However, beginning with SAS 9.2, the SAS Function Compiler procedure (PROC FCMP) can alternatively be used in many cases to encapsulate that same code into user-defined functions. So, why does SAS have two methods to accomplish what seems to be the same task? Is there a difference between these two methods? Is one method easier to use than the other? Is one method more efficient than the other? These are the types of issues that we will address in this lecture.

10

Creating User-Defined Functions

Business Tasks What is the age of each employee as of his or her hire date? What is the age of each employee as of the current date?

8

The calculation must account for leap years and partial interval results.

The tasks that we will accomplish in our examples and demonstrations involve calculating the difference between the birth date of employees and their hired date. In other words, we want to calculate how old a person was when he or she was hired. A second task is to calculate the difference between today’s date and an employee’s birth date. In other words, how old is the employee based on the current date? Note that two issues we must deal with in this calculation involve accounting for leap years and what to do with partial interval results. For example, if an employee was hired when he was 30.25 years old, we want the age to be recorded as 30 years old.

1. Lecture Scenario

11

Creating a DATA Step to Solve a Business Need This demonstration illustrates DATA step code that calculates the difference between two date values in number of years.

9

Let’s switch to a demonstration. In this demonstration, Linda will illustrate a DATA step to solve our business need. (fcmc1.sas) 1. Hi, my name is Linda and I will be presenting the demos in this lecture. Let’s take a look at this DATA step code that I typed in earlier. • We are reading from a permanent SAS data set named orion.staff. • We want to calculate the age of employees when they were hired. We will use the INTCK function and the YEAR argument to do this. • We tell SAS to take the difference between an employee’s Birth_Date value and his or her Emp_Hire_date value and return the number of years for the difference. • In the next portion of the program, we use the PUT functions to extract the month and day portions of our date values. The 4. length on the MMDDYY4. format is where we specify that only a two-digit day and two-digit month be extracted. The variable TextBirth will contain the month and day that an employee was born, and the variable TextHire will contain the month and day that an employee was hired. In the next statement, one day is added to the hired date value. The PUT function converts that new date value into a month and day value in a variable named TextNext. All three of these become character because the PUT function returns character values. • Next we have two IF statements. The first IF statement addresses how the INTCK function calculates the difference in years between two date values. For year intervals, INTCK counts the number of times st that January 1 is encountered when going from the start date to the end date. When the end date falls earlier in the calendar year than the start date, this counter is too high by 1. To deal with this issue, we

12

Creating User-Defined Functions

added the IF statement that subtracts one year from the difference whenever the starting month/day st string is greater than the ending month/day string. For example, going from December 31 to January st 1 of the following year yields 1, but our adjustment occurs because the text value 1231 is greater than the text value 0101, producing the correct value of zero. Also, the INTCK function does not count st st partial intervals. For example, going from January 1 to December 31 of the same year yields a correct st result of 0 because January 1 is never passed. Our program makes no adjustment because the text value 0101 is less than or equal to the text value 1231. • Are there any situations where subtracting 1 from YearsDiff might cause a problem? Yes, there is one leap year situation. The second IF statement, here, is checking to see if a person was born on February 29 of a leap year and hired on February 28 of a non-leap year (that is, the day after hiring was March 1). In this case, we will consider the employee's birthday to be February 28 during that non-leap year, so we need to add back the year that we took away in the first IF statement. • Finally, we don’t need to include our three “text” variables – TextBirth, TextHire, and TextNext in the resulting data set, so we will drop those here. • Let’s to ahead and submit this program and take a look at our log messages. We see that everything looks good in the log. There are no error messages or warnings. Our DATA step ran just fine. So, we now have the DATA step that will solve our first business task. Next, we want to take this code and create a SAS macro and PROC FCMP function from it. Then we can reuse our macro and userdefined function in other tasks that may require us to calculate the difference between two date values in number of years.

1. Lecture Scenario

13

PROC FCMP or SAS Macro? Which method you use is dependent on several factors.

Factors to be considered: „

End use

„

Development

„

Availability

„

Efficiency

10

For the remainder of this lecture, we will investigate how to place the DATA step code that you saw in the demonstration into both a PROC FCMP function and a macro, and we will compare the two methods using the four factors listed here. We’ll look at End Use: How easy do you find it to use each method? Development: How long does it take you to develop and maintain what you need using each method? Availability: Where in SAS can you apply user-defined functions or macros? Efficiency: You might need to consider the kinds of resources each method uses. We will consider CPU time. Any of these factors can become a constraint that causes you to select one method over the other.

14

Creating User-Defined Functions

Learning More about the SAS Macro Language For further information on the SAS macro language, please see the following SAS Education course and SAS documentation.

11

Please note that this e-lecture is not an attempt to teach the macro language or PROC FCMP. We will show and discuss code for both methods, but if you want to learn more on either topic, there are resources that you might want to investigate. To learn more about the SAS macro facility, there is a course entitled SAS® Macro Language 1: Essentials offered in a traditional classroom, Live Web, or self-paced e-course setting. You can also visit the SAS Publications Web site for further documentation, including the SAS® 9.2 Macro Language Reference.

1. Lecture Scenario

15

Learning More about the FCMP Procedure For further information on the FCMP procedure, please see the following SAS Education course and SAS documentation.

12

To learn more about PROC FCMP, please see our course entitled SAS® Programming 3: Advanced Techniques and Efficiencies or visit the SAS Publications Web site for further documentation, including the Base SAS® 9.2 Procedures Guide.

16

Creating User-Defined Functions

2.

Investigating End Use and Development Factors

Creating User-Defined Functions

1. Lecture Scenario

2. Investigating End Use and Development Factors

3. Investigating Availability and Efficiency Factors

4. Conclusion 13

Let’s start comparing macros with PROC FCMP user-defined functions.

2. Investigating End Use and Development Factors

Objectives

„

Develop a PROC FCMP function and then a macro to encapsulate the DATA step code created in the last demo.

„

Compare the ease of use for each method.

„

Use the PROC FCMP function and macro in a business scenario.

„

Compare the development efforts for each method.

14

Specifically, we will look at the effort required to use and develop PROC FCMP user-defined functions versus macros. We’ll start with end use so that you can see why you would want to develop your own user-defined function. Then we’ll discuss the code used to develop the functions that we used.

17

18

Creating User-Defined Functions

End Use: PROC FCMP Consider the effort to call the PROC FCMP function. options cmplib=orion.funcs; data age; set orion.staff; YearsDiff=years(Birth_Date,Emp_Hire_Date); run;

The OPTIONS statement is required to identify the location of the function. 15

As our first dimension, let's consider end use. Suppose you developed your own AGE function (again, we'll see how to do this later). How would you actually call that function in a DATA step? Let’s look at the code that calls the PROC FCMP function. The PROC FCMP-defined function is called with standard SAS syntax, exactly as a standard SAS function would be called in an assignment statement. No special syntax is required. Note that in this example, users just need to be aware that the newly available YEARS function accepts two arguments: one for a start date and one for an end date. Here we are using Birth_Date as the start date and Emp_Hire_Date as the end date. Also, note the caution stating that the location of a user-defined function must be known to SAS when it is called. This OPTIONS statement identifies the orion library location for our function.

2. Investigating End Use and Development Factors

19

End Use: SAS Macro Consider the effort to call the macro definition. data age; set orion.staff; %years(YearsDiff,Birth_Date,Emp_Hire_Date) run;

16

A call to the SAS macro appears as follows. We simply specify a percent sign and follow it with the name of the macro, which is YEARS, and its three parameters: YearsDiff, Birth_Date, and Emp_Hire_Date. That’s pretty easy. Macro programmers will instantly recognize the macro call and parameters. However, those not familiar with macro programming might not understand the purpose of the percent sign, and they will be puzzled by the three parameters. Non-macro programmers may also be puzzled by the lack of a semicolon. This code does not resemble standard DATA step code.

20

Creating User-Defined Functions

End Use: PROC FCMP or SAS Macro? For the end-use factor, the advantage goes to PROC FCMP. Macro

FCMP

End Use Development Availability Efficiency

17

The advantage for end-use effort required, then, goes to PROC FCMP.

2. Investigating End Use and Development Factors

21

Development: PROC FCMP Function Consider the effort to convert the DATA step into a PROC FCMP function. proc fcmp outlib=orion.funcs.temppkg; function years(Birth_Date,Emp_Hire_Date);

18

data age; set orion.staff; YearsDiff=intck("year",Birth_Date,Emp_Hire_Date); TextBirth=put(Birth_Date, mmddyy4.); TextHire=put(Emp_Hire_Date, mmddyy4.); TextNext=put(Emp_Hire_Date+1,mmddyy4.); if TextBirth gt TextHire then YearsDiff=YearsDiff-1; if TextBirth="0229" and TextHire="0228" and TextNext="0301" then YearsDiff=YearsDiff+1; drop TextBirth TextHire TextNext; run; return(YearsDiff); endsub; run;

You've already seen how you would call a user-defined function in a DATA step. Let's see how you would actually develop that function. Here, we placed the DATA step code into a PROC FCMP step. We start with the PROC FCMP statement. In this statement, I have included the OUTLIB option because I want to store this function for use later. PROC FCMP functions are stored in a package. A package is a group or collection of routines that have unique names. A package is stored in a data set. So, the required syntax for the OUTLIB=option is libref.data-set-name.package- name. We’ll store our function in the orion library in the funcs data set as a package named temppkg. Note that none of these are reserved names. They are names that have meaning to you. The FUNCTION statement declares the function name and arguments. The next set of statements, up to the RETURN statement, reflects our DATA step code. The RETURN statement is used to return the value of the function. Then, the ENDSUB statement ends the function definition. PROC FCMP variables are local. Therefore, no DROP statement is needed, and there is no conflict with existing variable names. In general, the procedure accepts slight variations of DATA step statements, and you can use most features of the SAS programming language in functions and CALL routines that are created by PROC FCMP. Also, many Microsoft Excel functions, not typically available in SAS, are implemented in PROC FCMP. You can find these functions in the sashelp.slkwxl data set. Now let’s take a look at the macro definition.

22

Creating User-Defined Functions

Development: SAS Macro Definition Consider the effort to convert the DATA step into a macro. %macro years(YearsDiff,Birth_Date,Emp_Hire_date); data age; set orion.staff; &YearsDiff=intck("year",&Birth_Date,&Emp_Hire_Date); TextBirth=put(&Birth_Date, mmddyy4.); TextHire =put(&Emp_Hire_Date, mmddyy4.); TextNext =put(&Emp_Hire_Date+1,mmddyy4.); if TextBirth gt TextHire then &YearsDiff=&YearsDiff-1; if TextBirth="0229" and TextHire="0228" and TextNext="0301" then &YearsDiff=&YearsDiff+1; drop TextBirth TextHire TextNext; run; %mend years; 19

Here, we placed the DATA step code into a macro definition named YEARS. The %MACRO statement begins the macro definition, and the %MEND statement ends the macro definition. As you can see in the %MACRO statement, the macro contains three parameters: YearsDiff, Birth_Date, and Emp_Hire_Date. These parameters are used in the rest of our macro code in place of the original DATA step variables. Note that if you are developing this macro for use by others, the names that you select for variable names – such as TextHire – cannot conflict with variables from the input data set being used later. Therefore, you might want to precede each name with an underscore to ensure this. The effort to convert the DATA step code into a macro definition was not very involved. However, those not familiar with macro programming might be a little puzzled by the percent signs, the ampersands, and the overall concept of creating macros. Also, consider that if an ampersand is left off of a macro variable during development, it will cause the macro to fail or not perform as expected. Maintenance of complex macros can be an involved process as well.

2. Investigating End Use and Development Factors

23

Development: PROC FCMP or SAS Macro? For the development factor, the advantage goes to PROC FCMP. Macro

FCMP

End Use Development Availability Efficiency

20

As you have seen, the effort to convert our DATA step code into a PROC FCMP function is less involved than the macro definition that we created on the last slide. The advantage for development effort required, then, goes to PROC FCMP.

24

Creating User-Defined Functions

Creating and Calling a SAS Macro and a PROC FCMP Function This demonstration illustrates how to create and call a SAS macro and a PROC FCMP function from the DATA step code presented in the last demonstration.

21

Now let’s switch to a demonstration where we will write the code to create the macros and functions just discussed. Linda, once again I’ll turn things over to you. 1. (fcmc2.sas) We’ll start our demo looking at the code to create our PROC FCMP function. We have our four required statements: the PROC FCMP statement, the FUNCTION statement, the RETURN statement, and ENDSUB statement. Remember that if we want to save this function, and we do, we must save it in a package in a data set. So, I have also included the OUTLIB= option here. We have started a new SAS session for this demo, so we’ll need to resubmit a LIBNAME statement along with the PROC FCMP code. Let’s go ahead and submit this code and take a look in the log. We see a message telling us that our PROC FCMP code ran just fine and our function was created successfully. 2. Now let’s go back to our editor and take a look at the code to create our macro – and here it is. We have added the necessary %MACRO statement and the %MEND statement, as well as the ampersands (&) that are needed for our macro variables. I am going to purposely, for the sake of this demo, take the percent sign off of my %MEND statement and pretend that I never had it there to start. I will go ahead and submit this code with the error in it and check my log. I don’t see anything in my log indicating a problem. When creating macros, unless I turn on certain options to get special messages, I don’t see anything in the log to indicate problems. At this point, I’ll just assume that everything is fine with my macro. 3. Now that we have created our macro and user-defined function, next, we want to use them. We’ll start with our PROC FCMP function. The function is called like any other function in SAS, except that I have to tell SAS where the function is stored before I can call it. Otherwise, SAS will look in the traditional function location and will not find our function. This OPTIONS statement specifies the

2. Investigating End Use and Development Factors

25

CMPLIB system option, which tells SAS to search each of the libraries or data sets that are listed for a package that contains our function. In this example, SAS will search the orion.funcs data set for our function. Notice that we only have to specify a two-level name here. In the DATA step, the program calls the function, passing the variable values for Birth_Date and Emp_Hire_Date , and returns the result in the variable YearsDiff. Before I submit the program, let me point out that I did make one change to my program, here, from what you saw in the program shown in the lecture section. In order to be able to distinguish between the data sets created by the FCMP call and, later, our macro call, I changed the name of the data set being created to fcmp_age instead of just age. I will submit this OPTIONS statement and this DATA step. When I go to my log, I don’t see any messages at all. What happened? Well, the problem goes back to that missing percent sign in my MEND statement in the previous program that I submitted. SAS sees the %MACRO statement, and everything it encounters after that is treated as part of the macro until it encounters a %MEND statement. Without the percent (%) sign on the statement, SAS has no %MEND statement to indicate the end of the macro definition. This is one of the disadvantages to creating macros. We have to be very careful to include all of the necessary percent signs and ampersands. 4. To correct this problem, I’m going to go back to my editor and add a percent (%) sign to my MEND statement. Then, I’m going to submit just that one statement to close the incorrect macro that is currently being defined. Then, I’ll resubmit my macro to re-create it, and I’ll also resubmit the program that calls the FCMP function. I’ll go to my log, and I now see messages for my FCMP function. 5. Up to this point, we have created our FCMP function and macro, and we successfully used our FCMP function. To complete this demo, we need to call our macro in a DATA step program and submit it. I’ll go back to my editor and highlight that code and submit it. Then I’ll go to my log and check it. It looks fine. 6. Let’s go over to the Explorer window and take a look in the Work library. I see the two data sets that were created. Let me open the fcmp_age data set first. I’ll scroll over and look at that YearsDiff variable. It looks great. Then let’s go over to the macro_age data set and take a look at the YearsDiff variable in that data set. It looks great as well. So, you have seen that either method – SAS macro or PROC FCMP functions – can create reusable subroutines for complex or commonly used code. The method that is the best for you to use will depend on how familiar you are with the required syntax. Let’s go back to the lecture now and investigate a couple more factors.

26

Creating User-Defined Functions

3.

Investigating Availability and Efficiency Factors

Creating User-Defined Functions

1. Lecture Scenario

2. Investigating End Use and Development Factors

3. Investigating Availability and Efficiency Factors

4. Conclusion 22

In this third section, we will take a look at two more factors to consider when comparing PROC FCMP user-defined functions and macros.

3. Investigating Availability and Efficiency Factors

27

Objectives

„

Discuss the locations where PROC FCMP functions and macros can be used in SAS code.

„

Compare the availability for each method.

„

Describe the resources used with each method.

„

Compare the resources required for each method.

23

We’ll start our discussion by investigating where PROC FCMP functions and macros can be used in SAS code – which is our availability factor. Then we will compare which method is more flexible from an availability standpoint. Next, we will look at the efficiency factor and compare the two methods using this factor.

28

Creating User-Defined Functions

Availability: PROC FCMP PROC FCMP functions can be called from „

the DATA step

„

the WHERE statement

„

PROC SQL (functions with array arguments are not supported)

„

a PROC REPORT COMPUTE block

„

selected SAS/STAT procedures

„

selected SAS/ETS procedures

„

selected SAS/OR procedures, including NLIN, MODEL, and NLP

„

the Graph Template Language

24

Starting with availability… You can use the PROC FCMP functions and subroutines with the DATA step, the WHERE statement, and the following procedures: • PROC SQL (functions with array arguments are not supported) • PROC REPORT COMPUTE blocks • selected SAS/STAT procedures, such as PROC GENMOD, PROC MCMC, PROC NLMIXED • selected SAS/ETS procedures, such as PROC COMPUTAB and PROC SIMILARITY • selected SAS/OR procedures. You can also use PROC FCMP functions and subroutines with the new, starting with SAS 9.2, Graph Template Language. For a complete list of where PROC FCMP functions are available, please see the Base SAS® 9.2 Procedures Guide mentioned earlier.

3. Investigating Availability and Efficiency Factors

29

Availability: PROC FCMP Example:

options cmplib=orion.funcs; proc means data=orion.staff n; where years(Birth_Date,Emp_Hire_Date) between 20 and 30; var Employee_ID; run; User-defined function

25

The OPTIONS statement is required to identify the location of the function.

Here is an example where we have called the PROC FCMP function from a WHERE statement within a procedure step – specifically PROC MEANS. Even though this function contains DATA step code, we can still call it from within a procedure. This was not the case with our macro. So, an advantage to using PROC FCMP functions to encapsulate DATA step code is that the code can then be used both in DATA steps and any procedures that support WHERE statements.

30

Creating User-Defined Functions

Availability: SAS Macro A macro can be called anywhere, depending on what the macro contains.

%years(YearsDiff,Birth_Date,Emp_Hire_Date)

26

Macros can be called from anywhere in SAS, unless they contain DATA step code. In other words, where a macro can be used depends on what the macro contains. In our example, the macro named YEARS generates partial DATA step code. Therefore, it can only be called from within a DATA step.

3. Investigating Availability and Efficiency Factors

31

Availability: PROC FCMP or SAS Macro?

Macro

FCMP

End Use Development Availability Efficiency

27

Since you develop functions using DATA step code, if you are using the SAS macro facility to create a pseudo-function, then the macro can only be called within a DATA step. On the other hand, a function created with PROC FCMP can be used in procedures as well. Based on our comparison, the advantage for availability goes to PROC FCMP.

32

Creating User-Defined Functions

Efficiency: PROC FCMP The FCMP function used 20% more CPU time. proc fcmp outlib=orion.funcs.temppkg; function years(Birth_Date,Emp_Hire_Date); YearsDiff=intck("year",Birth_Date,Emp_Hire_Date); TextBirth=put(Birth_Date, mmddyy4.); TextHire=put(Emp_Hire_Date, mmddyy4.); TextNext=put(Emp_Hire_Date+1,mmddyy4.); if TextBirth gt TextHire then YearsDiff=YearsDiff-1; if TextBirth="0229" and TextHire="0228" and TextNext="0301" then YearsDiff=YearsDiff+1;

= More CPU time

return(YearsDiff); endsub; run; 28

Our fourth factor considers the machine efficiency or resources used to execute each method. In this example, the FCMP function used 20% more CPU time. Increased CPU time is due to the internal overhead of a PROC FCMP-defined function. This statistic is based on a benchmark using a data set with two million observations. Other testing that we performed on different data sets and different machines yielded different statistics. However, each experiment yielded the same bottom line: that the FCMP function uses more CPU time. Benchmarking your programs is strongly recommended.

3. Investigating Availability and Efficiency Factors

Efficiency: SAS Macro Macro efficiency equals DATA step efficiency (for this example). %macro years(YearsDiff,Birth_Date,Emp_Hire_Date); &YearsDiff=intck("year",&Birth_Date,&Emp_Hire_Date); TextBirth=put(&Birth_Date, mmddyy4.); TextHire =put(&Emp_Hire_Date, mmddyy4.); TextNext =put(&Emp_Hire_Date+1,mmddyy4.); if TextBirth gt TextHire then &YearsDiff=&YearsDiff-1; if TextBirth="0229" and TextHire="0228" and TextNext="0301" then &YearsDiff=&YearsDiff+1; drop TextBirth TextHire TextNext;

= DATA step resources

%mend years;

29

In the case of our macro, the code generated by the macro determines the resources that it uses. Benchmarking is the best tool for determining resource utilization. In our example, since the macro that we created only generates DATA step code, with some text substitution, the resources used with our macro equals the resources used in the DATA step generated by the macro.

33

34

Creating User-Defined Functions

Efficiency: PROC FCMP or SAS Macro?

Macro

FCMP

End Use Development Availability Efficiency

30

Based on our discussion, the advantage for efficiency goes to the SAS macro facility in this case.

3. Investigating Availability and Efficiency Factors

35

Creating a DATA Step to Solve Another Business Task This demonstration illustrates how to calculate the current age of an employee.

31

Now, it’s time for another demonstration. In this demonstration, Linda will illustrate how to calculate the current age of an employee. (fcmc3.sas)This example is similar to the previous one. The previous example calculated an employee’s age at the time that he or she was hired. This new example calculates the employee’s age on the current date. We are retrieving the current date from the TODAY function, rather than through a hardcoded value. This will make the program dynamic. We can reuse the program without having to modify the code each time. Let’s quickly go through the program. • The DROP statement gets rid of variables that we don’t want included in our resulting DATA set. • The RETAIN statement is needed to hold the values of the Today, TextTerm, and TextNext variables through each iteration of the DATA step. You’ll see why this is needed in just a minute. • This DO GROUP retrieves today’s date with the TODAY function and places the date value into a variable named Today. It also creates the variable TextTerm, which contains only the month and day portion of today’s date. The variable TextNext contains only the month and day portion of today’s date plus 1. These three variables are created during the first iteration of the DATA step, and then their values are retained through the RETAIN statement that we saw earlier. Calculating these variables only once will save resources, especially if you are dealing with a large data set. • Next, the SET statement reads an observation. • The YearsDiff value is calculated, and we have substituted the variable Today as the end of the date interval to be calculated.

36

Creating User-Defined Functions

• The month and day for the employee’s birth date is captured with this PUT function. We did not include this statement in the DO GROUP, because this statement will execute multiple times – once for each employee. • The rest of the statements in the program remain the same as the previous DATA step example. I’ll go ahead and submit this DATA step program and check my log. Everything looks fine. Now we want to take this code and create a PROC FCMP function and a SAS macro from it. Then we’ll compare the two methods using our previous factors to consider.

3. Investigating Availability and Efficiency Factors

37

End Use: Comparison PROC FCMP is easier to use.

PROC FCMP data yearsemployed; set orion.staff; CurrentAge=years(Birth_Date,today()); run;

SAS Macro

32

data yearsemployed; set orion.staff; %years(CurrentAge,Birth_Date,currdate) run;

Using the new example from the demonstration, let's revisit the dimensions of end use, development, and efficiency for a PROC FCMP user-defined function and a SAS macro. We’ll start again with end use. Both methods are easy to use, but the PROC FCMP function call is more flexible, cleaner, and more elegant than the macro version. PROC FCMP has no percent sign, one argument fewer, and ends with a semicolon. It is a standard function call. It looks like SAS. It looks like a DATA step. The PROC FCMP function is flexible enough to accept the TODAY function in place of a date variable for either argument. The macro requires only the substitution of a new date parameter, but it is just not as flexible or clean as the PROC FCMP function call. The original macro offers the same flexibility. However, for reasons of efficiency, this macro has been rewritten to recognize CURRDATE as a special parameter value that signals the macro to use the current date. We will see this on the next few slides. We did not modify the PROC FCMP code because it cannot be made more efficient. Even if we had compared the flexibility of the original macro program with our PROC FCMP code, hands down, PROC FCMP would win.

38

Creating User-Defined Functions

Development: PROC FCMP

NO CHANGE to code needed.

33

Next, we’ll look at the effort required to develop the PROC FCMP function and SAS macro from our DATA step program. The previously defined PROC FCMP function can be used without modification. Therefore, I will give it a big smiley face.

3. Investigating Availability and Efficiency Factors

39

Development: SAS Macro

34

%macro years(outvar,Birth_Date,TermDate); drop TextStart TextTerm TextNext; %if %upcase(&TermDate)=CURRDATE %then %do; %let TermDate=%sysfunc(today()); retain TextTerm TextNext; if _N_=1 then do; TextTerm=put(&TermDate, mmddyy4.); TextNext=put(&TermDate+1, mmddyy4.); end; %end; %else %do; TextTerm =put(&TermDate,mmddyy4.); TextNext =put(&TermDate+1,mmddyy4.); %end; TextStart=put(&Birth_Date,mmddyy4.); &outvar=intck("year",&Birth_Date,&TermDate); if TextStart gt TextTerm then &outvar=&outvar-1; if TextStart="0229" and TextTerm="0228" and TextNext="0301" then &outvar=&outvar+1; %mend years;

For our macro, here are the changes that need to be made to the code. The %IF-%THEN/%DO block detects the special parameter value CURRDATE and generates DATA step code based on the TODAY function. The %ELSE/%DO block generates alternative DATA step code when a date variable is supplied in place of CURRDATE. While this program is written very efficiently, it requires a significant amount of macro knowledge.

40

Creating User-Defined Functions

Efficiency: PROC FCMP and SAS Macro

PROC FCMP

=

100% more CPU time

Macro

=

DATA step

 Macro efficiency equals DATA step efficiency for this example.

35

Now let’s look at efficiency. In our particular testing, the FCMP function used 100% more CPU time, but keep in mind that these statistics can vary depending on a number of factors. In the case of our macro, keep in mind that the original version of the macro would accept the TODAY() function but would be far less efficient. Therefore, the macro was rewritten. In this example then, since the macro only generates DATA step code, with some text substitution and one-time-only macro logic, the macro's efficiency equals DATA step efficiency. Why does the FCMP function use so much more CPU time?

3. Investigating Availability and Efficiency Factors

41

Efficiency: PROC FCMP For each observation in the input data set, (1) the TODAY function executes (2) the two PUT functions execute.

 Each time that a PROC FCMP function is called, every statement encapsulated by the function executes.

36

To understand why the PROC FCMP function uses so much more CPU time, let’s talk about the code contained within the function. Two issues contribute to increased resource usage: 1) The TODAY function executes once per row – in other words, for each observation in the input data set, as we saw earlier. 2) The two PUT functions execute once per row, as well, as you see here. Keep in mind that the larger your input data set is, the more significant the usage of this resource becomes. Also, keep in mind that, in general, you need to consider that each time a PROC FCMP function is called, every statement executes. Recall that in the DATA step macro program, the TODAY function executes only one time, because _N_=1 was used. Note that this functionality is not available with PROC FCMP.

42

Creating User-Defined Functions

Efficiency: PROC FCMP or SAS Macro?

Macro

FCMP

End Use Development Availability Efficiency

37

Based on our discussion, for this example, this is how the two techniques compared. I also added the availability factor from our previous discussion.

4. Conclusion

4.

43

Conclusion

Creating User-Defined Functions

1. Lecture Scenario

2. Investigating End Use and Development Factors

3. Investigating Availability and Efficiency Factors

4. Conclusion 38

This concludes our presentation of comparing the SAS macro facility and PROC FCMP. However, before we end this lecture, I would like to take you through a quick review of the topics that we discussed along the way.

44

Creating User-Defined Functions

Objectives

„

Review the definition of PROC FCMP.

„

Review the definition of the SAS macro facility.

„

Review the comparisons made in this lecture between the two methods.

39

First, we will review the capabilities and purpose for each method in SAS. Then we will review our comparisons of the methods based on our four factors: end use, development, availability, and efficiency.

4. Conclusion

45

Definitions: PROC FCMP The FCMP procedure

years( )

„

is part of Base SAS

„

creates SAS functions and CALL routines that contain DATA step syntax stored in a SAS data set

„

allows the use of most of the SAS programming language

„

is used the same way that other SAS functions or CALL routines are used in SAS

„

creates SAS functions and CALL routines that are easily written and maintained

„

produces functions and CALL routines that are independent from their underlying code

„

generates reusable subroutines.

40

Let’s start with PROC FCMP. • The SAS Function Compiler procedure (PROC FCMP) is part of Base SAS. • The procedure enables you to create and store SAS functions and CALL routines to be used in SAS procedures or DATA steps. • PROC FCMP functions and CALL routines contain DATA step syntax that is stored in a data set. • Most features of the SAS programming language can be used in PROC FCMP functions and CALL routines. • Because PROC FCMP functions and CALL routines are used just as any other SAS functions or CALL routines, programmers can easily write and maintain complex code. • PROC FCMP functions and CALL routines are independent from the code that they comprise. • Best of all, the functions and CALL routines created by PROC FCMP become reusable subroutines, meaning that they can be used in any DATA step or SAS procedure that has access to their storage location.

46

Creating User-Defined Functions

Definitions: SAS Macro Facility The SAS macro facility

%years

„

is part of Base SAS

„

provides the ability to substitute text in a program

„

reduces the amount of text for common and complex tasks

„

creates macros that are referenced by a name rather than by the underlying text

„

allows the creation of macro programs that are dynamic and self-modifiable.

41

• The SAS macro facility is part of Base SAS. • It is a tool for substituting text in a program and reducing the amount of text that you must enter to do common or complex tasks by assigning a name to character strings or groups of SAS programming statements. • When you want to use these programming statements later, you reference the macro names rather than the text within the macro. • The use of macros enables you to write SAS programs that are dynamic and capable of selfmodification. Note that the SAS macro language is text-based. To create and use macros, you must be familiar with the macro language. Also, keep in mind that the comparisons for the macro facility versus PROC FCMP in this lecture were for the purpose of creating the equivalent of DATA step functions. In this area, PROC FCMP has clear advantages. However, this is a narrow application of the macro facility as a whole. The macro facility is much broader in its functionality than what we have shown here.

4. Conclusion

47

Topic Review

„

Both methods organize complex or commonly used code into reusable units.

„

Development and the use of PROC FCMP functions fared better when compared to SAS macros. SAS macros require extensive macro knowledge.

„

Availability is a strong point for PROC FCMP while SAS macros tend to fare better on efficiency.

„

Benchmarking is suggested to determine the best method to use.

42

Both PROC FCMP and SAS macros are powerful tools available to you to turn complex and/or commonly used code into subroutines or reusable units. In this lecture, we compared the two techniques using four factors. First, we looked at the effort required to use and develop each method. We found that, in general, PROC FCMP code is easier to use and develop then SAS macro code. The main reason is that SAS macros require users to learn a new language, the macro language. PROC FCMP requires that four additional statements be added to your standard SAS code to convert the code into a PROC FCMP function. Also, SAS macros can be dependent on their underlying code. We used the example earlier where a macro contained partial DATA step code and therefore could only be used later in the DATA step. On the other hand, once PROC FCMP functions are created, they become independent, reusable subroutines. Next, we compared the two techniques by looking at availability and efficiency factors. In general, PROC FCMP is more available to use and SAS macros tend to be more efficient. To adequately compare these two factors with the two techniques, you should consider benchmarking.

48

Creating User-Defined Functions

Additional Training Topics For a complete list of available e-lectures and other SAS training products, visit

support.sas.com/training

43

As mentioned earlier, there is a SAS macro course offered by SAS Education, and the FCMP procedure is taught in our SAS Programming 3 course. For a complete list of available courses, e-lectures, and other SAS training products, please visit the SAS Web site at support.sas.com/training.

4. Conclusion

Credits Creating User-Defined Functions was developed by Linda Mitterling and Jim Simon. Additional contributions were made by Cynthia Johnson, Warren Repole, and Jason Secosky.

44

This concludes our lecture. Many thanks go to all of those who contributed to the creation of this electure. We hope that you found the material in the lecture helpful.

49

50

Creating User-Defined Functions

Comments? We would like to hear what you think. „

Do you have any comments about this lecture?

„

Did you find the information in this lecture useful?

„

What other e-lectures would you like SAS to develop in the future?

Please e-mail your comments to

[email protected]

Or you can fill out the short evaluation form at the end of this lecture. 45

SAS Education would like to know what you think about this e-lecture or e-lectures in general. If you have any comments, we would greatly appreciate receiving your input. You can use the e-mail address listed here to provide that feedback, or you can complete the short evaluation form.

4. Conclusion

Copyright SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. Copyright © 2009 by SAS Institute Inc., Cary, NC 27513, USA. All rights reserved.

46

Thank you for your time.

51

52

Creating User-Defined Functions

Appendix A Demonstration Programs 1.

Creating a DATA Step to Solve a Business Need ....................................................... A-3

2.

Creating and Calling a SAS Macro and a PROC FCMP Function.............................. A-4

3.

Creating a DATA Step to Solve Another Business Task ............................................ A-5

A-2

Appendix A Demonstration Programs

1. Creating a DATA Step to Solve a Business Need

1.

Creating a DATA Step to Solve a Business Need

Section 1, Slide 9 libname orion 's:\workshop'; data age; set orion.staff; YearsDiff=intck("year",Birth_Date,Emp_Hire_Date); TextBirth=put(Birth_Date, mmddyy4.); TextHire =put(Emp_Hire_Date, mmddyy4.); TextNext =put(Emp_Hire_Date+1, mmddyy4.); if TextBirth gt TextHire then YearsDiff=YearsDiff-1; if TextBirth="0229" and TextHire="0228" and TextNext="0301" then YearsDiff=YearsDiff+1; drop TextBirth TextHire TextNext; run;

A-3

A-4

Appendix A Demonstration Programs

2.

Creating and Calling a SAS Macro and a PROC FCMP Function

Section2, Slide 21 libname orion 's:\workshop'; ***** Create FCMP function *****; proc fcmp outlib=orion.funcs.temppkg; function years(Birth_Date,Emp_Hire_Date); YearsDiff=intck("year",BirthDate,Emp_Hire_Date); TextBirth=put(BirthDate, mmddyy4.); TextHire=put(Emp_Hire_Date, mmddyy4.); TextNext=put(Emp_Hire_Date+1, mmddyy4.); if TextBirth gt TextHire then YearsDiff=YearsDiff-1; if TextBirth="0229" and TextHire="0228" and TextNext="0301" then YearsDiff=YearsDiff+1; return(YearsDiff); endsub; run; ****** Create the macro *****; %macro years(YearsDiff,Birth_Date,Emp_Hire_Date); &YearsDiff=intck("year",&Birth_Date,&Emp_Hire_Date); TextBirth=put(&Birth_Date, mmddyy4.); TextHire =put(&Emp_Hire_Date, mmddyy4.); TextNext =put(&Emp_Hire_Date+1,mmddyy4.); if TextBirth gt TextHire then &YearsDiff=&YearsDiff-1; if TextBirth="0229" and TextHire="0228" and TextNext="0301" then &YearsDiff=&YearsDiff+1; drop TextBirth TextHire TextNext; %mend years;

***** Referencing the YEARS FCMP Function *****; options cmplib=orion.funcs; data fcmp_age; set orion.staff; YearsDiff=years(Birth_Date,Emp_Hire_Date); run; ***** Referencing the YEARS Macro *****; data macro_age; set orion.staff; %years(YearsDiff,Birth_Date,Emp_Hire_Date) run;

3. Creating a DATA Step to Solve Another Business Task

3.

Creating a DATA Step to Solve Another Business Task

Section 3, Slide 31 data age; drop Today TextTerm TextNext TextStart; retain Today TextTerm TextNext; if _n_=1 then do; Today=today(); TextTerm=put(Today, mmddyy4.); TextNext=put(Today+1, mmddyy4.); end; set orion.staff; YearsDiff=intck("year",Birth_Date,Today); TextStart=put(Birth_Date,mmddyy4.); if TextStart gt TextTerm then YearsDiff=YearsDiff-1; if TextStart="0229" and TextTerm="0228" and TextNext="0301" then YearsDiff=YearsDiff+1; run;

A-5

A-6

Appendix A Demonstration Programs