Monday 24 February 2014

Starting a project in SAS

You are about to start a brand new project and you have the basic SAS white screen staring at you.  Early in my career my first move was to write code and get the project started.  Over time I learned that was the wrong thing to do. What you should do is spend time to think about the project, the outcome and document that information at the top of the code.  There are two good reasons to start with proper documentation.

  1. Documenting forces you to think about the whole project and what the final result will be. If you start coding right away you tend to only think about the next step, which is a good way to get off course very quickly.
  2. Well documented SAS code is easier to read and understand when you go back to the code months later because your client has asked you to "do the same thing that was done last quarter".  I typically work on three or four projects during a week.  Over a period of four months that can add up to over 40 projects that I have touch over that time.  When I get a request to re-create a project, it is very difficult and time consuming to open up the code I think I used and read through it to confirm if I am looking at the correct code.
Below is an example of the first thing I type in before starting any project of any size


/********************************************************************************** 
Program: CampaignRetention20140224.sas 
Programmer: Patrick Booth
Creation Date: 2014-02-24 
Purpose: Select all customers who has a contract that is coming to an end and create a campaign list with their contact information
Input: A list of customers who has a contract that will end from March 1, 2014 to March 31, 2014.  This request came from Darla in marketing
Output: Campaign list with all contact information including store number
Program Dependency: Make sure the fall campaign is complete before selecting these customers
History: This is a campaign that came from a pilot that was run in 2013
************************************************************************/
The code gives you a lot of information at a glance which means you don't have to scroll down the hundreds of lines of code to confirm if you are looking at the correct file.
This method can be as detailed as you want and it only takes a few minutes to type in the information.

Below is a summary of each line and why they are important

Program:  This may seem redundant because the name of the program is the name of the file, so why do you need to type it in here?  I found it helps to write down the name of the program because it forces you to think about what a logical name is.
Programmer:  This is useful if you work with a team of programmers and you can immediately see who created the file.  If you work alone and nobody else programs in your company, then don't bother with this line.
Creation Date:  Important to know when you started the project because most of your clients will refer to projects by a name and when you created it.
Purpose:  Try to be as specific as possible, because that will help many months down the road.  Think about this section as a note to yourself in the future and what information you will need when you are trying to understand what this code is about.  I find that I go back and add to this section as the project changes over time.
Input:  Sometimes your client will give you a file to work with.  Name the file or describe that file here.  Other times the request will come from an individual.  Put the requester's name here because that is usually the first indication if you have the correct code.
Output:  This is more about forcing you to think about the outcome of the project.  If you think about the end goal then you tend to be more efficient with your coding.  Take a few minutes to think about this, it will save you coding time.
Program Dependency:  Include any events that has to happen before this program starts.  It is a good way to ensure everything is ready before you start the project
History:  Most projects happen as a result of events, previous projects or a change of policy at the company. This is an opportunity to add context to the project and help the "future you" to understand the code better.

Executing this code

I know what you are thinking.  "I can't be bothered to write all of this text at the beginning of every project.  It will affect my efficiency".  That is a legitimate concern, but I have a trick that will make this process a lot faster and easier.  
SAS has an abbreviation macro that will replace a small amount of code with a large amount of code.  I use the macro replace the text "startup"  with the text

/********************************************************************************** 
Program: Codename.sas 
Programmer: Patrick Booth
Creation Date: YYYY-MM-DD 
Purpose: 
Input: 
Output: 
Program Dependency: 
History: 
************************************************************************/

Here is how you do it:

  • Open a SAS program window
  • Select Program -> Add Abbreviation macro
  • Under Abbreviation type a brief name for your macro.  In my case I used the text "startup"
  • Under "Text to insert" type in the template for the beginning of your program.
  • Hit OK.
Now every time you want your startup template to appear type "startup" hit tab and start filling in your information.  Note:  The abbreviation macros are case sensitive, Startup would not work.

The abbreviation macro is a great tool for storing code that you use most often and I will talk about how to use it in future blogs.

I would love to hear about how you like to start your SAS coding projects.


No comments:

Post a Comment