1
PharmaSUG2011 - Paper CC10
A Recursive SAS Macro to Automate Importing Multiple Excel
Worksheets into SAS Data Sets
Wenyu Hu, Merck Sharp & Dohme Corp., Upper Gwynedd, PA
Liping Zhang, Merck Sharp & Dohme Corp., Upper Gwynedd, PA
ABSTRACT
In order to import data from Microsoft Excel into SAS
®
, there are various methods such as PROC IMPORT,
IMPORT WIZARD, ODBC and Excel LIBNAME engine. The optimal way often depends on the project needs
and programmer's preference. The need for importing multiple worksheets adds more complexity to an
already challenging job. This paper introduces a SAS macro that can automatically handle all Excel files with
various numbers of worksheets in different subfolders and generate separate SAS datasets for each Excel
file. As Excel files often contain multiple worksheets, we believe this macro is a flexible and user-friendly tool
that has wide applicability for importing Excel files to SAS.
Keywords: proc import, Excel, recursive
INTRODUCTION
In the pharmaceutical industry, SDTM and ADaM Metadata information are usually stored in Excel
spreadsheets. The current version of Excel files often need to be compared to previous versions to see what
changes were made to the data definition file. Since there are no good tools to compare two files and
identify the exact differences, we decided to convert the Excel files to SAS datasets and then compare the
SAS datasets. Normally for different studies, the number of sheets and sheet names are different, thus using
hard coding is very time consuming and error prone. An automated SAS macro would be a desirable utility
tool to accomplish this task.
This paper introduces a macro which reads multiple Excel files that have multi-sheets under different folders
and subfolders by using a recursive macro technique and the Excel LIBNAME engine.
The Excel LIBNAME engine was introduced in SAS version 9 and constructed a new way to connect SAS
files and Microsoft Excel files. By using this new LIBNAME engine, along with other data transfer tools such
as PROC EXPORT, PROC IMPORT, PROC SQL or DATA step, transferring data between Excel and SAS
can be greatly simplified.
THE MACRO DESIGN
The fundamental design of this macro (%xls2sas) is to use a recursive approach. The macro flow is as
follows: it first opens the root folder using DOPEN. If DOPEN returns a directory id value of greater than 0,
it then uses DNUM to return the number of members in this directory. Once it gets the total number of
members in the folder, it uses DREAD to get the member name and the scan function to extract the second
word of the member name with the period as the delimiter. If the second word is XLS, then the Excel
spreadsheet is read in using Excel LIBNAME. Once the LIBNAME statement is successfully executed, the
Excel workbook, the sheets, and named ranges defined in the workbook will be available in the SAS
Explorer window. If the second word resolves to null, it means the first word is considered as sub-folder.
%xls2sas calls itself upon reaching sub-folders not listed in the exclusion folders list. When %xls2sas is
called, it repeats the same process as the above. Upon finishing reading any file or folder, it goes up one
level to its parent folder to read the next XLS file or folder. The process finishes when all Excel files and
folders under the root folder are read. The key to the recursive design is to localize the macro variables so
that every call to %xls2sas will have its local instance of variables. This prevents subsequent %xls2sas calls
from overwriting the macro variable values of previous macro calls.