In a previous post, I shared an example of using ODS PACKAGE to create ZIP files. But what if you need to read a ZIP file within your SAS program? In SAS 9.4, you can use the FILENAME ZIP access method to do the job.
In this example, let's pretend that I need to analyze data that a government agency published (maybe by using SAS!) into a ZIP file. I've selected an exciting data source (found via data.gov) about Large Truck Crash Causation.
First, I need to download the latest version of the data file. I'll use PROC HTTP to do that job:
Next, I need to discover what files are within the ZIP file. I'll assign a fileref using the new FILENAME ZIP method. FILENAME ZIP is a directory-based access method, similar to the CATALOG access method or to using FILENAME to map to a folder. You can use functions such as DOPEN and DREAD to treat the ZIP file as if it's a file directory (since that's what it is, in concept).
Here's the report of files within the ZIP archive:
I've identified the HAZMAT.TXT file as the one that I want to analyze. I peeked at the first couple of records and was able to scratch out a simple DATA step to read the data. Notice how I don't need to explicitly extract the HAZMAT.TXT file -- I can simply reference it as a "member" of the INZIP fileref. The ZIP access method does the rest.
SAS reads my data file successfully, and yields this interesting box plot from the SGPLOT step:
(It looks like most "hazardous materials" accidents involved just 2 or 3 vehicles, except for one messy outlier that had nearly 30. Imagine the cleanup effort on that one!)
As an alternative, if I know exactly which file I need, I can assign a direct fileref by using the MEMBER= syntax:
The ZIP access method isn't just for reading. I can also use it to create and update ZIP files. For creating ZIP files, I prefer to use ODS PACKAGE. But it's very handy to be able to update ZIP files from a SAS program without using an external tool. For example, here's a program that deletes an extraneous file from an existing ZIP file:
Note: Like ODS PACKAGE, the FILENAME ZIP method does not support encrypted (password-protected) ZIP archives.
Download the complete SAS 9.4 program: filenameZipHttpExample.sas
Thanks to the growing size of data files, ZIP files are created and consumed by SAS users everywhere. Between ODS PACKAGE and FILENAME ZIP, you can teach your SAS programs to build and read the files without having to rely on external tools. The more you that you can use native SAS methods for this work, the more portable your SAS programs will be.