Data Use Tutorial
What You Will Get From the ICPSR Archive
Most of the datasets in the ICPSR archive are raw data from surveys, censuses, and administrative records. They were originally gathered in research projects and for administrative purposes. ICPSR preserves them because they have value for "secondary analysis" - that is, reexamining old data to address new questions or to employ new analytic methods.
You will not get reports, charts, publications, or other studies from the ICPSR Archive. Rather, you will get raw numerical data which can be analyzed. Some datasets in the Archive have been used in published studies. You can see a bibliography of these studies by clicking on the "view related literature" link on each study's home page.
Finding the Data You Need
Use the Search Box if you know exactly what you want. For example, enter a survey title or the name of a principal investigator in the search box. Otherwise, find the data you need by using the browsing options on the Find & Analyze Data page, which are also summarized below:
Browsing Options
View all studies. ICPSR has over 7500 studies. Faceted searching makes it easy to quickly narrow your results.
View all studies for which online analysis is available. ICPSR offers over 500+ studies in Survey Documentation and Analysis (SDA), a Web tool for analyzing data. Analyzing data online means you don't have to download data for use with statistical software.
Browse by topic. You can browse through studies arranged by subject. Most categories have been further subdivided into narrower subject areas.
Browse by geography. Click on a map to find studies arranged by country and U.S. states.
Browse by investigator. Consult an alphabetical list of investigators for all ICPSR data.
Browse by series. Scan this list of more than 250 ongoing series and data collection projects.
Browse recent updates & additions. This is a list of all ICPSR studies, sorted such that new releases appear at the top of the results page.
ICPSR makes use of faceted searching. You may find it helpful to read over a brief post on Faceted Searching Using SOLR.
Additional Finding Aids
Consult the Thematic Collections. Fourteen special archives are devoted to research in particular disciplines. These areas include aging [gerontology], criminology, demography, economics, early education, education, history, international studies, law, mental health, racial and ethnic minorities [underrepresented groups], political science, psychology, public health, substance abuse, sociology, and terrorism.
Search the Bibliography of Data-Related Literature. This database contains over 48,000 published and unpublished works that have used data in the ICPSR collection. It is an excellent way to explore what has already been done in an area of study.
Search the Variables Database. The variables database currently includes variable-level documentation for approximately 1300 studies, which represent about 20 percent of ICPSR's holdings excluding US Census data. This amounts to roughly 1.2 million variables.
Terminology
In your searching you may encounter unfamiliar acronyms and specialized terms used in the social sciences. These resources can provide clear definitions:
- Glossary of Social Science Terms
- Index of Frequently Used Acronyms
- Social Sciences Research and Instructional
Glossary

Obtaining Access to the Data
Documentation
All of the documentation files associated with ICPSR datasets are available to the general public. This documentation includes the study metadata--that is, the data describing the study. However, not all of the datasets themselves are freely available.
Data for ICPSR members
Most ICPSR data have been acquired, processed, and archived through the support of ICPSR member institutions. These data are available only to persons at ICPSR member institutions. If you are at a member institution, you will be able to download these datasets; if you are not, you may only download documentation files for that study. A note will appear on the study home page informing you if the data are available only to ICPSR members:

Freely Available Data
Studies funded by federal agencies are usually available to the general public (unless restricted because of disclosure risk). These externally-funded thematic collections focus on aging, child care, criminal justice, demographic information, education, health and mental health, substance abuse, and terrorism.
Restricted Data
Some datasets are protected because there is a risk that the identity of research participants could be disclosed. These datasets are released with special protections. See restricted data and confidentiality. Datasets in which the level of disclosure risk is especially high can be examined only in a special data enclave in the ICPSR office in Ann Arbor. See enclave data.
If you find a dataset where access is restricted, you will see a note about this on the study home page.

MyData Account
You will need a MyData account to download data from ICPSR.
Downloading Data
When you find a study that you wish to download, you will see a variety of download links on the study home page:
- Download documentation files (login not required)
- Download select files (traditional download page)
- Download all files (file size)
The "download all files" link is also present as a large green button just to the right of the study title. Before downloading the data, you may want to do two things to assure that the study meets your needs.
Click on the "View study description" link and carefully review the description of the study. This may include important information about file formats and other unique characteristics of the data.
Click on the "Browse documentation files" link and view the documentation files, including the codebook.
When you are ready to download, click on the "Download > select formats/datasets" link (or "Download > all files"). You will see a table displaying the datasets as rows (for studies that have more than one dataset) and format options as columns. Also note the two tabs atop the table, that allow you to choose between system files and ASCII files.

Note that in downloading data from ICPSR, you signify that you agree not to share the data with anyone not authorized by ICPSR to receive it. You further agree to the terms of use.
Analyzing Data
Using the downloaded data requires some basic knowledge of statistical analysis techniques and some familiarity with statistical analysis software packages. If you lack these skills, obtain assistance from a colleague or knowledgeable faculty and staff at your university. Each ICPSR member institution has a designated Official Representative. These individuals often provide such assistance. Find your institution's Official Representative by consulting the member list.
Setup Files
Although most ICPSR datasets provide system files, for some datasets only ASCII data and setup files are available. Setup files allow you to read the data files into your statistical software package. These files contain the syntax or program code to read ASCII data into your software package. ICPSR currently provides setup files for SAS, SPSS, and Stata. These are examples of setup files:
Importing Data
These pages provide step-by-step instructions for loading ASCII data files into your software package.
System Files
In addition to set up files, many ICPSR studies come with system (or application) files. These files are specific to a particular statistical package. For SPSS, the files have the .por or .sav extension, for SAS the extensions are .xpt or .stc, and for Stata the file extension is .dta. Some of the SAS and Stata files have supplemental syntax files that should be used in conjunction with the system files in order to use the missing values specified by the study investigator(s).
