Finding Statistics & Data Sets
Finding Statistics in the Literature
Statistics and information about data sets appear in the biomedical literature. In MEDLINE, there are two subheadings often applied to articles containing statistics. For articles about diseases/conditions, use the subheading EP - Epidemiology For everything else, use the subheading SN - Statistics and Numerical Data. Search on the MeSH term "Databases" along with your subject to find articles where large databases were used.
Once you know the source of the data, you can search the web to see whether the data set has been updated or expanded. Many statistical publications are updated yearly.
Internet Search and Manipulation Hints
Two questions to consider before beginning your statistical search:
- Who cares about or has a mandate to study the topic?
- Who has the resources and staff to collect data in this topic area?
Look for information about the:
- File Format (HTML, PDF, Excel, text, etc...)
- Dates of Data (not the same as the publication date of the document or page)
- Sources of Data
- Contact Person
- Suggested Citation
- Availability of Documentation
- Data Use Limitations
- Anything special about the data?
Statistical Resources and Publications
- 2008 Statistical Abstract - statistics on social and economic conditions in the United States. Selected international data included
- American FactFinder (Census data)
- Statistical Universe (LEXIS-NEXIS) - Choose the Statistical subset, a compilation from federal, state, and international statistical publications. See also the Quick Info/Reference section. Some of the tabular content can be downloaded as Excel files.
- FedStats Gateway to statistics from over 100 U.S. Federal agencies
- New York State Department of Health - Statistics and Data
- New York City: New York City Department of Health & Mental Hygiene - Data and Statistics
Example: Summary of Vital Statistics
Example: My Community's Health - National Center for Health Workforce Analysis - HRSA
Projects Presenting Spatial Data (Geographic Information Systems)
- New York State: Cancer Surveillance Improvement Initiative
- New York State County Health Indicator Profiles
- GIS Downloadable Data Sets - GIS users can access several popular geographic data sets in a variety of formats that are downloadable for immediate use with your GIS software. You can preview these data sets and then download selected areas in a ready-to-use GIS format such as an ESRI shapefile. Some of the data may be downloaded for free while other data may be ordered through a simple e-commerce transaction.
Projects Consolidating Data Sets
- Reported Volume for Selected Procedures Performed in New York State Licensed Hospitals and Ambulatory Surgery Centers - Center for Medical Consumers - data derived from New York State SPARCS/Ambulatory Surgery databases.
WCMC/NYP/CU Data and Statistical Expertise
- Clinical Epidemiology and Health Services Research Program
- Biostatistics and Research Methodology Core Facility
- Digital Collections - Mann Library, Cornell University - The USDA Economics and Statistics System and CUGIR (The Cornell University Geospatial Information Repository)
- Department of Biological Statistics and Computational Biology Consulting Service
Web Directories of Data Sets
- Directory of Health and Human Services Data Resources - Compilation of collection systems sponsored by the U.S. Department of Health and Human Services (HHS). Databases from continuing departmental data projects or program administrative and evaluation activities that met the criterion of broad utility were included. Such data projects and systems included recurring surveys and disease registries either maintained or sponsored by HHS. Databases from one-time studies or data collections were also included when the data may have broad interest.
- Health Services & Sciences Research Resources (HSRR) - HSRR is a searchable database of information about datasets and instruments/indices employed in Health Services Research, Behavioral and Social Sciences and Public Health with links to PubMed.
- Health and Medical Care Archive - Robert Wood Johnson Foundation - Sponsored data sets at Inter-University Consortium for Political and Social Research (ICPSR) also more at http://www.icpsr.umich.edu/ - Cornell is a member.
Data Sets
- National Behavioral Risk Factor Surveillance System -
- NYS BRFSS
- National Center for Health Statistics (NCHS) - National and state data sets as well as statistic reports. Information about ordering data sets that cannot be downloaded.
- CDC Data and Statistics page - much more than NCHS
- CDC WONDER - WONDER provides a single point of access to a wide variety of reports and numeric public health data.
- Agency for Healthcare Research and Quality - Data and Surveys
- HCUPnet - Online searchable data from the Healthcare Cost & Utilization Project in an interactive Tool for Community Hospital Statistics (National, Regional & participating States)
- RAND - Public Use Databases
- Statewide Planning and Research Cooperative System (SPARCS) - Data dictionaries, documentation and request forms. No searchable data online.
- U.S. Census Bureau
- WHOSIS -- WHO Statistical Information System
Tools and Software for Data Acquisition and Analysis
- Epi Info / Epi Map - Epi Info and Epi Map are public domain software designed to provide for easy database construction, data entry, and analysis with epidemiologic statistics, maps, and graphs.
- DataFerret - DataFerret, a collaborative effort between the National Center for Health Statistics and the Bureau of the Census, is a unique data mining and extraction tool. It allows you to select a databasket full of variables, and recode those variables as needed, and then develop and customize tables and charts. DataFerrett helps you locate and retrieve the data you need across the Internet to your desktop or system, regardless of where the data resides.
- SAS (Statistical Analysis Software) is loaded on PCs in the Library Computer Room
- Free Statistical Analysis Tools - Compiled by David Lane, Rice University
Background Reading & Additional Training in Finding and Using Data
- Institutional Review Boards and Health Services Research Data Privacy: A Workshop Summary (2000) – Institute of Medicine
- Electronic Statistics Textbook - StatSoft
- HyperStat Online Textbook - David Lane, Rice University
- Statistical Books in the Library's online catalog, Tri-Cat
- Finding and Using Health Statistics: A Self-Study Course
- Training modules on the Behavioral Risk Factor Surveillance System National Center for Chronic Disease Prevention and Health Promotion
- Request a consultation with a Librarian about finding statistics or data sets.


Chat reference is online
Chat reference is offline 