Weill Cornell Medicine Samuel J. Wood Library

You are here

Biomedical Research Database Resources

This page provides information and access to a range of biomedical databases and database-related appications. Click on the software name to view details, register for update alerts, download, or obtain a license.


i2b2 enables researchers to discover cohorts of patients using data from EHR systems. Through a point-and-click interface, researchers can build queries drawing from demographics, diagnoses, procedures, medications, and results recorded in Epic, the EHR system of Weill Cornell Physician Organization outpatient clinics (access to data from Eclipsys/Allscripts Sunrise Clinical Manager, the EHR system of NewYork-Presbyterian inpatient units, is planned). i2b2 supports different types of queries of clinical data, including whether clinical concepts occurred at any point in a patient’s medical history, during a particular visit, or in a sequence of events. Using de-identified data, investigators can determine potential cohorts of interest for later obtaining identified or limited data sets with Institutional Review Board (IRB) approval

Information links

System requirements

Costs and fees

This software is available at no cost to WCM faculty, staff and students.

Request database service from the library

Find related database by function

The New York City Clinical Data Research Network (NYC-CDRN) has been established to improve and streamline research in an effort to advance patient-centered research. NYC-CDRN uses a large volume of robust, high-quality patient data and support services from a collaboration of more than 20 partners, including WCM and NYP. The NYC-CDRN collects comprehensive medical histories for what will be as many as 6 million patients.

Information links

System requirements

Costs and fees

This software is not currently licensed by ITS. If you would like to share the licensing costs of this product with other interested users, please register your name using the link below.

Request database service from the library

Find related database by function


Medicare is the federally funded program that provides health insurance for the elderly, persons with end-stage renal disease, and some disabled. For persons age 65 and over, 97 percent are eligible for Medicare. Almost all Medicare beneficiaries have Part A coverage that includes hospital, skilled-nursing facility, hospice and some home health care. Information about Medicare eligibility and enrollment is available for all persons included in the SEER-Medicare data (both cancer and non-cancer cases). Medicare claims (bills) are available only for persons with fee-for-service coverage.

Information links

System requirements

Costs and fees

This software is not currently licensed by ITS. If you would like to share the licensing costs of this product with other interested users, please register your name using the link below.

Request database service from the library

Find related database by function

The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey combines interviews and physical examinations. NHANES is a major program of the National Center for Health Statistics (NCHS), which is part of the Centers for Disease Control and Prevention (CDC).
The survey examines a nationally representative sample of about 5,000 persons each year. These persons are located in counties across the country, 15 of which are visited each year. The NHANES interview includes demographic, socioeconomic, dietary, and health-related questions. The examination component consists of medical, dental, and physiological measurements, as well as laboratory tests administered by highly trained medical personnel.

Information links

System requirements

Costs and fees

This software is not currently licensed by ITS. If you would like to share the licensing costs of this product with other interested users, please register your name using the link below.

Request database service from the library

Find related database by function

The SEER-Medicare data reflect the linkage of two large population-based sources of data that provide detailed information about Medicare beneficiaries with cancer. The data come from the Surveillance, Epidemiology and End Results (SEER)External Web Site Policy program of cancer registries that collect clinical, demographic and cause of death information for persons with cancer and the Medicare claims for covered health care services from the time of a person's Medicare eligibility until death.
The linkage of these two data sources results in a unique population-based source of information that can be used for an array of epidemiological and health services research. For example, investigators using this combined dataset have conducted studies on patterns of care for persons with cancer before a cancer diagnosis, over the period of initial diagnosis and treatment, and during long-term follow-up. Investigators have also examined the use of cancer tests and procedures and the costs of cancer treatment.
The linked SEER-Medicare data files are large and complex. Before beginning an analysis, researchers are advised to read all documentation to determine whether the data will support their proposed research question. In addition, the SEER-Medicare data have a number of particular qualities and anomalies. Researchers are strongly encouraged to understand the complexity of the data before undertaking any analyses or publishing findings.

Information links

System requirements

Windows NT, Windows 95 or later if using a PC
GUNZIP required if using UNIX or Linux

Costs and fees

This software is not currently licensed by ITS. If you would like to share the licensing costs of this product with other interested users, please register your name using the link below.

Request database service from the library

Find related database by function

The Limited Access DMF (DMF) from the Social Security Administration (SSA) contains over 86 million records created from SSA payment records. This file includes the following information on each decedent, if the data are available to the SSA: social security number, name, date of birth, date of death. The SSA does not have a death record for all persons; therefore, SSA does not guarantee the veracity of the file. Thus, the absence of a particular person is not proof this person is alive. SSDMF database is updated once a week.

Information links

System requirements

Costs and fees

This software is not currently licensed by ITS. If you would like to share the licensing costs of this product with other interested users, please register your name using the link below.

Request database service from the library

Find related database by function

SPARCS is a comprehensive all payer data reporting system, initially created to collect information on discharges from hospitals. SPARCS currently collects patient level detail on patient characteristics, diagnoses and treatments, services, and charges for each hospital inpatient stay and outpatient (ambulatory surgery, emergency department, and outpatient services) visit; and each ambulatory surgery and outpatient services visit to a hospital extension clinic and diagnostic and treatment center licensed to provide ambulatory surgery services. SPARCS offers three levels of data access: public, limited, and identifiable. Public use data is openly available. Limited or identifiable data requires the submission of an application.

Information links

System requirements

Costs and fees

This software is open source and can be freely downloaded using the link below.

Request database service from the library

Find related database by function

The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute provides information on cancer statistics. The SEER research data include SEER incidence and population data associated by age, sex, race, year of diagnosis, and geographic areas (including SEER registry and county). SEER research data are released every Spring based on the previous November’s submission of data. Additional datasets are available including: Standard Population Data, U.S. Mortality Data, and U.S. Population Data.

Information links

System requirements

SEER*stat (optional software for accessing SEER data) requires a PC

Costs and fees

This software is open source and can be freely downloaded using the link below.

Request database service from the library

Find related database by function


IPA is a web-based software application for the analysis, integration, and interpretation of data derived from 'omics experiments, such as RNAseq, small RNAseq, microarrays including miRNA and SNP, metabolomics, proteomics, and small scale experiments that generate gene and chemi- cal lists. Powerful analysis and search tools uncover the significance of data and identify new targets or candidate biomarkers within the context of biological systems.

Information links

System requirements

Windows or Mac

Costs and fees

This software is available at no cost to WCM faculty, staff and students.

Request database service from the library

Find related database by function

KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies.

Information links

System requirements

Costs and fees

This software is not currently licensed by ITS. If you would like to share the licensing costs of this product with other interested users, please register your name using the link below.

Request database service from the library

Find related database by function

TRANSFAC provides data on eukaryotic transcription factors, their experimentally-proven binding sites, consensus binding sequences (positional weight matrices) and regulated genes. TRANSCompel contains data on eukaryotic transcription factors experimentally proven to act together in a synergistic or antagonistic manner.

Information links

System requirements

Costs and fees

This software is not currently licensed by ITS. If you would like to share the licensing costs of this product with other interested users, please register your name using the link below.

Request database service from the library

Find related database by function


Toggle highlighting of database resource by cost:





Suggest software

Return to Software Hub