Sequence Read Archive

About the Sequence Read Archive (SRA)

The Sequence Read Archive (SRA) is the National Center for Biotechnology Information (NCBI) database that stores sequence data obtained from next generation sequence technology. Released in 2009, the SRA contains 9 million records and 12 petabytes of data. The SRA is a broad collection of experimental DNA and RNA sequences that represent genome diversity across the tree of life. Through this database, researchers can search metadata for those sequences to locate the sequence reads for further analyses.

New Request for Information Seeks Public Input on Use of Cloud Resources and New File Formats for Sequence Read Archive Data

Submissions Due July 17

SRA in the Cloud

SRA data is available in the Google Cloud Platform and Amazon Web Services clouds through the STRIDES Initiative. All publicly available, unassembled read data and authorized-access human data are available for access and compute through these cloud providers. For more information on how to access and work with SRA data in the cloud, please see the NCBI SRA in the Cloud documentation. There is also information available about formats of SRA data available in the cloud, as well as about SRA data access costs in the cloud.

SRA Data Working Group

The Council of Councils advises the NIH Director on matters related to the Division of Program Coordination, Planning, and Strategic Initiatives (DPCPSI). The Council established the SRA Data Working Group in 2019 to provide recommendations to the Council on key factors for storing, managing, and accessing SRA data on cloud service provider environments. 

The SRA Data Working Group is currently examining data analyses of SRA related to access, cost, and usage, as well as other areas. The SRA working group is using these analyses, among other factors and considerations, to evaluate and deliberate data storage options. The group reports to the Council of Councils and will provide findings and draft recommendations on an ongoing basis.

The working group presented its interim report of draft recommendations at the Council meeting on Jan. 24. The group provided recommendations that would reduce the overall storage footprint of the SRA data while maintaining access to and use of the data by the research community.

News and Events

This page last reviewed on May 28, 2020