#MentorMonday: Eric Sayers

Eric Sayers, Ph.D.
Staff Scientist and Head, Customer Engagement
National Center for Biotechnology Information (NCBI) at the National Library of Medicine

In support of the NIH STRIDES project, NCBI is moving the Sequence Read Archive database to the Google and Amazon cloud platforms. This is a massive undertaking (over 13 petabytes of data) that has the potential to revolutionize the way genomic scientists access and compute on large datasets.

Ensuring high value datasets are available to the community.

One of the critical measures for success is to understand how researchers are using these datasets, and in particular, which datasets. By using metadata for these datasets as features, we are applying machine learning algorithms to deduce patterns of access behavior so that we can ensure that datasets of high value to the community are the most available, thereby making the most effective use of these cloud storage resources.

Helping make rapid progress.

We are excited to welcome Coding it Forward fellow Michelle Gan for the first time to our analytics team, and at exactly the right time! Not only can our fellow help us make rapid progress on this project, but we hope that we can provide an environment where Michelle can explore interests and see how data science and computational skills are used to solve real, pressing problems. Bringing in a new set of eyes always refreshes a project, and we look forward to benefiting from the new perspectives our fellow will bring. 

This page last reviewed on March 23, 2023