Meet the 2021 DATA Scholars

Click here to learn more about the DATA Scholars program.


Dr. Ansu Chatterjee

Dr. Ansu Chatterjee joined the Division of Medical and Scientific Research (DMSR) in the All of Us Research Program. He is working on multimodal data integration and record linkages, on addressing informative missing data issues in different surveys, and ethical data science and machine learning aspects of research on large biomedical databases. He will develop algorithms and projects for exploring All of Us data and related datasets for advancement of precision medicine for all individuals. Dr. Chatterjee’s research interests include statistical foundations of data sciences, differential privacy, algorithmic fairness and ethical data sciences, high dimensional data geometry, Bayesian statistics, resampling methods, small area and record linkage techniques, and applications of data sciences in multiple domains. Dr. Chatterjee is a faculty member in the School of Statistics at the University of Minnesota.

Dr. Lara Clark

Dr. Lara Clark joined the National Institute of Environmental Health Sciences (NIEHS) team working to develop tools and resources to support the use of geospatial data in environmental health research. Dr. Clark is an environmental scientist with training in air quality, urban planning, and exposure science. Before joining NIEHS, she was a postdoctoral researcher at University of Washington, where her research focused on the public health and equity impacts of transportation and the built environment. She brings experience integrating and analyzing diverse geospatial data on environmental, social, and infrastructural systems.

Dr. Anne Deslattes Mays joined the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) in the Office of Data Science and Sharing. She is charged with engaging researchers in identifying high priority, cross-project use cases to inform the continued improvement of the Kids First Data Resource and the INCLUDE Data Hub. Dr. Deslattes Mays is a mathematician and software engineer with a Ph.D. in Tumor Biology. Her history in large-scale computing began with forming the software group at Celera. She is driven to use all relevant data and applications at scale in the cloud in reproducible, open and transparent ways to address pressing specific pediatric cancer needs, including identifying needed diagnostics and therapeutics. Dr. Deslattes Mays participates with global efforts in this regard, specifically with the Global Alliance for Genomics Health (GA4GH) in cancer and federated analysis groups.

Dr. John Gachago

Dr. John Gachago joined the Office of Data Science Strategy to identify the challenges with using Electronic Health Records for health disparities research, and the implications of applying ML/AI techniques to this data without adequately addressing these challenges. Additionally, he facilitates AI-Ethics labs and serves as an ODSS liaison to the NIH AIM-AHEAD Advisory Committee. Dr. Gachago is the CEO and Co-founder of E-Health Solutions, LLC, a digital health consulting firm. His previous career was in international logistics and the pharmaceutical industry and he has led from inception to market launch two mobile health/telemedicine platforms and published several articles and white papers on digital health related to rural healthcare intelligence, blockchain technology for lower middle-income countries (LMICs), and strategic planning for health information technology. He teaches telehealth, organizational management, and healthcare finance at CalBaptist University and has developed continuing medical education programs for adoption of digital health and transforming healthcare with AI. Dr. Gachago has consulted on national eHealth strategy for countries including Rwanda, Kenya, Haiti, and Ghana.

Dr. Priyanka Ghosh

Dr. Priyanka Ghosh joined the National Center for Biotechnology Information in the National Library of Medicine (NLM) to develop scalable search methods that support identification of both known and as-yet undescribed sequences in the Sequence Read Archive (SRA), the NIH’s largest publicly available repository of high throughput sequence data, which was recently moved to the cloud. Dr. Ghosh’s research interests lie at the intersection of high-performance computing, and computational biology. She will be developing scalable algorithms for sequence search that leverage existing data structures or exploring optimizations to boost performance at scale on different cloud-based hardware platforms. These efforts could lead to the detection of novel microbial species and genes providing a deeper understanding of genomic variation, gene expression, and functional genomics. Her background lies in designing scalable algorithms for solving various combinatorial problems arising in genome assembly. She received her PhD in computer science from Washington State University in 2019. Prior to joining NIH, she was a postdoctoral research associate at Pacific Northwest National Laboratory (PNNL). She is the recipient of an IEEE/ACM Best Paper Award (NOCS19).

Dr. Jaleal Sanjak

Dr. Jaleal Sanjak joined the National Center for Advancement of Translational Science (NCATS) in the Informatics Core within the Division of Preclinical Innovation. He is working to apply advanced modeling techniques to finding commonalities amongst rare diseases and identify areas of overlap between common and rare diseases. His work will serve as a proof-of-concept research use cases for a large, rare disease knowledge base underlying the NCATS Genetic and Rare Disease (GARD) community portal. Dr. Sanjak is a computational biologist with training in systems biology, bioinformatics and statistical genetics. In addition to his academic research training, he spent several years in the private sector as a lead data scientist for a small-business research and consulting firm focused on the life sciences and public health. He hopes to merge his background in computational biology with the broad array of data science capabilities within NCATS to drive advancements in the study of genetic and rare diseases.

Meet the 2020 DATA Scholars 

This page last reviewed on March 23, 2023