FOA Title: 
Request for Information (RFI): Input on Development of Analysis Methods and Software for Big Data
Grant Type: 
Primary IC: 
Release Date: 
Aug 08 2013
Expiration Date: 
AC Source: 
Request Information RFI): Input Development Analysis Methods Software Big Data Notice Number: NOT-HG-13-014 Key Dates Release Date: August 8, 2013 Response Due Date: September 6, 2013 Issued National Human Genome Research Institute NHGRI) Purpose Request Information RFI) to solicit comments ideas the development analysis methods software tools, part the overall Big Data Knowledge BD2K) Initiative. Specifically, RFI solicits input needs software analysis methods related data compression/reduction, data visualization, data provenance, data wrangling. Background Biomedical research becoming data-intensive researchers generating using increasingly large, complex, diverse datasets. era Big Data' biomedical research taxes ability many researchers release, locate, analyze, interact these data associated software due the lack tools, accessibility, training.  response these new challenges biomedical research, in response the recommendations the Data Informatics Working Group DIWG) the Advisory Committee the NIH Director, NIH launched trans-NIH Big Data Knowledge BD2K) Initiative long-term goal the NIH BD2K Initiative to support advances data science, quantitative sciences, policy, training are needed the effective of Big Data biomedical research.  term biomedical" used here the broadest sense include biological, biomedical, behavioral, social, environmental, clinical studies relate understanding health disease).  term Big Data' refers datasets are increasingly larger, complex, which exceed abilities currently used approaches manage analyze.  Big Data" also meant capture opportunities address challenges facing biomedical researchers accessing, managing, analyzing integrating large datasets diverse data types.  Such data types include imaging, phenotypic, molecular including omics), clinical, environmental, behavioral, many types biological biomedical data.  Big Data" also includes data generated other purposes e.g. social media, search histories, cell phone data) they repurposed applied address health research questions.  Biomedical Big Data primarily emanate three sources: 1) small number groups produce very large amounts data, usually part projects specifically funded produce important resources use the research community large, large collections electronic health records; 2) individual investigators produce large datasets their own project, which might broadly useful the research community at-large; 3) even greater number investigators each produce small datasets whose value be amplified aggregating integrating with data. of DIWG recommendations to support development, implementation, evaluation, maintenance dissemination informatics methods applications. NIH supports wide range bioinformatics computational science through efforts such the Biomedical Science Technology Initiative funding opportunities through programs supported individual NIH institutes centers.  NIH now considering supporting development analytical methods software tools will focus initially four targeted areas begin address critical current emerging needs the research community using, managing, analyzing complex larger data sets: data compression/reduction, visualization, provenance, wrangling. NIH BD2K Working Group charged exploring development informatics methods tools seeks input the biomedical research communities the four targeted areas listed above ensure research resources generated have highest impact value the research community. NIH determined guidance needed broad scientific community the following areas: Data Compression/Reduction While data compression important BD2K since helps reduce resource usage, most compression techniques involve trade-offs among various factors, including degree compression, amount distortion induced the computational resources required compress decompress data. Data reduction aims more dramatically reduce data volume, in meantime reduce complexity/dimensionality data easier analysis. usually involves processing and/or reorganization data minimize redundancy, eliminate noise, preserve signal data integrity.  Data Visualization Data visualization permits researchers communicate information through graphical interactive means enables to explore gain insight/knowledge the data. challenge the Big Data era on interpreting complex, high-throughput data, especially the context other relevant, often orthogonal, data.  Data Provenance Provenance digital scientific data useful determining attribution, identifying relationships between objects, tracking back differences similar results, guaranteeing reliability the data, to allow researchers determine whether particular dataset be used their research providing lineage information the data). Data Wrangling Data wrangling a term is applied the conversion, formatting, mapping data enables researchers more easily submit data a database, expose data the internet, allows data be easily accessible shareable. Researchers generate datasets that, aggregate, become Big Data" often find difficult submit data, even standards well-established. Specialized informatics skills often needed, example, format data, apply metadata, fill gaps, ontologies, capture provenance, annotate features, apply functions reformat, manipulate, transform, process data. Information Requested maximize impact these valuable research resources tools informatics methods tools) facilitate use scientists a broad range expertise, seek input scientific informatics research user communities identifying prioritizing needs gaps the four focus areas outlined above. Submitting Response responses must submitted via email Friday, September 6, 2013.  Please include Notice number the subject line. Response this RFI voluntary. Responders free address any all the categories listed above. submitted information be reviewed the NIH staff. request for information planning purposes only should be construed a solicitation as obligation the part the Federal Government. NIH does intend make any awards based responses this RFI to otherwise pay the preparation any information submitted for Government's of such information. NIH use information submitted response this RFI its discretion will provide comments any responder's submission. However, responses the RFI be reflected future funding opportunity announcements. information provided be analyzed may appear reports. Respondents advised the Government under obligation acknowledge receipt the information received provide feedback respondents respect any information submitted.  proprietary, classified, confidential, sensitive information should included your response. Government reserves right use any non-proprietary technical information any resultant solicitation(s). Inquiries Please direct inquiries to: Jennifer Couch, Ph.D National Cancer Institute Telephone: 240-276-6210 Email: Website: