DataScience@NIH

Driving Discovery Through Data

2 What is the (NIH) Commons?
Vivien Bonazzi / 10.29.15

For a while, Phil and I have been talking about the concept of a Commons.  We’ve been describing it in broad strokes in keynotes and lectures, and we’ve been talking with academic groups and various companies about how to implement it.  Today we’re posting a description of the Commons in more detail and beginning a series of blog posts to describe the various components. We see this document as a starting point in which we describe the building blocks of the Commons and how they might work together. We’ve also provided some examples of current activities (Commons pilots) that are helping test the ideas. However, we recognize that engagement with the broader community is key so we hope to foster discussion and gather input from the community about the Commons concepts and framework.

Below is a brief summary of the Commons.  To access the text of the full document, go here

The Commons is a shared virtual space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find,  manage, share, use and reuse data, software, metadata and workflows. It is a digital ecosystem that supports open science and leverages currently available computing platforms in a flexible and scalable manner to allow researchers to find and use computing environments, access public data sets and connect with other resources and tools (e.g. other data, software, workflows, etc.) associated with scholarly research. For digital objects to be in the Commons, they must have attributes that make them Findable, Accessible, Interoperable and Reusable (FAIR), i.e. they must follow the FAIR principles. How those principles might be achieved could be the subject of a whole blog post.

Although the Commons is a complex ecosystem, there are four main components, which fit together according to Figure 1.

commons diagram, with hardware, data, and software, stitched together with the compliance model.

Figure 1: Commons Framework

  • A computing environment, such as the cloud or HPC (High Performance Computing) resources, which supports access, utilization and storage of digital objects.
  • Publicly available datasets that adhere to a Commons digital object compliance model.
  • Software services and tools to facilitate access to and use on data, both the data in the Commons or elsewhere.
  • A digital object compliance model that describes the properties of digital objects that enable them to be findable, accessible, interoperable and reproducible (FAIR).

Each of these components will require further development and harmonization while being developed. A series of Commons pilots has been initiated to develop and test these components in order to understand and evaluate how well they will contribute to an ecosystem that will effectively support and facilitate sharing and reuse of digital objects. 

 

 

Comments

This is a wonderful concept, a nice framework for Open Data. Look forward to see recommendations on how this computing environment would be setup. Perhaps an open source platform may extend this vision to allow for the collaborative growth of analytic tools from industry as well as academia.

Great post Vivien! Thanks for the detailed explanation.

Add New Comment

Posting Calendar

September 2018

Sun Mon Tue Wed Thu Fri Sat
 
 
 
 
 
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
10
 
11
 
12
 
13
 
14
 
15
 
16
 
17
 
18
 
19
 
20
 
21
 
22
 
23
 
24
 
25
 
26
 
27
 
28
 
29
 
30
 
 
 
 
 
 
 
Back to Top