There’s lots of hope and lots of excitement surrounding the promise of discovery held in the electronic health records that document the process of care. And, there’s a lot of good reason to be hopeful and excited. In the eight years since the passage of the Health Information Technology for Economic and Clinical Health Act (HiTECH Act), the percentage of hospitals and clinics using some type of an electronic health care record has grown from less than 30% to greater than 90%. This is indeed good news, providing some assurance that the observations collected at the point of care can improve the quality of phenotypes, provide a window into the patient’s life, and serve as the backdrop for evaluating the effectiveness of clinical interventions.
Yes, yes, there are many challenges waiting to be resolved before the EHR becomes the new big data, and in fact EHRs themselves are changing. Whatever becomes the organizing technology for patient care data, several challenges must be resolved prior to widespread use of them for data driven discovery, including assuring protection and adequate informed consent by the patients to whom the records refer as well as the clinicians who generated them, creating privacy-preserving analytical strategies and devising ways to integrate over time and space the records for a specific person. There has also been exceptional progress in improving the quality of data held in electronic health records, including widespread reliance on standard terminologies (e.g. SNOMED-CT). SNOMED CT is a list of terms and clinical concepts. It is one of a suite of designated standards for use in U.S. Federal Government systems for the electronic exchange of clinical health information and is also a required standard in interoperability specifications of the U.S. Healthcare Information Technology Standards Panel. The clinical terminology is owned and maintained by the International Health Terminology Standards Development Organisation (IHTSDO), a nonprofit association.
It’s useful for data science experts, particularly those who envision greater use of EHRs in future investigations, to keep an eye on what’s happening in the electronic health records space. The rules of play are quite different than what one might find in research-oriented operations. For example, the electronic health records that are used in practice are purchased by a health care providing organization from one of the more than 200 EHR vendors that exist today. Certainly, the big players—Epic Systems, Cerner, Eclipsys—are well known to some, even perhaps through their own experiences with care and health services. Each vendor has a more-or-less sophisticated version of how research integrates with the care process and how the data from their specific EHR product may later be used in the course of research. In addition, the data captured and stored in EHRs are generally kept within the health care institution where the care occurred. Thus, access to EHR data for research requires interactions and negotiations with the care providing organizations.
There’s a lot of innovation in the EHR space right now—data science investigators who envision using EHR datasets would be well advised to take a glance over there periodically to keep abreast with what’s happening. There are two key innovations that portend great benefit for care and ultimately benefit for research uses of EHR data. The first is the emergence of the Fast Healthcare Interoperability Resources (FHIR) specification. FHIR is a draft standard useful for exchanging EHR data—it doesn’t specify the terms, but rather provides a way for organizing and easily sharing sets of terms with a wide range of applications and other EHRs. Imagine graphing clinical data on the fly— that's one of the uses of FHIR. Among other things, the FHIR specification makes it easier to share data across disparate EHR systems and to link data from outside the care delivery system to the specific EHR. The benefit to researchers lies in a more comprehensive view of the patient.
Another idea that is starting to take hold is the application of Blockchain to clinical data. Blockchain emerges from the financial industry and it allows integration not of specific data, but of all of the locations any data related to a given entity (e.g., a patient) might be located. Moreover, Blockchain structure could serve as a type of a notification service, indicating when new updates to a person’s record exist. Blockchain may provide a secure and more complete index of the various places a patient’s data might appear, thus giving researchers a more comprehensive profile of the patient and his or her care experience.
So, fellow data scientists—keep a watch on that EHR space. It may prove to be the most high-value data set to come!