Research Operations for Secondary Use of Clinical Sites’ EMR


Nicola Sawalhi-Leckenby, MSc
Research Associate
Real-World Evidence

Sofia Fernandes, MSc
Associate Director, Project Management
Real-World Evidence

Vernon Schabert, PhD
Independent Consultant and
Former Senior Research Scientist

Provisions of the US Food and Drug Administration’s 2016 21st Century Cures Act,1,2 and several initiatives funded by the European Medicines Agency,3 have greatly increased demand for real-world data (RWD) from life sciences companies. These initiatives have increased the potential for real-world evidence (RWE) derived from RWD to influence regulatory decision making, including approval of new indications for approved drugs. Uses of RWD that get closer to the approval of new indications greatly increase regulators’ scrutiny of study design rigor, richness of clinical detail, and validation of data against primary sources.4 Pre-curated RWD research databases that have been used widely to influence reimbursement or post-authorization decisions have rarely passed the scrutiny demanded for such uses.

Sponsors’ pharmacovigilance and medical affairs teams frequently gather RWD directly from medical sites for chart reviews, registries, and other observational studies. These sources have also become more attractive sources for RWD to supplement new indication applications, particularly under accelerated approval schemes for breakthrough therapies and orphan indications.5-7 The human effort and time investments for such data collection limits sponsors’ ability to conduct these studies at scale. However, the increasing global adoption of electronic medical records (EMRs) at clinical sites has prompted interest in using sites’ EMRs systematically for observational studies. The hope of sponsors is that sites can spend less time performing manual abstraction and resolving queries, leading to lower costs, faster data collection, larger sample sizes, and higher quality and accuracy.

Despite the attractiveness of EMR-based site studies, demand for such data frequently outpaces the data exchange technologies required to implement EMR data collection. Technology solutions are possible and are (at least partly) enabled by international data exchange standards implemented in most branded EMRs.8,9 However, through our experience implementing several EMR data collection studies at clinical sites, we have learned that operational issues can often pose greater barriers to EMR studies than the technology limitations. Stakeholders at clinical sites often lack knowledge and harbor reasonable apprehensions about providing access to EMR data, and their concerns have been amplified as sanctions have increased (and have been more widely publicized) following new privacy laws such as the 2018 European Union (EU) General Data Protection Regulation (GDPR). Implementing site-based EMR studies requires new collaborations and change management within clinical sites, and few have invested in changes to accommodate EMR data collection approaches. Here we present four key lessons for EMR studies that we have gathered through our experience working with sites in multiple countries.

    Lesson 1

    EMR Studies Operate in a Clinical Trials World

    Clinical sites’ interest in study participation is commensurate with their direct (and sometimes narrow) perception of benefit. Tangible benefits often outweigh intangible benefits in sites’ decisions to participate, particularly given pressures on clinician productivity and revenue generation present in many clinical settings. Transparency regulations ensure that site-based studies offer financial reimbursement commensurate with effort, so site investigators who make purely rational economic decisions would perceive equal effort versus reward between observational studies and RCTs. However, although reimbursement for effort is similar for RCTs and observational studies, investigators often prefer RCTs because of the larger reimbursement potential per study. RCTs also offer investigators access to new investigational product before approval, and greater research prestige relative to observational studies. When we have sent study invitations to experienced study sites, only a third as many sites return the initial Confidential Disclosure Agreement (CDA) for observational studies relative to RCTs.

    Because site investigators more frequently opt for participation in RCTs, their institutions have often set up procedures optimized for RCTs but not for observational studies. This has multiple consequences for observational study sponsors. First, site-developed templates for study agreements, ethics applications, and data protection reviews often assume that all studies will be RCTs. When observational study teams plan on secondary use of pseudonymous data, they need to plan on additional time to ask sites how to manage exceptions to an RCT-optimized process. This may include forms that require copies of case report forms (CRFs) that won’t exist or a request for adverse event (AE) reporting procedures when no patients will be identifiable for these reports. In a US-based study using a site’s custom clinical outcome assessments (COAs) linked to their Epic-brand EMR, we found ourselves educating the sponsored projects office on its requirements under the US Health Insurance Portability and Accountability Act (HIPAA). The office was unaware that because they would be providing us with a “Limited Dataset” under HIPAA, they were required to negotiate a Data Use Agreement (DUA) with us that protected uses of their patient data.

    Second, sites’ sponsored projects offices and contracting teams often feel less pressure from investigators to sign observational study agreements relative to clinical trial agreements. Observational study teams need to consider this lower motivation when they manage expectations regarding timelines and when developing risk management plans. Even in observational studies, site investigators value positive sponsor engagement, and this can improve a study team’s leverage with the site. Encouraging sponsors to plan on additional site engagement time early in the study process can result in more motivated investigators and more efficient site activation.

      Lesson 2

      It Takes a Village to Judge a Site’s EMR Feasibility

      Investigator motivation also has considerable impact on feasibility analysis when planning secondary use of sites’ EMR data. Researchers must lead feasibility assessments to ensure that 1) clinical data sources are complete and accurate records of relevant patient care, and 2) there is an achievable process to approve and execute the required data exchange. Unlike traditional site-based studies, EMR study feasibility requires coordination of input from site functions such as IT, administrators, sponsored projects offices, data protection, analytics, and legal departments. Site investigators often have little interaction with these functions when providing patient care, and these functions are also often unfamiliar with working together to approve or conduct studies. Therefore, sponsors and their study teams should plan on early and active engagement with multiple site stakeholders to understand whether the site’s data and infrastructure will support EMR studies.

      To minimize risk of delay and diffusion of responsibility, we recommend that study teams identify a non-investigator site contact who has capacity and desire to coordinate across multiple stakeholders and motivate completion of feasibility responses. Without such a motivated site coordinator, the risk for non-response and delays during feasibility is substantial. We are currently conducting a pilot of a technology partner’s EMR data exchange technology with sites in multiple European countries. We began with 10 interested sites that completed a CDA and began the feasibility process. Of these, only two completed their feasibility questionnaires before we moved on to ethics and data protection reviews. At these two sites, we were able to identify coordinators who committed adequate time to learn an unfamiliar process and convey it to relevant internal stakeholders. At sites where a strong coordinator was not available, our study teams spent substantial time being referred to new site contacts and re-explaining study objectives to staff with little research experience and little relationship with the investigator.

      We have also found that early financial reimbursement improves site willingness to support the higher feasibility effort required for EMR studies. In the study we referenced using a site’s Epic EMR, we executed a site start-up agreement to cover the feasibility process. This early agreement increased our ease of interaction with the investigator, study coordinators, data protection officer, and analytics team. We had a similar positive experience in a study using EMR data from a clinical site in Norway. Although we needed to negotiate second agreements with each of these sites after receiving all approvals to conduct the study, start-up agreements are best practice to accelerate site activation for studies involving secondary use of a site’s data.

        Lesson 3

        EMR Studies Strain Ethics and Data Protection Workflows

        Prior to the availability of EMRs, site-based RWD studies were already employing electronic data collection. Case report forms have long been collected from sites through the use of electronic data capture (EDC) systems. However, compared to data collection from EMRs, site-based studies using EDC rely on human effort to transform source documentation into fit-for-purpose data entries for a study protocol. The human involvement in abstraction and EDC data entry has historically been leveraged to minimize inference and algorithm development by study database programmers, but it has also benefitted studies by further reducing the risk of patient re-identification from study data. Many CRF designers have adopted a set of informally shared practices to accrue these benefits, such as the replacement of specific service dates with date spans and recording of only those services critical to the study database analyses. These CRF design practices usually satisfy ethics bodies’ perceptions of low patient identification risk, and they have also limited the amount of technical knowledge ethics reviewers need to approve use of EDCs.

        EMR studies hold promise for greater efficiency and scalability because they reduce or eliminate the need for human abstraction. This can only be achieved if raw records pass from the site to the study database programmer, and interpretation effort is shifted from the human abstractor to electronic algorithms applied to raw EMR records. Even if identifiers are removed from raw EMR records before transfer, risk of patient re-identification from pseudonymous EMR data is still higher than from abstracted CRF records. Ethics committees that could previously function without detailed technology competencies must navigate through unfamiliar concepts when evaluating risks and harms in EMR studies.

          We have seen substantial variation in the readiness of countries’ ethics bodies to handle the challenge of reviewing EMR studies. Ethics bodies in the UK received a head start through development of the Caldicott Principles, originally developed in 1997 (and revised in 2013) following a review of how the NHS handled patient information.10 By the time the UK implemented GDPR with its Data Protection Act of 2018,11 the infrastructure to apply Caldicott Principles had long been practiced and was highly consistent with GDPR protections. Research Ethics Committees (RECs) in England and Wales form one of the core functions of the Health Research Authority (HRA), which exists to provide a unified national system for the governance of health research. The HRA is responsible for governing the technological side of EMR data access, which allows RECs to focus on the traditional benefits and harms during study ethics review. The HRA can approve electronic data access through two separate mechanisms – Caldicott Guardians designated at individual sites of care, or a centralized approval known as Section 251.12

            Evidera has conducted multiple studies with NHS trusts in partnership with CIS Oncology. CIS Oncology’s ChemoCare drug ordering platform is also used by many trusts for submissions to the Systemic Anticancer Therapy (SACT) research database. Evidera and CIS Oncology have been able to streamline data collection for site investigators following ethics and data protection approvals, and we have completed analysis of treatments long before they appear in SACT. Caldicott Guardian approvals at NHS trusts can be highly efficient, but processes vary widely by trust. At some trusts, the process appears to have been infrequently used or documented for external study teams, which can lead to long delays and limited feedback before receiving approvals.

            In other countries outside the US, it pays to prepare for surprises. As we mentioned above, we are currently piloting a technology partner’s EMR data exchange with sites in two European countries. Preliminary discussions with one of the sites in Germany had confirmed that they required ethics approval before the Data Protection Officer (DPO) could review our study request. However, after multiple rounds of review, the ethics committee acknowledged the limits of their competencies to evaluate the data exchange technology. The ethics body instructed our study team to seek advice from the DPO before the ethics committee could issue its opinion. The DPO, once approached, also deferred a decision until the site’s IT department could evaluate. The site’s IT department helpfully noted that it could not validate the data exchange technology until the study received ethics and data protection approvals! Study teams who implement site EMR approaches will need to plan for substantial education, coordination, and change management effort to facilitate ethics reviews at participating sites.

              Lesson 4

              If You’ve Seen One EMR, You Haven’t Seen Them All

              The feasibility processes we discussed in Lesson 2 will yield critical information needed to configure data exchange for an approved study. Study teams and sites will need to have thorough alignment on the technical details required to facilitate secure and private data exchange of a site’s existing data. However, even the most secure and efficient data exchange can’t support study objectives if the desired study data are not where they are expected. In our experience, sponsors and site investigators underestimate the dispersion of sites’ data and the heterogeneity of EMR systems. This increases the risk for disappointment when executing an EMR study.

              In the Norwegian EMR study mentioned above, our feasibility process showed that clinical data for the patient group of interest was stored in three separate clinical systems. Two of these were separate EMRs, both actively used by the site, but storing different information. EMR A was used to record diagnoses and text notes; it was kept in active use because of its ease for reporting to the Norwegian national patient register. EMR B was made available to the site through a regional partnership, was maintained by an external vendor at little cost to the site, and was used to store prescription and laboratory data. The reliance on an external vendor would add substantial time and cost when integrating EMR B data with EMR A for a clinical study. Fortunately, we also located a third data source, an internal registry managed by site clinicians. Study eligibility criteria required both diagnoses (stored in EMR A) and prescription data (stored in EMR B), but the internal registry permitted site investigators to identify eligible patients more efficiently and simplified the process for requesting supplemental data exports from each of the two EMRs. Site feasibility processes for EMR studies must identify all potential systems that store relevant study data; questions specific to systems used by place of service and by type of data content (e.g., diagnoses, orders, results) can increase the likelihood that multiple systems are identified in feasibility responses.

              If this much variation can occur within a single site, it follows that variation will also be high across sites. Consolidation of EMR market share among US clinical sites offers some hope for consistency of site data, but study teams should not plan on seeing common EMR brands outside of the US. Across our various European EMR studies, we’ve gathered feasibility data for 20 sites in 11 countries. These 20 sites identified 16 different EMR brands in use. This diversity of implemented EMRs among sites poses significant barriers to efficiency in multi-site EMR studies.

              Fortunately, because EMRs are still required to exchange data with other clinical systems, EMR standardization efforts have been underway long before demand increased for site-based RWD. Much of this standardization is accomplished through Health Level Seven (HL7), which has developed EMR data exchange standards since 1987.8 Virtually all electronic health data systems released to market since the year 2000 support at least one version of HL7 standards; estimates suggest that more than half of the world’s healthcare data are exchanged using an HL7 standard.13 The US Department of Health and Human Services has encouraged adoption of HL7-enabled EMRs through a successive program of legislation14,15 and rulemaking,16 including recent initiatives such as Blue Button.17,18 We hope that US efforts to promote standards-based data exchange will migrate to other countries through market forces. Given the diversity in EMR offerings witnessed globally, study teams cannot rely on developing custom data exchange procedures with each site if they aspire to use site EMR at scale.


                Increased adoption of EMR by clinical sites has the potential to transform healthcare not only through better clinical decision making, but also through more efficient clinical research. As we’ve shown, however, clinical sites, ethics bodies, and data protection officers require substantial education, reassurance, and change management support to be ready for using their EMR data for secondary research.

                The history of sponsor-funded clinical trials is relatively short. Drug approvals did not require well-controlled trials until the 1960s,19 around the same time that human subject protections were formalized in the Declaration of Helsinki.20 Most clinical sites that now participate in research have developed all their study infrastructure since that time. We trust that sites can and will continue to evolve their readiness and processes as sponsor demand expands to include more EMR use for observational studies.

                  Despite advancements in EMR technology and its increased adoption, heterogeneity of systems and inconsistent use within healthcare settings pose challenges for the researcher. Study teams need to pay careful attention to vetting sites’ use of their systems, including distribution of data across systems and interoperability. These are not things that the traditional site investigator knows well but are discoverable through careful coordination with investigators’ colleagues. Site feasibility in the era of EMR studies will involve the broader organization, including both technical and operational stakeholders, beyond the investigator and site coordinator. Site engagement and payment models will need to evolve to ensure efficient and effective EMR studies.

                    Pursuit of site data for EMR studies will also elevate data privacy concerns for site investigators and their colleagues. Electronic exchange of study data will often pose privacy and security risks comparable to those borne in studies using EDC systems, but sites will need educating and convincing that new procedures come with comparable safeguards. That process of convincing will require engagement with, and buy-in from, more contacts and functions within a site’s organization than are required for traditional observational studies. Our experiences engaging sites in these studies give us confidence that revised communication, coordination, and documentation can adequately educate and reassure sites that new paradigms offer comparable protections and the promise of greater efficiency.

                    Leveraging sites’ EMRs for secondary analysis still poses a set of critical technical challenges. Those challenges are magnified by a diverse range of proprietary systems and lagging adoption of data exchange standards. However, we’ve learned that data exchange technology is actually the last in a series of critical challenges facing the researcher interested in site-based RWD. We encourage sponsors and scientists to consider the human and operational impacts of secondary data use early in the study design phase, and to plan for change management at participating sites until new research models become more widely socialized in the clinical community.


                      1. US Food and Drug Administration. 21st Century Cures Act. Available at: Accessed August 30, 2019.
                      2. US Food and Drug Administration. Framework for FDA’s Real-World Evidence Program. Available at: Accessed August 30, 2019.
                      3. Plueschke K, McGettigan P, Pacurariu A, Kurz X, Cave A. EU-funded Initiatives for Real World Evidence: Descriptive Analysis of Their Characteristics and Relevance for Regulatory Decision-Making. BMJ Open. 2018 Jun 14;8(6):e021864.
                      4. Duke Margolis Center for Health Policy. A Framework for Regulatory Use of Real-World Evidence. September 13, 2017. Available at: Accessed September 4, 2019.
                      5. European Medicines Agency. PRIME: Priority Medicines. Available at: Accessed August 30, 2019.
                      6. European Medicines Agency. Orphan Designation: Overview. Available at: Accessed August 30, 2019.
                      7. US Food and Drug Administration. Breakthrough Therapy Approvals. Available at: Accessed August 30, 2019.
                      8. HL7 International. Introduction to HL7 Standards. Available at: Accessed August 30, 2019.
                      9. Meehan RA, Mon DT, Kelly KM, et al. Increasing EHR System Usability Through Standards: Conformance Criteria in the HL7 EHR-System Functional Model. J Biomed Inform. 2016 Oct;63:169-173. doi: 10.1016/j.jbi.2016.08.015. Epub 2016 Aug 11.
                      10. NHS. Information Governance Toolkit. Caldicott2 Principles. Available at: Accessed August 30, 2019.
                      11. NHS. General Data Protection Regulation (GDPR) -Information. Available at: Accessed August 30, 2019.
                      12. The United Kingdom Caldicott Guardian Council. Available at: Accessed August 30, 2019.
                      13. HL7 International 2018 Annual Report. Available at: Accessed August 30, 2019.
                      14. Schabert VF. Medical Specialty Societies – An Emerging Source of Real-World Evidence. The Evidence Forum, October 2016:17-22. Available at: Accessed August 30, 2019.
                      15. Centers for Medicare and Medicaid Services. Promoting Interoperability (PI). Available at: Accessed August 30, 2019.
                      16. Notice of Proposed Rulemaking to Improve the Interoperability of Health Information. Available at: Accessed August 30, 2019.
                      17. Blue Button. Available at: Accessed August 30, 2019.
                      18. CMS Blue Button 2.0. Available at: Accessed August 30, 2019
                      19. US Food and Drug Administration. FDA and Clinical Drug Trials: A Short History. Available at: Accessed August 30, 2019.
                      20. World Medical Association. WMA Declaration of Helsinki – Ethical Principles for Medical Research Involving Human Subjects. July 9, 2018. Available at: Accessed September 4, 2019.