Coronavirus disease 2019 (COVID-19) surveillance system: Development of COVID-19 minimum data set and interoperable reporting framework
Mostafa Shanbehzadeh1, Hadi Kazemi-Arpanahi2, Komeil Mazhab-Jafari3, Hamideh Haghiri4
1 Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
2 Department of Health Information Technology, Abadan Faculty of Medical Sciences, Abadan, Iran
3 Department of Laboratory Sciences, Abadan Faculty of Medical Sciences, Abadan, Iran
4 Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
|Date of Submission||05-May-2020|
|Date of Acceptance||21-May-2020|
|Date of Web Publication||31-Aug-2020|
Dr. Hadi Kazemi-Arpanahi
Department of Health Information Technology, Abadan Faculty of Medical Sciences, Abadan
Source of Support: None, Conflict of Interest: None
INTRODUCTION: The 2019 coronavirus disease (COVID-19) is a major global health concern. Joint efforts for effective surveillance of COVID-19 require immediate transmission of reliable data. In this regard, a standardized and interoperable reporting framework is essential in a consistent and timely manner. Thus, this research aimed at to determine data requirements towards interoperability.
MATERIALS AND METHODS: In this cross-sectional and descriptive study, a combination of literature study and expert consensus approach was used to design COVID-19 Minimum Data Set (MDS). A MDS checklist was extracted and validated. The definitive data elements of the MDS were determined by applying the Delphi technique. Then, the existing messaging and data standard templates (Health Level Seven-Clinical Document Architecture [HL7-CDA] and SNOMED-CT) were used to design the surveillance interoperable framework.
RESULTS: The proposed MDS was divided into administrative and clinical sections with three and eight data classes and 29 and 40 data fields, respectively. Then, for each data field, structured data values along with SNOMED-CT codes were defined and structured according HL7-CDA standard.
DISCUSSION AND CONCLUSION: The absence of effective and integrated system for COVID-19 surveillance can delay critical public health measures, leading to increased disease prevalence and mortality. The heterogeneity of reporting templates and lack of uniform data sets hamper the optimal information exchange among multiple systems. Thus, developing a unified and interoperable reporting framework is more effective to prompt reaction to the COVID-19 outbreak.
Keywords: COVID-19, coronavirus disease 2019, minimum data set, semantic interoperability, surveillance system
|How to cite this article:|
Shanbehzadeh M, Kazemi-Arpanahi H, Mazhab-Jafari K, Haghiri H. Coronavirus disease 2019 (COVID-19) surveillance system: Development of COVID-19 minimum data set and interoperable reporting framework. J Edu Health Promot 2020;9:203
|How to cite this URL:|
Shanbehzadeh M, Kazemi-Arpanahi H, Mazhab-Jafari K, Haghiri H. Coronavirus disease 2019 (COVID-19) surveillance system: Development of COVID-19 minimum data set and interoperable reporting framework. J Edu Health Promot [serial online] 2020 [cited 2020 Sep 24];9:203. Available from: http://www.jehp.net/text.asp?2020/9/1/203/293946
| Introduction|| |
In December 2019, a cluster of pneumonia cases of primary unknown etiology emerged in Wuhan City, Hubei Province, China. After extensive speculation, ultimately, a novel species of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) was recognized as the causative pathogen of the disease. The disease name was initially called “2019 novel CoV” and later changed into CoV disease 2019 (COVID-19). The highly contagious nature of the disease and rapid increase of emerging new cases in China and many other countries have led the World Health Organization (WHO) on January 30, 2020, to declare the COVID-19 outbreak a global public health threat.,,,,,,,
Surveillance is the foundation of public health practice and research. To prepare for and deal with COVID-19 pandemic outbreak, a robust and responsive surveillance system should be considered, which provides a partnership cooperation among public health practitioners, clinicians, and policymakers to direct disease control and prevention efforts., The effectiveness of COVID-19 Surveillance System (COVSS) depends on clinical data and reports from wide scattered public and hospital information system as data input (e.g., Hospital information systems (HIS), Iranian Electronic Health Record (so-called SEPAS), Iranian Integrated Health System (known as SIB), and other clinical information systems). In this sense, effective implementation of COVSS necessitates clear and coherent sets of data, along with unified standards for sharing this data rapidly, supporting e-health and P4-medicine (Predictive, Preventive, Personalized, and Participatory)., A modular methodology should be developed in the design and implementation of information systems that will increase their integrity and enterprise usefulness. Data standardization and harmonization is the first important step in the life cycle of the information system (known as System Development Life Cycle (SDLC)) and it should be achieved conforming to a proper plan., Minimum Data Set (MDS) is one standard approach for data collection, providing accurate access to health data. In respect to the development Public Health Surveillance (PHS), MDS solution offers enhanced progresses in systematic collection, interpretation, comparison, and integration of data regarding health-related threats. However, data sharing may also be hindered if standardized methods are not used for coding and formatting data. The use of Information and Communication Technology may aid in enabling standardized, automated, and interoperable frameworks for data exchange between public and health information systems with heterogeneous platforms.,,,, Thus, the present study was conducted to provide a comprehensive MDS as a template for implementing a COVSS and then presented designing an exchanging framework toward interoperability in the context of COVID-19.
| Materials and Methods|| |
This was a cross-sectional descriptive study conducted in 2020. Initially, to design the COVID-19 MDS, a combination of literature review and expert consensus approach was used. In this regard, a review of the literature was conducted to retrieve related data resources on COVID-19, while also applying guidelines and instructions issued from local, national, and international organizations, especially the WHO and Center for Disease Control. Literature review was limited to English languages between December 2019 and March 2020 in the full text along with valid sources available on PubMed, Scopus, Web of Science, Science direct, Embase, and Cochrane databases.
To confirm the COVID-19 MDS, the preliminary data list was evaluated through consensus of the selected experts after review and discussion. Thus, we brought together a multidisciplinary team of 40 samples with expertise in virology, epidemiology, public health practitioners, infectious diseases, and experience in health information management. A researcher-made questionnaire was created to validate data fields. The experts participating in the study were asked to review the initial draft of variables to score the items according to the importance perceived by them based on a 5-point Likert scale (ranging from 1:“very slightly important” to 5:”highly important”.,,,,
The content validity of the questionnaire was evaluated using the comments from medical informatics and health information technology experts (a total of six persons, consisting of three experts in each field). For the reliability of the questionnaire, the test–retest method was used by 10 infectious disease specialists. Through decision Delphi technique in two rounds, decisions on included data fields were made based on the agreement level. Specifically, data fields with <50% agreement were excluded in the first round, while those with more than 75% agreement were included in the primary round. Those with 50%–75% agreement were surveyed in the second round, and if there was 75% consensus over a subject, it was regarded as a final data field. Further, if any experts intended to change, delete, or add a variable for a specific purpose, they were asked to write an acceptable reason. The collected data were analyzed by SPSS 16 where Spearman's rank correlation coefficient was used to evaluate the reliability of the questionnaire, which showed a coefficient of 85%.
To determine the corresponding information content of data fields, a complete COVID-19 patient record sample in the Ayatollah Taleghani Hospital (focal center of COVID-19, Abadan, Iran) was selected and its contents were extracted by a checklist. Then, the information content was coded using selected classification or nomenclature systems.
In the next step, all scattered codes were mapped to Systematized Nomenclature of Medicine–Clinical Terms (SNOMED-CT) reference codes using NPEX SNOMED-CT online browser (https://snomedbrowser.com/). This process was visualized through MindMaple Lite 1.71 software as a graphic user interface representing thesaurus mapping across multiple medical terminologies [Figure 1]. Finally, SNOMED-CT codes were structured into Health Level Seven-Clinical Document Architecture (HL7-CDA) standard framework to provide the message syntax. Finally, the Extensive Markup Language (XML) hierarchical rules were defined for standardization of the message structure. XML provides a comprehensive and unified human- and machine-readable resource which formally defines and represents CDA information as a set of concepts in a given domain. Overall, the CDA schema was designed based on coded and structured title and body (CDA, level two and three) through SNOMED-CT reference codes and XML structure.
| Results|| |
After the literature review, the proposed COVID-19 MDS was divided into administrative and clinical data categories. Each of the categories contained three and eight data class and 52 and 85 data field, respectively. The administrative data category included demographical, admission, and report ID data classes. The second category was clinical data involving clinical presentation, exposure to casual factors, physical examination, signs and symptoms, laboratory findings, CT results, treatment plan, and discharge outcome. Then, Delphi surveys were used to finalize the primary MDS. The results of two Delphi rounds are presented in [Table 1].
|Table 1: Administrative and clinical data classes for a minimum data set for coronavirus disease-19 reporting|
Click here to view
After the second round of Delphi [Table 1], 45 data fields for clinical and 23 fields for the administrative category were excluded from primary MDS [Table 1]. Overall, the ultimate data fields for administrative and clinical categories were 29 and 40, respectively. In the next stage, for each finalized data field, their corresponding content was extracted from real patient medical records. After defining the information content for the fields, they were coded using selected classification or nomenclature systems (preferred codes). Then, all scattered codes were mapped to integrated codes at SNOMED-CT through MindMaple software. [Table 2] and [Table 3] report the data classes, fields, corresponding content, data format, content definition, as well as preferred and reference codes for clinical and administrative data categories.
|Table 2: Administrative minimum data set description for information exchange of coronavirus disease-19|
Click here to view
|Table 3: Clinical minimum data set description for information exchange of coronavirus disease-19|
Click here to view
XML schemas of COVID-19 provide a tools of defining the structure, content and semantics of exchange reports. The report template is divided into administrative and clinical sections. In [Figure 2] presents XML based CDA framework related to COVID-19 reporting [Figure 2].
|Figure 2: Extensive Markup Language-based Clinical Document Architecture hierarchical framework related to COVID-19 disease reporting|
Click here to view
The HL7-CDA standard was used for standardization of the message syntax. In the CDA structure, the data field related to identification of entities was pasted into the document heading, while the CDA body contained detailed information about clinical findings [Figure 3].
|Figure 3: Free-text Health Level Seven-Clinical Document Architecture framework for information exchange of COVID-19 reporting|
Click here to view
| Discussion|| |
With the widespread outbreak of COVID-19, Iran Ministry of Health and Medical Education has focused on the coordination of care and highlights the need to standardized data collection to streamline and improve the surveillance capabilities of Iranian Health system in response to this pandemic. In this regard, developing a unified and interoperable reporting framework is most effective to prompt detection and tracking of cases, investigate causes, and control a disease outbreak.,, The purpose of MDS is to standardize the collection and reporting of a minimal amount of data as a basis for implementing any electronic systems for clinical, research, surveillance, and management purposes.,,, The developed MDS in this study primarily focused on PHS, whoever can be used for other applications. In this regard, we initially defined an MDS required for unified data reporting of COVID-19. Then, the structure and semantics of COVID-19 disease reporting were standardized according to HL7-CDA for the purpose of information exchange.
The quality of surveillance systems can be limited due to poor uptake or unreliable data entry process. Manual data entry is time-consuming and suffers from the inconsistent and poor-quality data structured forms. Furthermore, reports are inadequate and data are input into incorrect or erroneous fields. Thus, a reliable and friendly data entry process is crucial for capturing high quality data. Each data field should also be comprehensive so that it can be recorded in a few clicks. From a health-care provider's perspective, it is easier to analyze the data fields that are compulsory options rather than free-text data., To compliance with data quality criteria such as data consistency and comparability in COVSS, not only a COVID-19 MDS but also more detailed categories (levels) and data formats for data capturing were defined.
New improvements in data collection instruments support the findability, accessibility, interoperability, and reusability (FAIR) of data, emphasizing the need for uniform data that can be integrated from distributed databases.,, In this regard, this study therefore provides exchange, aggregate, and proper data management to reach FAIR data regarding COVID-19.
Given the prevalence of COVID-19 in Iran,,, the current study determined the national COVSS MDS, to collect, analyze, and report COVID-19 indicators. Each data element was mapped to common coding standards and terminologies to facilitate interoperability between various health systems at local, national, and global levels.
The COVSS MDS can be used in other countries as a main prerequisite to the implementation of the COVID-19 surveillance system. This study also highlights the benefits of standardization of COVID-19 data exchange processes which can be useful to other public health domains. Interoperable reporting for COVID-19 provides timely and reliable clinical data for measuring disease trends, efficiently applying control and prevention actions, detecting high-risk inhabitants or geographic zones, and keeping the clinical community informed through warnings, recommendations, notifies, and guidelines.,,
Our study method had three major strengths. First of all, the proposed COVSS MDS was gathered through an extensive literature review combined with a two-round Delphi survey that benefits from evidence based and expert's wisdom in determining data elements. Second, the adoption of standard nomenclature such as SNOMED-CT is suggested for the Electronic Health Record (EHR) as it captures clinical information at the level of details required by clinicians for care provision in most health-care disciplines and settings. Finally, we leveraged HL7-CDA, as a standard for the exchange of clinical documents, which should be readable by computers and humans. HL7 CDA is an XML-based standard which has a simple and very flexible text format for structuring and exchanging information on the Web environment.,
Given some of the unfamiliar aspects of this novel outbreak, we recommend the development of conceptual models of surveillance systems and conducting a pilot study including a further Delphi stage prior to refine some data categories. In addition, this MDS may need to be appraised from the perspectives of a greater group of clinical and public health professionals to be applicable in a nationwide. Further, this study provides COVID-19 interoperable reporting framework from a data management perspective, but its technological aspects need to be resolved which are beyond our discussions in this article.
| Conclusion|| |
An effective COVID-19 surveillance system requires complete and timely information to guide fully informed decisions to reduce the further spread of disease by taking early preventive measures. The template presented in this study can enable interoperability across many clinical and public health information systems that populate the COVID-19 surveillance system. The main output of the proposed template supports collaborations among various healthcare providers and public health agencies in patient care management as well as research or public health purposes. Given some of the unfamiliar aspects of this novel outbreak, we recommend the development of conceptual models of surveillance systems and conducting a pilot study including a further Delphi stage prior to refine some data categories.
This article is the result of a research project approved by the research committee at Abadan Faculty of Medical Sciences (Iran) (Ethic code number: IR. ABADANUMS. REC.1398.109). The authors thank all of the clinical and health information management experts that cooperated with them to complete questionnaire.
Financial support and sponsorship
This research project has been financially supported by Abadan Faculty of Medical Sciences (Iran) under contract number of 98U749.
Conflicts of interest
There are no conflicts of interest.
| References|| |
Jung S-m, Akhmetzhanov AR, Hayashi K, Linton NM, Yang Y, Yuan B, et al
. Real-time estimation of the risk of death from novel coronavirus (covid-19) infection: Inference using exported cases. J Clin Med 2020;9:523.
Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis 2020;20:533-4.
Wu C, Chen X, Cai Y, Zhou X, Xu S, Huang H, et al
. Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease 2019 pneumonia in Wuhan, China. JAMA Int Med; 2020 March13.
Lai CC, Shih TP, Ko WC, Tang HJ, Hsueh PR. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and corona virus disease-2019 (COVID-19): The epidemic and the challenges. Int J Antimicrob Agents 2020; 3 (55).
Bai Y, Yao L, Wei T, Tian F, Jin DY, Chen L, et al
. Presumed asymptomatic carrier transmission of COVID-19 2020 Apr 14; 323(14):1406-7.
Linton NM, Kobayashi T, Yang Y, Hayashi K, Akhmetzhanov AR, Jung SM, et al
. Epidemiological characteristics of novel coronavirus infection: A statistical analysis of publicly available case data. medRxiv; 2020.
Organization WH. Coronavirus Disease 2019 (COVID-19): Situation Report; 2020. p. 45.
Guan WJ, Ni ZY, Hu Y, Liang WH, Ou CQ, He JX, et al
. Clinical characteristics of coronavirus disease 2019 in China. N
Engl J Med 2020;382:1708-20.
Richards CL, Iademarco MF, Atkinson D, Pinner RW, Yoon P, Mac Kenzie WR, et al
. Advances in public health surveillance and information dissemination at the centers for disease control and prevention. Public Health Rep 2017;132:403-10.
Dixon BE, Rahurkar S, Ho Y, Arno JN. Reliability of administrative data to identify sexually transmitted infections for population health: A systematic review. BMJ Health Care Inform 2019 Aug1; 26(1).
Streefkerk HR, Verkooijen RP, Bramer WM, Verbrugh HA. Electronically assisted surveillance systems of healthcare-associated infections: A systematic review. Euro Surveill 2020 Jan16; 25(2).
Allam Z, Jones DS. On the coronavirus (COVID-19) outbreak and the smart City Network: Universal Data Sharing Standards Coupled with Artificial Intelligence (AI) to Benefit Urban Health Monitoring and Management. Healthcare (Basel) 2020 Feb22; 8(1).
Safdari R, Ghazi Saeedi M, Masoumi-Asl H, Rezaei-Hachesu P, Mirnia K, Mohammadzadeh N, et al
. National minimum data set for antimicrobial resistance management: Toward global surveillance system. Iran J Med Sci 2018;43:494-505.
Garcia MC, Garrett NY, Singletary V, Brown S, Hennessy-Burt T, Haney G, et al
. An assessment of information exchange practices, challenges and opportunities to support US disease surveillance in three states. J Public Health Manag Practice 2018;24:546.
Gansel X, Mary M, van Belkum A. Semantic data interoperability, digital medicine, and e-health in infectious disease management: A review. Eur J Clin Microbiol Infect Dis 2019;38:1023-34.
Pilot E, Roa R, Jena B, Kauhl B, Krafft T, Murthy G. Towards sustainable public health surveillance in India: Using routinely collected electronic emergency medical service data for early warning of infectious diseases. Sustainability 2017;9:604.
Gazzarata R, Monteverde ME, Ruggiero C, Maggi N, Palmieri D, Parruti G, et al
. Healthcare associated infections: An interoperable infrastructure for multidrug resistant organism surveillance. Int J Environ Res Public Health 2020 Jan; 17(2):465.
Sheikhali SA, Abdallat M, Mabdalla S, Al Qaseer B, Khorma R, Malik M, et al
. Design and implementation of a national public health surveillance system in Jordan. Int J Med Inform 2016;88:58-61.
Cato KD, Cohen B, Larson E. Data elements and validation methods used for electronic surveillance of health care-associated infections: A systematic review. Am J Infect Control 2015;43:600-5.
Raeisi A, Tabrizi JS, Gouya MM. IR of Iran national mobilization against COVID-19 Epidemic. Arch Iran Med 2020;23:216-9.
Mounesan L, Eybpoosh S, Haghdoost A, Moradi G, Mostafavi E. Is reporting many cases of COVID-19 in Iran due to strength or weakness of Iran's health system? Iran J Microbiol 2020;12:73-6.
Moradzadeh R. The challenges and considerations of community-based preparedness at the onset of COVID-19 outbreak in Iran, 2020. Epidemiol Infect 2020;148:e82.
Shanbehzadeh M, Ahmadi M. Identification of the necessary data elements to report AIDS: A systematic review. Electron Physician 2017;9:5920-31.
Kazemi-Arpanahi H, Vasheghani-Farahani A, Baradaran A, Mohammadzadeh N, Ghazisaeedi M. Developing a minimum data set (MDS) for cardiac electronic implantable devices implantation. Acta Inform Med 2018;26:164-8.
Kazemi-Arpanahi H, Vasheghani-Farahani A, Baradaran A, Ghazisaeedi M, Mohammadzadeh N, Bostan H. Development of a minimum data set for cardiac electrophysiology study ablation. J Educ Health Promot 2019;8:101.
Baunsgaard CB, Chhabra H, Harvey L, Savic G, Sisto SA, Qureshi F, et al
. Reliability of the international spinal cord injury musculoskeletal basic data set. Spinal cord 2016;54:1105-13.
Davey CJ, Slade SV, Shickle D. A proposed minimum data set for international primary care optometry: A modified Delphi study. Ophthalmic Physiol Opt 2017;37:428-39.
Revere D, Hills RH, Dixon BE, Gibson PJ, Grannis SJ. Notifiable condition reporting practices: Implications for public health agency participation in a health information exchange. BMC Public Health 2017;17:247.
Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, et al
. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 2016; 3.
Haywood KL, Griffin XL, Achten J, Costa ML. Developing a core outcome set for hip fracture trials. Bone Joint J 2014;96-B:1016-23.
Lutomski JE, Baars MA, Schalk BW, Boter H, Buurman BM, den Elzen WP, et al
. The development of the older persons and informal caregivers survey minimum dataset (TOPICS-MDS): A large-scale data sharing initiative. PloS One 2013;8:e81673.
Riley WT, Glasgow RE, Etheredge L, Abernethy AP. Rapid, responsive, relevant (R3) research: A call for a rapid learning health research enterprise. Clin Translat Med 2013;2:10.
Reza G, Fatemeh H. Covid-19 and Iran: Swimming with hands tied! Swiss Med Weekly 2020 Apr 7; 150(1516).
Zandifar A, Badrfam R. Fighting COVID-19 in Iran; Economic challenges ahead. Arch Iran Med 2020;23:284.
Raoofi A, Takian A, Akbari Sari A, Olyaeemanesh A, Haghighi H, Aarabi M. COVID-19 Pandemic and Comparative Health Policy Learning in Iran. Arch Iran Med 2020;23:220-34.
Gong M, Liu L, Sun X, Yang Y, Wang S, Zhu H. Cloud-based system for effective surveillance and control of COVID-19: Useful experiences from Hubei, China. J Med Internet Res 2020; 22:e18948.
Desjardins MR, Hohl A, Delmelle EM. Rapid surveillance of COVID-19 in the United States using a prospective space-time scan statistic: Detecting and evaluating emerging clusters. Appl Geography 2020;118.
Foddai A, Lindberg A, Lubroth J, Ellis-Iversen J. Surveillance to improve evidence for community control decisions during the COVID-19 pandemic– Opening the animal epidemic toolbox for Public Health. One Health 2020; 9.
Liu D, Wang X, Pan F, Xu Y, Yang P, Rao K. Web-based infectious disease reporting using XML forms. Int J Med Inform 2008;77:630-40.
Kokkinakis I, Selby K, Favrat B, Genton B, Cornuz J. Covid-19 diagnosis: Clinical recommendations and performance of nasopharyngeal swab-PCR. Rev Med Suisse 2020;16:699-701.
[Figure 1], [Figure 2], [Figure 3]
[Table 1], [Table 2], [Table 3]