Abstract
AbstractResearch-ready data (data curated to a defined standard) increase scientific opportunity and rigour by integrating the data environment. The development of research platforms has highlighted the value of research-ready data, particularly for multi-cohort analyses. Following stakeholder consultation, a standard data model (C-Surv) optimised for data discovery, was developed using data from 5 population and clinical cohort studies. The model uses a four-tier nested structure based on 18 data themes selected according to user behaviour or technology. Standard variable naming conventions are applied to uniquely identify variables within the context of longitudinal studies. The data model was used to develop a harmonised dataset for 11 cohorts. This dataset populated the Cohort Explorer data discovery tool for assessing the feasibility of an analysis prior to making a data access request. Data preparation times were compared between cohort specific data models and C-Surv.It was concluded that adopting a common data model as a data standard for the discovery and analysis of research cohort data offers multiple benefits.
Publisher
Springer Science and Business Media LLC
Reference32 articles.
1. CDISC. Clinical Data Interchange Standards Consortium [02/02/2022]. Available from: https://www.cdisc.org/.
2. SNOMED. Systematized Nomenclature of Medicine – Clinical Terms [02/02/2022]. Available from: https://www.snomed.org/.
3. OHDSI. Observational Health Data Sciences and Informatics [09/09/2021]. Available from: https://www.ohdsi.org/ .
4. FIHR. Fast Healthcare Interoperability Resources. NHS Digital; [09/09/2021]. Available from: https://fhir.nhs.uk/.
5. Mohamed Yusoff A, Tan TK, Hari R, Koepfli KP, Wee WY, Antunes A, et al. De novo sequencing, assembly and analysis of eight different transcriptomes from the Malayan pangolin. Sci Rep. 2016;6:28199.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献