Huvudbild för Vitalis 2025
Profilbild för Ensuring Ethical and Trustworthy Secondary Use of Health Data: Insights from the Dorieh Platform

Ensuring Ethical and Trustworthy Secondary Use of Health Data: Insights from the Dorieh Platform

Onsdag 21 maj 2025 13:00 - 13:15 Vitalis Plaza

Föreläsare: Michael Bouzinier, Francesco Pontiggia

Spår: Health data

The vast proliferation of health data presents an enormous opportunity for research and policy-making but also poses significant challenges in trust, efficiency, and regulatory compliance. The secondary use of health data requires robust setups to ensure data is accurately harnessed for insights while meeting ethical and legal standards. This paper explores the integration of advanced data management, using tools such as descriptive workflow languages and domain-specific languages (DSLs), to create more trustworthy and efficient infrastructures in health data utilization.


The primary focus of the research was the Dorieh Data Platform, developed by Harvard University Research Computing in partnership with the Harvard T.H. Chan School of Public Health. Dorieh embodies a sophisticated data management approach that incorporates descriptive dataflow operators, enabling granular tracing of data transformations. By doing so, it addresses critical aspects of data provenance — the ability to trace and validate the lineage of every data element. Dorieh is deployed in the Harvard University FISMA-compliant Trusted Research Environment (TRE) leveraging Open OnDemand infrastructure. Dorieh is being used to prepare and document research datasets for National Studies of Air Pollution and Health.


Central to this work is employing a domain-specific language for data modeling, to allow for explicit definitions of transformations and enhance reproducibility and accountability in the secondary use of health data. Through integration with descriptive workflow languages, we create comprehensive frameworks that better adapt to the demands of modern data science, particularly in healthcare where regulatory compliance is rigorous.


The application of these methodologies on Medicare data highlighted data inconsistencies and underscored the effectiveness of Dorieh's approach in maintaining data quality. By providing detailed data lineage and error logging, Dorieh bolsters the trustworthiness and regulatory adherence of data-driven projects. We advocate adopting similar DSL tools across diverse health-related domains, ensuring data lineage is meticulously documented, thereby reinforcing the reliability and validity of research outcomes.


While the methodologies discussed were developed within a tightly controlled environment, they are positioned for scalability to more complex ecosystems like the European Health Data Space (EHDS). By addressing multimodal regulatory requirements, including FISMA and EMA-HMA Data Quality stipulations, this approach stands to become a pivotal element in modern data governance, ensuring that AI models and health policy decisions are based on transparent and scientifically sound data processing methods.

Språk

English

Ämne

Data och information

Seminarietyp

Live + på plats

Föreläsningsformat

Presentation

Föreläsningssyfte

Verktyg för implementering

Kunskapsnivå

Fördjupning

Målgrupp

Chef/Beslutsfattare
Tekniker/IT/Utvecklare
Forskare (även studerande)

Nyckelord

Exempel från verkligheten (goda/dåliga)
Innovation/forskning
Appar
Information/myndighet
Informatik/Interoperabilitet

Föreläsare

Profilbild för Michael Bouzinier

Michael Bouzinier Föreläsare

AI Data Architect
IDEXX Laboratories / Harvard University

Michael (Misha) Bouzinier is an AI Data Architect with IDEXX Labrotories. He has over 30 years of diverse experience in software research and development and 10 years as a professional educator. Misha’s intellectual interests include semiotics, natural language processing and text analytics, data visualization, evolutionary and medical genetics, computer simulations, and explainable AI. Throughout his career, he has worked and led diverse international teams, successfully collaborating with developers and researchers from within the US, UK, Sweden, Finland, Belgium, The Netherlands, and Japan. In his free time, Misha loves to enjoy the outdoors, travel, and interact with people from diverse backgrounds.

Francesco Pontiggia Föreläsare

Francesco Pontiggia is a Sr Director of Harvard University Research Computing