Evidence‑Based Data for Evidence‑Based AI

Evidence‑Based Data for Evidence‑Based AI

Wednesday May 6, 2026 08:45 - 09:05 F2

Lecturers: Michael Bouzinier, Scott Yockel
Speaker: Dmitry Etin

Track: Trustworthy data for trustworthy AI

AI‑powered tools in hospitals are moving from pilots to operational use, rapidly changing the questions being asked by oversight bodies, regulators and the general public.

Like other components of a clinical process, these tools are expected to be evidence based, and this evidence should be clear and trusted. The hard part is that modern healthcare models learn from data that is rarely used “as is”. It is assembled through pipelines that normalise, harmonise and clean messy inputs, often across multiple systems and teams. Those upstream choices quietly shape what patterns an AI system can learn, and where it may fail. To maintain trust, healthcare organisations must be able to answer questions such as: where did the data come from, what was excluded, what changed since last month, how is drift monitored and who is accountable when outputs influence care? Today, software‑intensive upstream data processes often obscure this evidence and erode trust.

Standardising data processing and making it more transparent is no longer optional in Europe and is increasingly expected in other advanced healthcare systems. Regulatory frameworks such as the EU AI Act and EHDS will require organisations to demonstrate data governance, quality management and fitness-for-use in ways that support oversight, accountability and reuse over time. Standards bodies are beginning to describe complex data pipelines, but current approaches still do not scale because they produce narrative artefacts that are difficult to inspect, compare or use as evidence in decision-making.

In this session we take that a step further and introduce a complementary approach to trust: instead of reconstructing data histories from documents, we can capture transformations automatically as workflows execute, creating structured, queryable records. Instead of manual compliance checks, we can express key requirements as formal, verifiable rules that run over those records. Drawing on real‑world implementations in large‑scale claims data pipelines (including Medicare) and trusted research environments, we will show how this approach helps clinical and programme leaders understand what was actually done to data, not just what should have been done, enabling them to approve deployments with genuine confidence and scale AI responsibly.

A copy of the book that informed this perspective will be reserved for the audience question that most constructively advances the discussion.

Language

English

Topic

Data and Information

Seminar type

Live + On site

Lecture type

Presentation

Objective of lecture

Orientation

Level of knowledge

Intermediate

Target audience

Management/decision makers
Politicians
Technicians/IT/Developers
Researchers
Healthcare professionals

Keyword

Innovation/research
Informatics/Interoperability

Conference

Vitalis

Lecturers

Michael Bouzinier Lecturer

Architect
Harvard University Research Computing

Michael (Misha) Bouzinier is a Senior Research Software Engineer within University Research Computing and an AI Data Architect at IDEXX Laboratories. He has over 30 years of diverse experience in software research and development and 10 years as a professional educator. Misha’s intellectual interests include semiotics, natural language processing and text analytics, data visualization, evolutionary and medical genetics, computer simulations and explainable AI. Misha is a co-founder of Forome, a collaborative initiative advancing open-source and research-driven tooling for transparent and auditable health data pipelines, including structured provenance. Throughout his career, he has worked and led diverse international teams, successfully collaborating with developers and researchers from within the US, UK, Sweden, Finland, Belgium, The Netherlands and Japan.

Scott Yockel Lecturer

Research Computing and Data Infrastructure Leader
Independent

Scott Yockel served as University Research Computing Officer at Harvard, where he worked with researchers across campus to develop and champion a university-wide research computing strategy in support of Harvard’s research mission. His work focused on identifying emerging needs, engaging faculty and university leadership, and building sustainable research computing infrastructure at institutional and national scale.

Scott has been deeply involved in national and regional initiatives such as CaRCC and MGHPCC and has served as PI or co-PI on multiple externally funded projects, including the New England Research Cloud, the Northeast Storage Exchange and the NSF Center of Excellence, RCD-Nexus.

Dmitry Etin Speaker

Forome | Deggendorf Institute of Technology

Dmitry Etin is a digital health strategist working at the intersection of healthcare technology, policy and implementation, with a focus on interoperability, health data governance and trustworthy AI. His work centres on translating regulatory and policy expectations into operational data and AI systems that can be used in real clinical and research settings. Dmitry is actively involved in European Health Data Space initiatives and advises international organisations, including the European Medicines Agency, on large-scale interoperability and data-sharing programmes.
He leads a boutique health data consultancy, working with healthcare organisations, researchers and industry partners on health data strategy and implementation. In parallel, he co-develops Forome Association as an open science organisation contributing to research and open-source approaches for transparent and auditable health data pipelines, including work on data provenance and traceability. Dmitry brings extensive hands-on experience from multinational technology companies and national health IT programmes and also teaches digital health at Deggendorf Institute of Technology, focusing on practical interoperability, governance and data-driven healthcare transformation.

Cookie	Category	Provider	Expiry	Legal basis	Intention
_production_session_id	Necessary	InvitePeople	Session	Legitimate interest	Stores a unique ID of the InvitePeople-user session. This will allow the user to login into InvitePeople.
order_id	Necessary	InvitePeople	6 hours	Legitimate interest	Stores a unique ID of the current order (shopping cart). This allows the user to add, view, and manage items in their order during the session.
cookie_consent_[ID]	Necessary	InvitePeople	6 months	Legitimate interest	Stores the user's cookie consent preference for a specific Optional cookie policy. The ID in the cookie name refers to the unique identifier of that policy. This ensures the user’s choice—whether accepting all cookies or only necessary ones—is remembered on future visits.
lang	Necessary	InvitePeople	1 year	Legitimate interest	Remembers the user’s selected language version.
logged_in_user	Necessary	InvitePeople	1 year	Legitimate interest	Stores a unique ID of the InvitePeople-user. This will help the user navigate.
production_auth	Necessary	InvitePeople	1 year	Legitimate interest	Stores a unique ID of the InvitePeople-user. This will help the user with future logins.