What question did this study set out to answer?

To evaluate the effectiveness of anonymized versus synthetic data for medication safety assessments and their implications for privacy and data utility.

April 15, 2026Open Access

A case study comparing anonymized and synthetic health insurance claims data for medication safety assessments

Key Points

To evaluate the effectiveness of anonymized versus synthetic data for medication safety assessments and their implications for privacy and data utility.
Compared anonymized and synthetic health claims data
Assessed their impact on fidelity, reproducibility, and privacy risks
Conducted analyses under different risk scenarios informed by threat modeling
Both data types yielded similar results to original data
Higher data utility correlated with increased privacy risks
Anonymized data showed higher fidelity compared to synthetic data, but with greater uncertainty in hazard ratios

Abstract

Abstract Synthetic data generation is increasingly proposed as an alternative to classical anonymization for sharing health data. We compared concrete applications of both approaches on a small, high-dimensional health claims dataset, assessing their impact on fidelity, reproducibility of study outcomes, and privacy risks. To reflect different sharing contexts, we considered a context-independent, higher-risk scenario with no assumptions about potential attacks, and a context-dependent, lower-risk scenario informed by threat modeling. Analyses on anonymized and synthetic data yielded results similar to those from the original study data, but came at the cost of higher uncertainty when estimating hazard ratios. As expected, higher data utility and fidelity were related to higher privacy risks. Our findings provide a reusable workflow and comparative insights into anonymization and synthetization and show that both methods are valuable means to lower privacy risks in data sharing scenarios but verifying results on the original data should be done whenever possible.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Mehmed Halilovic

Thierry Meurers

Marco Alibone

Journals

npj Digital Medicine

Actions

Institutions

Berlin Institute of Health at Charité - Universitätsmedizin Berlin

Federal Institute for Drugs and Medical Devices

IGES Institut

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A case study comparing anonymized and synthetic health insurance claims data for medication safety assessments

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study