Abstract Cancer is associated with many pre-existing health conditions (PHCs), but accurately quantifying these links remains challenging. Although some studies have examined these associations, large-scale analyses using diverse electronic health record (EHR) data remain limited and lack the ability to evaluate cancer risk when patients are stratified by interpersonal differences. Using a real-world EHR dataset from a large Louisiana health system comprising 8,283,236 records from 1,460,738 patients (2013–2022), we evaluated associations between pre-existing health conditions (PHCs) and subsequent cancer diagnoses within a fixed five-year risk window. We applied epidemiological, statistical, and artificial intelligence methods to the full dataset and to subgroups stratified by gender, race, and area deprivation index (ADI) for overall cancer and 20 cancer types. We identified nine ICD-10 chapters, including Chapter 4 (metabolic) and Chapter 14 (genitourinary), with 221 PHCs linked to increased cancer risk (RR 1, 95% CI excluding 1.0, BH-FDR–adjusted p 0.05). Key PHCs include systemic sclerosis, blood type, benign mammary dysplasia, immune mechanism disorders, disturbances of smell, lipoprotein metabolism disorders, HIV, vitamin D deficiency and diabetes. Chapter 12 (skin diseases) and Chapter 9 (circulatory diseases) showed strong associations with 10 and 13 cancer types, respectively. Age-, gender-, race-, and ADI-specific high-risk PHCs were also identified. However, these findings should be interpreted carefully, as ADI may not fully capture individual-level socioeconomic or environmental exposures, and the lack of tobacco data may introduce residual confounding.
Alam et al. (Fri,) studied this question.