Local EHR-based ML model accurately predicts risk of severe hypoglycemia in older adults with diabetes

Severe hypoglycemia (SH), a dangerously low blood glucose level (<70mg/dL) that requires assistance from a third party, is a major concern for older adults with diabetes.1-3 Beyond the immediate medical emergency, SH is associated with a range of serious consequences, including prolonged hospitalizations, higher healthcare expenditures, increased risks of cardiovascular disease, falls, cognitive decline and elevated mortality.1 In response to this clinical challenge, researchers from The Chinese University of Hong Kong (CUHK)'s Faculty of Medicine (CU Medicine) developed a novel machine learning (ML) model capable of predicting the one-year risk of SH requiring hospitalization in older adults with diabetes.1 This predictive tool has the potential to enable targeted interventions and improve diabetes management outcomes in this high-risk population.


Diabetes is a leading global health issue, elevating the risks of serious complications.1 While diabetes management centers on lowering blood glucose through lifestyle changes and medications, these interventions can increase the risk of SH, especially in older adults (≥75 years) who experience the highest SH-related hospitalization rates.1,2 To enable personalized prevention and management strategies, researchers used local electronic health record (EHR) data to develop a ML model that accurately predicts the one-year risk of SH requiring hospitalization in older adults with diabetes.1

In this Hong Kong-based territory-wide cohort and modeling study, researchers used 1,456,618 EHRs of 364,863 older adults (≥65 years) with diabetes who had interacted with the public healthcare system between 2013 and 2018 as identified in the Hospital Authority Data Collaboration Laboratory (HADCL), which consists of a broad range of patient information collected from all public hospitals and clinics in Hong Kong.1 Of these individuals, 9,616 had been hospitalized due to SH, totaling 11,128 SH events.1 Compared to controls, SH-hospitalized patients were older (77.9 ± 7.6 vs. 74.4 ± 8.0 years), had more inpatient and outpatient encounters, prior SH history (10% vs. 0.7%) and were more likely to be taking sulfonylureas, insulin and DPP-4 inhibitors, but less likely to be on lipid-regulating drugs.1

A total of 258 predictors including demographics, admissions, diagnoses, medications and routine laboratory tests in a one-year period, were used to predict SH events requiring hospitalization in the following 12 months.1 The cohort was randomly split into training (70%), testing (20%) and internal validation (10%) sets and 6 supervised ML algorithms were used for training the SH risk prediction models.1 The 6 ML algorithms evaluated included generalized linear model, distributed random forest, gradient boosting machine, Rulefit, deep neural network and extreme gradient boosting (XGBoost).1 Predictive performance was assessed using area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC) statistics and positive predictive value (PPV).1

The ML models demonstrated impressive predictive performance, with all algorithms yielding AUROC values exceeding 0.8 in the training, testing and validation datasets.1 However, the XGBoost model emerged as the top-performing approach.1 The XGBoost model achieved an AUROC of 0.978 (95% CI: 0.972-0.984), an AUPRC of 0.670 (95% CI: 0.652-0.688) and a PPV of 0.721 (95% CI: 0.703-0.739).1 This significantly outperformed a conventional logistic regression model based on 11 previously identified risk factors, which had an AUROC of 0.906, AUPRC of 0.085 and PPV of 0.468.1 Even the best ML model trained using only those 11 variables achieved a lower AUPRC of 0.280 compared to the full XGBoost model.1

The key predictors in the XGBoost model included non-use of lipid-regulating medications, recent inpatient admission, urgent emergency department triage, insulin use and a history of prior SH.1 Interestingly, a sensitivity analysis revealed that the model's predictive power could be largely retained using just the top 30 most important variables, with a validation AUPRC of 0.632 compared to 0.670 for the full 258-variable model.1

To further validate the model's performance, the researchers applied it to an independent temporal validation cohort from the Hong Kong Diabetes Register, using predictors defined in 2018 and outcome events in 2019.1 Among 13,917 older adults with diabetes in this validation set, the model maintained strong discrimination, with an AUROC of 0.856 (95% CI: 0.838-0.873) and an AUPRC of 0.286.1 These results demonstrate the robustness and generalizability of the XGBoost-based SH risk prediction model.

In summary, the researchers developed a highly accurate ML model to forecast the 12-month risk of severe hypoglycemic events requiring hospitalization in older adults with diabetes.1 This model demonstrated superior performance compared to conventional approaches, with improved discrimination and reduced false positives that could lead to unnecessary interventions.1 The researchers believe this predictive tool has strong potential to be integrated into electronic health record decision support systems, enabling targeted interventions for the highest-risk patients.1

Get access to our exclusive articles.