NEWS & PERSPECTIVE
Local EHR-based ML model accurately predicts risk of severe hypoglycemia in older adults with diabetes
Severe hypoglycemia (SH), a dangerously low blood glucose level (<70mg/dL) that requires assistance from a third party, is a major concern for older adults with diabetes.1-3 Beyond the immediate medical emergency, SH is associated with a range of serious consequences, including prolonged hospitalizations, higher healthcare expenditures, increased risks of cardiovascular disease, falls, cognitive decline and elevated mortality.1 In response to this clinical challenge, researchers from The Chinese University of Hong Kong (CUHK)'s Faculty of Medicine (CU Medicine) developed a novel machine learning (ML) model capable of predicting the one-year risk of SH requiring hospitalization in older adults with diabetes.1 This predictive tool has the potential to enable targeted interventions and improve diabetes management outcomes in this high-risk population.
Diabetes is a leading global health issue, elevating the risks of serious complications.1 While diabetes management centers on lowering blood glucose through lifestyle changes and medications, these interventions can increase the risk of SH, especially in older adults (≥75 years) who experience the highest SH-related hospitalization rates.4 To enable personalized prevention and management strategies, researchers used local electronic health record (EHR) data to develop a ML model that accurately predicts the one-year risk of SH requiring hospitalization in older adults with diabetes.1
In this Hong Kong-based territory-wide cohort and modeling study, researchers used 1,456,618 EHRs of 364,863 older adults (≥65 years) with diabetes who had interacted with the public healthcare system between 2013 and 2018 as identified in the Hospital Authority Data Collaboration Laboratory (HADCL), which consists of a broad range of patient information collected from all public hospitals and clinics in Hong Kong.1 Of these individuals, 9,616 had been hospitalized due to SH, totaling 11,128 SH events.1 Compared to controls, SH-hospitalized patients were older (77.9 ± 7.6 vs. 74.4 ± 8.0 years), had more inpatient and outpatient encounters, prior SH history (10% vs. 0.7%) and were more likely to be taking sulfonylureas, insulin and DPP-4 inhibitors, but less likely to be on lipid-regulating drugs.1
A total of 258 predictors including demographics, admissions, diagnoses, medications and routine laboratory tests in one-year period, were used to predict SH events requiring hospitalization in the following 12 months.1 The cohort was randomly split into training (70%), testing (20%) and internal validation (10%) sets and 6 supervised ML algorithms were used for training the SH risk prediction models.1 The 6 ML algorithms evaluated included generalized linear model, distributed random forest, gradient boosting machine, Rulefit, deep neural network and extreme gradient boosting (XGBoost).1 Predictive performance was assessed using area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC) statistics and positive predictive value (PPV).1
The ML models demonstrated impressive predictive performance, with all algorithms yielding AUROC values exceeding 0.8 in the training, testing and validation datasets.1 However, the XGBoost model emerged as the top-performing approach.1 The XGBoost model achieved an AUROC of 0.978 (95% CI: 0.972- 0.984), an AUPRC of 0.670 (95% CI: 0.652-0.688) and a PPV of 0.721 (95% CI: 0.703-0.739).1 This significantly outperformed a conventional logistic regression model based on 11 previously identified risk factors, which had an AUROC of 0.906, AUPRC of 0.085 and PPV of 0.468.1 Even the best ML model trained using only those 11 variables achieved a lower AUPRC of 0.280 compared to the full XGBoost model.1
The key predictors in the XGBoost model included non-use of lipid-regulating medications, recent inpatient admission, urgent emergency department triage, insulin use and a history of prior SH.1 Interestingly, a sensitivity analysis revealed that the model's predictive power could be largely retained using just the top 30 most important variables, with a validation AUPRC of 0.632 compared to 0.670 for the full 258-variable model.1
To further validate the model's performance, the researchers applied it to an independent temporal validation cohort from the Hong Kong Diabetes Register, using predictors defined in 2018 and outcome events in 2019.1 Among 13,917 older adults with diabetes in this validation set, the model maintained strong discrimination, with an AUROC of 0.856 (95% CI: 0.838-0.873) and an AUPRC of 0.286.1 These results demonstrate the robustness and generalizability of the XGBoost-based SH risk prediction model.
In summary, the researchers developed a highly accurate ML model to forecast the 12-month risk of SH events requiring hospitalization in older adults with diabetes.1 This model demonstrated superior performance compared to conventional approaches, with improved discrimination and reduced false positives that could lead to unnecessary interventions.1 The researchers believe this predictive tool has strong potential to be integrated into EHR decision support systems, enabling targeted interventions for the highest-risk patients.1
In a subsequent interview with Omnihealth Practice, Professor Aimin Yang and Dr. Chow, Yee-Kwan Elaine, 2 researchers behind the development of this prediction model, shared their thoughts on the adoption of the model within Hong Kong’s setting.
Question 1: Could you elaborate on the limitations of existing SH risk prediction models? How might the ML model you developed help address these challenges?
Prof. Yang: Existing risk prediction models for SH rely on a limited set of in-patient factors, such as age, previous SH history, and use of insulin, along with conventional statistical methods. As a result, these models can only predict relatively short-term SH risk. Another concern with these models is the potential for false alarms, which could lead to inappropriate treatment.
Given that SH is a rare but consequential condition, especially for older adults with diabetes, improving the accuracy of SH event prediction is of crucial importance. Our prediction model was developed using a more comprehensive dataset – the HA EHR. By incorporating a broader range of outpatient predictors, this model can enhance the stability and reliability of SH risk predictions, extending the prediction window to 1 year and enabling earlier interventions.
Question 2: How could this prediction model best be integrated to optimize SH prevention in routine clinical care? What potential barriers may hinder its effectiveness?
Dr. Chow: Ideally, the prediction model would be incorporated as a component of the routine annual diabetes complication screening for older adults with diabetes, alongside assessments for diabetic retinopathy and neuropathy. This would allow for pre-emptive interventions to be initiated for high-risk patients, such as adjusting their insulin regimen and providing education to both the patient and their caregivers on appropriate SH management strategies.
A potential limitation is that the model was developed using data from the public HA EHR system. This may limit its applicability to diabetes patients who predominantly use private healthcare services. This data availability issue may be addressed with the EHR Sharing System, eHealth, which promotes the two-way exchange of EHR data between the public and private sectors.
Question 3: Are there plans to fur ther enhance the model by incorporating patient-reported outcomes or to validate its performance in settings beyond the Hong Kong healthcare system?
Dr. Chow: Patient-reported outcomes such as diets, awareness towards SH and monitoring data from continuous glucose monitoring (CGM), are not integrated within the EHR of local patients. However, other assessment methods such as questionnaires could potentially be used for their inclusion into the prediction model. As for the model’s applicability in other healthcare systems, we have developed another prediction model with 30 predictors that are widely available in EHR. If the model were to be implemented in other healthcare systems, recalibration and validation would be necessary.