CONFERENCE UPDATE: AASLD 2025

AI-driven multiple instance learning advances precision in HCC histological grading

25 Jan 2026

Accurate histological differentiation of hepatocellular carcinoma (HCC) is essential for informing clinical decision-making, yet conventional grading remains labor-intensive and susceptible to inter-observer variability among pathologists.1 While artificial intelligence (AI) has demonstrated potential in computational pathology, the lack of systematic comparison of AI frameworks for HCC grading limits their adoption in clinical practice.1 At the AASLD Annual Meeting 2025, Professor Seto, Wai-Kay from the University of Hong Kong, presented a novel AI-based framework utilizing multiple instance learning (MIL) on digitized histopathology slides, offering a potential solution to standardized and efficient HCC grading.1

Digital histology, which converts glass or paraffin slides into high-resolution whole slide images (WSIs) for virtual viewing, enables remote collaboration, objective analysis, and efficient storage and retrieval.1 Utilizing these WSIs, the study was conducted to evaluate and compare AI frameworks based on MIL strategies for aggregating patch-level features to optimize automated, slide-level histological grading of HCC.1

This study analyzed WSIs derived from 54 surgically resected HCC patients, comprising a total of 392 slides that included tumors and non-tumor tissue, graded according to the American Joint Committee on Cancer (AJCC) 8th edition criteria.1 Tumor differentiation was determined by an experienced liver histopathologist and categorized as well- or moderately differentiated (G1/G2) or poorly differentiated (G3).1 WSIs were segmented into 1024×1024-pixel patches, with patches containing tumor tissue labeled according to histological differentiation.1

Following patch extraction,  each patch was processed using a pretrained foundation model UNI, a large vision transformer trained on over 100,000 hematoxylin-and-eosin (H&E)-stained WSIs across multiple organ systems, to generate patch-level feature embeddings.1 Uniform manifold approximation and projection (UMAP) was then applied to patch-level embeddings for visualization of feature distributions and qualitative assessment of histological patterns.1 These embeddings were subsequently aggregated at the slide level using MIL to enable slide-level grading.1

MIL requires only a slide-level label (positive/negative or G1-G3) and automatically identifies the most discriminative regions contributing to the final slide level prediction.1 Three MIL-based aggregation frameworks were evaluated for slide-level grading in the study: logistic regression as a baseline method, attention-based MIL, and transformer-based MIL.1 Model performance was assessed using area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).1

Visualization using UMAP demonstrated that the UNI feature extractor effectively identified meaningful histological patterns.1 Morphologically similar tissue patches clustered closely, with clear separation between non-tumor tissue and HCC tumor regions.1 This finding supports the suitability of the pretrained foundation model for downstream histopathological analysis.1

Across all evaluated metrics, both advanced MIL approaches outperformed the baseline logistic regression model.1 The transformer-based MIL model achieved the highest overall performance, with an AUC of 0.888 (95% CI: 0.706-0.998), sensitivity of 0.892 (95% CI: 0.784-0.976), and specificity of 0.957 (0.913-0.988) for slide-level differentiation grading.1 The model achieved a PPV of 0.891 (95% CI: 0.780-0.974) and an outstanding NPV of 0.957 (95% CI: 0.915-0.987), underscoring its robustness in accurately excluding negative cases.1 The superior performance of transformer-based MIL was attributed to its ability to model contextual relationships among patches, allowing the algorithm to integrate spatial and co-occurring patterns across the entire slide rather than relying on isolated features.1

Differentiation-specific analyses further highlighted the robustness of the approach.1 The transformer-based MIL model achieved an AUC of 0.926 (95% CI: 0.831-0.985) for non-tumor tissue, 0.881 (95% CI: 0.784-0.949) for well- and moderately differentiated tumors (G1/G2), and 0.924 (95% CI: 0.781-0.995) for poorly differentiated tumors (G3).1 Notably, performance remained strong even for G3 tumors, which are often more heterogeneous and challenging to classify.1 High negative predictive values across differentiation categories suggest reliable exclusion of incorrect grades, supporting potential clinical utility in assisting diagnostic workflows.1

In conclusion, MIL frameworks surpassed traditional probability aggregation, and the transformer model’s ability to process long patch sequences further enhanced diagnostic accuracy.1 Integrating MIL with pretrained foundation models significantly improves slide-level HCC differentiation grading by effectively capturing histological patterns across WSIs.1 The ability of transformer-based MIL to model contextual relationships further underscores its potential for complex computational pathology tasks.1 With future validation in larger, multi-center cohorts, this AI-driven approach may represent an important cornerstone toward standardized, scalable histological assessment in HCC.1

Get access to our exclusive articles.
Related Articles