ObjectiveCombined with long non-coding RNA (lncRNA) to find a regression model that can be used to predict the survival rate of patients with colon cancer before operation.MethodsThe clinical information and gene expression information of patients with colon cancer were downloaded by using TCGA database. The differentially expressed lncRNAs in tumor and paracancerous tissues were screened out, and then combined with the clinical information of patients to construct Cox proportional hazard regression model.ResultsA total of 26 kinds of lncRNAs with statistical difference in gene expression between paracancerous tissues and tumor tissues were selected (P<0.05). Through repeated screening and comparison of prediction efficiency, the prediction model was finally selected, which was constructed by patients’ age, M stage, N stage, and three kinds of lncRNAs (ZFAS1, SNHG25, and SNHG7) gene expression level: age [HR=4.00, 95%CI: (1.48, 10.84), P=0.006], M stage [HR=3.96, 95%CI: (2.23, 7.04), P<0.001], N stage [HR=1.87, 95%CI: (1.24, 2.84), P=0.003], ZFAS1 gene expression level [HR=0.60, 95%CI: (0.41, 0.86), P=0.006], SNHG25 gene expression level [HR=0.85, 95%CI: (0.73, 1.00), P=0.045], and SNHG7 gene expression level [HR=2.32, 95%CI: (1.53, 3.52), P<0.001] were all independent risk factors for postoperative survival of patients with colon cancer. The area under the ROC curves for predicting 1, 3, and 5-year overall survival were 0.802, 0.828, and 0.771, respectiely, which had a good prediction ability.ConclusionThe predictive model constructed by the combination of ZFAS1, SNHG25, SNHG7 genes expression level with M stage, N stage, and age can better predict the overall survival rate of patients before operation, which can effectively guide clinical decision-making and choose the most suitable treatment method for patients.
ObjectiveTo develop and validate a machine learning model based on preoperative clinical characteristics, laboratory indices, and radiological features for the non-invasive prediction of spread through air spaces (STAS) in patients with early-stage lung adenocarcinoma. Methods Preoperative data from patients with early-stage lung adenocarcinoma who underwent surgical resection at Northern Jiangsu People's Hospital between January 2020 and August 2025 were retrospectively collected. The data included clinical characteristics, laboratory indices, and radiological features. Patients were divided into a STAS-positive and a STAS-negative group based on postoperative pathological findings. The dataset was randomly split into a training set and a testing set at a 7 : 3 ratio. Feature variables were selected using the maximum relevance and minimum redundancy (mRMR) algorithm and the least absolute shrinkage and selection operator (LASSO) regression. Five machine learning models were constructed: logistic regression (LR), random forest (RF), support vector machine (SVM), light gradient boosting machine (LightGBM), and extreme gradient boosting (XGBoost). Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and decision curve analysis (DCA). The shapley additive explanations (SHAP) method was employed to interpret the optimal prediction model. Results A total of 377 patients were included, comprising 177 (46.9%) males and 200 females (53.1%), with a mean age of (63.31±9.73) years. There were 261 patients in the training set and 116 patients in the testing set. In the training set, statistically significant differences were observed between the STAS-positive group (n=130) and STAS-negative group (n=131) across multiple features, including age, sex, neutrophil-to-lymphocyte ratio (NLR), monocyte-to-lymphocyte ratio (MLR), clinical T stage, and maximum solid component diameter (P<0.05). A final set of 10 feature variables was selected by combining mRMR and LASSO regression, and five machine learning models (LR, RF, SVM, LightGBM, XGBoost) were developed. The XGBoost model demonstrated superior predictive performance in both the training and testing sets, achieving AUCs of 0.947 [95%CI (0.920, 0.975)] and 0.943 [95%CI (0.894, 0.993)], respectively, and achieved the optimal level in the testing set. DCA indicated that the XGBoost model provided a high net clinical benefit across a wide range of threshold probabilities. SHAP analysis revealed that the vessel convergence sign, clinical T stage, age, consolidation-to-tumor ratio (CTR), and MLR were the features with the highest contributions to STAS prediction. Conclusion The XGBoost model effectively predicts preoperative STAS status in early-stage lung adenocarcinoma, exhibiting excellent discriminative performance and good clinical interpretability. Key predictors such as the vessel convergence sign, clinical T stage, age and CTR provide a crucial reference for preoperative risk assessment and the individualized selection of surgical strategies, ultimately benefiting patients.
[Abstract]High-grade histologic subtypes of lung adenocarcinoma, such as micropapillary and solid patterns, are characterized by high invasiveness, increased risk of recurrence, and poor prognosis. Early preoperative identification of these subtypes is crucial for achieving individualized treatment and improving clinical outcomes. This review summarizes the clinical features, imaging manifestations, molecular mechanisms, and diagnostic advances related to these aggressive patterns. Studies have shown that micropapillary and solid subtypes are more common in male smokers, often present as solid nodules, and demonstrate strong predictive value in FDG-PET metabolic parameters and CT-based radiomics models. At the molecular level, EGFR mutations are more frequently observed in micropapillary types, whereas solid subtypes are often associated with high PD-L1 expression and TP53 mutations, indicating distinct therapeutic strategies for targeted and immunotherapies. In addition, serum markers such as CEA and CYFRA21-1, along with inflammatory indices like NLR and SII, may serve as auxiliary tools for subtype identification. Histologic subtypes of lung adenocarcinoma are evolving from descriptive classifications into critical determinants of treatment decisions and precision management. Clinicians should incorporate comprehensive histologic evaluation into individualized therapeutic planning. Multimodal integration technologies, combined with artificial intelligence algorithms, are advancing the accurate preoperative prediction and management of high-risk subtypes, thereby facilitating early diagnosis and stratified treatment of lung adenocarcinoma.