World Journal of Oncology, ISSN 1920-4531 print, 1920-454X online, Open Access
Article copyright, the authors; Journal compilation copyright, World J Oncol and Elmer Press Inc
Journal website https://wjon.elmerpub.com

Original Article

Volume 16, Number 6, December 2025, pages 630-644


Analysis of the SH3-Domain Kinase Binding Protein 1 Predictive Model for Pancreatic Ductal Adenocarcinoma and CCCTC-Binding Factor Transcriptional Regulatory Study

Figures

Figure 1.
Figure 1. Flow diagram.
Figure 2.
Figure 2. Machine learning feature selection and SHAP interpretability analysis based on multiple datasets. (a) Ten-fold cross-validation λ parameter selection curve for LASSO regression, with the optimal λ value marked by a red vertical line. (b) Venn diagram of feature selection results from three machine learning algorithms (LASSO, RF, SVM-RFE), with SH3KBP1 located in the intersection area of the algorithms. (c) LASSO regression coefficient shrinkage path diagram. (d) Ten-fold cross-validation accuracy curve with the number of features, showing a maximum accuracy of 0.850. (e) Ten-fold cross-validation error rate curve, showing a minimum error rate of 0.141. (f) ROC performance comparison of multiple machine learning algorithms, with the PLS algorithm having the highest AUC (0.994). (g) Univariate prediction efficiency analysis of key genes. (h) SHAP feature importance ranking diagram. (i) SHAP value summary diagram, showing the distribution of the impact of gene expression on prediction. (j) SHAP dependence diagram, showing the interaction between SH3KBP1 and such genes as PCDH8B, OR7E31P, USR1, and CAPZA3. (k) Flowchart of the final prediction model constructed based on key genes.
Figure 3.
Figure 3. SH3KBP1 expression and its clinical correlation analysis in the TCGA-PDAC dataset. (a) Expression distribution of SH3KBP1 in normal groups and tumor groups. (b) ROC diagnostic curve analysis of SH3KBP1. (c) Comparison of expression level distribution, mean value, and statistical characteristics. (d) Kaplan-Meier survival analysis of SH3KBP1 expression level with OS, DSS, and PFI.
Figure 4.
Figure 4. SH3KBP1 protein expression analysis based on the THPA (Antibody: HPA003351). (a) Immunohistochemical images showing the expression of SH3KBP1 in non-PDAC and PDAC tissues, with magnification levels of × 40 and × 80. (b) Distribution of the proportion of positive patients with SH3KBP1 in various tumor types.
Figure 5.
Figure 5. SH3KBP1 protein expression analysis based on the PDC database. (a) Violin plot showing the expression distribution of SH3KBP1 in non-PDAC and PDAC samples (300 cases in each group). (b) ROC curve evaluating the discriminative ability of SH3KBP1 in PDAC (AUC = 0.911).
Figure 6.
Figure 6. CRISPR knockout score of the SH3KBP1 gene in PDAC cell lines based on the DepMap database. Bar chart showing the CRISPR dependency scores (CRISPR score) of SH3KBP1 in four PDAC cell lines (JOPACA1, PATU8988S, BXPC3, and KP2).
Figure 7.
Figure 7. Analysis of cell heterogeneity and SH3KBP1 expression in PDAC tissues based on single-cell sequencing data from the NCBI SRA project (accession number: PRJNA885258). (a) Sample quality control parameters, including the number of genes (nFeature), number of UMIs (nCount), and mitochondrial gene ratio (MT.percent). (b) t-SNE dimensionality reduction map showing the spatial distribution of different cell populations. (c) Pie chart showing the composition ratio of various cell types in the sample. (d) DotPlot showing the expression levels and expression proportions of characteristic marker genes in each cell population. (e) Expression distribution of SH3KBP1 in various cell populations (t-SNE map). (f) Inference of whether cells are malignant based on single-cell CNV analysis, with the upper panel showing reference cells (e.g., T cells) and the lower panel showing the CNV heatmap of other cells.
Figure 8.
Figure 8. Spatial expression analysis of SH3KBP1 based on GSM8452847 spatial transcriptome data. (a) Spatial tissue structure map, spatial UMI count map (nCount_Spatial), and expression distribution of tumor key genes (CDKN2A, TP53) in tissue sections. (b) Spatial expression distribution of common driver genes (SMAD4, KRAS) in PDAC. (c) Tumor region heatmap and binary classification region map (Tumor vs. Normal) divided according to the expression patterns of key molecules (e.g., KRAS, TP53) in (b). (d) Spatial expression heatmap of SH3KBP1 and violin plot of its expression difference analysis between tumor regions and normal regions.
Figure 9.
Figure 9. Prediction analysis of potential transcription factors of SH3KBP1 based on the CDB database. (a) Venn diagram showing the intersection of SH3KBP1-coexpressed genes (P < 0.05) in TCGA-PDAC, upregulated DEG, and TF targets in the PANC-1 cell line, screening out four candidate transcription factors (CTCF, ETS1, ZEB1, and SIN3A). (b) Binding sites and scoring of transcription factors predicted by the CDB database in the SH3KBP1 promoter region (chrX).
Figure 10.
Figure 10. Expression characteristics of CTCF in PDAC and its correlation with SH3KBP1. (a) Analysis of CTCF expression difference between normal tissues and tumor tissues in the TCGA database (Log2(FPKM+1)). (b) ROC curve evaluating the diagnostic efficacy of CTCF for PDAC (AUC). (c) Spearman correlation analysis of SH3KBP1 and CTCF expression levels in PDAC samples (Log2(TPM+1)). (d) ChIP-seq enrichment map of CTCF in the SH3KBP1 gene region (hg38 reference genome), showing its potential binding sites and transcriptional regulatory regions.
Figure 11.
Figure 11. Schematic diagram of the role of SH3KBP1 in the PDAC.