Combining BART and Reciprocal LASSO for High-Dimensional Gene Expression Modeling

Authors

  • Saif Hosam Raheem University of Al-Qadisiyah, Iraq

DOI:

https://doi.org/10.37119/jpss2026.v24i1.1001

Abstract

High-dimensional gene expression data pose major challenges for statistical modeling due to the large number of predictors, strong correlations, and the presence of nonlinear regulatory structures. This study proposes a hybrid Bayesian framework that combines Bayesian Additive Regression Trees (BART) with the Reciprocal LASSO prior to achieve flexible nonlinear modeling and structured sparsity within a unified model. Theoretical development integrates aggressive shrinkage for variable selection with a sum-of-trees architecture that captures complex gene–gene interactions. A comprehensive simulation study across multiple dimensional and correlation settings demonstrates that the proposed BART–RL model consistently achieves lower prediction error and higher true positive rates compared with classical LASSO, elastic net, BART, and Bayesian reciprocal LASSO. Application to a real gene expression dataset further confirms the advantages of the hybrid approach, yielding improved predictive performance and identifying biologically meaningful genes supported by functional annotations. These results highlight the utility of combining nonlinear Bayesian tree ensembles with adaptive shrinkage priors for high-dimensional genomic modeling.

Downloads

Published

2026-03-01