基于LASSO方法的分类预测因子充分降维

Partially sufficient dimension reduction of the categorical predictors based on LASSO

  • 摘要: 切片逆回归(SIR)方法在充分降维以及数据可视化领域取得了显著成效,随着信息收集技术的发展,高维数据大量涌现,基于全部特征进行分析的经典降维方法会出现过拟合的问题。本文在SIR的基础上,引入了分类预测因子,同时应用LASSO进行特征筛选,提出LASSO-PSIR方法,并证明了该估计方法的一致性。数值模拟表明,LASSO-PSIR方法在保留原始信息的同时,能够充分考虑到分类预测因子的影响,更好地恢复部分中心子空间。

     

    Abstract: The sliced inverse regression (SIR) method has achieved significant success in the fields of dimension reduction and data visualization by fully reducing dimensions. With the development of information collection technology, high-dimensional data has emerged in large quantities, and classical dimension reduction methods based on all features for analysis will encounter overfitting problems. In this paper, we introduce the categorical predictors based on SIR and apply LASSO for feature selection to propose the LASSO-PSIR method. We also prove the consistency of this estimation method. Numerical simulations show that the LASSO-PSIR method can fully consider the influence of the categorical predictors while retaining the original information, and better recover partial central subspace.

     

/

返回文章
返回