张俊英, 张日权, 王航, 陆智萍. 超高维数据边际经验似然独立筛选方法[J]. 应用概率统计, 2019, 35(2): 126-140. DOI: 10.3969/j.issn.1001-4268.2019.02.002
引用本文: 张俊英, 张日权, 王航, 陆智萍. 超高维数据边际经验似然独立筛选方法[J]. 应用概率统计, 2019, 35(2): 126-140. DOI: 10.3969/j.issn.1001-4268.2019.02.002
ZHANG Junying, ZHANG Riquan, WANG Hang, LU Zhiping. Marginal Empirical Likelihood Independence Screening in Sparse Ultrahigh Dimensional Additive Models[J]. Chinese Journal of Applied Probability and Statistics, 2019, 35(2): 126-140. DOI: 10.3969/j.issn.1001-4268.2019.02.002
Citation: ZHANG Junying, ZHANG Riquan, WANG Hang, LU Zhiping. Marginal Empirical Likelihood Independence Screening in Sparse Ultrahigh Dimensional Additive Models[J]. Chinese Journal of Applied Probability and Statistics, 2019, 35(2): 126-140. DOI: 10.3969/j.issn.1001-4268.2019.02.002

超高维数据边际经验似然独立筛选方法

Marginal Empirical Likelihood Independence Screening in Sparse Ultrahigh Dimensional Additive Models

  • 摘要: 可加模型通过协变量函数对响应变量起作用,是更加灵活的非参统计模型. 当协变量个数大于样本数且以指数阶增大时,将维数降到经典方法可解决的范围是统计学家急需解决的问题. 本文研究了超高维数据可加模型的变量筛选问题, 提出了边际经验似然变量筛选方法.该方法通过排列在~0~点的边际经验似然率选择变量. 我们证明了选择变量集以概率1渐进包含真实变量集;提出了迭代边际经验似然变量筛选方法. 数据模拟和实数据分析验证了所提方法的可行性.

     

    Abstract: The additive model is a more flexible nonparametric statistical model which allows a data-analytic transform of the covariates.When the number of covariates is big and grows exponentially with the sample size the urgent issue is to reduce dimensionality from high to a moderate scale. In this paper, we propose and investigate marginal empirical likelihood screening methods in ultra-high dimensional additive models. The proposed nonparametric screening method selects variables by ranking a measure of the marginal empirical likelihood ratio evaluated at zero to differentiate contributions of each covariate given to a response variable. We show that, under some mild technical conditions, the proposed marginal empirical likelihood screening methods have a sure screening property and the extent to which the dimensionality can be reduced is also explicitly quantified. We also propose a data-driven thresholding and an iterative marginal empirical likelihood methods to enhance the finite sample performance for fitting sparse additive models. Simulation results and real data analysis demonstrate the proposed methods work competitively and performs better than competitive methods in error of a heteroscedastic case.

     

/

返回文章
返回