非参数回归函数的基于截尾数据的估计
NONPARAMETRIC REGRESSION FUNCTION ESTIMATION BASED ON CENSORED DATA
-
摘要: 本文考虑截尾数据情况下非参数回归函数m(x)=E(Y|x)的估计。具体地讲,我们面对的是这样的数学模型:T是与(X,Y)独立的随机变量,我们观测到的不是Y本身,而是Z=min(Y,T)及δ=YT。今有训练样本(Xi,Zi,δi)i-1及当前样本(X,z,δ),记\xi_i(\cdot)=\leftz_i \geqslant \cdot\right, \quad N^+(\cdot)=\sum_i=1^n \xi_i(\cdot), V_n(\cdot)=\prod_i=1^n\left\\frac1+N^+\left(z_i\right)2+N^+\left(z_i\right)\right\^\left\delta_i=0 z_i<0\right, \quad U_n(\cdot)=\sum_i=1^n W_n i(x) \xi_i(\cdot), 令m_n(x)=\int_0^n_n U_n(y) \mid V_n(y) d y, 其中un=F2-1(n-a),0<α<1/2为一实常数,F2(·)=P(Y≥·)为Y的(右侧)分布函数。在权函数Wni(x)i=1n及(X,Y,T)的分布函数满足一组条件下,我们证明了mn(x)为m(x)的强相合估计,即:mn(x)→m(x),a.s.(n→+∞).Abstract: Let (X1, Y1), …, (Xn, Yn) be i.i.d. Rd×R1 random vectoos, E|Y|<+∞. then m(x)=E(Y|x) is called a regression function, Now, T1, …, Tn be i.i.d. samples of random variable T, independent of (Xi, Yi)i=1n. Set F(x,y)=P(X≥x,Y≥y), G(t)=P(T≥t), both F and G are unknown continuous survival functions. Based on obserations Zi=min(Xi, Yi) and δi=Xi≤Yi only, we proposed an estimate mn(x) of m(x). Under some conditions it is shown that mn(x)→m(x), a. s. (n→+∞).