26 April 2024, Volume 40 Issue 2
    

  • Select all
    |
    article
  • ZHU Nenghui, YOU Jinhong, XU Qunfang
    CHINESE JOURNAL OF APPLIED PROBABILITY AND STATISTICS. 2024, 40(2): 201-228. https://doi.org/10.3969/j.issn.1001-4268.2024.02.001
    Abstract ( ) Download PDF ( ) Knowledge map Save
    By utilizing the robust loss function, B-spline approximation and adaptive group Lasso, a nonparametric additive model
    is investigated to identify insignificant covariates for the ``large p small $n$'' setting. Compared with the ordinary least-square adaptive group Lasso, the proposed method is resistant to heavy-tailed errors or outlines in the responses. To prove facilitate presentation, a more general weighted robust group Lasso estimator is considered. Moreover, the weight vectors play a pivotal role for the suggested estimators to enjoy the model selection oracle property and asymptotic normality. The robust group Lasso and adaptive robust group Lasso can be seen as special circumstances of different weight vectors. In practice, we use the robust group Lasso to obtain an initial estimator to reduce the dimension of the problem, and then apply the iterative adaptive robust group Lasso to select nonzero components. The results of simulation studies show that the proposed methods work well with samples of moderate size.\! A high-dimensional gene TRIM32 data is used to illustrate the application of the proposed method.
  • QIN Jing
    CHINESE JOURNAL OF APPLIED PROBABILITY AND STATISTICS. 2024, 40(2): 229-263. https://doi.org/10.3969/j.issn.1001-4268.2024.02.002
    Abstract ( ) Download PDF ( ) Knowledge map Save
    Biased sampling is a pervasive issue that transcends various disciplines, impacting fields such as econometrics, epidemiology, medicine, survey research, and more recently, machine learning and artificial intelligence (AI). This ubiquitous challenge arises when the selection of data points for analysis or research introduces systematic biases, potentially compromising the accuracy

    and reliability of research outcomes. In this paper, our objective is to provide a comprehensive overview of the foundational concepts related to biased sampling problems and the methods of inference. Furthermore, we aim to establish a connection between biased sampling issues and the more recent discussions in machine learning regarding distribution shift problems. Additionally, we will delve into the latest advancements in biased sampling, particularly within the context of transfer learning and conformal inference for predictive confidence intervals. Our ultimate goal is to present this material in a manner that is accessible to graduate students, enabling them to identify applications of biased sampling problems within their own research endeavors.

    It is with deep respect and gratitude that we dedicate this paper to the memory of the late Professor Shisong Mao, whose guidance and wisdom have been invaluable throughout the years.

  • HAO Hongxia, HU Hongqian, HAN Zhongcheng, LIN Jinguan
    CHINESE JOURNAL OF APPLIED PROBABILITY AND STATISTICS. 2024, 40(2): 264-276. https://doi.org/10.3969/j.issn.1001-4268.2024.02.003
    Abstract ( ) Download PDF ( ) Knowledge map Save
    To effectively capture the time-varying asymmetry of leverage effects in financial time series, this paper introduces a semi-parametric stochastic volatility model incorporating time-varying leverage effects based on linear splines. The parameters of this model are estimated using the Bayesian Markov Chain Monte Carlo (MCMC) method. Simulation studies indicate that the Bayesian MCMC method performs well in parameter estimation for the proposed model, even with limited sample sizes. Finally, the suggested semi-parametric stochastic volatility model with time-varying leverage effects is applied to the empirical analysis of daily returns data for the Shanghai Composite Index and the Shenzhen Component Index from January 4, 2000, to August 18, 2020. The results demonstrate the superiority of the proposed method.

  • HAN Dong, TSUNG Fugee
    CHINESE JOURNAL OF APPLIED PROBABILITY AND STATISTICS. 2024, 40(2): 277-286. https://doi.org/10.3969/j.issn.1001-4268.2024.02.004
    Abstract ( ) Download PDF ( ) Knowledge map Save
    This paper studies the non-Bayesian change-point detection of finite dependent samples sequence. By presenting the nonnegative dynamic random control limits, we not only constructed and proved two optimal control charts, but also obtained the expressions for
    the minimum values of Lorden's measure and Pollak's measure that are easier to calculate than the original definition.
  • SHAO Jun, WANG Lei
    CHINESE JOURNAL OF APPLIED PROBABILITY AND STATISTICS. 2024, 40(2): 287-297. https://doi.org/10.3969/j.issn.1001-4268.2024.02.005
    Abstract ( ) Download PDF ( ) Knowledge map Save
    This paper aims at developing a covariate selection approach for high-dimensional covariate vector in the presence of nonignorable nonresponse. Because of nonignorable missing responses, a novel covariate selection method has to be developed to eliminate
    covariates associated with neither the response variable nor the nonresponse mechanism. Once the redundant covariates are removed, existing methods for propensity estimation and other analyses by inverse propensity weighting can be applied. We provide some simulation results to show the effectiveness of our approach.
  • ZHOU Shirong, TANG Yincai, WANG Pingping, ZHUANG Liangliang, XU Jiawei
    CHINESE JOURNAL OF APPLIED PROBABILITY AND STATISTICS. 2024, 40(2): 298-322. https://doi.org/10.3969/j.issn.1001-4268.2024.02.006
    Abstract ( ) Download PDF ( ) Knowledge map Save
    The outbreak of COVID-19 in Shanghai in the spring of 2022 had a serious impact on the society, economy, and daily life of residents. The spread of COVID-19 often exhibits complex non-linear dynamics influenced by environment, demographics, medical conditions, frequency of nucleic acid or antigen testing, epidemic control strategies, etc. Long-short term memory (LSTM) models with complex network structures and extensive training are widely adopted to learn and predict the spreading of epidemic. However, such a model neither explains the uncertainty in data, nor takes the influence of various covariates and heterogeneities into account. Therefore, a two-stage LSTM nested generalized Poisson regression (LNGPR) model is proposed in this paper to analyze COVID-19 infectious data in Shanghai outbroke in the Spring of 2022. In the first stage, a multi-layer LSTM network is trained to learn district-specific infectious data, then the trained LSTM is used to fit and predict the number of symptomatic COVID-19 infections. In the second stage, the predicted number of cases is modeled by a generalized Poisson regression model under a hierarchical Bayesian framework, in which the logarithm of the relative risks is modeled as a linear function of covariates and random effects with spatio-temporal heterogeneities. Facilitated by a deep learning approach, the spatio-temporal generalized Poisson regression model can forecast and quantifies uncertainty of the number of daily new symptomatic infections. Furthermore, the predictions based on the proposed Bayesian deep learning approach performs better than those based on LSTM method in virtue of borrowing strength from covariates, and spatial and temporal heterogeneity.
  • CHEN Zhen-Qing
    CHINESE JOURNAL OF APPLIED PROBABILITY AND STATISTICS. 2024, 40(2): 323-342. https://doi.org/10.3969/j.issn.1001-4268.2024.02.007
    Abstract ( ) Download PDF ( ) Knowledge map Save
    In this paper, we survey some recent progress in the study of time fractional equations and its interplay with anomalous sub-diffusions, with some improvements and extensions.
  • PU Xiaolong, XIANG Dongdong, CHEN Xinyan
    CHINESE JOURNAL OF APPLIED PROBABILITY AND STATISTICS. 2024, 40(2): 343-363. https://doi.org/10.3969/j.issn.1001-4268.2024.02.008
    Abstract ( ) Download PDF ( ) Knowledge map Save
    With the increasing complexity of production processes, there has been a growing focus on online algorithms within the domain of multivariate statistical process control (SPC). Nonetheless, conventional methods, based on the assumption of complete data obtained at uniform time intervals, exhibit suboptimal performance in the presence of missing data. In our pursuit of maximizing available information, we propose an adaptive exponentially weighted moving average (EWMA) control chart employing a weighted imputation approach that leverages the relationships between complete and incomplete data. Specifically, we introduce two recovery methods: an improved K-Nearest Neighbors imputing value and the conventional univariate EWMA statistic. We then formulate an adaptive weighting function to amalgamate these methods, assigning a diminished weight to the EWMA statistic when the sample information suggests an increased likelihood of the process being out of control, and vice versa. The robustness and sensitivity of the proposed scheme are shown through simulation results and an illustrative example.