周世荣, 汤银才, 王平平, 庄亮亮, 徐嘉威. 基于贝叶斯深度学习方法的上海新冠肺炎病例时空预测和不确定性量化[J]. 应用概率统计, 2024, 40(2): 298-322. DOI: 10.3969/j.issn.1001-4268.2024.02.006
引用本文: 周世荣, 汤银才, 王平平, 庄亮亮, 徐嘉威. 基于贝叶斯深度学习方法的上海新冠肺炎病例时空预测和不确定性量化[J]. 应用概率统计, 2024, 40(2): 298-322. DOI: 10.3969/j.issn.1001-4268.2024.02.006
ZHOU Shirong, TANG Yincai, WANG Pingping, ZHUANG Liangliang, XU Jiawei. Spatio-Temporal Forecasting and Uncertainty Quantification of COVID-19 Cases in Shanghai via a Bayesian Deep Learning Approach[J]. Chinese Journal of Applied Probability and Statistics, 2024, 40(2): 298-322. DOI: 10.3969/j.issn.1001-4268.2024.02.006
Citation: ZHOU Shirong, TANG Yincai, WANG Pingping, ZHUANG Liangliang, XU Jiawei. Spatio-Temporal Forecasting and Uncertainty Quantification of COVID-19 Cases in Shanghai via a Bayesian Deep Learning Approach[J]. Chinese Journal of Applied Probability and Statistics, 2024, 40(2): 298-322. DOI: 10.3969/j.issn.1001-4268.2024.02.006

基于贝叶斯深度学习方法的上海新冠肺炎病例时空预测和不确定性量化

Spatio-Temporal Forecasting and Uncertainty Quantification of COVID-19 Cases in Shanghai via a Bayesian Deep Learning Approach

  • 摘要: 2022年春季在上海爆发的新冠肺炎疫情对上海的社会、经济和居民的日常生活造成了严重影响.新冠肺炎的传播通常表现出复杂的非线性动力学,受环境、人口统计、医疗条件、核酸或抗原检测频率、流行病控制策略等影响.具有复杂网络结构和广泛训练的长短期记忆(LSTM)模型被广泛用于学习和预测流行病的传播. 然而, 这种模型既没有解释数据的不确定性,也没有考虑各种协变量和异质性的影响. 因此,本文提出了一个两阶段LSTM嵌套广义泊松回归模型来分析2022年春季上海爆发的新冠肺炎疫情数据. 在第一阶段,训练一个多层LSTM网络来学习特定地区的感染数据,然后使用训练好的LSTM来拟合和预测有症状的新冠肺炎感染人数.在第二阶段,在分层贝叶斯框架下通过广义泊松回归模型对预测的病例数进行建模,其中相对风险的对数用带有协变量和时空异质性的随机效应的线性函数来建模.在深度学习方法的帮助下,时空广义泊松回归模型可以预测和量化每日新增症状感染数量的不确定性.此外, 得益于从协变量和时空异质性的借力,基于贝叶斯深度学习方法的预测比基于LSTM方法的预测性能更好.

     

    Abstract: The outbreak of COVID-19 in Shanghai in the spring of 2022 had a serious impact on the society, economy, and daily life of residents. The spread of COVID-19 often exhibits complex non-linear dynamics influenced by environment, demographics, medical conditions, frequency of nucleic acid or antigen testing, epidemic control strategies, etc. Long-short term memory (LSTM) models with complex network structures and extensive training are widely adopted to learn and predict the spreading of epidemic. However, such a model neither explains the uncertainty in data, nor takes the influence of various covariates and heterogeneities into account. Therefore, a two-stage LSTM nested generalized Poisson regression (LNGPR) model is proposed in this paper to analyze COVID-19 infectious data in Shanghai outbroke in the Spring of 2022. In the first stage, a multi-layer LSTM network is trained to learn district-specific infectious data, then the trained LSTM is used to fit and predict the number of symptomatic COVID-19 infections. In the second stage, the predicted number of cases is modeled by a generalized Poisson regression model under a hierarchical Bayesian framework, in which the logarithm of the relative risks is modeled as a linear function of covariates and random effects with spatio-temporal heterogeneities. Facilitated by a deep learning approach, the spatio-temporal generalized Poisson regression model can forecast and quantifies uncertainty of the number of daily new symptomatic infections. Furthermore, the predictions based on the proposed Bayesian deep learning approach performs better than those based on LSTM method in virtue of borrowing strength from covariates, and spatial and temporal heterogeneity.

     

/

返回文章
返回