秦进. 倚抽样问题的选择性评论及其在现代统计学中的应用[J]. 应用概率统计, 2024, 40(2): 229-263. DOI: 10.3969/j.issn.1001-4268.2024.02.002
引用本文: 秦进. 倚抽样问题的选择性评论及其在现代统计学中的应用[J]. 应用概率统计, 2024, 40(2): 229-263. DOI: 10.3969/j.issn.1001-4268.2024.02.002
QIN Jing. Selective Review of Biased Sampling Problems with Applications in Modern Statistics[J]. Chinese Journal of Applied Probability and Statistics, 2024, 40(2): 229-263. DOI: 10.3969/j.issn.1001-4268.2024.02.002
Citation: QIN Jing. Selective Review of Biased Sampling Problems with Applications in Modern Statistics[J]. Chinese Journal of Applied Probability and Statistics, 2024, 40(2): 229-263. DOI: 10.3969/j.issn.1001-4268.2024.02.002

倚抽样问题的选择性评论及其在现代统计学中的应用

Selective Review of Biased Sampling Problems with Applications in Modern Statistics

  • 摘要: 偏倚抽样是一个普遍存在的问题,跨越各个学科领域, 影响着计量经济学、流行病学、医学、调查研究,以及最近的机器学习和人工智能AI等领域.当选择用于分析或研究的数据点引入系统性偏倚时,这种无处不在的挑战可能会影响研究结果的准确性和可靠性.本文的目标是全面介绍与偏倚抽样问题相关的基础概念和推理方法.此外, 我们还旨在建立偏倚抽样问题与机器学习中关于分布转移问题的最新讨论之间的联系. 我们还将深入探讨偏倚抽样的最新进展,特别是在转移学习和预测置信区间的符合推理方面.我们的最终目标是以一种对研究生易于理解的方式呈现这些材料,使他们能够在自己的研究工作中识别偏倚抽样问题的应用. 我们怀着深深的敬意和感激之情, 将本文献给已故的茆诗松教授,他多年来的指导和智慧对我们至关重要.

     

    Abstract: Biased sampling is a pervasive issue that transcends various disciplines, impacting fields such as econometrics, epidemiology, medicine, survey research, and more recently, machine learning and artificial intelligence (AI). This ubiquitous challenge arises when the selection of data points for analysis or research introduces systematic biases, potentially compromising the accuracy and reliability of research outcomes. In this paper, our objective is to provide a comprehensive overview of the foundational concepts related to biased sampling problems and the methods of inference. Furthermore, we aim to establish a connection between biased sampling issues and the more recent discussions in machine learning regarding distribution shift problems. Additionally, we will delve into the latest advancements in biased sampling, particularly within the context of transfer learning and conformal inference for predictive confidence intervals. Our ultimate goal is to present this material in a manner that is accessible to graduate students, enabling them to identify applications of biased sampling problems within their own research endeavors. It is with deep respect and gratitude that we dedicate this paper to the memory of the late Professor Shisong Mao, whose guidance and wisdom have been invaluable throughout the years.

     

/

返回文章
返回