一个计算鞅差相关系数的快速算法

A Fast Algorithm for Computing Martingale Difference Correlation

  • 摘要: 鞅差散度或鞅差相关系数被广泛地用于度量一维响应变量与多维预测变量关于条件均值独立性的偏离.根据定义直接计算样本鞅差散度的时间复杂度为O(n2),其中n为样本量,这极大地限制了鞅差散度在大数据中的应用.为了能够快速计算两个一维随机变量的样本鞅差散度,本文提出一个简单的、准确的、时间复杂度为O(nlog(n))的快速算法.本文所提出的快速算法基本上只包含了一个排序步骤,所以容易实现.数值模拟的结果表明本文所提出的算法显著优于对鞅差散度的蛮力算法,这给研究者在大数据集中探索复杂相依结构提供了很大的便利.

     

    Abstract: Martingale difference divergence and martingale difference correlation have been widely adopted in measuring the departure of conditional mean independence between a scalar response variable and a vector predictor variable. The computation of sample martingale difference divergence is implemented directly accordingly to its definition typically requires O(n2) cost, where n is the sample size, which greatly limits the use of the distance covariance for large data. To calculate the sample martingale difference divergence between two univariate random variables, a simple, exact O(nlog(n)) algorithm is developed. The proposed algorithm essentially consists of one sorting step, so it is easy to implement. The numerical simulation results show that the proposed algorithm is significantly faster than the brute force method, which enables researchers to explore complicated dependence structures in large datasets.

     

/

返回文章
返回