Variable Selection with Copula Entropy
-
Graphical Abstract
-
Abstract
Variable selection is of significant importance for classification and regression tasks in machine learning and statistical applications where both predictability and explainability are needed. In this paper, a Copula Entropy (CE) based method for variable selection which use CE based ranks to select variables is proposed. The method is both model-free and tuning-free. Comparison experiments between the proposed method and traditional variable selection methods, such as distance correlation, Hilbert-Schmidt independence criterion, stepwise selection, regularized generalized linear models and adaptive LASSO, were conducted on the UCI heart disease data. Experimental results show that CE based method can select the `right' variables out more effectively and derive better interpretable results than traditional methods do without sacrificing accuracy performance. It is believed that CE based variable selection can help to build more explainable models.
-
-