|Author(s): G. Tsoumakas, A. Dimou, E. Spyromitros-Xioufis, V. Mezaris, I. Kompatsiaris, I. Vlahavas.
Title: “Correlation-Based Pruning of Stacked Binary Relevance Models for Multi-Label Learning”.
Click here to download the PDF (Acrobat Reader) file (16 pages).
Proceedings of the 1st International Workshop on Learning from Multi-Label Data (MLD'09), G. Tsoumakas, Min-Ling Zhang, Zhi-Hua Zhou (Ed.), pp. 101-116, Bled, Slovenia, 2009.
Abstract: Binary relevance (BR) learns a single binary model for each different
label of multi-label data. It has linear complexity with respect to the number of
labels, but does not take into account label correlations and may fail to accurately
predict label combinations and rank labels according to relevance with a new instance.
Stacking the models of BR in order to learn a model that associates their
output to the true value of each label is a way to alleviate this problem. In this
paper we propose the pruning of the models participating in the stacking process,
by explicitly measuring the degree of label correlation using the phi coefficient.
Exploratory analysis of phi shows that the correlations detected are meaningful
and useful. Empirical evaluation of the pruning approach shows that it leads to
substantial reduction of the computational cost of stacking and occasional improvements
in predictive performance.