A Meta-Learning Method to Select Under-Sampling Algorithms for Imbalanced Data Sets

Main Article Content

Romero F. A. B. de Morais

Abstract

Imbalanced data sets originating from real world problems, such as medical diagnosis, can be found pervasive. Learning from imbalanced data sets poses its own challenges, as common classifiers assume a balanced distribution of examples’ classes in the data. Sampling techniques overcome the imbalance in the data by modifying the examples’ classes distribution. Unfortunately, selecting a sampling technique together with its parameters is still an open problem. Current solutions include the brute-force approach (try as many techniques as possible), and the random search approach (choose the most appropriate from a random subset of techniques). In this work, we propose a new method to select sampling techniques for imbalanced data sets. It uses Meta-Learning and works by recommending a technique for an imbalanced data set based on solutions to previous problems. Our experimentation compared the proposed method against the brute-force approach, all techniques with their default parameters, and the random search approach. The results of our experimentation show that the proposed method is comparable to the brute-force approach, outperforms the techniques with their default parameters most of the time, and always surpasses the random search approach.

Article Details

How to Cite
F. A. B. DE MORAIS, Romero. A Meta-Learning Method to Select Under-Sampling Algorithms for Imbalanced Data Sets. BRACIS, [S.l.], dec. 2016. Available at: <http://143.54.25.88/index.php/bracis/article/view/116>. Date accessed: 19 sep. 2024. doi: https://doi.org/10.1235/bracis.vi.116.
Section
Artigos