In:
IEEJ Transactions on Electrical and Electronic Engineering, Wiley, Vol. 13, No. 10 ( 2018-10), p. 1483-1491
Abstract:
This paper proposes a novel oversampling method for imbalanced data classification, in which the minority class samples are synthesized in a feature space to avoid the generated minority samples falling into the majority class regions. For this purpose, it introduces a multi‐linear feature space (MLFS) based on a quasi‐linear kernel, which is composed from a pretrained neural network (NN). By using the quasi‐linear kernel, the proposed MLFS oversampling method avoids computing directly the Euclidean distances among the samples when oversampling the minority class and mapping the samples to high‐dimensional feature space, which makes it easy to be applied to classification of high‐dimensional datasets. On the other hand, by using kernel learning instead of representation learning using the NN, it makes an unsupervised learning, even a transfer learning, to be easily employed for the pretraining of NNs because a kernel is usually less dependent on a specific problem, which makes it possible to avoid considering the imbalance problem at the stage of pretraining the NN. Finally, a method is developed to oversample the synthetic minority samples by computing the quasi‐linear kernel matrix instead of computing very high dimensional MLFS feature vectors directly. The proposed MLFS oversampling method is applied to different real‐world datasets including image dataset, and simulation results confirm the effectiveness of the proposed method. © 2018 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
Type of Medium:
Online Resource
ISSN:
1931-4973
,
1931-4981
DOI:
10.1002/tee.2018.13.issue-10
Language:
English
Publisher:
Wiley
Publication Date:
2018
detail.hit.zdb_id:
2241861-1
Permalink