GLORIA — GEOMAR Library Ocean Research Information Access

Hits per page

hit 1 - 1 | 1 hit

Sorting

Online Resource

Impact of Dataset Size on Classification Performance: An Empirical Evaluation in the Medical Domain

Althnian, Alhanoof ; AlSaeed, Duaa ; Al-Baity, Heyam ; [et al.]

MDPI AG ; 2021

In: Applied Sciences Vol. 11, No. 2 ( 2021-01-15), p. 796-

add to mindlist on the mindlist

Details

In: Applied Sciences, MDPI AG, Vol. 11, No. 2 ( 2021-01-15), p. 796-

Abstract: Dataset size is considered a major concern in the medical domain, where lack of data is a common occurrence. This study aims to investigate the impact of dataset size on the overall performance of supervised classification models. We examined the performance of six widely-used models in the medical field, including support vector machine (SVM), neural networks (NN), C4.5 decision tree (DT), random forest (RF), adaboost (AB), and naïve Bayes (NB) on eighteen small medical UCI datasets. We further implemented three dataset size reduction scenarios on two large datasets and analyze the performance of the models when trained on each resulting dataset with respect to accuracy, precision, recall, f-score, specificity, and area under the ROC curve (AUC). Our results indicated that the overall performance of classifiers depend on how much a dataset represents the original distribution rather than its size. Moreover, we found that the most robust model for limited medical data is AB and NB, followed by SVM, and then RF and NN, while the least robust model is DT. Furthermore, an interesting observation is that a robust machine learning model to limited dataset does not necessary imply that it provides the best performance compared to other models.

Type of Medium: Online Resource

ISSN: 2076-3417

URL: Article

DOI: 10.3390/app11020796

Language: English

Publisher: MDPI AG

Publication Date: 2021

detail.hit.zdb_id: 2704225-X

Permalink

	Location	Call Number	Limitation	Availability

Others were also interested in ...

Online Resource

Link to publisher

hit 1 - 1 | 1 hit