In:
INFORMS Journal on Computing, Institute for Operations Research and the Management Sciences (INFORMS), Vol. 31, No. 2 ( 2019-04), p. 207-225
Abstract:
In this paper, we present a method for comparing and evaluating different collections of machine learning algorithms on the basis of a given performance measure (e.g., accuracy, area under the curve (AUC), F-score). Such a method can be used to compare standard machine learning platforms such as SAS, IBM SPSS, and Microsoft Azure ML. A recent trend in automation of machine learning is to exercise a collection of machine learning algorithms on a particular problem and then use the best performing algorithm. Thus, the proposed method can also be used to compare and evaluate different collections of algorithms for automation on a certain problem type and find the best collection. In the study reported here, we applied the method to compare six machine learning platforms – R, Python, SAS, IBM SPSS Modeler, Microsoft Azure ML, and Apache Spark ML. We compared the platforms on the basis of predictive performance on classification problems because a significant majority of the problems in machine learning are of that type. The general question that we addressed is the following: Are there platforms that are superior to others on some particular performance measure? For each platform, we used a collection of six classification algorithms from the following six families of algorithms – support vector machines, multilayer perceptrons, random forest (or variant), decision trees/gradient boosted trees, Naive Bayes/Bayesian networks, and logistic regression. We compared their performance on the basis of classification accuracy, F-score, and AUC. We used F-score and AUC measures to compare platforms on two-class problems only. For testing the platforms, we used a mix of data sets from (1) the University of California, Irvine (UCI) library, (2) the Kaggle competition library, and (3) high-dimensional gene expression problems. We performed some hyperparameter tuning on algorithms wherever possible. The online supplement is available at https://doi.org/10.1287/ijoc.2018.0825 .
Type of Medium:
Online Resource
ISSN:
1091-9856
,
1526-5528
DOI:
10.1287/ijoc.2018.0825
Language:
English
Publisher:
Institute for Operations Research and the Management Sciences (INFORMS)
Publication Date:
2019
detail.hit.zdb_id:
2070411-2
detail.hit.zdb_id:
2004082-9
SSG:
3,2
Permalink