Development of Prediction Models for Shear Strength of Rockfill Material Using Machine Learning Techniques

Ahmad, Mahmood; Kamiński, Paweł; Olczak, Piotr; Alam, Muhammad; Iqbal, Muhammad Junaid; Ahmad, Feezan; Sasui, Sasui; Khan, Beenish Jehan

doi:10.3390/app11136167

Open AccessArticle

Development of Prediction Models for Shear Strength of Rockfill Material Using Machine Learning Techniques

¹

Department of Civil Engineering, University of Engineering and Technology Peshawar (Bannu Campus), Bannu 28100, Pakistan

²

Faculty of Mining and Geoengineering, AGH University of Science and Technology, Mickiewicza 30 Av., 30-059 Kraków, Poland

³

Mineral and Energy Economy Research Institute, Polish Academy of Sciences, 7A Wybickiego St., 31-261 Cracow, Poland

⁴

Department of Civil Engineering, University of Engineering and Technology, Mardan 23200, Pakistan

⁵

State Key Laboratory of Coastal and Offshore Engineering, Dalian University of Technology, Dalian 116024, China

⁶

Department of Architectural Engineering, Chungnam National University, Daejeon 34134, Korea

⁷

Department of Civil Engineering, CECOS University of IT and Emerging Sciences, Peshawar 25000, Pakistan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(13), 6167; https://doi.org/10.3390/app11136167

Submission received: 24 May 2021 / Revised: 30 June 2021 / Accepted: 1 July 2021 / Published: 2 July 2021

(This article belongs to the Special Issue Trends and Prospects in Geotechnics)

Download

Browse Figures

Versions Notes

Abstract

:

Supervised machine learning and its algorithms are a developing trend in the prediction of rockfill material (RFM) mechanical properties. This study investigates supervised learning algorithms—support vector machine (SVM), random forest (RF), AdaBoost, and k-nearest neighbor (KNN) for the prediction of the RFM shear strength. A total of 165 RFM case studies with 13 key material properties for rockfill characterization have been applied to construct and validate the models. The performance of the SVM, RF, AdaBoost, and KNN models are assessed using statistical parameters, including the coefficient of determination (R²), Nash–Sutcliffe efficiency (NSE) coefficient, root mean square error (RMSE), and ratio of the RMSE to the standard deviation of measured data (RSR). The applications for the abovementioned models for predicting the shear strength of RFM are compared and discussed. The analysis of the R² together with NSE, RMSE, and RSR for the RFM shear strength data set demonstrates that the SVM achieved a better prediction performance with (R² = 0.9655, NSE = 0.9639, RMSE = 0.1135, and RSR = 0.1899) succeeded by the RF model with (R² = 0.9545, NSE = 0.9542, RMSE = 0.1279, and RSR = 0.2140), the AdaBoost model with (R² = 0.9390, NSE = 0.9388, RMSE = 0.1478, and RSR = 0.2474), and the KNN with (R² = 0.6233, NSE = 0.6180, RMSE = 0.3693, and RSR = 0.6181). Furthermore, the sensitivity analysis result shows that normal stress was the key parameter affecting the shear strength of RFM.

Keywords:

AdaBoost; support vector machine; k-nearest neighbor; random forest; rockfill materials; shear strength

1. Introduction

Rockfill materials (RFMs) are commonly used in civil engineering projects such as rockfill dams, slopes, and embankments as construction materials for filling. This material is either obtained from a river’s alluvial deposits or by blasting available rock [1,2]. RFMs are widely being used in the construction of rockfill dams to trap the river water because of their inherent flexibility, capacity to absorb large seismic energy, and adaptability to various foundation conditions. The behavior of RFMs used in rockfill dams is important for the safe and cost-effective construction of these structures. Generally, rockfill behaves like a Mohr/Coulomb material, albeit without cohesion and with relatively high internal friction angles. Crushed rockfill, loosely layered, can behave like coarse sand. The shear strength of both types of RFM is affected by many factors such as mineral composition, surface structure, particle size, shape, relative density, individual particle strength, etc. [3,4,5]. Because of the variable jointing, angularity/roundness, and rock particle size distribution, the RFM can be considered the most complex material [6]. In order to know the mechanical properties of RFMs, extensive field and laboratory research is essential for understanding RFM behavior and determining shear strength parameters in order to design safe and cost-effective structures. In situ direct shear system was used to monitor the shear strength of RFM, as well as the variation in the shear strength of rockfill along with the fill lift [7]. Linero [8] carried out some large-scale shear resistance experiments to simulate the material’s original grain size distribution and the expected load level. RFM with a large particle size (maximum particle size of 1200 mm) is incompatible in laboratory testing [9]. Owing to restricting the effects of large particle sizes on test apparatus, such behavior makes it much more difficult to design representative/realistic large-scale strength tests. Furthermore, determining the shear strength of RFM directly is considered a costly and difficult process. Large-scale shear tests are often time-consuming and complex, and estimating the nonlinear shear strength function without using an analytical method is difficult. As a result, several researchers have attempted to determine the mechanical properties of RFM using indirect methods based on machine learning (ML) techniques.

In recent years, several researchers used ML algorithms and achieved efficient successes in different civil engineering and other sectors such as environmental [10], geotechnical [11,12,13,14,15,16,17,18], and other fields of science [19,20,21,22,23,24,25,26,27,28]. Numerous researchers have documented the behavior of the RFM. Marsal [3], Mirachi et al. [4], Venkatachalam [5], Gupta [29], Abbas [30], and Honkanadavar and Sharma [31] carried out laboratory experiments on different rockfill materials and concluded that the behavior of stress-strain is nonlinear, inelastic and based on the level of stress. They also noted that with an increase in maximum particle size for riverbed rockfill material, the angle of internal friction increases, and a reverse pattern for quarried rockfill material is observed. Frossard et al. [32] proposed a rational approach for assessing rockfill shear strength on the basis of size effects; Honkanadavar and Gupta [9] developed power law to relate the shear strength parameter to some index properties of riverbed RFM. Describing the mechanical behavior of rockfill materials and challenges in large-scale strength tests have incited several approaches in modeling the respective behavior of such soils. In this context, the artificial neural network (ANN) approach used by Kaunda [33] needs fewer rockfill parameters and was found to be more efficient in predicting RFM shear strength. Zhou et al. [34] have recently used cubist and random forest regression algorithms and have found that both can deliver better predictive RFM shear strength results than ANN and conventional regression models. This field, however, continues to be further explored. Considering that large-scale strength tests to characterize the shear strength are challenging, ML algorithms based on support vector machine (SVM), random forest (RF), AdaBoost, and k-nearest neighbor (KNN) models are proposed. Furthermore, the ML algorithms SVM, RF, AdaBoost, and KNN have demonstrated excellent prediction efficiency in a variety of fields [35,36,37,38,39] because of their generalization capability. The application in civil engineering field more significantly in prediction of RFM shear strength is limited based on literature surveys.

The main intention of the present study is to explore the capability of using SVM, RF, AdaBoost, and KNN algorithms to establish a more precise and parsimonious behavioral model for predicting the RFM shear strength. A critical review of existing literature suggests that despite the successful implementation of these techniques in various domains, their implementation in the prediction of RFM shear strength is scarcely explored. One of the primary significances of this study is that the data division in the training and testing data sets has been made with due regard to statistical aspects such as maximum, minimum, mean, and standard deviation. The splitting of the data sets is made to determine the predictive capability and generalization performance of established models and later helps to better evaluate them. Additionally, sensitivity analysis is carried out to find the main parameter influencing RFM shear strength. Concisely, the present study investigated and expanded the scope of machine learning algorithms for the development of the RFM shear strength model, which will provide theoretical support for researchers to establish a basis in selecting optimal machine learning algorithms in improving the predictive performance of RFM shear strength.

The rest of this article is structured as follows: The next section introduces the description of the used database and preliminaries of the algorithms used in the proposed approach and discusses the model evaluation metrics. Development of SVM, RF, AdaBoost, and KNN models are described in Section 3. Section 4 is dedicated to the performances and comparison of proposed models. Finally, Section 5 draws conclusions and outlines promising directions for future work.

2. Materials and Methods

2.1. Data Set

In this study, 165 samples of rockfill material (RFM) shear strength case history acquired by Kaunda [33] presented in Table A1 in the Appendix A were used to develop and evaluate the effectiveness of the proposed models. The RFM shear strength case history data are summarized in Table 1, where D₁₀, D₃₀, D₆₀, and D₉₀ correspond to the 10%, 30%, 60%, and 90% passing sieve sizes, C_c and C_u refer to coefficients of uniformity and curvature (C_c), respectively, FM and GM describe fineness modulus and gradation modulus, respectively, R represents ISRM hardness rating, UCS_min and UCS_max (MPa) indicate the minimum and maximum uniaxial compression strengths (MPa), γ is the dry unit weight (kN/m³), σ_n is normal stress (MPa), and τ is the shear strength of RFM (MPa) as the output variable. In this study, the output parameter selected to determine shear strength was the shear stress value at the failure of test samples and was the single output variable. The database was divided into two different sets, consisting of 80 percent (132 cases) and 20 percent (33 cases) of data, respectively, represented as training and testing sets. The testing set was accustomed to determine when training should be stopped in order to avoid overfitting. In order to achieve a consistent data splitting, different combinations of training and testing sets were experienced. The abovementioned selection was in such a way that the maximum (Max), minimum (Min), mean, and standard deviation of the parameters were consistent in the training and testing data sets (Table 2).

2.2. Support Vector Machine

Boser, Guyon, and Vapnik were the first to formulate and introduce the support vector machine (SVM) [40]. In the case of non-separable data, to accommodate errors for certain objects i, the “ideal boundary” must be introduced:

{\begin{cases} minimize (\frac{1}{2} {| δ |}^{2} + C \sum_{i = 1}^{n} ξ_{i}) \\ under the constraints y_{i} (b + δ \cdot x_{i}) + ξ_{i} \geq 1 and ξ_{i} \geq 0 \end{cases}

(1)

where

C

is the penetrating parameter;

δ

and

b

are, respectively, the normal vector and the bias of the hyperplane; and each

ξ_{i}

refers to the distance within object i and the respective margin hyperplane [41,42].

Data are implicitly mapped to a higher-dimensional space through mercer kernels, which can be broken down into a dot product to learn nonlinearly separable functions

K (x_{i}, x_{j}) = φ (x_{i}) \cdot φ (x_{j})

[42]. The kernel of the radial basis function (RBF) that is used widely is listed below:

K (x_{i}, x_{j}) = \exp (- σ {‖ x_{i} - x_{j} ‖}^{2})

(2)

where

σ

is the kernel parameter.

2.3. Random Forest

The use of a large series of low-dimensional regression trees is the basis of the random forest (RF). The theoretical development of RF is described by Breiman [43]. RF is an example of ensemble learning, which requires the development of a large number of decision trees to be implemented. In general, there are two types of decision trees: regression trees and classification trees. Regression trees were designed in the RF model since the main goal of this analysis was to predict the shear strength of RFM. Figure 1 depicts a general architecture for RF analysis. The protocol for analysis can be divided into two stages:

Stage 1: To create a sequence of sub-data sets, the bootstrap statistical technique is used to randomly sample from the initial data set (training data). The forest is then built using regression trees based on these sub-data sets. Each tree is trained by choosing a set of variables at random (a fixed number of descriptive variables selected from the random subset). Two important parameters that can be adjusted during the training stage are the number of trees (ntree) and the number of variables (mtry).

Stage 2: Once the model has been trained, a prediction can be made. In an ensemble approach, input variables are evaluated for all regression trees first, and then the final output is calculated by measuring the average value of each individual tree’s prediction.

2.4. AdaBoost Algorithm

The sequential ensemble technique AdaBoost, or adaptive boosting, is based on the concept of developing many poor learners using different training sub-sets drawn at random from the original training data set. Weights are allocated during each training session, and these are used to learn each hypothesis. The weights are used to calculate the hypothesis error on the data set and are a measure of the relative importance of each instance. After each iteration, the weights are recalculated so that instances classified wrongly by the previous hypothesis obtain higher weights. This allows the algorithm to concentrate on instances that are more difficult to understand. The algorithm’s most important task is to assign updated weights to instances that were wrongly labeled. In regression, the instances represent a real-value error. The AdaBoost technique can be used to mark the calculated error as an error or not an error by comparing it to a predefined threshold prediction error. Instances that have made a greater mistake on previous learners are more likely (i.e., have a higher probability) to be chosen for training the next base learner. Finally, an ensemble estimate of the individual base learner predictions is made using a weighted average or median [44].

2.5. k-Nearest Neighbor

The supervised ML algorithm k-nearest neighbor (KNN) can be used to solve both classification and regression problems. It is, however, most commonly used in classification problems [45]. In regression problems, the input data set consists of k that is nearest to the training data sets deployed in the featured set. The output is dependent if KNN is deployed to function as a regression algorithm. For KNN regression, the ensuing result is the characteristic value for the object, which is the mean figure of k’s nearest neighbors. To locate the k of a data point, a parameter such as Euclidean, Mahalanobis can be used as the distance metric [46,47].

2.6. Performance Metric

The coefficient of determination (R²), Nash–Sutcliffe efficiency (NSE) coefficient, root mean square error (RMSE), and the ratio of the RMSE to the standard deviation of measured data (RSR) were taken into account to examine the predictive capacity of the models, as shown in Equations (3)–(6) [48,49,50]:

R^{2} = [\frac{\sum_{i = 1}^{n} (O_{i} - \bar{O}) (P_{i} - \bar{P})}{\sqrt{\sum_{i = 1}^{n} {(O_{i} - \bar{O})}^{2}} \sqrt{\sum_{i = 1}^{n} {(P_{i} - \bar{P})}^{2}}}]

(3)

N S E = \frac{{\sum_{i = 1}^{n} (O_{i} - \bar{O})}^{2} - \sum_{i = 1}^{n} {(P_{i} - O_{i})}^{2}}{\sum_{i = 1}^{n} {(O_{i} - \bar{O})}^{2}}

(4)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(P_{i} - O_{i})}^{2}}

(5)

R S R = \frac{\sum_{i = 1}^{n} {(O_{i} - P_{i})}^{2}}{\sqrt{\sum_{i = 1}^{n} {(O_{i} - \bar{O})}^{2}}}

(6)

where n is the number of observations under consideration, O_i is the ith observed value,

\bar{O}

is the mean observed value, P_i is the ith model-predicted value, and P is the mean model-predicted value.

R-squared, also called the determination coefficient, describes the change in data as the degree of fit. The normal “determination coefficient” range is (0–1). The model is considered to be efficient if the R² value is greater than 0.8 and is close to 1 [51]. The NSE is a normalized statistic that controls the relative extent of the residual variance relative to the variance of the data measured [52]. The NSE varies between −∞ and 1. When NSE = 1, it presents a flawless match among observed and predicted values. Model predictive output with a range of 0.75 < NSE ≤ 1.00, 0.65 < NSE ≤ 0.75, 0.50 < NSE ≤ 0.65, 0.40 < NSE ≤ 0.50, or NSE ≤ 0.4 is graded as very good, good, acceptable, or unacceptable, respectively [53,54]. The RMSE is the square root of the ratio of the square of the deviation between the observed value and the true value of the number of observations n. The RMSE has a value greater than or equal to 0, where 0 is a statistically perfect fit for the data observed [55,56,57]. The RSR is interpreted as the ratio of the measured data’s RMSE and standard deviation. The RSR varies between an optimal value of 0 and a large positive value. A lower RSR presents a lower RMSE, which indicates the model’s greater predictive efficiency. RSR classification ranges are described as very good, good, acceptable, and unacceptable with ranges of 0.00 ≤ RSR ≤ 0.50, 0.50 ≤ RSR ≤ 0.60, 0.60 ≤ RSR ≤ 0.70, and RSR > 0.70, respectively [53].

3. Model Development to Predict RFM Shear Strength

The models for RFM shear strength prediction were developed using Orange software, which is a popular open-source environment for statistical computing and data visualization. All data processing is carried out using Orange software (version 3.13). The most prevalent supervised learning classification algorithms are given by Orange. In the package documentation manuals, one can find more information about input parameters, implementation, and references.

The structure of the model was based on an input matrix identified by predictor variables,

x = {D_{10}, D_{30}, D_{60}, D_{90}, C_{c}, C_{u}, G M, F M, R, U C S_{\min}, U C S_{\max}, γ, and σ_{n}}

and output, also called target variable (y), was the RFM shear strength. In every modeling process, achieving a consistent data division and the appropriate size of the training and testing data sets is the most important task. The statistical features, such as the minimum, maximum, mean, and standard deviation of the data sets, have therefore been taken into account in the splitting process. The statistical accuracy of the training and testing data sets optimizes the performance of the models and ultimately helps to evaluate them better. On the remaining data set, the proposed models were tested. In other words, to build and test the models, 132 and 33 data sets were used, respectively. To fairly assess the predictive performance of the models, the data set used for the testing of all models was kept the same.

In order to optimize the RFM shear strength prediction, all the models (AdaBoost, RF, SVM, and KNN) were tuned based on the trial and error process. Initially, the values were chosen for model tuning parameters and gradually varied in experiments until the best fitness measurements were achieved. Figure 2 shows the schematic diagram of the proposed methodology. The optimization method aims to find the best parameters for AdaBoost, RF, SVM, and KNN in order to achieve the best prediction accuracy. Some critical hyperparameters in the AdaBoost, RF, SVM, and KNN algorithms are tuned in this study, as shown in Table 3. The definitions of these hyperparameters are also clarified in Table 3. The values for the tuning parameters of the models were first chosen and then varied in the trials until the best fitness measures mentioned in Table 3 were achieved.

4. Results and Discussion

In this study, R², NSE coefficient, RMSE, and RMSE to standard deviation of measured data are chosen as the criterion for defining the model’s output. The database is split into a training data set and a testing data set to evaluate the performance of the presented models. To make a fair comparison, all the models are developed by applying them to the same RFM shear strength training and testing data sets. Figure 3, displays the scatter plot of the actual and the predicted RFM shear strength for the training phase. The analysis of the R² together with NSE, RMSE, and RSR for the RFM shear strength data set demonstrates that the SVM achieved a better prediction performance with (R² = 0.9655, NSE = 0.9639, RMSE = 0.1135, and RSR = 0.1899) succeeded by the RF model with (R² = 0.9545, NSE = 0.9542, RMSE = 0.1279, and RSR = 0.2140), the AdaBoost model with (R² = 0.9390, NSE = 0.9388, RMSE = 0.1478, and RSR = 0.2474), and the KNN with (R² = 0.6233, NSE = 0.6180, RMSE = 0.3693, and RSR = 0.6181).

Figure 4, presenting the predicted RFM shear strength, is plotted with the actual RFM shear strength data. According to the test data set, all models demonstrated very good predictive potential (R² > 0.8) with the exception of KNN, which displayed slightly worse results (i.e., R² = 0.6304) for the test data set. The result of R² demonstrated that all SVM, RF, and AdaBoost models except KNN are appropriate, but the SVM model performed better because it had the highest R² value (0.9656), and after that, the RF (0.9181) and AdaBoost (0.8951) models. In comparison to the other models, the KNN model presented the worst estimates with maximum dispersion (Figure 4).

In addition, the NSE measure was ranked from highest to lowest predictive strength, following the way: SVM (0.9654) > RF (0.9164) > AdaBoost (0.8835) > KNN (0.6076), which is similar to R². With regard to RMSE score, the SVM model also had the maximum predictive ability by having the lowest RMSE (0.0153), succeeded by the models RF (0.0797), AdaBoost (0.0941), and KNN (0.1727).

Finally, the reliability of all applied models was divided into four groups based on RSR values: unsatisfactory, satisfactory, good, and very good with ranges of RSR > 0.70, 0.60 ≤ RSR ≤ 0.70, 0.50 ≤ RSR ≤ 0.60, and 0.00 ≤ RSR ≤ 0.50, respectively. The RSR value therefore demonstrates very good results throughout all our established models except the KNN model, whose performance is considered to be satisfactory. Figure 5 depicts the bar graphs comparing the R², NSE, RMSE, and RSR for the training and testing data sets of all the models. The R² defines the degree of co-linearity between our predicted and actual data. The value of RMSE is more focused on large errors than on small errors. A lower RSR indicates a lower RMSE, indicating the model’s better predictive efficiency. The SVM model has high R² and NSE while lower RMSE and RSR values, revealing that the SVM model is preferable for predicting the RFM shear strength for the testing data. The SVM achieved a better prediction performance with (R² = 0.9655, RMSE = 0.0513 and mean absolute error (MAE) = 0.0184) in comparison to the cubist method (R² = 0.9645, RMSE = 0.0975, and MAE = 0.0644) and ANN method (R² = 0.9386, RMSE = 0.1320 and MAE = 0.0841) reported by Zhou et al. [34] and Kaunda [33], respectively, for the test data. Additionally, the accuracy of modeling determined by the linear regression method reported by Andjelkovic et al. [58] between measured and calculated values of shear strength (R² = 0.836) was slightly lower than the proposed SVM model. In general, the generalization and reliability of the SVM algorithm perform well, and larger data sets can yield better prediction results.

In the present research, a sensitivity analysis was also conducted using Yang and Zang’s [59] method to evaluate the influence of input parameters on RFM shear strength. This approach has been used in several studies [60,61,62,63] and is formulated as:

r_{i j} = \frac{\sum_{m = 1}^{n} (y_{i m} \times y_{o m})}{\sqrt{\sum_{m = 1}^{n} y_{i m}^{2} \sum_{m = 1}^{n} y_{o m}^{2}}}

(7)

where n is the number of data values (this study used 132 data values) and y_im and y_om are the input and output parameters. The r_ij value ranged from zero to one for each input parameter, and the highest r_ij values suggested the most efficient output parameter (which was RFM shear strength in this study). The r_ij values for all input parameters are presented in Figure 6. It can be seen from Figure 6 that the σ_n with r_ij is 0.990. Similar research of sensitivity analyses on RFM shear strength was also implemented by Kaunda [33] and Zhou et al. [34]. The findings demonstrated that normal stress is the most sensitive factor, which shows agreement with the present mentioned results.

Despite the fact that the proposed model produces desirable prediction results, certain limitations should be addressed in the future.

(1): Similar to other machine learning methods, the major disadvantages of SVM, RF, AdaBoost, and KNN models are sensitive to the fitness of the data set. Generally, if the data set is small, the generalization and reliability of the model would be influenced. However, the SVM, RF, and AdaBoost algorithms work with a limited data set, i.e., 165 cases, except for KNN. The prediction performances could be better on a larger data set. Furthermore, the developed models can always be updated to yield better results as new data becomes available.
(2): Other qualitative indicators such as the Los Angeles abrasion value and lithology may also have influences on the prediction results of the shear strength of RFM. Accordingly, it is significant to analyze the influence of these indicators on the prediction results for improving performance.

5. Conclusions

This study employed and examined the SVM, RF, AdaBoost, and KNN algorithms in the RFM shear strength prediction problem. To construct and validate a new model on the basis of the aforementioned algorithms, a comprehensive database containing 165 RFM case studies was collected from the available literature. Thirteen different predictive variables for rockfill characterization were selected as the input variables: D₁₀ (mm), D₃₀ (mm), D₆₀ (mm), D₉₀ (mm), C_c, C_u, GM, FM, R, UCS_min (MPa), γ (kN/m³), UCS_max (MPa), and σ_n (MPa). The predictive performance of the proposed models is verified and compared. The conclusions can be outlined as follows:

In this study, the SVM model (R² = 0.9656, NSE = 0.9654, RMSE = 0.0153, and RSR = 0.1861) successfully achieved a high level of modeling prediction efficiency to RF (R² = 0.9181, NSE = 0.9164, RMSE = 0.0797, and RSR = 0.2891), AdaBoost (R² = 0.8951, NSE = 0.8835, RMSE = 0.0941, and RSR = 0.3414), and KNN (R² = 0.6304, NSE = 0.6076, RMSE = 0.1727, and RSR = 0.6264) in the test data set. As the same methodology (having the same training and test data sets) for structuring all models is taken into consideration, the SVM model resulted the best and highest performance in this aspect. This implies that this algorithm is robust in comparison with others in RFM shear strength prediction.
The performance (in terms of R²) of the test data set for the SVM, RF, and AdaBoost algorithms studied falls in the range of 0.9656–0.8951 across the three models with 13 input valuables. Results conclude that it is rational and feasible to estimate the shear strength of RFM from the gradation, particle size, dry unit weight (γ), material hardness, FM, and normal stress (σ_n).
Sensitivity analysis results revealed that normal stress (σ_n) was the key parameter affecting the shear strength of RFM.

The findings show that the SVM model is a useful and accurate artificial intelligence technique for predicting RFM shear strength and can be used in various fields. Further, the generalization of the proposed approach for achieving improved performance results, more experimental data should be collected in future research. Finally, RFM shear strength prediction using advanced machine learning algorithms (i.e., deep learning) is left as a future research topic.

Author Contributions

Conceptualization, M.A. (Mahmood Ahmad); P.K. and P.O.; methodology, M.A. (Mahmood Ahmad), P.K.; software, M.A. (Mahmood Ahmad) and F.A.; validation, P.O., F.A., M.A. (Mahmood Ahmad), M.A. (Muhammad Alam) and F.A.; formal analysis, M.A. (Mahmood Ahmad); investigation, M.A. (Mahmood Ahmad), F.A., P.K., M.J.I. and S.S.; resources, P.K. and P.O.; data curation, M.A. (Mahmood Ahmad), M.A. (Muhammad Alam) and M.J.I.; writing—original draft preparation, M.A. (Mahmood Ahmad); writing—review and editing, M.A. (Mahmood Ahmad), B.J.K., F.A. and S.S.; supervision, P.K. and P.O.; project administration, P.K.; funding acquisition, P.K. and P.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Notation

ANN	Artificial neural network
AdaBoost	Adaptive boosting
KNN	k-nearest neighbor
NSE	Nash–Sutcliffe efficiency coefficient
R²	Coefficient of determination
RF	Random forest
RFM	Rockfill material
RMSE	Root mean square error
RSR	Ratio of RMSE to the standard deviation of the measured data
ISRM	International Society of Rock Mechanics
SVM	Support vector machine
D₁₀	Sieve size at 10 percent passing
D₃₀	Sieve size at 30 percent passing
D₆₀	Sieve size at 60 percent passing
D₉₀	Sieve size at 90 percent passing
C_c	Coefficient of curvature
C_u	Coefficient of uniformity
GM	Gradation modulus
FM	Fineness modulus
R	ISRM hardness rating
UCS_min	Minimum uniaxial compression strength
γ	Dry unit weight
UCS_max	Maximum uniaxial compression strength
σ_n	Normal stress
τ	Shear strength
φ	Angle of internal friction

Appendix A

Table A1. Rockfill shear strength database.

Case No.	Location	D₁₀ (mm)	D₃₀ (mm)	D₆₀ (mm)	D₉₀ (mm)	C_c	C_u	GM	FM	R	UCS_min (MPa)	UCS_max (MPa)	γ (KN/m³)	σ_n (MPa)	τ (MPa)
1	Canada	0.02	0.94	4	18	11.05	200	4.78	4.19	1	1	5	15.4	0.022	0.013
2	Canada	0.02	0.94	4	18	11.05	200	4.78	4.19	1	1	5	15.4	0.044	0.025
3	Canada	0.02	0.94	4	18	11.05	200	4.78	4.19	1	1	5	15.4	0.088	0.049
4	Canada	0.03	2.1	6.6	18	22.27	220	4.22	4.73	1	1	5	38.9	0.022	0.013
5	Canada	0.03	2.1	6.6	18	22.27	220	4.22	4.73	1	1	5	38.9	0.044	0.024
6	Canada	0.03	2.1	6.6	18	22.27	220	4.22	4.73	1	1	5	38.9	0.088	0.048
7	Canada	0.09	0.92	3.2	10	2.94	35.56	5	3.94	1	1	5	37	0.022	0.014
8	Canada	0.09	0.92	3.2	10	2.94	35.56	5	3.94	1	1	5	37	0.044	0.027
9	Canada	0.09	0.92	3.2	10	2.94	35.56	5	3.94	1	1	5	37	0.088	0.053
10	U.K.	1	6	19	29	1.89	19	2.61	6.36	5	100	250	19.62	0.059	0.163
11	U.K.	1	6	19	29	1.89	19	2.61	6.36	5	100	250	19.62	0.098	0.218
12	U.K.	1	6	19	29	1.89	19	2.61	6.36	5	100	250	19.62	0.198	0.367
13	U.K.	1	6	19	29	1.89	19	2.61	6.36	5	100	250	19.62	0.299	0.513
14	U.K.	0.3	3.2	16	30	2.13	53.33	3.37	5.74	5	100	250	18.0504	0.058	0.15
15	U.K.	0.3	3.2	16	30	2.13	53.33	3.37	5.74	5	100	250	18.0504	0.097	0.204
16	U.K.	0.3	3.2	16	30	2.13	53.33	3.37	5.74	5	100	250	18.0504	0.195	0.33
17	U.K.	0.3	3.2	16	30	2.13	53.33	3.37	5.74	5	100	250	18.0504	0.297	0.456
18	U.K.	1	6	19	29	1.89	19	2.65	6.36	5	100	250	19.62	0.179	0.262
19	U.K.	1	6	19	29	1.89	19	2.65	6.36	5	100	250	19.62	0.538	0.697
20	U.K.	1	6	19	29	1.89	19	2.65	6.36	5	100	250	19.62	0.887	0.112
21	U.K.	0.3	3.2	16	30	2.13	53.33	3.37	5.74	5	100	250	18.0504	0.177	0.245
22	U.K.	0.3	3.2	16	30	2.13	53.33	3.37	5.74	5	100	250	18.0504	0.529	0.666
23	U.K.	0.3	3.2	16	30	2.13	53.33	3.37	5.74	5	100	250	18.0504	0.876	0.102
24	Iran	0.1	1.2	7.5	17.3	1.92	75	4.32	7.42	4	50	100	9.3195	0.101	0.16
25	Iran	0.1	1.2	7.5	17.3	1.92	75	4.32	7.42	4	50	100	9.3195	0.301	0.34
26	Iran	0.1	1.2	7.5	17.3	1.92	75	4.32	7.42	4	50	100	9.3195	0.503	0.5
27	Iran	0.4	2.8	11	30	1.78	27.5	3.38	5.6	4	50	100	9.3195	0.172	0.207
28	Iran	0.4	2.8	11	30	1.78	27.5	3.38	5.6	4	50	100	9.3195	0.497	0.476
29	Iran	0.4	2.8	11	30	1.78	27.5	3.38	5.6	4	50	100	9.3195	0.83	0.751
30	Japan	1.3	4.6	15	32	1.09	11.54	2.68	6.28	5	100	250	17.81496	0.094	0.136
31	Japan	1.3	4.6	15	32	1.09	11.54	2.68	6.28	5	100	250	17.81496	0.177	0.242
32	Japan	1.3	4.6	15	32	1.09	11.54	2.68	6.28	5	100	250	17.81496	0.351	0.415
33	Japan	1.3	4.6	15	32	1.09	11.54	2.68	6.28	5	100	250	17.81496	0.512	0.552
34	Japan	0.6	2	16	30	0.42	26.67	3.24	5.86	4	50	100	21.8763	0.093	0.165
35	Japan	0.6	2	16	30	0.42	26.67	3.24	5.86	4	50	100	21.8763	0.182	0.308
36	Japan	0.6	2	16	30	0.42	26.67	3.24	5.86	4	50	100	21.8763	0.359	0.523
37	Japan	0.6	2	16	30	0.42	26.67	3.24	5.86	4	50	100	21.8763	0.535	0.744
38	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	21	0.177	0.214
39	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	21	0.514	0.525
40	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	21	0.839	0.773
41	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	21	1.172	1.07
42	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	21	1.494	1.312
43	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	21	1.97	1.648
44	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	20.8	0.18	0.24
45	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	20.8	0.5	0.447
46	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	20.8	0.821	0.689
47	Iran	0.4	2.9	9.7	31	2.17	24.25	3.41	5.57	4	50	100	20.8	1.142	0.93
48	Iran	0.5	2.8	9.7	30	1.62	19.4	3.43	5.61	5	100	250	21.1	0.487	0.39
49	Iran	0.5	2.8	9.7	30	1.62	19.4	3.43	5.61	5	100	250	21.1	0.972	0.766
50	Iran	0.5	2.8	9.7	30	1.62	19.4	3.43	5.61	5	100	250	21.1	1.448	1.11
51	Iran	0.2	2.5	19.4	42.2	1.61	97	3.19	5.77	5	100	250	21	0.168	0.157
52	Iran	0.2	2.5	19.4	42.2	1.61	97	3.19	5.77	5	100	250	21	0.373	0.634
53	Iran	0.2	2.5	19.4	42.2	1.61	97	3.19	5.77	5	100	250	21	0.731	1.088
54	Iran	0.2	2.5	19.4	42.2	1.61	97	3.19	5.77	5	100	250	21	0.906	1.258
55	Iran	0.2	2.5	19.4	42.2	1.61	97	3.19	5.77	5	100	250	21	1.262	1.699
56	Iran	0.2	2.5	19.4	42.2	1.61	97	3.19	5.77	5	100	250	21	1.437	1.883
57	Iran	0.4	3.3	10.3	33.3	2.64	25.75	3.32	5.64	4	50	100	21.8	0.092	0.14
58	Iran	0.4	3.3	10.3	33.3	2.64	25.75	3.32	5.64	4	50	100	21.8	0.179	0.23
59	Iran	0.4	3.3	10.3	33.3	2.64	25.75	3.32	5.64	4	50	100	21.8	0.344	0.357
60	Iran	0.4	3.3	10.3	33.3	2.64	25.75	3.32	5.64	4	50	100	21.8	0.514	0.52
61	Iran	0.4	3.3	10.3	33.3	2.64	25.75	3.32	5.64	4	50	100	21.8	0.859	0.887
62	Iran	0.4	3.3	10.3	33.3	2.64	25.75	3.32	5.64	4	50	100	21.8	1.186	1.149
63	Iran	1.2	2.1	4.2	25.3	0.88	3.5	3.63	5.34	4	50	100	21.8	0.092	0.147
64	Iran	1.2	2.1	4.2	25.3	0.88	3.5	3.63	5.34	4	50	100	21.8	0.178	0.22
65	Iran	1.2	2.1	4.2	25.3	0.88	3.5	3.63	5.34	4	50	100	21.8	0.503	0.461
66	Iran	1.2	2.1	4.2	25.3	0.88	3.5	3.63	5.34	4	50	100	21.8	1.148	0.959
67	Iran	0.4	2.9	9.7	31	2.17	24.25	3.4	5.53	4	50	100	21	0.34	0.332
68	Iran	0.4	2.9	9.7	31	2.17	24.25	3.4	5.53	4	50	100	21	0.99	0.843
69	Iran	0.4	2.9	9.7	31	2.17	24.25	3.4	5.53	4	50	100	21	1.618	1.271
70	Iran	0.4	2.9	9.7	31	2.17	24.25	3.4	5.53	4	50	100	21	2.399	1.799
71	Iran	0.4	2.9	9.7	31	2.17	24.25	3.4	5.53	4	50	100	21.5	0.342	0.342
72	Iran	0.4	2.9	9.7	31	2.17	24.25	3.4	5.53	4	50	100	21.5	0.994	0.865
73	Iran	0.4	2.9	9.7	31	2.17	24.25	3.4	5.53	4	50	100	21.5	1.63	1.321
74	Iran	0.4	2.9	9.7	31	2.17	24.25	3.4	5.53	4	50	100	21.5	2.422	1.891
75	Australia	33.9	42.4	50	60.2	1.06	1.47	0.2	8.8	5	100	250	21.7	0.163	0.23
76	Australia	33.9	42.4	50	60.2	1.06	1.47	0.2	8.8	5	100	250	21.7	0.215	0.275
77	Australia	33.9	42.4	50	60.2	1.06	1.47	0.2	8.8	5	100	250	21.7	0.412	0.424
78	Australia	30	34	40.8	50	0.94	1.36	0.52	8.5	5	100	250	21.7	0.165	0.252
79	Australia	30	34	40.8	50	0.94	1.36	0.52	8.5	5	100	250	21.7	0.215	0.28
80	Australia	30	34	40.8	50	0.94	1.36	0.52	8.5	5	100	250	21.7	0.412	0.424
81	Germany	4	11.7	36.2	98.2	0.95	9.05	1.47	7.55	4	50	100	24.2	1.039	1.114
82	Germany	4	11.7	36.2	98.2	0.95	9.05	1.47	7.55	4	50	100	24.2	2.034	1.964
83	Germany	4	11.7	36.2	98.2	0.95	9.05	1.47	7.55	4	50	100	24.2	3.004	2.705
84	Germany	3	9.1	30.4	98.2	0.91	10.13	1.67	7.28	4	50	100	24.2	0.533	0.658
85	Germany	3	9.1	30.4	98.2	0.91	10.13	1.67	7.28	4	50	100	24.2	1.039	1.114
86	Germany	3	9.1	30.4	98.2	0.91	10.13	1.67	7.28	4	50	100	24.2	2.018	1.882
87	Germany	4.2	12.8	41.2	99	0.95	9.81	1.37	7.62	4	50	100	24.2	0.512	0.512
88	Germany	4.2	12.8	41.2	99	0.95	9.81	1.37	7.62	4	50	100	24.2	1.001	0.902
89	Germany	4.2	12.8	41.2	99	0.95	9.81	1.37	7.62	4	50	100	24.2	1.987	1.728
90	USA	0.9	3	18.8	99	0.53	20.89	2.64	6.35	5	100	250	21.7	0.861	0.898
91	USA	0.9	3	18.8	99	0.53	20.89	2.64	6.35	5	100	250	21.7	1.67	1.509
92	USA	0.9	3	18.8	99	0.53	20.89	2.64	6.35	5	100	250	21.7	4.049	3.198
93	U.K.	0.44	1.5	6.99	27.5	0.73	15.89	3.82	5.16	4	50	100	18.7	0.159	0.189
94	U.K.	0.44	1.5	6.99	27.5	0.73	15.89	3.82	5.16	4	50	100	18.7	0.471	0.424
95	U.K.	0.44	1.5	6.99	27.5	0.73	15.89	3.82	5.16	4	50	100	18.7	1.13	0.905
96	Iran	0.4	2.3	12.2	44.4	1.08	30.5	3.3	5.69	4	50	100	26.2	0.815	0.66
97	Iran	0.4	2.3	12.2	44.4	1.08	30.5	3.3	5.69	4	50	100	18.7	0.794	0.577
98	Iran	0.4	2.3	12.2	44.4	1.08	30.5	3.3	5.69	5	100	250	24.5	0.994	0.864
99	India	0.5	1.5	4.6	15.6	0.98	9.2	4.22	4.8	4	50	100	24.5	1.384	0.881
100	India	0.95	2.8	12.5	34.9	0.66	13.16	3.02	5.97	4	50	100	24.5	1.369	0.836
101	India	1.3	4.6	18.9	55.9	0.86	14.54	2.4	6.54	4	50	100	24.5	1.358	0.803
102	Iran (multiple)	0.5	3	10.4	31.2	1.73	20.8	3.36	5.63	4	50	100	26.2	1.056	0.625
103	Iran (multiple)	0.4	2.8	9.2	30.1	2.13	23	3.42	5.53	4	50	100	24.5	0.501	0.451
104	Iran (multiple)	0.4	2.8	9.2	30.1	2.13	23	3.42	5.53	4	50	100	24.5	0.986	0.827
105	Iran (multiple)	0.4	2.8	9.2	30.1	2.13	23	3.42	5.53	4	50	100	24.5	1.479	1.241
106	Iran (multiple)	0.5	3.3	10.2	31	2.14	20.4	3.28	5.7	5	100	250	24.5	0.485	0.379
107	Iran (multiple)	0.5	3.3	10.2	31	2.14	20.4	3.28	5.7	5	100	250	24.5	0.808	0.631
108	Iran (multiple)	0.5	3.3	10.2	31	2.14	20.4	3.28	5.7	5	100	250	24.5	1.131	0.884
109	Iran (multiple)	0.4	2.8	10.4	31.2	1.88	26	3.4	5.58	4	50	100	18.7	0.808	0.631
110	Iran (multiple)	0.4	2.8	10.4	31.2	1.88	26	3.4	5.58	4	50	100	18.7	1.131	0.884
111	USA	2.4	19.3	80.1	100	1.94	33.38	1.32	7.72	6	250	400	25.6	0.85	0.836
112	USA	2.4	19.3	80.1	100	1.94	33.38	1.32	7.72	6	250	400	25.6	1.695	1.637
113	USA	2.4	19.3	80.1	100	1.94	33.38	1.32	7.72	6	250	400	25.6	4.205	3.921
114	Iran (multiple)	0.01	1	10.4	43.9	9.62	1040	4	4.93	4	50	100	24.2	0.241	0.183
115	Iran (multiple)	0.01	1	10.4	43.9	9.62	1040	4	4.93	4	50	100	24.2	0.468	0.316
116	Iran (multiple)	0.01	1	10.4	43.9	9.62	1040	4	4.93	4	50	100	24.2	0.921	0.582
117	Iran (multiple)	0.01	1	10.4	43.9	9.62	1040	4	4.93	4	50	100	24.2	0.265	0.313
118	Iran (multiple)	0.01	1	10.4	43.9	9.62	1040	4	4.93	4	50	100	24.2	0.511	0.506
119	Iran (multiple)	0.01	1	10.4	43.9	9.62	1040	4	4.93	4	50	100	24.2	1.001	0.902
120	USA	0.2	0.56	1.2	2.6	0.1	6	6	3	4	50	100	16.1	0.021	0.029
121	USA	0.2	0.56	1.2	2.6	0.1	6	6	3	4	50	100	16.1	0.042	0.051
122	USA	0.2	0.56	1.2	2.6	0.1	6	6	3	4	50	100	16.1	0.068	0.071
123	India	0.1	1.3	6.5	15	2.6	65	4.42	4.55	5	100	250	19.9	0.054	0.005
124	India	0.1	1.3	6.5	15	2.6	65	4.42	4.55	5	100	250	19.9	0.089	0.028
125	India	0.1	1.3	6.5	15	2.6	65	4.42	4.55	5	100	250	19.9	0.11	0.049
126	India	0.1	1.3	6.5	15	2.6	65	4.42	4.55	5	100	250	19.9	0.152	0.067
127	India	0.1	1.3	6.5	15	2.6	65	4.42	4.55	5	100	250	19.9	0.191	0.081
128	India	0.1	1.3	6.5	15	2.6	65	4.42	4.55	5	100	250	19.9	0.24	0.092
129	India	0.1	1	6.2	17	1.61	62	4.5	4.44	5	100	250	22.3	0.706	0.94
130	India	0.1	1	6.2	17	1.61	62	4.5	4.44	5	100	250	22.3	1.31	1.296
131	India	0.1	1	6.2	17	1.61	62	4.5	4.44	5	100	250	22.3	1.868	1.536
132	India	0.2	2.9	12.3	32	3.42	61.5	3.46	5.53	5	100	250	22.3	0.702	0.903
133	India	0.2	2.9	12.3	32	3.42	61.5	3.46	5.53	5	100	250	22.3	1.305	1.25
134	India	0.2	2.9	12.3	32	3.42	61.5	3.46	5.53	5	100	250	22.3	1.862	1.486
135	India	0.4	4.4	21.2	59.8	2.28	53	2.74	6.27	5	100	250	22.3	0.697	0.862
136	India	0.4	4.4	21.2	59.8	2.28	53	2.74	6.27	5	100	250	22.3	1.283	1.167
137	India	0.4	4.4	21.2	59.8	2.28	53	2.74	6.27	5	100	250	22.3	1.819	1.358
138	Australia	27.1	32.6	41.3	53	0.95	1.52	0.57	8.5	5	100	250	15.3	0.002	0.007
139	Australia	27.1	32.6	41.3	53	0.95	1.52	0.57	8.5	5	100	250	15.3	0.02	0.052
140	Australia	27.1	32.6	41.3	53	0.95	1.52	0.57	8.5	5	100	250	15.3	0.032	0.072
141	Australia	27.1	32.6	41.3	53	0.95	1.52	0.57	8.5	5	100	250	15.3	0.054	0.095
142	Australia	27.1	32.6	41.3	53	0.95	1.52	0.57	8.5	5	100	250	15.3	0.111	0.168
143	Australia	27.1	32.6	41.3	53	0.95	1.52	0.57	8.5	5	100	250	15.3	0.162	0.217
144	Australia	27.1	32.6	41.3	53	0.95	1.52	0.57	8.5	5	100	250	15.3	0.209	0.259
145	Australia	27.1	32.6	41.3	53	0.95	1.52	0.57	8.5	5	100	250	15.3	0.401	0.409
146	Australia	20.7	26.7	32.8	53	1.05	1.58	0.89	8.2	5	100	250	15.3	0.003	0.008
147	Australia	20.7	26.7	32.8	53	1.05	1.58	0.89	8.2	5	100	250	15.3	0.021	0.062
148	Australia	20.7	26.7	32.8	53	1.05	1.58	0.89	8.2	5	100	250	15.3	0.035	0.089
149	Australia	20.7	26.7	32.8	53	1.05	1.58	0.89	8.2	5	100	250	15.3	0.058	0.116
150	Australia	20.7	26.7	32.8	53	1.05	1.58	0.89	8.2	5	100	250	15.3	0.115	0.191
151	Australia	20.7	26.7	32.8	53	1.05	1.58	0.89	8.2	5	100	250	15.3	0.155	0.206
152	Australia	20.7	26.7	32.8	53	1.05	1.58	0.89	8.2	5	100	250	15.3	0.209	0.259
153	Australia	20.7	26.7	32.8	53	1.05	1.58	0.89	8.2	5	100	250	15.3	0.394	0.401
154	Thailand	3.1	7.8	22	46.4	0.89	7.1	1.98	7.01	4	50	100	21	0.833	0.745
155	Thailand	3.1	7.8	22	46.4	0.89	7.1	1.98	7.01	4	50	100	21	1.649	1.407
156	Thailand	3.1	7.8	22	46.4	0.89	7.1	1.98	7.01	4	50	100	21	2.451	2.01
157	Thailand	3.1	7.8	22	46.4	0.89	7.1	1.98	7.01	4	50	100	21	3.223	2.492
158	Thailand	3.5	7.1	19.8	45.7	0.73	5.66	2.03	6.98	4	50	100	21	0.808	0.631
159	Thailand	3.5	7.1	19.8	45.7	0.73	5.66	2.03	6.98	4	50	100	21	1.592	1.169
160	Thailand	3.5	7.1	19.8	45.7	0.73	5.66	2.03	6.98	4	50	100	21	2.338	1.576
161	Thailand	3.5	7.1	19.8	45.7	0.73	5.66	2.03	6.98	4	50	100	21	3.14	2.178
162	Netherlands	11	15	23	32	0.89	2.09	1.48	7.48	5	100	250	16.8	0.014	0.028
163	Netherlands	11	15	23	32	0.89	2.09	1.48	7.48	5	100	250	16.8	0.028	0.048
164	Netherlands	11	15	23	32	0.89	2.09	1.48	7.48	5	100	250	16.8	0.055	0.082
165	Netherlands	11	15	23	32	0.89	2.09	1.48	7.48	5	100	250	16.8	0.108	0.143

References

Aghaei, A.A.; Soroush, A.; Reyhani, M. Large-scale triaxial testing and numerical modeling of rounded and angular rockfill materials. Sci. Iran. Trans. A Civ. Eng. 2010, 17, 169–183. [Google Scholar]
Varadarajan, A.; Sharma, K.; Venkatachalam, K.; Gupta, A. Testing and modeling two rockfill materials. J. Geotech. Geoenviron. Eng. 2003, 129, 206–218. [Google Scholar] [CrossRef]
Marsal, R.J. Large scale testing of rockfill materials. J. Soil Mech. Found. Div. 1967, 93, 27–43. [Google Scholar] [CrossRef]
Marachi, N.D. Strength and Deformation. Characteristics of Rockfill Materials. Ph.D. Thesis, University of California, Berkeley, CA, USA, 1969. [Google Scholar]
Venkatachalam, K. Prediction of Mechanical Behaviour of Rockfill Materials. Ph.D. Thesis, Indian Institute of Technology, Delhi, India, 1993. [Google Scholar]
Leps, T.M. Review of shearing strength of rockfill. J. Soil Mech. Found. Div. 1970, 96, 1159–1170. [Google Scholar] [CrossRef]
Liu, S.-H. Application of in situ direct shear device to shear strength measurement of rockfill materials. Water Sci. Eng. 2009, 2, 48–57. [Google Scholar]
Linero, S.; Palma, C.; Apablaza, R. Geotechnical characterisation of waste material in very high dumps with large scale triaxial testing. In Proceedings of the 2007 International Symposium on Rock Slope Stability in Open Pit Mining and Civil Engineering; Australian Centre for Geomechanics: Perth, Australia, 2007; pp. 59–75. [Google Scholar] [CrossRef]
Honkanadavar, N.; Gupta, S. Prediction of shear strength parameters for prototype riverbed rockfill material using index properties. In Proceedings of the Indian Geotechnical Conference, Mumbai, India, 16–18 December 2010; pp. 335–338. [Google Scholar]
Froemelt, A.; Duürrenmatt, D.J.; Hellweg, S. Using data mining to assess environmental impacts of household consumption behaviors. Environ. Sci. Technol. 2018, 52, 8467–8478. [Google Scholar] [CrossRef] [PubMed]
Ahmad, M.; Tang, X.-W.; Qiu, J.-N.; Gu, W.-J.; Ahmad, F. A hybrid approach for evaluating CPT-based seismic soil liquefaction potential using Bayesian belief networks. J. Cent. South Univ. 2020, 27, 500–516. [Google Scholar]
Ahmad, M.; Tang, X.-W.; Qiu, J.-N.; Ahmad, F. Evaluating Seismic Soil Liquefaction Potential Using Bayesian Belief Network and C4. 5 Decision Tree Approaches. Appl. Sci. 2019, 9, 4226. [Google Scholar] [CrossRef] [Green Version]
Ahmad, M.; Tang, X.; Qiu, J.; Ahmad, F.; Gu, W. LLDV-a Comprehensive framework for assessing the effects of liquefaction land damage potential. In Proceedings of the 2019 IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), Dalian, China, 14–16 November 2019; pp. 527–533. [Google Scholar]
Ahmad, M.; Tang, X.-W.; Qiu, J.-N.; Ahmad, F.; Gu, W.-J. A step forward towards a comprehensive framework for assessing liquefaction land damage vulnerability: Exploration from historical data. Front. Struct. Civil. Eng. 2020, 14, 1476–1491. [Google Scholar] [CrossRef]
Ahmad, M.; Tang, X.; Ahmad, F. Evaluation of liquefaction-induced settlement using random forest and REP tree models: Taking Pohang earthquake as a case of illustration. In Natural Hazards-Impacts, Adjustments & Resilience; IntechOpen: London, UK, 2020. [Google Scholar]
Ahmad, M.; Al-Shayea, N.A.; Tang, X.-W.; Jamal, A.; Al-Ahmadi, H.M.; Ahmad, F. Predicting the pillar stability of underground mines with random trees and C4. 5 decision trees. Appl. Sci. 2020, 10, 6486. [Google Scholar] [CrossRef]
Pirhadi, N.; Tang, X.; Yang, Q.; Kang, F. A new equation to evaluate liquefaction triggering using the response surface method and parametric sensitivity analysis. Sustainability 2019, 11, 112. [Google Scholar] [CrossRef] [Green Version]
Pirhadi, N.; Tang, X.; Yang, Q. Energy evaluation of triggering soil liquefaction based on the response surface method. Appl. Sci. 2019, 9, 694. [Google Scholar] [CrossRef] [Green Version]
Mosavi, A.; Shirzadi, A.; Choubin, B.; Taromideh, F.; Hosseini, F.S.; Borji, M.; Shahabi, H.; Salvati, A.; Dineva, A.A. Towards an ensemble machine learning model of random subspace based functional tree classifier for snow avalanche susceptibility mapping. IEEE Access 2020, 8, 145968–145983. [Google Scholar] [CrossRef]
Mosavi, A.; Golshan, M.; Janizadeh, S.; Choubin, B.; Melesse, A.M.; Dineva, A.A. Ensemble models of GLM, FDA, MARS, and RF for flood and erosion susceptibility mapping: A priority assessment of sub-basins. Geocarto Int. 2020, 1–20. [Google Scholar] [CrossRef]
Mosavi, A.; Hosseini, F.S.; Choubin, B.; Goodarzi, M.; Dineva, A.A. Groundwater salinity susceptibility mapping using classifier ensemble and Bayesian machine learning models. IEEE Access 2020, 8, 145564–145576. [Google Scholar] [CrossRef]
Mosavi, A.; Sajedi-Hosseini, F.; Choubin, B.; Taromideh, F.; Rahi, G.; Dineva, A.A. Susceptibility mapping of soil water erosion using machine learning models. Water 2020, 12, 1995. [Google Scholar] [CrossRef]
Mosavi, A.; Hosseini, F.S.; Choubin, B.; Abdolshahnejad, M.; Gharechaee, H.; Lahijanzadeh, A.; Dineva, A.A. Susceptibility Prediction of Groundwater Hardness Using Ensemble Machine Learning Models. Water 2020, 12, 2770. [Google Scholar] [CrossRef]
Mosavi, A.; Hosseini, F.S.; Choubin, B.; Taromideh, F.; Ghodsi, M.; Nazari, B.; Dineva, A.A. Susceptibility mapping of groundwater salinity using machine learning models. Environ. Sci. Pollut. Res. 2021, 28, 10804–10817. [Google Scholar] [CrossRef] [PubMed]
Mosavi, A.; Hosseini, F.S.; Choubin, B.; Goodarzi, M.; Dineva, A.A.; Sardooi, E.R. Ensemble boosting and bagging based machine learning models for groundwater potential prediction. Water Resour. Manag. 2021, 35, 23–37. [Google Scholar] [CrossRef]
Choubin, B.; Borji, M.; Hosseini, F.S.; Mosavi, A.; Dineva, A.A. Mass wasting susceptibility assessment of snow avalanches using machine learning models. Sci. Rep. 2020, 10, 18363. [Google Scholar] [CrossRef]
Lovrić, M.; Pavlović, K.; Žuvela, P.; Spataru, A.; Lučić, B.; Kern, R.; Wong, M.W. Machine learning in prediction of intrinsic aqueous solubility of drug-like compounds: Generalization, complexity, or predictive ability? J. Chemom. 2021, e3349. [Google Scholar] [CrossRef]
Lovrić, M.; Meister, R.; Steck, T.; Fadljević, L.; Gerdenitsch, J.; Schuster, S.; Schiefermüller, L.; Lindstaedt, S.; Kern, R. Parasitic resistance as a predictor of faulty anodes in electro galvanizing: A comparison of machine learning, physical and hybrid models. Adv. Model. Simul. Eng. Sci. 2020, 7, 1–16. [Google Scholar] [CrossRef]
Gupta, A.K. Constitative Modelling of Rockfill Materials. Ph.D. Thesis, Indian Institute of Technology, Delhi, India, 2000. [Google Scholar]
Abbas, S.; Varadarajan, A.; Sharma, K. Prediction of shear strength parameter of prototype rockfill material. IGC-2003 Roorkee 2003, 1, 5–8. [Google Scholar]
Honkanadavar, N.; Sharma, K. Testing and modeling the behavior of riverbed and blasted quarried rockfill materials. Int. J. Geomech. 2014, 14, 04014028. [Google Scholar] [CrossRef]
Frossard, E.; Hu, W.; Dano, C.; Hicher, P.-Y. Rockfill shear strength evaluation: A rational method based on size effects. Géotechnique 2012, 62, 415–427. [Google Scholar] [CrossRef] [Green Version]
Kaunda, R. Predicting shear strengths of mine waste rock dumps and rock fill dams using artificial neural networks. Int. J. Min. Mineral. Eng. 2015, 6, 139–171. [Google Scholar] [CrossRef]
Zhou, J.; Li, E.; Wei, H.; Li, C.; Qiao, Q.; Armaghani, D.J. Random forests and cubist algorithms for predicting shear strengths of rockfill materials. Appl. Sci. 2019, 9, 1621. [Google Scholar] [CrossRef] [Green Version]
Alkhatib, K.; Najadat, H.; Hmeidi, I.; Shatnawi, M.K.A. Stock price prediction using k-nearest neighbor (kNN) algorithm. Int. J. Bus. Humanit. Technol. 2013, 3, 32–44. [Google Scholar]
Vijayan, V.; Ravikumar, A. Study of data mining algorithms for prediction and diagnosis of diabetes mellitus. Int. J. Comput. Appl. 2014, 95, 12–16. [Google Scholar]
Thongkam, J.; Xu, G.; Zhang, Y. AdaBoost algorithm with random forests for predicting breast cancer survivability. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1–8 June 2008; pp. 3062–3069. [Google Scholar]
Samui, P.; Sitharam, T.; Contadakis, M. Machine learning modelling for predicting soil liquefaction susceptibility. Nat. Hazards Earth Syst. Sci. 2011, 11, 1–9. [Google Scholar] [CrossRef] [Green Version]
Pal, M. Support vector machines-based modelling of seismic liquefaction potential. Int. J. Numer. Anal. Methods Geomech. 2006, 30, 983–996. [Google Scholar] [CrossRef]
Boser, B.E.; Guyon, I.M.; Vapnik, V.N. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, Pittsburgh, PA, USA, 27–29 July 1992; pp. 144–152. [Google Scholar]
Zhou, J.; Li, X.; Shi, X. Long-term prediction model of rockburst in underground openings using heuristic algorithms and support vector machines. Saf. Sci. 2012, 50, 629–644. [Google Scholar] [CrossRef]
Zhou, J.; Li, X.; Wang, S.; Wei, W. Identification of large-scale goaf instability in underground mine using particle swarm optimization and support vector machine. Int. J. Min. Sci. Technol. 2013, 23, 701–707. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Seo, D.K.; Kim, Y.H.; Eo, Y.D.; Park, W.Y.; Park, H.C. Generation of radiometric, phenological normalized image based on random forest regression for change detection. Remote. Sens. 2017, 9, 1163. [Google Scholar] [CrossRef] [Green Version]
Maillo, J.; Ramírez, S.; Triguero, I.; Herrera, F. kNN-IS: An iterative spark-based design of the k-nearest neighbors classifier for big data. Knowl. Based Syst. 2017, 117, 3–15. [Google Scholar] [CrossRef] [Green Version]
Chomboon, K.; Chujai, P.; Teerarassamee, P.; Kerdprasop, K.; Kerdprasop, N. An empirical study of distance metrics for k-nearest neighbor algorithm. In Proceedings of the 3rd International Conference on Industrial Application Engineering, Kitakyushu, Japan, 28–30 March 2015; pp. 280–285. [Google Scholar]
Prasath, V.; Alfeilat, H.A.A.; Hassanat, A.; Lasassmeh, O.; Tarawneh, A.S.; Alhasanat, M.B.; Salman, H.S.E. Distance and similarity measures effect on the performance of K-nearest neighbor classifier—A review. arXiv 2017, arXiv:1708.04321. [Google Scholar]
Farooq, F.; Nasir Amin, M.; Khan, K.; Rehan Sadiq, M.; Faisal Javed, M.; Aslam, F.; Alyousef, R. A comparative study of random forest and genetic engineering programming for the prediction of compressive strength of High Strength Concrete (HSC). Appl. Sci. 2020, 10, 7330. [Google Scholar] [CrossRef]
Golmohammadi, G.; Prasher, S.; Madani, A.; Rudra, R. Evaluating three hydrological distributed watershed models: MIKE-SHE, APEX, SWAT. Hydrology 2014, 1, 20–39. [Google Scholar] [CrossRef] [Green Version]
Nhu, V.-H.; Shahabi, H.; Nohani, E.; Shirzadi, A.; Al-Ansari, N.; Bahrami, S.; Miraki, S.; Geertsema, M.; Nguyen, H. Daily water level prediction of Zrebar Lake (Iran): A comparison between M5P, random forest, random tree and reduced error pruning trees algorithms. ISPRS Int. J. Geo. Inf. 2020, 9, 479. [Google Scholar] [CrossRef]
Gandomi, A.H.; Babanajad, S.K.; Alavi, A.H.; Farnam, Y. Novel approach to strength modeling of concrete under triaxial compression. J. Mater. Civ. Eng. 2012, 24, 1132–1143. [Google Scholar] [CrossRef]
Nush, J.; Sutcliffe, J.V. River flow forecasting through conceptual models part I—A discussion of principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
Khosravi, K.; Mao, L.; Kisi, O.; Yaseen, Z.M.; Shahid, S. Quantifying hourly suspended sediment load using data mining models: Case study of a glacierized Andean catchment in Chile. J. Hydrol. 2018, 567, 165–179. [Google Scholar] [CrossRef]
Moriasi, D.N.; Arnold, J.G.; Van Liew, M.W.; Bingner, R.L.; Harmel, R.D.; Veith, T.L. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans. ASABE 2007, 50, 885–900. [Google Scholar] [CrossRef]
Koopialipoor, M.; Fallah, A.; Armaghani, D.J.; Azizi, A.; Mohamad, E.T. Three hybrid intelligent models in estimating flyrock distance resulting from blasting. Eng. Comput. 2019, 35, 243–256. [Google Scholar] [CrossRef] [Green Version]
Asteris, P.G.; Tsaris, A.K.; Cavaleri, L.; Repapis, C.C.; Papalou, A.; Di Trapani, F.; Karypidis, D.F. Prediction of the fundamental period of infilled RC frame structures using artificial neural networks. Comput. Intell. Neurosci. 2016. [Google Scholar] [CrossRef] [Green Version]
Koopialipoor, M.; Armaghani, D.J.; Hedayat, A.; Marto, A.; Gordan, B. Applying various hybrid intelligent systems to evaluate and predict slope stability under static and dynamic conditions. Soft Comput. 2019, 23, 5913–5929. [Google Scholar] [CrossRef]
Andjelkovic, V.; Pavlovic, N.; Lazarevic, Z.; Radovanovic, S. Modelling of shear strength of rockfills used for the construction of rockfill dams. Soils Found. 2018, 58, 881–893. [Google Scholar] [CrossRef]
Yang, Y.; Zhang, Q. A hierarchical analysis for rock engineering using artificial neural networks. Rock Mech. Rock Eng. 1997, 30, 207–222. [Google Scholar] [CrossRef]
Faradonbeh, R.S.; Armaghani, D.J.; Abd Majid, M.; Tahir, M.M.; Murlidhar, B.R.; Monjezi, M.; Wong, H. Prediction of ground vibration due to quarry blasting based on gene expression programming: A new model for peak particle velocity prediction. Int. J. Environ. Sci. Technol. 2016, 13, 1453–1464. [Google Scholar] [CrossRef] [Green Version]
Chen, W.; Hasanipanah, M.; Rad, H.N.; Armaghani, D.J.; Tahir, M. A new design of evolutionary hybrid optimization of SVR model in predicting the blast-induced ground vibration. Eng. Comput. 2021, 37, 1455–1471. [Google Scholar] [CrossRef]
Rad, H.N.; Bakhshayeshi, I.; Jusoh, W.A.W.; Tahir, M.; Foong, L.K. Prediction of flyrock in mine blasting: A new computational intelligence approach. Nat. Resour. Res. 2020, 29, 609–623. [Google Scholar]
Ahmad, M.; Hu, J.-L.; Ahmad, F.; Tang, X.-W.; Amjad, M.; Iqbal, M.J.; Asim, M.; Farooq, A. Supervised learning methods for modeling concrete compressive strength prediction at high temperature. Materials 2021, 14, 1983. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Schematic representation of RF analysis.

Figure 2. The flowchart of the methodology.

Figure 3. Scatter plots of actual vs. predicted RFM shear strength in training stage: (a) SVM, (b) RF, (c) AdaBoost, and (d) KNN.

Figure 4. Scatter plots of actual vs. predicted RFM shear strength in testing stage: (a) SVM, (b) RF, (c) AdaBoost, and (d) KNN.

Figure 5. Comparison of R², NSE, RMSE, and RSR values from the SVM, RF, AdaBoost, and KNN models in (a) training phase and (b) testing phase.

Figure 6. Sensitivity analysis results.

Table 1. Rockfill materials shear strength case history data.

Case No.	Location	D₁₀/mm	D₃₀/mm	D₆₀/mm	D₉₀/mm	Cc	Cu	GM	FM	R	UCS_min/MPa	UCS_max/MPa	γ/KNm⁻³	σ_n/MPa	τ/MPa
1	Canada	0.02	0.94	4	18	11.05	200	4.78	4.19	1	1	5	15.4	0.022	0.013
2	Canada	0.02	0.94	4	18	11.05	200	4.78	4.19	1	1	5	15.4	0.044	0.025
3	Canada	0.02	0.94	4	18	11.05	200	4.78	4.19	1	1	5	15.4	0.088	0.049
…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…
163	Netherlands	11	15	23	32	0.89	2.09	1.48	7.48	5	100	250	16.8	0.028	0.048
164	Netherlands	11	15	23	32	0.89	2.09	1.48	7.48	5	100	250	16.8	0.055	0.082
165	Netherlands	11	15	23	32	0.89	2.09	1.48	7.48	5	100	250	16.8	0.108	0.143

Table 2. Statistical parameters of the training and testing data sets.

Parameter	Data Set	Min Value	Max Value	Mean	Standard Deviation
D₁₀ (mm)	Training	0.010	33.900	4.857	9.179
D₁₀ (mm)	Testing	0.010	33.900	2.887	7.453
D₃₀ (mm)	Training	0.560	42.400	8.465	10.577
D₃₀ (mm)	Testing	0.560	42.400	5.442	9.050
D₆₀ (mm)	Training	1.200	80.100	19.287	15.135
D₆₀ (mm)	Testing	1.200	50.000	14.252	10.349
D₉₀ (mm)	Training	2.600	100.000	40.386	22.018
D₉₀ (mm)	Testing	2.600	99.000	38.091	24.289
C_C	Training	0.100	22.270	2.199	3.075
C_C	Testing	0.100	22.270	3.226	4.492
C_U	Training	1.360	1040.000	53.324	156.064
C_U	Testing	1.470	1040.000	134.510	294.958
GM	Training	0.200	6.000	2.788	1.243
GM	Testing	0.200	6.000	3.365	1.331
FM	Training	3.000	8.800	6.250	1.261
FM	Testing	3.000	8.800	5.709	1.374
R	Training	1.000	6.000	4.364	0.910
R	Testing	1.000	5.000	4.182	1.131
UCS_min (MPa)	Training	1.000	250.000	75.045	39.230
UCS_min (MPa)	Testing	1.000	100.000	68.273	32.444
UCS_max (MPa)	Training	5.000	400.000	170.682	88.010
UCS_max (MPa)	Testing	5.000	250.000	159.545	87.957
γ (KN/m³)	Training	9.320	38.900	20.766	4.605
γ (KN/m³)	Testing	9.320	38.900	20.932	5.854
σ_n (MPa)	Training	0.002	4.205	0.729	0.780
σ_n (MPa)	Testing	0.021	3.223	0.756	0.816
τ (MPa)	Training	0.005	3.921	0.660	0.662
τ (MPa)	Testing	0.024	2.492	0.668	0.619

Table 3. Hyperparameter optimization results.

Algorithm	Hyperparameter	Explanation	Optimal Value
AdaBoost	Number of estimators	Number of trees	2
	Learning rate	It establishes the degree to which newly acquired information can override previously acquired information	0.1
	Boosting algorithm	Updates the weight of the base estimator with probability estimates or classification results (SAMME.R/SAMME)	SAMME
	Regression loss function	Linear/square/exponential	Linear
RF	Number of trees	Number of trees in the forest	15
RF	Limit depth of individual trees	The depth to which the trees will be grown	03
SVM	Cost (C)	Penalty term for loss and applies for classification and regression tasks	8
	Regression loss epsilon (ε)	The distance between true and predicted values within which no penalty is applied	0.1
	Kernal type	Kernel is a function that transforms attribute space to a new feature space to fit the maximum-margin hyperplane, thus allowing the algorithm to construct the model with linear, polynomial, RBF, and Sigmoid kernels	RBF
KNN	Number of neighbors	Number of nearest neighbors	5
	Metric	Distance parameter—Euclidean/Manhattan/Chebyshev/Mahalanobis	Euclidean
	Weight	Uniform—all points in each neighborhood are weighted equally/distance—closer neighbors of a query point have a greater influence than the neighbors further away	Uniform

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ahmad, M.; Kamiński, P.; Olczak, P.; Alam, M.; Iqbal, M.J.; Ahmad, F.; Sasui, S.; Khan, B.J. Development of Prediction Models for Shear Strength of Rockfill Material Using Machine Learning Techniques. Appl. Sci. 2021, 11, 6167. https://doi.org/10.3390/app11136167

AMA Style

Ahmad M, Kamiński P, Olczak P, Alam M, Iqbal MJ, Ahmad F, Sasui S, Khan BJ. Development of Prediction Models for Shear Strength of Rockfill Material Using Machine Learning Techniques. Applied Sciences. 2021; 11(13):6167. https://doi.org/10.3390/app11136167

Chicago/Turabian Style

Ahmad, Mahmood, Paweł Kamiński, Piotr Olczak, Muhammad Alam, Muhammad Junaid Iqbal, Feezan Ahmad, Sasui Sasui, and Beenish Jehan Khan. 2021. "Development of Prediction Models for Shear Strength of Rockfill Material Using Machine Learning Techniques" Applied Sciences 11, no. 13: 6167. https://doi.org/10.3390/app11136167

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Development of Prediction Models for Shear Strength of Rockfill Material Using Machine Learning Techniques

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Set

2.2. Support Vector Machine

2.3. Random Forest

2.4. AdaBoost Algorithm

2.5. k-Nearest Neighbor

2.6. Performance Metric

3. Model Development to Predict RFM Shear Strength

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Notation

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI