Variable: fNR Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 243 Number of independent variables: 377 Mtry: 20 Target node size: 5 Variable importance mode: impurity OOB prediction error: 4.672807 R squared: 0.8641326 OOB RMSE: 2.162 Variable importance: [,1] Lai_avg 210.5710 M13RB3ALT 203.1728 NCluster_1_AF_1km 198.1328 GPMIMERGALT 182.3191 MMOD4avg 176.2156 af_agg_30cm_AWCpF23__M_1km 160.3095 af_agg_30cm_TAWCpF23__M_1km 154.6684 M13RB1A04 154.5297 NCluster_19_AF_1km 149.1198 M43BSALT 148.5649 SLTPPT_M_agg30cm_AF_1km 144.0797 CRFVOL_M_agg30cm_AF_1km 144.0794 PET 143.3862 ENTENV3 137.8375 K_M_agg30cm_AF_1km 137.6526 eXtreme Gradient Boosting 243 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 162, 162, 162 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 2.425931 1 0.3 2 100 2.425675 1 0.3 2 150 2.425675 1 0.3 3 50 2.444060 1 0.3 3 100 2.443705 1 0.3 3 150 2.443705 1 0.3 4 50 2.654540 1 0.3 4 100 2.654296 1 0.3 4 150 2.654296 1 0.3 5 50 2.330494 1 0.3 5 100 2.330179 1 0.3 5 150 2.330179 1 0.3 6 50 2.276201 1 0.3 6 100 2.275701 1 0.3 6 150 2.275701 1 0.3 7 50 3.036777 1 0.3 7 100 3.036542 1 0.3 7 150 3.036542 1 0.3 8 50 2.439459 1 0.3 8 100 2.439249 1 0.3 8 150 2.439249 1 0.4 2 50 2.268432 1 0.4 2 100 2.268432 1 0.4 2 150 2.268432 1 0.4 3 50 2.302722 1 0.4 3 100 2.302721 1 0.4 3 150 2.302721 1 0.4 4 50 2.773044 1 0.4 4 100 2.773044 1 0.4 4 150 2.773044 1 0.4 5 50 2.640358 1 0.4 5 100 2.640358 1 0.4 5 150 2.640358 1 0.4 6 50 1.868391 1 0.4 6 100 1.868391 1 0.4 6 150 1.868391 1 0.4 7 50 3.066985 1 0.4 7 100 3.066984 1 0.4 7 150 3.066984 1 0.4 8 50 2.519071 1 0.4 8 100 2.519070 1 0.4 8 150 2.519070 1 0.5 2 50 1.496433 1 0.5 2 100 1.496433 1 0.5 2 150 1.496433 1 0.5 3 50 3.143188 1 0.5 3 100 3.143188 1 0.5 3 150 3.143188 1 0.5 4 50 1.729344 1 0.5 4 100 1.729344 1 0.5 4 150 1.729344 1 0.5 5 50 2.973623 1 0.5 5 100 2.973623 1 0.5 5 150 2.973623 1 0.5 6 50 2.328293 1 0.5 6 100 2.328293 1 0.5 6 150 2.328293 1 0.5 7 50 2.140677 1 0.5 7 100 2.140677 1 0.5 7 150 2.140677 1 0.5 8 50 2.600661 1 0.5 8 100 2.600661 1 0.5 8 150 2.600661 1 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 2, eta = 0.5, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 1.496 R2: 1 XGBoost variable importance: Feature Gain Cover Frequency 1: af_agg_30cm_AWCpF23__M_1km 0.6550504 0.6538462 0.6538462 2: af_agg_30cm_TAWCpF23__M_1km 0.3449496 0.3461538 0.3461538 3: NA NA NA NA 4: NA NA NA NA 5: NA NA NA NA 6: NA NA NA NA 7: NA NA NA NA 8: NA NA NA NA 9: NA NA NA NA 10: NA NA NA NA 11: NA NA NA NA 12: NA NA NA NA 13: NA NA NA NA 14: NA NA NA NA 15: NA NA NA NA Ensemble validation RMSE: 1.919 R2: 0.999 -------------------------------------- Variable: fPR Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 243 Number of independent variables: 377 Mtry: 8 Target node size: 5 Variable importance mode: impurity OOB prediction error: 0.08260831 R squared: 0.1292402 OOB RMSE: 0.287 Variable importance: [,1] af_BDRICM_T__M_1km 0.5145925 M43WSALT 0.4582392 M13RB3A04 0.4347164 CRFVOL_M_agg30cm_AF_1km 0.4301633 REDL14 0.3913628 ECN_M_agg30cm_AF_1km 0.3637455 ENTENV3 0.3590912 M13RB3ALT 0.3553882 Sorghum_rainfed_intermed_baseline 0.3431506 yGapSorghum 0.3410478 MANMCF5 0.3384627 BIO1ALT 0.3378400 NCluster_17_AF_1km 0.3363356 NCluster_12_AF_1km 0.3176117 K_M_agg30cm_AF_1km 0.3127619 eXtreme Gradient Boosting 243 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 162, 162, 162 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 0.1893527 0.7762238 0.3 2 100 0.1893527 0.7762238 0.3 2 150 0.1893527 0.7762238 0.3 3 50 0.1605534 0.7762238 0.3 3 100 0.1605534 0.7762238 0.3 3 150 0.1605534 0.7762238 0.3 4 50 0.1511921 0.7762238 0.3 4 100 0.1511921 0.7762238 0.3 4 150 0.1511921 0.7762238 0.3 5 50 0.1972418 0.7762238 0.3 5 100 0.1972418 0.7762238 0.3 5 150 0.1972418 0.7762238 0.3 6 50 0.1916338 0.7762238 0.3 6 100 0.1916338 0.7762238 0.3 6 150 0.1916338 0.7762238 0.3 7 50 0.1520646 0.7762238 0.3 7 100 0.1520646 0.7762238 0.3 7 150 0.1520646 0.7762238 0.3 8 50 0.1742277 0.7762238 0.3 8 100 0.1742277 0.7762238 0.3 8 150 0.1742277 0.7762238 0.4 2 50 0.1793417 0.7762238 0.4 2 100 0.1793417 0.7762238 0.4 2 150 0.1793417 0.7762238 0.4 3 50 0.2093956 0.7762238 0.4 3 100 0.2093956 0.7762238 0.4 3 150 0.2093956 0.7762238 0.4 4 50 0.1447457 0.7762238 0.4 4 100 0.1447457 0.7762238 0.4 4 150 0.1447457 0.7762238 0.4 5 50 0.2063902 0.7762238 0.4 5 100 0.2063902 0.7762238 0.4 5 150 0.2063902 0.7762238 0.4 6 50 0.1716541 0.7762238 0.4 6 100 0.1716541 0.7762238 0.4 6 150 0.1716541 0.7762238 0.4 7 50 0.1935563 0.7762238 0.4 7 100 0.1935563 0.7762238 0.4 7 150 0.1935563 0.7762238 0.4 8 50 0.1847403 0.7762238 0.4 8 100 0.1847403 0.7762238 0.4 8 150 0.1847403 0.7762238 0.5 2 50 0.1859101 0.7762238 0.5 2 100 0.1859109 0.7762238 0.5 2 150 0.1859118 0.7762238 0.5 3 50 0.1791163 0.7762238 0.5 3 100 0.1791171 0.7762238 0.5 3 150 0.1791180 0.7762238 0.5 4 50 0.1439660 0.7762238 0.5 4 100 0.1439666 0.7762238 0.5 4 150 0.1439674 0.7762238 0.5 5 50 0.1962033 1.0000000 0.5 5 100 0.1962042 1.0000000 0.5 5 150 0.1962050 1.0000000 0.5 6 50 0.1439742 0.7762238 0.5 6 100 0.1439748 0.7762238 0.5 6 150 0.1439756 0.7762238 0.5 7 50 0.1834947 0.7762238 0.5 7 100 0.1834957 0.7762238 0.5 7 150 0.1834965 0.7762238 0.5 8 50 0.2371453 1.0000000 0.5 8 100 0.2371453 1.0000000 0.5 8 150 0.2371453 1.0000000 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 4, eta = 0.5, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 0.144 R2: 0.776 XGBoost variable importance: Feature Gain Cover Frequency 1: af_agg_30cm_AWCpF23__M_1km 0.71585998 0.5454545 0.5454545 2: AAIavg_GYGA 0.26631486 0.2727273 0.2727273 3: af_agg_30cm_TAWCpF23__M_1km 0.01782517 0.1818182 0.1818182 4: NA NA NA NA 5: NA NA NA NA 6: NA NA NA NA 7: NA NA NA NA 8: NA NA NA NA 9: NA NA NA NA 10: NA NA NA NA 11: NA NA NA NA 12: NA NA NA NA 13: NA NA NA NA 14: NA NA NA NA 15: NA NA NA NA Ensemble validation RMSE: 0.209 R2: 0.545 -------------------------------------- Variable: fKR Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 243 Number of independent variables: 377 Mtry: 6 Target node size: 5 Variable importance mode: impurity OOB prediction error: 2.392915 R squared: -0.7288387 OOB RMSE: 1.547 Variable importance: [,1] VW1MOD1avg 6.427328 GPMIMERGALT 5.573757 MAXENV3 5.322609 yGapSorghum 4.487883 AAIavg_GYGA 4.381383 Al_M_agg30cm_AF_1km 4.268731 Water_balance 3.978185 Wdvi 3.958375 Zn_M_agg30cm_AF_1km 3.747678 GAEZ_ET 3.664601 P_M_agg30cm_AF_1km 3.652944 B14CHE3 3.546727 BIO1ALT 3.509092 NCluster_M_AF_1km 3.477146 fPR_SorghumTrials 3.404805 eXtreme Gradient Boosting 243 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 163, 162, 161 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 1.209553 0.5785635 0.3 2 100 1.209622 0.5785635 0.3 2 150 1.209622 0.5785635 0.3 3 50 1.152243 0.5785635 0.3 3 100 1.152243 0.5785635 0.3 3 150 1.152243 0.5785635 0.3 4 50 1.108437 0.5785635 0.3 4 100 1.108477 0.5785635 0.3 4 150 1.108477 0.5785635 0.3 5 50 1.122304 0.5785635 0.3 5 100 1.122380 0.5785635 0.3 5 150 1.122380 0.5785635 0.3 6 50 1.112275 0.5785635 0.3 6 100 1.112311 0.5785635 0.3 6 150 1.112311 0.5785635 0.3 7 50 1.283292 0.5785635 0.3 7 100 1.283415 0.5785635 0.3 7 150 1.283415 0.5785635 0.3 8 50 1.128434 0.5785635 0.3 8 100 1.128454 0.5785635 0.3 8 150 1.128454 0.5785635 0.4 2 50 1.061531 0.5785635 0.4 2 100 1.061531 0.5785635 0.4 2 150 1.061531 0.5785635 0.4 3 50 1.135295 0.5785635 0.4 3 100 1.135295 0.5785635 0.4 3 150 1.135295 0.5785635 0.4 4 50 1.067817 0.5785635 0.4 4 100 1.067817 0.5785635 0.4 4 150 1.067817 0.5785635 0.4 5 50 1.139136 0.5785635 0.4 5 100 1.139136 0.5785635 0.4 5 150 1.139136 0.5785635 0.4 6 50 1.146593 0.5785635 0.4 6 100 1.146593 0.5785635 0.4 6 150 1.146593 0.5785635 0.4 7 50 1.120269 0.5785635 0.4 7 100 1.120269 0.5785635 0.4 7 150 1.120269 0.5785635 0.4 8 50 1.177948 0.5785635 0.4 8 100 1.177948 0.5785635 0.4 8 150 1.177948 0.5785635 0.5 2 50 1.158217 0.5785635 0.5 2 100 1.158217 0.5785635 0.5 2 150 1.158217 0.5785635 0.5 3 50 1.160264 0.5785635 0.5 3 100 1.160264 0.5785635 0.5 3 150 1.160264 0.5785635 0.5 4 50 1.092881 0.5785635 0.5 4 100 1.092881 0.5785635 0.5 4 150 1.092881 0.5785635 0.5 5 50 1.166668 0.5785635 0.5 5 100 1.166668 0.5785635 0.5 5 150 1.166668 0.5785635 0.5 6 50 1.322120 0.5785635 0.5 6 100 1.322120 0.5785635 0.5 6 150 1.322120 0.5785635 0.5 7 50 1.106120 0.5785635 0.5 7 100 1.106120 0.5785635 0.5 7 150 1.106120 0.5785635 0.5 8 50 1.098582 0.5785635 0.5 8 100 1.098582 0.5785635 0.5 8 150 1.098582 0.5785635 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 2, eta = 0.4, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 1.062 R2: 0.579 XGBoost variable importance: Feature Gain Cover Frequency 1: AAIavg_GYGA 0.7031680839 0.66717511 0.66666667 2: af_BDRICM_T__M_1km 0.1646448958 0.07413057 0.07407407 3: af_agg_30cm_PWP__M_1km 0.1076861098 0.03698902 0.03703704 4: af_agg_30cm_AWCpF23__M_1km 0.0153343971 0.16610738 0.16666667 5: Al_M_agg30cm_AF_1km 0.0087662481 0.03706528 0.03703704 6: B04CHE3 0.0004002652 0.01853264 0.01851852 7: NA NA NA NA 8: NA NA NA NA 9: NA NA NA NA 10: NA NA NA NA 11: NA NA NA NA 12: NA NA NA NA 13: NA NA NA NA 14: NA NA NA NA 15: NA NA NA NA Ensemble validation RMSE: 1.39 R2: 0.854 -------------------------------------- Variable: fNRec Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 45 Number of independent variables: 377 Mtry: 18 Target node size: 5 Variable importance mode: impurity OOB prediction error: 42.56502 R squared: 0.8786719 OOB RMSE: 6.524 Variable importance: [,1] af_agg_30cm_AWCpF23__M_1km 884.9369 OC_M_agg30cm_AF_1km 531.8681 BIO1ALT 474.6042 af_agg_30cm_TAWCpF23__M_1km 438.6698 PRSCHE3 393.6380 PCHE3avg 388.3338 Water_balance 376.3854 LSTD_avgIRI_Jul2002_Sep2016_mosaicLAEA_celsius 339.4872 AAIavg_GYGA 334.3338 af_agg_30cm_TAWCpF23mm__M_1km 325.6297 VW1MOD1avg 296.6799 M02MOD4 283.9430 af_agg_30cm_TETAs__M_1km 271.9653 DEMENV5 269.6975 SW2L14 267.4982 eXtreme Gradient Boosting 45 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 30, 30, 30 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 5.427074 0.9197517 0.3 2 100 5.427070 0.9197431 0.3 2 150 5.427070 0.9197431 0.3 3 50 5.285672 0.9242248 0.3 3 100 5.285930 0.9242125 0.3 3 150 5.285930 0.9242125 0.3 4 50 5.322856 0.9235119 0.3 4 100 5.323120 0.9234994 0.3 4 150 5.323120 0.9234994 0.3 5 50 5.262612 0.9248469 0.3 5 100 5.262876 0.9248342 0.3 5 150 5.262876 0.9248342 0.3 6 50 5.297378 0.9231282 0.3 6 100 5.297626 0.9231166 0.3 6 150 5.297626 0.9231166 0.3 7 50 5.347853 0.9219818 0.3 7 100 5.348127 0.9219695 0.3 7 150 5.348127 0.9219695 0.3 8 50 5.224710 0.9256515 0.3 8 100 5.224983 0.9256392 0.3 8 150 5.224983 0.9256392 0.4 2 50 5.263249 0.9249519 0.4 2 100 5.263249 0.9249519 0.4 2 150 5.263249 0.9249519 0.4 3 50 5.283462 0.9242447 0.4 3 100 5.283462 0.9242447 0.4 3 150 5.283462 0.9242447 0.4 4 50 5.284006 0.9242399 0.4 4 100 5.284005 0.9242399 0.4 4 150 5.284005 0.9242399 0.4 5 50 5.287791 0.9241037 0.4 5 100 5.287791 0.9241037 0.4 5 150 5.287791 0.9241037 0.4 6 50 5.281812 0.9243122 0.4 6 100 5.281812 0.9243122 0.4 6 150 5.281812 0.9243122 0.4 7 50 5.265462 0.9249128 0.4 7 100 5.265462 0.9249128 0.4 7 150 5.265462 0.9249128 0.4 8 50 5.282195 0.9242986 0.4 8 100 5.282195 0.9242986 0.4 8 150 5.282195 0.9242986 0.5 2 50 5.258701 0.9248500 0.5 2 100 5.258701 0.9248500 0.5 2 150 5.258701 0.9248500 0.5 3 50 5.366822 0.9221986 0.5 3 100 5.366822 0.9221986 0.5 3 150 5.366822 0.9221986 0.5 4 50 5.275580 0.9243526 0.5 4 100 5.275580 0.9243526 0.5 4 150 5.275580 0.9243526 0.5 5 50 5.400426 0.9205695 0.5 5 100 5.400426 0.9205695 0.5 5 150 5.400426 0.9205695 0.5 6 50 5.264717 0.9246309 0.5 6 100 5.264717 0.9246309 0.5 6 150 5.264717 0.9246309 0.5 7 50 5.269000 0.9245119 0.5 7 100 5.269000 0.9245119 0.5 7 150 5.269000 0.9245119 0.5 8 50 5.264652 0.9235680 0.5 8 100 5.264652 0.9235680 0.5 8 150 5.264652 0.9235680 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 8, eta = 0.3, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 5.225 R2: 0.926 XGBoost variable importance: Feature Gain Cover Frequency 1: AAIavg_GYGA 5.674871e-01 0.115282051 0.150375940 2: af_agg_30cm_AWCpF23__M_1km 2.539863e-01 0.306051282 0.255639098 3: af_BDRICM_T__M_1km 7.974852e-02 0.028923077 0.030075188 4: af_agg_30cm_TETAs__M_1km 6.156749e-02 0.144000000 0.142857143 5: af_agg_30cm_TAWCpF23mm__M_1km 2.553226e-02 0.018051282 0.015037594 6: B07CHE3 4.722996e-03 0.018461538 0.015037594 7: ASSDAC3 3.930541e-03 0.032615385 0.037593985 8: af_agg_30cm_PWP__M_1km 2.387659e-03 0.007179487 0.007518797 9: EACKCL_M_agg30cm_AF_1km 5.532296e-04 0.144000000 0.135338346 10: C03GLC5 6.577836e-05 0.028102564 0.030075188 11: B14CHE3 1.105866e-05 0.031384615 0.037593985 12: Zn_M_agg30cm_AF_1km 2.693615e-06 0.052923077 0.045112782 13: ECN_M_agg30cm_AF_1km 2.580043e-06 0.020307692 0.022556391 14: af_agg_ERZD_TAWCpF23mm__M_1km 1.731055e-06 0.020102564 0.030075188 15: Al_M_agg30cm_AF_1km 2.110120e-09 0.002871795 0.007518797 Ensemble validation RMSE: 5.166 R2: 0.923 -------------------------------------- Variable: fPRec Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 46 Number of independent variables: 377 Mtry: 6 Target node size: 5 Variable importance mode: impurity OOB prediction error: 8.514012 R squared: 0.6055574 OOB RMSE: 2.918 Variable importance: [,1] GAEZ_LGP 21.80462 K_M_agg30cm_AF_1km 17.68083 Usgs_lithologyc14 16.52063 DEMENV5 15.84365 Mn_M_agg30cm_AF_1km 15.60250 CLYPPT_M_agg30cm_AF_1km 15.22223 ASSDAC3 15.03173 EACKCL_M_agg30cm_AF_1km 14.89279 GAEZ_ratioP_PETsea 14.40419 ESMOD5avg 13.31871 NCluster_9_AF_1km 13.28559 Hypsclassc2 13.16998 GIEMSD3 12.93039 Lai_avg 12.44735 Mg_M_agg30cm_AF_1km 11.29865 eXtreme Gradient Boosting 46 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 30, 31, 31 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 3.625735 0.4804011 0.3 2 100 3.626471 0.4804065 0.3 2 150 3.626471 0.4804065 0.3 3 50 3.754613 0.4659969 0.3 3 100 3.755443 0.4660104 0.3 3 150 3.755443 0.4660104 0.3 4 50 3.739210 0.4758829 0.3 4 100 3.740005 0.4759038 0.3 4 150 3.740005 0.4759038 0.3 5 50 3.594869 0.4801906 0.3 5 100 3.595710 0.4801995 0.3 5 150 3.595710 0.4801995 0.3 6 50 3.725211 0.4779022 0.3 6 100 3.726127 0.4779113 0.3 6 150 3.726127 0.4779113 0.3 7 50 3.721382 0.4785191 0.3 7 100 3.722264 0.4785303 0.3 7 150 3.722264 0.4785303 0.3 8 50 3.759138 0.4769082 0.3 8 100 3.760036 0.4769175 0.3 8 150 3.760036 0.4769175 0.4 2 50 3.574583 0.4902282 0.4 2 100 3.574583 0.4902281 0.4 2 150 3.574583 0.4902281 0.4 3 50 3.439661 0.5060143 0.4 3 100 3.439661 0.5060144 0.4 3 150 3.439661 0.5060144 0.4 4 50 3.706356 0.4783160 0.4 4 100 3.706356 0.4783161 0.4 4 150 3.706356 0.4783161 0.4 5 50 3.774204 0.4720046 0.4 5 100 3.774204 0.4720046 0.4 5 150 3.774204 0.4720046 0.4 6 50 3.675684 0.4751679 0.4 6 100 3.675684 0.4751679 0.4 6 150 3.675684 0.4751679 0.4 7 50 3.806590 0.4696272 0.4 7 100 3.806590 0.4696272 0.4 7 150 3.806590 0.4696272 0.4 8 50 3.933515 0.4515375 0.4 8 100 3.933516 0.4515375 0.4 8 150 3.933516 0.4515375 0.5 2 50 3.721327 0.4737252 0.5 2 100 3.721327 0.4737252 0.5 2 150 3.721327 0.4737252 0.5 3 50 3.784097 0.4691820 0.5 3 100 3.784097 0.4691820 0.5 3 150 3.784097 0.4691820 0.5 4 50 3.764998 0.4725135 0.5 4 100 3.764998 0.4725135 0.5 4 150 3.764998 0.4725135 0.5 5 50 3.763357 0.4696550 0.5 5 100 3.763357 0.4696550 0.5 5 150 3.763357 0.4696550 0.5 6 50 3.797820 0.4698387 0.5 6 100 3.797820 0.4698387 0.5 6 150 3.797820 0.4698387 0.5 7 50 3.752556 0.4787252 0.5 7 100 3.752556 0.4787252 0.5 7 150 3.752556 0.4787252 0.5 8 50 3.958565 0.4527790 0.5 8 100 3.958565 0.4527790 0.5 8 150 3.958565 0.4527790 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 3, eta = 0.4, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 3.44 R2: 0.506 XGBoost variable importance: Feature Gain Cover Frequency 1: Al_M_agg30cm_AF_1km 4.851715e-01 0.25065274 0.21052632 2: NCluster_13_AF_1km 2.590908e-01 0.01091859 0.00877193 3: C08GLC5 1.097308e-01 0.01091859 0.00877193 4: ASSDAC3 7.726657e-02 0.02183717 0.01754386 5: af_agg_30cm_PWP__M_1km 2.794324e-02 0.09233325 0.09649123 6: Ca_M_agg30cm_AF_1km 2.358623e-02 0.05174460 0.04385965 7: Wdvi 9.201538e-03 0.14882507 0.12280702 8: AAIavg_GYGA 3.487304e-03 0.02919535 0.08771930 9: M13NDVIA01 2.373955e-03 0.02088773 0.01754386 10: B04CHE3 1.554056e-03 0.04367434 0.03508772 11: N_M_agg30cm_AF_1km 2.399973e-04 0.01993829 0.01754386 12: CHIRPSA 1.155730e-04 0.01708996 0.02631579 13: B13CHE3 9.289333e-05 0.01044386 0.00877193 14: C02GLC5 5.500118e-05 0.04177546 0.03508772 15: Na_M_agg30cm_AF_1km 4.806902e-05 0.01542844 0.01754386 Ensemble validation RMSE: 2.941 R2: 0.591 -------------------------------------- Variable: fKRec Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 42 Number of independent variables: 377 Mtry: 4 Target node size: 5 Variable importance mode: impurity OOB prediction error: 19.789 R squared: 0.2918842 OOB RMSE: 4.448 Variable importance: [,1] SNDPPT_M_agg30cm_AF_1km 17.29231 Sorghum_actual_baseline 17.17472 GPMIMERGALT 14.63410 VBFMRG5 13.87158 M13NDVIA01 12.88385 PET 12.53161 Na_M_agg30cm_AF_1km 12.33826 Al_M_agg30cm_AF_1km 12.20269 af_agg_30cm_AWCpF23__M_1km 11.69979 N_M_agg30cm_AF_1km 11.55984 EXMOD5avg 11.09915 PRSCHE3 10.90869 SLTPPT_M_agg30cm_AF_1km 10.73099 M13RB1A08 10.47957 CRFVOL_M_agg30cm_AF_1km 10.43697 eXtreme Gradient Boosting 42 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 29, 28, 27 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 2.930272 0.7247193 0.3 2 100 2.930076 0.7247117 0.3 2 150 2.930076 0.7247117 0.3 3 50 2.838434 0.7363819 0.3 3 100 2.838054 0.7363849 0.3 3 150 2.838054 0.7363849 0.3 4 50 2.896578 0.7349935 0.3 4 100 2.896051 0.7350291 0.3 4 150 2.896051 0.7350291 0.3 5 50 2.844131 0.7428222 0.3 5 100 2.843696 0.7428474 0.3 5 150 2.843696 0.7428474 0.3 6 50 3.033772 0.7183799 0.3 6 100 3.033303 0.7184064 0.3 6 150 3.033303 0.7184064 0.3 7 50 2.869669 0.7357116 0.3 7 100 2.869172 0.7357576 0.3 7 150 2.869172 0.7357576 0.3 8 50 2.822949 0.7382126 0.3 8 100 2.822528 0.7382389 0.3 8 150 2.822528 0.7382389 0.4 2 50 2.991132 0.6869321 0.4 2 100 2.991126 0.6869321 0.4 2 150 2.991126 0.6869321 0.4 3 50 2.904120 0.7099639 0.4 3 100 2.904119 0.7099639 0.4 3 150 2.904119 0.7099639 0.4 4 50 2.919211 0.7173472 0.4 4 100 2.919211 0.7173472 0.4 4 150 2.919211 0.7173472 0.4 5 50 3.034012 0.6836179 0.4 5 100 3.034012 0.6836179 0.4 5 150 3.034012 0.6836179 0.4 6 50 2.865428 0.7251791 0.4 6 100 2.865428 0.7251791 0.4 6 150 2.865428 0.7251791 0.4 7 50 2.991225 0.7388196 0.4 7 100 2.991225 0.7388196 0.4 7 150 2.991225 0.7388196 0.4 8 50 2.878659 0.7200511 0.4 8 100 2.878659 0.7200511 0.4 8 150 2.878659 0.7200511 0.5 2 50 2.841020 0.7278870 0.5 2 100 2.841020 0.7278870 0.5 2 150 2.841020 0.7278870 0.5 3 50 2.562859 0.7716113 0.5 3 100 2.562859 0.7716113 0.5 3 150 2.562859 0.7716113 0.5 4 50 2.915784 0.7528129 0.5 4 100 2.915784 0.7528129 0.5 4 150 2.915784 0.7528129 0.5 5 50 2.585994 0.7651366 0.5 5 100 2.585994 0.7651366 0.5 5 150 2.585994 0.7651366 0.5 6 50 2.803403 0.7350487 0.5 6 100 2.803403 0.7350487 0.5 6 150 2.803403 0.7350487 0.5 7 50 2.594996 0.7692621 0.5 7 100 2.594996 0.7692621 0.5 7 150 2.594996 0.7692621 0.5 8 50 2.852083 0.7215681 0.5 8 100 2.852083 0.7215681 0.5 8 150 2.852083 0.7215681 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 3, eta = 0.5, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 2.563 R2: 0.772 XGBoost variable importance: Feature Gain Cover Frequency 1: M13NDVIA01 4.195736e-01 0.02837838 0.02631579 2: ECN_M_agg30cm_AF_1km 3.790045e-01 0.01418919 0.01315789 3: Al_M_agg30cm_AF_1km 1.243541e-01 0.38108108 0.35526316 4: C01GLC5 4.261787e-02 0.01418919 0.01315789 5: af_agg_30cm_AWCpF23__M_1km 1.921603e-02 0.10574324 0.11842105 6: AAIavg_GYGA 1.381695e-02 0.07871622 0.09210526 7: BARL10 6.135391e-04 0.05675676 0.05263158 8: af_agg_30cm_PWP__M_1km 3.256417e-04 0.18175676 0.19736842 9: af_agg_30cm_TAWCpF23__M_1km 2.809591e-04 0.01351351 0.01315789 10: NCluster_6_AF_1km 1.365175e-04 0.01385135 0.01315789 11: GTDHYS3 5.193545e-05 0.02770270 0.02631579 12: C08GLC5 6.883999e-06 0.02837838 0.02631579 13: af_agg_30cm_TAWCpF23mm__M_1km 1.462613e-06 0.01385135 0.01315789 14: af_agg_ERZD_TAWCpF23mm__M_1km 8.315153e-08 0.04189189 0.03947368 15: NA NA NA NA Ensemble validation RMSE: 3.128 R2: 0.665 --------------------------------------