Variable: fNR Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 243 Number of independent variables: 377 Mtry: 14 Target node size: 5 Variable importance mode: impurity OOB prediction error: 4.209339 R squared: 0.8776085 OOB RMSE: 2.052 Variable importance: [,1] K_M_agg30cm_AF_1km 242.6144 M13RB1ALT 204.2825 GAEZ_LGP 198.5951 af_agg_30cm_AWCpF23__M_1km 193.2777 LSTD_avgIRI_Jul2002_Sep2016_mosaicLAEA_celsius 187.0816 Sorghum_rainfed_intermed_baseline 181.7180 AfSIS_WRBc109 181.0244 B02CHE3 170.8982 M13RB3A01 165.0259 Zn_M_agg30cm_AF_1km 164.5170 M43BSALT 159.8934 MY2LSTNALT_200207_201609 155.0845 M02MOD4 154.4834 NIRL14 148.5649 ENTENV3 148.4262 eXtreme Gradient Boosting 243 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 162, 162, 162 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 0.721906559 0.9753203 0.3 2 100 0.720863653 0.9753503 0.3 2 150 0.720863653 0.9753503 0.3 3 50 0.067528137 0.9998482 0.3 3 100 0.066724186 0.9998472 0.3 3 150 0.066724186 0.9998472 0.3 4 50 0.365057387 0.9946988 0.3 4 100 0.364428232 0.9946916 0.3 4 150 0.364428232 0.9946916 0.3 5 50 0.157144913 0.9991225 0.3 5 100 0.156260207 0.9991216 0.3 5 150 0.156260207 0.9991216 0.3 6 50 0.219705468 0.9982222 0.3 6 100 0.218660740 0.9982242 0.3 6 150 0.218660740 0.9982242 0.3 7 50 0.700277533 0.9770084 0.3 7 100 0.699255230 0.9770346 0.3 7 150 0.699255230 0.9770346 0.3 8 50 0.305828725 0.9963922 0.3 8 100 0.304935910 0.9963924 0.3 8 150 0.304935910 0.9963924 0.4 2 50 0.288638964 0.9967940 0.4 2 100 0.288638681 0.9967940 0.4 2 150 0.288638681 0.9967940 0.4 3 50 0.108725415 0.9995850 0.4 3 100 0.108725132 0.9995850 0.4 3 150 0.108725132 0.9995850 0.4 4 50 0.164129226 0.9990269 0.4 4 100 0.164128943 0.9990269 0.4 4 150 0.164128943 0.9990269 0.4 5 50 0.229020615 0.9980416 0.4 5 100 0.229020333 0.9980416 0.4 5 150 0.229020333 0.9980416 0.4 6 50 0.447013607 0.9916795 0.4 6 100 0.447013325 0.9916795 0.4 6 150 0.447013325 0.9916795 0.4 7 50 0.285017262 0.9968797 0.4 7 100 0.285016979 0.9968797 0.4 7 150 0.285016979 0.9968797 0.4 8 50 0.083431715 0.9997589 0.4 8 100 0.083431432 0.9997589 0.4 8 150 0.083431432 0.9997589 0.5 2 50 0.081747687 0.9997687 0.5 2 100 0.081747687 0.9997687 0.5 2 150 0.081747687 0.9997687 0.5 3 50 0.051698335 0.9999090 0.5 3 100 0.051698335 0.9999090 0.5 3 150 0.051698335 0.9999090 0.5 4 50 0.236045683 0.9979121 0.5 4 100 0.236045683 0.9979121 0.5 4 150 0.236045683 0.9979121 0.5 5 50 0.292185028 0.9967088 0.5 5 100 0.292185028 0.9967088 0.5 5 150 0.292185028 0.9967088 0.5 6 50 0.005548306 0.9999990 0.5 6 100 0.005548306 0.9999990 0.5 6 150 0.005548306 0.9999990 0.5 7 50 0.449242895 0.9915870 0.5 7 100 0.449242895 0.9915870 0.5 7 150 0.449242895 0.9915870 0.5 8 50 0.034904070 0.9999589 0.5 8 100 0.034904070 0.9999589 0.5 8 150 0.034904070 0.9999589 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 6, eta = 0.5, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 0.006 R2: 1 XGBoost variable importance: Feature Gain Cover Frequency 1: af_agg_30cm_TAWCpF23__M_1km 0.6413752858 0.30769231 0.30769231 2: af_agg_30cm_AWCpF23__M_1km 0.3584173494 0.65384615 0.65384615 3: af_agg_30cm_TAWCpF23mm__M_1km 0.0002073649 0.03846154 0.03846154 4: NA NA NA NA 5: NA NA NA NA 6: NA NA NA NA 7: NA NA NA NA 8: NA NA NA NA 9: NA NA NA NA 10: NA NA NA NA 11: NA NA NA NA 12: NA NA NA NA 13: NA NA NA NA 14: NA NA NA NA 15: NA NA NA NA Ensemble validation RMSE: 0.01 R2: 1 -------------------------------------- Variable: fPR Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 243 Number of independent variables: 377 Mtry: 6 Target node size: 5 Variable importance mode: impurity OOB prediction error: 0.0807908 R squared: 0.1483983 OOB RMSE: 0.284 Variable importance: [,1] af_agg_30cm_TETAs__M_1km 0.4498803 af_agg_30cm_TAWCpF23__M_1km 0.3697929 Sorghum_intermed 0.3428970 OC_M_agg30cm_AF_1km 0.3423977 Temperature 0.3422347 Lai_avg 0.3407124 GAEZ_LGP 0.3353661 M13NDVIA04 0.3245097 fNR_SorghumT2 0.3144237 GAEZ_NPP 0.3121376 EACKCL_M_agg30cm_AF_1km 0.3108299 MY2LSTNALT_200207_201609 0.3098515 M13RB3ALT 0.3037426 M43WVALT 0.2983198 VW1MOD1avg 0.2890946 eXtreme Gradient Boosting 243 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 162, 162, 162 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 0.2881569 0.9728324 0.3 2 100 0.2881563 0.9728324 0.3 2 150 0.2881563 0.9728324 0.3 3 50 0.2877488 0.9838807 0.3 3 100 0.2877481 0.9838807 0.3 3 150 0.2877481 0.9838807 0.3 4 50 0.2873432 0.9916248 0.3 4 100 0.2873425 0.9916248 0.3 4 150 0.2873425 0.9916248 0.3 5 50 0.2861809 0.9999946 0.3 5 100 0.2861802 0.9999946 0.3 5 150 0.2861802 0.9999946 0.3 6 50 0.2895462 0.9057715 0.3 6 100 0.2895455 0.9057715 0.3 6 150 0.2895455 0.9057715 0.3 7 50 0.2861470 1.0000000 0.3 7 100 0.2861464 1.0000000 0.3 7 150 0.2861464 1.0000000 0.3 8 50 0.2868149 0.9976264 0.3 8 100 0.2868142 0.9976264 0.3 8 150 0.2868142 0.9976264 0.4 2 50 0.2873899 0.9911109 0.4 2 100 0.2873898 0.9911109 0.4 2 150 0.2873898 0.9911109 0.4 3 50 0.2875646 0.9880381 0.4 3 100 0.2875645 0.9880381 0.4 3 150 0.2875645 0.9880381 0.4 4 50 0.2874572 0.9899921 0.4 4 100 0.2874570 0.9899921 0.4 4 150 0.2874570 0.9899921 0.4 5 50 0.2863361 0.9998535 0.4 5 100 0.2863359 0.9998535 0.4 5 150 0.2863359 0.9998535 0.4 6 50 0.2864729 0.9995193 0.4 6 100 0.2864727 0.9995193 0.4 6 150 0.2864727 0.9995193 0.4 7 50 0.2861626 1.0000000 0.4 7 100 0.2861625 1.0000000 0.4 7 150 0.2861625 1.0000000 0.4 8 50 0.2863361 0.9998535 0.4 8 100 0.2863359 0.9998535 0.4 8 150 0.2863359 0.9998535 0.5 2 50 0.2861521 1.0000000 0.5 2 100 0.2861521 1.0000000 0.5 2 150 0.2861521 1.0000000 0.5 3 50 0.2861521 1.0000000 0.5 3 100 0.2861521 1.0000000 0.5 3 150 0.2861521 1.0000000 0.5 4 50 0.2861521 1.0000000 0.5 4 100 0.2861521 1.0000000 0.5 4 150 0.2861521 1.0000000 0.5 5 50 0.2865915 0.9990153 0.5 5 100 0.2865915 0.9990153 0.5 5 150 0.2865915 0.9990153 0.5 6 50 0.2861521 1.0000000 0.5 6 100 0.2861521 1.0000000 0.5 6 150 0.2861521 1.0000000 0.5 7 50 0.2862067 0.9999858 0.5 7 100 0.2862067 0.9999858 0.5 7 150 0.2862067 0.9999858 0.5 8 50 0.2861521 1.0000000 0.5 8 100 0.2861521 1.0000000 0.5 8 150 0.2861521 1.0000000 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 100, max_depth = 7, eta = 0.3, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 0.286 R2: 1 XGBoost variable importance: Feature Gain Cover Frequency 1: af_agg_30cm_AWCpF23__M_1km 0.48113619 0.6666667 0.6666667 2: AAIavg_GYGA 0.47111997 0.1666667 0.1666667 3: af_agg_30cm_TAWCpF23__M_1km 0.04774384 0.1666667 0.1666667 4: NA NA NA NA 5: NA NA NA NA 6: NA NA NA NA 7: NA NA NA NA 8: NA NA NA NA 9: NA NA NA NA 10: NA NA NA NA 11: NA NA NA NA 12: NA NA NA NA 13: NA NA NA NA 14: NA NA NA NA 15: NA NA NA NA Ensemble validation RMSE: 0.312 R2: 0.083 -------------------------------------- Variable: fKR Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 243 Number of independent variables: 377 Mtry: 4 Target node size: 5 Variable importance mode: impurity OOB prediction error: 2.417062 R squared: -0.7462844 OOB RMSE: 1.555 Variable importance: [,1] MY2LSTNALT_200207_201609 6.245826 ECN_M_agg30cm_AF_1km 5.494395 Sorghum_actual_baseline 4.218615 CMCF5avg 4.034069 B07CHE3 3.788514 af_agg_ERZD_TAWCpF23mm__M_1km 3.713238 Al_M_agg30cm_AF_1km 3.705416 NCluster_14_AF_1km 3.698286 M13RB1A01 3.677263 Fcover 3.563114 K_M_agg30cm_AF_1km 3.495953 ENTENV3 3.459652 M43BVALT 3.446987 Zn_M_agg30cm_AF_1km 3.431057 NCluster_1_AF_1km 3.351465 eXtreme Gradient Boosting 243 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 163, 162, 161 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 1.2204002 0.5785431 0.3 2 100 1.2204725 0.5785431 0.3 2 150 1.2204716 0.5785431 0.3 3 50 1.1256813 0.5785431 0.3 3 100 1.1256890 0.5785431 0.3 3 150 1.1256881 0.5785431 0.3 4 50 1.1944748 0.5785431 0.3 4 100 1.1944951 0.5785431 0.3 4 150 1.1944943 0.5785431 0.3 5 50 1.1001437 0.5785431 0.3 5 100 1.1001420 0.5785431 0.3 5 150 1.1001411 0.5785431 0.3 6 50 1.1145603 0.5785431 0.3 6 100 1.1146251 0.5785431 0.3 6 150 1.1146251 0.5785431 0.3 7 50 1.1286029 0.5785431 0.3 7 100 1.1286301 0.5785431 0.3 7 150 1.1286293 0.5785431 0.3 8 50 1.1751169 0.5785431 0.3 8 100 1.1751484 0.5785431 0.3 8 150 1.1751476 0.5785431 0.4 2 50 1.0212933 0.5785431 0.4 2 100 1.0212933 0.5785431 0.4 2 150 1.0212933 0.5785431 0.4 3 50 0.9749087 0.5785431 0.4 3 100 0.9749087 0.5785431 0.4 3 150 0.9749087 0.5785431 0.4 4 50 1.1362426 0.5785431 0.4 4 100 1.1362426 0.5785431 0.4 4 150 1.1362426 0.5785431 0.4 5 50 1.0762322 0.5785431 0.4 5 100 1.0762322 0.5785431 0.4 5 150 1.0762322 0.5785431 0.4 6 50 1.1883697 0.5785431 0.4 6 100 1.1883697 0.5785431 0.4 6 150 1.1883697 0.5785431 0.4 7 50 1.1325468 0.5785431 0.4 7 100 1.1325468 0.5785431 0.4 7 150 1.1325468 0.5785431 0.4 8 50 1.1383418 0.5785431 0.4 8 100 1.1383418 0.5785431 0.4 8 150 1.1383418 0.5785431 0.5 2 50 1.0473247 0.5785431 0.5 2 100 1.0473247 0.5785431 0.5 2 150 1.0473247 0.5785431 0.5 3 50 1.1046328 0.5785431 0.5 3 100 1.1046328 0.5785431 0.5 3 150 1.1046328 0.5785431 0.5 4 50 1.1057460 0.5785431 0.5 4 100 1.1057460 0.5785431 0.5 4 150 1.1057460 0.5785431 0.5 5 50 1.1356617 0.5785431 0.5 5 100 1.1356617 0.5785431 0.5 5 150 1.1356617 0.5785431 0.5 6 50 1.0889676 0.5785431 0.5 6 100 1.0889676 0.5785431 0.5 6 150 1.0889676 0.5785431 0.5 7 50 1.0825763 0.5785431 0.5 7 100 1.0825763 0.5785431 0.5 7 150 1.0825763 0.5785431 0.5 8 50 1.0904730 0.5785431 0.5 8 100 1.0904730 0.5785431 0.5 8 150 1.0904730 0.5785431 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 3, eta = 0.4, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 0.975 R2: 0.579 XGBoost variable importance: Feature Gain Cover Frequency 1: af_BDRICM_T__M_1km 0.41652423 0.22239170 0.22222222 2: AAIavg_GYGA 0.36661518 0.51891397 0.51851852 3: af_agg_30cm_PWP__M_1km 0.10771679 0.05544539 0.05555556 4: Al_M_agg30cm_AF_1km 0.09384008 0.05559793 0.05555556 5: af_agg_30cm_AWCpF23__M_1km 0.01530372 0.14765101 0.14814815 6: NA NA NA NA 7: NA NA NA NA 8: NA NA NA NA 9: NA NA NA NA 10: NA NA NA NA 11: NA NA NA NA 12: NA NA NA NA 13: NA NA NA NA 14: NA NA NA NA 15: NA NA NA NA Ensemble validation RMSE: 1.324 R2: 0.708 -------------------------------------- Variable: fNRec Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 45 Number of independent variables: 377 Mtry: 18 Target node size: 5 Variable importance mode: impurity OOB prediction error: 45.54028 R squared: 0.8701912 OOB RMSE: 6.748 Variable importance: [,1] af_agg_30cm_TAWCpF23mm__M_1km 581.5430 OC_M_agg30cm_AF_1km 461.8674 PRSCHE3 455.7511 af_agg_30cm_AWCpF23__M_1km 419.8327 AAIavg_GYGA 404.1346 Fcover 395.9256 LSTD_avgIRI_Jul2002_Sep2016_mosaicLAEA_celsius 390.7130 PCHE3avg 382.0430 B02CHE3 373.4869 GAEZ_LGP 368.6217 M13RB3ALT 347.6289 Water_balance 345.2260 BIO1ALT 337.3796 af_agg_30cm_TAWCpF23__M_1km 337.1992 M43BSALT 335.8670 eXtreme Gradient Boosting 45 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 30, 30, 30 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 4.904457 0.9226034 0.3 2 100 4.904474 0.9226033 0.3 2 150 4.904474 0.9226033 0.3 3 50 4.885137 0.9240887 0.3 3 100 4.885146 0.9240887 0.3 3 150 4.885146 0.9240887 0.3 4 50 4.780493 0.9258862 0.3 4 100 4.780520 0.9258857 0.3 4 150 4.780520 0.9258857 0.3 5 50 4.836779 0.9238733 0.3 5 100 4.836788 0.9238733 0.3 5 150 4.836788 0.9238733 0.3 6 50 4.743781 0.9278244 0.3 6 100 4.743786 0.9278246 0.3 6 150 4.743786 0.9278246 0.3 7 50 4.720466 0.9281266 0.3 7 100 4.720473 0.9281264 0.3 7 150 4.720473 0.9281264 0.3 8 50 4.823961 0.9247796 0.3 8 100 4.823989 0.9247791 0.3 8 150 4.823989 0.9247791 0.4 2 50 4.928938 0.9214548 0.4 2 100 4.928938 0.9214548 0.4 2 150 4.928938 0.9214548 0.4 3 50 4.860544 0.9232602 0.4 3 100 4.860544 0.9232602 0.4 3 150 4.860544 0.9232602 0.4 4 50 4.758846 0.9272082 0.4 4 100 4.758846 0.9272082 0.4 4 150 4.758846 0.9272082 0.4 5 50 4.920816 0.9213149 0.4 5 100 4.920816 0.9213149 0.4 5 150 4.920816 0.9213149 0.4 6 50 4.916252 0.9215805 0.4 6 100 4.916252 0.9215805 0.4 6 150 4.916252 0.9215805 0.4 7 50 4.872307 0.9228009 0.4 7 100 4.872307 0.9228009 0.4 7 150 4.872307 0.9228009 0.4 8 50 4.801129 0.9258113 0.4 8 100 4.801129 0.9258113 0.4 8 150 4.801129 0.9258113 0.5 2 50 4.728747 0.9293604 0.5 2 100 4.728747 0.9293604 0.5 2 150 4.728747 0.9293604 0.5 3 50 4.880158 0.9230322 0.5 3 100 4.880158 0.9230322 0.5 3 150 4.880158 0.9230322 0.5 4 50 4.859704 0.9258240 0.5 4 100 4.859704 0.9258240 0.5 4 150 4.859704 0.9258240 0.5 5 50 4.887596 0.9230445 0.5 5 100 4.887596 0.9230445 0.5 5 150 4.887596 0.9230445 0.5 6 50 4.827132 0.9259196 0.5 6 100 4.827132 0.9259196 0.5 6 150 4.827132 0.9259196 0.5 7 50 5.133617 0.9201620 0.5 7 100 5.133617 0.9201620 0.5 7 150 5.133617 0.9201620 0.5 8 50 4.884830 0.9257621 0.5 8 100 4.884830 0.9257621 0.5 8 150 4.884830 0.9257621 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 7, eta = 0.3, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 4.72 R2: 0.928 XGBoost variable importance: Feature Gain Cover Frequency 1: af_agg_30cm_AWCpF23__M_1km 6.782292e-01 0.286801522 0.235294118 2: AAIavg_GYGA 1.436308e-01 0.095533747 0.132352941 3: af_agg_30cm_TETAs__M_1km 1.409369e-01 0.201882636 0.191176471 4: af_agg_30cm_TAWCpF23mm__M_1km 2.552494e-02 0.017624675 0.014705882 5: B07CHE3 4.722864e-03 0.027037853 0.022058824 6: B04CHE3 2.900726e-03 0.007210094 0.007352941 7: af_agg_30cm_PWP__M_1km 2.387659e-03 0.011215702 0.014705882 8: ASSDAC3 1.029815e-03 0.024634488 0.029411765 9: EACKCL_M_agg30cm_AF_1km 5.503244e-04 0.115561787 0.110294118 10: C03GLC5 6.540251e-05 0.024233927 0.029411765 11: B14CHE3 1.105996e-05 0.030042059 0.036764706 12: ECN_M_agg30cm_AF_1km 3.889465e-06 0.014219908 0.014705882 13: Zn_M_agg30cm_AF_1km 2.719302e-06 0.043060284 0.036764706 14: af_agg_ERZD_TAWCpF23mm__M_1km 1.729097e-06 0.022631684 0.036764706 15: ENTENV3 1.561583e-06 0.008612057 0.007352941 Ensemble validation RMSE: 5.149 R2: 0.924 -------------------------------------- Variable: fPRec Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 46 Number of independent variables: 377 Mtry: 12 Target node size: 5 Variable importance mode: impurity OOB prediction error: 8.121836 R squared: 0.6237264 OOB RMSE: 2.85 Variable importance: [,1] LSTD_avgIRI_Jul2002_Sep2016_mosaicLAEA_celsius 25.08695 EACKCL_M_agg30cm_AF_1km 20.21598 Al_M_agg30cm_AF_1km 19.13720 Hypsclassc2 17.41442 DEMENV5 16.63617 SW2L00 16.17168 Sorghum_actual_baseline 16.08887 Mg_M_agg30cm_AF_1km 15.23419 GIEMSD3 14.97347 B14CHE3 14.60650 GAEZ_LGP 14.14186 Lai_avg 14.11367 GAEZ_ratioP_PETsea 14.01766 Na_M_agg30cm_AF_1km 13.97959 Usgs_lithologyc14 13.56502 eXtreme Gradient Boosting 46 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 30, 31, 31 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 3.066398 0.5426156 0.3 2 100 3.066843 0.5426400 0.3 2 150 3.066843 0.5426400 0.3 3 50 3.291840 0.4894097 0.3 3 100 3.292309 0.4894170 0.3 3 150 3.292309 0.4894170 0.3 4 50 2.967386 0.5826766 0.3 4 100 2.967923 0.5826603 0.3 4 150 2.967923 0.5826603 0.3 5 50 3.106606 0.5123817 0.3 5 100 3.106969 0.5123902 0.3 5 150 3.106969 0.5123902 0.3 6 50 3.025719 0.5760598 0.3 6 100 3.026131 0.5760676 0.3 6 150 3.026131 0.5760676 0.3 7 50 3.113093 0.5103154 0.3 7 100 3.113700 0.5102862 0.3 7 150 3.113700 0.5102862 0.3 8 50 2.889100 0.5830113 0.3 8 100 2.889547 0.5829983 0.3 8 150 2.889547 0.5829983 0.4 2 50 3.475612 0.4837804 0.4 2 100 3.475612 0.4837804 0.4 2 150 3.475612 0.4837804 0.4 3 50 3.626318 0.3954307 0.4 3 100 3.626317 0.3954308 0.4 3 150 3.626317 0.3954308 0.4 4 50 3.115798 0.5059001 0.4 4 100 3.115798 0.5059001 0.4 4 150 3.115798 0.5059001 0.4 5 50 3.192907 0.5334616 0.4 5 100 3.192907 0.5334616 0.4 5 150 3.192907 0.5334616 0.4 6 50 3.115490 0.4941349 0.4 6 100 3.115491 0.4941349 0.4 6 150 3.115491 0.4941349 0.4 7 50 3.544186 0.4335993 0.4 7 100 3.544186 0.4335993 0.4 7 150 3.544186 0.4335993 0.4 8 50 3.376925 0.4673266 0.4 8 100 3.376925 0.4673266 0.4 8 150 3.376925 0.4673266 0.5 2 50 3.711671 0.3888550 0.5 2 100 3.711671 0.3888550 0.5 2 150 3.711671 0.3888550 0.5 3 50 3.118694 0.4778107 0.5 3 100 3.118694 0.4778107 0.5 3 150 3.118694 0.4778107 0.5 4 50 3.281244 0.5293359 0.5 4 100 3.281244 0.5293359 0.5 4 150 3.281244 0.5293359 0.5 5 50 3.453678 0.4518965 0.5 5 100 3.453678 0.4518965 0.5 5 150 3.453678 0.4518965 0.5 6 50 3.131806 0.5466416 0.5 6 100 3.131806 0.5466416 0.5 6 150 3.131806 0.5466416 0.5 7 50 3.636166 0.3925634 0.5 7 100 3.636166 0.3925634 0.5 7 150 3.636166 0.3925634 0.5 8 50 3.442849 0.4323985 0.5 8 100 3.442849 0.4323985 0.5 8 150 3.442849 0.4323985 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 8, eta = 0.3, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 2.889 R2: 0.583 XGBoost variable importance: Feature Gain Cover Frequency 1: Al_M_agg30cm_AF_1km 0.5703926649 0.261972838 0.220689655 2: NCluster_13_AF_1km 0.2079547238 0.008220157 0.006896552 3: C08GLC5 0.1342619752 0.008220157 0.006896552 4: af_agg_30cm_PWP__M_1km 0.0265146054 0.026268763 0.048275862 5: ASSDAC3 0.0205380617 0.082201573 0.068965517 6: Ca_M_agg30cm_AF_1km 0.0174869835 0.027877055 0.027586207 7: Mg_M_agg30cm_AF_1km 0.0100293620 0.022873481 0.020689655 8: Wdvi 0.0066911246 0.046104360 0.041379310 9: af_BDRICM_T__M_1km 0.0024050348 0.022873481 0.020689655 10: RANENV3 0.0013486413 0.022873481 0.020689655 11: M13NDVIA01 0.0009349336 0.039313796 0.034482759 12: AAIavg_GYGA 0.0005329277 0.030378842 0.055172414 13: NCluster_12_AF_1km 0.0003059639 0.007862759 0.006896552 14: B_M_agg30cm_AF_1km 0.0002389635 0.016440315 0.020689655 15: ENTENV3 0.0001952415 0.021265189 0.027586207 Ensemble validation RMSE: 2.739 R2: 0.646 -------------------------------------- Variable: fKRec Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 42 Number of independent variables: 377 Mtry: 20 Target node size: 5 Variable importance mode: impurity OOB prediction error: 19.38071 R squared: 0.3064945 OOB RMSE: 4.402 Variable importance: [,1] Sorghum_actual_baseline 46.48986 Na_M_agg30cm_AF_1km 43.20279 VBFMRG5 37.35582 CRFVOL_M_agg30cm_AF_1km 27.91457 C01GLC5 20.02671 PET 18.30140 Fapar 17.95414 M13NDVIA01 17.78230 Fcover 17.35199 NCluster_8_AF_1km 16.95519 CMCF5avg 16.51659 af_agg_ERZD_TAWCpF23mm__M_1km 16.13445 C03GLC5 15.99471 Al_M_agg30cm_AF_1km 15.77632 GTDHYS3 15.74647 eXtreme Gradient Boosting 42 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 28, 27, 29 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 3.352625 0.7314804 0.3 2 100 3.352623 0.7314921 0.3 2 150 3.352623 0.7314921 0.3 3 50 3.357790 0.7303801 0.3 3 100 3.357535 0.7304346 0.3 3 150 3.357535 0.7304346 0.3 4 50 3.311991 0.7375904 0.3 4 100 3.311730 0.7376505 0.3 4 150 3.311730 0.7376505 0.3 5 50 3.325573 0.7379548 0.3 5 100 3.325296 0.7380193 0.3 5 150 3.325296 0.7380193 0.3 6 50 3.341557 0.7340366 0.3 6 100 3.341321 0.7340909 0.3 6 150 3.341321 0.7340909 0.3 7 50 3.460178 0.7109959 0.3 7 100 3.459849 0.7110551 0.3 7 150 3.459849 0.7110551 0.3 8 50 3.487951 0.7050129 0.3 8 100 3.487667 0.7050720 0.3 8 150 3.487667 0.7050720 0.4 2 50 3.321819 0.7390350 0.4 2 100 3.321818 0.7390350 0.4 2 150 3.321818 0.7390350 0.4 3 50 3.341865 0.7373579 0.4 3 100 3.341863 0.7373579 0.4 3 150 3.341863 0.7373579 0.4 4 50 3.443930 0.7208459 0.4 4 100 3.443929 0.7208459 0.4 4 150 3.443929 0.7208459 0.4 5 50 3.310055 0.7419557 0.4 5 100 3.310053 0.7419557 0.4 5 150 3.310053 0.7419557 0.4 6 50 3.290646 0.7421744 0.4 6 100 3.290645 0.7421744 0.4 6 150 3.290645 0.7421744 0.4 7 50 3.298227 0.7450063 0.4 7 100 3.298225 0.7450063 0.4 7 150 3.298225 0.7450063 0.4 8 50 3.238208 0.7557433 0.4 8 100 3.238207 0.7557433 0.4 8 150 3.238207 0.7557433 0.5 2 50 3.298423 0.7472330 0.5 2 100 3.298423 0.7472330 0.5 2 150 3.298423 0.7472330 0.5 3 50 3.567931 0.6921416 0.5 3 100 3.567931 0.6921416 0.5 3 150 3.567931 0.6921416 0.5 4 50 3.296449 0.7478061 0.5 4 100 3.296449 0.7478061 0.5 4 150 3.296449 0.7478061 0.5 5 50 3.328516 0.7383348 0.5 5 100 3.328516 0.7383348 0.5 5 150 3.328516 0.7383348 0.5 6 50 3.297728 0.7474355 0.5 6 100 3.297728 0.7474355 0.5 6 150 3.297728 0.7474355 0.5 7 50 3.298851 0.7473034 0.5 7 100 3.298851 0.7473034 0.5 7 150 3.298851 0.7473034 0.5 8 50 3.295464 0.7480966 0.5 8 100 3.295464 0.7480966 0.5 8 150 3.295464 0.7480966 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 100, max_depth = 8, eta = 0.4, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 3.238 R2: 0.756 XGBoost variable importance: Feature Gain Cover Frequency 1: ECN_M_agg30cm_AF_1km 3.281976e-01 0.010033445 0.009708738 2: M13NDVIA01 3.173703e-01 0.020066890 0.019417476 3: af_agg_30cm_AWCpF23__M_1km 1.625710e-01 0.123745819 0.126213592 4: Al_M_agg30cm_AF_1km 1.322312e-01 0.506688963 0.495145631 5: C01GLC5 2.300619e-02 0.069278548 0.067961165 6: af_agg_30cm_TETAs__M_1km 1.643433e-02 0.049211658 0.058252427 7: AAIavg_GYGA 1.549077e-02 0.047061634 0.048543689 8: BARL10 2.850520e-03 0.050167224 0.048543689 9: af_agg_30cm_TAWCpF23mm__M_1km 1.481900e-03 0.009555662 0.009708738 10: rElev 1.656497e-04 0.019589107 0.019417476 11: af_BDRICM_T__M_1km 1.536690e-04 0.036789298 0.038834951 12: Ca_M_agg30cm_AF_1km 3.727328e-05 0.010033445 0.009708738 13: af_agg_30cm_TAWCpF23__M_1km 9.555292e-06 0.009077879 0.009708738 14: C08GLC5 2.273076e-08 0.020066890 0.019417476 15: B04CHE3 5.183620e-09 0.018633540 0.019417476 Ensemble validation RMSE: 3.44 R2: 0.591 --------------------------------------