Variable: YA Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 34 Number of independent variables: 377 Mtry: 20 Target node size: 5 Variable importance mode: impurity OOB prediction error: 32291.05 R squared: 0.3722427 OOB RMSE: 179.697 Variable importance: [,1] CMCF5avg 62686.74 MANMCF5 59976.82 BIO12ALT 53573.98 CHIRPSA 42889.42 PCHE3avg 38771.34 Millet_actual_baseline 36917.96 B13CHE3 30281.92 PRSCHE3 30239.35 BIO1ALT 29423.93 NCluster_10_AF_1km 29362.86 BARL10 28012.75 M17GPPALTfill 27063.34 Water_balance 26932.57 Mn_M_agg30cm_AF_1km 24723.33 B_M_agg30cm_AF_1km 22036.57 eXtreme Gradient Boosting 34 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 23, 23, 22 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 185.4638 0.4273957 0.3 2 100 185.5001 0.4271507 0.3 2 150 185.5001 0.4271507 0.3 3 50 192.6272 0.3502211 0.3 3 100 192.6340 0.3501662 0.3 3 150 192.6340 0.3501662 0.3 4 50 206.0569 0.3006360 0.3 4 100 206.0602 0.3006259 0.3 4 150 206.0602 0.3006259 0.3 5 50 189.1910 0.3571259 0.3 5 100 189.1890 0.3571430 0.3 5 150 189.1890 0.3571430 0.3 6 50 184.5387 0.3796015 0.3 6 100 184.5370 0.3796146 0.3 6 150 184.5370 0.3796146 0.3 7 50 205.6149 0.3634743 0.3 7 100 205.6180 0.3634785 0.3 7 150 205.6180 0.3634785 0.3 8 50 183.0899 0.4058131 0.3 8 100 183.0897 0.4058284 0.3 8 150 183.0897 0.4058284 0.4 2 50 184.3182 0.4470550 0.4 2 100 184.3170 0.4470613 0.4 2 150 184.3170 0.4470613 0.4 3 50 194.1120 0.3785302 0.4 3 100 194.1122 0.3785295 0.4 3 150 194.1122 0.3785295 0.4 4 50 197.9725 0.2978766 0.4 4 100 197.9725 0.2978774 0.4 4 150 197.9725 0.2978774 0.4 5 50 209.7909 0.2987336 0.4 5 100 209.7913 0.2987316 0.4 5 150 209.7913 0.2987316 0.4 6 50 202.1993 0.3457794 0.4 6 100 202.1996 0.3457784 0.4 6 150 202.1996 0.3457784 0.4 7 50 203.5797 0.3264907 0.4 7 100 203.5799 0.3264897 0.4 7 150 203.5799 0.3264897 0.4 8 50 195.3224 0.4070705 0.4 8 100 195.3225 0.4070707 0.4 8 150 195.3225 0.4070707 0.5 2 50 187.1386 0.4312036 0.5 2 100 187.1390 0.4312026 0.5 2 150 187.1390 0.4312026 0.5 3 50 176.0295 0.3816246 0.5 3 100 176.0295 0.3816246 0.5 3 150 176.0295 0.3816246 0.5 4 50 188.3921 0.3121897 0.5 4 100 188.3921 0.3121897 0.5 4 150 188.3921 0.3121897 0.5 5 50 187.4493 0.3813393 0.5 5 100 187.4493 0.3813393 0.5 5 150 187.4493 0.3813393 0.5 6 50 203.6042 0.2892352 0.5 6 100 203.6042 0.2892352 0.5 6 150 203.6042 0.2892352 0.5 7 50 163.6281 0.5061223 0.5 7 100 163.6281 0.5061223 0.5 7 150 163.6281 0.5061223 0.5 8 50 196.7589 0.3558250 0.5 8 100 196.7589 0.3558250 0.5 8 150 196.7589 0.3558250 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 7, eta = 0.5, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 163.628 R2: 0.506 XGBoost variable importance: Feature Gain Cover Frequency 1: CMCF5avg 0.487143677 0.005852987 0.003194888 2: B02CHE3 0.145041531 0.009984507 0.006389776 3: BIO1ALT 0.118532446 0.098812188 0.060702875 4: af_agg_30cm_TETAs__M_1km 0.081735904 0.050955414 0.031948882 5: PCHE3avg 0.035682765 0.010156653 0.006389776 6: NCluster_M_AF_1km 0.029968909 0.016009640 0.015974441 7: Al_M_agg30cm_AF_1km 0.023226004 0.018075400 0.035143770 8: yGapMillet 0.016770520 0.019280427 0.012779553 9: NCluster_10_AF_1km 0.015382171 0.005508693 0.003194888 10: Zn_M_agg30cm_AF_1km 0.013581746 0.063694268 0.035143770 11: af_agg_30cm_AWCpF23__M_1km 0.004537525 0.074539508 0.095846645 12: yFertilised_MilletTrials 0.003454830 0.003959373 0.003194888 13: Cu_M_agg30cm_AF_1km 0.003237545 0.075916681 0.044728435 14: C04GLC5 0.003182491 0.021690480 0.015974441 15: IMOD4avg 0.002968696 0.043553107 0.028753994 Ensemble validation RMSE: 170.724 R2: 0.42 -------------------------------------- Variable: YW Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 34 Number of independent variables: 377 Mtry: 12 Target node size: 5 Variable importance mode: impurity OOB prediction error: 816142.9 R squared: 0.5722625 OOB RMSE: 903.406 Variable importance: [,1] CHIRPSA 1687525.0 B02CHE3 1388399.2 GAEZ_NPP 1381456.7 GAEZ_ratioP_PETsea 1363645.9 GAEZ_ET 1289858.4 BIO12ALT 1288164.1 NCluster_1_AF_1km 1263464.0 CMCF5avg 1153415.5 B07CHE3 1137662.9 M17GPPALTfill 1050788.2 B14CHE3 1014433.3 PRSCHE3 1012997.4 SW2L14 983622.2 B04CHE3 944574.0 PCHE3avg 932017.2 eXtreme Gradient Boosting 34 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 22, 24, 22 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 907.0491 0.5525314 0.3 2 100 906.9956 0.5526217 0.3 2 150 906.9953 0.5526221 0.3 3 50 943.0562 0.5087391 0.3 3 100 942.9940 0.5088013 0.3 3 150 942.9935 0.5088017 0.3 4 50 914.4234 0.5335455 0.3 4 100 914.4093 0.5335692 0.3 4 150 914.4092 0.5335693 0.3 5 50 878.3913 0.6004002 0.3 5 100 878.3633 0.6004258 0.3 5 150 878.3633 0.6004258 0.3 6 50 956.7665 0.5174330 0.3 6 100 956.7570 0.5174454 0.3 6 150 956.7566 0.5174457 0.3 7 50 919.8814 0.5187778 0.3 7 100 919.8544 0.5188143 0.3 7 150 919.8542 0.5188145 0.3 8 50 885.9853 0.5795410 0.3 8 100 885.9670 0.5795615 0.3 8 150 885.9666 0.5795620 0.4 2 50 1000.8143 0.4663685 0.4 2 100 1000.8141 0.4663759 0.4 2 150 1000.8141 0.4663759 0.4 3 50 935.2556 0.5435800 0.4 3 100 935.2534 0.5435809 0.4 3 150 935.2534 0.5435809 0.4 4 50 884.5193 0.5584263 0.4 4 100 884.5176 0.5584272 0.4 4 150 884.5176 0.5584272 0.4 5 50 874.2541 0.5641919 0.4 5 100 874.2528 0.5641921 0.4 5 150 874.2528 0.5641921 0.4 6 50 925.1150 0.5273244 0.4 6 100 925.1123 0.5273268 0.4 6 150 925.1123 0.5273268 0.4 7 50 901.7873 0.5422601 0.4 7 100 901.7866 0.5422607 0.4 7 150 901.7866 0.5422607 0.4 8 50 972.6673 0.4811179 0.4 8 100 972.6650 0.4811194 0.4 8 150 972.6650 0.4811194 0.5 2 50 943.9552 0.5143625 0.5 2 100 943.9559 0.5143623 0.5 2 150 943.9559 0.5143623 0.5 3 50 1066.5430 0.4389411 0.5 3 100 1066.5428 0.4389413 0.5 3 150 1066.5428 0.4389413 0.5 4 50 1029.2432 0.4931047 0.5 4 100 1029.2432 0.4931047 0.5 4 150 1029.2432 0.4931047 0.5 5 50 1064.3446 0.4268038 0.5 5 100 1064.3446 0.4268039 0.5 5 150 1064.3446 0.4268039 0.5 6 50 886.7222 0.5647607 0.5 6 100 886.7222 0.5647607 0.5 6 150 886.7222 0.5647607 0.5 7 50 993.9014 0.5002498 0.5 7 100 993.9014 0.5002498 0.5 7 150 993.9014 0.5002498 0.5 8 50 1078.7756 0.4019794 0.5 8 100 1078.7756 0.4019794 0.5 8 150 1078.7756 0.4019794 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 100, max_depth = 5, eta = 0.4, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 874.253 R2: 0.564 XGBoost variable importance: Feature Gain Cover Frequency 1: B04CHE3 0.418154910 0.003871555 0.002375297 2: CHIRPSA 0.189116076 0.006262810 0.007125891 3: B14CHE3 0.087473607 0.005351856 0.007125891 4: NCluster_19_AF_1km 0.056508381 0.004213163 0.004750594 5: M17NPPALTfill 0.039991909 0.003643817 0.004750594 6: B02CHE3 0.036496294 0.006490549 0.011876485 7: M02MOD4 0.026379029 0.009451150 0.009501188 8: af_agg_30cm_PWP__M_1km 0.026041844 0.016966522 0.030878860 9: GAEZ_ET 0.020345232 0.024709633 0.019002375 10: fNR_MilletTrials 0.016881295 0.005465725 0.007125891 11: P.T_M_agg30cm_AF_1km 0.016734220 0.006035072 0.007125891 12: M13RB3ALT 0.013794441 0.004554771 0.004750594 13: SW1L14 0.009374796 0.009678889 0.007125891 14: EACKCL_M_agg30cm_AF_1km 0.007036019 0.194944204 0.123515439 15: M13NDVIA08 0.006753646 0.009678889 0.011876485 Ensemble validation RMSE: 878.278 R2: 0.594 -------------------------------------- Variable: YW_SD Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 34 Number of independent variables: 377 Mtry: 14 Target node size: 5 Variable importance mode: impurity OOB prediction error: 751858.5 R squared: 0.6245901 OOB RMSE: 867.098 Variable importance: [,1] BIO12ALT 2121280 GAEZ_ET 1899733 MANMCF5 1789928 GAEZ_ratioP_PETsea 1753309 CHIRPSA 1732740 B07CHE3 1701304 CMCF5avg 1612278 VW1MOD1avg 1560928 GAEZ_ratioP_PETan 1442238 GAEZ_LGP 1316833 SW1L14 1231506 B02CHE3 1214980 PCHE3avg 1203068 ECN_M_agg30cm_AF_1km 1201093 NCluster_1_AF_1km 1109866 eXtreme Gradient Boosting 34 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 22, 23, 23 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 815.2715 0.6558488 0.3 2 100 815.0198 0.6560358 0.3 2 150 815.0192 0.6560364 0.3 3 50 874.6667 0.6437207 0.3 3 100 874.6666 0.6436874 0.3 3 150 874.6666 0.6436873 0.3 4 50 869.1281 0.6294515 0.3 4 100 869.1493 0.6294349 0.3 4 150 869.1492 0.6294349 0.3 5 50 863.3529 0.6586564 0.3 5 100 863.3465 0.6586441 0.3 5 150 863.3459 0.6586444 0.3 6 50 921.5038 0.5983719 0.3 6 100 921.4982 0.5983625 0.3 6 150 921.4982 0.5983625 0.3 7 50 873.8109 0.6268515 0.3 7 100 873.7921 0.6268455 0.3 7 150 873.7916 0.6268458 0.3 8 50 914.9574 0.5846476 0.3 8 100 914.9542 0.5846301 0.3 8 150 914.9536 0.5846303 0.4 2 50 945.1715 0.5380508 0.4 2 100 945.1731 0.5380526 0.4 2 150 945.1731 0.5380526 0.4 3 50 937.6721 0.5770047 0.4 3 100 937.6729 0.5770041 0.4 3 150 937.6729 0.5770041 0.4 4 50 939.0871 0.5550676 0.4 4 100 939.0880 0.5550666 0.4 4 150 939.0880 0.5550666 0.4 5 50 975.6508 0.5208341 0.4 5 100 975.6511 0.5208345 0.4 5 150 975.6511 0.5208345 0.4 6 50 949.3634 0.5440484 0.4 6 100 949.3638 0.5440480 0.4 6 150 949.3638 0.5440480 0.4 7 50 939.0124 0.5469907 0.4 7 100 939.0120 0.5469907 0.4 7 150 939.0120 0.5469907 0.4 8 50 925.5896 0.5543964 0.4 8 100 925.5901 0.5543960 0.4 8 150 925.5901 0.5543960 0.5 2 50 945.3514 0.5499838 0.5 2 100 945.3503 0.5499843 0.5 2 150 945.3503 0.5499843 0.5 3 50 903.7078 0.5971592 0.5 3 100 903.7077 0.5971592 0.5 3 150 903.7077 0.5971592 0.5 4 50 886.0584 0.5990101 0.5 4 100 886.0584 0.5990101 0.5 4 150 886.0584 0.5990101 0.5 5 50 971.5670 0.5410409 0.5 5 100 971.5671 0.5410409 0.5 5 150 971.5671 0.5410409 0.5 6 50 1000.2867 0.5329630 0.5 6 100 1000.2868 0.5329630 0.5 6 150 1000.2868 0.5329630 0.5 7 50 982.3036 0.5143513 0.5 7 100 982.3035 0.5143513 0.5 7 150 982.3035 0.5143513 0.5 8 50 956.7631 0.5283748 0.5 8 100 956.7631 0.5283748 0.5 8 150 956.7631 0.5283748 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 150, max_depth = 2, eta = 0.3, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 815.019 R2: 0.656 XGBoost variable importance: Feature Gain Cover Frequency 1: GAEZ_ratioP_PETan 0.318763807 0.003393891 0.0025 2: B02CHE3 0.177512334 0.004991016 0.0075 3: SW1L14 0.138014490 0.020063885 0.0150 4: M17GPPALTfill 0.099529230 0.006588141 0.0050 5: BIO12ALT 0.057453396 0.003892993 0.0050 6: C04GLC5 0.034146386 0.001497305 0.0025 7: NCluster_20_AF_1km 0.032226810 0.002894789 0.0050 8: M43BNALT 0.027447958 0.001796766 0.0025 9: GAEZ_ET 0.022384071 0.004591735 0.0050 10: VW1MOD1avg 0.019216463 0.003393891 0.0050 11: EACKCL_M_agg30cm_AF_1km 0.009210767 0.096626073 0.0725 12: CMCF5avg 0.006642250 0.001597125 0.0025 13: SLTPPT_M_agg30cm_AF_1km 0.005592613 0.003793172 0.0050 14: Millet_rainfed_low_baseline 0.005534746 0.002595328 0.0025 15: IMOD4avg 0.005075612 0.003393891 0.0025 Ensemble validation RMSE: 835.505 R2: 0.653 --------------------------------------