Variable: yControl000 Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 26 Number of independent variables: 372 Mtry: 4 Target node size: 5 Variable importance mode: impurity OOB prediction error: 2127952 R squared: -0.1749759 OOB RMSE: 1458.75 Variable importance: [,1] ESMOD5avg 447837.2 EACKCL_M_agg30cm_AF_1km 442993.3 NMSD3avg 435802.0 af_BDRICM_T__M_1km 428961.9 NCluster_M_AF_1km 424308.7 Al_M_agg30cm_AF_1km 409390.9 PET 401401.6 SLTPPT_M_agg30cm_AF_1km 400633.5 M13RB3A01 389777.9 M13RB3A08 379102.3 LRI_M_agg30cm_AF_1km 332207.5 NCluster_6_AF_1km 326746.3 NCluster_4_AF_1km 320559.3 M13RB3ALT 318811.6 NCluster_16_AF_1km 311915.4 eXtreme Gradient Boosting 26 samples 372 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 18, 18, 16 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 1533.863 0.15789432 0.3 2 100 1533.846 0.15790430 0.3 2 150 1533.846 0.15790438 0.3 3 50 1746.710 0.34245219 0.3 3 100 1746.715 0.34245794 0.3 3 150 1746.715 0.34245785 0.3 4 50 1643.378 0.38394984 0.3 4 100 1643.350 0.38392787 0.3 4 150 1643.350 0.38392787 0.3 5 50 1650.583 0.25515656 0.3 5 100 1650.589 0.25515688 0.3 5 150 1650.589 0.25515691 0.3 6 50 1735.909 0.24748352 0.3 6 100 1735.895 0.24749209 0.3 6 150 1735.895 0.24749211 0.3 7 50 1712.420 0.29368206 0.3 7 100 1712.398 0.29368991 0.3 7 150 1712.398 0.29369018 0.3 8 50 1617.428 0.19602091 0.3 8 100 1617.418 0.19601344 0.3 8 150 1617.418 0.19601344 0.4 2 50 1558.103 0.14960777 0.4 2 100 1558.102 0.14960785 0.4 2 150 1558.102 0.14960785 0.4 3 50 1716.874 0.16764943 0.4 3 100 1716.873 0.16764947 0.4 3 150 1716.873 0.16764947 0.4 4 50 1707.926 0.29904116 0.4 4 100 1707.925 0.29904147 0.4 4 150 1707.925 0.29904147 0.4 5 50 1703.661 0.23334817 0.4 5 100 1703.660 0.23334837 0.4 5 150 1703.660 0.23334837 0.4 6 50 1696.052 0.17124339 0.4 6 100 1696.053 0.17124420 0.4 6 150 1696.053 0.17124420 0.4 7 50 1710.068 0.21021637 0.4 7 100 1710.068 0.21021655 0.4 7 150 1710.068 0.21021655 0.4 8 50 1723.542 0.39211866 0.4 8 100 1723.540 0.39211950 0.4 8 150 1723.540 0.39211950 0.5 2 50 1776.915 0.32770434 0.5 2 100 1776.915 0.32770419 0.5 2 150 1776.915 0.32770419 0.5 3 50 1868.837 0.34231526 0.5 3 100 1868.837 0.34231528 0.5 3 150 1868.837 0.34231528 0.5 4 50 1608.651 0.08730871 0.5 4 100 1608.651 0.08730872 0.5 4 150 1608.651 0.08730872 0.5 5 50 1624.980 0.07757550 0.5 5 100 1624.980 0.07757549 0.5 5 150 1624.980 0.07757549 0.5 6 50 1781.730 0.43443524 0.5 6 100 1781.730 0.43443524 0.5 6 150 1781.730 0.43443524 0.5 7 50 1641.000 0.23225729 0.5 7 100 1641.000 0.23225728 0.5 7 150 1641.000 0.23225728 0.5 8 50 1699.238 0.21094610 0.5 8 100 1699.238 0.21094606 0.5 8 150 1699.238 0.21094606 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 150, max_depth = 2, eta = 0.3, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 1533.846 R2: 0.158 XGBoost variable importance: Feature Gain Cover Frequency 1: NCluster_M_AF_1km 0.17376309 0.061663991 0.050000000 2: TMSD3avg 0.12382135 0.027822365 0.023684211 3: NCluster_14_AF_1km 0.11334141 0.011369716 0.010526316 4: CEC_M_agg30cm_AF_1km 0.10963767 0.005484216 0.005263158 5: af_agg_30cm_PWP__M_1km 0.10241419 0.006019262 0.010526316 6: TMNMOD3 0.07001614 0.003477796 0.002631579 7: EACKCL_M_agg30cm_AF_1km 0.04576921 0.027688604 0.026315789 8: VBFMRG5 0.03628353 0.027956126 0.023684211 9: Rice_intermed 0.03047935 0.010700910 0.010526316 10: REDL14 0.02655700 0.002140182 0.002631579 11: M13RB3A01 0.02264589 0.006955591 0.005263158 12: Wdvi 0.02208626 0.058587480 0.055263158 13: Zn_M_agg30cm_AF_1km 0.02052617 0.001337614 0.002631579 14: ESMOD5avg 0.01885873 0.117174960 0.094736842 15: NCluster_4_AF_1km 0.01757172 0.005082932 0.005263158 Ensemble validation RMSE: 1485.07 R2: 0.175 -------------------------------------- Variable: ymx000 Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 14 Number of independent variables: 372 Mtry: 10 Target node size: 5 Variable importance mode: impurity OOB prediction error: 1125174 R squared: -0.09513549 OOB RMSE: 1060.742 Variable importance: [,1] af_BDRICM_T__M_1km 156954.6 EXMOD5avg 144771.8 ECN_M_agg30cm_AF_1km 137104.6 Lai_avg 125436.7 CEC_M_agg30cm_AF_1km 119591.0 M17NPPALTfill 114637.1 ENTENV3 114033.7 M13RB3A04 109813.9 Cu_M_agg30cm_AF_1km 109757.5 M13NDVIALT 108831.0 af_agg_ERZD_TAWCpF23mm__M_1km 108312.8 Fcover 107038.0 NCluster_12_AF_1km 105556.0 SLTPPT_M_agg30cm_AF_1km 104426.4 Fe_M_agg30cm_AF_1km 103634.0 eXtreme Gradient Boosting 14 samples 372 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 10, 9, 9 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 1026.2793 0.2453573 0.3 2 100 1026.5204 0.2453250 0.3 2 150 1026.5204 0.2453250 0.3 3 50 1006.6921 0.2605700 0.3 3 100 1007.0189 0.2604371 0.3 3 150 1007.0189 0.2604371 0.3 4 50 988.7679 0.2816522 0.3 4 100 989.0805 0.2814981 0.3 4 150 989.0805 0.2814981 0.3 5 50 983.2437 0.2875633 0.3 5 100 983.5149 0.2874457 0.3 5 150 983.5149 0.2874457 0.3 6 50 991.4094 0.2742239 0.3 6 100 991.7678 0.2740687 0.3 6 150 991.7678 0.2740687 0.3 7 50 994.2803 0.2762284 0.3 7 100 994.6281 0.2760563 0.3 7 150 994.6281 0.2760563 0.3 8 50 1068.0987 0.2264557 0.3 8 100 1068.4336 0.2264142 0.3 8 150 1068.4336 0.2264142 0.4 2 50 998.3449 0.3052295 0.4 2 100 998.3591 0.3052239 0.4 2 150 998.3591 0.3052239 0.4 3 50 1067.4926 0.2242278 0.4 3 100 1067.5011 0.2242284 0.4 3 150 1067.5011 0.2242284 0.4 4 50 1048.9666 0.2565145 0.4 4 100 1048.9762 0.2565132 0.4 4 150 1048.9762 0.2565132 0.4 5 50 974.0843 0.3011804 0.4 5 100 974.0930 0.3011797 0.4 5 150 974.0930 0.3011797 0.4 6 50 1069.5100 0.2169851 0.4 6 100 1069.5187 0.2169858 0.4 6 150 1069.5187 0.2169858 0.4 7 50 956.1719 0.3300298 0.4 7 100 956.1795 0.3300298 0.4 7 150 956.1795 0.3300298 0.4 8 50 1014.7630 0.2579022 0.4 8 100 1014.7722 0.2579018 0.4 8 150 1014.7722 0.2579018 0.5 2 50 991.3653 0.2747345 0.5 2 100 991.3658 0.2747342 0.5 2 150 991.3658 0.2747342 0.5 3 50 948.2607 0.3352190 0.5 3 100 948.2609 0.3352189 0.5 3 150 948.2609 0.3352189 0.5 4 50 1150.5599 0.1751631 0.5 4 100 1150.5600 0.1751632 0.5 4 150 1150.5600 0.1751632 0.5 5 50 1012.1674 0.2600106 0.5 5 100 1012.1677 0.2600105 0.5 5 150 1012.1677 0.2600105 0.5 6 50 1027.4764 0.2490433 0.5 6 100 1027.4768 0.2490432 0.5 6 150 1027.4768 0.2490432 0.5 7 50 1058.7797 0.2277704 0.5 7 100 1058.7801 0.2277704 0.5 7 150 1058.7801 0.2277704 0.5 8 50 1042.9192 0.2357372 0.5 8 100 1042.9195 0.2357371 0.5 8 150 1042.9195 0.2357371 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 3, eta = 0.5, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 948.261 R2: 0.335 XGBoost variable importance: Feature Gain Cover Frequency 1: af_agg_ERZD_TAWCpF23mm__M_1km 3.775595e-01 0.137659784 0.15294118 2: af_BDRICM_T__M_1km 2.192966e-01 0.269419862 0.24705882 3: ECN_M_agg30cm_AF_1km 1.986482e-01 0.051130777 0.04705882 4: ASSDAC3 1.947386e-01 0.027531957 0.02352941 5: ENTENV3 5.145910e-03 0.013765978 0.01176471 6: af_agg_30cm_AWCpF23__M_1km 4.294248e-03 0.079646018 0.09411765 7: af_agg_30cm_PWP__M_1km 1.557535e-04 0.192723697 0.16470588 8: AAIavg_GYGA 1.472949e-04 0.132743363 0.15294118 9: af_ERZD__M_1km 1.325622e-05 0.067846608 0.07058824 10: CEC_M_agg30cm_AF_1km 6.700351e-07 0.011799410 0.01176471 11: Al_M_agg30cm_AF_1km 2.889083e-11 0.004916421 0.01176471 12: B04CHE3 4.227199e-13 0.010816126 0.01176471 13: NA NA NA NA 14: NA NA NA NA 15: NA NA NA NA Ensemble validation RMSE: 993.584 R2: 0.056 -------------------------------------- Variable: fRyld Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 7 Number of independent variables: 372 Mtry: 6 Target node size: 5 Variable importance mode: impurity OOB prediction error: 1542568 R squared: -0.1975367 OOB RMSE: 1242.002 Variable importance: [,1] rElevIndex 157937.10 BIO12ALT 155234.25 GAEZ_ratioP_PETsea 116508.27 Lai_avg 106738.13 NCluster_15_AF_1km 102126.84 RANENV3 96821.29 Rice_actual_baseline 91778.69 C02GLC5 90526.31 af_BDRICM_T__M_1km 86486.39 GAEZ_LGP 85440.60 CMCF5avg 84379.19 ENAX_M_agg30cm_AF_1km 83479.62 M13RB3A08 77435.76 EXMOD5avg 76941.30 PCHE3avg 76411.27 eXtreme Gradient Boosting 7 samples 372 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 4, 5, 5 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 1030.1500 0.9621942 0.3 2 100 1030.3072 0.9621942 0.3 2 150 1030.3072 0.9621942 0.3 3 50 1009.6836 0.9621942 0.3 3 100 1009.8428 0.9621942 0.3 3 150 1009.8428 0.9621942 0.3 4 50 952.2733 0.9621942 0.3 4 100 952.4347 0.9621942 0.3 4 150 952.4347 0.9621942 0.3 5 50 1020.5379 0.9621942 0.3 5 100 1020.6967 0.9621942 0.3 5 150 1020.6967 0.9621942 0.3 6 50 1003.9080 0.9621942 0.3 6 100 1004.0690 0.9621942 0.3 6 150 1004.0690 0.9621942 0.3 7 50 978.7971 0.9621942 0.3 7 100 978.9558 0.9621942 0.3 7 150 978.9558 0.9621942 0.3 8 50 1032.2118 0.9621942 0.3 8 100 1032.3693 0.9621942 0.3 8 150 1032.3693 0.9621942 0.4 2 50 1052.4208 0.9621942 0.4 2 100 1052.4269 0.9621942 0.4 2 150 1052.4269 0.9621942 0.4 3 50 1049.4442 0.9621942 0.4 3 100 1049.4510 0.9621942 0.4 3 150 1049.4510 0.9621942 0.4 4 50 1030.8733 0.9621942 0.4 4 100 1030.8800 0.9621942 0.4 4 150 1030.8800 0.9621942 0.4 5 50 1015.6117 0.9621942 0.4 5 100 1015.6185 0.9621942 0.4 5 150 1015.6185 0.9621942 0.4 6 50 989.9549 0.9621942 0.4 6 100 989.9617 0.9621942 0.4 6 150 989.9617 0.9621942 0.4 7 50 990.1993 0.9621942 0.4 7 100 990.2061 0.9621942 0.4 7 150 990.2061 0.9621942 0.4 8 50 1002.2176 0.9621942 0.4 8 100 1002.2244 0.9621942 0.4 8 150 1002.2244 0.9621942 0.5 2 50 1041.1224 0.9621942 0.5 2 100 1041.1226 0.9621942 0.5 2 150 1041.1226 0.9621942 0.5 3 50 1052.7270 0.9621942 0.5 3 100 1052.7272 0.9621942 0.5 3 150 1052.7272 0.9621942 0.5 4 50 990.8526 0.9621942 0.5 4 100 990.8527 0.9621942 0.5 4 150 990.8527 0.9621942 0.5 5 50 885.8870 0.9621942 0.5 5 100 885.8872 0.9621942 0.5 5 150 885.8872 0.9621942 0.5 6 50 962.5159 0.9621942 0.5 6 100 962.5161 0.9621942 0.5 6 150 962.5161 0.9621942 0.5 7 50 980.5664 0.9621942 0.5 7 100 980.5666 0.9621942 0.5 7 150 980.5666 0.9621942 0.5 8 50 1033.1970 0.9621942 0.5 8 100 1033.1972 0.9621942 0.5 8 150 1033.1972 0.9621942 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 5, eta = 0.5, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 885.887 R2: 0.962 XGBoost variable importance: Feature Gain Cover Frequency 1: af_agg_30cm_AWCpF23__M_1km 7.544602e-01 0.07054674 0.07608696 2: af_BDRICM_T__M_1km 2.428591e-01 0.18518519 0.16304348 3: ASSDAC3 1.534911e-03 0.02469136 0.02173913 4: AAIavg_GYGA 1.130698e-03 0.47619048 0.48913043 5: af_agg_30cm_PWP__M_1km 1.310072e-05 0.21164021 0.21739130 6: af_agg_ERZD_TAWCpF23mm__M_1km 2.024005e-06 0.03174603 0.03260870 7: NA NA NA NA 8: NA NA NA NA 9: NA NA NA NA 10: NA NA NA NA 11: NA NA NA NA 12: NA NA NA NA 13: NA NA NA NA 14: NA NA NA NA 15: NA NA NA NA Ensemble validation RMSE: 1109.143 R2: 0.001 --------------------------------------