Variable: YA Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 34 Number of independent variables: 377 Mtry: 6 Target node size: 5 Variable importance mode: impurity OOB prediction error: 101998.1 R squared: 0.1139767 OOB RMSE: 319.371 Variable importance: [,1] MAXENV3 49809.70 SW2L00 47781.12 NCluster_6_AF_1km 43973.83 Ca_M_agg30cm_AF_1km 43071.50 EVEENV3 40270.90 M13RB1ALT 39772.71 NCluster_4_AF_1km 39042.96 Maize_intermed 37316.39 POSMRG5 37074.95 PCHE3avg 37023.59 Temperature 36828.06 M02MOD4 36477.31 TMOD3avg 36437.28 Water_balance 36156.36 B13CHE3 35333.78 eXtreme Gradient Boosting 34 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 22, 24, 22 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 340.3182 0.12814619 0.3 2 100 340.3155 0.12825059 0.3 2 150 340.3154 0.12825072 0.3 3 50 330.4091 0.14996384 0.3 3 100 330.3942 0.15001628 0.3 3 150 330.3942 0.15001628 0.3 4 50 357.4032 0.05252469 0.3 4 100 357.4008 0.05253547 0.3 4 150 357.4008 0.05253547 0.3 5 50 336.7455 0.11345238 0.3 5 100 336.7463 0.11345266 0.3 5 150 336.7463 0.11345266 0.3 6 50 349.9395 0.05154231 0.3 6 100 349.9408 0.05153376 0.3 6 150 349.9408 0.05153376 0.3 7 50 347.4954 0.08996219 0.3 7 100 347.4970 0.08996197 0.3 7 150 347.4970 0.08996197 0.3 8 50 333.3672 0.13961645 0.3 8 100 333.3696 0.13962226 0.3 8 150 333.3696 0.13962226 0.4 2 50 382.5747 0.05518692 0.4 2 100 382.5768 0.05518661 0.4 2 150 382.5768 0.05518661 0.4 3 50 333.4137 0.12138874 0.4 3 100 333.4138 0.12139045 0.4 3 150 333.4138 0.12139045 0.4 4 50 353.8558 0.06750637 0.4 4 100 353.8558 0.06750572 0.4 4 150 353.8558 0.06750572 0.4 5 50 343.8505 0.06415432 0.4 5 100 343.8504 0.06415437 0.4 5 150 343.8504 0.06415437 0.4 6 50 347.8365 0.05736403 0.4 6 100 347.8363 0.05736441 0.4 6 150 347.8363 0.05736441 0.4 7 50 343.4025 0.05727717 0.4 7 100 343.4021 0.05727813 0.4 7 150 343.4021 0.05727813 0.4 8 50 337.9275 0.09839150 0.4 8 100 337.9278 0.09839011 0.4 8 150 337.9278 0.09839011 0.5 2 50 382.1061 0.07295780 0.5 2 100 382.1064 0.07295754 0.5 2 150 382.1064 0.07295754 0.5 3 50 351.9197 0.07264392 0.5 3 100 351.9197 0.07264392 0.5 3 150 351.9197 0.07264392 0.5 4 50 359.5721 0.04167844 0.5 4 100 359.5721 0.04167844 0.5 4 150 359.5721 0.04167844 0.5 5 50 369.9785 0.10106177 0.5 5 100 369.9785 0.10106177 0.5 5 150 369.9785 0.10106177 0.5 6 50 386.2365 0.04973464 0.5 6 100 386.2365 0.04973463 0.5 6 150 386.2365 0.04973463 0.5 7 50 391.8329 0.04747290 0.5 7 100 391.8329 0.04747290 0.5 7 150 391.8329 0.04747290 0.5 8 50 369.1049 0.06026437 0.5 8 100 369.1049 0.06026437 0.5 8 150 369.1049 0.06026437 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 100, max_depth = 3, eta = 0.3, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 330.394 R2: 0.15 XGBoost variable importance: Feature Gain Cover Frequency 1: LRI_M_agg30cm_AF_1km 0.28819600 0.011819235 0.007594937 2: Ca_M_agg30cm_AF_1km 0.13178860 0.019814600 0.015189873 3: BIO1ALT 0.09777676 0.010196987 0.010126582 4: RANENV3 0.09643041 0.044495944 0.037974684 5: Mn_M_agg30cm_AF_1km 0.07036719 0.022016222 0.015189873 6: Na_M_agg30cm_AF_1km 0.04952297 0.020509849 0.017721519 7: Fe_M_agg30cm_AF_1km 0.04117421 0.006257242 0.005063291 8: rElev 0.03867157 0.027694090 0.022784810 9: BARL10 0.03390969 0.142757822 0.093670886 10: Cu_M_agg30cm_AF_1km 0.03142159 0.011819235 0.007594937 11: DEMENV5 0.02657643 0.001622248 0.005063291 12: af_agg_30cm_PWP__M_1km 0.01460392 0.015527231 0.027848101 13: NCluster_4_AF_1km 0.01450878 0.025492468 0.017721519 14: ESMOD5avg 0.01025184 0.017728853 0.012658228 15: LCEE10c11 0.00865107 0.001622248 0.002531646 Ensemble validation RMSE: 317.714 R2: 0.113 -------------------------------------- Variable: YW Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 34 Number of independent variables: 377 Mtry: 10 Target node size: 5 Variable importance mode: impurity OOB prediction error: 7952277 R squared: 0.274502 OOB RMSE: 2819.978 Variable importance: [,1] Ca_M_agg30cm_AF_1km 7002549 NCluster_20_AF_1km 6419252 PRSCHE3 5722597 BIO12ALT 5397037 NCluster_6_AF_1km 5170397 af_agg_30cm_TETAs__M_1km 5015539 SW1L00 4962922 BLDFIE_M_agg30cm_AF_1km 4658686 B14CHE3 4631958 GAEZ_NPP 4590620 NCluster_16_AF_1km 4501405 Temperature 4420290 GAEZ_LGP 4383764 M43BVALT 4347222 LSTD_avgIRI_Jul2002_Sep2016_mosaicLAEA_celsius 4093462 eXtreme Gradient Boosting 34 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 23, 23, 22 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 3184.506 0.1451617 0.3 2 100 3184.552 0.1451341 0.3 2 150 3184.550 0.1451352 0.3 3 50 3329.823 0.1810264 0.3 3 100 3330.014 0.1810102 0.3 3 150 3330.015 0.1810108 0.3 4 50 3238.031 0.1284840 0.3 4 100 3238.045 0.1285523 0.3 4 150 3238.044 0.1285533 0.3 5 50 3064.601 0.1987337 0.3 5 100 3064.495 0.1987999 0.3 5 150 3064.492 0.1988011 0.3 6 50 3081.392 0.1836624 0.3 6 100 3081.326 0.1836750 0.3 6 150 3081.324 0.1836755 0.3 7 50 3145.326 0.1832677 0.3 7 100 3145.172 0.1833735 0.3 7 150 3145.168 0.1833756 0.3 8 50 3083.762 0.1680258 0.3 8 100 3083.727 0.1680884 0.3 8 150 3083.725 0.1680896 0.4 2 50 2724.038 0.4257605 0.4 2 100 2724.015 0.4257750 0.4 2 150 2724.010 0.4257772 0.4 3 50 3257.415 0.1250568 0.4 3 100 3257.412 0.1250590 0.4 3 150 3257.409 0.1250598 0.4 4 50 3141.534 0.1696510 0.4 4 100 3141.541 0.1696485 0.4 4 150 3141.539 0.1696494 0.4 5 50 3186.649 0.1316057 0.4 5 100 3186.642 0.1316095 0.4 5 150 3186.639 0.1316105 0.4 6 50 3170.289 0.1447476 0.4 6 100 3170.291 0.1447509 0.4 6 150 3170.290 0.1447521 0.4 7 50 3003.302 0.2253872 0.4 7 100 3003.296 0.2253899 0.4 7 150 3003.294 0.2253908 0.4 8 50 3056.082 0.2036167 0.4 8 100 3056.082 0.2036206 0.4 8 150 3056.080 0.2036218 0.5 2 50 2826.630 0.3555623 0.5 2 100 2826.617 0.3555679 0.5 2 150 2826.613 0.3555687 0.5 3 50 2941.425 0.2463963 0.5 3 100 2941.423 0.2463966 0.5 3 150 2941.420 0.2463971 0.5 4 50 3193.043 0.1765151 0.5 4 100 3193.042 0.1765158 0.5 4 150 3193.041 0.1765166 0.5 5 50 3283.674 0.1314201 0.5 5 100 3283.674 0.1314209 0.5 5 150 3283.674 0.1314217 0.5 6 50 3063.717 0.1822198 0.5 6 100 3063.715 0.1822204 0.5 6 150 3063.712 0.1822210 0.5 7 50 2987.048 0.2487338 0.5 7 100 2987.046 0.2487339 0.5 7 150 2987.043 0.2487340 0.5 8 50 3045.929 0.1981894 0.5 8 100 3045.927 0.1981907 0.5 8 150 3045.925 0.1981920 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 150, max_depth = 2, eta = 0.4, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 2724.01 R2: 0.426 XGBoost variable importance: Feature Gain Cover Frequency 1: BIO12ALT 0.28986714 0.005858405 0.007915567 2: Maize_actualbaseline 0.15895890 0.019859001 0.015831135 3: Ca_M_agg30cm_AF_1km 0.09981280 0.007943600 0.007915567 4: Wdvi 0.08721422 0.024923046 0.021108179 5: Al_M_agg30cm_AF_1km 0.05071594 0.011816106 0.018469657 6: af_agg_30cm_TETAs__M_1km 0.04687744 0.009631616 0.007915567 7: Slopeclassc3 0.03654210 0.010128091 0.007915567 8: Slopeclassc2 0.03289229 0.006752060 0.005277045 9: NMOD3avg 0.02868141 0.009433026 0.013192612 10: NIRL14 0.02177568 0.001390130 0.002638522 11: M13RB1A08 0.01637261 0.014199186 0.013192612 12: M13RB3A08 0.01565928 0.011418926 0.010554090 13: M13RB3ALT 0.01407083 0.003574620 0.007915567 14: C08GLC5 0.01376690 0.021844901 0.021108179 15: B14CHE3 0.01200341 0.001985900 0.002638522 Ensemble validation RMSE: 2695.826 R2: 0.333 -------------------------------------- Variable: YW_SD Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 34 Number of independent variables: 377 Mtry: 18 Target node size: 5 Variable importance mode: impurity OOB prediction error: 12587618 R squared: 0.2539374 OOB RMSE: 3547.903 Variable importance: [,1] SW1L00 14150886 CHIRPSA 13440184 MMOD4avg 10169034 REDL00 9792696 af_agg_30cm_TETAs__M_1km 9189087 M17NPPALTfill 9177758 SW2L00 9171570 M43BSALT 8610880 Al_M_agg30cm_AF_1km 8413703 Ca_M_agg30cm_AF_1km 7940473 M43WNALT 7784251 Fcover 6977463 Zn_M_agg30cm_AF_1km 6787631 BLDFIE_M_agg30cm_AF_1km 6777648 M43WSALT 6760409 eXtreme Gradient Boosting 34 samples 377 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 22, 23, 23 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 3892.012 0.21278797 0.3 2 100 3892.366 0.21272419 0.3 2 150 3892.366 0.21272433 0.3 3 50 4111.127 0.09817792 0.3 3 100 4111.229 0.09815660 0.3 3 150 4111.227 0.09815703 0.3 4 50 4233.596 0.06951602 0.3 4 100 4233.818 0.06947498 0.3 4 150 4233.817 0.06947513 0.3 5 50 3829.154 0.24080144 0.3 5 100 3829.025 0.24088285 0.3 5 150 3829.023 0.24088366 0.3 6 50 4282.128 0.08571491 0.3 6 100 4282.012 0.08579207 0.3 6 150 4282.010 0.08579291 0.3 7 50 3798.335 0.24793079 0.3 7 100 3798.221 0.24802553 0.3 7 150 3798.221 0.24802530 0.3 8 50 4083.220 0.15464457 0.3 8 100 4083.043 0.15473931 0.3 8 150 4083.040 0.15474019 0.4 2 50 3766.534 0.25941512 0.4 2 100 3766.569 0.25939982 0.4 2 150 3766.568 0.25939947 0.4 3 50 4117.195 0.14561404 0.4 3 100 4117.186 0.14561935 0.4 3 150 4117.185 0.14561959 0.4 4 50 4116.905 0.13911046 0.4 4 100 4116.911 0.13910919 0.4 4 150 4116.909 0.13911009 0.4 5 50 4009.731 0.14651872 0.4 5 100 4009.721 0.14652419 0.4 5 150 4009.719 0.14652503 0.4 6 50 4497.375 0.08052092 0.4 6 100 4497.367 0.08052852 0.4 6 150 4497.364 0.08052947 0.4 7 50 4080.071 0.18812521 0.4 7 100 4080.065 0.18812943 0.4 7 150 4080.064 0.18812992 0.4 8 50 4180.102 0.08695757 0.4 8 100 4180.093 0.08696206 0.4 8 150 4180.093 0.08696195 0.5 2 50 4005.648 0.13765966 0.5 2 100 4005.645 0.13766072 0.5 2 150 4005.645 0.13766072 0.5 3 50 4031.907 0.15128818 0.5 3 100 4031.906 0.15128894 0.5 3 150 4031.904 0.15128985 0.5 4 50 4052.549 0.15563333 0.5 4 100 4052.547 0.15563432 0.5 4 150 4052.545 0.15563529 0.5 5 50 4169.760 0.08561522 0.5 5 100 4169.758 0.08561615 0.5 5 150 4169.756 0.08561704 0.5 6 50 4121.484 0.12913741 0.5 6 100 4121.482 0.12913844 0.5 6 150 4121.480 0.12913947 0.5 7 50 4194.428 0.14434782 0.5 7 100 4194.426 0.14434835 0.5 7 150 4194.425 0.14434880 0.5 8 50 4136.746 0.11493390 0.5 8 100 4136.745 0.11493466 0.5 8 150 4136.743 0.11493549 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 2, eta = 0.4, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 3766.534 R2: 0.259 XGBoost variable importance: Feature Gain Cover Frequency 1: B14CHE3 0.24530825 0.014792899 0.014184397 2: Maize_actualbaseline 0.13943928 0.011242604 0.014184397 3: NCluster_6_AF_1km 0.11574418 0.019822485 0.014184397 4: GAEZ_NPP 0.09473638 0.003550296 0.007092199 5: MMOD4avg 0.06056243 0.010059172 0.007092199 6: NCluster_4_AF_1km 0.04348980 0.013609467 0.014184397 7: M43BSALT 0.04248088 0.020118343 0.014184397 8: Wdvi 0.04088890 0.040236686 0.028368794 9: Zn_M_agg30cm_AF_1km 0.03274240 0.031065089 0.028368794 10: Ca_M_agg30cm_AF_1km 0.03215390 0.021893491 0.021276596 11: M02MOD4 0.02649481 0.010355030 0.014184397 12: SW1L14 0.01765272 0.010059172 0.007092199 13: MAXENV3 0.01474098 0.009171598 0.007092199 14: Na_M_agg30cm_AF_1km 0.01404684 0.058579882 0.049645390 15: AAIavg_GYGA 0.01126712 0.005621302 0.035460993 Ensemble validation RMSE: 3623.802 R2: 0.204 --------------------------------------