Variable: yControl000 Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 82 Number of independent variables: 372 Mtry: 6 Target node size: 5 Variable importance mode: impurity OOB prediction error: 79573.69 R squared: 0.3247622 OOB RMSE: 282.088 Variable importance: [,1] ENTENV3 173178.76 P.T_M_agg30cm_AF_1km 135798.52 EVEENV3 125688.89 N_M_agg30cm_AF_1km 109005.16 MANMCF5 105632.39 IMOD4avg 101341.51 M13RB1A01 93130.05 RANENV3 92621.20 Zn_M_agg30cm_AF_1km 92418.29 B_M_agg30cm_AF_1km 89388.38 REDL14 89089.31 MAXENV3 87281.58 CMCF5avg 83889.20 SW2L00 83692.60 EXBX_M_agg30cm_AF_1km 81824.95 eXtreme Gradient Boosting 82 samples 372 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 56, 54, 54 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 292.3543 0.3479042 0.3 2 100 293.0167 0.3474811 0.3 2 150 293.0271 0.3474856 0.3 3 50 308.8528 0.3076790 0.3 3 100 308.9275 0.3076555 0.3 3 150 308.9276 0.3076554 0.3 4 50 324.1985 0.3205565 0.3 4 100 324.2158 0.3205825 0.3 4 150 324.2158 0.3205825 0.3 5 50 325.8871 0.3297620 0.3 5 100 325.8998 0.3297991 0.3 5 150 325.8998 0.3297991 0.3 6 50 324.5591 0.3079955 0.3 6 100 324.5697 0.3080492 0.3 6 150 324.5697 0.3080492 0.3 7 50 330.9029 0.2976964 0.3 7 100 330.9113 0.2977535 0.3 7 150 330.9113 0.2977535 0.3 8 50 315.1729 0.3319289 0.3 8 100 315.1822 0.3319838 0.3 8 150 315.1822 0.3319838 0.4 2 50 297.1765 0.3258370 0.4 2 100 297.2652 0.3259284 0.4 2 150 297.2651 0.3259289 0.4 3 50 306.1021 0.3060457 0.4 3 100 306.1078 0.3060420 0.4 3 150 306.1078 0.3060420 0.4 4 50 319.8300 0.3075929 0.4 4 100 319.8311 0.3075941 0.4 4 150 319.8311 0.3075941 0.4 5 50 327.3044 0.3135183 0.4 5 100 327.3049 0.3135205 0.4 5 150 327.3049 0.3135205 0.4 6 50 324.1915 0.3138366 0.4 6 100 324.1922 0.3138385 0.4 6 150 324.1922 0.3138385 0.4 7 50 323.0954 0.3311902 0.4 7 100 323.0959 0.3311922 0.4 7 150 323.0959 0.3311922 0.4 8 50 311.8202 0.3130925 0.4 8 100 311.8208 0.3130947 0.4 8 150 311.8208 0.3130947 0.5 2 50 302.9037 0.3085183 0.5 2 100 302.9127 0.3085280 0.5 2 150 302.9127 0.3085280 0.5 3 50 315.3066 0.3093170 0.5 3 100 315.3068 0.3093172 0.5 3 150 315.3068 0.3093172 0.5 4 50 320.3082 0.3235459 0.5 4 100 320.3082 0.3235458 0.5 4 150 320.3082 0.3235458 0.5 5 50 330.5220 0.2862642 0.5 5 100 330.5220 0.2862642 0.5 5 150 330.5220 0.2862642 0.5 6 50 323.1975 0.3098371 0.5 6 100 323.1975 0.3098372 0.5 6 150 323.1975 0.3098372 0.5 7 50 307.8031 0.3302540 0.5 7 100 307.8031 0.3302540 0.5 7 150 307.8031 0.3302540 0.5 8 50 310.9659 0.3147316 0.5 8 100 310.9659 0.3147316 0.5 8 150 310.9659 0.3147316 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 2, eta = 0.3, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 292.354 R2: 0.348 XGBoost variable importance: Feature Gain Cover Frequency 1: M43WNALT 0.17817768 0.020370598 0.023809524 2: ENTENV3 0.15683642 0.050312922 0.039682540 3: NMSD3avg 0.13035663 0.010062584 0.007936508 4: EVEENV3 0.13025500 0.090563259 0.071428571 5: REDL14 0.04897308 0.009817155 0.007936508 6: Millet_actual_baseline 0.04378272 0.021965885 0.023809524 7: M17GPPALTfill 0.03509624 0.007117438 0.007936508 8: PHIHOXagg0_30 0.02943417 0.002945147 0.007936508 9: EXBX_M_agg30cm_AF_1km 0.02860392 0.054976071 0.047619048 10: B_M_agg30cm_AF_1km 0.02687787 0.009817155 0.007936508 11: N_M_agg30cm_AF_1km 0.02472875 0.029328752 0.023809524 12: Zn_M_agg30cm_AF_1km 0.02248856 0.039268622 0.031746032 13: Mn_M_agg30cm_AF_1km 0.02037160 0.009817155 0.007936508 14: BIO1ALT 0.01848809 0.037673334 0.031746032 15: BIO12ALT 0.01250083 0.009449012 0.007936508 Ensemble validation RMSE: 273.806 R2: 0.36 -------------------------------------- Variable: ymx000 Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 97 Number of independent variables: 372 Mtry: 18 Target node size: 5 Variable importance mode: impurity OOB prediction error: 254368.5 R squared: 0.3469892 OOB RMSE: 504.35 Variable importance: [,1] Fe_M_agg30cm_AF_1km 1301383.2 ENTENV3 786096.8 NCluster_16_AF_1km 768296.2 MAXENV3 730835.1 P.T_M_agg30cm_AF_1km 684064.5 BIO1ALT 637155.3 B14CHE3 587931.9 af_agg_30cm_TETAs__M_1km 572849.3 LRI_M_agg30cm_AF_1km 566288.4 Ca_M_agg30cm_AF_1km 557410.9 M13NDVIA04 539230.5 C03GLC5 534783.7 EXBX_M_agg30cm_AF_1km 521396.8 IMOD4avg 487809.7 EACKCL_M_agg30cm_AF_1km 441917.6 eXtreme Gradient Boosting 97 samples 372 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 64, 65, 65 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 463.6504 0.5323388 0.3 2 100 463.6339 0.5323950 0.3 2 150 463.6335 0.5323957 0.3 3 50 460.5342 0.5380756 0.3 3 100 460.5312 0.5381120 0.3 3 150 460.5312 0.5381120 0.3 4 50 461.3153 0.5371348 0.3 4 100 461.3101 0.5371618 0.3 4 150 461.3101 0.5371618 0.3 5 50 460.0508 0.5390850 0.3 5 100 460.0545 0.5391026 0.3 5 150 460.0545 0.5391026 0.3 6 50 461.1404 0.5368381 0.3 6 100 461.1443 0.5368557 0.3 6 150 461.1443 0.5368557 0.3 7 50 462.7445 0.5338515 0.3 7 100 462.7488 0.5338706 0.3 7 150 462.7488 0.5338706 0.3 8 50 492.2612 0.4976657 0.3 8 100 492.2535 0.4977022 0.3 8 150 492.2535 0.4977022 0.4 2 50 462.0130 0.5357741 0.4 2 100 462.0138 0.5357593 0.4 2 150 462.0138 0.5357593 0.4 3 50 460.5143 0.5393571 0.4 3 100 460.5146 0.5393575 0.4 3 150 460.5146 0.5393575 0.4 4 50 463.0478 0.5335280 0.4 4 100 463.0479 0.5335289 0.4 4 150 463.0479 0.5335289 0.4 5 50 462.7523 0.5334845 0.4 5 100 462.7524 0.5334854 0.4 5 150 462.7524 0.5334854 0.4 6 50 461.1329 0.5365486 0.4 6 100 461.1330 0.5365495 0.4 6 150 461.1330 0.5365495 0.4 7 50 459.6145 0.5403179 0.4 7 100 459.6146 0.5403187 0.4 7 150 459.6146 0.5403187 0.4 8 50 468.0259 0.5277785 0.4 8 100 468.0263 0.5277788 0.4 8 150 468.0263 0.5277788 0.5 2 50 461.4957 0.5359806 0.5 2 100 461.4961 0.5359788 0.5 2 150 461.4961 0.5359788 0.5 3 50 463.1721 0.5336983 0.5 3 100 463.1721 0.5336983 0.5 3 150 463.1721 0.5336983 0.5 4 50 468.7969 0.5261951 0.5 4 100 468.7969 0.5261951 0.5 4 150 468.7969 0.5261951 0.5 5 50 467.0036 0.5247573 0.5 5 100 467.0036 0.5247573 0.5 5 150 467.0036 0.5247573 0.5 6 50 460.0953 0.5392467 0.5 6 100 460.0953 0.5392467 0.5 6 150 460.0953 0.5392467 0.5 7 50 457.7187 0.5448076 0.5 7 100 457.7187 0.5448076 0.5 7 150 457.7187 0.5448076 0.5 8 50 457.9808 0.5427786 0.5 8 100 457.9808 0.5427786 0.5 8 150 457.9808 0.5427786 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 100, max_depth = 7, eta = 0.5, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 457.719 R2: 0.545 XGBoost variable importance: Feature Gain Cover Frequency 1: BIO1ALT 0.3495376739 0.032390605 0.024038462 2: Fe_M_agg30cm_AF_1km 0.2980978917 0.018009998 0.014423077 3: AAIavg_GYGA 0.1531604918 0.003423954 0.043269231 4: GTDHYS3 0.0540571247 0.023488324 0.019230769 5: Ca_M_agg30cm_AF_1km 0.0441660394 0.166815038 0.125000000 6: EXBX_M_agg30cm_AF_1km 0.0384567272 0.006642471 0.004807692 7: ASSDAC3 0.0196214652 0.016092584 0.019230769 8: Na_M_agg30cm_AF_1km 0.0158569118 0.040608094 0.038461538 9: LCEE10c190 0.0090669410 0.025885092 0.019230769 10: C01GLC5 0.0064795314 0.007738136 0.009615385 11: B_M_agg30cm_AF_1km 0.0029334317 0.005683764 0.019230769 12: BLDFIE_M_agg30cm_AF_1km 0.0022556108 0.021639389 0.028846154 13: B13CHE3 0.0022290691 0.005204410 0.004807692 14: NCluster_M_AF_1km 0.0019892845 0.011572964 0.014423077 15: K_M_agg30cm_AF_1km 0.0005982639 0.033212354 0.028846154 Ensemble validation RMSE: 470.116 R2: 0.43 -------------------------------------- Variable: fRyld Ranger result Call: ranger(formulaString.lst[[j]], data = dfs, importance = "impurity", write.forest = TRUE, mtry = t.mrfX$bestTune$mtry, num.trees = 500) Type: Regression Number of trees: 500 Sample size: 33 Number of independent variables: 372 Mtry: 14 Target node size: 5 Variable importance mode: impurity OOB prediction error: 167163.9 R squared: 0.6184794 OOB RMSE: 408.857 Variable importance: [,1] EXBX_M_agg30cm_AF_1km 572910.8 IMOD4avg 503095.8 NCluster_16_AF_1km 465556.0 RANENV3 464897.6 Ca_M_agg30cm_AF_1km 451117.7 Zn_M_agg30cm_AF_1km 403369.6 B14CHE3 380228.9 N_M_agg30cm_AF_1km 378362.6 ENTENV3 353751.1 Fe_M_agg30cm_AF_1km 343985.8 EVEENV3 338239.0 EACKCL_M_agg30cm_AF_1km 325958.6 P.T_M_agg30cm_AF_1km 319154.3 POSMRG5 305026.2 M17NPPALTfill 268804.0 eXtreme Gradient Boosting 33 samples 372 predictors No pre-processing Resampling: Cross-Validated (3 fold, repeated 1 times) Summary of sample sizes: 23, 22, 21 Resampling results across tuning parameters: eta max_depth nrounds RMSE Rsquared 0.3 2 50 555.9972 0.3482769 0.3 2 100 556.0943 0.3483263 0.3 2 150 556.0943 0.3483263 0.3 3 50 521.5728 0.3782138 0.3 3 100 521.6528 0.3782721 0.3 3 150 521.6528 0.3782721 0.3 4 50 582.2447 0.3281721 0.3 4 100 582.3379 0.3282431 0.3 4 150 582.3379 0.3282431 0.3 5 50 510.7806 0.3889806 0.3 5 100 510.8639 0.3890308 0.3 5 150 510.8639 0.3890308 0.3 6 50 592.1380 0.3184533 0.3 6 100 592.2438 0.3185136 0.3 6 150 592.2438 0.3185136 0.3 7 50 562.0029 0.3414213 0.3 7 100 562.0871 0.3414904 0.3 7 150 562.0871 0.3414904 0.3 8 50 576.1075 0.3301947 0.3 8 100 576.1881 0.3302714 0.3 8 150 576.1881 0.3302714 0.4 2 50 558.2675 0.3512822 0.4 2 100 558.2736 0.3512836 0.4 2 150 558.2736 0.3512836 0.4 3 50 602.8446 0.3099323 0.4 3 100 602.8493 0.3099357 0.4 3 150 602.8493 0.3099357 0.4 4 50 604.4042 0.3079429 0.4 4 100 604.4089 0.3079460 0.4 4 150 604.4089 0.3079460 0.4 5 50 580.2560 0.3251522 0.4 5 100 580.2606 0.3251553 0.4 5 150 580.2606 0.3251553 0.4 6 50 555.4656 0.3480592 0.4 6 100 555.4702 0.3480617 0.4 6 150 555.4702 0.3480617 0.4 7 50 610.0099 0.3016880 0.4 7 100 610.0136 0.3016919 0.4 7 150 610.0136 0.3016919 0.4 8 50 591.3518 0.3168983 0.4 8 100 591.3564 0.3169015 0.4 8 150 591.3564 0.3169015 0.5 2 50 543.9157 0.3579671 0.5 2 100 543.9160 0.3579670 0.5 2 150 543.9160 0.3579670 0.5 3 50 604.9489 0.3052039 0.5 3 100 604.9491 0.3052039 0.5 3 150 604.9491 0.3052039 0.5 4 50 599.9746 0.3094068 0.5 4 100 599.9748 0.3094068 0.5 4 150 599.9748 0.3094068 0.5 5 50 580.4859 0.3255066 0.5 5 100 580.4860 0.3255066 0.5 5 150 580.4860 0.3255066 0.5 6 50 563.4565 0.3410000 0.5 6 100 563.4567 0.3410001 0.5 6 150 563.4567 0.3410001 0.5 7 50 539.3267 0.3657868 0.5 7 100 539.3269 0.3657868 0.5 7 150 539.3269 0.3657868 0.5 8 50 635.9572 0.2817088 0.5 8 100 635.9573 0.2817088 0.5 8 150 635.9573 0.2817088 Tuning parameter 'gamma' was held constant at a value of 0 Tuning parameter 'colsample_bytree' was held constant at a value of 0.8 Tuning parameter 'min_child_weight' was held constant at a value of 1 RMSE was used to select the optimal model using the smallest value. The final values used for the model were nrounds = 50, max_depth = 5, eta = 0.3, gamma = 0, colsample_bytree = 0.8 and min_child_weight = 1. RMSE: 510.781 R2: 0.389 XGBoost variable importance: Feature Gain Cover Frequency 1: B14CHE3 0.7779793103 0.054595444 0.038461538 2: Ca_M_agg30cm_AF_1km 0.1988700151 0.200903378 0.132478632 3: POSMRG5 0.0076058388 0.018263943 0.012820513 4: BLDFIE_M_agg30cm_AF_1km 0.0046571079 0.022191673 0.021367521 5: NCluster_9_AF_1km 0.0038173062 0.006087981 0.004273504 6: M17NPPALTfill 0.0017059691 0.058326787 0.038461538 7: af_agg_30cm_TAWCpF23mm__M_1km 0.0016033415 0.011390416 0.017094017 8: NCluster_M_AF_1km 0.0010796938 0.006087981 0.004273504 9: AAIavg_GYGA 0.0009477892 0.031225452 0.149572650 10: Al_M_agg30cm_AF_1km 0.0006005653 0.081893166 0.072649573 11: Slopeclassc1 0.0002800203 0.012961508 0.008547009 12: VBFMRG5 0.0002635087 0.029457973 0.025641026 13: af_agg_ERZD_TAWCpF23mm__M_1km 0.0001246546 0.023173606 0.029914530 14: C04GLC5 0.0001236428 0.002749411 0.004273504 15: af_agg_30cm_PWP__M_1km 0.0001054378 0.020031422 0.034188034 Ensemble validation RMSE: 409.612 R2: 0.608 --------------------------------------