Skip to contents

This function can be applied to reduce the surrogate variables in a forest that is created by getTreeranger, addLayer and getSurrogates functions. Hence, it can be applied to the forests that were used for surrogate minimal depth variable importance.

Usage

reduce.surrogates(forest, s = 10)

Arguments

forest

a list containing allvariables and trees. Allvariables is a vector of all variable names in the original data set (strings). Trees is a list of trees that was generated by getTreeranger, addLayer, and getSurrogates functions.

s

number of surrogate variables in the new forest (have to be less than in the RF in trees)

Value

A forest with s surrogate variables.

Examples

# \donttest{
data("SMD_example_data")
###### use result of SMD variable importance and reduce surrogate variables to 10
# select variables with smd variable importance (usually more trees are needed)
set.seed(42)
res <- var.select.smd(
  x = as.data.frame(SMD_example_data[, 2:ncol(SMD_example_data)]),
  y = SMD_example_data[, 1],
  s = 100,
  num.trees = 10,
  num.threads = 1
)
forest.new <- reduce.surrogates(forest = res$forest, s = 10)

# execute SMD on tree with reduced number of surrogates
res.new <- var.select.smd(
  forest = forest.new,
  num.threads = 1
)
res.new$var
#>  [1] "X2"      "X3"      "X4"      "X5"      "X6"      "X8"      "cp1_1"  
#>  [8] "cp1_4"   "cp1_5"   "cp1_6"   "cp1_8"   "cp1_9"   "cp2_2"   "cp2_3"  
#> [15] "cp2_4"   "cp2_5"   "cp2_10"  "cp3_3"   "cp3_4"   "cp3_5"   "cp3_6"  
#> [22] "cp7_1"   "cp8_2"   "cp8_5"   "cgn_4"   "cgn_19"  "cgn_47"  "cgn_49" 
#> [29] "cgn_62"  "cgn_75"  "cgn_121"

#' # investigate variable relations
rel <- var.relations(
  forest = forest.new,
  variables = c("X1", "X7"),
  candidates = res$forest[["allvariables"]][1:100],
  t = 5,
  num.threads = 1
)
rel$var
#> $X1
#>  [1] "cp1_1"  "cp1_2"  "cp1_3"  "cp1_4"  "cp1_5"  "cp1_6"  "cp1_7"  "cp1_8" 
#>  [9] "cp1_9"  "cp1_10" "cp8_1" 
#> 
#> $X7
#>  [1] "cp7_1"  "cp7_2"  "cp7_3"  "cp7_4"  "cp7_5"  "cp7_6"  "cp7_7"  "cp7_8" 
#>  [9] "cp7_9"  "cp7_10"
#> 
# }