Investigate variable relations of a specific variable with mean adjusted agreement
Source:R/MeanAdjustedAgreement.R
MeanAdjustedAgreement.Rd
This function uses the mean adjusted agreement to select variables that are related to a defined variable using a threshold T. The parameter t is used to calculate T: t=1 means that every variable with higher probability than "by chance" is identified as "important". t=2 means the probability has to be twice, etc. Based on the threshold a vector is created containing the related variables.
Usage
MeanAdjustedAgreement(
RFS,
t = 5,
variables = RFS$ranger$forest$independent.variable.names,
candidates = RFS$ranger$forest$independent.variable.names,
related = TRUE,
num.threads = 1
)
Arguments
- RFS
A
RandomForestSurrogates()
object.- t
(Default: 5) Used to calculate threshold.
- variables
Vector of variable names for which related variables should be searched. (Default: All variables used to create the random forest.)
- candidates
Vector of variable names that are candidates to be related to the variables. (Default: All variables used to create the random forest.)
- related
(Default: TRUE) Whether related variables should be identified.
- num.threads
(Default: 1) Number of threads used for determination of relations.
Value
A MeanAdjustedAgreement
list object:
RFS
: The originalRandomForestSurrogates()
object.relations
: Matrix with mean adjusted agreement valuesRows:
variables
.Columns:
candidates
.
threshold
: the threshold used to select related variables.related
: A list of vectors for eachvariable
containing relatedcandidates
. Only present ifrelated = TRUE
(Default).
Examples
# \donttest{
data("SMD_example_data")
rfs <- RandomForestSurrogates(
x = SMD_example_data[, -1],
y = SMD_example_data[, 1],
s = 10,
seed = 42,
num.trees = 10,
num.threads = 1
)
maa <- MeanAdjustedAgreement(
rfs,
variables = c("X7", "X1"),
candidates = colnames(SMD_example_data)[2:101],
t = 5,
num.threads = 1
)
# }