Changelog
Source:NEWS.md
RFSurrogates 0.4.2
- Added new optional parameter to
MeanAdjustedAgreementandmeanAdjAgree:round_digitsdefaulting to 2. This offers the same behaviour as before, but allows rounding to different amounts of decimal places if desired.
RFSurrogates 0.4.1
- Fix
SurrogateMinimalDepth’s result$selectedreturning modified names (#13)
RFSurrogates 0.4.0
Added RandomForestSurrogates().
- This functions aims to replace the first section of the variable selection and relation functions by creating a single reusable object which contains the random forest
ranger::ranger()model, as well as thetreeslist with layers and surrogates added. - Returns a
RandomForestSurrogatesobject, which serves as the base object for later analysis. - Additional
...params are passed directly toranger::ranger. -
s.pctis a helper for calculating the number of surrogates as a fraction of number of variables (Default: 0.01).scan be set to overwrite this default. -
mtrysupports the following values:- One of the documented
stringvalues, which will cause themtrypassed toranger::ranger()to be a function accepting the number of variables, and returning the specific transformation after flooring the result. - A
functionwhich takes the number of variables as its first and only param, and returns the value ofmtry. - A
numericvalue formtry. - The default is
"^3/4".
- One of the documented
-
typealso usesmatch.arg()and still defaults to"regression". -
num.threadsdefaults to 1. -
permutatewill, if set toTRUE, apply random permutation to the data in each feature. (This is used in permutation importance approaches.) -
seedis now a strongly recommended optional parameter (issuing a warning whenever it is not set).- Setting
seedwill cause a call toset.seed()when permutating. It is also used as theseedparam of theranger::ranger()call. - Requiring
seedas a function parameter is preferred because it does not rely on global, non-reproducible state of the random number generator, if it was not seeded immediately before the function call.
- Setting
- The inner call to
ranger::ranger()includes the following defaults:keep.inbag = TRUErespect.unordered.factors = "partition"- Data is passed as a data.frame with the special column
y, and the optional special columnstatusfor survival forests.-
xmust not contain the column namesyorstatus, as this may lead to unexpected behavior.
-
- In general, input parameters are more strictly validated.
- The function uses
num.threadsto also parallelize creating the list of trees with layers.
Added SurrogateMinimalDepth() as a replacement for var.select.smd(). It takes just a RandomForestSurrogates object as a param, and returns a SurrogateMinimalDepth list object.
Value:
-
RFS: TheRandomForestSurrogatesobject used. -
selected: The selected variables (previouslyvar). -
depth: The surrogate minimal depth of each variable (previouslyinfo$depth). -
threshold: The threshold used to select variables (previouslyinfo$threshold). -
surrogates: Details on the average number of surrogates in the forest, and per layer.-
average: Previouslys$s.a. -
layer: Previouslys$s.l.
-
Added MeanAdjustedAgreement() as a replacement for var.relations(). It takes a RandomForestSurrogates object as a param, and returns a MeanAdjustedAgreement list object.
Value:
-
RFS: TheRandomForestSurrogatesobject used. -
relations: The matrix of mean adjusted agreements for the investigated variables (rows) and the possible candidates (columns) (previouslysurr.res). -
threshold: The threshold used to select related variables. -
related: A list of vectors for each investigated variable containing the related candidate variables (previouslyvar).
Added MFI() as replacement for var.relations.mfi(). MFI() takes the same arguments as RandomForestSurrogates(), with the additional variables and candidates params. Perform variable selection using MutualForestImpactVariableSelection().
Added MutualImpurityReduction() as a replacement for var.select.mir(), where variable selection is performed with MutualImpurityReductionVariableSelection().
RFSurrogates 0.3.4
-
var.select.smd(),var.select.md(),var.relations(),var.relations.mfi(): Made several improvements to developer experience:-
create.forestnow defaults tois.null(forest), so it will automatically beTRUEif no forest is provided, andFALSEotherwise. -
xis no longer required ifcreate.forestisFALSE. - (Internal) Inverted some nested guard clauses for readability.
-
-
addLayer(): Refactor for-loop to lapply.- Add
num.threadsparam to enable parallelization usingparallel::mclapply(). It defaults to 1 for backward compatability.
- Add
-
getTreeranger(): Refactorlapply()toparallel::mclapply().- Add
num.threadsparam (passed tomc.coresinparallel::mclapply()). It defaults to 1 for backward compatability. - Add
add_layerparam to include the effect ofaddLayerwithin the same loop. Defaults toFALSEfor backward compatability. - (Internal)
getsingletree(): Addadd_layerparam to enable adding layers within the same loop.
- Add
-
addSurrogates():- Clarified default value for
num.threadsto beparallel::detectCores()by adding it as a default to the parameter - Added assertion that
RFis arangerobject. - Added assertion that
RF$num.treesandlength(trees)are equal. This is not considered a breaking change since these values should always be equal when the function is used correctly.
- Clarified default value for
- Added S3 classes to the
treeslist objects.-
getTreeranger()now returns aRangerTreeslist. -
addLayer()andgetTreeranger(add_layer = TRUE)add theLayerTreesclass to the list (indicating presence of thelayerlist item). It now requires that itstreesparam inheritsRangerTrees. -
addSurrogates()now adds theSurrogateTreesclass. It now requires that itstreesparam inheritsRangerTrees.
-
RFSurrogates 0.3.3
- Fixed
meanAdjAgree()bug which caused mean adjusted agreement pairings to be set to NA incorrectly whenvariableswas a subset or differently ordered thancandidates.
RFSurrogates 0.3.2
- Moved to new repository: AGSeifert/RFSurrogates
- Fixed
mean.index()bug which caused the return value to be of incorrect length in some cases (#4).
This is the legacy changelog of SurrogateMinimalDepth, previously located at
Version_info.txt, adapted to Tidyverse Style for news. Future releases should begin following Semantic Versioning, and changes should be easier to track using Conventional Commits.
RFSurrogates 0.3.1 (RFSurrogates 0.3.1)
- Renamed package
SurrogateMinimalDepthtoRFSurrogates - Renamed parameter
ntreetonum.treesthroughout the package.
RFSurrogates 0.3.0 (SurrogateMinimalDepth 0.3.0)
- Round Mean Adjusted Agreement to 2 digits.
-
meanAdjAgree(): Sped up. - Changed that relations to variables not used as primary split are set as NA (before it was 0).
-
var.relations.corr()renamed tovar.relations.mfi()(as named in the paper). - Bugfix for MFI and MIR.
- Permutation approach to determine p-values was optimized for the selection of important and related variables.
- Default approach for p-value calculation was set to
"permutation".
RFSurrogates 0.2.1 (SurrogateMinimalDepth 0.2.1)
- Included possibility to analyze categorial variables.
RFSurrogates 0.2.0 (SurrogateMinimalDepth 0.2.0)
- New
var.select.mir()to select important variables based on Mutual Impurity Reduction. - New
var.relations.corr()to calculated unbiased relations. - Added multicore calculation of variable relations.
-
var.relations(): Included possibility to only calculate relations and not select variables. - Included the possibility to set case weights.
RFSurrogates 0.1.10 (SurrogateMinimalDepth 0.1.10)
- Fixed bug not including first and last surrogate.
- Added random selection of surrogates when adjusted agreement are the same.
- Fixed bug in
var.relation()example. - New
build.clusters()to obtain variable groups.
RFSurrogates 0.1.9 (SurrogateMinimalDepth 0.1.9)
-
var.select.smd(): Some specifics to setswere adapted. -
var.select.smd(),var.select.md(): Addedsave.memoryparameter (to build the forst with ranger).
RFSurrogates 0.1.8 (SurrogateMinimalDepth 0.1.8)
-
var.select.md()now executesvar.select.smd()withs = 0instead of using a separate function. - New
reduce.surrogates()function added. -
var.select.smd(): Added parameterscreate.forestandforestto use existing forests (e.g. created byreduce.surrogates()). - Renamed parameter
treestoforestcontaining trees and variable names. - The C-code is updated to enable multicore analysis.
RFSurrogates 0.1.7 (SurrogateMinimalDepth 0.1.7)
-
var.select.smd(),var.select.md(): Added parameter.save.rangerto save ranger object.
RFSurrogates 0.1.6 (SurrogateMinimalDepth 0.1.6)
-
var.select.smd(): Adapted threshold for low depth trees.
RFSurrogates 0.1.5 (SurrogateMinimalDepth 0.1.5)
-
var.select.smd(): Addedsas a parameter. -
var.select.smd(),var.select.md(): Implemented survival function.