Adjust behavior of "positive"
arg for
classif.logreg
(#2846)
Consistent naming for dummy feature encoding of variables with different levels count (#2847)
Remove {nodeHarvest} learners (#2841)
Remove {rknn} learner (#2842)
Remove all {DiscriMiner} learners (#2840)
Remove {extraTrees} learner (#2839)
Remove depcrecated {rrlda} learner
Resolve some {ggplot} deprecation warnings
Fixed information.gain
filter calculation. Before,
chi.squared
was calculated even though
information.gain
was requested due to a glitch in the
filter naming (#2816, @jokokojote)
Make helpLearnerParam()
’s HTML parsing more robust
(#2843)
Add HTML5 support for help pages
FSelectoRcpp::relief()
. This C++ based
implementation of the RelieF filter algorithm is way faster than the
Java based one from the {FSelector} package (#2804)FilterWrapper
objectsfix.factors.prediction = TRUE
causes the
generation of NAs for new factor levels in prediction (@jakob-r, #2794)newdata
(@jakob-r, #2794)praznik_MRMR
: Remove handling of survival tasks (#2790,
@bommert)objective
default from
reg:linear
(deprecated) to
reg:squarederror
blocking
was set in the Task but
blocking.cv
was not set within `makeResampleDesc()
(#2788)generateLearningCurveData()
(#2768)getFeatureImportance()
: Account for feature importance
weight of linear xgboost modelss
did not match the learner note) (#2747)createSpatialResamplingPlots()
. The package caused issues
on R-devel. In addition users should set custom themes by
themselves.getNestedTuneResultsOptPathDf()
(#2754)regr_slim
learner due to pkg (flare) being
orphaned on CRANclValid::dunn
and its tests (package
orphaned) (#2742)tuneThreshold()
now accounts for the direction
of the measure. Beforehand, the performance measure was always minimized
(#2732).more.args
for simple filters (@annette987, #2709)print.FeatSelResult()
when bits.to.features is used
in selectFeatures()
(#2721)getFeatureImportance()
(#2708)pkgdown: Move changelog to Appendix
Account for {checkmate} v2.0.0 update (#2734)
Refactor function calls from packages
(<pkg::fun>
) within ParamSets (#2730) to avoid errors
in listLearners()
if those pkgs are not installed
listLearners()
should not fail if a package is not
installed (#2717)
clValid::dunn
and its tests (package
orphaned) (#2742)<pkg::fun>
) within ParamSets (#2730) to avoid errors
in listLearners()
if those pkgs are not installedregr_slim
learner due to pkg (flare) being
orphaned on CRANtuneThreshold()
now accounts for the direction
of the measure. Beforehand, the performance measure was always minimized
(#2732).print.FeatSelResult()
when bits.to.features is used
in selectFeatures()
(#2721)getFeatureImportance()
(#2708)listLearners()
should not fail if a package is not
installed (#2717)more.args
for simple filters (@annette987, #2709)n.show
argument had no effect in
plotFilterValues()
. Thanks @albersonmiranda. (#2689)PR: #2638 (@pfistl)
Added several learners for regression and classification on functional data
Added preprocessing steps for feature extraction from functional data
Fixed a bug where multiclass to binaryclass reduction techniques did not work with functional data.
Several other minor bug fixes and code improvements
Extended and clarified documentation for several fda components.
tree_method
(@albersonmiranda, #2701)getFeatureImportance()
now returns a long data.frame
with columns variable
and importance
.
Beforehand, a wide data.frame was returned with each variable
representing a column (@pat-s, #1755).filterFeatures()
: Arg thresh
was not
working correctly when applied to ensemble filters. (@annette987, #2699)classif.xgboost
which prevented passing
a watchlist for binary tasks. This was caused by a suboptimal internal
label inversion approach. Thanks to @001ben for reporting (#32) (@mllg)fda.usc
learners to work with package version
>=2.0glmnet
learners to upstream package version
3.0.0xgboost
learners to upstream version 0.90.2
(@pat-s & @be-marc, #2681)classif.gbm
and
regr.gbm
. Specifically, param shrinkage
now
defaults to 0.1 instead of 0.001. Also more choices for param
distribution
have been added. Internal parallelization by
the package is now suppressed (param n.cores
). (@pat-s, #2651)h2o.deeplearning
learners (@albersonmiranda,
#2668)configureMlr()
to .onLoad()
, possibly
fixing some edge cases (#2585) (@pat-s, #2637)h2o.gbm
learners were not running until
wcol
was passed somehow due to an internal bug. In
addition, this bug caused another issue during prediction where the
prediction data.frame
was somehow formatted as a character
rather a numeric. Thanks to @nagdevAmruthnath for bringing this
up in #2630.Bugfix: Allow method = "vh"
for filter
randomForestSRC_var.select
and return informative error
message for not supported values. Also argument
conservative
can now be passed. See #2646 and #2639 for
more information (@pat-s, #2649)
Bugfix: Allow method = "md"
of filter
randomForestSRC_var.select
to set the value returned for
features below its threshold to NA (Issue #2687)
Bugfix: With the new praznik v7.0.0 release filter
praznik_CMIM
does no longer return a result for logical
features. See https://gitlab.com/mbq/praznik/issues/19 for more
information
data.frame
filter values are now
returned in a long (tidy) tibble
. This makes it easier to
apply post-processing methods (like group_by()
, etc) (@pat-s, #2456)benchmark()
does not store the tuning results
($extract
slot) anymore by default. If you want to keep
this slot (e.g. for post tuning analysis), set
keep.extract = TRUE
. This change originated from the fact
that the size of BenchmarkResult
objects with extensive
tuning got very large (~ GB) which can cause memory problems during
runtime if multiple benchmark()
calls are executed on
HPCs.benchmark()
does not store the created models
($models
slot) anymore by default. The reason is the same
as for the $extract
slot above. Storing can be enabled
using models = TRUE
.generateFeatureImportanceData()
gains argument
show.info
which shows the name of the current feature being
calculated, its index in the queue and the elapsed time for each feature
(@pat-s, #26222)classif.liquidSVM
and regr.liquidSVM
have
been removed because liquidSVM
has been removed from
CRAN.data.table
s default in
rbindlist()
. See #2578 for more information. (@mllg, #2579)regr.randomForest
gains three new methods to estimate
the standard error:
se.method = "jackknife"
se.method = "bootstrap"
se.method = "sd"
See ?regr.randomForest
for more details. regr.ranger
relies on the functions
provided by the package (“jackknife” and “infjackknife” (default))
(@jakob-r,
#1784)regr.gbm
now supports
quantile distribution
(@bthieurmel, #2603)classif.plsdaCaret
now supports multiclass
classification (@GegznaV, #2621)getClassWeightParam()
now also works for Wrapper*
Models and ensemble models (@ja-thomas, #891)getLearnerNote()
to query the “Note” slot of a
learner (@alona-sydorova, #2086)e1071::svm()
now only uses the formula interface if
factors are present. This change is supposed to prevent from “stack
overflow” issues some users encountered when using large datasets. See
#1738 for more information. (@mb706, #1740)cluster.MiniBatchKmeans
from package
ClusterR (@Prasiddhi, #2554)plotHyperParsEffect()
now supports facet visualization
of hyperparam effects for nested cv (@MasonGallo, #1653)data.table
s default in
rbindlist()
. See #2578 for more information. (@mllg, #2579)options(on.learner.error)
was not
respected in benchmark()
. This caused
benchmark()
to stop even if it should have continued
including FailureModels
in the result (@dagola, #1984)getClassWeightParam()
now also works for Wrapper*
Models and ensemble models (@ja-thomas, #891)getLearnerNote()
to query the “Note” slot of a
learner (@alona-sydorova, #2086)praznik_mrmr
also supports regr
and
surv
tasksplotFilterValues()
got a bit “smarter” and easier now
regarding the ordering of multiple facets. (@pat-s, #2456)filterFeatures()
,
generateFilterValuesData()
and
makeFilterWrapper()
gained new examples. (@pat-s, #2456)makeResampleDesc(fixed = TRUE)
) (@pat-s, #2412).Task
help pages are now split into separate ones,
e.g. RegrTask
, ClassifTask
(@pat-s, #2564)deleteCacheDir()
: Clear the default mlr cache directory
(@pat-s, #2463)getCacheDir()
: Return the default mlr cache directory
(@pat-s, #2463)getResamplingIndices(inner = TRUE)
now correctly
returns the inner indices (before inner indices referred to the subset
of the respective outer level train set) (@pat-s, #2413).fw.perc
,
fw.abs
or fw.threshold
. It can be triggered
with the new cache
argument in
makeFilterWrapper()
or filterFeatures()
(@pat-s, #2463).Additionally, filter names have been harmonized using the following
scheme:
Added filters FSelectorRcpp_gain.ratio
,
FSelectorRcpp_information.gain
and
FSelectorRcpp_symmetrical.uncertainty
from package
FSelectorRcpp
. These filters are ~ 100 times faster than
the implementation of the FSelector
pkg. Please note that
both implementations do things slightly different internally and the
FSelectorRcpp
methods should not be seen as direct
replacement for the FSelector
pkg.
filter names have been harmonized using the following scheme:
information.gain
->
FSelector_information.gain
gain.ratio
-> FSelector_gain.ratio
symmetrical.uncertainty
->
FSelector_symmetrical.uncertainty
chi.squared
->
FSelector_chi.squared
relief
-> FSelector_relief
oneR
-> FSelector_oneR
randomForestSRC.rfsrc
->
randomForestSRC_importance
randomForestSRC.var.select
->
randomForestSRC_var.select
randomForest.importance
->
randomForest_importance
fixed a bug related to the loading of namespaces for required filter packages (@pat-s, #2483)
"h2o.use.data.table" = TRUE
is now the default (@j-hartshorn,
#2508)x.bit.names
that stores the optimal bitsx
now always contains the real feature names
and not the bit.namesmakeFeatSelWrapper
usable
with custom bit.names
.sffs
crashed in some cases
(@bmihaljevic,
#2486)resample.fun
to specify a custom resampling function to
use.