Parameters without defaults are required and will trigger an error message if not supplied values when called.
Ideally, the constructor should produce a valid model if called without any arguments; i.e., not have any required arguments.
The source package defaults will be used for parameters with NULL
values.
Model formula, data, and weights are separate from model parameters and should not be included among the constructor arguments.
Include all packages whose functions are called directly from within the constructor.
Use :: or ::: to reference source package functions.
"factor"
, "numeric"
, "ordered"
, and/or "Surv"
) that can be analyzed with the model.params(environment())
if all arguments are to be passed to the source package fit function as supplied. Additional steps may be needed to pass the constructor arguments to the source package in a different format; e.g., when some model parameters must be passed in a control structure, as in C50Model and CForestModel.data
argument that represents a model frame and return its number of analytic predictor variables.The first three arguments should be formula
, data
, and weights
followed by a three dot ellipsis.
Set environment(formula) <- environment()
if the formula will be passed to the source package fit function. Some fit functions expect the formula to be defined in the same environment as the data and weights belong; i.e., the fit formula environment.
If weights are not supported, the following should be included in the function:
if(!all(weights == 1)) warning("weights are unsupported and will be ignored")
Only add elements to the resulting fit object if they are needed and will be used in the predict
, response
, or varimp
functions.
Return the fit object.
The arguments are a model fit object
, newdata
frame, optionally time
for prediction at survival time points, and an ellipsis.
Extract any new elements added in the fit function and then recast with unMLModelFit
to convert the fit object to its original class.
The predict function should return a vector or matrix of probabilities for 2-level factors, a matrix for factors with more than 2 levels, a vector or matrix of expected responses if numeric, a matrix of survival probabilities if follow-up times are supplied, or a vector of survival risks if no follow-up times.
Should have a single model fit object
argument, followed by an ellipsis, and return the original response variable supplied to the fit function.
Sometimes the original response is easily extracted from the fit function result. At other times, it may be nested within the results and require some extra work to extract (see GBMModel). If it is impossible to extract the response from the source package fit function results, add the response as a new element to the results in the fit function.
Should have a single model fit object
argument followed by an ellipsis.
Variable importance results should generally be returned as a vector with elements named after the corresponding predictor variables. The package will take care of conversions to a data frame and VarImp object. If there is more than one set of relevant variable importance measures, they can be returned as a matrix or data frame with predictor variable names as the row names.
Include the first sentences from the source package.
Start sentences with the parameter value type (logical, numeric, character, etc.).
Start sentences with lowercase.
Omit indefinite articles (a, an, etc.) from the starting sentences.
Include response types (factor, numeric, ordered, and/or Surv).
Include the following sentence:
Default values for the arguments and further model details can be found in the source link below.
MLModel class object.
\code{\link[<source package>]{<fit function>}}, \code{\link{fit}},
\code{\link{resample}}, \code{\link{tune}}
For full compatibility with the MachineShop
package, model constructors must belong to its namespace.
This can be accomplished by rebuilding the package with source code for a new model added; or by defining a new model constructor, say CustomModel
, and manually adding it to the namespace as follows.
environment(CustomModel) <- asNamespace("MachineShop")
"MLCustomModel.R"
.