With the psfmi
package you can pool logistic regression models by using
the following pooling methods: RR (Rubin’s Rules), D1, D2, D3 and MPR
(Median R Rule).
You can also use forward or backward selection from the pooled model.
This vignette show you examples of how to apply these procedures.
library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
predictors=c("Gender", "Smoking", "Function", "JobControl",
"JobDemands", "SocialSupport"), method="D1")
pool_lr$RR_model
#> $`Step 1 - no variables removed -`
#> term estimate std.error statistic df p.value
#> 1 (Intercept) -0.02145084 2.49485297 -0.008598036 104.09644 0.993156301
#> 2 Gender -0.35445151 0.41807427 -0.847819477 141.28927 0.397972465
#> 3 Smoking 0.07565036 0.34084592 0.221948835 147.74179 0.824660215
#> 4 Function -0.14188458 0.04337897 -3.270815252 132.02927 0.001368147
#> 5 JobControl 0.00690354 0.02053384 0.336203110 88.93815 0.737509628
#> 6 JobDemands 0.00227508 0.03872846 0.058744401 103.72259 0.953268722
#> 7 SocialSupport 0.04434046 0.05750883 0.771019941 126.70867 0.442130487
#> OR lower.EXP upper.EXP
#> 1 0.9787776 0.007362449 130.1205053
#> 2 0.7015581 0.309165955 1.5919729
#> 3 1.0785854 0.552994262 2.1037224
#> 4 0.8677214 0.796994613 0.9447246
#> 5 1.0069274 0.967206964 1.0482791
#> 6 1.0022777 0.929012848 1.0813204
#> 7 1.0453382 0.933908457 1.1700632
Back to Examples
Pooling Logistic regression models over 5 imputed datasets with backward selection using a p-value of 0.05 and as method D1 and forcing the predictor “Smoking” in the models during backward selection.
library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
predictors=c("Gender", "Smoking", "Function", "JobControl",
"JobDemands", "SocialSupport"), keep.predictors = "Smoking",
p.crit = 0.05, method="D1", direction = "BW")
#> Removed at Step 1 is - JobDemands
#> Removed at Step 2 is - JobControl
#> Removed at Step 3 is - SocialSupport
#> Removed at Step 4 is - Gender
#>
#> Selection correctly terminated,
#> No more variables removed from the model
pool_lr$RR_model_final
#> $`Step 5`
#> term estimate std.error statistic df p.value OR
#> 1 (Intercept) 1.20696975 0.48230894 2.5024827 138.9717 0.01349142 3.3433381
#> 2 Smoking 0.06427314 0.33804675 0.1901309 151.8086 0.84946055 1.0663836
#> 3 Function -0.14058914 0.04225212 -3.3273866 121.6357 0.00115993 0.8688462
#> lower.EXP upper.EXP
#> 1 1.2990643 8.6045855
#> 2 0.5497462 2.0685436
#> 3 0.7997922 0.9438623
pool_lr$predictors_out
#> Gender Smoking Function JobControl JobDemands SocialSupport
#> Step 1 0 0 0 0 1 0
#> Step 2 0 0 0 1 0 0
#> Step 3 0 0 0 0 0 1
#> Step 4 1 0 0 0 0 0
#> Removed 1 0 0 1 1 1
Back to Examples
Pooling Logistic regression models over 5 imputed datasets with backward selection using a p-value of 0.05 and as method D1 and forcing the predictor “Smoking” in the models during backward selection.
library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
predictors=c("Gender", "Smoking", "Function", "JobControl",
"JobDemands", "SocialSupport"), keep.predictors = "Smoking",
p.crit = 0.05, method="MPR", direction = "BW")
#> Removed at Step 1 is - JobDemands
#> Removed at Step 2 is - JobControl
#> Removed at Step 3 is - SocialSupport
#> Removed at Step 4 is - Gender
#>
#> Selection correctly terminated,
#> No more variables removed from the model
pool_lr$RR_model_final
#> $`Step 5`
#> term estimate std.error statistic df p.value OR
#> 1 (Intercept) 1.20696975 0.48230894 2.5024827 138.9717 0.01349142 3.3433381
#> 2 Smoking 0.06427314 0.33804675 0.1901309 151.8086 0.84946055 1.0663836
#> 3 Function -0.14058914 0.04225212 -3.3273866 121.6357 0.00115993 0.8688462
#> lower.EXP upper.EXP
#> 1 1.2990643 8.6045855
#> 2 0.5497462 2.0685436
#> 3 0.7997922 0.9438623
pool_lr$predictors_out
#> Gender Smoking Function JobControl JobDemands SocialSupport
#> Step 1 0 0 0 0 1 0
#> Step 2 0 0 0 1 0 0
#> Step 3 0 0 0 0 0 1
#> Step 4 1 0 0 0 0 0
#> Removed 1 0 0 1 1 1
Back to Examples
Pooling Logistic regression models over 5 imputed datasets with BS using a p-value of 0.05 and as method D2. Several interaction terms, including a categorical predictor, are part of the selection procedure.
library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
predictors=c("Gender", "Smoking", "Function", "JobControl"),
p.crit = 0.05, cat.predictors = c("Carrying", "Satisfaction"),
int.predictors = c("Carrying:Smoking", "Gender:Smoking"), method="D2", direction = "BW")
#> Removed at Step 1 is - JobControl
#> Removed at Step 2 is - factor(Satisfaction)
#> Removed at Step 3 is - Gender*Smoking
#> Removed at Step 4 is - Gender
#> Removed at Step 5 is - factor(Carrying)*Smoking
#> Removed at Step 6 is - Smoking
#> Removed at Step 7 is - Function
#>
#> Selection correctly terminated,
#> No more variables removed from the model
pool_lr$RR_model_final
#> $`Step 8`
#> term estimate std.error statistic df p.value
#> 1 (Intercept) -1.582393 0.3773100 -4.193880 151.2668 4.652717e-05
#> 2 factor(Carrying)2 1.391554 0.4709708 2.954650 144.1330 3.657304e-03
#> 3 factor(Carrying)3 2.248897 0.4750324 4.734198 151.4269 5.010441e-06
#> OR lower.EXP upper.EXP
#> 1 0.2054828 0.09808504 0.4304753
#> 2 4.0210931 1.59751783 10.1214455
#> 3 9.4772792 3.73532116 24.0458093
pool_lr$predictors_out
#> Gender Smoking Function JobControl factor(Carrying)
#> Step 1 0 0 0 1 0
#> Step 2 0 0 0 0 0
#> Step 3 0 0 0 0 0
#> Step 4 1 0 0 0 0
#> Step 5 0 0 0 0 0
#> Step 6 0 1 0 0 0
#> Step 7 0 0 1 0 0
#> Removed 1 1 1 1 0
#> factor(Satisfaction) factor(Carrying)*Smoking Gender*Smoking
#> Step 1 0 0 0
#> Step 2 1 0 0
#> Step 3 0 0 1
#> Step 4 0 0 0
#> Step 5 0 1 0
#> Step 6 0 0 0
#> Step 7 0 0 0
#> Removed 1 1 1
Back to Examples
Same as above but now forcing several predictors, including interaction terms, in the model during BS.
library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
predictors=c("Gender", "Smoking", "Function", "JobControl"),
p.crit = 0.157, cat.predictors = c("Carrying", "Satisfaction"),
int.predictors = c("Carrying*Smoking", "Gender*Smoking"),
keep.predictors = c("Smoking*Carrying", "JobControl"), method="D1", direction = "FW")
#>
#> Selection correctly terminated,
#> No new variables entered the model
pool_lr$RR_model_final
#> $`Final model`
#> term estimate std.error statistic df
#> 1 (Intercept) -0.810522255 1.3650185 -0.5937812 55.33263
#> 2 Smoking -1.796541680 1.1699026 -1.5356336 65.48650
#> 3 JobControl -0.004625312 0.0216596 -0.2135455 58.16427
#> 4 factor(Carrying)2 0.723452199 0.6214600 1.1641171 107.17663
#> 5 factor(Carrying)3 1.534813529 0.5908820 2.5974958 107.09227
#> 6 Smoking:factor(Carrying)2 2.093737680 1.3149790 1.5922214 66.01171
#> 7 Smoking:factor(Carrying)3 2.370029492 1.3934064 1.7008889 51.83267
#> p.value OR lower.EXP upper.EXP
#> 1 0.55507827 0.4446258 0.03062439 6.455380
#> 2 0.12944599 0.1658715 0.01674676 1.642907
#> 3 0.83164826 0.9953854 0.95401285 1.038552
#> 4 0.24696141 2.0615378 0.60980909 6.969293
#> 5 0.01071096 4.6404601 1.45744482 14.775084
#> 6 0.11611271 8.1151905 0.61654673 106.814803
#> 7 0.09495675 10.6977078 0.69694615 164.203434
pool_lr$predictors_in
#> Gender Smoking Function JobControl factor(Carrying)
#> Step 1 0 1 0 1 1
#> Step 2 0 1 0 1 1
#> Step 3 0 1 0 1 1
#> Step 4 0 1 0 1 1
#> Included 0 1 0 1 1
#> factor(Satisfaction) factor(Carrying)*Smoking Gender*Smoking
#> Step 1 0 1 0
#> Step 2 0 1 0
#> Step 3 0 1 0
#> Step 4 0 1 0
#> Included 0 1 0
Back to Examples
Pooling Logistic regression models over 5 imputed datasets with BS using a p-value of 0.05 and as method D1. A spline predictor and interaction term are part of the selection procedure.
library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
predictors=c("Gender", "Smoking", "JobControl"),
p.crit = 0.157, cat.predictors = c("Carrying", "Satisfaction"),
spline.predictors=c("Function"), int.predictors = c("Carrying:Smoking"),
nknots=3, method="D1", direction = "BW")
#> Removed at Step 1 is - JobControl
#> Removed at Step 2 is - Gender
#> Removed at Step 3 is - rcs(Function,3)
#> Removed at Step 4 is - factor(Satisfaction)
#> Removed at Step 5 is - factor(Carrying)*Smoking
#> Removed at Step 6 is - Smoking
#>
#> Selection correctly terminated,
#> No more variables removed from the model
pool_lr$RR_model_final
#> $`Step 7`
#> term estimate std.error statistic df p.value
#> 1 (Intercept) -1.582393 0.3773100 -4.193880 151.2668 4.652717e-05
#> 2 factor(Carrying)2 1.391554 0.4709708 2.954650 144.1330 3.657304e-03
#> 3 factor(Carrying)3 2.248897 0.4750324 4.734198 151.4269 5.010441e-06
#> OR lower.EXP upper.EXP
#> 1 0.2054828 0.09808504 0.4304753
#> 2 4.0210931 1.59751783 10.1214455
#> 3 9.4772792 3.73532116 24.0458093
pool_lr$predictors_out
#> Gender Smoking JobControl factor(Carrying) factor(Satisfaction)
#> Step 1 0 0 1 0 0
#> Step 2 1 0 0 0 0
#> Step 3 0 0 0 0 0
#> Step 4 0 0 0 0 1
#> Step 5 0 0 0 0 0
#> Step 6 0 1 0 0 0
#> Removed 1 1 1 0 1
#> rcs(Function,3) factor(Carrying)*Smoking
#> Step 1 0 0
#> Step 2 0 0
#> Step 3 1 0
#> Step 4 0 0
#> Step 5 0 1
#> Step 6 0 0
#> Removed 1 1
Back to Examples