Overview
precmed was developed to help researchers with the implementation of precision medicine in R. A key objective of precision medicine is to determine the optimal treatment separately for each patient instead of applying a common treatment to all patients. Personalizing treatment decisions becomes particularly relevant when treatment response differs across patients, or when patients have different preferences about benefits and harms. This package offers statistical methods to develop and validate prediction models for estimating individualized treatment effects. These treatment effects are also known as the conditional average treatment effects (CATEs) and describe how different subgroups of patients respond to the same treatment. Presently, precmed focuses on the personalization of two competitive treatments using randomized data from a clinical trial (Zhao et al. 2013) or using real-world data (RWD) from a non-randomized study (Yadlowsky et al. 2020).
Installation
The precmed
package can be installed from CRAN as follows:
install.packages("precmed")
The latest version can be installed from GitHub as follows:
install.packages("devtools")
devtools::install_github(repo = "smartdata-analysis-and-statistics/precmed")
Package capabilities
The main functions in the precmed package are:
Function | Description |
---|---|
catefit() | Estimation of the conditional average treatment effect (CATE) |
atefit() | Doubly robust estimator for the average treatment effect (ATE) |
catecv() | Development and cross-validation of the CATE |
abc() | Compute the area between the average treatment difference curve of competing models for the CATE (Zhao et al. 2013) |
plot() | Two side-by-side line plots of validation curves from the precmed object |
boxplot() | Plot the proportion of subjects with an estimated treatment effect no less than over a range of values for (Zhao et al. 2013). |
For more info: https://smartdata-analysis-and-statistics.github.io/precmed/
Recommended workflow
We recommend the following workflow to develop a model for estimating the CATE in order to identify treatment effect heterogeneity:
- Compare up to five modelling approaches (e.g., Poisson regression, boosting) for estimating the CATE using cross-validation through catecv.
- Select the best modelling approach using 3 metrics:
- Compare the steepness of the validation curves in the validation samples across methods using
plot()
. Two side-by-side plots are generated, visualizing the estimated average treatment effects in a series of nested subgroups. On the left side the curve is shown for the training set, and on the right side the curve is shown for the validation set. Each line in the plots represents one scoring method (e.g., boosting, randomForest) specified under the argumentscore.method
. - The area between curves (ABC) using abc quantifies a model’s ability to capture treatment effect heterogeneity. Higher ABC values are preferable as they indicate that more treatment effect heterogeneity is captured by the scoring method.
- Compare the distribution of the estimated ATE across different levels of the CATE score percentiles using
boxplot()
.
- Compare the steepness of the validation curves in the validation samples across methods using
- Apply the best modelling approach in the original data or in a new external dataset using
catefit()
. - Optional. Use
atefit()
to estimate ATE between 2 treatment groups with a doubly robust estimator and estimate the variability of the ATE with a bootstrap approach.
In the vignettes, we will adopt a different workflow to gradually expose the user from simple to more complex methods.
User input
When applying catefit()
or catecv()
, the user has to (at least) input:
-
response
: type of outcome/response (eithercount
orsurvival
)
-
data
: a data frame with individual patient data
-
score.method
: methods to estimate the CATE (e.g.,boosting
,poisson
,twoReg
,contrastReg
)
-
cate.model
: a formula describing the outcome model (e.g., outcome ~ age + gender + previous_treatment)
-
ps.model
: a formula describing the propensity score model to adjust for confounding (e.g., treatment ~ age + previous_treatment)