A procedure to estimate the conditional average treatment effect (CATE) via
a strategy that utilizes a regression of a doubly robust pseudo-outcome
derived from the form of the efficient influence function (a key quantity in
semiparametric statistics) on the segmentation covariates. Note that the
data for this estimation procedure is created based upon the specifications
provided in set_est_data
, so this function only takes those
arguments directly relevant to nuisance parameter estimation.
est_cate(data_est_spec, cv_folds = 5L, split_type = c("inner", "outer"),
ps_learner, or_learner, cate_learner, use_cv_selector = F)
Arguments
data_est_spec |
An input data.table object
created from the input data by set_est_data . Note that this
data container object has specialized attributes appended to it, so it must
be created by that internal utility function. |
cv_folds |
A numeric specifying the number of cross-validation
folds to be used for sample-splitting when estimating nuisance parameters. |
split_type |
A character string (of length one) indicating the
sample-splitting "level" at which estimation of the CATE is performed. The
choices are "inner", for estimation of the CATE within folds (i.e., at the
the same level at which nuisance parameters are estimated), and "outer", in
which case the CATE is estimated at the "full-sample" level. |
ps_learner |
Either an instantiated learner object (class inheriting
from Lrnr_base ), from sl3, or a list of
specifications, or a constant rate between 0 and 1, to be used for
estimation of the propensity score (the probability of receiving treatment,
conditional on covariates). If list : each entry may be an
instantiated learner object, or can be a list where one item is an
instantiated learner object whose modeling requires specification, and the
other item is a list of character vectors, where each vector specifies an
interaction term. If constant rate, this rate represents the population
probability of being assigned to treatment in an A/B tests. Note that the
outcome of this estimation task is strictly binary and that algorithms or
ensemble models should be set up accordingly. |
or_learner |
Either an instantiated learner object (class inheriting
from Lrnr_base ), from sl3, or a list of
specifications, to be used for estimation of the outcome regression (the
mean of the response variable, conditional on exposure and covariates). If
list : each entry can be an instantiated learner object, or can be a
list where one item is an instantiated learner object whose modeling
requires specification, and the other item is a list of character vectors,
where each vector specifies an interaction term. |
cate_learner |
Either an instantiated learner object (class inheriting
from Lrnr_base ), from sl3, or a list of
specifications, to be used to estimate the CATE, based on a regression of a
doubly robust pseudo-outcome on the specified segmentation covariates. If
list : each entry can be an instantiated learner object, or can be a
list where one item is an instantiated learner object whose modeling
requires specification, and the other item is a list of character vectors,
where each vector specifies an interaction term. Note that the outcome of
this estimation task is derived from the other nuisance parameter estimates
and should be expected to always be continuous-valued, so algorithms or
ensemble models should be set up accordingly.
@param use_cv_selector If TRUE , then will use Cross-Validation to
choose the best among a list of learners when fitting ps_learner ,
or_learner or cate_learner .
If FALSE (default), then will use the default metalearner for the outcome type.
This argument will not be applied for a learner that is not a list but one instantiated learner object. |
Value
A data.table
of the full data, augmented
with additional columns that specify estimates of the nuisance parameters,
the doubly robust pseudo-outcome, and the estimated CATE.