Sherlock Holmes was a consulting detective who had spectacular powers of
deduction and logical reasoning. Within sherlock's causal segmentation
framework, `sherlock_calculate`

takes data from a segmentation "case",
the roles of the different variables, and specifications for assessing the
conditional treatment effects required for deriving a segmentation. Being
the workhorse, this function is the most demanding, as it computes all of
the nuisance parameters required for subsequent analyses. The complementary
functions `watson_segment`

and `mycroft_assess`

can
be used once Sherlock has consulted on the causal segmentation case.

sherlock_calculate(data_from_case, baseline, exposure, outcome, segment_by,
ids = NULL, treatment_cost = NULL, cv_folds = 5L,
split_type = c("inner", "outer"), ps_learner, or_learner, cate_learner,
use_cv_selector = FALSE)

## Arguments

data_from_case |
Rectangular input data, whether a `data.frame` ,
`data.table` , or `tibble` . |

baseline |
A `character` vector specifying the column names in
`data_obs` that correspond to the baseline covariates (conditioning
set). These variables should temporally precede the exposure and outcome. |

exposure |
A `character` string (of length one) specifying the
column in `data_obs` corresponding to the exposure or treatment. This
variable should follow those in `baseline` in time but precede the
response variable `outcome` . |

outcome |
A `character` string (of length one) specifying the
column in `data_obs` corresponding to the response variable. |

segment_by |
A `character` vector specifying the column names in
`data_obs` that correspond to the covariates over which segmentation
should be performed. This should be a strict subset of `baseline` . |

ids |
A `character` string (of length one) specifying the column
in `data_obs` that gives observation-level IDs. The default value of
`NULL` assumes that all rows of `data_obs` are independent. |

treatment_cost |
A `character` string (of length one) specifying
the column in `data_obs` that provides the cost associated to treating
the given unit. The default value of `NULL` assumes that all units are
equally costly to treat. |

cv_folds |
A `numeric` specifying the number of cross-validation
folds to be used for sample-splitting when estimating nuisance parameters. |

split_type |
A `character` string (of length one) indicating the
sample-splitting "level" at which estimation of the CATE is performed. The
choices are "inner", for estimation of the CATE within folds (i.e., at the
the same level at which nuisance parameters are estimated), and "outer", in
which case the CATE is estimated at the "full-sample" level. |

ps_learner |
Either an instantiated learner object (class inheriting
from `Lrnr_base` ), from sl3, or a `list` of
specifications, or a constant rate between 0 and 1, to be used for
estimation of the propensity score (the probability of receiving treatment,
conditional on covariates). If `list` : each entry may be an
instantiated learner object, or can be a list where one item is an
instantiated learner object whose modeling requires specification, and the
other item is a list of character vectors, where each vector specifies an
interaction term. If constant rate, this rate represents the population
probability of being assigned to treatment in an A/B tests. Note that the
outcome of this estimation task is strictly binary and that algorithms or
ensemble models should be set up accordingly. |

or_learner |
Either an instantiated learner object (class inheriting
from `Lrnr_base` ), from sl3, or a `list` of
specifications, to be used for estimation of the outcome regression (the
mean of the response variable, conditional on exposure and covariates). If
`list` : each entry can be an instantiated learner object, or can be a
list where one item is an instantiated learner object whose modeling
requires specification, and the other item is a list of character vectors,
where each vector specifies an interaction term. |

cate_learner |
Either an instantiated learner object (class inheriting
from `Lrnr_base` ), from sl3, or a `list` of
specifications, to be used to estimate the CATE, based on a regression of a
doubly robust pseudo-outcome on the specified segmentation covariates. If
`list` : each entry can be an instantiated learner object, or can be a
list where one item is an instantiated learner object whose modeling
requires specification, and the other item is a list of character vectors,
where each vector specifies an interaction term. Note that the outcome of
this estimation task is derived from the other nuisance parameter estimates
and should be expected to always be continuous-valued, so algorithms or
ensemble models should be set up accordingly. |

use_cv_selector |
If `TRUE` , then will use cross-validation to
choose the best among a list of learners when fitting `ps_learner` ,
`or_learner` or `cate_learner` . If `FALSE` (default), then
the default metalearner for the outcome type (from sl3) will be used.
This argument will not be ignored for a `learner` that is not a list,
but is instead an instantiated learner object. |