mrtool.core package¶
data¶
data module for mrtool package.
-
class
MRData(obs=<factory>, obs_se=<factory>, covs=<factory>, study_id=<factory>, data_id=<factory>)[source]¶ Bases:
objectData for simple linear mixed effects model.
-
get_covs(covs)[source]¶ Get covariate matrix.
Parameters: covs (Union[List[str], str]) – List of covariate names or one covariate name. Returns: Covariates matrix, in the column fashion. Return type: np.ndarray
-
get_study_data(studies)[source]¶ Get study specific data.
Parameters: studies (Union[List[Any], Any]) – List of studies or one study. - Returns
- MRData: Data object contains the study specific data.
Return type: MRData
-
has_covs(covs)[source]¶ If the data has the provided covariates.
Parameters: covs (Union[List[str], str]) – List of covariate names or one covariate name. Returns: If has covariates return True. Return type: bool
-
has_studies(studies)[source]¶ If the data has provided study_id
Parameters: Union[List[Any], Any] (studies) – List of studies or one study. Returns: If has studies return True. Return type: bool
-
load_df(data, col_obs=None, col_obs_se=None, col_covs=None, col_study_id=None, col_data_id=None)[source]¶ Load data from data frame.
-
load_xr(data, var_obs=None, var_obs_se=None, var_covs=None, coord_study_id=None)[source]¶ Load data from xarray.
-
normalize_covs(covs=None)[source]¶ Normalize covariates by the largest absolute value for each covariate.
-
num_covs¶ Number of covariates.
-
num_obs¶ Number of observations.
-
num_points¶ Number of data points.
-
num_studies¶ Number of studies.
-
cov_model¶
Covariates model for mrtool.
-
class
CovModel(alt_cov, name=None, ref_cov=None, use_re=False, use_re_mid_point=False, use_spline=False, use_spline_intercept=False, spline_knots_type='frequency', spline_knots=array([0., 0.33333333, 0.66666667, 1. ]), spline_degree=3, spline_l_linear=False, spline_r_linear=False, prior_spline_derval_gaussian=None, prior_spline_derval_gaussian_domain=(0.0, 1.0), prior_spline_derval_uniform=None, prior_spline_derval_uniform_domain=(0.0, 1.0), prior_spline_der2val_gaussian=None, prior_spline_der2val_gaussian_domain=(0.0, 1.0), prior_spline_der2val_uniform=None, prior_spline_der2val_uniform_domain=(0.0, 1.0), prior_spline_funval_gaussian=None, prior_spline_funval_gaussian_domain=(0.0, 1.0), prior_spline_funval_uniform=None, prior_spline_funval_uniform_domain=(0.0, 1.0), prior_spline_monotonicity=None, prior_spline_monotonicity_domain=(0.0, 1.0), prior_spline_convexity=None, prior_spline_convexity_domain=(0.0, 1.0), prior_spline_num_constraint_points=20, prior_spline_maxder_gaussian=None, prior_spline_maxder_uniform=None, prior_spline_normalization=None, prior_beta_gaussian=None, prior_beta_uniform=None, prior_beta_laplace=None, prior_gamma_gaussian=None, prior_gamma_uniform=None, prior_gamma_laplace=None)[source]¶ Bases:
objectCovariates model.
-
create_constraint_mat()[source]¶ Create constraint matrix. :returns: Return linear constraints matrix and its uniform prior. :rtype: tuple{numpy.ndarray, numpy.ndarray}
-
create_design_mat(data)[source]¶ Create design matrix. :param data: The data frame used for storing the data :type data: mrtool.MRData
Returns: Return the design matrix for linear cov or spline. Return type: tuple{numpy.ndarray, numpy.ndarray}
-
create_regularization_mat()[source]¶ Create constraint matrix. :returns: Return linear regularization matrix and its Gaussian prior. :rtype: tuple{numpy.ndarray, numpy.ndarray}
-
create_spline(data, spline_knots=None)[source]¶ Create spline given current spline parameters. :type data:
MRData:param data: The data frame used for storing the data :type data: mrtool.MRData :type spline_knots:Optional[ndarray] :param spline_knots: Spline knots, ifNonedetermined by frequency or domain. :type spline_knots: np.ndarray, optionalReturns: The spline object. Return type: xspline.XSpline
-
num_constraints¶
-
num_regularizations¶
-
num_x_vars¶
-
num_z_vars¶
-
-
class
LinearCovModel(*args, **kwargs)[source]¶ Bases:
mrtool.core.cov_model.CovModelLinear Covariates Model.
-
class
LogCovModel(*args, **kwargs)[source]¶ Bases:
mrtool.core.cov_model.CovModelLog Covariates Model.
-
create_constraint_mat(threshold=1e-06)[source]¶ Create constraint matrix. Overwrite the super class, adding non-negative constraints.
-
create_x_fun(data)[source]¶ Create design functions for the fixed effects.
Parameters: data (mrtool.MRData) – The data frame used for storing the data Returns: Design functions for fixed effects. Return type: tuple{function, function}
-
create_z_mat(data)[source]¶ Create design matrix for the random effects.
Parameters: data (mrtool.MRData) – The data frame used for storing the data Returns: Design matrix for random effects. Return type: numpy.ndarray
-
num_constraints¶
-
num_z_vars¶
-
model¶
Model module for mrtool package.
-
class
MRBRT(data, cov_models, inlier_pct=1.0)[source]¶ Bases:
objectMR-BRT Object
-
create_draws(data, beta_samples, gamma_samples, random_study=True, sort_by_data_id=False)[source]¶ Create draws for the given data set.
Parameters: - data (MRData) – MRData object contains predict data.
- beta_samples (np.ndarray) – Samples of beta.
- gamma_samples (np.ndarray) – Samples of gamma.
- random_study (bool, optional) – If True the draws will include uncertainty from study heterogeneity.
- sort_by_data_id (bool, optional) – If True, will sort the final prediction as the order of the original data frame that used to create the data. Default to False.
Returns: Returns outcome sample matrix.
Return type: np.ndarray
-
fit_model(**fit_options)[source]¶ Fitting the model through limetr.
Parameters: - x0 (np.ndarray) – Initial guess for the optimization problem.
- inner_print_level (int) – If non-zero printing iteration information of the inner problem.
- inner_max_iter (int) – Maximum inner number of iterations.
- inner_tol (float) – Tolerance of the inner problem.
- outer_verbose (bool) – If True print out iteration information.
- outer_max_iter (int) – Maximum outer number of iterations.
- outer_step_size (float) – Step size of the outer problem.
- outer_tol (float) – Tolerance of the outer problem.
- normalize_trimming_grad (bool) – If True, normalize the gradient of the outer trimming problem.
-
predict(data, predict_for_study=False, sort_by_data_id=False)[source]¶ Create new prediction with existing solution.
Parameters: - data (MRData) – MRData object contains the predict data.
- predict_for_study (bool, optional) – If True, use the random effects information to prediction for specific study. If the study_id in data do not contain in the fitting data, it will assume the corresponding random effects equal to 0.
- sort_by_data_id (bool, optional) – If True, will sort the final prediction as the order of the original data frame that used to create the data. Default to False.
Returns: Predicted outcome array.
Return type: np.ndarray
-
sample_soln(sample_size=1, sim_prior=True, sim_re=True, max_iter=100, print_level=0)[source]¶ Sample solutions.
Parameters: - sample_size (int, optional) – Number of samples.
- sim_prior (bool, optional) – If True, simulate priors.
- sim_re (bool, optional) – If True, simulate random effects.
- max_iter (int, optional) – Maximum number of iterations. Default to 100.
- print_level (int, optional) – Level detailed of optimization information printed out during sampling process. If 0, no information will be printed out.
Returns: Return beta samples and gamma samples.
Return type: Tuple[np.ndarray, np.ndarray]
-
-
class
MRBeRT(data, ensemble_cov_model, ensemble_knots, cov_models=None, inlier_pct=1.0)[source]¶ Bases:
objectEnsemble model of MRBRT.
-
create_draws(data, beta_samples, gamma_samples, random_study=True, sort_by_data_id=False)[source]¶ Create draws. For function description please check create_draws for MRBRT.
Return type: ndarray
-
fit_model(x0=None, inner_print_level=0, inner_max_iter=20, inner_tol=1e-08, outer_verbose=False, outer_max_iter=100, outer_step_size=1.0, outer_tol=1e-06, normalize_trimming_grad=False, scores_weights=array([1., 1.]), slopes=array([ 2., 10.]), quantiles=array([0.4, 0.4]))[source]¶ Fitting the model through limetr.
-
predict(data, predict_for_study=False, sort_by_data_id=False, return_avg=True)[source]¶ Create new prediction with existing solution.
Parameters: return_avg (bool) – When it is True, the function will return an average prediction based on the score, and when it is False the function will return a list of predictions from all groups. Return type: ndarray
-
sample_soln(sample_size=1, sim_prior=True, sim_re=True, max_iter=100, print_level=0)[source]¶ Sample solution.
Return type: Tuple[List[ndarray],List[ndarray]]
-
-
create_knots_samples(data, alt_cov_names=None, ref_cov_names=None, l_zero=True, num_splines=50, num_knots=5, width_pct=0.2, return_settings=False)[source]¶ Create knot samples for relative risk application.
Parameters: - data (MRData) – Data object.
- alt_cov_names (List[str], optional) – Name of the alternative exposures, if None use [‘b_0’, ‘b_1’]. Default to None.
- ref_cov_names (List[str], optional) – Name of the reference exposures, if None use [‘a_0’, ‘a_1’]. Default to None.
- l_zero (bool, optional) – If True, assume the exposure min is 0. Default to True.
- num_splines (int, optional) – Number of splines. Default to 50.
- num_knots (int, optional) – Number of the spline knots. Default to 5.
- width_pct (float, optional) – Minimum percentage distance between knots. Default to 0.2.
- return_settings (bool, optional) – Returns the knots setting if True. Default to False.
Returns: Knots samples.
Return type: np.ndarray
utils¶
utils module of the mrtool package.
-
avg_integral(mat, spline=None, use_spline_intercept=False)[source]¶ Compute average integral.
Parameters: - mat (numpy.ndarray) – Matrix that contains the starting and ending points of the integral or a single column represents the mid-points.
- spline (xspline.XSpline | None, optional) – Spline integrate over with, when None treat the function as linear.
- use_spline_intercept (bool, optional) – If True use all bases from spline, otherwise remove the first bases.
Returns: Design matrix when spline is not None, otherwise the mid-points.
Return type: numpy.ndarray
-
combine_cols(cols)[source]¶ Combine column names into one list of names.
Parameters: cols (list{str | list{str}}) – A list of names of columns or list of column names. Returns: Combined names of columns. Return type: list{str}
-
expand_array(array, shape, value, name)[source]¶ Expand array when it is empty.
Parameters: - array (np.ndarray) – Target array. If array is empty, fill in the
value. And When it is not empty assert theshapeagrees and return the original array. - shape (Tuple[int]) – The expected shape of the array.
- value (Any) – The expected value in final array.
- name (str) – Variable name of the array (for error message).
Returns: Expanded array.
Return type: np.ndarray
- array (np.ndarray) – Target array. If array is empty, fill in the
-
get_cols(df, cols)[source]¶ Return the columns of the given data frame. :param df: Given data frame. :type df: pandas.DataFrame :param cols: Given column name(s), if is None, will return a empty data frame. :type cols: str | list{str} | None
Returns: The data frame contains the columns. Return type: pandas.DataFrame | pandas.Series
-
input_cols(cols, append_to=None, default=None)[source]¶ Process the input column name. :param cols: The input column name(s). :type cols: str | list{str} | None :param append_to: A list keep track of all the column names. :type append_to: list{str} | None, optional :param default: Default value when cols is None. :type default: str | list{str} | None, optional
Returns: The name of the column(s) Return type: str | list{str}
-
input_gaussian_prior(prior, size)[source]¶ Process the input Gaussian prior
Parameters: - prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation.
- size (int, optional) – Size the variable, prior related to.
Returns: Prior after processing, with shape (2, size), with the first row store the mean and second row store the standard deviation.
Return type: numpy.ndarray
-
input_laplace_prior(prior, size)¶ Process the input Gaussian prior
Parameters: - prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation.
- size (int, optional) – Size the variable, prior related to.
Returns: Prior after processing, with shape (2, size), with the first row store the mean and second row store the standard deviation.
Return type: numpy.ndarray
-
input_uniform_prior(prior, size)[source]¶ Process the input Gaussian prior
Parameters: - prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation.
- size (int, optional) – Size the variable, prior related to.
Returns: Prior after processing, with shape (2, size), with the first row store the mean and second row store the standard deviation.
Return type: numpy.ndarray
-
is_cols(cols)[source]¶ Check variable type fall into the column name category. :param cols: Column names candidate. :type cols: str | list{str} | None
Returns: if col is either str, list{str} or None Return type: bool
-
is_gaussian_prior(prior, size=None)[source]¶ Check if variable satisfy Gaussian prior format
Parameters: prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation. Keyword Arguments: size (int | None, optional) – Size the variable, prior related to. Returns: True if satisfy condition. Return type: bool
-
is_laplace_prior(prior, size=None)¶ Check if variable satisfy Gaussian prior format
Parameters: prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation. Keyword Arguments: size (int | None, optional) – Size the variable, prior related to. Returns: True if satisfy condition. Return type: bool
-
is_numeric_array(array)[source]¶ Check if an array is numeric.
Parameters: array (np.ndarray) – Array need to be checked. Returns: True if the array is numeric. Return type: bool
-
is_uniform_prior(prior, size=None)[source]¶ Check if variable satisfy uniform prior format
Parameters: prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to lower bound and second group refer to upper bound. Keyword Arguments: size (int | None, optional) – Size the variable, prior related to. Returns: True if satisfy condition. Return type: bool
-
sample_knots(num_intervals, knot_bounds=None, interval_sizes=None, num_samples=1)[source]¶ Sample knots given a set of rules.
Parameters: - num_intervals (
int) – Number of intervals (number of knots minus 1). - knot_bounds (
Optional[ndarray]) – Bounds for the interior knots. Here we assume the domain span 0 to 1, bound for a knot should be between 0 and 1, e.g.[0.1, 0.2].knot_boundsshould have number of interior knots of rows, and each row is a bound for corresponding knot, e.g.knot_bounds=np.array([[0.0, 0.2], [0.3, 0.4], [0.3, 1.0]]), for when we have three interior knots. - interval_sizes (
Optional[ndarray]) – Bounds for the distances between knots. For the same reason, we assume elements in interval_sizes to be between 0 and 1. For example,interval_distances=np.array([[0.1, 0.2], [0.1, 0.3], [0.1, 0.5], [0.1, 0.5]])means that the distance between first (0) and second knot has to be between 0.1 and 0.2, etc. And the number of rows forinterval_sizeshas to be same withnum_intervals. - num_samples (
int) – Number of knots samples.
Returns: Return knots sample as array, with num_samples rows and number of knots columns.
Return type: np.ndarray
- num_intervals (