mrtool.core package¶

data¶

data module for mrtool package.

class MRData(obs=<factory>, obs_se=<factory>, covs=<factory>, study_id=<factory>, data_id=<factory>)[source]¶

Bases: object

Data for simple linear mixed effects model.

get_covs(covs)[source]¶

Get covariate matrix.

Parameters:	covs (Union[List[str], str]) – List of covariate names or one covariate name.
Returns:	Covariates matrix, in the column fashion.
Return type:	np.ndarray

get_study_data(studies)[source]¶

Get study specific data.

Parameters:	studies (Union[List[Any], Any]) – List of studies or one study.

Returns: MRData: Data object contains the study specific data.

Return type:	`MRData`

has_covs(covs)[source]¶

If the data has the provided covariates.

Parameters:	covs (Union[List[str], str]) – List of covariate names or one covariate name.
Returns:	If has covariates return True.
Return type:	bool

has_studies(studies)[source]¶

If the data has provided study_id

Parameters:	Union[List[Any], Any] (studies) – List of studies or one study.
Returns:	If has studies return True.
Return type:	bool

is_cov_normalized(covs=None)[source]¶

Return true when covariates are normalized.

Return type:	`bool`

is_empty()[source]¶

Return true when object contain data.

Return type:	`bool`

load_df(data, col_obs=None, col_obs_se=None, col_covs=None, col_study_id=None, col_data_id=None)[source]¶: Load data from data frame.

load_xr(data, var_obs=None, var_obs_se=None, var_covs=None, coord_study_id=None)[source]¶: Load data from xarray.

normalize_covs(covs=None)[source]¶: Normalize covariates by the largest absolute value for each covariate.

num_covs¶: Number of covariates.

num_obs¶: Number of observations.

num_points¶: Number of data points.

num_studies¶: Number of studies.

reset()[source]¶: Reset all the attributes to default values.

to_df()[source]¶

Convert data object to data frame.

Return type:	`DataFrame`

cov_model¶

Covariates model for mrtool.

class CovModel(alt_cov, name=None, ref_cov=None, use_re=False, use_re_mid_point=False, use_spline=False, use_spline_intercept=False, spline_knots_type='frequency', spline_knots=array([0., 0.33333333, 0.66666667, 1. ]), spline_degree=3, spline_l_linear=False, spline_r_linear=False, prior_spline_derval_gaussian=None, prior_spline_derval_gaussian_domain=(0.0, 1.0), prior_spline_derval_uniform=None, prior_spline_derval_uniform_domain=(0.0, 1.0), prior_spline_der2val_gaussian=None, prior_spline_der2val_gaussian_domain=(0.0, 1.0), prior_spline_der2val_uniform=None, prior_spline_der2val_uniform_domain=(0.0, 1.0), prior_spline_funval_gaussian=None, prior_spline_funval_gaussian_domain=(0.0, 1.0), prior_spline_funval_uniform=None, prior_spline_funval_uniform_domain=(0.0, 1.0), prior_spline_monotonicity=None, prior_spline_monotonicity_domain=(0.0, 1.0), prior_spline_convexity=None, prior_spline_convexity_domain=(0.0, 1.0), prior_spline_num_constraint_points=20, prior_spline_maxder_gaussian=None, prior_spline_maxder_uniform=None, prior_spline_normalization=None, prior_beta_gaussian=None, prior_beta_uniform=None, prior_beta_laplace=None, prior_gamma_gaussian=None, prior_gamma_uniform=None, prior_gamma_laplace=None)[source]¶

Bases: object

Covariates model.

attach_data(data)[source]¶: Attach data.

create_constraint_mat()[source]¶: Create constraint matrix. :returns: Return linear constraints matrix and its uniform prior. :rtype: tuple{numpy.ndarray, numpy.ndarray}

create_design_mat(data)[source]¶

Create design matrix. :param data: The data frame used for storing the data :type data: mrtool.MRData

Returns:	Return the design matrix for linear cov or spline.
Return type:	tuple{numpy.ndarray, numpy.ndarray}

create_regularization_mat()[source]¶: Create constraint matrix. :returns: Return linear regularization matrix and its Gaussian prior. :rtype: tuple{numpy.ndarray, numpy.ndarray}

create_spline(data, spline_knots=None)[source]¶

Create spline given current spline parameters. :type data: MRData :param data: The data frame used for storing the data :type data: mrtool.MRData :type spline_knots: Optional[ndarray] :param spline_knots: Spline knots, if None determined by frequency or domain. :type spline_knots: np.ndarray, optional

Returns:	The spline object.
Return type:	xspline.XSpline

create_x_fun(data)[source]¶

create_z_mat(data)[source]¶

has_data()[source]¶: Return True if there is one data object attached.

num_constraints¶

num_regularizations¶

num_x_vars¶

num_z_vars¶

class LinearCovModel(*args, **kwargs)[source]¶

Bases: mrtool.core.cov_model.CovModel

Linear Covariates Model.

create_x_fun(data)[source]¶: Create design function for the fixed effects.

create_z_mat(data)[source]¶

Create design matrix for the random effects.

Parameters:	data (mrtool.MRData) – The data frame used for storing the data
Returns:	Design matrix for random effects.
Return type:	numpy.ndarray

class LogCovModel(*args, **kwargs)[source]¶

Bases: mrtool.core.cov_model.CovModel

Log Covariates Model.

create_constraint_mat(threshold=1e-06)[source]¶: Create constraint matrix. Overwrite the super class, adding non-negative constraints.

create_x_fun(data)[source]¶

Create design functions for the fixed effects.

Parameters:	data (mrtool.MRData) – The data frame used for storing the data
Returns:	Design functions for fixed effects.
Return type:	tuple{function, function}

create_z_mat(data)[source]¶

Create design matrix for the random effects.

Parameters:	data (mrtool.MRData) – The data frame used for storing the data
Returns:	Design matrix for random effects.
Return type:	numpy.ndarray

num_constraints¶

num_z_vars¶

model¶

Model module for mrtool package.

class LimeTr[source]¶: Bases: object

class MRBRT(data, cov_models, inlier_pct=1.0)[source]¶

Bases: object

MR-BRT Object

attach_data(data=None)[source]¶: Attach data to cov_model.

check_input()[source]¶: Check the input type of the attributes.

create_c_mat()[source]¶: Create the constraints matrices.

create_draws(data, beta_samples, gamma_samples, random_study=True, sort_by_data_id=False)[source]¶

Create draws for the given data set.

Parameters:	data (MRData) – MRData object contains predict data. beta_samples (np.ndarray) – Samples of beta. gamma_samples (np.ndarray) – Samples of gamma. random_study (bool, optional) – If True the draws will include uncertainty from study heterogeneity. sort_by_data_id (bool, optional) – If True, will sort the final prediction as the order of the original data frame that used to create the data. Default to False.
Returns:	Returns outcome sample matrix.
Return type:	np.ndarray

create_gprior()[source]¶: Create direct gaussian prior.

create_h_mat()[source]¶: Create the regularizer matrices.

create_lprior()[source]¶: Create direct laplace prior.

create_uprior()[source]¶: Create direct uniform prior.

create_x_fun(data=None)[source]¶: Create the fixed effects function, link with limetr.

create_z_mat(data=None)[source]¶: Create the random effects matrix, link with limetr.

extract_re(study_id)[source]¶

Extract the random effect for a given dataset.

Return type:	`ndarray`

fit_model(**fit_options)[source]¶

Fitting the model through limetr.

Parameters:

x0 (np.ndarray) – Initial guess for the optimization problem.
inner_print_level (int) – If non-zero printing iteration information of the inner problem.
inner_max_iter (int) – Maximum inner number of iterations.
inner_tol (float) – Tolerance of the inner problem.
outer_verbose (bool) – If True print out iteration information.
outer_max_iter (int) – Maximum outer number of iterations.
outer_step_size (float) – Step size of the outer problem.
outer_tol (float) – Tolerance of the outer problem.
normalize_trimming_grad (bool) – If True, normalize the gradient of the outer trimming problem.

get_cov_model(name)[source]¶

Choose covariate model with name.

Return type:	`CovModel`

get_cov_model_index(name)[source]¶

From cov_model name get the index.

Return type:	`int`

predict(data, predict_for_study=False, sort_by_data_id=False)[source]¶

Create new prediction with existing solution.

Parameters:	data (MRData) – MRData object contains the predict data. predict_for_study (bool, optional) – If True, use the random effects information to prediction for specific study. If the study_id in data do not contain in the fitting data, it will assume the corresponding random effects equal to 0. sort_by_data_id (bool, optional) – If True, will sort the final prediction as the order of the original data frame that used to create the data. Default to False.
Returns:	Predicted outcome array.
Return type:	np.ndarray

sample_soln(sample_size=1, sim_prior=True, sim_re=True, max_iter=100, print_level=0)[source]¶

Sample solutions.

Parameters:	sample_size (int, optional) – Number of samples. sim_prior (bool, optional) – If True, simulate priors. sim_re (bool, optional) – If True, simulate random effects. max_iter (int, optional) – Maximum number of iterations. Default to 100. print_level (int, optional) – Level detailed of optimization information printed out during sampling process. If 0, no information will be printed out.
Returns:	Return beta samples and gamma samples.
Return type:	Tuple[np.ndarray, np.ndarray]

summary()[source]¶

Return the summary data frame.

Return type:	`Tuple`[`DataFrame`, `DataFrame`]

class MRBeRT(data, ensemble_cov_model, ensemble_knots, cov_models=None, inlier_pct=1.0)[source]¶

Bases: object

Ensemble model of MRBRT.

create_draws(data, beta_samples, gamma_samples, random_study=True, sort_by_data_id=False)[source]¶

Create draws. For function description please check create_draws for MRBRT.

Return type:	`ndarray`

fit_model(x0=None, inner_print_level=0, inner_max_iter=20, inner_tol=1e-08, outer_verbose=False, outer_max_iter=100, outer_step_size=1.0, outer_tol=1e-06, normalize_trimming_grad=False, scores_weights=array([1., 1.]), slopes=array([ 2., 10.]), quantiles=array([0.4, 0.4]))[source]¶: Fitting the model through limetr.

predict(data, predict_for_study=False, sort_by_data_id=False, return_avg=True)[source]¶

Create new prediction with existing solution.

Parameters:	return_avg (bool) – When it is True, the function will return an average prediction based on the score, and when it is False the function will return a list of predictions from all groups.
Return type:	`ndarray`

sample_soln(sample_size=1, sim_prior=True, sim_re=True, max_iter=100, print_level=0)[source]¶

Sample solution.

Return type:	`Tuple`[`List`[`ndarray`], `List`[`ndarray`]]

score_model(scores_weights=array([1., 1.]), slopes=array([ 2., 10.]), quantiles=array([0.4, 0.4]))[source]¶: Score the model by there fitting and variation.

summary()[source]¶

Create summary data frame.

Return type:	`Tuple`[`DataFrame`, `DataFrame`]

create_knots_samples(data, alt_cov_names=None, ref_cov_names=None, l_zero=True, num_splines=50, num_knots=5, width_pct=0.2, return_settings=False)[source]¶

Create knot samples for relative risk application.

Parameters:	data (MRData) – Data object. alt_cov_names (List[str], optional) – Name of the alternative exposures, if None use [‘b_0’, ‘b_1’]. Default to None. ref_cov_names (List[str], optional) – Name of the reference exposures, if None use [‘a_0’, ‘a_1’]. Default to None. l_zero (bool, optional) – If True, assume the exposure min is 0. Default to True. num_splines (int, optional) – Number of splines. Default to 50. num_knots (int, optional) – Number of the spline knots. Default to 5. width_pct (float, optional) – Minimum percentage distance between knots. Default to 0.2. return_settings (bool, optional) – Returns the knots setting if True. Default to False.
Returns:	Knots samples.
Return type:	np.ndarray

score_sub_models_datafit(mr)[source]¶: score the result of mrbert

score_sub_models_variation(mr, ensemble_cov_model_name, n=1)[source]¶

score the result of mrbert

Return type:	`float`

utils¶

utils module of the mrtool package.

class Matrix[source]¶: Bases: object

class Polyhedron[source]¶

Bases: object

get_generators()[source]¶

class RepType[source]¶

Bases: object

INEQUALITY = None¶

avg_integral(mat, spline=None, use_spline_intercept=False)[source]¶

Compute average integral.

Parameters:	mat (numpy.ndarray) – Matrix that contains the starting and ending points of the integral or a single column represents the mid-points. spline (xspline.XSpline \| None, optional) – Spline integrate over with, when None treat the function as linear. use_spline_intercept (bool, optional) – If True use all bases from spline, otherwise remove the first bases.
Returns:	Design matrix when spline is not None, otherwise the mid-points.
Return type:	numpy.ndarray

col_diff_mat(n)[source]¶: column difference matrix

combine_cols(cols)[source]¶

Combine column names into one list of names.

Parameters:	cols (list{str \| list{str}}) – A list of names of columns or list of column names.
Returns:	Combined names of columns.
Return type:	list{str}

empty_array()[source]¶

expand_array(array, shape, value, name)[source]¶

Expand array when it is empty.

Parameters:	array (np.ndarray) – Target array. If array is empty, fill in the `value`. And When it is not empty assert the `shape` agrees and return the original array. shape (Tuple[int]) – The expected shape of the array. value (Any) – The expected value in final array. name (str) – Variable name of the array (for error message).
Returns:	Expanded array.
Return type:	np.ndarray

get_cols(df, cols)[source]¶

Return the columns of the given data frame. :param df: Given data frame. :type df: pandas.DataFrame :param cols: Given column name(s), if is None, will return a empty data frame. :type cols: str | list{str} | None

Returns:	The data frame contains the columns.
Return type:	pandas.DataFrame \| pandas.Series

input_cols(cols, append_to=None, default=None)[source]¶

Process the input column name. :param cols: The input column name(s). :type cols: str | list{str} | None :param append_to: A list keep track of all the column names. :type append_to: list{str} | None, optional :param default: Default value when cols is None. :type default: str | list{str} | None, optional

Returns:	The name of the column(s)
Return type:	str \| list{str}

input_gaussian_prior(prior, size)[source]¶

Process the input Gaussian prior

Parameters:	prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation. size (int, optional) – Size the variable, prior related to.
Returns:	Prior after processing, with shape (2, size), with the first row store the mean and second row store the standard deviation.
Return type:	numpy.ndarray

input_laplace_prior(prior, size)¶

Process the input Gaussian prior

Parameters:	prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation. size (int, optional) – Size the variable, prior related to.
Returns:	Prior after processing, with shape (2, size), with the first row store the mean and second row store the standard deviation.
Return type:	numpy.ndarray

input_uniform_prior(prior, size)[source]¶

Process the input Gaussian prior

Parameters:	prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation. size (int, optional) – Size the variable, prior related to.
Returns:	Prior after processing, with shape (2, size), with the first row store the mean and second row store the standard deviation.
Return type:	numpy.ndarray

is_cols(cols)[source]¶

Check variable type fall into the column name category. :param cols: Column names candidate. :type cols: str | list{str} | None

Returns:	if col is either str, list{str} or None
Return type:	bool

is_gaussian_prior(prior, size=None)[source]¶

Check if variable satisfy Gaussian prior format

Keyword Arguments:
Parameters:	prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation.
	size (int \| None, optional) – Size the variable, prior related to.
Returns:	True if satisfy condition.
Return type:	bool

is_laplace_prior(prior, size=None)¶

Check if variable satisfy Gaussian prior format

Keyword Arguments:
Parameters:	prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to mean and second group refer to standard deviation.
	size (int \| None, optional) – Size the variable, prior related to.
Returns:	True if satisfy condition.
Return type:	bool

is_numeric_array(array)[source]¶

Check if an array is numeric.

Parameters:	array (np.ndarray) – Array need to be checked.
Returns:	True if the array is numeric.
Return type:	bool

is_uniform_prior(prior, size=None)[source]¶

Check if variable satisfy uniform prior format

Keyword Arguments:
Parameters:	prior (numpy.ndarray) – Either one or two dimensional array, with first group refer to lower bound and second group refer to upper bound.
	size (int \| None, optional) – Size the variable, prior related to.
Returns:	True if satisfy condition.
Return type:	bool

mat_to_fun(alt_mat, ref_mat=None)[source]¶

mat_to_log_fun(alt_mat, ref_mat=None, add_one=True)[source]¶

nonlinear_trans(score, slope=6.0, quantile=0.7)[source]¶

ravel_dict(x)[source]¶

Ravel dictionary.

Return type:	`dict`

sample_knots(num_intervals, knot_bounds=None, interval_sizes=None, num_samples=1)[source]¶

Sample knots given a set of rules.

Parameters:	num_intervals (`int`) – Number of intervals (number of knots minus 1). knot_bounds (`Optional`[`ndarray`]) – Bounds for the interior knots. Here we assume the domain span 0 to 1, bound for a knot should be between 0 and 1, e.g. `[0.1, 0.2]`. `knot_bounds` should have number of interior knots of rows, and each row is a bound for corresponding knot, e.g. `knot_bounds=np.array([[0.0, 0.2], [0.3, 0.4], [0.3, 1.0]])`, for when we have three interior knots. interval_sizes (`Optional`[`ndarray`]) – Bounds for the distances between knots. For the same reason, we assume elements in interval_sizes to be between 0 and 1. For example, `interval_distances=np.array([[0.1, 0.2], [0.1, 0.3], [0.1, 0.5], [0.1, 0.5]])` means that the distance between first (0) and second knot has to be between 0.1 and 0.2, etc. And the number of rows for `interval_sizes` has to be same with `num_intervals`. num_samples (`int`) – Number of knots samples.
Returns:	Return knots sample as array, with num_samples rows and number of knots columns.
Return type:	np.ndarray

sample_simplex(n, N=1)[source]¶: sample from n dimensional simplex

sizes_to_indices(sizes)[source]¶

Converting sizes to corresponding indices. :param sizes: An array consist of non-negative number. :type sizes: numpy.dnarray

Returns:	List the indices.
Return type:	list{range}

to_list(obj)[source]¶

Convert objective to list of object.

Parameters:	obj (Any) – Object need to be convert.
Returns:	If obj already is a list object, return obj itself, otherwise wrap obj with a list and return it.
Return type:	List[Any]