https://raw.githubusercontent.com/andrewtavis/causeinfer/main/resources/causeinfer_logo_transparent.png

rtd travis codecov pyversions pypi pypistatus license coc codestyle colab

Machine learning based causal inference/uplift in Python

Installation

pip install causeinfer
git clone https://github.com/andrewtavis/causeinfer.git
cd causeinfer
python setup.py install
import causeinfer

standard_algorithms

The standard_algorithms module compiles causal inference modeling techniques for quick application.

Base Models

Base models for the following algorithms:

  • The Two Model Approach

  • The Interaction Term Approach

  • The Binary Class Transformation (BCT) Approach

  • The Quaternary Class Transformation (QCT) Approach

  • The Reflective Uplift Approach

  • The Pessimistic Uplift Approach

Note: these classes should not be used directly. Please use derived classes instead.

Based on

Kuchumov, A. pyuplift: Lightweight uplift modeling framework for Python. (2019). URL: https://github.com/duketemon/pyuplift. License: https://github.com/duketemon/pyuplift/blob/master/LICENSE.

Contents
BaseModel Class

fit, predict

TransformationModel Class (see annotation/methodology explanation)

is_treatment_positive, is_control_positive, is_control_negative, is_treatment_negative

class causeinfer.standard_algorithms.base_models.BaseModel[source]

Base class for the Two Model and Interaction Term Approaches.

fit(X, y, w)[source]
Parameters
Xnumpy.ndarray(num_units, num_features)int, float

Dataframe of covariates.

ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Designates the original treatment allocation across units.

Returns
selfobject
predict(X, w)[source]
Parameters
Xnumpy.ndarray(num_pred_units, num_pred_features)int, float

New data on which to make a prediction.

wnumpy.ndarray(num_pred_units, num_pred_features)int, float

Treatment allocation for predicted units.

Returns
y_prednumpy.ndarray(num_pred_units,)int, float

Vector of predicted unit responses.

class causeinfer.standard_algorithms.base_models.TransformationModel[source]

Base class for the Response Transformation Approaches.

Notes

The following is non-standard annotation to combine marketing and other methodologies.

Traditional marketing annotation is found in parentheses.

The response transformation approach splits the units based on response and treatment:

  • TP : Treatment Positives (Treatment Responders).

  • CP : Control Positives (Control Responders).

  • CN : Control Negatives (Control Nonresponders).

  • TN : Treatment Negatives (Treatment Nonresponders).

From these four known classes we want to derive the characteristic responses of four unknown classes:

  • AP : Affected Positives (Persuadables) : within TPs and CNs.

  • UP : Unaffected Positives (Sure Things) : within TPs and CPs.

  • UN : Unaffected Negatives (Lost Causes) : within CNs and TNs.

  • AN : Affected Negatives (Do Not Disturbs) : within CPs and TNs.

The focus then falls onto predicting APs and ANs via their known classes.

is_treatment_positive(y, w)[source]

Checks if a subject did respond when treated.

Parameters
yint, float

The target response.

wint, float

The treatment value.

Returns
is_treatment_positivebool
is_control_positive(y, w)[source]

Checks if a subject did respond when not treated.

Parameters
yint, float

The target response.

wint, float

The treatment value.

Returns
is_control_positivebool
is_control_negative(y, w)[source]

Checks if a subject didn’t respond when not treated.

Parameters
yint, float

The target response.

wint, float

The treatment value.

Returns
is_control_negativebool
is_treatment_negative(y, w)[source]

Checks if a subject didn’t respond when treated.

Parameters
yint, float

The target response.

wint, float

The treatment value.

Returns
is_treatment_negativebool

Two Model

The Two Model Approach (Double Model, Separate Model).

Based on

Kuchumov, A. pyuplift: Lightweight uplift modeling framework for Python. (2019). URL: https://github.com/duketemon/pyuplift. License: https://github.com/duketemon/pyuplift/blob/master/LICENSE.

Hansotia, B. and B. Rukstales (2002). “Incremental value modeling”. In: Journal of Interactive Marketing 16(3), pp. 35–46. URL: https://search.proquest.com/openview/1f86b52432f7d80e46101b2b4b7629c0/1?cbl=32002& pq-origsite=gscholar

Devriendt, F. et al. (2018). A Literature Survey and Experimental Evaluation of the State-of-the-Art in Uplift Modeling: A Stepping Stone Toward the Development of Prescriptive Analytics. Big Data, Vol. 6, No. 1, March 1, 2018, pp. 1-29. Codes found at: data-lab.be/downloads.php.

Contents
TwoModel Class

fit, predict, predict_proba

class causeinfer.standard_algorithms.two_model.TwoModel(control_model=None, treatment_model=None)[source]
fit(X, y, w)[source]

Trains a model given covariates, responses and assignments.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

Matrix of covariates.

ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
treatment_model, control_modelcauseinfer.standard_algorithms.TwoModel

Two trained models (one for training group, one for control).

predict(X)[source]

Predicts a causal effect given covariates.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

New data on which to make predictions.

Returns
predictionsnumpy.ndarray(num_units, 2)float

Predicted causal effects for all units given treatment model and control.

predict_proba(X)[source]

Predicts the probability that a subject will be a given class given covariates.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

New data on which to make predictions.

Returns
probasnumpy.ndarray(num_units, 2)float

Predicted probability to respond for all units given treatment and control models.

Interaction Term

The Interaction Term Approach (The True Lift Model, The Dummy Variable Approach).

Based on

Kuchumov, A. pyuplift: Lightweight uplift modeling framework for Python. (2019). URL: https://github.com/duketemon/pyuplift. License: https://github.com/duketemon/pyuplift/blob/master/LICENSE.

Lo, VSY. (2002). “The true lift model: a novel data mining approach to response modeling in database marketing”. In:SIGKDD Explor4 (2), 78–86. URL: https://dl.acm.org/citation.cfm?id=772872

Devriendt, F. et al. (2018). A Literature Survey and Experimental Evaluation of the State-of-the-Art in Uplift Modeling: A Stepping Stone Toward the Development of Prescriptive Analytics. Big Data, Vol. 6, No. 1, March 1, 2018, pp. 1-29. Codes found at: data-lab.be/downloads.php.

Contents
InteractionTerm Class

fit, predict, predict_proba

class causeinfer.standard_algorithms.interaction_term.InteractionTerm(model=None)[source]
fit(X, y, w)[source]

Trains a model given covariates, responses and assignments.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

Matrix of covariates.

ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
selfcauseinfer.standard_algorithms.InteractionTerm

A trained model.

predict(X)[source]

Predicts a causal effect given covariates.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

New data on which to make predictions.

Returns
predictionsnumpy.ndarray(num_units, 2)float

Predicted causal effects for all units given a 1 and 0 interaction term.

predict_proba(X)[source]

Predicts the probability that a subject will be a given class given covariates.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

New data on which to make predictions.

Returns
probasnumpy.ndarray(num_units, 2)float

Predicted causal probabilities for all units given a 1 and 0 interaction term.

Binary Class Transformation

The Binary Class Transformation Approach (Influential Marketing, Response Transformation Approach).

Based on

Kuchumov, A. pyuplift: Lightweight uplift modeling framework for Python. (2019). URL: https://github.com/duketemon/pyuplift. License: https://github.com/duketemon/pyuplift/blob/master/LICENSE.

Lai, L.Y.-T. (2006). “Influential marketing: A new direct marketing strategy addressing the existence of voluntary buyers”. Master of Science thesis, Simon Fraser University School of Computing Science, Burnaby, BC,Canada. URL: https://summit.sfu.ca/item/6629

Shaar, A., Abdessalem, T., and Segard, O. (2016). “Pessimistic Uplift Modeling”. ACM SIGKDD, August 2016, San Francisco, California USA, arXiv:1603.09738v1. URL:https://pdfs.semanticscholar.org/a67e/401715014c7a9d6a6679df70175be01daf7c.pdf.

Devriendt, F. et al. (2018). A Literature Survey and Experimental Evaluation of the State-of-the-Art in Uplift Modeling: A Stepping Stone Toward the Development of Prescriptive Analytics. Big Data, Vol. 6, No. 1, March 1, 2018, pp. 1-29. Codes found at: data-lab.be/downloads.php.

Contents
BinaryTransformation Class

_binary_transformation, _binary_regularization, fit, predict (Not available at this time), predict_proba

class causeinfer.standard_algorithms.binary_transformation.BinaryTransformation(model=None, regularize=False)[source]
_binary_transformation(y, w)[source]

Derives which of the unknown Affected Positive or Affected Negative classes the unit could fall into based known outcomes.

Parameters
ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
np.array(y_transformed)numpy.ndarrayan array of transformed unit classes.
_binary_regularization(y=None, w=None)[source]

Regularization of binary classes is based on the positive and negative binary affectual classes.

Parameters
ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
fav_ratio, unfav_ratiofloat

Regularized ratios of favorable and unfavorable classes.

fit(X, y, w)[source]

Trains a model given covariates, responses and assignments.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

Matrix of covariates.

ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
selfcauseinfer.standard_algorithms.BinaryTransformation

A trained model.

predict_proba(X)[source]

Predicts the probability that a subject will be a given class given covariates.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

New data on which to make predictions.

Returns
probasnumpy.ndarray(num_units, 2)float

Predicted probabilities for being a favorable class and unfavorable class.

Quaternary Class Transformation

The Quaternary Class Transformation Approach (Response Transformation Approach).

Based on

Kuchumov, A. pyuplift: Lightweight uplift modeling framework for Python. (2019). URL: https://github.com/duketemon/pyuplift. License: https://github.com/duketemon/pyuplift/blob/master/LICENSE.

Kane, K., Lo, VSY., and Zheng, J. (2014). “Mining for the truly responsive customers and prospects using truelift modeling: Comparison of new and existing methods”. In:Journal of Marketing Analytics 2(4), 218–238. URL: https://link.springer.com/article/10.1057/jma.2014.18

Devriendt, F. et al. (2018). A Literature Survey and Experimental Evaluation of the State-of-the-Art in Uplift Modeling: A Stepping Stone Toward the Development of Prescriptive Analytics. Big Data, Vol. 6, No. 1, March 1, 2018, pp. 1-29. Codes found at: data-lab.be/downloads.php.

Contents
QuaternaryTransformation Class

_quaternary_transformation, _quaternary_regularization, fit, predict (not available at this time), predict_proba

class causeinfer.standard_algorithms.quaternary_transformation.QuaternaryTransformation(model=None, regularize=False)[source]
_quaternary_transformation(y, w)[source]

Assigns known quaternary (TP, CP, CN, TN) classes to units.

Parameters
ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
np.array(y_transformed)np.array

an array of transformed unit classes.

_quaternary_regularization(y=None, w=None)[source]

Regularization of quaternary classes is based on their treatment assignment.

Parameters
ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
control_count, treatment_countint

Regularized amounts of control and treatment classes.

fit(X, y, w)[source]

Trains a model given covariates, responses and assignments.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

Matrix of covariates.

ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
selfcauseinfer.standard_algorithms.QuaternaryTransformation

A trained model.

predict_proba(X)[source]

Predicts the probability that a subject will be a given class given covariates.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

New data on which to make predictions.

Returns
probasnumpy.ndarray(num_units, 2)float

Predicted probabilities for being a favorable class and an unfavorable class.

Reflective Uplift Transformation

The Reflective Uplift Transformation Approach.

Based on

Kuchumov, A. pyuplift: Lightweight uplift modeling framework for Python. (2019). URL: https://github.com/duketemon/pyuplift. License: https://github.com/duketemon/pyuplift/blob/master/LICENSE.

Shaar, A., Abdessalem, T., and Segard, O. (2016). “Pessimistic Uplift Modeling”. ACM SIGKDD, August 2016, San Francisco, California USA, arXiv:1603.09738v1. URL:https://pdfs.semanticscholar.org/a67e/401715014c7a9d6a6679df70175be01daf7c.pdf.

Contents
ReflectiveUplift Class

fit, predict (not available at this time), predict_proba, _reflective_transformation, _reflective_weights

class causeinfer.standard_algorithms.reflective.ReflectiveUplift(model=None)[source]
fit(X, y, w)[source]

Trains a model given covariates, responses and assignments.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

Matrix of covariates.

ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
selfcauseinfer.standard_algorithms.ReflectiveUplift

A trained model.

predict_proba(X)[source]

Predicts the probability that a subject will be a given class given covariates.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

New data on which to make predictions.

Returns
probasnumpy.ndarray(num_units, 2)float

Predicted probabilities for being a favorable class and an unfavorable class.

_reflective_transformation(y, w)[source]

Assigns known quaternary (TP, CP, CN, TN) classes to units.

Parameters
ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
np.array(y_transformed)np.array

an array of transformed unit classes.

_reflective_weights(y, w)[source]

Derives weights to normalize binary transformation noise.

Parameters
ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
p_tp_fav, p_cp_fav, p_cn_unfav, p_tn_unfavnp.array

Probabilities of being a quaternary class per binary class.

Pessimistic Uplift Transformation

The Pessimistic Uplift Transformation Approach.

Based on

Kuchumov, A. pyuplift: Lightweight uplift modeling framework for Python. (2019). URL: https://github.com/duketemon/pyuplift. License: https://github.com/duketemon/pyuplift/blob/master/LICENSE.

Shaar, A., Abdessalem, T., and Segard, O. (2016). “Pessimistic Uplift Modeling”. ACM SIGKDD, August 2016, San Francisco, California USA, arXiv:1603.09738v1. URL:https://pdfs.semanticscholar.org/a67e/401715014c7a9d6a6679df70175be01daf7c.pdf.

Contents
PessimisticUplift Class

fit, predict (not available at this time), predict_proba

class causeinfer.standard_algorithms.pessimistic.PessimisticUplift(model=None)[source]
fit(X, y, w)[source]

Trains a model given covariates, responses and assignments.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

Matrix of covariates.

ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
selfcauseinfer.standard_algorithms.PessimisticUplift

A trained model.

predict_proba(X)[source]

Predicts the probability that a subject will be a given class given covariates.

Parameters
Xnumpy.ndarray(num_units, num_features)int, float

New data on which to make predictions.

Returns
probasnumpy.ndarray(num_units, 2)float

Predicted probabilities for being a favorable class and an unfavorable class.

evaluation

The evaluation module provides methods for accuracy measurement and presentation.

Metrics

causeinfer metrics provide statistical impressions of model performance.

Functions

causeinfer.evaluation.get_cum_effect(df, models=None, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=False, random_seed=None)[source]

Gets average causal effects of model estimates in cumulative population.

Parameters
dfpandas.DataFrame

A data frame with model estimates and actual data as columns.

modelslist

A list of models corresponding to estimated treatment effect columns.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

treatment_effect_colstroptional (default=tau)

The column name for the true treatment effect.

normalizeboolnot implemented (default=False)

For consitency with gain and qini.

random_seedint, optional (default=None)

Random seed for numpy.random.rand().

Returns
effectspandas.DataFrame

Average causal effects of model estimates in cumulative population.

causeinfer.evaluation.get_cum_gain(df, models=None, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=False, random_seed=None)[source]

Gets cumulative gains of model estimates in population.

Parameters
dfpandas.DataFrame

A data frame with model estimates and actual data as columns.

modelslist

A list of models corresponding to estimated treatment effect columns.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

treatment_effect_colstroptional (default=tau)

The column name for the true treatment effect.

normalizebooloptional (default=False)

Whether to normalize the y-axis to 1 or not.

random_seedint, optional (default=None)

Random seed for numpy.random.rand().

Returns
gainspandas.DataFrame

Cumulative gains of model estimates in population.

causeinfer.evaluation.get_qini(df, models=None, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=False, random_seed=None)[source]

Gets Qini of model estimates in population.

Parameters
dfpandas.DataFrame

A data frame with model estimates and actual data as columns.

modelslist

A list of models corresponding to estimated treatment effect columns.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

treatment_effect_colstroptional (default=tau)

The column name for the true treatment effect.

normalizebooloptional (default=False)

Whether to normalize the y-axis to 1 or not.

random_seedint, optional (default=None)

Random seed for numpy.random.rand().

Returns
qinispandas.DataFrame

Qini of model estimates in population.

causeinfer.evaluation.auuc_score(df, models=None, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=False, random_seed=None)[source]

Calculates the AUUC score (Gini): the Area Under the Uplift Curve.

Parameters
dfpandas.DataFrame

A data frame with model estimates and actual data as columns.

modelslist

A list of models corresponding to estimated treatment effect columns.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

treatment_effect_colstroptional (default=tau)

The column name for the true treatment effect.

normalizebooloptional (default=False)

Whether to normalize the y-axis to 1 or not.

random_seedint, for inheritance (default=None)

Random seed for numpy.random.rand().

Returns
AUUC scorefloat
causeinfer.evaluation.qini_score(df, models=None, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=False, random_seed=None)[source]

Calculates the Qini score: the area between the Qini curve of a model and random assignment.

Parameters
dfpandas.DataFrame)

A data frame with model estimates and actual data as columns

modelslist

A list of models corresponding to estimated treatment effect columns.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

treatment_effect_colstroptional (default=tau)

The column name for the true treatment effect.

normalizebooloptional (default=False)

Whether to normalize the y-axis to 1 or not.

random_seedint, for inheritance (default=None)

Random seed for numpy.random.rand().

Returns
Qini scorefloat
causeinfer.evaluation.signal_to_noise(y, w)[source]

Computes the signal to noise ratio of a dataset to derive the potential for causal inference efficacy.

Parameters
ynumpy.ndarray(num_units,)int, float

Vector of unit responses.

wnumpy.ndarray(num_units,)int, float

Vector of original treatment allocations across units.

Returns
sn_ratiofloat

Notes

  • The signal to noise ratio is the difference in treatment and control response to the control response.

  • Values close to 0 imply that CI would have little benefit over predictive modeling.

Plots

causeinfer plots provide graphical representations of model performance.

Functions

causeinfer.evaluation.plot_eval(df, kind=None, n=100, percent_of_pop=False, normalize=False, figsize=(15, 5), fontsize=20, axis=None, legend_metrics=None, *args, **kwargs)[source]

Plots one of the effect/gain/qini charts of model estimates.

Parameters
dfpandas.DataFrame

A data frame with model estimates and unit outcomes as columns.

kindstroptional (default=’gain’)

The kind of plot to draw: ‘effect,’ ‘gain,’ and ‘qini’ are supported.

nint, optional (default=100)

The number of samples to be used for plotting.

percent_of_popbooloptional (default=False)

Whether the X-axis is displayed as a percent of the whole population.

normalizeboolfor inheritance (default=False)

Passes this argument to interior functions directly.

figsizetupleoptional

Allows for quick changes of figures sizes.

fontsizeint or floatoptional (default=20)

The font size of the plots, with all labels scaled accordingly.

axisstroptional (default=None)

Adds an axis to the plot so they can be combined.

legend_metricsbooloptional (default=True)

Calculate AUUC or Qini metrics to add to the plot legend for gain and qini respectively.

causeinfer.evaluation.plot_cum_effect(df, n=100, models=None, percent_of_pop=False, outcome_col='y', treatment_col='w', treatment_effect_col='tau', random_seed=None, figsize=None, fontsize=20, axis=None, legend_metrics=None)[source]

Plots the causal effect chart of model estimates in cumulative population.

Parameters
dfpandas.DataFrame

A data frame with model estimates and actual data as columns.

kindeffect

The kind of plot to draw

nint, optional (default=100)

The number of samples to be used for plotting.

modelslist

A list of models corresponding to estimated treatment effect columns.

percent_of_popbooloptional (default=False)

Whether the X-axis is displayed as a percent of the whole population.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

treatment_effect_colstroptional (default=tau)

The column name for the true treatment effect.

random_seedint, optional (default=None)

Random seed for numpy.random.rand().

figsizetupleoptional

Allows for quick changes of figures sizes.

fontsizeint or floatoptional (default=20)

The font size of the plots, with all labels scaled accordingly.

axisstroptional (default=None)

Adds an axis to the plot so they can be combined.

legend_metricsbooloptional (default=False)

Not supported for plot_cum_effect - the user will be notified.

Returns
A plot of the cumulative effects of all models in df.
causeinfer.evaluation.plot_cum_gain(df, n=100, models=None, percent_of_pop=False, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=False, random_seed=None, figsize=None, fontsize=20, axis=None, legend_metrics=True)[source]

Plots the cumulative gain chart (or uplift curve) of model estimates.

Parameters
dfpandas.DataFrame

A data frame with model estimates and actual data as columns.

kindgain

The kind of plot to draw

nint, optional (default=100)

The number of samples to be used for plotting.

modelslist

A list of models corresponding to estimated treatment effect columns.

percent_of_popbooloptional (default=False)

Whether the X-axis is displayed as a percent of the whole population.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

treatment_effect_colstroptional (default=tau)

The column name for the true treatment effect.

normalizebooloptional (default=False)

Whether to normalize the y-axis to 1 or not.

random_seedint, optional (default=None)

Random seed for numpy.random.rand().

figsizetupleoptional

Allows for quick changes of figures sizes.

fontsizeint or floatoptional (default=20)

The font size of the plots, with all labels scaled accordingly.

axisstroptional (default=None)

Adds an axis to the plot so they can be combined.

legend_metricsbooloptional (default=True)

Calculates AUUC metrics to add to the plot legend.

Returns
A plot of the cumulative gains of all models in df.
causeinfer.evaluation.plot_qini(df, n=100, models=None, percent_of_pop=False, outcome_col='y', treatment_col='w', treatment_effect_col='tau', normalize=False, random_seed=None, figsize=None, fontsize=20, axis=None, legend_metrics=True)[source]

Plots the Qini chart (or uplift curve) of model estimates.

Parameters
dfpandas.DataFrame

A data frame with model estimates and actual data as columns.

kindqini

The kind of plot to draw

nint, optional (default=100)

The number of samples to be used for plotting.

modelslist

A list of models corresponding to estimated treatment effect columns.

percent_of_popbooloptional (default=False)

Whether the X-axis is displayed as a percent of the whole population.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

treatment_effect_colstroptional (default=tau)

The column name for the true treatment effect.

normalizebooloptional (default=False)

Whether to normalize the y-axis to 1 or not.

random_seedint, optional (default=None)

Random seed for numpy.random.rand().

figsizetupleoptional

Allows for quick changes of figures sizes.

fontsizeint or floatoptional (default=20)

The font size of the plots, with all labels scaled accordingly.

axisstroptional (default=None)

Adds an axis to the plot so they can be combined.

legend_metricsbooloptional (default=True)

Calculates Qini metrics to add to the plot legend.

Returns
A plot of the qini curves of all models in df.
causeinfer.evaluation.plot_batch_metrics(df, kind=None, n=10, models=None, outcome_col='y', treatment_col='w', normalize=False, figsize=(15, 5), fontsize=20, axis=None, *args, **kwargs)[source]

Plots the batch chart: the cumulative batch metrics predicted by a model given ranked treatment effects.

Parameters
dfpandas.DataFrame

A data frame with model estimates and unit outcomes as columns.

kindstroptional (default=’gain’)

The kind of plot to draw: ‘effect,’ ‘gain,’ ‘qini,’ and ‘response’ are supported.

nint, optional (default=10, deciles; 20, quintiles also standard)

The number of batches to split the units into.

modelslist

A list of models corresponding to estimated treatment effect columns.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

figsizetupleoptional

Allows for quick changes of figures sizes.

fontsizeint or floatoptional (default=20)

The font size of the plots, with all labels scaled accordingly.

axisstroptional (default=None)

Adds an axis to the plot so they can be combined.

Returns
A plot of batch metrics of all models in df.
causeinfer.evaluation.plot_batch_responses(df, n=10, models=None, outcome_col='y', treatment_col='w', normalize=False, figsize=(15, 5), fontsize=20, axis=None)[source]

Plots the batch response chart: the cumulative batch responses predicted by a model given ranked treatment effects.

Parameters
dfpandas.DataFrame

A data frame with model estimates and unit outcomes as columns.

kindresponse

The kind of plot to draw

nint, optional (default=10, deciles; 20, quintiles also standard)

The number of batches to split the units into.

modelslist

A list of models corresponding to estimated treatment effect columns.

outcome_colstroptional (default=y)

The column name for the actual outcome.

treatment_colstroptional (default=w)

The column name for the treatment indicator (0 or 1).

figsizetupleoptional

Allows for quick changes of figures sizes.

fontsizeint or floatoptional (default=20)

The font size of the plots, with all labels scaled accordingly.

axisstroptional (default=None)

Adds an axis to the plot so they can be combined.

Returns
A plot of batch responses of all models in df.

Iteration

Iterations methods allow a researcher or practitioner to derive average model accuracy.

Functions

causeinfer.evaluation.iterate_model(model, X_train, y_train, w_train, X_test, y_test, w_test, tau_test=None, n=10, pred_type='predict', eval_type=None, normalize_eval=False, verbose=True)[source]

Trains and makes predictions with a model multiple times to derive average predictions and their variance.

Parameters
modelobject

A model over which iterations will be done.

X_trainnumpy.ndarray(num_train_units, num_features)int, float

Matrix of covariates.

y_trainnumpy.ndarray(num_train_units,)int, float

Vector of unit responses.

w_trainnumpy.ndarray(num_train_units,)int, float

Vector of original treatment allocations across units.

X_testnumpy.ndarray(num_test_units, num_features)int, float

A matrix of covariates.

y_testnumpy.ndarray(num_test_units,)int, float

A vector of unit responses.

w_testnumpy.ndarray(num_test_units,)int, float

A vector of original treatment allocations across units.

tau_testnumpy.ndarray(num_test_units,)int, float

A vector of the actual treatment effects given simulated data.

nint (default=10)

The number of train and prediction iterations to run.

pred_typestr (default=pred)

predict or predict_proba: the type of prediction the iterations will make.

eval_typestr (default=None)

qini or auuc: the type of evaluation to be done on the predictions.

Note: if None, model predictions will be averaged without their variance being calculated.

normalize_evalbooloptional (default=False)

Whether to normalize the evaluation metric.

verbosebool (default=True)

Whether to show a tqdm progress bar for the query.

Returns
avg_preds_probasnumpy.ndarray (num_units, 2)float

Averaged per unit predictions.

all_preds_probasdict

A dictionary of all predictions produced during iterations.

avg_evalfloat

The average of the iterated model evaluations.

eval_variancefloat

The variance of all prediction evaluations.

eval_variancefloat

The variance of all prediction evaluations.

all_evalsdict

A dictionary of all evaluations produced during iterations.

causeinfer.evaluation.eval_table(eval_dict, variances=False, annotate_vars=False)[source]

Displays the evaluation of models given a dictionary of their evaluations over datasets.

Parameters
eval_dictdict

A dictionary of model evaluations over datasets.

variancesbool (default=False)

Whether to annotate the evaluations with their variances.

annotate_varsbool (default=False)

Whether to annotate the evaluation variances with stars given their sds.

Returns
eval_tablepandas.DataFrame(num_datasets, num_models)

A dataframe of dataset to model evaluation comparisons.

data

Data in causeinfer provides examples for business, medical, and socio-economic fields as benchmarks for CI techniques.

Hillstrom Email Marketing

An email marketing dataset from Kevin Hillstrom’s MineThatData blog.

See an example using this data at causeinfer/examples/business_hillstrom.

Description found at:

https://blog.minethatdata.com/2008/03/minethatdata-e-mail-analytics-and-data.html

Based on

Kuchumov, A. pyuplift: Lightweight uplift modeling framework for Python. (2019). URL: https://github.com/duketemon/pyuplift. License: https://github.com/duketemon/pyuplift/blob/master/LICENSE.

K. Hillstrom. “The MineThatData E-Mail Analytics And Data Mining Challenge”. 2008. URL: https://blog.minethatdata.com/2008/03/minethatdata-e-mail-analytics-and-data.html.

Contents

download_hillstrom, _format_data, load_hillstrom

causeinfer.data.hillstrom.download_hillstrom(data_path=None, url='http://www.minethatdata.com/Kevin_Hillstrom_MineThatData_E-MailAnalytics_DataMiningChallenge_2008.03.20.csv')[source]

Downloads the dataset from Kevin Hillstrom’s blog.

Parameters
data_pathstroptional (default=None)

A user specified path for where the data should go.

urlstr

The url from which the data is to be downloaded.

Returns
The data ‘hillstrom.csv’ in a ‘datasets’ folder, unless otherwise specified.
causeinfer.data.hillstrom._format_data(df, format_covariates=True, normalize=True)[source]

Formats the data upon loading for consistent data preparation.

Parameters
dfpd.DataFrame

The original unformatted version of the data.

format_covariatesbooloptional (default=True), controlled in load_hillstrom
  • True: creates dummy columns and encodes the data.

  • False: only steps for data readability will be taken.

normalizebooloptional (default=True), controlled in load_hillstrom

Normalize dataset columns to prepare them for ML methods.

Returns
dfpd.DataFrame

A formated version of the data.

causeinfer.data.hillstrom.load_hillstrom(file_path=None, format_covariates=True, download_if_missing=True, normalize=True)[source]

Loads the Hillstrom dataset with formatting if desired.

Parameters
file_pathstroptional (default=None)

Specify another path for the dataset.

By default the dataset should be stored in the ‘datasets’ folder in the cwd.

format_covariatesbooloptional (default=True)

Indicates whether raw data should be loaded without covariate manipulation.

download_if_missingbooloptional (default=True)

Download the dataset if it is not downloaded before using ‘download_hillstrom’.

normalizebooloptional (default=True)

Normalize dataset columns to prepare them for ML methods.

Returns
datadict object with the following attributes:
data.descriptionstr

A description of the Hillstrom email marketing dataset.

data.dataset_fullnumpy.ndarray(64000, 12) or formatted (64000, 22)

The full dataset with features, treatment, and target variables.

data.dataset_full_nameslist, size 12 or formatted 22

List of dataset variables names.

data.featuresnumpy.ndarray(64000, 8) or formatted (64000, 18)

Each row corresponding to the 8 feature values in order.

data.feature_nameslist, size 8 or formatted 18

List of feature names.

data.treatmentnumpy.ndarray(64000,)

Each value corresponds to the treatment.

data.response_spendnumpy.ndarray(64000,)

Each value corresponds to how much customers spent during the two-week outcome period.

data.response_visitnumpy.ndarray(64000,)

Each value corresponds to whether people visited the site during the two-week outcome period.

data.response_conversionnumpy.ndarray(64000,)

Each value corresponds to whether they purchased at the site (i.e. converted) during the two-week outcome period.

Mayo Clinic PBC

A dataset on medical trials to combat primary biliary cholangitis (PBC, formerly cirrhosis) of the liver from the Mayo Clinic.

See an example using this data at causeinfer/examples/medical_mayo_pbc.

Description found at:

https://www.mayo.edu/research/documents/pbchtml/DOC-10027635

Based on

Mayo Clinic. “Primary Biliary Cirrhosis”. 1991. URL: https://www.mayo.edu/research/documents/pbchtml/DOC-10027635.

Contents

download_mayo_pbc, _format_data, load_mayo_pbc

causeinfer.data.mayo_pbc.download_mayo_pbc(data_path=None, url='http://www.mayo.edu/research/documents/pbcdat/DOC-10026921')[source]

Downloads the dataset from the Mayo Clinic’s research documents.

Parameters
data_pathstroptional (default=None)

A user specified path for where the data should go.

urlstr

The url from which the data is to be downloaded.

Returns
The text file ‘mayo_pbc’ in a ‘datasets’ folder, unless otherwise specified.
causeinfer.data.mayo_pbc._format_data(dataset_path, format_covariates=True, normalize=True)[source]

Formats the data upon loading for consistent data preparation.

Parameters
dataset_pathstr

The original file is a text file with inconsistent spacing, and periods for NaNs.

Furthermore, process only loads those units that took part in the randomized trial, as there are 106 cases that were monitored, but not in the trial.

format_covariatesbooloptional (default=True)
  • True: creates dummy columns and encodes the data.

  • False: only steps for data readability will be taken.

normalizebooloptional (default=True)

Normalization step controlled in load_mayo_pbc.

Returns
dfpd.DataFrame

A formated version of the data.

causeinfer.data.mayo_pbc.load_mayo_pbc(file_path=None, format_covariates=True, download_if_missing=True, normalize=True)[source]

Loads the Mayo PBC dataset with formatting if desired.

Parameters
file_pathstroptional (default=None)

Specify another path for the dataset.

By default the dataset should be stored in the ‘datasets’ folder in the cwd.

format_covariatesbooloptional (default=True)

Indicates whether raw data should be loaded without covariate manipulation.

download_if_missingbooloptional (default=True)

Download the dataset if it is not downloaded before using ‘download_mayo_pbc’.

normalizebooloptional (default=True)

Normalize the dataset to prepare it for ML methods.

Returns
datadict object with the following attributes:
data.descriptionstr

A description of the Mayo Clinic PBC dataset.

data.dataset_fullnumpy.ndarray312, 19) or formatted (312, 24)

The full dataset with features, treatment, and target variables.

data.dataset_full_nameslist, size 19 or formatted 24

List of dataset variables names.

data.featuresnumpy.ndarray(312, 17) or formatted (312, 22)

Each row corresponding to the 17 feature values in order.

data.feature_nameslist, size 17 or formatted 22

List of feature names.

data.treatmentnumpy.ndarray(312,)

Each value corresponds to the treatment (1 = treat, 0 = control).

data.responsenumpy.ndarray(312,)

Each value corresponds to one of the outcomes (0 = alive, 1 = liver transplant, 2 = dead).

CMF Microfinance

A dataset on microfinance from The Centre for Micro Finance (CMF) at the Institute for Financial Management Research (Chennai, India).

See an example using this data at causeinfer/examples/socioeconomic_cmf_micro.

Description found at:

https://www.aeaweb.org/articles?id=10.1257/app.20130533 (see paper)

Based on

A. Banerjee et al. “The Miracle of Microfinance? Evidence from a Randomized Evaluation”. In: American Economic Journal: Applied Economics 7 (1 2015), pp. 22–53. URL: https://www.aeaweb.org/articles?id=10.1257/app.20130533.

Contents

download_cmf_micro (deprecated), _format_data, load_cmf_micro

causeinfer.data.cmf_micro._format_data(dataset_path, format_covariates=True, normalize=True)[source]

Formats the data upon loading for consistent data preparation.

Source: https://github.com/thmstang/apa19-microfinance/blob/master/helpers.r (R-version)

Parameters
dataset_pathstr

The original file is a folder that has various .dta sets.

format_covariatesbooloptional (default=True)
  • True: creates dummy columns and encodes the data.

  • False: only steps for data readability will be taken.

normalizebooloptional (default=True)

Normalization step controlled in load_cmf_micro.

Returns
dfpd.DataFrame

A formated version of the data.

causeinfer.data.cmf_micro.load_cmf_micro(file_path=None, format_covariates=True, normalize=True)[source]

Loads the CMF micro dataset with formatting if desired.

Parameters
file_pathstroptional (default=None)

Specify another path for the dataset.

By default the dataset should be stored in the ‘datasets’ folder in the cwd.

load_raw_databooloptional (default=True)

Indicates whether raw data should be loaded without covariate manipulation.

download_if_missingbooloptional (default=True) (Deprecated)

Download the dataset if it is not downloaded before using ‘download_cmf_micro’.

normalizebooloptional (default=True)

Normalize the dataset to prepare it for ML methods.

Returns
datadict object with the following attributes:
data.descriptionstr

A description of the CMF microfinance data.

data.dataset_fullnumpy.ndarray(5328, 183) or formatted (5328, 60)

The full dataset with features, treatment, and target variables.

data.dataset_full_nameslist, size 61

List of dataset variables names.

data.featuresnumpy.ndarray(5328, 186) or formatted (5328, 57)

Each row corresponding to the 58 feature values in order (note that other target can be a feature).

data.feature_nameslist, size 58

List of feature names.

data.treatmentnumpy.ndarray(5328,)

Each value corresponds to the treatment (1 = treat, 0 = control).

data.response_biz_indexnumpy.ndarray(5328,)

Each value corresponds to the business index of each of the participants.

data.response_women_empnumpy.ndarray(5328,)

Each value corresponds to the women’s empowerment index of each of the participants.

Download Utilities

Utility functions for downloading data.

Based on

Kuchumov, A. pyuplift: Lightweight uplift modeling framework for Python. (2019). URL: https://github.com/duketemon/pyuplift. License: https://github.com/duketemon/pyuplift/blob/master/LICENSE.

Contents

download_file, get_download_paths

causeinfer.data.download_utils.download_file(url: str, output_path: str, zip_file=False)[source]

Downloads a file from a url to a specified path.

Parameters
urlstr

the URL from which the file can be downloaded from.

output_pathstr

a user specified path, which defaults to a ‘files’ folder in the cwd.

causeinfer.data.download_utils.get_download_paths(file_path, file_directory='files', file_name='file')[source]

Derives paths for a file folder and a file.

Parameters
pathstr

A user specified path that the data should go to

file_directorystr (default=files)

A user specified directory.

file_namestr (default=file)

The name to call the file.

utils

The utils module provides needed functions for causal inference model testing and deployment.

Functions

causeinfer.utils.train_test_split(X, y, w, percent_train=0.7, random_state=None, maintain_proportions=False)[source]

Split unit X covariates and (y,w) outcome tuples into training and testing sets.

Parameters
Xnumpy.ndarray(n_samples, n_features)

Matrix of unit covariate features.

ynumpy.ndarray(n_samples,)

Array of unit responses.

wnumpy.ndarray(n_samples,)

Array of unit treatments.

percent_trainfloat

The percent of the covariates and outcomes to delegate to model training.

random_stateint (default=None)

A seed for the random number generator for consistency.

maintain_proportionsbooloptional (default=False)

Whether to maintain the treatment group proportions within the split samples.

Returns
X_train, X_test, y_train, y_test, w_train, w_testnumpy.ndarray

Arrays of split covariates and outcomes.

causeinfer.utils.plot_unit_distributions(df, variable, treatment=None, bins=None, axis=None)[source]

Plots seaborn countplots of unit covariate and outcome distributions.

Parameters
df_plotpandas df, [n_samples, n_features]

The data from which the plot is made.

variablestr

A unit covariate or outcome for which the plot is desired.

treatmentstroptional (default=None)

The treatment variable for comparing across segments.

binsint (default=None)

Bins the column values such that larger distributions can be plotted.

axisstroptional (default=None)

Adds an axis to the plot so they can be combined.

Returns
axmatplotlib.axes

Displays a seaborn plot of unit distributions across the given covariate or outcome value.

causeinfer.utils.over_sample(X_1, y_1, w_1, sample_2_size, shuffle=True, random_state=None)[source]

Over-samples to provide equality between a given sample and another it is smaller than.

Parameters
X_1numpy.ndarray(num_sample1_units, num_sample1_features)

Dataframe of sample covariates.

y_1numpy.ndarray(num_sample1_units,)

Vector of sample unit responses.

w_1numpy.ndarray(num_sample1_units,)

Designates the original treatment allocation across sample units.

sample_2_sizeint

The size of the other sample to match.

shufflebooloptional (default=True)

Whether to shuffle the new sample after it’s created.

random_stateint (default=None)

A seed for the random number generator to allow for consistency.

Returns
The provided covariates and outcomes, having been over-sampled to match another.
  • X_os : numpy.ndarray : (num_sample2_units, num_sample2_features).

  • y_os : numpy.ndarray : (num_sample2_units,).

  • w_os : numpy.ndarray : (num_sample2_units,).

Contributing to causeinfer

Thank you for your consideration in contributing to this project!

Please take a moment to review this document in order to make the contribution process easy and effective for everyone involved.

Following these guidelines helps to communicate that you respect the time of the developers managing and developing this open source project. In return, and in accordance with this project’s code of conduct, other contributors will reciprocate that respect in addressing your issue or assessing patches and features.

Using the issue tracker

The issue tracker for causeinfer is the preferred channel for bug reports, features requests and submitting pull requests.

Bug reports

A bug is a demonstrable problem that is caused by the code in the repository. Good bug reports are extremely helpful - thank you!

Guidelines for bug reports:

  1. Use the GitHub issue search to check if the issue has already been reported.

  2. Check if the issue has been fixed by trying to reproduce it using the latest main or development branch in the repository.

  3. Isolate the problem to make sure that the code in the repository is definitely responsible for the issue.

Great Bug Reports tend to have:

  • A quick summary

  • Steps to reproduce

  • What you expected would happen

  • What actually happens

  • Notes (why this might be happening, things tried that didn’t work, etc)

Again, thank you for your time in reporting issues!

Feature requests

Feature requests are more than welcome! Please take a moment to find out whether your idea fits with the scope and aims of the project. When making a suggestion, provide as much detail and context as possible, and further make clear the degree to which you would like to contribute in its development.

Pull requests

Good pull requests - patches, improvements and new features - are a fantastic help. They should remain focused in scope and avoid containing unrelated commits. Note that all contributions to this project will be made under the specified license and should follow the coding indentation and style standards (contact us if unsure).

Please ask first before embarking on any significant pull request (implementing features, refactoring code, etc), otherwise you risk spending a lot of time working on something that the developers might not want to merge into the project. With that being said, major additions are very appreciated!

When making a contribution, adhering to the GitHub flow process is the best way to get your work merged:

  1. Fork the repo, clone your fork, and configure the remotes:

    # Clone your fork of the repo into the current directory
    git clone https://github.com/<your-username>/<repo-name>
    # Navigate to the newly cloned directory
    cd <repo-name>
    # Assign the original repo to a remote called "upstream"
    git remote add upstream https://github.com/<upsteam-owner>/<repo-name>
    
  2. If you cloned a while ago, get the latest changes from upstream:

    git checkout <dev-branch>
    git pull upstream <dev-branch>
    
  3. Create a new topic branch (off the main project development branch) to contain your feature, change, or fix:

    git checkout -b <topic-branch-name>
    
  4. Commit your changes in logical chunks, and please try to adhere to Conventional Commits. Use Git’s interactive rebase feature to tidy up your commits before making them public.

  5. Locally merge (or rebase) the upstream development branch into your topic branch:

    git pull --rebase upstream <dev-branch>
    
  6. Push your topic branch up to your fork:

    git push origin <topic-branch-name>
    
  7. Open a Pull Request with a clear title and description.

Thank you in advance for your contributions!

License

BSD 3-Clause License

Copyright (c) 2019, the causeinfer developers. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its
  contributors may be used to endorse or promote products derived from
  this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Change log

Changelog

causeinfer tries to follow semantic versioning, a MAJOR.MINOR.PATCH version where increments are made of the:

  • MAJOR version when we make incompatible API changes

  • MINOR version when we add functionality in a backwards compatible manner

  • PATCH version when we make backwards compatible bug fixes

causeinfer 1.0.1 (June 3rd, 2022)

  • Updates source code files with direct references to codes they’re based on.

causeinfer 1.0.0 (December 28th, 2021)

causeinfer 0.1.2 (April 4th, 2021)

Changes include:

  • An src structure has been adopted to improve organization and testing

  • Users are now able to implement the following models:

    • Reflective Uplift (Shaar 2016)

    • Pessimistic Uplift (Shaar 2016)

  • The contribution guidelines have been expanded

  • Code quality checks via Codacy have been added

  • Extensive code formatting has been done to improve quality and style

  • Bug fixes and a more explicit use of exceptions

causeinfer 0.1.0 (Feb 25th, 2021)

First stable release of causeinfer

  • Users are able to implement baseline causal inference models including:

    • Two model

    • Interaction term (Lo 2002)

    • Binary transformation (Lai 2006)

    • Quaternary transformation (Kane 2014)

  • Plotting functions allow for graphical analysis of models

  • Functions useful for research such as model iterations, oversampling, and variance analysis are included

  • The package is fully documented

  • Virtual environment files are provided

  • Extensive testing of all modules with GH Actions and Codecov has been performed

  • A code of conduct and contribution guidelines are included

Project Indices