CMF Microfinance¶

A dataset on microfinance from The Centre for Micro Finance (CMF) at the Institute for Financial Management Research (Chennai, India).

Description found at:: https://www.aeaweb.org/articles?id=10.1257/app.20130533 (see paper)
Based on: A. Banerjee et al. “The Miracle of Microfinance? Evidence from a Randomized Evaluation”. In: American Economic Journal: Applied Economics 7 (1 2015), pp. 22–53. URL: https://www.aeaweb.org/articles?id=10.1257/app.20130533.
Contents: download_cmf_micro (deprecated), _format_data, load_cmf_micro

causeinfer.data.cmf_micro._format_data(dataset_path, format_covariates=True, normalize=True)[source]¶

Formats the data upon loading for consistent data preparation.

Parameters:

dataset_pathstr

The original file is a folder that has various .dta sets.

format_covariatesbooloptional (default=True)

normalizebooloptional (default=True)

Normalization step controlled in load_cmf_micro.

Returns:

causeinfer.data.cmf_micro.load_cmf_micro(file_path=None, format_covariates=True, normalize=True)[source]¶

Loads the CMF micro dataset with formatting if desired.

Parameters:

file_pathstroptional (default=None)

Specify another path for the dataset.

By default the dataset should be stored in the ‘datasets’ folder in the cwd.

load_raw_databooloptional (default=True)

Indicates whether raw data should be loaded without covariate manipulation.

download_if_missingbooloptional (default=True) (Deprecated)

Download the dataset if it is not downloaded before using ‘download_cmf_micro’.

normalizebooloptional (default=True)

Normalize the dataset to prepare it for ML methods.

Returns:

datadict object with the following attributes:

data.descriptionstr: A description of the CMF microfinance data.
data.dataset_fullnumpy.ndarray(5328, 183) or formatted (5328, 60): The full dataset with features, treatment, and target variables.
data.dataset_full_nameslist, size 61: List of dataset variables names.
data.featuresnumpy.ndarray(5328, 186) or formatted (5328, 57): Each row corresponding to the 58 feature values in order (note that other target can be a feature).
data.feature_nameslist, size 58: List of feature names.
data.treatmentnumpy.ndarray(5328,): Each value corresponds to the treatment (1 = treat, 0 = control).
data.response_biz_indexnumpy.ndarray(5328,): Each value corresponds to the business index of each of the participants.
data.response_women_empnumpy.ndarray(5328,): Each value corresponds to the women’s empowerment index of each of the participants.