CMF Microfinance

A dataset on microfinance from The Centre for Micro Finance (CMF) at the Institute for Financial Management Research (Chennai, India).

See an example using this data at causeinfer/examples/socioeconomic_cmf_micro.

Description found at:

https://www.aeaweb.org/articles?id=10.1257/app.20130533 (see paper)

Based on

A. Banerjee et al. “The Miracle of Microfinance? Evidence from a Randomized Evaluation”. In: American Economic Journal: Applied Economics 7 (1 2015), pp. 22–53. URL: https://www.aeaweb.org/articles?id=10.1257/app.20130533.

Contents

download_cmf_micro (deprecated), _format_data, load_cmf_micro

causeinfer.data.cmf_micro._format_data(dataset_path, format_covariates=True, normalize=True)[source]

Formats the data upon loading for consistent data preparation.

Source: https://github.com/thmstang/apa19-microfinance/blob/master/helpers.r (R-version)

Parameters:
dataset_pathstr

The original file is a folder that has various .dta sets.

format_covariatesbooloptional (default=True)
  • True: creates dummy columns and encodes the data.

  • False: only steps for data readability will be taken.

normalizebooloptional (default=True)

Normalization step controlled in load_cmf_micro.

Returns:
dfpd.DataFrame

A formated version of the data.

causeinfer.data.cmf_micro.load_cmf_micro(file_path=None, format_covariates=True, normalize=True)[source]

Loads the CMF micro dataset with formatting if desired.

Parameters:
file_pathstroptional (default=None)

Specify another path for the dataset.

By default the dataset should be stored in the ‘datasets’ folder in the cwd.

load_raw_databooloptional (default=True)

Indicates whether raw data should be loaded without covariate manipulation.

download_if_missingbooloptional (default=True) (Deprecated)

Download the dataset if it is not downloaded before using ‘download_cmf_micro’.

normalizebooloptional (default=True)

Normalize the dataset to prepare it for ML methods.

Returns:
datadict object with the following attributes:
data.descriptionstr

A description of the CMF microfinance data.

data.dataset_fullnumpy.ndarray(5328, 183) or formatted (5328, 60)

The full dataset with features, treatment, and target variables.

data.dataset_full_nameslist, size 61

List of dataset variables names.

data.featuresnumpy.ndarray(5328, 186) or formatted (5328, 57)

Each row corresponding to the 58 feature values in order (note that other target can be a feature).

data.feature_nameslist, size 58

List of feature names.

data.treatmentnumpy.ndarray(5328,)

Each value corresponds to the treatment (1 = treat, 0 = control).

data.response_biz_indexnumpy.ndarray(5328,)

Each value corresponds to the business index of each of the participants.

data.response_women_empnumpy.ndarray(5328,)

Each value corresponds to the women’s empowerment index of each of the participants.