CMF Microfinance¶
A dataset on microfinance from The Centre for Micro Finance (CMF) at the Institute for Financial Management Research (Chennai, India).
See an example using this data at causeinfer/examples/socioeconomic_cmf_micro.
- Description found at:
https://www.aeaweb.org/articles?id=10.1257/app.20130533 (see paper)
- Based on
A. Banerjee et al. “The Miracle of Microfinance? Evidence from a Randomized Evaluation”. In: American Economic Journal: Applied Economics 7 (1 2015), pp. 22–53. URL: https://www.aeaweb.org/articles?id=10.1257/app.20130533.
- Contents
download_cmf_micro (deprecated), _format_data, load_cmf_micro
- causeinfer.data.cmf_micro._format_data(dataset_path, format_covariates=True, normalize=True)[source]¶
Formats the data upon loading for consistent data preparation.
Source: https://github.com/thmstang/apa19-microfinance/blob/master/helpers.r (R-version)
- Parameters
- dataset_pathstr
The original file is a folder that has various .dta sets.
- format_covariatesbooloptional (default=True)
True: creates dummy columns and encodes the data.
False: only steps for data readability will be taken.
- normalizebooloptional (default=True)
Normalization step controlled in load_cmf_micro.
- Returns
- dfpd.DataFrame
A formated version of the data.
- causeinfer.data.cmf_micro.load_cmf_micro(file_path=None, format_covariates=True, normalize=True)[source]¶
Loads the CMF micro dataset with formatting if desired.
- Parameters
- file_pathstroptional (default=None)
Specify another path for the dataset.
By default the dataset should be stored in the ‘datasets’ folder in the cwd.
- load_raw_databooloptional (default=True)
Indicates whether raw data should be loaded without covariate manipulation.
- download_if_missingbooloptional (default=True) (Deprecated)
Download the dataset if it is not downloaded before using ‘download_cmf_micro’.
- normalizebooloptional (default=True)
Normalize the dataset to prepare it for ML methods.
- Returns
- datadict object with the following attributes:
- data.descriptionstr
A description of the CMF microfinance data.
- data.dataset_fullnumpy.ndarray(5328, 183) or formatted (5328, 60)
The full dataset with features, treatment, and target variables.
- data.dataset_full_nameslist, size 61
List of dataset variables names.
- data.featuresnumpy.ndarray(5328, 186) or formatted (5328, 57)
Each row corresponding to the 58 feature values in order (note that other target can be a feature).
- data.feature_nameslist, size 58
List of feature names.
- data.treatmentnumpy.ndarray(5328,)
Each value corresponds to the treatment (1 = treat, 0 = control).
- data.response_biz_indexnumpy.ndarray(5328,)
Each value corresponds to the business index of each of the participants.
- data.response_women_empnumpy.ndarray(5328,)
Each value corresponds to the women’s empowerment index of each of the participants.