Document Type

Dissertation

Rights

This item is available under a Creative Commons License for non-commercial use only

Disciplines

Computer Sciences

Publication Details

Dissertation submitted in partial fulfilment of the requirements of Dublin Institute of Technology for the degree of M.Sc. in Computing (Stream), June 2018.

Abstract

This study evaluated the performance of an artificial neural network (ANN) multi-layer perceptron model and a logistic regression logitboost (LR) model to predict default in chit funds. The two types of default investigated were late payment of 30 days and late payment of 90 days. The dataset was broken up into training and validation datasets using random sampling and K folds cross validation was used on the training dataset to assess performance of the tuning parameters. The validation dataset was used to compare performance of both algorithms. Principle component analysis (PCA) was used to reduce the feature set while still explaining 95% of the variance in the data. The classes were highly imbalanced and Synthetic Minority Oversampling Technique (SMOTE) and down sampling were used to overcome the class imbalance. 16 experiments were ran, 8 for each of the two defaults. The three key metrics that were measured for these experiments were balanced accuracy, Area under the ROC curve (AUC) and F1 score. After making Bonferroni’s adjustment to the original p value statistical significance was set to 0.003 when comparing multiple experiments. In these experiments the ANN model had the best results for balanced accuracy, AUC and F1score. Statistical analysis using a paired t test showed that there was a statistically significant difference in the results between ANN and LR. The results of these experiments also showed that there was very little difference in the contribution of the top 20 features to the first 30 principal components, which were used to predict default. These features included family id, income and address. Features that had little or no contribution to the principle components included Commission, Auction Amount, and type of relation the nominee is to the chit fund member. These findings are context specific and in this case the context is chit funds from a digital chit fund operator in India

DOI

10.21427/D7KN69

Share

COinS