Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Credit scorecards in retail banking: enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy

This research addresses the dual challenges of improving credit scorecard accuracy and maintaining interpretability. While machine learning algorithms like random forest and eXtreme gradient boosting outperform traditional logistic regression in accuracy, their complex predictor variable representat...

Full description

Saved in:

Bibliographic Details
Main Author:	Hlongwane, Rivalani
Other Authors:	Ramaboa, Kutlwano
Format:	Thesis
Language:	English English
Published:	Graduate School of Business (GSB) 2025
Subjects:	credit scorecard
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613492584906752
access_status_str	Open Access
author	Hlongwane, Rivalani
author2	Ramaboa, Kutlwano
author_browse	Hlongwane, Rivalani Ramaboa, Kutlwano
author_facet	Ramaboa, Kutlwano Hlongwane, Rivalani
author_sort	Hlongwane, Rivalani
collection	Thesis
description	This research addresses the dual challenges of improving credit scorecard accuracy and maintaining interpretability. While machine learning algorithms like random forest and eXtreme gradient boosting outperform traditional logistic regression in accuracy, their complex predictor variable representation hinders interpretability. To reconcile this, the study discretizes numerical variables, applies one-hot encoding, and employs Shapley values to derive interpretable credit scores for random forest, eXtreme gradient boosting, light gradient boosting machine, and categorical boosting models. This approach produces credit scorecards that align with industry standards. Additionally, the investigation into the role of alternative data in credit scoring reveals its impact on model accuracy. By analysing unique predictor variables such as an applicant's social circle default status, regional ratings, and local population size, the significance of alternative data is demonstrated. Leveraging the model-X knockoffs framework for predictor variable selection contributes to superior model performance, achieving the highest area under the curve on the Kaggle home credit data.
format	Thesis
id	oai:open.uct.ac.za:11427/41623
institution	University of Cape Town (South Africa)
language	English eng
last_indexed	2026-06-10T12:37:00.852Z
license_str	Not specified — see source repository
provenance_str_mv	Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate	2025
publishDateRange	2025
publishDateSort	2025
publisher	Graduate School of Business (GSB)
publisherStr	Graduate School of Business (GSB)
record_format	dspace
source_str	UCTD — University of Cape Town Open Access Repository
spelling	oai:open.uct.ac.za:11427/41623 Credit scorecards in retail banking: enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy Hlongwane, Rivalani Ramaboa, Kutlwano credit scorecard This research addresses the dual challenges of improving credit scorecard accuracy and maintaining interpretability. While machine learning algorithms like random forest and eXtreme gradient boosting outperform traditional logistic regression in accuracy, their complex predictor variable representation hinders interpretability. To reconcile this, the study discretizes numerical variables, applies one-hot encoding, and employs Shapley values to derive interpretable credit scores for random forest, eXtreme gradient boosting, light gradient boosting machine, and categorical boosting models. This approach produces credit scorecards that align with industry standards. Additionally, the investigation into the role of alternative data in credit scoring reveals its impact on model accuracy. By analysing unique predictor variables such as an applicant's social circle default status, regional ratings, and local population size, the significance of alternative data is demonstrated. Leveraging the model-X knockoffs framework for predictor variable selection contributes to superior model performance, achieving the highest area under the curve on the Kaggle home credit data. 2025-08-26T09:00:37Z 2025-08-26T09:00:37Z 2025 2025-08-26T08:57:52Z Thesis / Dissertation Doctoral PhD http://hdl.handle.net/11427/41623 en eng application/pdf Graduate School of Business (GSB) Faculty of Commerce University of Cape Town
spellingShingle	credit scorecard Hlongwane, Rivalani Credit scorecards in retail banking: enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy
thesis_degree_str	Doctoral
title	Credit scorecards in retail banking: enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy
title_full	Credit scorecards in retail banking: enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy
title_fullStr	Credit scorecards in retail banking: enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy
title_full_unstemmed	Credit scorecards in retail banking: enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy
title_short	Credit scorecards in retail banking: enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy
title_sort	credit scorecards in retail banking enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy
topic	credit scorecard
url	http://hdl.handle.net/11427/41623
work_keys_str_mv	AT hlongwanerivalani creditscorecardsinretailbankingenhancinginterpretabilitythroughshapleyvaluesandevaluatingtheeffectivenessofalternativedataforimprovedaccuracy

Full Text Available

Credit scorecards in retail banking: enhancing interpretability through shapley values and evaluating the effectiveness of alternative data for improved accuracy

Similar Items