Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
This study presents an innovative solution to the challenge of generating new data points for small data sets. It introduces a Single Value Decomposition (SVD)-based model that draws inspiration from the ability of SVD to estimate a lower rank matrix. This approach seeks to overcome the limitations...
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Language: | Eng |
| Published: |
Department of Statistical Sciences
2025
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867613343022317568 |
|---|---|
| access_status_str | Open Access |
| author | Biyana, Tlhologello |
| author2 | Nyirenda, Juwa Chiza |
| author_browse | Biyana, Tlhologello Nyirenda, Juwa Chiza |
| author_facet | Nyirenda, Juwa Chiza Biyana, Tlhologello |
| author_sort | Biyana, Tlhologello |
| collection | Thesis |
| description | This study presents an innovative solution to the challenge of generating new data points for small data sets. It introduces a Single Value Decomposition (SVD)-based model that draws inspiration from the ability of SVD to estimate a lower rank matrix. This approach seeks to overcome the limitations imposed by sample size constraints by expanding available data. Motivated by challenges faced during algorithm development due to small data sets, the study proposes the SVD-based model, evaluates its efficacy in replicating original data attributes and compares model performance with new and original data. The method involves utilising SVD to generate new data, mimicking a predictive modelling formula by combining systematic and error components. The generated data set retains the distribution of the original data but introduces distinct error values, facilitating efficient data generation. Through graphical and quantitative assessments, including histograms, box plots, correlation analysis and reconstruction error evaluations, the effectiveness of the method is demonstrated. The study focuses on comparing SVD-generated data sets with original data across three data sets: Abalone, Life Expectancy and NBA. Findings indicate close approximation of distribution, correlation and model performance attributes between SVD-generated and original data sets. Improved similarity with increasing observation count enhances comparability and model performance of SVD-generated data. While minor deviations are noted in specific scenarios, the study underscores potential of SVD in generating new data points from the original data sets, making it a valuable tool for data augmentation and analysis across diverse data sets. |
| format | Thesis |
| id | oai:open.uct.ac.za:11427/41476 |
| institution | University of Cape Town (South Africa) |
| language | Eng |
| last_indexed | 2026-06-10T12:34:38.153Z |
| license_str | Not specified — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository |
| publishDate | 2025 |
| publishDateRange | 2025 |
| publishDateSort | 2025 |
| publisher | Department of Statistical Sciences |
| publisherStr | Department of Statistical Sciences |
| record_format | dspace |
| source_str | UCTD — University of Cape Town Open Access Repository |
| spelling | oai:open.uct.ac.za:11427/41476 Generating new data points using singular value decomposition Biyana, Tlhologello Nyirenda, Juwa Chiza Statistical Sciences This study presents an innovative solution to the challenge of generating new data points for small data sets. It introduces a Single Value Decomposition (SVD)-based model that draws inspiration from the ability of SVD to estimate a lower rank matrix. This approach seeks to overcome the limitations imposed by sample size constraints by expanding available data. Motivated by challenges faced during algorithm development due to small data sets, the study proposes the SVD-based model, evaluates its efficacy in replicating original data attributes and compares model performance with new and original data. The method involves utilising SVD to generate new data, mimicking a predictive modelling formula by combining systematic and error components. The generated data set retains the distribution of the original data but introduces distinct error values, facilitating efficient data generation. Through graphical and quantitative assessments, including histograms, box plots, correlation analysis and reconstruction error evaluations, the effectiveness of the method is demonstrated. The study focuses on comparing SVD-generated data sets with original data across three data sets: Abalone, Life Expectancy and NBA. Findings indicate close approximation of distribution, correlation and model performance attributes between SVD-generated and original data sets. Improved similarity with increasing observation count enhances comparability and model performance of SVD-generated data. While minor deviations are noted in specific scenarios, the study underscores potential of SVD in generating new data points from the original data sets, making it a valuable tool for data augmentation and analysis across diverse data sets. 2025-06-23T13:19:52Z 2025-06-23T13:19:52Z 2025 2025-06-23T13:17:18Z Thesis / Dissertation Masters MSc http://hdl.handle.net/11427/41476 Eng application/pdf Department of Statistical Sciences Faculty of Science University of Cape town |
| spellingShingle | Statistical Sciences Biyana, Tlhologello Generating new data points using singular value decomposition |
| thesis_degree_str | Master's |
| title | Generating new data points using singular value decomposition |
| title_full | Generating new data points using singular value decomposition |
| title_fullStr | Generating new data points using singular value decomposition |
| title_full_unstemmed | Generating new data points using singular value decomposition |
| title_short | Generating new data points using singular value decomposition |
| title_sort | generating new data points using singular value decomposition |
| topic | Statistical Sciences |
| url | http://hdl.handle.net/11427/41476 |
| work_keys_str_mv | AT biyanatlhologello generatingnewdatapointsusingsingularvaluedecomposition |