Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression

Computer bugs and tests are antagonistic elements of the software development process, with the former attempting to corrupt a program and the latter aiming to identify and fix the introduced faults. The automation of bug identification and repair schemes through automated software testing is an are...

Full description

Saved in:
Bibliographic Details
Main Author: Ombura, Martin
Other Authors: Nitschke, Geoff Stuart
Format: Thesis
Language:English
Published: Department of Computer Science 2021
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613344249151488
access_status_str Open Access
author Ombura, Martin
author2 Nitschke, Geoff Stuart
author_browse Nitschke, Geoff Stuart
Ombura, Martin
author_facet Nitschke, Geoff Stuart
Ombura, Martin
author_sort Ombura, Martin
collection Thesis
description Computer bugs and tests are antagonistic elements of the software development process, with the former attempting to corrupt a program and the latter aiming to identify and fix the introduced faults. The automation of bug identification and repair schemes through automated software testing is an area of research that has only seen success in niche areas of software development but has failed to progress into general areas of computing due to the complexity and diversity of programming languages, codebases and developer coding practices. Unlike traditional engineering fields such as mechanical or civil where project specifications are carefully outlined and built towards, software engineering suffers from a lack of global standardization required to “build from a spec”. In this study we investigate a coevolutionary spec-based approach to dynamically damage and repair programs mathematical programs (functions). We opt for mathematical functions instead of software due to their functional similarities and simpler syntax and semantics. We utilize symbolic regression (SR) as a framework to analyze the error maximized by bugs and minimized by test. We adopt a hybrid evolutionary algorithm (EA) that implements the tree based phenotypic structure of genetic programming (GP) and the list-based chromosome of genetic algorithm (GA) that permits embedding of mathematical tree manipulation (MTM) strategies, as well as adequate selection mechanisms for search. Bugs utilize the MTM strategies in their chromosome to manipulate the input program (IP) with the aim of maximizing the error while tests adopt a set of their own MTM strategies to repair the damaged program using a spec generated from the IP to guide the repair process. Both adversarial agents are investigated in four common coevolutionary topologies, Hall of Fame (HoF), K-Random Tournaments (KRT), Round Robin (RR) and Single Elimination Tournament (SET). We ran 1556 simulations each generating a random polynomial that the bugs and tests would have to contend over in all 4 topologies. We observed that KRT with a low k value of 5 performs best from a computational and fitness standpoint for all bugs and tests. Bugs were dominant in nearly all topologies for all polynomial complexities, whereas tests struggled in the HoF, RR and SET topologies as the input programs became more complex. The competitive landscape however was quite chaotic with the best individuals lasting a maximum of 14 generations out of 300, with the average top individuals lasting only 1 generation. This made predictions on when the best individuals would be born nearly impossible as the coevolutionary landscape changed quite rapidly and non-deterministically. The kinds of MTM strategies selected by both bugs and tests depended on the level of complexity of the input programs. For input programs that had negative polynomials, the best bugs opted to delete the program entirely and build a completely new tree, whereas the best tests were unable to select viable specialized strategies to repair such programs. For programs that had large polynomial degrees, bugs opted for strategies that added nodes their underlying GP tree, in the hopes of damaging the input program more. Tests on the other hand implemented strategies to carefully reduce the complexity of the polynomial. Tests however, frequently overcompensated when attempting to fix the fit bugs, leading to mediocre solutions.
format Thesis
id oai:open.uct.ac.za:11427/32904
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:34:39.078Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2021
publishDateRange 2021
publishDateSort 2021
publisher Department of Computer Science
publisherStr Department of Computer Science
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/32904 The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression Ombura, Martin Nitschke, Geoff Stuart computer science Computer bugs and tests are antagonistic elements of the software development process, with the former attempting to corrupt a program and the latter aiming to identify and fix the introduced faults. The automation of bug identification and repair schemes through automated software testing is an area of research that has only seen success in niche areas of software development but has failed to progress into general areas of computing due to the complexity and diversity of programming languages, codebases and developer coding practices. Unlike traditional engineering fields such as mechanical or civil where project specifications are carefully outlined and built towards, software engineering suffers from a lack of global standardization required to “build from a spec”. In this study we investigate a coevolutionary spec-based approach to dynamically damage and repair programs mathematical programs (functions). We opt for mathematical functions instead of software due to their functional similarities and simpler syntax and semantics. We utilize symbolic regression (SR) as a framework to analyze the error maximized by bugs and minimized by test. We adopt a hybrid evolutionary algorithm (EA) that implements the tree based phenotypic structure of genetic programming (GP) and the list-based chromosome of genetic algorithm (GA) that permits embedding of mathematical tree manipulation (MTM) strategies, as well as adequate selection mechanisms for search. Bugs utilize the MTM strategies in their chromosome to manipulate the input program (IP) with the aim of maximizing the error while tests adopt a set of their own MTM strategies to repair the damaged program using a spec generated from the IP to guide the repair process. Both adversarial agents are investigated in four common coevolutionary topologies, Hall of Fame (HoF), K-Random Tournaments (KRT), Round Robin (RR) and Single Elimination Tournament (SET). We ran 1556 simulations each generating a random polynomial that the bugs and tests would have to contend over in all 4 topologies. We observed that KRT with a low k value of 5 performs best from a computational and fitness standpoint for all bugs and tests. Bugs were dominant in nearly all topologies for all polynomial complexities, whereas tests struggled in the HoF, RR and SET topologies as the input programs became more complex. The competitive landscape however was quite chaotic with the best individuals lasting a maximum of 14 generations out of 300, with the average top individuals lasting only 1 generation. This made predictions on when the best individuals would be born nearly impossible as the coevolutionary landscape changed quite rapidly and non-deterministically. The kinds of MTM strategies selected by both bugs and tests depended on the level of complexity of the input programs. For input programs that had negative polynomials, the best bugs opted to delete the program entirely and build a completely new tree, whereas the best tests were unable to select viable specialized strategies to repair such programs. For programs that had large polynomial degrees, bugs opted for strategies that added nodes their underlying GP tree, in the hopes of damaging the input program more. Tests on the other hand implemented strategies to carefully reduce the complexity of the polynomial. Tests however, frequently overcompensated when attempting to fix the fit bugs, leading to mediocre solutions. 2021-02-19T12:49:21Z 2021-02-19T12:49:21Z 2020 2021-02-19T12:28:00Z Master Thesis Masters MSc http://hdl.handle.net/11427/32904 eng application/pdf Department of Computer Science Faculty of Science
spellingShingle computer science
Ombura, Martin
The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression
thesis_degree_str Master's
title The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression
title_full The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression
title_fullStr The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression
title_full_unstemmed The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression
title_short The performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression
title_sort performance of coevolutionary topologies in developing competitive tree manipulation strategies for symbolic regression
topic computer science
url http://hdl.handle.net/11427/32904
work_keys_str_mv AT omburamartin theperformanceofcoevolutionarytopologiesindevelopingcompetitivetreemanipulationstrategiesforsymbolicregression
AT omburamartin performanceofcoevolutionarytopologiesindevelopingcompetitivetreemanipulationstrategiesforsymbolicregression