Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
Memory is an important component of effective learning systems and is crucial in non-Markovian as well as partially observable environments. In recent years, Long Short-Term Memory (LSTM) networks have been the dominant mechanism for providing memory in reinforcement learning, however, the success o...
| Main Author: | |
|---|---|
| Other Authors: | |
| Format: | Thesis |
| Language: | English |
| Published: |
Department of Mathematics and Applied Mathematics
2022
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| _version_ | 1867613421666566145 |
|---|---|
| access_status_str | Open Access |
| author | Makkink, Thomas |
| author2 | Shock, Jonathan |
| author_browse | Makkink, Thomas Shock, Jonathan |
| author_facet | Shock, Jonathan Makkink, Thomas |
| author_sort | Makkink, Thomas |
| collection | Thesis |
| description | Memory is an important component of effective learning systems and is crucial in non-Markovian as well as partially observable environments. In recent years, Long Short-Term Memory (LSTM) networks have been the dominant mechanism for providing memory in reinforcement learning, however, the success of transformers in natural language processing tasks has highlighted a promising and viable alternative. Memory in reinforcement learning is particularly difficult as rewards are often sparse and distributed over many time steps. Early research into transformers as memory mechanisms for reinforcement learning indicated that the canonical model is not suitable, and that additional gated recurrent units and architectural modifications are necessary to stabilize these models. Several additional improvements to the canonical model have further extended its capabilities, such as increasing the attention span, dynamically selecting the number of per-symbol processing steps and accelerating convergence. It remains unclear, however, whether combining these improvements could provide meaningful performance gains overall. This dissertation examines several extensions to the canonical Transformer as memory mechanisms in reinforcement learning and empirically studies their combination, which we term the Integrated Transformer. Our findings support prior work that suggests gating variants of the Transformer architecture may outperform LSTMs as memory networks in reinforcement learning. However, our results indicate that while gated variants of the Transformer architecture may be able to model dependencies over a longer temporal horizon, these models do not necessarily outperform LSTMs when tasked with retaining increasing quantities of information. |
| format | Thesis |
| id | oai:open.uct.ac.za:11427/35840 |
| institution | University of Cape Town (South Africa) |
| language | eng |
| last_indexed | 2026-06-10T12:35:53.219Z |
| license_str | Not specified — see source repository |
| provenance_str_mv | Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository |
| publishDate | 2022 |
| publishDateRange | 2022 |
| publishDateSort | 2022 |
| publisher | Department of Mathematics and Applied Mathematics |
| publisherStr | Department of Mathematics and Applied Mathematics |
| record_format | dspace |
| source_str | UCTD — University of Cape Town Open Access Repository |
| spelling | oai:open.uct.ac.za:11427/35840 Evaluating transformers as memory systems in reinforcement learning Makkink, Thomas Shock, Jonathan Pretorius, Arnu Mathematics and Applied Mathematics Memory is an important component of effective learning systems and is crucial in non-Markovian as well as partially observable environments. In recent years, Long Short-Term Memory (LSTM) networks have been the dominant mechanism for providing memory in reinforcement learning, however, the success of transformers in natural language processing tasks has highlighted a promising and viable alternative. Memory in reinforcement learning is particularly difficult as rewards are often sparse and distributed over many time steps. Early research into transformers as memory mechanisms for reinforcement learning indicated that the canonical model is not suitable, and that additional gated recurrent units and architectural modifications are necessary to stabilize these models. Several additional improvements to the canonical model have further extended its capabilities, such as increasing the attention span, dynamically selecting the number of per-symbol processing steps and accelerating convergence. It remains unclear, however, whether combining these improvements could provide meaningful performance gains overall. This dissertation examines several extensions to the canonical Transformer as memory mechanisms in reinforcement learning and empirically studies their combination, which we term the Integrated Transformer. Our findings support prior work that suggests gating variants of the Transformer architecture may outperform LSTMs as memory networks in reinforcement learning. However, our results indicate that while gated variants of the Transformer architecture may be able to model dependencies over a longer temporal horizon, these models do not necessarily outperform LSTMs when tasked with retaining increasing quantities of information. 2022-02-23T15:40:59Z 2022-02-23T15:40:59Z 2021 2022-02-23T15:34:07Z Master Thesis Masters MSc http://hdl.handle.net/11427/35840 eng application/pdf Department of Mathematics and Applied Mathematics Faculty of Science |
| spellingShingle | Mathematics and Applied Mathematics Makkink, Thomas Evaluating transformers as memory systems in reinforcement learning |
| thesis_degree_str | Master's |
| title | Evaluating transformers as memory systems in reinforcement learning |
| title_full | Evaluating transformers as memory systems in reinforcement learning |
| title_fullStr | Evaluating transformers as memory systems in reinforcement learning |
| title_full_unstemmed | Evaluating transformers as memory systems in reinforcement learning |
| title_short | Evaluating transformers as memory systems in reinforcement learning |
| title_sort | evaluating transformers as memory systems in reinforcement learning |
| topic | Mathematics and Applied Mathematics |
| url | http://hdl.handle.net/11427/35840 |
| work_keys_str_mv | AT makkinkthomas evaluatingtransformersasmemorysystemsinreinforcementlearning |