Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

An African Genome Variation Database and its applications in human diversity and health

African genomes exhibit the highest levels of sequence and haplotype diversity of all extant human populations. A combination of historical as well as geographical factors have contributed toward the high level of genetic diversity in Ancestral populations in Africa. Additionally, a series of concom...

Full description

Saved in:
Bibliographic Details
Main Author: Todt, Davis
Other Authors: Mulder, Nicola
Format: Thesis
Language:English
Published: Department of Integrative Biomedical Sciences (IBMS) 2022
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867614371941711872
access_status_str Open Access
author Todt, Davis
author2 Mulder, Nicola
author_browse Mulder, Nicola
Todt, Davis
author_facet Mulder, Nicola
Todt, Davis
author_sort Todt, Davis
collection Thesis
description African genomes exhibit the highest levels of sequence and haplotype diversity of all extant human populations. A combination of historical as well as geographical factors have contributed toward the high level of genetic diversity in Ancestral populations in Africa. Additionally, a series of concomitant migration events out of Africa, with founder populations harbouring only a subset of this genetic variation, have contributed to the relatively lower genetic diversity observed in non-Africans. Population genetic studies have refined our understanding of human evolutionary history and clinical genomic studies have resulted in improved patient outcomes. However, despite the increased throughput and decreased cost afforded from next-generation sequencing (NGS) and despite the relatively higher genetic variation in Africans, relatively little of the genomic data currently available is representative of diverse African populations. This may result in adverse outcomes in the context of minority populations with little representation in clinical databases. Given the under-representation of African genetic variation and the importance of highlighting and further characterizing it, the objectives of this project were to design, develop and deploy a proof of concept database and web application for the storage, analysis and visualization of African genetic variant data – the African Genome Variation Database (AGVD). The AGVD was developed according to software industry design standards. The project also explored available genomic tools and databases in order to leverage existing software solutions where suitable. Additionally, relevant data sets were identified for use during testing and validation of the pilot phase of the project. To this end, the open access 1000 Genomes Project phase 3 dataset was selected and the genotypes for several chromosomes were loaded into the AGVD. The AGVD leverages the scalable, performant, and open source genomics engine OpenCGA for data storage and analysis. A custom front-end web application was developed by applying a novel approach to render and serve static Vue JS assets from the Python Flask microframework. The web application supports rich data search and filtering operations of loaded variants and allows end-users to visualize annotations of genomic loci and allele change, variant type, associated gene and transcript consequences, clinical significance, and allele frequency information for all annotated cohorts in a highly interactive manner. A bespoke REST API also supports future analytical functionality. The AGVD has demonstrated proof of concept in the secure and scalable storage and visualization of African genomic data, providing a viable solution for H3ABioNet to further extend in future iterations of the project and a valuable resource for researchers to explore African genetic variation.
format Thesis
id oai:open.uct.ac.za:11427/36188
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:50:59.472Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2022
publishDateRange 2022
publishDateSort 2022
publisher Department of Integrative Biomedical Sciences (IBMS)
publisherStr Department of Integrative Biomedical Sciences (IBMS)
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/36188 An African Genome Variation Database and its applications in human diversity and health Todt, Davis Mulder, Nicola Bioinformatics African genomes exhibit the highest levels of sequence and haplotype diversity of all extant human populations. A combination of historical as well as geographical factors have contributed toward the high level of genetic diversity in Ancestral populations in Africa. Additionally, a series of concomitant migration events out of Africa, with founder populations harbouring only a subset of this genetic variation, have contributed to the relatively lower genetic diversity observed in non-Africans. Population genetic studies have refined our understanding of human evolutionary history and clinical genomic studies have resulted in improved patient outcomes. However, despite the increased throughput and decreased cost afforded from next-generation sequencing (NGS) and despite the relatively higher genetic variation in Africans, relatively little of the genomic data currently available is representative of diverse African populations. This may result in adverse outcomes in the context of minority populations with little representation in clinical databases. Given the under-representation of African genetic variation and the importance of highlighting and further characterizing it, the objectives of this project were to design, develop and deploy a proof of concept database and web application for the storage, analysis and visualization of African genetic variant data – the African Genome Variation Database (AGVD). The AGVD was developed according to software industry design standards. The project also explored available genomic tools and databases in order to leverage existing software solutions where suitable. Additionally, relevant data sets were identified for use during testing and validation of the pilot phase of the project. To this end, the open access 1000 Genomes Project phase 3 dataset was selected and the genotypes for several chromosomes were loaded into the AGVD. The AGVD leverages the scalable, performant, and open source genomics engine OpenCGA for data storage and analysis. A custom front-end web application was developed by applying a novel approach to render and serve static Vue JS assets from the Python Flask microframework. The web application supports rich data search and filtering operations of loaded variants and allows end-users to visualize annotations of genomic loci and allele change, variant type, associated gene and transcript consequences, clinical significance, and allele frequency information for all annotated cohorts in a highly interactive manner. A bespoke REST API also supports future analytical functionality. The AGVD has demonstrated proof of concept in the secure and scalable storage and visualization of African genomic data, providing a viable solution for H3ABioNet to further extend in future iterations of the project and a valuable resource for researchers to explore African genetic variation. 2022-03-22T09:36:13Z 2022-03-22T09:36:13Z 2021 2022-03-22T06:07:05Z Master Thesis Masters MSc http://hdl.handle.net/11427/36188 eng application/pdf Department of Integrative Biomedical Sciences (IBMS) Faculty of Health Sciences
spellingShingle Bioinformatics
Todt, Davis
An African Genome Variation Database and its applications in human diversity and health
thesis_degree_str Master's
title An African Genome Variation Database and its applications in human diversity and health
title_full An African Genome Variation Database and its applications in human diversity and health
title_fullStr An African Genome Variation Database and its applications in human diversity and health
title_full_unstemmed An African Genome Variation Database and its applications in human diversity and health
title_short An African Genome Variation Database and its applications in human diversity and health
title_sort african genome variation database and its applications in human diversity and health
topic Bioinformatics
url http://hdl.handle.net/11427/36188
work_keys_str_mv AT todtdavis anafricangenomevariationdatabaseanditsapplicationsinhumandiversityandhealth
AT todtdavis africangenomevariationdatabaseanditsapplicationsinhumandiversityandhealth