Full Text Available

Access Repository

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

High-resolution virtual try-on with garment extraction using generative adversarial networks

Image-based virtual try-on aims to depict an individual wearing a garment not originally worn by them. While existing literature predominantly focuses on garments from standalone images, this research addresses the use of images where the garment is already being worn by another individual. The stud...

Full description

Saved in:

Bibliographic Details
Main Author:	Charters, Daniel J
Other Authors:	Britz, Stefan S
Format:	Thesis
Language:	English
Published:	Department of Statistical Sciences 2025
Subjects:	data science
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1867613183953338368
access_status_str	Open Access
author	Charters, Daniel J
author2	Britz, Stefan S
author_browse	Britz, Stefan S Charters, Daniel J
author_facet	Britz, Stefan S Charters, Daniel J
author_sort	Charters, Daniel J
collection	Thesis
description	Image-based virtual try-on aims to depict an individual wearing a garment not originally worn by them. While existing literature predominantly focuses on garments from standalone images, this research addresses the use of images where the garment is already being worn by another individual. The study bridges a notable gap as most current systems are tailored for standalone garment images. The proposed system, given a pair of high-resolution images, extracts the garment from one, refines it using context-aware image inpainting, and subsequently transfers it onto the second image's subject. The methodology incorporates various off-the-shelf models, notably Part Grouping Network (PGN), Densepose, and OpenPose for pre-processing. A state-of-the-art context-aware inpainting model refines the garments, and the final synthesis leverages the HR-VITON architecture, producing images at a resolution of 768 × 1024. Distinctively, our model processes both standalone and garment-on-person images. Evaluating the models involves testing on 2 032 high-resolution images under both paired and unpaired conditions. Metrics such as RMSE, Peak Signal-to-Noise Ratio (PSNR), Learned Perceptual Image Patch Similarity (LPIPS), Structural Similarity (SSIM), Inception Score (IS), Fréchet Inception Distance (FID), and Kernel Inception Distance (KID) assessed the model's prowess. Benchmarked against HR-VITON, ACGPN, and CP-VTON, our model slightly trailed HR-VITON but notably surpassed ACGPN and CP-VTON. In realistic, unpaired conditions, the model achieved an IS of 3.152, an FID of 15.3, and a KID of 0.0063. This is compared to an IS of 3.398, an FID of 11.93, and a KID of 0.0034 achieved by HR-VITON on the same data. ACGPN has an FID of 43.29, and a KID of 0.0373, while CP-VTON has an FID of 43.28, while it has a KID of 0.0376. IS is not measured for both ACGPN and CP-VTON. An ablation study underscored the importance of context-aware inpainting in our network. The findings highlight the model's ability to generate convincing, high-resolution virtual try-on images from garment-on-person extractions, addressing a prevalent gap in the literature and offering tangible applications in high-resolution virtual try-on image generation.
format	Thesis
id	oai:open.uct.ac.za:11427/40827
institution	University of Cape Town (South Africa)
language	eng
last_indexed	2026-06-10T12:32:06.010Z
license_str	Not specified — see source repository
provenance_str_mv	Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate	2025
publishDateRange	2025
publishDateSort	2025
publisher	Department of Statistical Sciences
publisherStr	Department of Statistical Sciences
record_format	dspace
source_str	UCTD — University of Cape Town Open Access Repository
spelling	oai:open.uct.ac.za:11427/40827 High-resolution virtual try-on with garment extraction using generative adversarial networks Charters, Daniel J Britz, Stefan S Bernicchi, Dino data science Image-based virtual try-on aims to depict an individual wearing a garment not originally worn by them. While existing literature predominantly focuses on garments from standalone images, this research addresses the use of images where the garment is already being worn by another individual. The study bridges a notable gap as most current systems are tailored for standalone garment images. The proposed system, given a pair of high-resolution images, extracts the garment from one, refines it using context-aware image inpainting, and subsequently transfers it onto the second image's subject. The methodology incorporates various off-the-shelf models, notably Part Grouping Network (PGN), Densepose, and OpenPose for pre-processing. A state-of-the-art context-aware inpainting model refines the garments, and the final synthesis leverages the HR-VITON architecture, producing images at a resolution of 768 × 1024. Distinctively, our model processes both standalone and garment-on-person images. Evaluating the models involves testing on 2 032 high-resolution images under both paired and unpaired conditions. Metrics such as RMSE, Peak Signal-to-Noise Ratio (PSNR), Learned Perceptual Image Patch Similarity (LPIPS), Structural Similarity (SSIM), Inception Score (IS), Fréchet Inception Distance (FID), and Kernel Inception Distance (KID) assessed the model's prowess. Benchmarked against HR-VITON, ACGPN, and CP-VTON, our model slightly trailed HR-VITON but notably surpassed ACGPN and CP-VTON. In realistic, unpaired conditions, the model achieved an IS of 3.152, an FID of 15.3, and a KID of 0.0063. This is compared to an IS of 3.398, an FID of 11.93, and a KID of 0.0034 achieved by HR-VITON on the same data. ACGPN has an FID of 43.29, and a KID of 0.0373, while CP-VTON has an FID of 43.28, while it has a KID of 0.0376. IS is not measured for both ACGPN and CP-VTON. An ablation study underscored the importance of context-aware inpainting in our network. The findings highlight the model's ability to generate convincing, high-resolution virtual try-on images from garment-on-person extractions, addressing a prevalent gap in the literature and offering tangible applications in high-resolution virtual try-on image generation. 2025-01-23T09:17:42Z 2025-01-23T09:17:42Z 2024 2025-01-23T08:00:21Z Thesis / Dissertation Masters MSc http://hdl.handle.net/11427/40827 eng application/pdf Department of Statistical Sciences Faculty of Science University of Cape Town
spellingShingle	data science Charters, Daniel J High-resolution virtual try-on with garment extraction using generative adversarial networks
thesis_degree_str	Master's
title	High-resolution virtual try-on with garment extraction using generative adversarial networks
title_full	High-resolution virtual try-on with garment extraction using generative adversarial networks
title_fullStr	High-resolution virtual try-on with garment extraction using generative adversarial networks
title_full_unstemmed	High-resolution virtual try-on with garment extraction using generative adversarial networks
title_short	High-resolution virtual try-on with garment extraction using generative adversarial networks
title_sort	high resolution virtual try on with garment extraction using generative adversarial networks
topic	data science
url	http://hdl.handle.net/11427/40827
work_keys_str_mv	AT chartersdanielj highresolutionvirtualtryonwithgarmentextractionusinggenerativeadversarialnetworks

Full Text Available

High-resolution virtual try-on with garment extraction using generative adversarial networks

Similar Items