Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Detection of HTTPS malware traffic without decryption

Each year the world's dependency on the internet grows, especially its functionality relating to critical infrastructure and social connections. More than 80% of internet traffic is encrypted using Transport Layer Security (TLS) protocol, and it is predicted that this number will increase [8]. Howev...

Full description

Saved in:
Bibliographic Details
Main Author: Nyathi, Miranda
Other Authors: Hutchison, Andrew
Format: Thesis
Language:English
Published: Department of Computer Science 2022
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613190259474432
access_status_str Open Access
author Nyathi, Miranda
author2 Hutchison, Andrew
author_browse Hutchison, Andrew
Nyathi, Miranda
author_facet Hutchison, Andrew
Nyathi, Miranda
author_sort Nyathi, Miranda
collection Thesis
description Each year the world's dependency on the internet grows, especially its functionality relating to critical infrastructure and social connections. More than 80% of internet traffic is encrypted using Transport Layer Security (TLS) protocol, and it is predicted that this number will increase [8]. However, threat actors are also increasingly using the TLS protocol to hide malicious activities such as Command and Control, loading malware into a network, and exfiltration of sensitive data. The use of TLS by threat actors poses a challenge to security professionals as traditional techniques used in the detection of HTTP malware cannot be applied in detecting Hypertext Transfer Protocol Secure (HTTPS) encrypted malware. To manage this, companies are using a traditional method called Transport Layer Security Inspection (TLSI), which involves decrypting packets to do full packet inspection. TLSI is expensive in computational performance and complexity, and over and above all, it violates the users' privacy. Researchers from Cisco have proposed that it is possible to identify malicious encrypted traffic by techniques other than TLSI and that the unencrypted TLS handshake messages, certificates, and flow metadata of malicious traffic are distinct from benign. These differences can be effectively used in machine learning to classify malicious and benign encrypted traffic [35]. This dissertation aims to assess the feasibility and effectiveness of the proposed alternative to TLSI. We sourced thousands of malware and benign flows and then used the Cisco tool called Joy to extract the features from the unencrypted TLS handshake messages, certificates, and flow metadata. To understand the characteristic behaviour between malicious and benign flows, we did a data exploration, summarized the unique values of the features from our datasets, and compared them with the feature values from the Cisco datasets used in the research paper [35]. We then selected features that had the most differentiating power in our dataset. The selected features were inputs into the two supervised classifiers: logistic regression and random forest. The classifiers were trained and tested on the offline datasets of benign and malware features, and we observed that the random forest performed better with an average accuracy of 98.92%. We concluded that it is viable and effective to use alternative techniques to detect HTTPS malware without TLSI.
format Thesis
id oai:open.uct.ac.za:11427/36527
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:32:12.136Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2022
publishDateRange 2022
publishDateSort 2022
publisher Department of Computer Science
publisherStr Department of Computer Science
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/36527 Detection of HTTPS malware traffic without decryption Nyathi, Miranda Hutchison, Andrew computer science Each year the world's dependency on the internet grows, especially its functionality relating to critical infrastructure and social connections. More than 80% of internet traffic is encrypted using Transport Layer Security (TLS) protocol, and it is predicted that this number will increase [8]. However, threat actors are also increasingly using the TLS protocol to hide malicious activities such as Command and Control, loading malware into a network, and exfiltration of sensitive data. The use of TLS by threat actors poses a challenge to security professionals as traditional techniques used in the detection of HTTP malware cannot be applied in detecting Hypertext Transfer Protocol Secure (HTTPS) encrypted malware. To manage this, companies are using a traditional method called Transport Layer Security Inspection (TLSI), which involves decrypting packets to do full packet inspection. TLSI is expensive in computational performance and complexity, and over and above all, it violates the users' privacy. Researchers from Cisco have proposed that it is possible to identify malicious encrypted traffic by techniques other than TLSI and that the unencrypted TLS handshake messages, certificates, and flow metadata of malicious traffic are distinct from benign. These differences can be effectively used in machine learning to classify malicious and benign encrypted traffic [35]. This dissertation aims to assess the feasibility and effectiveness of the proposed alternative to TLSI. We sourced thousands of malware and benign flows and then used the Cisco tool called Joy to extract the features from the unencrypted TLS handshake messages, certificates, and flow metadata. To understand the characteristic behaviour between malicious and benign flows, we did a data exploration, summarized the unique values of the features from our datasets, and compared them with the feature values from the Cisco datasets used in the research paper [35]. We then selected features that had the most differentiating power in our dataset. The selected features were inputs into the two supervised classifiers: logistic regression and random forest. The classifiers were trained and tested on the offline datasets of benign and malware features, and we observed that the random forest performed better with an average accuracy of 98.92%. We concluded that it is viable and effective to use alternative techniques to detect HTTPS malware without TLSI. 2022-06-23T15:32:51Z 2022-06-23T15:32:51Z 2022 2022-06-23T14:34:06Z Master Thesis Masters MSc http://hdl.handle.net/11427/36527 eng application/pdf Department of Computer Science Faculty of Science
spellingShingle computer science
Nyathi, Miranda
Detection of HTTPS malware traffic without decryption
thesis_degree_str Master's
title Detection of HTTPS malware traffic without decryption
title_full Detection of HTTPS malware traffic without decryption
title_fullStr Detection of HTTPS malware traffic without decryption
title_full_unstemmed Detection of HTTPS malware traffic without decryption
title_short Detection of HTTPS malware traffic without decryption
title_sort detection of https malware traffic without decryption
topic computer science
url http://hdl.handle.net/11427/36527
work_keys_str_mv AT nyathimiranda detectionofhttpsmalwaretrafficwithoutdecryption