Vinsamlegast notið þetta auðkenni þegar þið vitnið til verksins eða tengið í það: https://hdl.handle.net/1946/47691
Solidity is a high-level Turing-complete programming language widely used for developing smart contracts on various blockchain platforms, including Ethereum and Binance Smart Chain. As smart contract security is crucial due to the contracts’ significant financial worth, recent studies have focused on Solidity vulnerability detection and repair. In Solidity, inline assembly code enables the integration of machine-level instructions directly into smart contracts. As this allows for machine-level optimizations, inline assembly is widely used in existing smart contracts. However, source code static analyzers designed for Solidity struggle to detect specifics about inline assembly vulnerabilities. Currently, only one existing tool specifically reports errors from within inline assembly, and only for two different errors. This potentially impacts both contract security and optimization.
To address this gap, this thesis aims to implement a method for creating a dataset and detecting the types of vulnerabilities within inline assembly and Solidity with Large Language Models (LLM). The objective is to expand the scope of vulnerability detection to find vulnerabilities in code fragments of both Solidity and Yul.
First, we gathered 50,000 real-world smart contracts from Open Source Software (OSS) projects on GitHub. Next, we curated a dataset of vulnerabilities, with code fragments of vulnerabilities in Solidity and Yul, labeled for each type of vulnerability. The results show the proficiency of our method, CIPHER, in detecting vulnerabilities in Solidity and Yul, achieving an accuracy of over 90\% and an F-score of +90\% on the notable re-entrancy vulnerabilities. As the current tools do not have this functionality, this method could be used to expand smart contract security over the current baseline.
These results shed light on the importance of exploring the use of other LLM implementations on similar tasks. We also present the dataset with vulnerabilities detected from smart contracts, including labeled vulnerabilities in inline assembly, with the collection of contracts used, their bug reports, and a curated smaller set of specific vulnerabilities we used for training.
Skráarnafn | Stærð | Aðgangur | Lýsing | Skráartegund | |
---|---|---|---|---|---|
Waltteri_Nuutinen_MSc_Thesis.pdf | 1,13 MB | Opinn | Heildartexti | Skoða/Opna |