An In-Depth Analysis of Smart Contract Forking in the Ethereum Ecosystem

Posted by AntChain Open Labs on 2024-01-10

Ethereum, one of the biggest ecosystems in the cryptocurrency world, has grown rapidly over the last several years. As of January 2024, there were about 64.34 million active smart contracts on Ethereum, according to the ZAN KYT database.

The rapid growth in smart contracts reflects the booming Ethereum developer community. The development toolchain, smart contract library, and open-source smart contract projects of Ethereum smart contracts are gradually maturing. We have noticed that when writing smart contracts, more and more developers refer to some contracts or superstar projects that have almost become the de facto standard of Ethereum, such as OpenZeppelin, Uniswap, Aave, etc. This occurrence has prompted contemplation within the ZAN team: To what extent do security issues in prominent projects impact the overall security of the Ethereum ecosystem? Additionally, can the affected projects be promptly notified and implement countermeasures?

In the traditional software domain, a concept strongly associated with the aforementioned phenomenon is Software Supply Chain Security. This concept pertains to ensuring the security of every stage in the entire process of software development, deployment, and maintenance, ranging from source code to the final product. It encompasses various aspects, including but not limited to source code management, development tools, dependency libraries, build systems, application deployment, and update mechanisms. Software supply chain attacks have the potential to implant malicious code or vulnerabilities at any stage of software product development, leading to security risks when users deploy and use the software. Over the past few years, there have been numerous large-scale security incidents attributed to software supply chain security, such as Log4J, OpenSSL, and many widely used dependent libraries. These incidents have resulted in downstream software and systems using these libraries being exposed to security risks due to vulnerabilities.

Software supply chain security has long been a prominent topic in software security research, with numerous emergency protocols and auxiliary security measures already in place. Software systems downstream can be promptly informed when supply chain security problems arise so that the vulnerability can be fixed or the version updated. Nevertheless, the necessary emergency protocols and security measures remain mostly undeveloped in the realm of smart contracts. We creatively created a supply chain security tool for Ethereum smart contracts, and we employed this tool to examine the supply chain conditions of Ethereum smart contracts, yielding some intriguing results.

Case Study

To highlight the significance of blockchain supply chain security, we will present a case study that happened in December of 2023.

OpenZeppelin formally released a significant security alert to the community on December 8, 2023, stating that projects using multicall-like methods (which allow delegate calls to their contract, but control over the calldata is still retained by external users) in conjunction with the ERC-2771 standard will be vulnerable to arbitrary address spoofing attacks.
(https://twitter.com/openzeppelin/status/1732913331265036475?s=46&t=zDQbmeyWt2t9a8SGpOvbtw)

Less than twenty-four hours after the initial compromised contract targeted, eight more contracts suffered similar attacks, resulting in a loss of about US$200,000. The eight contracts had very similar source codes, according to the ZAN team’s instant analysis of the attack. Furthermore, we discovered 62 smart contracts on the Mainnet that are very similar to the compromised contract but have not been public, utilizing our supply chain tools and the gathered smart contract data set with verified source code.

The figure below shows the fork relationship between two affected contracts and the first attacked contract. There is a one-to-one correspondence between the contract files on the two affected addresses and the source files of the first attacked contract which indicates that their source code is almost identical and will be affected by the same attack. This real-world incident serves as a reminder of how crucial it is to build smart contract supply chain security analysis tools and emergency mechanisms.

image

OpenZeppelin ERC2771-Multicall contract vulnerability

Our Findings

Our study subject is the Ethereum contracts that have been called more than ten times. Based on the invoke times, we tidied up and gathered 4,379,029 verified contract source codes, splitting them into four groups:

image

● A total of 380 contracts that have been called more than 1 million times
● A total of 3,877 contracts that have been called more than 100,000 times
● A total of 25,012 contracts that have been called more than 10,000 times
● There are 4,349,760 contracts that have been called more than or equal to 10 times.

Q1. How many contracts have forked other contracts?

We first set the threshold of our similarity-matching algorithm to 90%. In our evaluation criteria, this is a high value. We can conclude that two contracts are identical or only have minor modifications if their similarity exceeds 90%.

● Finding 1:
Out of the 3,877 contracts with over 100,000 calls, 1,233 (31%) are derivatives of other contracts. In the case of the 25,012 contracts with over 10,000 calls, 13,286 (53%) are entirely derived from existing contracts. Among the remaining contracts with 10 or more calls, a substantial 65% (2,823,536 contracts) are forks from other projects. This analysis reveals a noteworthy distinction: head projects exhibit a significantly higher proportion of originality compared to non-head projects.

image

● Finding 2:
Moreover, we identified a pronounced trend in forked contracts. We meticulously sorted and analyzed contracts based on their frequency of being forked. Out of the 4.37 million contracts, our analysis of 150 million similarity relationships unveiled that 67.46 million relationships were directed towards the top 200 contracts with the highest fork frequencies, constituting 44% of the total. This suggests that, in the development of smart contracts, developers lean towards reusing the source code or logic of leading projects.

image

This deduction is further corroborated by the heatmap we generated. In this visualization, contract addresses are aggregated into points according to their fork relationships. The larger and darker the point, the greater the cumulative fork count of the aggregated contracts. From the heatmap, it is evident that only a small number of points are displayed in orange or red, signifying that a select few top contracts have been forked by a substantial number of other contracts.

Q2: How many projects could potentially be affected if security issues arise in standard contracts or common library contracts?

In the Ethereum ecosystem, OpenZeppelin has effectively become the standard for smart contracts. Additionally, community projects like Solady and Solmate focus on performance and gas optimization in library contracts. We utilized our tools to scan the collected contracts and analyzed the usage of these library contracts. Given the generally stable nature of library contract source code, we selected the latest versions of OpenZeppelin, OpenZeppelin-upgradable, Solmate, and Solady as the comparison standards.

We discovered that the number of addresses using the OpenZeppelin contract library in the source code is 1,695,218, constituting approximately 38.7%. The number of addresses using the OpenZeppelin-upgradable contract library is 41,586, accounting for around 0.94%. Combined, they make up almost 40%. The contracts using Solmate and Solady libraries total 8,778. This implies that standard contracts significantly influence the security of smart contracts in the Ethereum ecosystem.

Furthermore, we conducted a detailed analysis of the most frequently used library contracts within the OpenZeppelin contracts.

Contract Name
Library Contract Users
openzeppelin-contracts/token/ERC20/IERC20 1,190,866
openzeppelin-contracts/interfaces/IERC4626 1,118,187
openzeppelin-contracts/utils/Context 1,115,134
openzeppelin-contracts/utils/Address 928,030
openzeppelin-contracts/token/ERC721/IERC721Receiver 911,625

As observed in the table above, the number of addresses utilizing token standard contracts, such as ERC20 and ERC721, is in the millions. Beyond the contracts mentioned earlier, the addresses employing library contracts associated with proxy contracts or permission control functions, such as TransparentUpgradeableProxy, Ownable, and Initializable, also exceed 200,000.

● Finding 3:
De facto standard contracts, like OpenZeppelin, are employed by almost 40% of contracts in the current on-chain environment, indirectly impacting the security of contracts across nearly 1.7 million addresses.

Summary

Through a statistical analysis of smart contract fork relationships in the Ethereum ecosystem, we have uncovered compelling findings that underscore the surprising extent to which smart contracts fork from one another. In modern software development, there is an increasing reliance on open-source software or publicly available code. On one hand, the nested reuse of open-source third-party libraries creates challenges for downstream software developers, hindering their ability to clearly discern the dependencies of upstream software. On the other hand, developers, while utilizing open-source software, may inadvertently introduce vulnerable code when working on code or large models.

For instance, in collaborative development projects involving multiple contributors, there is a risk if a developer lazily copies numerous sample codes or error codes from online sources. Alternatively, if a developer directly forks an old project that harbors vulnerabilities for secondary development, these actions can introduce risks to the software, categorizing them as supply chain risks. The smart contract supply chain within the Web3 ecosystem further indicates a trend of centralization. A few prominent projects and de facto standard contracts have the potential to significantly impact the security of a large number of contracts. The research on supply chain security and the establishment of early warning mechanisms for smart contracts deserve increased attention from the community.

Smart Contract Supply Chain Security Tool

In response to the escalating threats posed by smart contract supply chain security, the ZAN team has developed a smart contract supply chain security service aimed at tracking the fork relationships between smart contracts. This service, combined with our KYT transaction analysis and early warning capabilities, provides intelligent monitoring for smart contract security. It enables prompt notifications to affected projects and users when a contract faces a security vulnerability, safeguarding user funds.

ZAN’s smart contract supply chain security service utilizes technologies such as smart contract abstract syntax tree, source code similarity analysis, and vulnerability matching. It supports the analysis and early warning of supply chain security risks associated with smart contracts, conducting security monitoring from multiple dimensions:
● If the target contract is forked from a well-known project, early warnings can be issued to relevant stakeholders of the target contract when the forked project faces an attack.
● Leveraging our extensive security vulnerability library, we can search for on-chain contracts containing vulnerabilities and provide early warnings to relevant stakeholders of the affected contracts.
● By establishing a database of third-party library versions and vulnerabilities, relevant stakeholders of contracts utilizing these problematic third-party libraries can receive early warnings.

If you are interested in our supply chain security services, please feel free to contact us to participate in a product experience.
ZAN Smart Contract Review: https://zan.top/home/contract-review
ZAN KYT: https://zan.top/home/know-your-transaction

About

AntChain Open Labs

AntChain Open Labs is a research center initiated by AntChain and world leading computer scientists in the area of foundational trust technologies. It is dedicated to building a secure, transparent and reliable Web3 infrastructure driven by innovative research and aiming to advance transformative services.
Website:https://openlabs-intl.antdigital.com/home

ZAN

ZAN, powered by AntChain Open Labs, provides solutions for Web3, such as Smart Contract Review, KYT, KYC, Node Service, and more.
Website | Telegram | Discocd | Twitter | More