How Gender and Race Labels Are Applied to NFT Data Analysis
Table of Links
3 Results
5. Conclusion/ Acknowledgements/ References
A Appendix
A.2 Detailed NFT Information & A.3 Google NFT Searches Map
2. Methodology
We describe the methods of analyzing the gender and race biases in the prices of NFTs. We first summarize our data collection process (Section 2.1 and 2.2) and then describe how we statistically quantify the gender and racial biases among different NFTs (Section 2.3). The steps are shown in Figure 2. More detailed description of methods and implementation can be found in the Appendix.
2.1 Initial Data Collection
Our dataset consists of NFTs transacted on OpenSea, which is the primary marketplace for NFTs on Ethereum. We query the OpenSea “v1/collections” endpoint at the end of November 2022 [15] to retrieve NFT metadata and last sale price. We choose 790 collections from the Kaggle Ethereum NFTs dataset [11] and NFTs from the top 30-day and all-time OpenSea volume leaderboard around November 2021. After the data collection process, we end up with ∼ 2.5 NFTs, each that have been transacted upon.
2.2 Retrieving Race and Gender Labels
Many NFT collections do not represent humans, and therefore cannot be directly studied through the lens of race and gender. We select collections that have metadata with the words “male” and “female” and end with a total of 44 such collections with gender labels representing different avatars. Statistics on these 44 collections with gender labels can be found in Table 6 in Appendix A.2.
As far as we know, this is the first NFT dataset with gender labels across many collections. However, race labels often do not exist in the metadata, so we are limited to only CryptoPunks, Avastar, and Dynamic Duelers for collections with race labels.
2.3 Statistical Tools to Analyze Bias in Gender and Race
To determine the statistical significance of the hypothesis that female NFTs are sold for less than male NFTs, we run both paired and unpaired one-sided Student’s t-tests [22].
Unpaired vs Paired t-test: For unpaired t-test, we compare the mean of all NFT sale prices for male versus for female. For paired t-test, we calculate t-statistics on the paired difference of male and female prices marked to the daily mean price and the weekly mean price of the NFT. The paired t-test is used to isolate the male versus female price difference by fixing price variation across time.
Log Transformation: As t-test assume normality of the data, we apply a log transformation to address this. With rare NFTs worth significantly more than common NFTs, NFT price distributions tend to follow a power law distribution [14]. Inspired by how stock prices follow a log-normal distribution [1], we applied the same transformation and found the distribution of log of price to be more normal. We refer to running the t-test on the log of prices as log t-test.
Outlier Trimming: Outliers may occur due to very high selling prices for rare NFTs or very low selling prices due to humans errors while listing. We address outliers using Winsorization [19], or trimming outliers past a certain percentile. We report results for 0.1%, 1%, 2.5%, and 5% percentiles.
The approach described above is also used to compare the prices of light and dark CryptoPunks. We report results from combinations of different t-tests and outlier detection methods to show our conclusion remains consistent regardless of the way we conduct the statistical significance test. For the figures and statistics in this paper, unless otherwise stated, we remove outliers at the 2.5% percentile.