There are three facets to Sybil defense - humans, algorithms and passports. Passport aims to proactively assess evidence of personhood, vetting individuals before they participate in a grant round. Algorithms are used to retrospectively identify Sybil behaviours for individuals already participating in the round. Humans are there to resolve disputes and provide subjective input where necessary, as well as setting thresholds and round eligibility requirements.
Each of these approaches ultimately aim to distinguish users into two categories: Sybil and non-Sybil. Real, honest users have high value to the Gitcoin ecosystem and should also have high trust scores because they exhibit clear non-Sybil behaviours in advance of, and during, a grant round. For example, they might have participated in many activities that can be represented in the form of passport stamps and may have reputation or credentials that can carry over from previous rounds. Sybils, on the other hand, have low (or even negative) value to the Gitcoin ecosystem and, assuming the system works well, should have low apparent trust score. However, there are two substantial challenges:
-
Sybilâs sometimes put considerable time, money and effort into increasing their apparent trust score without also adding value to the ecosystem. These create Type 1 errors (false negative) in the Sybil defenses.
-
Honest users who are new to Gitcoin or new to Web3 more generally may lack the credentialks that give them high trust scores, despite being non-Sybil. These create Type 2 (false positive) errors in the Sybil defenses.
From the Sybil defense perspective, newbies and airdrop farmers/âcheapâ Sybils probably look quite similar, while Web3 native contributors and sophisticated attackers probably look similar. This is a complex problem because raising the bar for proof-of-personhood to mitigate attacks also risks eliminating new users, while lowering the bar of personhood to enable new users makes the system easier to attack. The top level challenge is to appropriately balance between inclusivity and security.
Key takeaways
-
Gitcoin Passport alone is not yet a robust Sybil Defense mechanism, but pairing it with Sybil scoring shows great potential.
-
Creating Sybil scoring âlegosâ and allowing grant owners and round managers to apply them to sub-communities could be a big value-add to current Sybil defense.
Learning from data
We might hypothesize that:
-
Developing more sophisticated algorithms will lead to more effective squelching of attackers. However, this runs the risk of creating an arms race since, in an open protocol, the algorithms can be known and gamed by attackers, necessitating constant updates.
-
Introducing more passport stamps or weighting high-signal stamps more strongly in the Sybil defenses will eliminate more adversarial users from a grant round. However, this will also come at the cost of silencing more new users.
-
Introducing more humans-in-the-loop to ake subjective decisions will help reduce both type-1 and type-2 errors, but this is expensive, slow and vulnerable to bribery/corruption.
To test these hypotheses, we can interrogate GR15 data.
GR15 passport data
20,055 users used Gitcoin passport in GR15. 87% of those users achieved a trust score greater than 100%, meaning their contributions were boosted in the matching pool. Passport users overall preferred to use Web2 stamps (Twitter, Facebook, Github, Discord, Google, LinkedIn) compared to Web3 stamps (ENS, POAP, BrightId etc). The median user collected 6 individual stamps. There was a very steep dropoff in frequency of stamps beyond initial verification for the Web2 platforms, though. For example, almost all passports had a verified Twitter account, but less than half had >10 tweets, less than 1/3 had >100 followers, >1/10 had >1000 followers and ~1/100 had >5000 followers. For Github, 100% of users had an account, but <1/10 of users had 5 or more repositories. Almost all users had a Facebook account, but less than 2/5 had one with a profile picture.
While it is not surprising to have fewer users meeting more stringent requirements, the drop off is unexpectedly sharp and might signify large numbers of potentially adversarial users gaming the system by only aiming to meeting the basic standards that enable to them to avoid being squelched. It is known, for example, that accounts on these platforms can be bought and sold in bulk on black markets. Accounts meeting basic requirements are likely to be cheaper and more readily available than more fully-featured versions. Alternatively, it might be that users simply donât know that we want them to maximise their stamps, rather than just collecting the minimum viable set. Either way, the data indicate that the web2 stamps might only convey a relatively weak signal for Sybil defense.
On the other hand, Gitcoin data had a much more gradual drop off rate, indicating the stamp design works as intended, with increasing effort required to obtain increasingly prestigious stamps. This suggests these stamps might be weighted more heavily in future rounds to improve proactive Sybil defense.
Despite the high adoption of Gitcoin Passport and the promising results for the efficacy of Web3 stamps, we observed a low correlation between the trust bonus arising from passport stamps and the users labelled Sybil by FDDs algorithms (see bar chart below). If passport stamps were a viable standalone Sybil defense mechanism then we would expect a steady decrease in the number of squelched accounts as the trust bonus increases, but this is not what we observe. This is also reinforced by the observation that 58% of the users that were squelched in GR15 had Gitcoin Passports.
The userâs APU scores can boost the signal in some cases. The APU score is an Accumulated Partitioned Uniqueness value calculated from the number of stamps in a passport and the uniqueness of their combination. The relationship between APU score and squelching needs some further investigation because there was a condition in place that meant no-one could lose trust bonus during a round which mayhave skewed the data by allowing some users with below-median APU scores to achieve the maximum +150% trust bonus.
Ultimately, though, the unlock comes from combining passport with algorithmic scoring. This approach gives a much stronger set of signals.
In addition, staking $GTC was a very strong signal for honesty in these analyses, but the result is probably skewed by the fact that folks holding $GTC before it becomes used to enhance a userâs trust score are probably high-integrity users anyway, whereas in future rounds buying and staking $GTC will be known to be beneficial for increasing a userâs influence. Future attackers might buy and stake $GTC specifically to legitimize them so they can launch an attack.
For example, if we assert a priori that new users are likely to have minimal on-chain activity, a single wallet address, a unique username and a unique ip address, while an airdrop farmer is likely to have more on-chain activity such as POAP and NFT ownership, multiple wallet addresses, nonsense usernames and duplicate ip-addresses. Classifying users by these criteria and then looking at their behaviours reveals some strong signals:
-
new users fund climate, desci, diversity and education topics
-
web3 natives tend to fund crypto advocacy, ethereum infrastructure and open source software.
-
Airdrop farmers tend to focus on DeFi
This makes sense because lots of users might onboard to Web3 specifically to contribute to causes they already care about, such as climate and DEI, whereas farmers are primarily interested in receiving an aridrop in the form of some token that is most likely to come from DeFi. This implies that calibrating climate, DeSci and DEI more leniently might be appropriate in order to include the large population of new users attracted to those rounds, while DeFi needs to be calibrated strictly to defend against attackers and farmers. This is the kind of calibration that can be done by a round manager and then fine tuning done at the grant level in the new protocol based grants system.
How to better identify noobs and farmers
In the previous section we made some assumptions about how new users and airdrop farmers are likely to behave. However, those hypotheses were presented in qualitative, subjective terms that donât lend themselves very well to detection algorithms. To algorithmically classify users into identity bins we need some quantitative measures in the parameter space of the Sybil detection algorithms that map to those qualitative behaviours so that they can be used as filtering measures. To determine which parameters might be diagnostic for farmers vs noobs, we plotted the scores from each metric for users known to be noobs and farmers and looked to see which showed distinct separation between the groups (see figure below). The three metrics that appear to be useful in distinguishing these two identities are the intersectionality score, donor DNA distance and shared IP address count. Farmers and noobs both spread across the full range of SAD scores, had similar APU scores and had a lot of crossover in Levenshtein distances.
From these data we can identify the following set of parameters that encode the identities of new users and farmers:
Measure | Noob | Farmer |
---|---|---|
Intersectionality score | 0 | >= 1 |
DonorDNADistance | 0.1 < x < 0.25 | <= 0.1 |
IPSharedCount | 0 | >= 3 |
This information could be used by round managers to algorithmically filter users on entry to a grant round in a way that usually welcomes new users and blocks airdrop farmers. The simplest form of filtering would simply be to gate entry to users that matched the Noob criteria, squelching those that match the farmer criteria. In this context, gating means disallowing participation by reducing the influence of a squelched account to zero as a consequence of the scoring legos, as opposed to âhard-gatingâ which explicitly prevents a user from accessing the platform, e.g. a blacklist.
Sybil defense legos
GR15 demonstrated that no single metric on its own was able to distinguish Sybils from honest users. The plot below shows large crossover between the squelched and non-squelched users for six individual parameters.
However, these metrics become more powerful when they are combined together. They can be thought of as Sybil-scoring legos - not very useful on their own but able to be connected and stacked in creative ways to form more useful super-structures. One example is breaking the total user population into classes and checking whether the aforementioned legos are better able to distinguish between specific subpopulations rather than simply distinguishing users that may have been squelched for a diverse set of reasons. In FDD we have two reference populations known as Thor and Loki representing known non-Sybil and Sybil users respectively. The Thor and Loki users are represented by the green and pink lines on the plot below.
The honest (Thor) users have very low LevDistance, Intersectionality and DNADistance scores and high SAD and IPSharedRatio scores, while the Loki users are the opposite. Squelched users and users without Gitcoin passports tend to be more similar to Loki than Thor, although the signal is dampened across all metrics, presumably due to large variation across the populations. Users that have stake GTC (âself stakersâ) have a profile much more closely resembling Thor. This indicates high potential for stacking Sybil scoring metrics to better identify honest users from Sybils.
Sybil-scoring-as-a-service
The utility of these Sybil-scoring legos can be levelled up by providing them via a web app. Users can then tweak parameters, analyse their own community of users and configure their own defenses. A prototype Sybil-scoring-as-a-servioce dashboard is shown below:
In this dashboard, a grant owner can analyse individual users or groups of users using all the different metrics and simulate the effects of imposing certain eligibility criteria (e.g. how many users are squelched when I set the following thresholdsâŚ). Additional pages could be used to view different combinations of legos.
Summary
Analysis of GR15 showed that Gitcoin Passport on its own was not a strong diagnostic tool for Sybils. Part of this may be because users donât realize that they are supposed to maximize their passport stamps, as opposed to meeting a set of minimal viable requirements. However, creating Sybil-scoring legos and pairing them to Gitcoin Passports holds great promise for better Sybil detection. Building services that allow grant owners and round managers to easily apply those legos to their own subcommunities of users could be a major unlock in a composability-focused grants protocol. Staking $GTC is a very strong signal for honesty in our pilot studies, but this needs to be confirmed with a larger population of users in case the result is skewed in favour of behaviours associated with current $GTC holders. People may well buy $GTC specifically to stake and launch a Sybil attack.