With big thanks to @Sov and @meglister
TL;DR
In this post, weâll share our Sybil resistance strategy for GG20. Our plan is based on three lines of thought:
- If you can only look at donation data, complete Sybil Resistance is impossible in any QF mechanism, no matter how fancy.
- That being said, COCM has a large (though not complete) sybil-reducing effect.
- The passport team has recently introduced a new verification procedure called Passport model based detection, which reduces user burden and appears to be quite effective when combined with COCM.
Therefore, the strategy is clear: use COCM and Passportâs model based detection for GG20, which will yield a smoother user experience and more sybil resistance.
Wait, whatâs COCM?
You can think of COCM (Connection-Oriented Cluster Match) as a tweak to the Pairwise Match algorithm suggested by Vitalik. Like Pairwise Match, COCM is based on standard QF and attenuates a projectâs funding if many of its donors have similar donation patterns. However, COCM calculates similarity differently. See this blog post for a full explanation.
Sybil resistance is at odds with what we like about QF
QF is commonly understood as a fairly democratic mechanism, in the sense that many small donors can out-class one large donor. It turns out that sybil resistance is at odds with this property.
The community uses the term âsybilâ in many ways, so letâs clarify what we mean. Imagine a person about to donate to Gitcoin Grants â say, $10 to project X, and $4 to project Y. Before checking out, they pause and ask, âwhat if I actually split my donations among two separate wallets? I could donate, say, $5 to X and $2 to Y from both wallets, instead of making one big donation.â If that person wouldnât be better off splitting up their donation (regardless of the original donation amounts or the way they split it up), then Iâll say the QF mechanism is sybil resistant, since people are incentivized just to make their whole donation from one account.
The problem is that any mechanism with this sybil resistance property degenerates into a direct donation mechanism where no matching funds are added. This is because you can apply the logic of the definition in reverse. Imagine that Alice and Bob are two donors to a QF round. Even if Alice and Bob are real people, a mechanism has no way of knowing that their donations arenât really two sybil donations from a real user named Charlie, whose cart originally held all of Aliceâs donations plus all of Bobâs.
In fact, we could replace Alice and Bobâs donations with the donations from our hypothetical Charlie â and because of our Sybil resistance property, we still get the exact same funding results as before. But now our list of donors is one shorter. So what if we do the same thing again? We can pick two random users, combine their donations, and have a guarantee that our mechanism does the same thing before and after we squish them together.
Now weâre in trouble, because we can keep âsquishingâ users together until our algorithm is just running on one donor who gave the sum of everybodyâs individual donations. But we already know what QF mechanisms do when thereâs only one donor â they donât add any matching funds! So our sybil resistant âQFâ mechanism is really just a plain old direct donation mechanism.
In experiments, COCM is much less vulnerable to Sybil attacks
Given that sybil attacks are inevitable in any variation of QF (including COCM), the next step is to see how bad they get. To figure this out, we simulated a bunch of sybil attacks in the donation data for the Zuzalu tech round, which had just over 4,000 donors.
For each project, we had the top 10 donors to that project make some number of sybils (5, 25, 75, or 100) each. We created these sybils by splitting up the real donations among the appropriate number of new wallets â so, no new money was added to the system. Then re-ran QF (or COCM) and saw how much the funding to that project changed. For each amount of sybils per user, we measured the average funding gain across all projects, and the maximum funding gain that any project got. Results are below:
This chart makes it pretty clear that COCM is much more resilient to sybil attacks than standard QF. Letâs zoom in on COCM:
Even in the worst case, generating 1,000 new sybils only nets an extra 0.2 percentage points of funding for a project.
Generating sybils makes so much less of a difference under COCM because it attenuates funding when donors look similar. It may be difficult to generate a large number of sybil accounts that look different enough to get past the attenuation imposed by COCM.
But this leads to a caveat: there are many, many strategies for generating sybil accounts. I tried many of them, but we didnât try all of them, and we certainly donât consider ourselves better sybil attack strategists than every other person in the web3 ecosystem. Some attack strategies may be more effective than what we tried.
Nevertheless, these experiments suggest that Gitcoinâs sybil resistance strategy can be adjusted to have a lighter burden on the user. Next, weâll describe a new approach to Passport that does just that.
Passportâs Model-Based Detection Provides Lower User Friction and Greater Effectiveness
The Gitcoin Passport team has recently introduced a Model-Based Detection System. This system analyzes the on-chain history of addresses and compares it to the historical data of known human and sybil addresses. Based on this comparison, the model assigns each address a score ranging from 0 to 100, where a score closer to 0 indicates a higher likelihood of the address being a sybil, and a score closer to 100 suggests a higher probability of the address belonging to a genuine human user.
This is a very different approach from the Stamp-Based System, in which users are prompted to connect with different identity providers to prove their personhood. In past rounds, weâve heard many complaints that people find this manual system too difficult to achieve a high enough score for maximum matching.
One of the advantages of the Model-Based system is that it requires minimum user interaction and could greatly reduce user frustration. The new user experience would be mostly hands-free. Usersâ wallets will be automatically scanned in the background and assigned a score based on their history.
Additionally, weâre keen to understand if this newer system can provide a level of sybil defense that is at least as effective as Passport Stamps in GG19. Weâll investigate how it would have impacted the results if it had been used instead.
One of the difficulties with the sybil resistance problem is that there is no answer sheet: we donât have a list of confirmed humans and sybils and have to move forward with our best guesses.
To determine if our solutions are protecting the matching pool from bad actors, we gut-check how our solutions affect which projects get funded. After investigating the projects in the round, a member of our team assigns each project a âLegitimacy Score,â which is just a hand-wavey approximation. 5 means more legitimate, and 1 means less. When comparing COCM to normal QF, weâve seen it take matching funds away from projects that have less legitimacy and shift them toward projects that have more.
If we were to use Model-Based Detection instead of Stamp-Based, weâd want to see the matching going to the scammy projects be even less while there is an uptick in the matching going to legitimate projects. This is exactly what we see:
1 and 2 scoring projects all see a slightly greater reduction in matching funding when switching from Stamps to the Model. This is a positive sign because it means the Model is removing voters from matching who are giving to these scammy projects. At the same time, we see a big increase in matching going to more legitimate projects indicating that the Model predicts most of their voters are human and should stay in the dataset. This is also a great sign!
COCM alone redistributes 180,233.84 DAI of the matching fund compared to using normal quadratic funding and Gitcoin Passport. When combined with the Model, the redistribution increases to 227,453.29 DAI, representing a 20% increase in redistributed funds.
Evaluating the impact of the Model without COCM, we find that it still redistributes 82,356.04 DAI. The redistribution shows a moderate to strong positive correlation (0.66) between the effects of COCM and Model on the results. This indicates that both strategies move results in the same direction and provide benefits when used separately. However, the strongest outcome is achieved by using them together.
In GG19, Gitcoin Passport alone, without COCM, would have redistributed 9,643.46 DAI compared to not using Passport at all. The results show a weak negative correlation (-0.13) with COCM results.
It is noteworthy that using the Model would lead to significantly improved funding outcomes, representing a substantial change in the effectiveness of sybil detection. Moreover, this can be achieved without requiring the voter to interact beyond sharing their wallet address, greatly reducing user friction.
The Passport team has done an outstanding job with this model and weâre very happy to be working with them.
However, we should make a caveat: the Model-Based system may look more effective today because no attacker has yet had a chance to game this system. On the other hand, people have had many rounds to try and game stamps. Weâll have to reevaluate our strategy as the attackers evolve continuously.
Path Forward
Given these lines of analysis, we are planning for Gitcoin to use COCM and Passport Model-Based Detection as its sybil resistance tools for GG20. We believe this strategy will result in better funding outcomes for grantees and simultaneously reduce donor frustration.
We also used COCM in GG19 and have since offered it in an extremely limited-beta to select partners. Weâre happy to share that weâll soon be rolling COCM out as an option for all our partners.
As weâve mentioned above, no sybil resistance strategy will be foolproof. However, to the best of our knowledge, this is clearly the best path forward with our current available tools.