Our Sybil Resistance Strategy for GG20

By @Joel_m @umarkhaneth

With big thanks to @Sov and @meglister

TL;DR

In this post, we’ll share our Sybil resistance strategy for GG20. Our plan is based on three lines of thought:

  1. If you can only look at donation data, complete Sybil Resistance is impossible in any QF mechanism, no matter how fancy.
  2. That being said, COCM has a large (though not complete) sybil-reducing effect.
  3. The passport team has recently introduced a new verification procedure called Passport model based detection, which reduces user burden and appears to be quite effective when combined with COCM.

Therefore, the strategy is clear: use COCM and Passport’s model based detection for GG20, which will yield a smoother user experience and more sybil resistance.

Wait, what’s COCM?

You can think of COCM (Connection-Oriented Cluster Match) as a tweak to the Pairwise Match algorithm suggested by Vitalik. Like Pairwise Match, COCM is based on standard QF and attenuates a project’s funding if many of its donors have similar donation patterns. However, COCM calculates similarity differently. See this blog post for a full explanation.

Sybil resistance is at odds with what we like about QF

QF is commonly understood as a fairly democratic mechanism, in the sense that many small donors can out-class one large donor. It turns out that sybil resistance is at odds with this property.

The community uses the term “sybil” in many ways, so let’s clarify what we mean. Imagine a person about to donate to Gitcoin Grants – say, $10 to project X, and $4 to project Y. Before checking out, they pause and ask, “what if I actually split my donations among two separate wallets? I could donate, say, $5 to X and $2 to Y from both wallets, instead of making one big donation.” If that person wouldn’t be better off splitting up their donation (regardless of the original donation amounts or the way they split it up), then I’ll say the QF mechanism is sybil resistant, since people are incentivized just to make their whole donation from one account.

The problem is that any mechanism with this sybil resistance property degenerates into a direct donation mechanism where no matching funds are added. This is because you can apply the logic of the definition in reverse. Imagine that Alice and Bob are two donors to a QF round. Even if Alice and Bob are real people, a mechanism has no way of knowing that their donations aren’t really two sybil donations from a real user named Charlie, whose cart originally held all of Alice’s donations plus all of Bob’s.

In fact, we could replace Alice and Bob’s donations with the donations from our hypothetical Charlie – and because of our Sybil resistance property, we still get the exact same funding results as before. But now our list of donors is one shorter. So what if we do the same thing again? We can pick two random users, combine their donations, and have a guarantee that our mechanism does the same thing before and after we squish them together.

Now we’re in trouble, because we can keep “squishing” users together until our algorithm is just running on one donor who gave the sum of everybody’s individual donations. But we already know what QF mechanisms do when there’s only one donor – they don’t add any matching funds! So our sybil resistant “QF” mechanism is really just a plain old direct donation mechanism.

In experiments, COCM is much less vulnerable to Sybil attacks

Given that sybil attacks are inevitable in any variation of QF (including COCM), the next step is to see how bad they get. To figure this out, we simulated a bunch of sybil attacks in the donation data for the Zuzalu tech round, which had just over 4,000 donors.

For each project, we had the top 10 donors to that project make some number of sybils (5, 25, 75, or 100) each. We created these sybils by splitting up the real donations among the appropriate number of new wallets – so, no new money was added to the system. Then re-ran QF (or COCM) and saw how much the funding to that project changed. For each amount of sybils per user, we measured the average funding gain across all projects, and the maximum funding gain that any project got. Results are below:

This chart makes it pretty clear that COCM is much more resilient to sybil attacks than standard QF. Let’s zoom in on COCM:

Even in the worst case, generating 1,000 new sybils only nets an extra 0.2 percentage points of funding for a project.

Generating sybils makes so much less of a difference under COCM because it attenuates funding when donors look similar. It may be difficult to generate a large number of sybil accounts that look different enough to get past the attenuation imposed by COCM.

But this leads to a caveat: there are many, many strategies for generating sybil accounts. I tried many of them, but we didn’t try all of them, and we certainly don’t consider ourselves better sybil attack strategists than every other person in the web3 ecosystem. Some attack strategies may be more effective than what we tried.

Nevertheless, these experiments suggest that Gitcoin’s sybil resistance strategy can be adjusted to have a lighter burden on the user. Next, we’ll describe a new approach to Passport that does just that.

Passport’s Model-Based Detection Provides Lower User Friction and Greater Effectiveness

The Gitcoin Passport team has recently introduced a Model-Based Detection System. This system analyzes the on-chain history of addresses and compares it to the historical data of known human and sybil addresses. Based on this comparison, the model assigns each address a score ranging from 0 to 100, where a score closer to 0 indicates a higher likelihood of the address being a sybil, and a score closer to 100 suggests a higher probability of the address belonging to a genuine human user.

This is a very different approach from the Stamp-Based System, in which users are prompted to connect with different identity providers to prove their personhood. In past rounds, we’ve heard many complaints that people find this manual system too difficult to achieve a high enough score for maximum matching.

One of the advantages of the Model-Based system is that it requires minimum user interaction and could greatly reduce user frustration. The new user experience would be mostly hands-free. Users’ wallets will be automatically scanned in the background and assigned a score based on their history.

Additionally, we’re keen to understand if this newer system can provide a level of sybil defense that is at least as effective as Passport Stamps in GG19. We’ll investigate how it would have impacted the results if it had been used instead.

One of the difficulties with the sybil resistance problem is that there is no answer sheet: we don’t have a list of confirmed humans and sybils and have to move forward with our best guesses.

To determine if our solutions are protecting the matching pool from bad actors, we gut-check how our solutions affect which projects get funded. After investigating the projects in the round, a member of our team assigns each project a ‘Legitimacy Score,’ which is just a hand-wavey approximation. 5 means more legitimate, and 1 means less. When comparing COCM to normal QF, we’ve seen it take matching funds away from projects that have less legitimacy and shift them toward projects that have more.

If we were to use Model-Based Detection instead of Stamp-Based, we’d want to see the matching going to the scammy projects be even less while there is an uptick in the matching going to legitimate projects. This is exactly what we see:

1 and 2 scoring projects all see a slightly greater reduction in matching funding when switching from Stamps to the Model. This is a positive sign because it means the Model is removing voters from matching who are giving to these scammy projects. At the same time, we see a big increase in matching going to more legitimate projects indicating that the Model predicts most of their voters are human and should stay in the dataset. This is also a great sign!

COCM alone redistributes 180,233.84 DAI of the matching fund compared to using normal quadratic funding and Gitcoin Passport. When combined with the Model, the redistribution increases to 227,453.29 DAI, representing a 20% increase in redistributed funds.

Evaluating the impact of the Model without COCM, we find that it still redistributes 82,356.04 DAI. The redistribution shows a moderate to strong positive correlation (0.66) between the effects of COCM and Model on the results. This indicates that both strategies move results in the same direction and provide benefits when used separately. However, the strongest outcome is achieved by using them together.

In GG19, Gitcoin Passport alone, without COCM, would have redistributed 9,643.46 DAI compared to not using Passport at all. The results show a weak negative correlation (-0.13) with COCM results.

It is noteworthy that using the Model would lead to significantly improved funding outcomes, representing a substantial change in the effectiveness of sybil detection. Moreover, this can be achieved without requiring the voter to interact beyond sharing their wallet address, greatly reducing user friction.

The Passport team has done an outstanding job with this model and we’re very happy to be working with them.

However, we should make a caveat: the Model-Based system may look more effective today because no attacker has yet had a chance to game this system. On the other hand, people have had many rounds to try and game stamps. We’ll have to reevaluate our strategy as the attackers evolve continuously.

Path Forward

Given these lines of analysis, we are planning for Gitcoin to use COCM and Passport Model-Based Detection as its sybil resistance tools for GG20. We believe this strategy will result in better funding outcomes for grantees and simultaneously reduce donor frustration.

We also used COCM in GG19 and have since offered it in an extremely limited-beta to select partners. We’re happy to share that we’ll soon be rolling COCM out as an option for all our partners.

As we’ve mentioned above, no sybil resistance strategy will be foolproof. However, to the best of our knowledge, this is clearly the best path forward with our current available tools.

14 Likes

Thanks for sharing this detailed overview of Gitcoin’s sybil resistance strategy for GG20. It’s great to see the thoughtful approach you’re taking to enhance user experience and address the challenges posed by sybil attacks. The combination of COCM and Passport’s Model-Based Detection seems promising, and I’m looking forward to seeing how it contributes to better funding outcomes for grantees. Keep up the good work!

2 Likes

While this tech looks cool and it will be very interesting to see how it all plays out, I am left a bit frustrated. I’m trying to bring in a new community that likely has done very little on ethereum in the past. It sounds like they won’t be eligible for any kind of grant matching?

I guess I’ll just encourage them to get started so things will be in place for GG21, but It seems like I said something similar about building up their passport last season. (https://x.com/ICDevs_org/status/1729525201791017374)

Although I guess if we do find a hands-off approach it will be very welcomed by those donating that don’t want to make it feel like a job.

1 Like

hey @skilesare ! You’ve definitely highlighted a gap in our current implementation. Our goal is first to optimize the experience for web3-native folks, which we’re addressing with Passport’s model-based detection + cluster matching (as well as moving off PGN, as you called out in your post.) Once we’ve nailed that, we’ll address the non-native audience needs, which likely means supporting fiat/credit card checkout.

7 Likes

That’s a great form to give more powerful to the GG20. Awesome strategy guys! :smile: LFG! :potted_plant: :green_circle: :green_heart: Let’s grow.