[Discussion & Proposal] Ratify the Results of Gitcoin’s Beta Round and Formally Request the Community Multisig Holders to Payout Matching Allocations

ale.k · May 28, 2023, 10:34pm

Heya Evan -

Couple things to respond to here.

(1) Surprised to hear the accusations of non-transparency, as it seems so far Gitcoin is the only DAO who has taken time from their contributors and dedicated significant resources to giving the ODC datasets to play with I’d also push back on the idea of “volunteerism” when again, Gitcoin has given significant funds to incentivize the hackathon experiment… As I told you in person: I’d highly recommend for the best interest of the ODC that use cases outside of Gitcoin be explored to add some legitimacy to this work and credibility for the quality of the work.

(2) As I do for all other auditors and friends of the DAO, happy to provide cleaned datasets if there’s something that you’re missing that you want? These forum posts have always served as a high-level reporting opportunity, and not a deep dive. As @connor mentioned here, we are moving towards an automatic rule system by which any rule triggered by a voter could be queried or otherwise looked up. (Same for grantee applications - moving towards “kerckhoff compliant” a la @disruptionjoe’s thought leadership.) In the meantime, again, happy to provide these scripts where there’s interest, but also the timestamp checks and passport checks mentioned are all derived from public, on-chain data that can be pretty easily replicated with good fidelity.

Again- please feel free to follow up with me with questions about our methodology, but that seems a bit beyond scope here, as we’ve always done significant sybil silencing where a couple hundred MBs of voter data doesn’t fit easily into a governance post… We aspire to regularly publish internal and external data for the whole community to learn from.

DistributedDoge · May 29, 2023, 12:28pm

In final report there is one grant “WGMI community DAO [Web3 Community Round]” with exactly 0 eligible contributions that also reports 8.81$ of total USD received in donations.

This contradicts explanation post, which made me assume that column “Total USD received” refers to final subset of donations eligible for matching that is:

The final results you see show data that has been calculated after imposing various eligibility controls and donor squelching For example, donations from users without a sufficient Passport score or donations under the minimum will not be counted in the aggregate “Total Received USD” or “Contributions” column for your grant.

Can someone explain what happened here?

ZER8 · May 29, 2023, 8:57pm

Hey, I know the “burn” felt during/after the Gitcoin rounds so just want to clarify one aspect. Evan said that the hackathon participants are volunteering their time because the prizes were awarded to them for the work did during for the hackathon. They are not obligated to dedicate more time to help or explain their analyses because they already received their prizes.

ZER8 · May 29, 2023, 9:16pm

Kudos for all the work done everyone and congrats for finalizing yet another amazing Gitcoin Round

My only worry (kinda related to @thedevanshmehta) is that projects with relatively few contributors got huge matching amounts(some even got the max amount). That’s concerning because it seems that in rounds with a few donors -like the web3 community and education round) it can encourage people(could be friends of the project, could be sybils, could be actual members from their community- we cannot know for certain) to donate larger amounts to get a larger matching.

so…I either don’t understand how the calculations are being done or this will prove just another lesson that needs to be learned for the next round I guess

vegayp · May 30, 2023, 7:33am

Thank for your posting this, as a first time grantee this information is enlightening.

Great work.

epowell101 · May 30, 2023, 2:05pm

Thanks for your response. I guess this could be cleared up if we knew how the squelching was done.

Maybe I missed that?

I see the following text:

After on-chain data analysis and a manual sampling process, donations from addresses that were associated with these types of behaviors were excluded for the purposes of matching calculations. This includes things like:

Suspected bot activity based on specific transaction patterns and similarities

Flagging known Sybil networks/addresses from prior rounds

Enhanced analysis of Passport stamps and other data to flag evidence of abuse between different wallets

Self-donations from grantee wallets

Any details on the Legos used to identify the transaction patterns and similarities, the addresses squelched (we could put this in a private location if that was preferred - the OpenData Community keeps certain suspect Sybil addresses access controlled for example), and other explanations of the “enhanced analysis” and so on would be useful.

Soap box - and the nuance may be lost here - I’m 100% confident that great analysis was done. I’m also pretty sure that non-transparent analysis puts at risk the credibility we are all seeking to build or maybe rebuild in the space. By sharing more of how the analysis was done we can all gain confidence while learning more about how to protect rounds in the future.

thedevanshmehta · May 30, 2023, 7:33pm

Thank you for appreciating our concerns! From my understanding, the spirit of QF is letting the wisdom of the crowds decide which projects should be funded and in what amount. The results from the current round contravene that spirit, as pointed out by @ZER8 , @flowscience & many others on Twitter that have privately messaged me. The folks in the climate round are particularly aggrieved at mini meadows nearly getting the matching cap despite only having 13 (!) contributors.

For me to vote in favor of ratifying these results, I would need to see some discussion around an alternative formula (even as simple as all votes are equal) and.a spreadsheet showing how the funds in the beta round would have been allocated had we used the alternative formula. Only with this comparison can we intelligently decide the path ahead and whether to stick with pure QF or some other version thereof.

connor · May 31, 2023, 4:03am

Hey @duckdegen thank you for pointing this out - it sounds like the product team discovered a bug with the calculations impacting a handful of transactions, likely including this one, that has now been fixed. We’ll be sharing updated results soon and this should be resolved. Thank you again for bringing this to our attention!

@DistributedDoge thank you as well for flagging this - I believe this should also be fixed with the new patch mentioned above (and as you noted is a contradiction and not intended). Appreciate everyone who is digging into the data and helping us spot discrepancies like this!

connor · May 31, 2023, 4:16am

Hey @nick_smr thank you for sharing these ideas. Looking at your grant it looks like 1 donor did not have a passport, one donation was ~$0.90, and 2 others were flagged in the Sybil squelching.

I think this point is something that could be directly addressed by matching caps. Before each round, we do try to gather community feedback on eligibility criteria and other parameters like matching caps. If most of the Climate grantees felt the cap should be lower, it is certainly something that could be done (CC @M0nkeyFl0wer)

I do think part of the reason the Climate round was very skewed was that the total contributions and donation amounts were generally lower than we’ve seen in prior rounds. So with fewer “votes”, each vote (and its weight) matters more.

I do get your point about minimum support and low matching amounts. It is primarily the nature of the QF mechanism and letting the community “decide” where funding should be allocated. We do plan to add a variety of new and different funding mechanisms to the Allo protocol, which would likely provide allocations more in line with what you are thinking.

connor · May 31, 2023, 4:35am

In response to this post and similar concerns from @thedevanshmehta, @flowscience, and @thedevanshmehta -

I agree at a glance many of the results do seem “off” when comparing contributions vs. total amounts, and the associated matching amount (versus what might be expected). I think this can be broken into two distinct conversations:

Why do these results seem different than usual?
Is QF the optimal funding allocation mechanism?

Regarding 1 - my personal view here is that due to smaller numbers of donors and total amounts donated, specific large donations can have a larger impact than one might expect. This is still the same QF algorithm used in many past rounds, so the math is not different.

In this specific case, it is an interesting outcome, which I think is due to 9 and 41 being fairly small “sample sizes” of data to calculate from. If this was 10xed (ie. 90 and 410 contributors), I believe the classic QF impact of “many small donations outweighing fewer large donations” would be much more amplified.

Regarding 2 - this is a larger discussion happening in many places, so won’t get into it much here, but I do think we should be experimenting with various tweaks to “classic” QF, as well as entirely new funding mechanisms. As Allo gets built out, there should be many more options, and we will hopefully be doing more retroactive data studies on how results may change.

Finally re: wash trading - this is something we are certainly aware of and looking for in Sybil and fraud analysis. We need better tooling and more automation, but that work is in progress, and any wash trading discovered will 100% be dealt with.

nick_smr · May 31, 2023, 5:05am

Hi @connor thank you for taking time to respond to my queries. Its our first time participating in a gitcoin round so I’m not sure if this ever happens,but is it usually possible to know & seek a review on the votes that were considered a sybil squelch? Whatever the outcome,I believe this will put us in a better position as we prepare for future rounds.

connor · May 31, 2023, 5:29am

Hey @epowell101 I really appreciate the constructive criticism and productive line of questioning, both in this initial post and your follow-up comment. I know you have the DAO and the community’s best interests at heart. I would love to collaborate more on this round and going forward

So I actually had no idea that the ODC was looking at data and calculating results for these Beta rounds, this is news to me. And great news at that! We absolutely could use more eyes on the results here and I am very open to collaborating. I’d love to learn more about what has been built in the hackathon and what differences you are seeing. I’ll reach out privately to chat about next steps.

This is great to hear - as I mentioned in one of my above posts, we have discovered a bug in the product impacting certain transactions and the resulting match calculations. We do plan to post an updated version of the round results in the coming days, taking this fix into account, as well as any other issues or tactics that arise that are beneficial. Let’s combine forces - we have a chance to improve our Sybil defense and reach a “fairer” outcome, before this goes through ratification and payouts.

This is an area I am somewhat torn on. We can definitely do a better job of openly sharing data, details of our methodologies, etc, but I also am not sure if 100% transparency is the ideal solution right now. If we publish data on every single donation and whether it was counted, we’d create an environment where users may feel specifically targeted and where we’d have to justify every decision that was made (often by automated tools). Ideally, we could hear out objections and investigate each one, but I’m not sure that can work at scale (over 100k txs in the Beta) and we also don’t want to make it so only those with the loudest objections get rewarded.

But my bigger concern is if we publish every algorithm and tactic used for Sybil and fraud prevention, it makes it much easier for bad actors to use that knowledge to game the system in the future. There’s a reasonable argument to be made that “if your Sybil defense is strong enough it doesn’t matter if the methods are fully transparent, it still cannot be cheated” but frankly I don’t think Gitcoin (or Web3 identity systems as a whole) are anywhere near that point yet.

IMO there is a sweet spot between being completely transparent and being a black box, where we maximize community knowledge sharing while minimizing bad actor empowerment, we just need to find where that is. Perhaps we’re leaning too far into the “black box” direction right now, but I’d love to get your opinion, and work together on a path forward.

I just want to reiterate since it may get buried in my string of posts - we do expect to re-run matching calculations based on new findings in the coming days and will share updated results here. If/when this happens, the timeline:

will be moved back, so there will be 5 additional days after updated results are posted, before anything moves to a Snapshot vote.

DistributedDoge · May 31, 2023, 6:51am

If we publish data on every single donation and whether it was counted, we’d create an environment where users may feel specifically targeted and where we’d have to justify every decision that was made

It’s not the people who vote that count, it’s the people who count the votes.

Current trend of only sharing aggregate results means that only way to prevent incorrect results from being ratified is to spot obvious anomaly in a dataset. This time we were lucky in that while report was calculated incorrectly, there is indeed one row with an obvious anomaly and @duckdegen was aware of a missing vote.

I would like to request a dataset with all eligible, non-Sybil votes that qualify for final matching. This would allow everyone to re-calculate matching amounts and confirm that inaccuracies in the reports were indeed corrected.

Moreover, raw data with final votes would also make it possible to recalculate matching using alternative algorithms as suggested by @thedevanshmehta.

Ideally, but this to me is of secondary importance, because some votes are “missing” Gitcoin could also share updated listing of all votes that were cast during a round, no matter if eligible or not.

As it stands, without actual vote (donation) data on-hand how are stewards supposed to verify accuracy of matching amounts presented in final report?

connor · May 31, 2023, 10:29am

I hear you, I am open to sharing the raw vote data once we have the revised calculations ready if the rest of the team is on board. We’ll discuss the best way to provide this and enable anyone to verify results. Thank you for this feedback, and your willingness to dig into the data with us

epowell101 · May 31, 2023, 2:59pm

Thanks Connor - I see your DM as well.

I think my comment was poorly worded to somewhat conflate round calcs with Sybil & squelching. There were bounties having to do with Sybil identification in the ODC hackathon - and many about building the tools such as Legos and Dashboards, and data sets to protect rounds. My comment made it sound like we also calculated the round calculations - which from this thread I see there are ODC members doing that - however there was NOT a bounty or anything like that for that work.

I totally get concerns about the downside of transparency - especially if it meant that we’d see a lot of appeals. For example, the ODC required everyone competing in the hackathon who wanted to use lists of likely Sybils to backtest their models to register and be added to an access list / “whitelist.” So the OpenData Community was not entirely open with its data. Maybe there should be a document “we” in the broader community collaborate on regarding transparency vs. secrecy of approaches in this regard.

Coming back to the initial point - as it stands we are not aware of how the Sybil identification and squelching analysis was done. More depth on what algorithms were used, what patterns were seen, will help the broader community prepare for future QF rounds and incidentally will help other communities fighting washtrading and fraud more broadly. We could stop short from publishing individual likely Sybils which would presumably trigger appeals.

thedevanshmehta · June 1, 2023, 10:16am

Thanks Connor, a quick look at the open source round (which saw a lot more participation) does confirm your analysis: low participation leads to money given counting more than number of contributors. In OSS, the projects receiving the matching cap also had the most number of contributors, unlike the other rounds.

It’s great to know of initiatives rethinking QF. One takeaway for me is that Gitcoin should leave open the optionality of using classic QF if there is high participation and weighted QF if there is low participation. This will be important as we run experimental rounds like journalism which may similarly see lower participation in the first few seasons.

It’ll be near impossible to eliminate wash trading entirely. For eg, my project has a SOP of creating new wallets for every grant received. We could easily have cycled grant funds already received in those wallets through Gitcoin to come out with our principal plus profit (we obviously chose not to do this). The only solution is giving even more weight to number of contributors than amount donated, especially for rounds with low participation.

ale.k · June 1, 2023, 10:38am

Hey @DistributedDoge, @nick_smr, others-

We are more than happy to send projects the outcomes of their particular round (so they can see exactly where and why certain votes were silenced.) I will echo Connor in that we don’t want to embarrass anyone who might have accidentally got caught up donating during a bot attack, through now fault of their own And this certainly happens -but if you accidentally gave exactly 1.5 DAI at the same time as 45 other wallets did the same- you will have blended in with the bots and will have been silenced…

We don’t want to contribute to the many unattended “sybil-lists” that float around the space unless we are highly confident and transparent about inclusion on a sybil list. When we silence someone, it is not necessarily because they were known 100% confident sybil, but because they behaved in a way that was anomalous and likely intended to manipulate the round. I hope this makes sense- but we’re happy to speak to project owners about any particular concerns and be fully transparent with their results. I’ll personally send you a little sheet so you can see what happened to all of your voters.

Similar to the transparency balanced with “shade-throwing” ideas above, we know many of our projects (especially the OSS crowd) are subject to airdrop farming speculation. So please know, it is not a reflection on a project’s honesty and respectability if they had a high squelch-rate. For example, anyone who is building on Optimism L2 is likely going to have a lot of airdrop activity; that’s totally not their fault and certainly doesn’t “prove” a self-attack.

I hope this helps! Please feel free to message me on the gov forum or Discord if you want to keep talking through this.

ale.k · June 1, 2023, 11:00am

Hey @epowell -

In terms of our use of the ODC hackathon results- I hope you can understand that we’re a bit far away from being able to utilize multiple notebooks across multiple systems without any criteria to gage quality in place. I know I gave this feedback last hackathon about a need for consistent auditing standards- and I gave several notes as to where analysis could be improved on or explained in more standard ways for a wider data science quality assurance process.

I personally am not much worried about appeals- I think if you accidentally donated in a sea of airdrop-bots, you’ll understand why we had to discount you Our community is smart and savvy about these types of things, although, we do go on “anomalous” behavior and not pure sybil in order to protect the rounds. I know the ODC members also rely on stats analysis- and I’d be happy to discuss any areas where they think we can improve on our “rules”-based logic that we’re moving into.

Here were the rules we ran for this round:

Passport pass/fail
Passport manipulation attempt (attempting to add the same hashed credentials across several Passports)
Suspected bot (you and 44+ of your friends donated in identical ways within very close proximity (same 3 seconds) of one another, or otherwise gave yourselves away as a “bot” army)
You attacked us during alpha, Fantom, or Unicef rounds and we saw your sybil network attack again (this is relatively rare, but there are a few larger networks which we are tracking who move in this way; most all also failed the above Passport checks, to be clear, but we’re keeping an eye on these beasties)

The energy and work of the ODC members has been truly impressive, and I’ve enjoyed meeting with and talking with so many of the analysts, students, and data scientists you and @baoki have gathered here. If you all have ideas of how we can improve the rounds, would love to talk.

I would also expect any sybil networks or self-attacks of grantees to be reported to us, though, and not saved for last minute-ism… I can’t seem to find that anyone notified us of a network being found? Let me know if you think we missed this somehow - we are examining better ways to surface community reports of round manipulation.

connor · June 1, 2023, 11:15am

@thedevanshmehta Actually you were 100% on to something here with the Mini Meadows grant.

By itself this doesn’t provide any indication of calc issues or bad behavior. Like I said since the climate round had a low-ish turnout, 13 unique donors giving high amounts can and will give a big match, the QF math is working correctly.

But a glance - yeah seems weird. The match checks out so it could be legit, but the numbers stick out like a sore thumb. Total Received USD is 3-4x higher than all those around it, and contributions is 3-4x lower. So it’s definitely a big outlier.

So I dug into the on-chain data manually a bit and discovered it’s pretty much all one big Sybil ring, potentially including more grants as well, and more donors. I’m frankly disappointed that it slipped through the cracks with how blatant it was.

I’m going to propose (both internally to the DAO and to the community here) that we halt any next steps towards finalizing and paying out the round until we can sit back and focus on a deep dive into Sybils and fraud using robust on-chain data (ideally with more automated tools), leveraging Passport data much more, and using other methods to flag these patterns I’m seeing. It will be a collaborative effort across multiple Gitcoin workstreams, as well as something we’d love to get the community involved in. I’ve been DMing with @epowell101 about what tools created by the ODC can be utilized. So I’m hopeful we’ll learn from this process and can streamline these calculations going forward.

GerMiniMeadows · June 1, 2023, 1:19pm

Hi there!

It’s Ger, the founder of mini meadows. Really disappointed to hear this. I’ve been developing and reiterating this idea for a quite a while now to help people take real climate action.

As everyone in the environmental movement knows, it’s a massive challenge to elicit climate action and outside my professional role, I’ve made it my full time (& unpaid) job to empower others to take such action.

It’s actually quite upsetting that I’m at risk of being excluded before I’ve even had the opportunity to deliver anything from the project because people donated to the cause.

I canvassed with my own network and on twitter spaces alike. I even onboarded new people to web3 to make donations.

I didn’t expect to receive such donations tbh and it’s deeply disappointing to see this reaction from a community that is supposed to value solidarity and is looking for widespread adoption