[Proposal] FDD Season 14 Budget Request

Pop · May 24, 2022, 3:15pm

Thank you for the continued collaboration on moving this forward and taking all the steward comments into account and really taking the steward feedback on board. It’s been one of my highlights of this season to really see us all collaborating to find a middle ground for how we move forward.

I am wondering if there is a combination between these two? I would opt for Option 4 but I am forever wanting innovation vs stagnation so what would be a balance here where we continue iterating models (maybe it’s ONE per season?) for improvement but without a 140k price tag? It is not an easy or pleasant task scrutinizing to this level but perhaps a better or more clear definition of what " models for making round to round improvements" may mean or entail?

DisruptionJoe · May 24, 2022, 5:02pm

Thanks for the meaningful and direct feedback.

Let’s say you are a sybil attacker. You make a bunch of accounts Simona1, Simona2, Simona3… This is a clear signal, but we wouldn’t see the usernames being closely related without

Noticing the behavior
Turning the insight into a data problem - A model
Creating a process for continually evaluating and updating the model

Here is the example of how we recently created another model for fighting off your attack: Fighting Simple Sybils: Levenshtein Distance - HackMD

Now we know how to detect it, but we need to either manual squelch those accounts or algorithmically do it. Algorithmic scales, manual does not.

To make this “feature” a part of the algorithmic detection we must go a step further. We can turn the detection of the behavior into a feature for the ml model. This is the overall algorithm that combines many features to determine if an account is sybil.

But what happens when the legit user “Simona69” gets caught in the heuristic? This is why we have humans in the loop. https://medium.com/vsinghbisen/what-is-human-in-the-loop-machine-learning-why-how-used-in-ai-60c7b44eb2c0

The humans will notice that Simona69 is not like the 68 before it. This info will then train the algo that the high quality signal DOES have exceptions.

The more humans that we have evaluate, the more opportunities for us to spot the algorithm being unfair to classes of users. The community model is finding new features and models to incorporate into what you might call a “meta-model” or ensemble which is best capable of accurately identifying sybil, and more importantly - NOT SYBIL accounts.

We have discussed moving the data science initiative to DAOops and maybe this is the time to do that. It helps FDD, but could be more beneficial to the DAO as a whole. The interesting part of connecting the group to FDD is that spotting sybil behavior is a lot like moving your Scrabble pieces around to see if you have a word. Sometimes we know exactly what data to request to build a feature, but spotting new behaviors is a mix of active analysis and serendipity.

We could lower the overall ask to $115k for the round to round improvements and ethical guardrails to be in place by transfering the data science budget to come from DAOops.

Pop · May 24, 2022, 6:50pm

This is really interesting actually and a trajectory I am seeing in other DAOs also - I wonder if there is enough understanding of the possibilities of DS for the DAO that would warrant a “DS as a service” function to be accessed by workstreams

krrisis · May 24, 2022, 8:57pm

Thanks for this updated proposal Joe & team. I’ll just copy what simona wrote here because I wholeheartedly agree:

A quick question on this:

… are you saying here that you would be counting on DAOops to take over these roles? Or am I misinterpreting this? Because this is not really in our current scope.

This is also not something we could take on at this time I fear, data analysis that is crucial to the DAO at the moment is FDD related, so I would propose for it to remain with FDD.

On my vote, I am unsure how I will vote at this moment and looking forward to more comments from others. I’d definitely vote for innovation, same comments as Simona.

Thanks for all your work on this, I know this is a tough process but it’s making our DAO more transparent + resilient and us all better at giving and receiving feedback. Appreciate all you do.

DisruptionJoe · May 24, 2022, 9:03pm

Nope. Saying that the role exists because DAOops requested us to have someone in that role. It shouldn’t be by storyteller though!

Option 3 is focused on improvements, but not necessarily innovation. Options 1 & 2 have innovation. The difference being that they are creating new value or systems rather than continually improving or tweaking the ones we have.

Based on your comment, I think you are most aligned with Option 3.

DisruptionJoe · May 24, 2022, 9:09pm

I think this is very insightful for anyone considering options 1 & 2 or looking at FDD as future drivers of dPoPP adoption. [S14 Proposal] FDD Season 14 Budget Request - #50 by danlessa

lefterisjp · May 25, 2022, 10:21am

Hello,

I want to echo other people’s comments and thank you for taking the time to incorporate feedback from the stewards in the budget proposals.

Before talking at all about the current options let me re-iterate the feedback I tried to give in the google doc you had DMed me which seems to have evolved to become this proposal.

What I would like to see come out of the FDD WG is an algorithmic software solution for evaluation of grants and sybil detection. With minimal human input. I do understand that human input will always be needed at some point but it should be kept at a minimum as it does not scale.

For simple manual reviews of grants which is what I understand Option 4 to be I think even Option 4 is very expensive. Manual review and classification of grants is a very simple work that can be accomplished even by someone from Upwork for $10/hr so to say.

Instead, and please please correct me as I may be understanding the budget wrong, what I see is:
A request of $222,975 for 4.8 months (3 months + 60% reserves) for 5 full time contributors.

That translates to a 222975 / 5 / 4.8 → $9290.625 per month for each employee to do “the bare minimum” which is what I understand is manual reviews.

This is an insane monthly salary for just manually reviewing grants.

That said what I want to see from FDD is to focus and iterate on delivering a software solution combined with minimal human input. Can you guys pull that off? So far I have not seen the most encouraging results.

I see lots of hard to understand jargon and no software yet.

I am at the moment not sure what to vote. I would also want to see what other stewards say.

ZER8 · May 25, 2022, 3:13pm

This is a very funny situation. Option 4 doesn’t mean only grant reviewing. On that logic everything that the other workstream do can be outsourced via Upwork? Am I wrong to assume this? I thought this was a DAO or is it a CAO in which we call it a DAO, but it’s just a corporation that’s online 24/7 ? Is that the vision? I don’t think so…

The FDD has multiple algorithmic solutions and the [Reward system] (GIA Rewards OKR Report) ( includes grant reviewers and is only one of the solutions).

The total cost of the grant reviews was exactly 14k last season and we protected millions of dollars of funds(more from an eligibility POV) . Do you think it’s expensive to do manual grant ?
reviews?

I don’t think you actually know the issues/challenges we face as the FDD and even more so in the GIA which is in charge of grant reviewing. 14K was the cost of 12.000-15.000 reviews! The FDD does not mean grant reviewing…

We could hire people from upwork but that will cost us even more…We will need to train each of them…takes a season or two. We actually pay our reviewers with 10-20$ per hour atm… On another note I really thought we had the ethos of decentralization here, but like reading your comments I kinda see through that “mirage”. Even the way in which we recruited the reviewers was fair and open for everyone. We recruited them chronologically in the order they reached out to us… and we wanted to keep things neutral and fair even within our squads.

Reading your comment is like all my work within the FDD for the past three-six months was null and I can’t be happy about that… Why are stewards not even looking at what we are building and just “hating” on us? Is that fair for the contributors here? Is that fair towards me? Who worked 60+ hours a week and 80+ during the round to make sure the grant creators, our donors and our matching funds are protected?(4 real, you can look in our Discord during the round and see what’s up). Associating the FDD with grant reviews nullifies all the hard work done by my peers.

If you really care about Gitcoin DAO, the grants program and our community please try to understand what major issues we face …we have entire organizations trying to game the system, we have hundreds of people that are creating fake grants…all while running the round, ensuring sybil protection and trying to solve “unsolved” research problems…

Those people are not reviewing grants, I am the driver of Round Management, only my squad handles grant reviewing and as I stated above that costs 10-20k MAX per round. This round we actually want to reduce that number with 50%, which is not fair for our contributors, but we need to reach a balance that satisfies the DAO.

That’s your truth, but is it the truth?

We can have a call anytime (takes 10 minutes) and I can explain and walk you though the whole process of grant reviewing and why we opted for this system which works amazingly btw

I’m sure that @DisruptionJoe @omnianalytics @Sirlupinwatson @David_Dyor and @vogue20033 can explain more about our collective efforts.

PS. I know and I completely agree what the DAO has some BIG issue. Parasitic behaviors are the threat to any organisation… The FDD is not the biggest spender and we work on one of the most complicated and complex set of problems. I don’t believe that by cutting our budget we will do the DAO a favor If people vote for Option 3 which is less than 400k they will see that we could and probably will continue to save funds(from our donors and matching funds) that are more than that number. Have a good day ser

Sirlupinwatson · May 25, 2022, 3:59pm

Hi there! In short,

Sorry about that, we will try to simplify the literature that we are using so you can understand what we are working on and what we are doing/trying to achieve.

Fundamentally, we have two working algorithm for Sybil Detection classification and one application that can automatically run prediction/analysis from the human-in-the-loop inputs (Human Evaluations)

“ASOP” → Anti-Sybil Operationalized Process or “SAD” Fraud-Detection-and-Defense (github.com)
Explained by @danlessa in depth here

Community Model

Software (You will need a username and a passcode to see it, if you are interested let me know)
@omnianalytics

lefterisjp · May 25, 2022, 3:59pm

The FDD is the biggest spender. (source: https://dune.com/ivanmolto/Gitcoin-DAO)

Those people are not reviewing grants, I am the driver of Round Management, only my squad handles grant reviewing and as I stated above that costs 10-20k MAX per round. This round we actually want to reduce that number with 50%, which is not fair for our contributors, but we need to reach a balance that satisfies the DAO.

Please describe in detail to me then what “bare minimum” is and where each part of the requested $222,975 goes where so I can make an informed decision.

You seem to have some kind of misunderstanding that it’s my or other steward’s fault for not understanding your problems or for not diving deep enough into what you are doing.

I am getting paid nothing to do this. You are asking money from the DAO and I am asked to say yes or no. If you want a yes you will have to keep it simple, keep it very short and explain where the money goes in enough detail so I can make an informed decision if the ask is too much or not.

If you give me to read very long essays I will simply default to saying no.

lefterisjp · May 25, 2022, 4:04pm

Thank you Armin.

The link to Github is broken. I assume because the repository is private?

https://github.com/Fraud-Detection-and-Defense/CASM <— this one.

Software (You will need a username and a passcode to see it, if you are interested let me know)

Yes that would be cool to try it out. Good idea. But first I would like to see code. I am a developer so that’s what I can judge best.

Sirlupinwatson · May 25, 2022, 4:10pm

Yes, these are private repo, you can request an access to @omnianalytics (OmniAnalytics#5482) on discord, just send him your GitHub username then he can add you in the group.

For the “ASOP” “SAD” @DisruptionJoe can add you in the group as well.

lefterisjp · May 25, 2022, 4:14pm

I will do that. But why private? Since we are all about opensource software here, I would expect what the DAO produces to be opensource too.

danlessa · May 25, 2022, 5:09pm

The repos contain sensitive information, like the specific features being used for sybil detection and how they’re retrieved. Making those open source will make obvious to the attackers to what look into in order to avoid detection. I don’t see a hard blocker for opening it up, but this should be part of an involved discussion (which FDD is already doing a lot) and the precautionary principle should apply.

Also, the SAD codebase is as of now a monolithic piece of software, which makes not trivial to separate the sensitive stuff into separate repos. It’s definitely doable as a part of the FDD evolution to develop into that, and opting for increasing community involvement on Sybil Detection will definitely make that highly desirable.

ZER8 · May 25, 2022, 5:26pm

I was talking about this season

ZER8 · May 25, 2022, 5:29pm

I understand, I’m actually upset because I’m sure we both have Gitcoins DAO best interest at hearth, Our solutions are not that easy to comprehend because the threats are the same.
Maybe next season there could be a Steward Budget call in which all the workstreams present their needs and accomplishments with their previous budget

The simplest way of seeing what we are doing is the diagram:

DisruptionJoe · May 25, 2022, 5:54pm

Here is another way to view the FDD work

Grant Eligibility = Content Moderation

Round execution - Reactive operational execution (All Round Management)

Two reviewers per grant minimum x 1000 new grants
Training of new reviewers - facilitating learning and communication during the round
Manual documentation of flags/disputes
Judgements on disputes
Manual documentation of appeals
Facilitation of conversations around appeals and judgements

Short Term Improvements - Proactively improving for the next round (Mandate Delivery, Data Science)

Ethelo devops support to bring down per review cost - spend $10k, save $10k per round
Gather better data from the approvals process
Training
Research on Kleros, Celeste and staking ideas (We had this on the roadmap 2 seasons ago, but then the dissolving of dGrants was a surprise and no one included us in the product conversations around building grants 2.0 so we thought we were supposed to maintain course and figure out our own solutions for grants 2.0)

Medium Term Connecting Current Course & Speed to Future State (Trust Builders)

Using review data from ethelo to run simulations of the decentralized protocol for reviewer reputation which would create stamps in Passport.
Software - https://github.com/dRewardsSystem/Rewards/commit/6eeaebf3584d21fbbdfcc42e580b870b4e75ea22
Description of software - GIA Rewards OKR Report

Long Term Vision - Future State = An ethically values aligned and sustainable solution (Trust Builders w/ Sybil Detection DAO & Passport)

Passport is used by everyone on earth - They have an option to participate in sybil hunting and grant curation for reward at anytime
The system doesn’t BAN sybils and fraudulent grant creators, it instead only allows them to play with each other
We avoid the web 2 moderation trap of becoming addicted to lean and inexpensive (but easily corrupted) delegated authority
We have built a system that offers communities the potential to choose “community curation” which is a decentralized review process aka a system where they can’t do wrong as opposed to shouldn’t do wrong
Stamps from this system are HIGH quality non-sybil signals

Sybil Defense = User Moderation

Round execution - Reactive operational tasks (Mostly SAD squad & Human Evals)

Run ASOP algorithm to identify sybil accounts
Push info to gitcoin backend for sanctioning/squelching (114,000 contributions out of 500k total in GR12)
Enough human reviews to statistically validate the model is working properly (>1,500 but decreasing returns over 8,000 or around there currently)
Enough human reviews to identify new behaviors of attackers (More is better as long as they are putting in genuine human subjective answers and not just using a “rubric”)
Enough human reviews to disperse bias across reviewer geographies, cultures, race, sex, etc.
(rather than programming the bias of the engineers)

Short Term Improvements - Proactively improving for the next round (Data Science & Community Model)

Identifying the high confidence sybil users (and known not sybil) and analyzing for correlations to Passport stamps
Turn Passport stamp correlations into features
Continue work identifying sybil behavior classes and new features
Analyze human evaluations for inter-reviewer reliability

Medium Term Connecting Current Course & Speed to Future State (Community Model, Mandate Delivery, Data Science)

Remove all cgrants backend data from algos - Use only non-pii publicly available (On chain) inputs
Identify long term value patterns that can prove the cost of forgery
Continue algorithmic sybil defense for communities that need it early. Although gitcoin is building a very long term valuable solution with passport, someone still needs to read the data AND we need to compare the results of our hypothesis that dPoPP will continue to solve the problem and not be gamed. We should not lose the current working system until we have tried to falsify the new hypothesis with the best data available!

Long Term Vision - Future State = An ethical & values aligned sustainable solution (Community Model, Sybil Detection DAO, Trust Builders)

Sybil Detection DAO decentralized user moderation - NOT by having an ever expanding set of human evaluators, but by using the machine learnin to scale the human subjectivity. Algorithms hold unknown amounts of bias. Keeping humans in the loop is an ethical solution in line with Gitcoin’s values.
Dynamic reading and peer predictions
A high values rules based system designed and dynamically updated using crowdsourced data analysis (Community model)
Large ownership stake in digital public infrastructure for Gitcoin & aqueducts!!!

Note: Evo - oXS & Evo - Ops are both operational functions. The first being our decision making, meetings, calendar updates, internal comms, etc. The latter being the roles with DAO ops has requested each workstream have and our payments.

DisruptionJoe · May 25, 2022, 6:14pm

Another thing that might help to understand FDD is it is all OUTCOME based organization.

Sybil Defense

Are we protecting grants from sybil attacks in a legitimate way that empowers the voices of the many (Its this part about empowering voices of many where @kyle and I disagree on whether community model and human evals should continue. I agree there is a high value system being built, but it is not there yet. By putting in the evaluation work now, we can better attempt to disprove the hypothesis that the new and untested high-value system Passport creates will be ungameable)

This round
Next round
Making moves in line with a grants 2.0 future
Identifying assumptions and risks
Reducing system bias
Decentralizing the inputs where it solves a specific problem - not for the sake of “decentralization”

Grants Intelligence Agency

Does Gitcoin grants provide maximum credible neutrality in the way it moderates content? (aka grant eligibility)

Executing reviews, disputes, appeals for the current round
Developing MVPs like work with Ethelo which will scale our reviews
Get better data on reviews to scale even further
We will still have to curate the main gitcoin rounds with grants 2.0 protocol
Should we accept that the content moderation on web 3’s premier value network uses delegated authority a la Twitter/Facebook style unless someone else is willing to invest in a decentralized grant review protocol or should we offer the “can’t do wrong” system out of the box? If our mission is to launch grants 2.0 protocol, you might consider this scope creep. If you think the gitcoin mission is to help communities build and fund their shared needs, it is probably essential.

Evolution

Are we consistently re-evaluating and improving

Staying ahead of the “red team”
Setting metrics which aren’t subject to Goodhardts law
Aligned and focused internally
Accountable to each other
Have the needed context and info

I would really love very direct feedback from @lthrift (Product Director) @kevin.olsen (VP of eng) @brent (Passport PM) and @nategosselin (Grants PM) to see which specific points are not aligned here. Let’s have a productive conversation!

I am also asking for us to be involved in co-creating a future. I don’t understand why FDD was only involved in the grants 2.0 planning in discovery workshops where the product team for grants 2.0 learned what FDD was currently doing, but not in any future planning where we could have put these ideas forward and course corrected together at an earlier point in this process.

In fact, I put the proposal forward to CSDO that we needed to get on the ball, hire Sam and The Ready to discuss and share our plans before the budget proposals were due.

lefterisjp · May 25, 2022, 7:31pm

Daniel I understand that and it’s a sufficient explanation to me for having it private for now.

Try to work towards making it opensource eventually at some point soon, perhaps by separating the sensitive paremeterization out in a private repo.

DisruptionJoe · May 25, 2022, 7:43pm

Yes. This is the plan with the community model squad. We will figure out the Passport (dPoPP) usage patterns of known sybil accounts to identify features to use in creating the Gitcoin Passport Reader score which would be exported to dapps that don’t want to roll their own algo.

Unfortunately, we only have a couple seasons to do this while the Passport is being used on cGrants before the grant registry and round manager parts of the protocol are rolled out.

The idea being we can move from private repos using some PII data to open repos using only publicly available on chain data.

We know that the community values decentralization, privacy, consent, etc. We have had the same outcomes all along and have adjusted with the new plans for grants 2.0. At this point, the community has been cheering us on as we built the proper tools for the job, but they aren’t all finished.

We thought the timeline was to be able to launch by the time grants 2.0 launches.

Now, if we are “defunded” it is a catch-22. People can say, “see they didn’t build anything with the money”, but in reality we did a lot of research and iteration to understand the problem space and have started on building in the solution space. All along we were executing the necessary minimum to keep the rounds going.

The reason I shared the documents around Trust Builders and the info about the community model and SDD above is because the evidence is tangible. If we stop now, Gitcoin community will provide a grants 2.0 protocol with all its eggs in one basket on an untested sybil defense hypothesis and throw away its investment in other solutions that work today, against the advice of those directly working on the problem, even though those people are agreeing that it is a great hypothesis that we should test!

Please discuss with the data scientists and people working on the problem before making a decision.

It would also leave a 3/4 built skyscraper behind where people could say, “see, GitcoinDAO couldn’t provide a protocol without single points of failure and they wasted all their money on this thing”… but it might just be that finishing building that skyscraper would have removed the single points of failure. (skyscraper = the sybil work done so far)