Data Science @ GitcoinDAO

I think there is so much potential in investing data science at Gitcoin.

This is informed by a few things

  • Prior Experience. A few years ago, I was the Director of Engineering at a clean energy startup that set up a fairly robust data warehouse ETL/snowflake schema system to run advanced analytics on time series data. I’ve also been in a few different product oriented positions at various web2 startups that had mission critical ecommerce checkout flows with A/B testing, marketing emails to optimize those funnels. This one time, when i was CTO of an online dating site (a double sided marketplace, just like Gitcoin), I built a matching engine that matched users on 20 dimensions.
  • Per gitcoin.co/results, Gitcoin has helped 66,712 funders reach an audience of 292,817 earners. Gitcoin has facilitated 1,740,075 complete transactions to 10,247 unique earners. Understanding the 4 years of data at Gitcoin, particularly the Grants Rounds data, gives me a hunch that there are interesting opportunities in understanding the data.

The objective of this thread is to start a conversation. What should the data science practice at Gitcoin look like?

Here are the data science opportunities matters I’m aware of at GitcoinDAO

  • Product Analytics & Data Science
    • Responsible for understanding how users use the platfrom.
  • Marketing Analytics & Data Science
    • Responsible for understanding how to drive more core actions (like Grants checkouts)
  • Complex Systems Insights
    • Responsible for guiding the QF matching engine with deep analytical insights (perhaps one day even simulating agent-based contributor behaviour)
    • Responsible for publishing advanced analytics-based insights from our datasets. Heres an example of what this could look like.
  • Fraud insights
    • Responsible for (Joe, correct me if I’m wrong) surfacing fraud on the Gitcoin Grants network (whether sybil or collusion) and partnering with Governance to remediate in a legitimate way.

An assortment of tools are used in these practices at Gitcoin. Here are the ones I’m aware of:

  • Etherscan
  • Dune Analytics
  • The Graph
  • PostgresSQL
  • Google Analytics
  • Metabase
  • Google spreadsheets
  • Google presentations
  • Acquia
  • CADCAD
  • Machine Learning Tools (not sure which ones)

I’d welcome corrections from any workstream leads on the above. The above is just my best approximation of the tools/roles as they currently stand int he DAO.

I’d be curious if people in the community would be interested in putting forward a proposal to the DAO to formalize a data science practice at GitcoinDAO (which currently resides in multiple different groups at varying levels of coordination)

I’d like to end on this questions:

  1. What should the data science practice at Gitcoin look like?
  2. If Data Science was an area of practice at Gitcoin, what would it look like?
  3. How could it span multiple workstreams or squads(teams) & cross-pollinate between them?

9 Likes
  1. Have a dedicated analytics person working with every team- MMM, DAOops. While central teams have their advantages, core teams will either always be stuck in prioritization queue or forever unaware of how data can help and wont be as impactful as they could have been with data.
  2. It might also be fair for contributors to expect to know where their grants are being put to work. An opt-in visibility tool which also helps set expectations. While we have FDD for flagging and preventing fraud, we might need a function to raise the bar on best practices for governance and fund utilization using data.
3 Likes

This would be so incredibly helpful. We are dramatically underinvested in this area. We have a wealth of incredible data that we can and should be using to better understand how we can better serve our stakeholders and increase our impact. MMM is just starting to scratch the surface here with work that we’re doing in the growth substream but we have a long way to go. Getting support from a DAO-wide data science team is essential to our work and our ability to fulfill our mission.

4 Likes

For the time being I think that there is a lot of work being done in silos. And just as you say, more cross-pollination between ppl that do data currently could have great benefit to improve our products.

Data that I use day to day include;
Traffic data
Grants performance (add to carts, checkouts)
Mobile performance
Monitoring traffic sources
Campaign monitoring (newsletters etc)

A powerful link would be data to be set in comparison with grants data from Metabase to compare and identify new opportunities for growth.

How do we structure all this? Maybe;

  1. A data scientist who can identify interesting data and create datasets based on our grants data, and do all the queries.
  2. A google analytics (or similar tool) wizard who monitors user behavior and run A/B tests to continually improve our product.
  3. A product manager. 1 and 2 should work closely with the PM to understand their product better and identify pain points in the funnel and where there is potential to improve.

One pitfall we could encounter is to create fantastic datasets and insights, but no idea how we can execute on the data. Identifying the right stakeholders and making everyone onboard will be important for the project’s success.

This also needs to align with the Web 3.0 ethos to collect as little data as possible to be able to do our job. I think transparency and inclusion of the GitcoinDAO would be crucial to be able to do this well.

6 Likes

I’d love to see us invest more in this area. My .02 gwei would be to separate the idea into two pieces:

  1. How do we build a scalable, modular, anti-fragile data backbone across GitcoinDAO?
  2. How do we encourage a culture of using data across the opportunities within the DAO?

In my mind, #1 should be a data-focused product team with the remit to build a scalable data warehouse/system that can make data more easily available to analyst-level data users. In other words, how do we knit all of our data sources together and let SQL users get value out of it (given they have the business context)?

#2 feels like building a group of analysts and scientists (i.e. more advanced data users) that can be embedded on specific workstreams. Maybe it lends itself to a guild model like the Gitcoin Product Collective?

4 Likes

it’s interesting you post about this, i was just thinking about the feasibility of launching an AI workstream, similar to moonshot collective.

this is what the matrix squad in FDD is working on btw.

i think this point should be wrestled with a little more. one interesting benefit of having a dedicated AI workstream would be that these services could be provided, with the goal of creating data unions/revenue streams (carefully of course to avoid negative incentivizations).

if we did have an AI workstream, it should play a supportive role, facilitating collaborations and acting as a service to the DAO.

this is another benefit, the ability to cross-correlate data for insights.

and for the data that we do collect, the reports should be openly available.

beautifully put.

3 Likes

Having Data Science as an Area of Practise or sometimes also called Community of Practise in larger web2 IT organizations is a common pattern. However, this DS-Area would not do any particular work for projects, OKRs or initiatives. Their goal is to reflect upon and improve their craftmanship by ways of learning from each other and presenting or socialising at events. If they get really productive they might develop and provide trainings for other Areas. For example: enabling developers so they learn from Data Scientists.

There is this idea of “Team Topologies” out there that could be helping us design and visualize how we work together. And also discuss several options how Data Science could look like and interacts with others.

image

3 Likes

@ivanmolto from Builderband helped the MMM Workstream create a Dune Dashboard with an overview of the GitcoinDAO governance and finances. These visuals greatly increase the transparency of our DAO - and they look super cool!
Feel free to reach out with feedback.

3 Likes

100% support this

Giving people access to quality data, with a data dictionary, and the ability to use tools like SQL and Python will probably drive them to generate insights on their own. Data-savvy contributors who want answers will look for them – they just might need the infrastructure/access to do so.

This is what AWS is really good at – although a centralized technology – it makes it easy to have your data somewhere people can find and run more advanced analytics on using tools like SQL, & Jupyter Notebooks. It also allows for granular role-based access control. I don’t know what the decentralized alternative is but would love to

2 Likes

Great work here. I’d love to see the number of delegators each steward has delegating to them as well. Something like this:

6 Likes

Number of delegators has been added to the table. @ivanmolto have also been working hard on an additional deep dive into our Governor Alpha set up.

Some very interesting data can be found here: Dune Analytics

4 Likes

was just messing around with breadcrumbs.app tonight and was able to come up with this sankey diagram that allows ppl to follow the flow of GTC out of the timelock + to the workstreams + to their contributors. pretty neat!

ivan, would love to show u breadcrumbs.app sometime.

3 Likes

This is cool! We have discussed creating a subgraph for the Steward Health Card-project as a potential metric of engagement. The theory being that Stewards receiving GTC less than 4 hops from the multisig is a strong indicator of engagement. Especially if it’s occurring month after month.

1 Like

Yes, of course. I would love it. I didn’t know about this tool. It is a result outstanding visually!

2 Likes

this is great - I would love to use some of this for the workstream and treasury health dashboard to be added as DAO tools alongside the steward health cards

3 Likes

@ivanmolto and I have been looking into the correlation between GTC outflow from GitcoinDAO and GTC spot price. A common belief is that Workstreams compensating contributors with GTC would somehow negatively impact the price of GTC. We put this theory to the test through a number of queries in this Dune Dashboard to see if it holds any water.

In order to produce a meaningful analysis we couldn’t just track GTC outflow from Workstreams and compare that to GTC price. To accurately test this hypothesis we need to take one step further and track GTC outflow from contributors in relation to price.

A common conception about contributors receiving compensation in GTC is that they will immediately sell their tokens for other assets, causing a massive sell wall and negative price movement as a result.

After analyzing 269 unique contributors over time we can see that this is far from the truth. The data show that contributors of GitcoinDAO are likely to hold their GTC and the average amount of GTC held by this group is increasing. The first figure below shows the number of contributors who hold more than 0, 100 and 1000 GTC respectively. The second figure displays the amount of GTC held (x100) in total by this group as well as their average GTC holdings. These metrics are all increasing.

We could not find a correlation between GTC outflow from Workstreams and GTC price movement:

As we looked further into what could impact GTC we found a correlation between GTC and other governance tokens. In general, GTC seems to be following the movements of the market sector. The figure below show the price movement of other governance tokens and GTC, relative to their price on May 25th 2021:

In conclusion; the hypothesis that sending GTC to Workstreams who then pay contributors would negatively impact GTC price can not be proven through the data. GTC is however correlated to similar tokens in the sector.

We encourage you all to look through the dashboard and we welcome feedback on ways to improve these further. A couple of additional visuals and tweaks will be added to the dashboard in the coming days.

Enormous shoutout to @ivanmolto for the amazing job.

Our two previous dashboards can be found here:
GitcoinDAO: Governance & Financial overview
GitcoinDAO: Governor Alpha & Timelock

10 Likes

The same to you @Fred It has been an amazing job side by side with you. Please all you enjoy the insights!

2 Likes

Great to see some data! Thanks for taking effort and time to put together.

After analyzing 269 unique contributors over time we can see that this is far from the truth.

Does this mean data set is limited to these 269 individuals over the entire time frame? So any contributor that may have joined in Feb 2022 would not be counted?

In conclusion; the hypothesis that sending GTC to Workstreams who then pay contributors would negatively impact GTC price can not be proven through the data.

Assuming the population is constant throughout the period, the conclusion is undeniable. But concerns right now are more forward looking – it’s easy to hold an appreciating asset, especially given most contributors have likely been sitting pretty from wider market during this time. Will be interesting to revisit this after each round and see how trend develops in response to wider market turmoil

1 Like

I could have clarified the “269” number a bit better.
269 is the current number of total contributors we are able to identify on-chain.

On May 25th 2021, when GitcoinDAO launched, we had 0 on-chain contributors. Now we’ve reached 269. A visual for the number of total contributors over time can potentially be added to the dashboard for clarity!

In my opinion we have seen some pretty big market turmoil in the last months but I absolutely agree, it will be very interesting to see how things evolve. Thanks!

This is great @Fred and @ivanmolto
It would also be great to see if or not rounds exhibit any trend in pricing of GTC!

3 Likes