Delivering here on my promise to share more details.
How to Calculate Cluster Match QF:
-
First, a quick review of simple QF:
- Sum the square roots of each individual’s contribution to a project
- Square that sum to get a per-project value
- Distribute the matching fund proportional to the relative size of each projects square (and enforce a matching cap so that no project takes too much of the pool by itself)
-
Next, cluster-match QF. Cluster-match QF orients matching funds around communities rather than individuals. This is mainly the same overall process however before we square root contributions we cluster them together.
- Cluster based on the donation profile of a donor. A donation profile is defined as the set of decisions you made on each project: donate or don’t donate. Donors who made all the same decisions are clustered together
- The contributions to a project by the same cluster are added together as if they were the same voting bloc. Then their square root is taken.
- After that the process is the same: sum the square roots of all clusters grouped by project, square the sums, and payout the matching fund proportionally.
In Code:
Thank you to @Joel_m for writing this python function:
def donation_profile_clustermatch(donation_df):
# run cluster match, using donation profiles as the clusters
# i.e., everyone who donated to the same set of projects gets put under the same square root.
# donation_df is expected to be a pandas Dataframe where rows are unique donors, columns are projects,
# and entry i,j denote user i's total donation to project j
# we'll store donation profiles as binary strings.
# i.e. say there are four projects total. if an agent donated to project 0, project 1, and project 3, they will be put in cluster "1101".
# here the indices 0,1,2,3 refer to the ordering in the input list of projects.
projects = donation_df.columns
clusters = {} # a dictionary that will map clusters to the total donation amounts coming from those clusters.
# build up the cluster donation amounts
for (wallet, donations) in donation_df.iterrows():
# figure out what cluster the current user is in
c = ''.join('1' if donations[p] > 0 else '0' for p in projects)
# now update that cluster's donation amounts (or initialize new donation amounts if this is the first donor from that cluster)
if c in clusters.keys():
for p in projects:
clusters[c][p] += donations[p]
else:
clusters[c] = {p: donations[p] for p in projects}
# now do QF on the clustered donations.
funding = {p: sum(sqrt(clusters[c][p]) for c in clusters.keys()) ** 2 for p in projects}
return funding
More Numbers
Here are the calculation details including both matching formulas and pre/post squelching voter numbers and donation amounts.
The ‘base’ totals are the numbers after applying our basic rules: have a passport score over 20 and donate at least $1.
The ‘eligible’ totals are the numbers after applying our sybil squelching based on the rules stated above:
I’ll note that while pulling this data together I found a bug in how my data was being aggregated. I fixed this and it affected the results. To me this underscored the necessary importance of transparency. We need to rapidly move toward turning off post-round squelching and relying only on passport + better QF.