In which I pretend to know how to do statistics or data analysis; but really hope I spur people to point out what I'm doing wrong =)
Methodology:
1) Run two bwauths on the same box 2) Collect votes (and raw bwauth votes where available) for a month from all bwauths (including the duplicate one) 3) Stick the tuple (vote timestamp, relayid, bwauthid, bw value) into a database 4) Measure the percent difference in bandwidth for the same relay at the same vote time between two bwauths as (abs(r1.bw - r2.bw) / ((r1.bw + r2.bw) / 2) 5) Average that value for all relays for a given vote time to get the overall percent difference for that vote time 6) Graph it for multiple bwauth pairs 7) Get a slight flavor difference of the above value by limiting it to relays with a bandwidth value of at least 100, to see if there are any differences
Code: Download and archive votes and raw bwauth votes: https://github.com/tomrittervg/bwauth-tools/blob/master/download_files.py cron it up, run it every hour at 15 minutes after the hour.
MySQL Database schema: https://github.com/tomrittervg/bwauth-tools/blob/master/schema.sql
Process the files and put them in the database: https://github.com/tomrittervg/bwauth-tools/blob/master/files_to_database.py
Run the last query to get the data: https://github.com/tomrittervg/bwauth-tools/blob/master/queries.sql
Plot it: https://github.com/tomrittervg/bwauth-tools/blob/master/plot_data.py Optionally run something like grep -v "\\N,\\N,\\N,\\N,\\N,\\N,\\N,\\N" query.csv > non-blank-lines.csv, or change non-blank-lines.csv to query.csv
~ One month of data is 29,194,717 rows or ~4GB of data, and you want to run the query and the plotting overnight.
Conclusions:
The data!! https://raw.githubusercontent.com/tomrittervg/bwauth-tools/master/data.png x axis: unix epoch of the vote time y axis: percent disagreement between maatuska's bwauth and the indicated bwauth
The two bwauths run from maatuska agree quite consistently. However they have the only instance of DISagreement between the 'all' set of relays and the '>100' set.
maatuska and moria agree eerily closely.
two bwauths agree with each other to the same amount consistently over time, although how much they agree depends on the two bwauths being compared
the 'base' agreement, how much two identical bwauths will agree, is around 35%
there is not much difference between 'all relays' and 'relays > 100'
Follow Up:
Is my data correct? maatuska and moria being that close together is suspicious.
Compare moria<->longclaw, moria<->faravahar, and longclaw<->faravahar
Change the bandwidth limit to >1000 and see if they produces more or less disagreement
I can start applying bwauth patches, and run the new bwauth code, from the same vantage point and use this methodology to determine if new code has altered the results. IF it has altered the results: are they better results or worse? Don't know!
I should measure the disagreements between bwauths when grouping relays by country.
Because I have raw vote data from maatuska and moria, I can test the hypothesis that relays fall between gaps in scanners. Maybe. I could probably get positive proof this occurs but maybe not proof it _doesn't_ occur.
-tom