[tor-dev] CAPTCHA Monitoring Project's Dashboard

Barkin Simsek barkin at nyu.edu
Tue Aug 11 20:40:24 UTC 2020

I messed up with replying to my own thread since I was receiving
emails in a digest. I'm sorry for creating a duplicate thread.
This email intends to reply to Georg's reply from the original thread.

> Thanks for that wiki page. That's been really helpful. One thing I kept
> wondering while reading through the different graph descriptions with
> the default graph style in mind:
> So, what's the tick shown per day then? One consensus somehow picked or
> data for all consensuses for that day? Or...?

I initially wanted to have per-day data points on the graphs, but
later I realized that it complicates the calculations a lot (and
increases the probability of making a mistake) since some relays show
up & disappear (and their exit probabilities change) throughout the
day. So, I decided to calculate the CAPTCHA rates per each hourly
consensus to simplify calculations. There will be 24 data points
(representing each consensus) per day and 30 days of history in a
single graph. In total, there will be 24*30=720 data points in a
single graph. Since having 720 labels similar to "2020-07-18 14:00" in
the x-axis would be too cluttered, I decided to put ticks for each
whole day and label them only rather than labeling every single data
point. I hope this explanation makes it more clear. I will update the
wiki and example graph on the wiki to reflect what I just explained.

> How do you plan to show those graphs on the dashboard? Are they all
> shown in some order? Grouped by sections (maybe along those on the
> wiki)? Grouped by "importance"?

I plan to have separate pages for each section mentioned in the wiki.
In addition to that, I plan to have a "highlights" page to show a
quick summary of all graphs. People visiting the dashboard will first
see the highlights and later check more detailed graphs if they are
interested in seeing them. I agree that there is a lot of information
for a visitor to digest, and making this process as effortless as
possible is one of my top priorities.

> Somewhat related to that, I think we can try reducing the number of
> graphs, at least those which are greatly related.
> One thing that could be useful here
> is having a switch/buttons next to the graph giving the user the option
> to see "by exit relay age" for all CAPTCHAs, or for Cloudflare ones, or
> for Akamai ones or for... depending on the switch/button the users
> clicked on. The graph would update accordingly once clicked, showing the
> user choice. The default could be for all CAPTCHAs or essentially
> whatever we think is more important for us. That way you have related
> things grouped together AND you have less graphs shown per default on
> the dashboard, too, which might help not getting confused.

A few other people suggested using this approach as well, and I like
it! I will place buttons to achieve what you have suggested. On top of
this, there will be checkboxes next to graphs for users to make
decisions on what to include in the graphs, and the graphs will be
updated accordingly. The default values will be configured in a way to
reflect what we care about most. For example, both "Tor Browser" and
"Firefox over Tor" measurements could be used to calculate the CAPTCHA
rates, but only "Tor Browser" checkbox will be checked by default, and
the user will have the ability to include "Firefox over Tor" and
others into calculations.

This gives the user the ability to compare different options easily.
That said, I wonder if having too many options is another problem for
users. It might lead to confusion, and determining the right balance
is still an open issue.

Thank you for your feedback!


More information about the tor-dev mailing list