-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 07/06/16 23:02, Isabela wrote:
Hello there,
Hi everyone,
This is a great idea Nima, thanks for putting it together.
Agreed.
I think before collecting data (which might take more work than we think, who knows) we should review the audience we are focusing here and what stories we want to tell in the infograph then list the data we should get.
Infograph are stories, we will start with the story that tells how to use it. Then there is another section with other stories -- if our focus is for new users or users who doesn't know PT because never had to used it, than listing some of these things might just confuse more than help or be too much information.
On 06/07/2016 09:11 AM, George Kadianakis wrote:
Nima Fatemi nima@torproject.org writes:
- List of PTs included in Tor Browser with a one-line summary
and significance of each
I think this is a great thing. Based on David and Linda's research people tend to just pick the first option in the drop-down menu. This will help educate people what the other options are etc. People also don't understand all those PTs names listed there, what is the difference etc :)
- List of PTs under development
I think this goes in the 'extra weight' package if our public is the one I describe above.
- Stats + Total number of PT bridges
Just in case you don't go with a real-world example as Isabela suggests (which I think is a good idea), I can help with getting some data on total number of PT bridges. I think I'll need to write some code for this (though it feels like a familiar use case that I surely must have written code for in the past). Just tell me for which time period you'd want these numbers. But please be optimistic that you'll use these numbers, because I'd write this code specifically for you.
- Average number of PT clients per day + Maybe compare the
numbers with direct connections?
Well, we already have graphs and data for this on Tor Metrics. You should be able to get the numbers from there.
I think we should tell a story with this information -- is much better for someone to digest and understand the meaning of numbers this way.
For instance, instead of just saying the average clients per day we could (also?) give a real world case of a censorship event and the usage growth due to it.
- Links to PT page (on website and wiki)
Sounds good.
Specifically, here are some example statistics that I would find interesting:
- How many clients use BridgeDB? - Which distributors are used
most? - Which PTs are asked most? - What % of clients are using Tor vs connecting directly? - What are the most popular client countries? - How many unique bridges does BridgeDB give out daily/weekly
These all sound like interesting questions.
- Did BridgeDB usage decrease with the introduction of meek?
This one might be a bit hard to answer without a time machine.
Some of the above statistics might also require some privacy-preserving obfuscation before publishing.
Is it feasible and easy to extract (any of) the information George listed above? I need to have all the information we want to put on this infographic by the end of next week.
Wait, what? End of next week? This year or next? Or under the assumption that time machines exist?
I seriously hope there's no way to extract this information from existing data, because that would mean that BridgeDB logs all kinds of stuff about its users, and I strongly believe it doesn't do such a horrible thing.
And if you meant that we could write a small hack to BridgeDB to dump some usage information to disk which we would then include in this infographic, then no, I'm almost certain that we wouldn't do such a horrible thing.
We have processes for extracting new information about users in a privacy-preserving way. These processes include publishing a proposal, getting it reviewed on tor-dev@, implementing and testing statistics code, getting that publicly reviewed, publishing the resulting data, and only then using that data. This can take weeks if not months.
But I'll just hope you meant something else.
I'm still not sure why we have no statistics on BridgeDB. It seems something that could point us to future research directions, and generally help us in strategizing against censors.
I've created https://trac.torproject.org/projects/tor/ticket/19332 for this.
Again, don't underestimate the effort there. And these things usually happen faster if somebody submits a patch.
I would do the same thing here, based on the audience, do we need all this?
Good point.
I am happy to join a session on irc to look for what data/story we could tell on the infograph. I just can't do it today because of other work I said I would get done.
I can't commit to joining such a session, but if there's anything I can help with that we can extract from existing data, please let me know.
All the best, Karsten