[metrics-team] metrics-web detect script update and question

seamus tuohy stuohy at internews.org
Tue Nov 17 20:49:08 UTC 2015


Howdy Karsten,

Karsten Loesing <karsten at torproject.org> writes:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 16/11/15 22:57, seamus tuohy wrote:
>>
>> Hello,
>
> Hello Seamus,
>
>> Karsten Loesing <karsten at torproject.org> writes:
>>
>>> Great to see your interest in making the censorship detector
>>> better!
>>>
>>> So, I have been thinking about your plan to submit a pull request
>>> for the rewrite of the current functionality, and I think I'd
>>> want to suggest a different plan:
>>>
>>> How about you deploy your rewritten code on a minimal website
>>> that visualizes the output of your rewritten censorship detection
>>> script, possibly comparing it to other algorithms, and we link
>>> that website from the Metrics website?
>>>
>>
>> Sadly, My expertise is not in the statistical analysis, but in
>> open source software development. This is why I focused on making
>> the existing code cleaner and more cleanly documented and
>> structured. It would be a much more significant task for me to
>> compare the original algorithm to others.
>>
>>> Let me explain this plan a bit more: what we really want is a
>>> better censorship detection algorithm that doesn't produce as
>>> many false positives.  Your rewrite can be a great starting point
>>> for that.  But there's no need to merge code directly into
>>> Metrics until we're sure we found an algorithm we like better
>>> than the current one, and maybe that requires making two or three
>>> attempts to get it right.  For now, I'd rather want to add a link
>>> to your results.  We can always discuss replacing the script in
>>> Metrics with a new one later, but there are really no
>>> requirements other than that it can read a .csv in the provided
>>> format and write a new .csv in the expected format.
>>>
>>> If you're not sure what I mean by link, here are two examples
>>> for external links on Tor Metrics:
>>>
>>> https://metrics.torproject.org/oxford-anonymous-internet.html
>>>
>>> https://metrics.torproject.org/uncharted-data-flow.html
>>>
>>> Does that plan make sense to you?  It's really great that you're
>>> picking up this topic.  Thanks for that!
>>>
>>
>> I will keep the code available if anyone wants to use it as a base
>> to implement a better algorithm, but if this code will not serve
>> any functional purpose I see no value in putting any additional
>> work into it.
>
> That's not at all what I wanted to achieve here!  I do see this code
> serving a purpose when it runs on a dedicated website (which can
> probably run on a tiny VM) built to compare detection algorithms.  And
> this doesn't mean that you would have to implement those other
> algorithms yourself, that could easily be done by others.  To be
> clear, I think that your rewrite makes it more likely that others
> start working on different algorithms, so that's already a benefit.
>
> I'm just careful with adding this code directly to Tor Metrics yet,
> because that causes quite some overhead for you and me without
> providing an immediate benefit.  It's a non-trivial amount of work for
> me to review your code and make sure it does the exact same thing as
> the current code, because I didn't write the current code and only
> reviewed it once many years ago.
>

Thanks, that makes sense. With this in mind, I am going to get a axe out and
restructure aggressively to make the functions and flow support easy
implementation of other algorithms.

On that note, The inputs of the current code rely on the output of other
scripts in metrics-web. If I am going to host this separately I would
like to directly query the correct public API/directory/Store so that it
can collect updated daily data automatically instead of manually. What
is the proper place to query?


> I would rather want to promote your website by adding a link to
> Metrics and by making a call for help on the Tor development mailing
> list.  Let me know if you'd want that.

Once I get something up and running a call to see what restructuring
would be needed for models to be more easily tested and implemented
within this code base would allow me to make any structural changes
while I still have it in my head.

Best,
s2e


>
> Really hope that makes sense.
>
> All the best,
> Karsten
>
>
>> Best, s2e
>>
>>
>>> All the best, Karsten
>>>
>>>
>>>>
>>>> Best, s2e
>>>>
>>>> -- seamus tuohy | Sr. Technologist - Internet Initiatives
>>>> stuohy at internews.org Skype/XMPP on request PGP: 36AC 272E B7CF
>>>> EDD5 F907 E488 B619 3EC7 3CF0 7AA7 MiniLock:
>>>> 2G3JmRWRYB3B7rthZqkzomcRe8GwJvPtSooA748XMsTBdf
>>>>
>>>> INTERNEWS | Local Voices. Global Change. www.internews.org |
>>>> @internews _______________________________________________
>>>> metrics-team mailing list metrics-team at lists.torproject.org
>>>> https://lists.torproject.org/cgi-bin/mailman/listinfo/metrics-team
>>>>
>>>
>>>
>>>>
> - -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v1 Comment: GPGTools - http://gpgtools.org
>>>
>>> iQEcBAEBAgAGBQJWSjWNAAoJEJD5dJfVqbCrqNQH/3M1sp8ZUB5LSGT4f9zO8Srv
>>> 9Oeq5APKh+GpMvUavTNoZebjjegP4GzsUmLbFLUM0n0/v0woNxiEpQJV8yS6aumA
>>> dJchopns2xSBsbcQgPc/+x1QKmAnxeqCDmetQWEWLF8VXRO/VJKGkHL39ULRbKwL
>>> 5NF8o3Zd3V5uN2PyXArPeWmR35bbMUTse+8HLqlQwb3bj6uazbzBgUC91YMMhmt3
>>> nZrzVU3rOu+CMtXAMZgHdQMmjEbdy3Qx/wt/sdVaj6102RxdH6QgA2cxnf3s4Ftz
>>> 19NVpVGOUZ+goxhyI4I1fDW8PJ4Tpo4OpVCMAcA+K39dQdEagY2c+JpQ5y0lbwg=
>>> =5O6k -----END PGP SIGNATURE-----
>>
>> -- seamus tuohy | Sr. Technologist - Internet Initiatives
>> stuohy at internews.org Skype/XMPP on request PGP: 36AC 272E B7CF EDD5
>> F907 E488 B619 3EC7 3CF0 7AA7 MiniLock:
>> 2G3JmRWRYB3B7rthZqkzomcRe8GwJvPtSooA748XMsTBdf
>>
>> INTERNEWS | Local Voices. Global Change. www.internews.org |
>> @internews
>>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
> Comment: GPGTools - http://gpgtools.org
>
> iQEcBAEBAgAGBQJWSwD6AAoJEJD5dJfVqbCrgMMIAKbm6mOEpPyMxxNcW8d9r0fD
> C+SGAuuTjfbnbpseNs0pyooUk0b7/kB4nWBi93lLSta0uaRqKDGBkYvzsfklOteq
> B+7He4HGtEzWPJuTRXNEW8YqYNhmSUV7bWy2q3TPRsOqRInAo/WFi1jHIlzk5iYm
> fp3aq7LiJX+hazAkUBxV/ACTiAMQlJ72tujXTfidnFdGUk4mv0OE89BZjl6SM0m8
> Z8fcPqCzIPykTJLXb3r1vm6Mmos8BROWCfaGY+b1YwTOGX4gzY674UqaScrqEFI9
> 7B8btLIDqxd7vUbxSlBoxKZ3NOFa8rGXmVaVEOTaB5M0X0dM4TcLyU950oj7gxs=
> =koe9
> -----END PGP SIGNATURE-----

--
seamus tuohy | Sr. Technologist - Internet Initiatives
stuohy at internews.org
Skype/XMPP on request
PGP: 36AC 272E B7CF EDD5 F907 E488 B619 3EC7 3CF0 7AA7
MiniLock: 2G3JmRWRYB3B7rthZqkzomcRe8GwJvPtSooA748XMsTBdf

INTERNEWS | Local Voices. Global Change.
www.internews.org | @internews


More information about the metrics-team mailing list