On 2017-07-18 03:41, Roger Dingledine wrote:
On Mon, Jul 17, 2017 at 07:54:14PM -0400, Ian Goldberg wrote:
Any chance you (i.e. a script) could replace the IP address with HASH(IP||salt) for a randomly chosen salt that you don't know, and which is deleted when the 30 minutes are up, before you get access to the log file?
See https://www.eff.org/policy#cryptolog for how EFF does something similar. It looks like they use 24 hour intervals, and they do this all the time, but hopefully their cryptolog tool will be helpful if we opt to use it for the short term. https://github.com/efforg/cryptolog
As answered to Ian, I'd like to keep this simple and leave out IP addresses for now.
Also, teor's question about partial downloads is a really good one: there are many "download accelerators" out there that fetch the first 5 kbytes of the file or something and then stop and do it again, over and over. In theory our current logs should be able to help there, since it should log how many bytes were fetched.
As answered to teor, we should count these download accelerators as 1 code 200 request and n code 206 requests, and since we only consider code 200 requests, we'd count them as 1 download.
All the best, Karsten
And for those wondering about our current logging approach, see https://trac.torproject.org/projects/tor/ticket/20928 http://lists.spi-inc.org/pipermail/spi-general/2016-December/003645.html https://anonscm.debian.org/cgit/mirror/dsa-puppet.git/tree/modules/apache2/f...
--Roger
tor-project mailing list tor-project@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project