[tor-dev] tor relay process health data for operators (controlport)

Mon Feb 4 06:35:35 UTC 2019

On February 3, 2019 10:19:00 PM UTC, nusenu <nusenu-lists at riseup.net> wrote:
>> Thanks for this email. I exporting more metrics on the control port
>is a
>> great idea. I wanted to have that for a while 
>
>Great to hear that so we have a realistic chance it
>gets actually implemented :)
>
>
>> There are safety questions to ask ourselves here before blindly
>> exporting many stats.
>
>Sure.
>
>> Exporting many stats to the control port unfortunately means that all
>> relay operator can possibly create fancy graphs 
>
>making non-public graphs and alerts is the goal
>
>> and make them public
>
>public graphs should result in the rejection of
>affected relays.
>I'll be submitting a few to bad-relays@ soon
>since enn.lu apparently does not care when asked to
>remove their public stats and xml data.
> 
>> which, depending on the stat, can be harmful.
>> 
>> Furthermore, graphing stats can also means that over time the relay
>> operator stores historical data of everything that happened within
>the
>> relay and that can be used in many ways to pull off attacks (ex:
>> subpoena to access such data base by LE).
>
>yes, acceptable / unacceptable retention times and granularity
>should be defined and documented.
>I'd propose a max. retention time of two weeks.
>
>
>> The Heartbeat log has a minimum of 30 minutes period but a default of
>6
>> hours. 
>
>current tor has no restrictions on Heartbeat granularity, you can
>ask tor to write the data to the logs every other second by issuing 
>"SIGNAL HEARTBEAT"
>on the control port.
>
>
>> Whatever stats we would end up exporting, I strongly think that
>> keeping delays like that is a strong requirement because we would
>sort
>> of "bin" those aggregated stats by a "long enough period" instead of
>> having a very fine grained stream of stats that would make it trivial
>to
>> spot spikes down to the minute.
>
>30 or 60 minutes granularity seems reasonable
>
>
>> Some of the stats below are safe in my opinion like the memory usage
>but
>> most of them need to be looked at in terms of safety 
>
>yes please

Here's another design that preserves user privacy:
* add noise to every logged statistic (to protect usage in the current period)
* round every logged statistic (to protect average usage over multiple periods)

If we add enough noise to protect most users, then we will have privacy by design.

We should still teach operators why detailed stats are bad for users. And have rules about retention periods. But these rules won't be as critical as they are now, because the rules will only be needed for edge cases. (Like a single client that uses moat of a relay, which we can't hide very well, no matter what we do.)

Adding noise will be easier once PrivCount is implemented. Until then, we'll need to rely on the retention rules you are suggesting.

T

--
teor
----------------------------------------------------------------------