[metrics-team] How to interpret written/read bytes per second in Relay Search?

Tue Oct 13 01:25:25 UTC 2020

On Wed, Mar 25, 2020 at 01:01:37PM -0600, David Fifield wrote:
> On Wed, Mar 25, 2020 at 09:42:48AM +0100, Karsten Loesing wrote:
> > > 2. When I look at the graphs of some default bridges, I see the written
> > >    and read number being almost equal always.
> > >    https://metrics.torproject.org/rs.html#details/5F161D2E5713C93F16FEEDD63178E37208AA78DF
> > >    https://metrics.torproject.org/rs.html#details/8F4541EEE3F2306B7B9FEF1795EC302F6B84DAE8
> > >    When I look at moria1, a directory authority, I see written being
> > >    much greater than read.
> > >    https://metrics.torproject.org/rs.html#details/9695DFC35FFEB861329B9F1AB04C46397020CE31
> > >    What accounts for the equality in some cases and the inequality in
> > >    others? What could explain the divergence in the case of the
> > >    Snowflake bridge?
> > 
> > The inequality in case of directory authorities is very likely due to
> > directory requests. Requesting a consensus takes just a few dozen bytes,
> > but responding with a consensus takes about 2.4 MiB or something like
> > 0.5 MiB when compressed.
> > 
> > I can only speculate about the Snowflake bridge. When looking at the 5
> > years graph on Relay Search it seems like the increase in read bytes is
> > not that unusual. It's the divergence from written bytes that hasn't
> > happened for a while. But if you look at late 2017 there has been a time
> > when read bytes outnumbered written bytes.
> 
> Here's a hypothesis. For typical guards/bridges,
> 	written = written to client + written to middle node
> 	read    = read from client  + read from middle node
> And because input/output is conserved in the absence of other transfers,
> 	written to client = read from middle node
> 	read from client  = written to middle node
> We have equality in the usual case:
> 	written = read from middle node + written to middle node
> 	        = read from client + written to client
> 	        = read
> Another way to put it is in terms of the client's upload and download:
> 	written = download + upload = upload + download = read
> 
> That explains why the two graph lines are almost equal for most bridges.
> 
> But because of a bug in Snowflake proxies that cause them not to extract
> the correct client IP address, the Snowflake bridge currently counts
> bytes for only about 25% of client connections (https://bugs.torproject.org/33157#comment:10).
> A large fraction of bytes to/from the client are not being counted, but
> bytes to/from the middle node are being counted as usual. Ignoring any
> possible correlation between which connections have a bogus client IP
> address and the number of bytes transferred per connection, we have
> something more like
> 	written = 0.25 * written to client + written to middle node
> 	read    = 0.25 * read from client  + read from middle node
> and
> 	written to client = 0.25 * read from middle node
> 	read from client  = 0.25 * written to middle node
> so
> 	written = (0.25 * 0.25 * read from middle node) + written to middle node
> 	read    = (0.25 * 0.25 * written to middle node) + read from middle node
> 
> Now the ratio of written/read depends on the how much the client uploads
> versus how much it downloads. There's no reason why these should be
> equal.
> 	written to client = 0.25 * download = 0.25 * read from middle node
> 	read from client  = 0.25 * upload   = 0.25 * written to middle node
> 	written = 0.25 * 0.25 * download + upload
> 	read    = 0.25 * 0.25 * upload + download
> 
> Still ignoring correlation, we could recover the quantity upload +
> download by adding written and read and dividing by a number that
> depends on the faction of bogus IP addresses:
> 	written + read
> 	= 0.25 * 0.25 * download + upload + 0.25 * 0.25 * upload + download
> 	= 1.0625 * (upload + download)
> 	upload + download = (written + read) / 1.0625
> 
> To imagine what the graph would look like if we were actually accounting
> for all client bytes, we approximately just have to add the two lines
> together.
> 
> If my guess is correct, then it accounts for the divergence in the
> written and read graphs, if we additionally assume that before
> 2020-02-19 either 100% of clients were not affected by IP address
> reporting bug, or the number of Snowflake clients was negligible and the
> graphs only reflect some inter-relay traffic that also happens to
> conserve input/output.

On 2020-10-05, Cecylia deployed a fix for bug #33517 to improve client
IP address extraction. The number of client connections with USERADDR
rose from about 25% to about 95%. The result is that the read/written
bwhist graphs are now close to equal.

I expected the number-of-clients graph to increase at the same time, but
it did not. I suppose that the client count does not depend on USERADDR
as I thought.

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/33157#note_2710701
https://metrics.torproject.org/rs.html#details/5481936581E23D2D178105D44DB6915AB06BFB7F