[metrics-team] Are bandwidth charts double counting?

teor teor2345 at gmail.com
Tue Oct 17 15:20:53 UTC 2017


On 17 Oct 2017, at 10:48, Tom Ritter <tom at ritter.vg> wrote:

>>> The other question I had, that I don't think we are able to calculate,
>>> is "How many connections does the Tor Network produce".  Obviously it
>>> handwaves over 'connection', but for the browser scenario I'd say
>>> 'connections to first party domains'.
>>> 
>>> exit_streams_opened might be the best measurement to accomplish
>>> something very similar though right? It'd be unique connections to
>>> third party domains (for ports 443 and 80) instead of first party
>>> domains, but that's pretty close.
>>> 
>>> How would I go about calculating it? Is it as simple as summing this
>>> field across all the extra info descriptors for a given time period?
>> 
>> Fine question. We don't have such statistics yet. But let's see what we
>> could do with existing data.
>> 
>> First, I'm not sure what exactly you mean by first party domains and
>> third party domains.
> 
> example.com includes resources from a.com and b.com
> 
> When we load this in Tor Browser we will produce 1 circuit and 3 streams.
> 
> I guess ideally I'd like to know "How many circuits are opened" (over
> time) as this would tell us something about capacity. If we
> anticipating adding a 'thing' to the tor network that would generate
> an additional 10 circuits/second, and we currently handle 5
> circuits/second, even if we didn't put much bandwidth on these
> circuits, we would be tripling <something> in the network that could
> cause bottlenecks or problems.
> 
> But 'streams' is, at least, an upper bound for circuits, you can't
> have more circuits than streams.

This is only true of each circuit is used for at least one stream.

A circuit can have 0 streams if it is:
* an unused multi-directory bootstrapping circuit (clients only)
* an unused preemptive circuit
* a circuit build time measurement circuit
* an introduction point circuit (clients or services to intro points)
* a rendezvous point circuit (clients or services to rend points)
* a bandwidth self-measurement circuit (relays only)
* an ORPort reachability check circuit (relays only)
And I've probably missed at least one case here.

Also, does a stream refused due to an exit policy count as an
"opened" stream?

We don't know the proportion of circuits without streams to
circuits with streams. In theory, if most clients are long-lived,
and most traffic is successful exit traffic, then these other
categories should only be a few percent of the circuits.

But wouldn't it be nice to measure it? :-)

T


More information about the metrics-team mailing list