[tor-dev] Scaling tor for a global population

Prateek Mittal pmittal at princeton.edu
Tue Sep 30 11:29:47 UTC 2014

Thanks isis. We worked on PIR-Tor a while ago, so my recollection could be
a bit rusty, but here are some thoughts.

1) From a performance perspective, the best results are obtained by using
the information-theoretic variant of PIR-Tor (ITPIR) (as opposed to the
computational variant). As you correctly point out, we considered the three
guard relays to be the PIR directory servers, with the assumption that not
all three collude. (Our argument for the non-colluding assumption was that
if all three guards are malicious, then the security offered by Tor is
terrible already.) Note than ITPIR necessarily requires more than one
server. Given Tor's move to a single guard design, the first challenge is
to figure out alternative PIR servers for the client.

(I havn't thought much about this challenge, but it seems that the single
guard could be one server, and two other relays (maybe with guard flags)
could play the role of PIR directory servers. I'll have to think more about
this, but it is possible that this design continues to provide similar
security properties to Tor)

2) Even outside the context of PIR, there are several advantages to having
some structure in the database, with fine grained signatures on individual
relay/micro descriptors (relays could also be grouped into blocks, with
signatures at the block level). For example, if the Tor network were to use
2 hop circuits, then clients would have close to 0 (or at least
significantly less than 1%) overhead for maintaining network state.

Why? Because the first hop (guard) is fixed, fetching the directory to
select the guard is only a *one-time* operation; note that much of the
directory overhead in Mike Perry's analysis comes from *periodic fetches*.
Secondly, clients do not need to use PIR to fetch the exit relay in this
setting, since guards know the identity of exits in a two hop design anyway
(bandwidth overhead of fetching a single exit is simply the size of the
descriptor/micro-descriptor -- resulting in *several orders of magnitude*
bandwidth savings). (This optimization was referred to in the paper in the
context of fetching middle relays)

(Of course there are anonymity implications of moving to two hop designs.
Perhaps the most significant concern is easy guard identification, but
perhaps an additional 100 million clients + move to a single guard reduces
these concerns. The bandwidth savings could be very significant, specially
in the context of mobile clients, which otherwise might use hundreds of MB
per month just to maintain network state. Anyway, I digress.)

Please find some comments about your previous mail inline (below).

On Tue, Sep 30, 2014 at 1:04 AM, isis <isis at torproject.org> wrote:

>   1. The authors assume that a client's three [sic] Guard Nodes will be the
>      directory mirrors for that client, accepting PIR queries for
> descriptors.
Right, please see (1) above.

>   3. There is zero handling of colluding directory mirrors.
This seems incorrect? We used the Percy open source library to perform PIR,
which does handle collusion among mirrors (to the best of my recollection).
The parameters can be set such that one honest server is sufficient for

>   4. The *increase* to descriptor size is not well factored in to the
> analysis.
>     4.b. All of the Directory Authorities would need to sign each and every
>           descriptor by itself. This would result in the current
>           microdescriptors, which are sized from roughly 900 bytes to some
> of
>           the larger ones which are 2.5 KB, increasing in size by roughly
> 4 KB
>           (that's the size of the DirAuth signatures on the current
> consensus).
I am not sure which signature scheme you are considering. In the paper, we
talk about threshold BLS signatures, which have only 22 byte overhead.
(This might increase for better security guarantees, but 4KB overhead seems
very conservative.)

> While one of the benefits of PIR would be that clients would no longer
> need to
> request the descriptors for all relays listed in the consensus, this
> benefit
> actually doesn't help as much as it would seem, because switching to a PIR
> scheme would require a client to make three of these roughly O(146449)-bit
> requests, every ten minutes or so, when the client wants a new circuit.
Right, clients would roughly need 18 middle/exit relays (using PIR queries)
over a three hour interval to build a Tor circuit every 10 minutes. My
recollection is that even in 2011, PIR seemed more efficient (we estimated
a 12KB overhead per PIR query) than a full network fetch every three hours
(12*18 = 216KB with PIR vs several MBs without PIR). Though these numbers
might change depending on implementation details (including the use of
micro-descriptors), the benefits would only improve with growing network


Prateek Mittal
Assistant Professor
Princeton University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20140930/e709de78/attachment.html>

More information about the tor-dev mailing list