commit 414f29fd2f2083e44908ceebda44955cb22f0ddb
Author: Nick Mathewson <nickm(a)torproject.org>
Date: Wed Mar 2 01:02:36 2011 -0500
Move xxx-bridge-disbursement to ideas/old: bridgedb spec supersedes it.
---
proposals/ideas/old/xxx-bridge-disbursement.txt | 174 +++++++++++++++++++++++
proposals/ideas/xxx-bridge-disbursement.txt | 174 -----------------------
2 files changed, 174 insertions(+), 174 deletions(-)
diff --git a/proposals/ideas/old/xxx-bridge-disbursement.txt b/proposals/ideas/old/xxx-bridge-disbursement.txt
new file mode 100644
index 0000000..6c9a3c7
--- /dev/null
+++ b/proposals/ideas/old/xxx-bridge-disbursement.txt
@@ -0,0 +1,174 @@
+
+How to hand out bridges.
+
+Divide bridges into 'strategies' as they come in. Do this uniformly
+at random for now.
+
+For each strategy, we'll hand out bridges in a different way to
+clients. This document describes two strategies: email-based and
+IP-based.
+
+0. Notation:
+
+ HMAC(k,v) : an HMAC of v using the key k.
+
+ A|B: The string A concatenated with the string B.
+
+
+1. Email-based.
+
+ Goal: bootstrap based on one or more popular email service's sybil
+ prevention algorithms.
+
+
+ Parameters:
+ HMAC -- an HMAC function
+ P -- a time period
+ K -- the number of bridges to send in a period.
+
+ Setup: Generate two nonces, N and M.
+
+ As bridges arrive, put them into a ring according to HMAC(N,ID)
+ where ID is the bridges's identity digest.
+
+ Divide time into divisions of length P.
+
+ When we get an email:
+
+ If it's not from a supported email service, reject it.
+
+ If we already sent a response to that email address (normalized)
+ in this period, send _exactly_ the same response.
+
+ If it is from a supported service, generate X = HMAC(M,PS|E) where E
+ is the lowercased normalized email address for the user, and
+ where PS is the start of the currrent period. Send
+ the first K bridges in the ring after point X.
+
+ [If we want to make sure that repeat queries are given exactly the
+ same results, then we can't let the ring change during the
+ time period. For a long time period like a month, that's quite a
+ hassle. How about instead just keeping a replay cache of addresses
+ that have been answered, and sending them a "sorry, you already got
+ your addresses for the time period; perhaps you should try these
+ other fine distribution strategies while you wait?" response? This
+ approach would also resolve the "Make sure you can't construct a
+ distinct address to match an existing one" note below. -RD]
+
+ [I think, if we get a replay, we need to send back the same
+ answer as we did the first time, not say "try again."
+ Otherwise we need to worry that an attacker can keep people
+ from getting bridges by preemtively asking for them,
+ or that an attacker may force them to prove they haven't
+ gotten any bridges by asking. -NM]
+
+ [While we're at it, if we do the replay cache thing and don't need
+ repeatable answers, we could just pick K random answers from the
+ pool. Is it beneficial that a bridge user who knows about a clump of
+ nodes will be sharing them with other users who know about a similar
+ (overlapping) clump? One good aspect is against an adversary who
+ learns about a clump this way and watches those bridges to learn
+ other users and discover *their* bridges: he doesn't learn about
+ as many new bridges as he might if they were randomly distributed.
+ A drawback is against an adversary who happens to pick two email
+ addresses in P that include overlapping answers: he can measure
+ the difference in clumps and estimate how quickly the bridge pool
+ is growing. -RD]
+
+ [Random is one more darn thing to implement; rings are already
+ there. -NM]
+
+ [If we make the period P be mailbox-specific, and make it a random
+ value around some mean, then we make it harder for an attacker to
+ know when to try using his small army of gmail addresses to gather
+ another harvest. But we also make it harder for users to know when
+ they can try again. -RD]
+
+ [Letting the users know about when they can try again seems
+ worthwhile. Otherwise users and attackers will all probe and
+ probe and probe until they get an answer. No additional
+ security will be achieved, but bandwidth will be lost. -NM]
+
+ To normalize an email address:
+ Start with the RFC822 address. Consider only the mailbox {???}
+ portion of the address (username@domain). Put this into lowercase
+ ascii.
+
+ Questions:
+ What to do with weird character encodings? Look up the RFC.
+
+ Notes:
+ Make sure that you can't force a single email address to appear
+ in lots of different ways. IOW, if nickm(a)freehaven.net and
+ NICKM(a)freehaven.net aren't treated the same, then I can get lots
+ more bridges than I should.
+
+ Make sure you can't construct a distinct address to match an
+ existing one. IOW, if we treat nickm@X and nickm@Y as the same
+ user, then anybody can register nickm@Z and use it to tell which
+ bridges nickm@X got (or would get).
+
+ Make sure that we actually check headers so we can't be trivially
+ used to spam people.
+
+
+2. IP-based.
+
+ Goal: avoid handing out all the bridges to users in a similar IP
+ space and time.
+
+ Parameters:
+
+ T_Flush -- how long it should take a user on a single network to
+ see a whole cluster of bridges.
+
+ N_C
+
+ K -- the number of bridges we hand out in response to a single
+ request.
+
+ Setup: using an AS map or a geoip map or some other flawed input
+ source, divide IP space into "areas" such that surveying a large
+ collection of "areas" is hard. For v0, use /24 address blocks.
+
+ Group areas into N_C clusters.
+
+ Generate secrets L, M, N.
+
+ Set the period P such that P*(bridges-per-cluster/K) = T_flush.
+ Don't set P to greater than a week, or less than three hours.
+
+ When we get a bridge:
+
+ Based on HMAC(L,ID), assign the bridge to a cluster. Within each
+ cluster, keep the bridges in a ring based on HMAC(M,ID).
+
+ [Should we re-sort the rings for each new time period, so the ring
+ for a given cluster is based on HMAC(M,PS|ID)? -RD]
+
+ When we get a connection:
+
+ If it's http, redirect it to https.
+
+ Let area be the incoming IP network. Let PS be the current
+ period. Compute X = HMAC(N, PS|area). Return the next K bridges
+ in the ring after X.
+
+ [Don't we want to compute C = HMAC(key, area) to learn what cluster
+ to answer from, and then X = HMAC(key, PS|area) to pick a point in
+ that ring? -RD]
+
+
+ Need to clarify that some HMACs are for rings, and some are for
+ partitions. How rings scale is clear. How do we grow the number of
+ partitions? Looking at successive bits from the HMAC output is one way.
+
+3. Open issues
+
+ Denial of service attacks
+ A good view of network topology
+
+at some point we should learn some reliability stats on our bridges. when
+we say above 'give out k bridges', we might give out 2 reliable ones and
+k-2 others. we count around the ring the same way we do now, to find them.
+
diff --git a/proposals/ideas/xxx-bridge-disbursement.txt b/proposals/ideas/xxx-bridge-disbursement.txt
deleted file mode 100644
index 6c9a3c7..0000000
--- a/proposals/ideas/xxx-bridge-disbursement.txt
+++ /dev/null
@@ -1,174 +0,0 @@
-
-How to hand out bridges.
-
-Divide bridges into 'strategies' as they come in. Do this uniformly
-at random for now.
-
-For each strategy, we'll hand out bridges in a different way to
-clients. This document describes two strategies: email-based and
-IP-based.
-
-0. Notation:
-
- HMAC(k,v) : an HMAC of v using the key k.
-
- A|B: The string A concatenated with the string B.
-
-
-1. Email-based.
-
- Goal: bootstrap based on one or more popular email service's sybil
- prevention algorithms.
-
-
- Parameters:
- HMAC -- an HMAC function
- P -- a time period
- K -- the number of bridges to send in a period.
-
- Setup: Generate two nonces, N and M.
-
- As bridges arrive, put them into a ring according to HMAC(N,ID)
- where ID is the bridges's identity digest.
-
- Divide time into divisions of length P.
-
- When we get an email:
-
- If it's not from a supported email service, reject it.
-
- If we already sent a response to that email address (normalized)
- in this period, send _exactly_ the same response.
-
- If it is from a supported service, generate X = HMAC(M,PS|E) where E
- is the lowercased normalized email address for the user, and
- where PS is the start of the currrent period. Send
- the first K bridges in the ring after point X.
-
- [If we want to make sure that repeat queries are given exactly the
- same results, then we can't let the ring change during the
- time period. For a long time period like a month, that's quite a
- hassle. How about instead just keeping a replay cache of addresses
- that have been answered, and sending them a "sorry, you already got
- your addresses for the time period; perhaps you should try these
- other fine distribution strategies while you wait?" response? This
- approach would also resolve the "Make sure you can't construct a
- distinct address to match an existing one" note below. -RD]
-
- [I think, if we get a replay, we need to send back the same
- answer as we did the first time, not say "try again."
- Otherwise we need to worry that an attacker can keep people
- from getting bridges by preemtively asking for them,
- or that an attacker may force them to prove they haven't
- gotten any bridges by asking. -NM]
-
- [While we're at it, if we do the replay cache thing and don't need
- repeatable answers, we could just pick K random answers from the
- pool. Is it beneficial that a bridge user who knows about a clump of
- nodes will be sharing them with other users who know about a similar
- (overlapping) clump? One good aspect is against an adversary who
- learns about a clump this way and watches those bridges to learn
- other users and discover *their* bridges: he doesn't learn about
- as many new bridges as he might if they were randomly distributed.
- A drawback is against an adversary who happens to pick two email
- addresses in P that include overlapping answers: he can measure
- the difference in clumps and estimate how quickly the bridge pool
- is growing. -RD]
-
- [Random is one more darn thing to implement; rings are already
- there. -NM]
-
- [If we make the period P be mailbox-specific, and make it a random
- value around some mean, then we make it harder for an attacker to
- know when to try using his small army of gmail addresses to gather
- another harvest. But we also make it harder for users to know when
- they can try again. -RD]
-
- [Letting the users know about when they can try again seems
- worthwhile. Otherwise users and attackers will all probe and
- probe and probe until they get an answer. No additional
- security will be achieved, but bandwidth will be lost. -NM]
-
- To normalize an email address:
- Start with the RFC822 address. Consider only the mailbox {???}
- portion of the address (username@domain). Put this into lowercase
- ascii.
-
- Questions:
- What to do with weird character encodings? Look up the RFC.
-
- Notes:
- Make sure that you can't force a single email address to appear
- in lots of different ways. IOW, if nickm(a)freehaven.net and
- NICKM(a)freehaven.net aren't treated the same, then I can get lots
- more bridges than I should.
-
- Make sure you can't construct a distinct address to match an
- existing one. IOW, if we treat nickm@X and nickm@Y as the same
- user, then anybody can register nickm@Z and use it to tell which
- bridges nickm@X got (or would get).
-
- Make sure that we actually check headers so we can't be trivially
- used to spam people.
-
-
-2. IP-based.
-
- Goal: avoid handing out all the bridges to users in a similar IP
- space and time.
-
- Parameters:
-
- T_Flush -- how long it should take a user on a single network to
- see a whole cluster of bridges.
-
- N_C
-
- K -- the number of bridges we hand out in response to a single
- request.
-
- Setup: using an AS map or a geoip map or some other flawed input
- source, divide IP space into "areas" such that surveying a large
- collection of "areas" is hard. For v0, use /24 address blocks.
-
- Group areas into N_C clusters.
-
- Generate secrets L, M, N.
-
- Set the period P such that P*(bridges-per-cluster/K) = T_flush.
- Don't set P to greater than a week, or less than three hours.
-
- When we get a bridge:
-
- Based on HMAC(L,ID), assign the bridge to a cluster. Within each
- cluster, keep the bridges in a ring based on HMAC(M,ID).
-
- [Should we re-sort the rings for each new time period, so the ring
- for a given cluster is based on HMAC(M,PS|ID)? -RD]
-
- When we get a connection:
-
- If it's http, redirect it to https.
-
- Let area be the incoming IP network. Let PS be the current
- period. Compute X = HMAC(N, PS|area). Return the next K bridges
- in the ring after X.
-
- [Don't we want to compute C = HMAC(key, area) to learn what cluster
- to answer from, and then X = HMAC(key, PS|area) to pick a point in
- that ring? -RD]
-
-
- Need to clarify that some HMACs are for rings, and some are for
- partitions. How rings scale is clear. How do we grow the number of
- partitions? Looking at successive bits from the HMAC output is one way.
-
-3. Open issues
-
- Denial of service attacks
- A good view of network topology
-
-at some point we should learn some reliability stats on our bridges. when
-we say above 'give out k bridges', we might give out 2 reliable ones and
-k-2 others. we count around the ring the same way we do now, to find them.
-