[tor-commits] [torspec/master] Close proposal 166 and make xxx-geoip-survey-plan obsolete

2 Mar 2011

commit 6501e1e80a6eb44aa1ff089ced2870b6728865a8
Author: Nick Mathewson <nickm@torproject.org>
Date:   Wed Mar 2 11:20:33 2011 -0500

    Close proposal 166 and make xxx-geoip-survey-plan obsolete
    
    Karsten confirms that 166 is implemented, and xxx-geoip-survey-plan is
    superseded by this tech report:
    
     https://metrics.torproject.org/papers/countingusers-2010-11-30.pdf
---
 proposals/000-index.txt                       |    4 +-
 proposals/166-statistics-extra-info-docs.txt  |    2 +-
 proposals/ideas/old/xxx-geoip-survey-plan.txt |  137 +++++++++++++++++++++++++
 proposals/ideas/xxx-geoip-survey-plan.txt     |  137 -------------------------
 4 files changed, 140 insertions(+), 140 deletions(-)

diff --git a/proposals/000-index.txt b/proposals/000-index.txt
index 48ec6a8..91c2f27 100644
--- a/proposals/000-index.txt
+++ b/proposals/000-index.txt
@@ -86,7 +86,7 @@ Proposals by number:
 163  Detecting whether a connection comes from a client [OPEN]
 164  Reporting the status of server votes [OPEN]
 165  Easy migration for voting authority sets [OPEN]
-166  Including Network Statistics in Extra-Info Documents [ACCEPTED]
+166  Including Network Statistics in Extra-Info Documents [CLOSED]
 167  Vote on network parameters in consensus [CLOSED]
 168  Reduce default circuit window [OPEN]
 169  Eliminate TLS renegotiation for the Tor connection handshake [SUPERSEDED]
@@ -137,7 +137,6 @@ Proposals by status:
    140  Provide diffs between consensuses [for 0.2.2.x]
    147  Eliminate the need for v2 directories in generating v3 directories [for 0.2.1.x]
    157  Make certificate downloads specific [for 0.2.1.x]
-   166  Including Network Statistics in Extra-Info Documents [for 0.2.2]
    172  GETINFO controller option for circuit information
    173  GETINFO Option Expansion
    174  Optimistic Data for Tor: Server Side
@@ -179,6 +178,7 @@ Proposals by status:
    148  Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha]
    150  Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha]
    152  Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha]
+   166  Including Network Statistics in Extra-Info Documents [for 0.2.2]
    167  Vote on network parameters in consensus [in 0.2.2]
  SUPERSEDED:
    112  Bring Back Pathlen Coin Weight
diff --git a/proposals/166-statistics-extra-info-docs.txt b/proposals/166-statistics-extra-info-docs.txt
index ab2716a..8b0c6a1 100644
--- a/proposals/166-statistics-extra-info-docs.txt
+++ b/proposals/166-statistics-extra-info-docs.txt
@@ -3,7 +3,7 @@ Title: Including Network Statistics in Extra-Info Documents
 Author: Karsten Loesing
 Created: 21-Jul-2009
 Target: 0.2.2
-Status: Accepted
+Status: Closed
 
 Change history:
 
diff --git a/proposals/ideas/old/xxx-geoip-survey-plan.txt b/proposals/ideas/old/xxx-geoip-survey-plan.txt
new file mode 100644
index 0000000..49c6615
--- /dev/null
+++ b/proposals/ideas/old/xxx-geoip-survey-plan.txt
@@ -0,0 +1,137 @@
+
+
+Abstract
+
+   This document explains how to tell about how many Tor users there
+   are, and how many there are in which country.  Statistics are
+   involved.
+
+Motivation
+
+   There are a few reasons we need to keep track of which countries
+   Tor users (in aggregate) are coming from:
+
+      - Resource allocation.  Knowing about underserved countries with
+        lots of users can let us know about where we need to direct
+        translation and outreach efforts.
+
+      - Anticensorship.  Sudden drops in usage on a national basis can
+        indicate the arrival of a censorious firewall.
+
+      - Sponsor outreach and self-evalutation.  Many people and
+        organizations who are interested in funding The Tor Project's
+        work want to know that we're successfully serving parts of the
+        world they're interested in, and that efforts to expand our
+        userbase are actually succeeding.  So do we.
+
+Goals
+
+   We want to know approximately how many Tor users there are, and which
+   countries they're in, even in the presence of a hypothetical
+   "directory guard" feature.  Some uncertainty is okay, but we'd like
+   to be able to put a bound on the uncertainty.
+
+   We need to make sure this information isn't exposed in a way that
+   helps an adversary.
+
+Methods for current clients:
+
+   Every client downloads network status documents.  There are
+   currently three methods (one hypothetical) for clients to get them.
+      - 0.1.2.x clients (and earlier) fetch a v2 networkstatus
+        document about every NETWORKSTATUS_CLIENT_DL_INTERVAL [30
+        minutes].
+
+      - 0.2.0.x clients fetch a v3 networkstatus consensus document
+        at a random interval between when their current document is no
+        longer freshest, and when their current document is about to
+        expire.
+
+        [In both of the above cases, clients choose a running
+        directory cache at random with odds roughly proportional to
+        its bandwidth.  If they're just starting, they know a XXXX FIXME -NM]
+
+      - In some future version, clients will choose directory caches
+        to serve as their "directory guards" to avoid profiling
+        attacks, similarly to how clients currently start all their
+        circuits at guard nodes.
+
+    We assume that a directory cache can tell which of these three
+    categories a client is in by the format of its status request.
+
+    A directory cache can be made to count distinct client IP
+    addresses that make a certain request of it in a given timeframe,
+    and total requests made to it over that timeframe.  For the first
+    two cases, a cache can get a  picture of the overall
+    number and countries of users in the network by dividing the IP
+    count by the probability with which they (as a cache) would be
+    chosen.  Assuming that our listed bandwidth is such that we expect
+    to be chosen with probability P for any given request, and we've
+    been counting IPs for long enough that we expect the average
+    client to have made N requests, they will have visited us at least
+    once with probability P' = 1-(1-P)^N, and so we divide the IP
+    counts we've seen by P' for our estimate.  To estimate total
+    number of clients of a given type, determine how many requests a
+    client of that type will make over that time, and assume we'll
+    have seen P of them.
+
+    Both of these numbers are useful: the IP counts will give the
+    total number of IPs connecting to the network, and the request
+    counts will give the total number of users on the network at any
+    given time.
+
+    Notes:
+       - [Over H hours, the N for V2 clients is 2*H, and the N for V3
+         clients is currently around H/2 or H/3.]
+
+       - (We should only count requests that we actually intend to answer;
+         503 requests shouldn't count.)
+
+       - These measurements should also be taken at a directory
+         authority if possible: their picture of the network is skewed
+         by clients that fetch from them directly.  These clients,
+         however, are all the clients that are just bootstrapping
+         (assuming that the fallback-consensus feature isn't yet used
+         much).
+
+       - These measurements also overestimate the V2 download rate if
+         some downloads fail and clients retry them later after backing
+         off.
+
+Methods for directory guards:
+
+    If directory guards are in use, directory guards get a picture of
+    all those users who chose them as a guard when they were listed
+    as a good choice for a guard, and who are also on the network
+    now.  The cleanest data here will come from nodes that were listed
+    as good new-guards choices for a while, and have not been so for a
+    while longer (to study decay rates); nodes that have been listed
+    as good new-guard choices consistently for a long time (to get a
+    sample of the network); and nodes that have been listed as good
+    new-guard choices only recently (to get a sample of new users and
+    users whose guards have died out.)
+
+    Since directory guards are currently unspecified, we'll need to
+    make some guesses about how they'll turn out to work.  Here are
+    a couple of approaches that could work.
+       - We could have clients pick completely new directory guards on
+         a rolling basis every two months or so.  This would ensure
+         that staying as a guard for a while would be sufficient to
+         see a sample of users.  This is potentially advantageous for
+         load-balancing the network as well, though it might lose some
+         of the benefits of directory guard.  We need to quantify the
+         impact of this; it might not actually make stuff worse in
+         practice, if most guards don't stay good guards for a month
+         or two.
+
+       - We could try to collect statistics at several directory
+         guards and combine their statisics, but we would need to make
+         sure that for all time, at least one of the directory guards
+         had been recommended as a good choice for new guards.  By
+         looking at new-IP rates for guards, we could get an idea of
+         user uptake; for looking at old-IP decay rates, we could get
+         an idea of turnover.  This approach would entail significant
+         complexity, and we'd probably need to record more information
+         than we'd really like to.
+
+
diff --git a/proposals/ideas/xxx-geoip-survey-plan.txt b/proposals/ideas/xxx-geoip-survey-plan.txt
deleted file mode 100644
index 49c6615..0000000
--- a/proposals/ideas/xxx-geoip-survey-plan.txt
+++ /dev/null
@@ -1,137 +0,0 @@
-
-
-Abstract
-
-   This document explains how to tell about how many Tor users there
-   are, and how many there are in which country.  Statistics are
-   involved.
-
-Motivation
-
-   There are a few reasons we need to keep track of which countries
-   Tor users (in aggregate) are coming from:
-
-      - Resource allocation.  Knowing about underserved countries with
-        lots of users can let us know about where we need to direct
-        translation and outreach efforts.
-
-      - Anticensorship.  Sudden drops in usage on a national basis can
-        indicate the arrival of a censorious firewall.
-
-      - Sponsor outreach and self-evalutation.  Many people and
-        organizations who are interested in funding The Tor Project's
-        work want to know that we're successfully serving parts of the
-        world they're interested in, and that efforts to expand our
-        userbase are actually succeeding.  So do we.
-
-Goals
-
-   We want to know approximately how many Tor users there are, and which
-   countries they're in, even in the presence of a hypothetical
-   "directory guard" feature.  Some uncertainty is okay, but we'd like
-   to be able to put a bound on the uncertainty.
-
-   We need to make sure this information isn't exposed in a way that
-   helps an adversary.
-
-Methods for current clients:
-
-   Every client downloads network status documents.  There are
-   currently three methods (one hypothetical) for clients to get them.
-      - 0.1.2.x clients (and earlier) fetch a v2 networkstatus
-        document about every NETWORKSTATUS_CLIENT_DL_INTERVAL [30
-        minutes].
-
-      - 0.2.0.x clients fetch a v3 networkstatus consensus document
-        at a random interval between when their current document is no
-        longer freshest, and when their current document is about to
-        expire.
-
-        [In both of the above cases, clients choose a running
-        directory cache at random with odds roughly proportional to
-        its bandwidth.  If they're just starting, they know a XXXX FIXME -NM]
-
-      - In some future version, clients will choose directory caches
-        to serve as their "directory guards" to avoid profiling
-        attacks, similarly to how clients currently start all their
-        circuits at guard nodes.
-
-    We assume that a directory cache can tell which of these three
-    categories a client is in by the format of its status request.
-
-    A directory cache can be made to count distinct client IP
-    addresses that make a certain request of it in a given timeframe,
-    and total requests made to it over that timeframe.  For the first
-    two cases, a cache can get a  picture of the overall
-    number and countries of users in the network by dividing the IP
-    count by the probability with which they (as a cache) would be
-    chosen.  Assuming that our listed bandwidth is such that we expect
-    to be chosen with probability P for any given request, and we've
-    been counting IPs for long enough that we expect the average
-    client to have made N requests, they will have visited us at least
-    once with probability P' = 1-(1-P)^N, and so we divide the IP
-    counts we've seen by P' for our estimate.  To estimate total
-    number of clients of a given type, determine how many requests a
-    client of that type will make over that time, and assume we'll
-    have seen P of them.
-
-    Both of these numbers are useful: the IP counts will give the
-    total number of IPs connecting to the network, and the request
-    counts will give the total number of users on the network at any
-    given time.
-
-    Notes:
-       - [Over H hours, the N for V2 clients is 2*H, and the N for V3
-         clients is currently around H/2 or H/3.]
-
-       - (We should only count requests that we actually intend to answer;
-         503 requests shouldn't count.)
-
-       - These measurements should also be taken at a directory
-         authority if possible: their picture of the network is skewed
-         by clients that fetch from them directly.  These clients,
-         however, are all the clients that are just bootstrapping
-         (assuming that the fallback-consensus feature isn't yet used
-         much).
-
-       - These measurements also overestimate the V2 download rate if
-         some downloads fail and clients retry them later after backing
-         off.
-
-Methods for directory guards:
-
-    If directory guards are in use, directory guards get a picture of
-    all those users who chose them as a guard when they were listed
-    as a good choice for a guard, and who are also on the network
-    now.  The cleanest data here will come from nodes that were listed
-    as good new-guards choices for a while, and have not been so for a
-    while longer (to study decay rates); nodes that have been listed
-    as good new-guard choices consistently for a long time (to get a
-    sample of the network); and nodes that have been listed as good
-    new-guard choices only recently (to get a sample of new users and
-    users whose guards have died out.)
-
-    Since directory guards are currently unspecified, we'll need to
-    make some guesses about how they'll turn out to work.  Here are
-    a couple of approaches that could work.
-       - We could have clients pick completely new directory guards on
-         a rolling basis every two months or so.  This would ensure
-         that staying as a guard for a while would be sufficient to
-         see a sample of users.  This is potentially advantageous for
-         load-balancing the network as well, though it might lose some
-         of the benefits of directory guard.  We need to quantify the
-         impact of this; it might not actually make stuff worse in
-         practice, if most guards don't stay good guards for a month
-         or two.
-
-       - We could try to collect statistics at several directory
-         guards and combine their statisics, but we would need to make
-         sure that for all time, at least one of the directory guards
-         had been recommended as a good choice for new guards.  By
-         looking at new-IP rates for guards, we could get an idea of
-         user uptake; for looking at old-IP decay rates, we could get
-         an idea of turnover.  This approach would entail significant
-         complexity, and we'd probably need to record more information
-         than we'd really like to.
-
-

    

[tor-commits] [torspec/master] Close proposal 166 and make xxx-geoip-survey-plan obsolete

nickm＠torproject.org