Filename: xxx-bridgedb-learns-ipv6.txt Title: BridgeDB Learns IPv6 Author: Aaron Gibson Created: 5 Dec 2011 Status: Draft Overview: This document outlines what we'll do to make BridgeDB fully support IPv6 bridges, and fully support IPv6 with the email, https, and bucket distributors. Motivation: IPv6 bridges need a BridgeDB too. What needs to change: There are two main tasks that must be completed for BridgeDB to support IPv6. 1. BridgeDB must be able to parse IPv6 addresses from router descriptors. (Currently, BridgeDB does not recognize the or-address line described in 186-multiple-orports.txt) 2. BridgeDB must decide how to hand out IPv6 addresses. (Currently, BridgeDB distributors are not IPv6 aware, and provide no support for distinguishing bridges by address class) 1. BridgeDB learns to parse or-address BridgeDB must learn how to parse the new or-address line from server descriptors. The new or-address line allows a router to specify a list of addresses and ports or port-ranges. Here is the or-address specification (see: 186-multiple-orports.txt) or-address SP ADDRESS ":" PORTLIST NL ADDRESS = IP6ADDR | IP4ADDR IPV6ADDR = an ipv6 address, surrounded by square brackets. IPV4ADDR = an ipv4 address, represented as a dotted quad. PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST PORTSPEC = PORT | PORT "-" PORT PORT = a number between 1 and 65535 inclusive. BridgeDB must now comprehend and store multiple listening addresses and ports. BridgeDB currently assumes that each bridge has only one listen address. BridgeDB must be modified to take one of the following approaches: a. Treat each ADDRESS:PORT combination as a separate bridge entity b. Display a subset of each bridges ADDRESS:PORT entries in a response c. Display all of each bridges ADDRESS:PORT entries in a response Given any address of the bridge you can learn its fingerprint, and use that to look up its descriptor at tonga and learn the rest of the addresses. so counting a bridge with 5 addresses as 5 bridges makes it more likely to get blocked by a smart adversary. but more useful against a weaker adversary. #XXX: thanks arma! # any other thoughts here? option c. seems to be the simplest to implement. BridgeDB should be able to mark specific IP:port pairs as blocked, and indicate where it is blocked (e.g. by country code). This requirement is complicated by the fact that or-address may specify a range of listening ports. How are IPv6 Addresses stored in BridgeDB? IPv6 Addresses are stored as strings, the same way as IPv4 addresses. #XXX: is this better than using the ipaddr.IPAddress class? How are Bridges differentiated by address class? Bridges are differentiated by the string representation of their IP address. When BridgeDB needs to make a distinction between IP address classes, the python module ipaddr-py (https://code.google.com/p/ipaddr-py/) will be used to determine address class. 2. BridgeDB learns how to selectively distribute IPv6 bridges BridgeDB's 3 distributors must be able to selectively provide both IPv4 and/or IPv6 bridges to clients. Deployment of these distributors must take care to mitigate reachability issues due partly to the ongoing transition from IPv4 to IPv6. [One such issue is clients who have IPv6 support on their local network but the client's ISP does not; such a client may try to reach the IPv6 address specified by a AAAA record and fail to connect.] The 3 distributor types that BridgeDB currently features are: 1. HTTPS Distributor The HTTPS distributor must be able to selectively offer both IPv4 and IPv6 bridges to its' clients, and BridgeDB must support both IPv4 and IPv6 connections. #XXX the twisted framework does not currently support ipv6. However, # it is possible to place BridgeDB behind a forwarding proxy such as # apache or nginx, which will pass the client address to BridgeDB in the # X_FORWARDED_FOR header. BridgeDB HTTPS distributor must be able to # parse the X_FORWARDED_FOR header for both IPv4 and IPv6 addresses. # Additionally, BridgeDB should eventually support IPv6 natively when # the twisted framework provides adequate IPv6 support. How does bridgedb determine whether to respond with ipv4 or ipv6 bridges? Users select IPv4 or IPv6 bridges by visiting different URLs. An informational message added to the BridgeDB response will include the other URL. Two separate BridgeDB instances are run, one for each URL. Alternately, ipv6 bridges could be requested by visiting bridges.tpo/ipv6 or similar URL path scheme. Proposed Additional Hostname For IPv6 Bridges BridgeDB shall listen for requests on two different hostnames, bridges.torproject.org and bridgesv6.torproject.org. DNS Configuration Details bridges.torproject.org shall not have an AAAA record until the addition of the record is determined to be sound. bridgesv6.torproject.org shall have both an AAAA and A record. This is to avoid the confused-client scenario described above. How does BridgeDB know which URL was requested? This section describes how BridgeDB could be modified to support requests to both bridges.torproject.org and bridgesv6.torproject.org with a single BridgeDB instance. A single BridgeDB instance could handle requests to both bridges.torproject.org and bridgesv6.torproject.org by checking the Host header sent by the browser. The Host header is optional. In order to expose the selector explitely BridgeDB must check the query string for the following parameters: q=ipv4 - Request IPv4 bridges. q=ipv6 - Request IPv6 bridges. Parameters may be repeated to select multiple classes, e.g. q=ipv4&q=ipv6 - Request both IPv4 and IPv6 bridges. When no parameters are set, by default BridgeDB must return addresses of the same class as the client. This default may promote IPv6 use where possible. How does someone end up at bridgesv6.torproject.org? BridgeDB should include a message at the end of its' response. e.g. "Get IPv4 bridges https://bridges.torproject.org" "Get IPv6 bridges from https://bridgesv6.torproject.org" "You must have IPv6 for these bridges to work." #XXX: will users understand what this means? How does IPv6 affect address datamining of https distribution? A user may be allocated a /128, or a /64. An adversary may control a /32 or perhaps larger Proposal: Enable reCAPTCHA support by default. How do IPv6 addresses work with the IPBasedDistributor? #XXX: I need feedback on this # do we use all 128 bits here? # upper N bits? lower N bits? random or specific N bits? How are IPv6 Bridges actually distinguished? A new type of BridgeSplitter (sort of like a BridgeHolder) is devised; the function of which is to split bridges into different rings determined by a filter function. The filtering mechanism here is similar to BridgeDB's ipCategories implementation, the difference is that both the filters and ring names are specified at instance construction. The construction of a BridgeSplitter instance should be done by passing lists of tuples (ringName,filterFunction) to the constructor. This feature allows for dynamically creating filtered BridgeRings, which would prove useful for constructing more complex filters, for example, filtering by both address class and reachability from specific countries. This implementation could increase the rate at which bridges are detected and blocked, although the rate could be controlled by the frequency that BridgeDB updates its knowledge of blocked bridges. #XXX: I have some concern about the performance of # filtering bridges dynamically for each response. BridgeDB should # learn to cache recently used dynamic filters so that the impact of # popular requests will be reduced, and BridgeDB does not have to # pre-compute or identify which types of requests will be popular. The implementation could look similar to the current 'subring' implementation; or the current 'ipCategories' implementation. Both of the features are implemented using subrings that hold a subset of the parent ring's bridges; the subset being defined by a filtering function. An accompanying Distributor based on the existing IPBasedDistributor shall be designed to use the above BridgeSplitter so that sorted Bridges are selectable by address type. Because a bridge may now have both IPv6 and IPv4 addresses, BridgeDB needs to take one of the following approaches when only a single address class is requested: a. filter addresses of the other address class from the response b. include the other addresses in the response 2. Email Distributor The Email Distributor must accept additional new commands parsed from the subject or a single line in the body of an email message. ipv4 - request IPv4 bridges. ipv6 - request IPv6 bridges. The default action may be set in bridgedb.conf with the option EMAIL_DEFAULT_ADDRESS_CLASS, which must be one of 'ipv6' or 'ipv4'. If the option is not given in the config, EMAIL_DEFAULT_ADDRESS_CLASS shall default to 'ipv4'. Similar to the IPBasedDistributor, BridgeDB must determine how the response should accommodate bridges with both address classes. 3. Unassigned Distributor and Buckets BridgeDB must provide a selector to choose between exporting IPv4, IPv6, or both types of bridges. BridgeDB currently provides a way to export bucket files with unallocated bridges. The existing syntax provides no mechanism to differentiate by address class. Proposed new FILE_BUCKET syntax: A dictionary of dictionaries with the following acceptable keys and values. 'filename_prefix' shall be a unique string used as the output filename prefix. This is string is also the key to a dictionary that contains the following key/values: 'address-class' : one of either 'ipv6' or 'ipv4' 'number' : an integer > 0 Users may wish to provide descriptive names, e.g. FILE_BUCKETS = { 'filename_prefix': {'address-class': 'ipv6', 'number': 10}, 'descriptive_name': {'address-class': 'ipv6', 'number': 10}, } Future BridgeDB enhancements may expand these options to include other filters. #XXX: e.g. buckets of bridges 'not-blocked-in'