[tor-bugs] #7520 [BridgeDB]: Design and implement a social distributor for BridgeDB

Sun Jun 16 14:56:39 UTC 2013

#7520: Design and implement a social distributor for BridgeDB
-------------------------+--------------------------------------------------
 Reporter:  aagbsn       |          Owner:     
     Type:  enhancement  |         Status:  new
 Priority:  normal       |      Milestone:     
Component:  BridgeDB     |        Version:     
 Keywords:  SponsorZ     |         Parent:     
   Points:               |   Actualpoints:     
-------------------------+--------------------------------------------------
Changes (by isis):

 * cc: isis@… (added)

Comment:

 *First, what I´ve researched thus far. Later, I will respond to the last
 few comments. My eye cannot handle this much glaring computer screens
 right now.*

 I spent a lot of time at first thinking about PoW schemes to protect
 BridgeDB
 from a malicious user requesting bridges and then burning them. To see
 notes
 on all the research I did on that, see Appendix A. The research is
 irrelevant,
 because an email from phw:

 > And I also wasted^Wspent quite some time thinking about PoW schemes for
 > scanning resistance and bridge distribution.  I came to the conclusion
 that the
 > bridge churn rate might not be high enough for it to make sense.  I have
 some
 > details in Section 4.1.1 here:
 > http://www.cs.kau.se/philwint/pdf/scramblesuit2013.pdf Let me know if
 you have
 > some thoughts; I'd love to chat more about this.
 >

 and reading phw's ScambleSuit paper [0], mentioned above, convinced me
 that
 PoW schemes cannot ever be made to work.

 I read the rBridge [12] and Proximax [13] papers, and I must agree with
 the
 rBridge authors that the granularity of categories for classifying whether
 or
 not a distribution tree is "infected" with malicious users is not fine-
 grained
 enough to have results as good as rBridge's (see the "Comparison with
 Proximax" section in [12]).

 Also asn and I spoke very briefly on IRC about possible implementations of
 rBridge. asn thinks the crypto needs to be more widely reviewed. I agree,
 mostly...though, I would note that some of the simpler propositions for
 privacy preservation in Section 5 of the rBridge paper, such as the
 Pedersen
 secrets [14] (essentially a "newer" version of Shamir's secret sharing
 algorithm, where "newer" means "1992") are pretty well-established.
 However, I
 would need a bit of time to read up on the Oblivious Transfer (OT) scheme
 utilised [15] -- which most of the rBridge privacy preservation depends
 upon
 -- as well as the authors of that paper's [14] updates to their protocol
 [16][17], and more recently publishes articles on OT, ([18] for one).
 First,
 I'm not a "real cryptographer". Second, this will take me a while, due to
 my
 recent injury. TTS doesn't do well with LaTeX algorithms.

 Third, that's beside the point. BridgeDB doesn't currently preserve
 privacy
 (as far as I can tell), and if it were compromised all the bridges would
 be
 leaked anyway. But in the meantime, without implementing Section 5 of the
 rBridge paper, we could implement the rest, and then perhaps have a real
 proposal to a funder later for implementing the privacy-preserving
 features,
 possibly including some way to incentivize real cryptographers to take a
 look
 at the proofs in Sections 5 and 6 of rBridge.

 Other possible solutions (though not tailored to distribution of Tor
 bridges)
 that I have investigated somewhat, though these could perhaps use further
 research if it is decided that rBridge is non-optimal were using Mozilla's
 Persona [19][20] to implement blinded bridge-user account authentication
 (mostly because mikeperry has been praising Persona on tor-dev and got me
 interested in it). Another was IBM Research's IDEmix [21][22][23], though
 I
 put off looking into that more because there seemed easier, faster paths
 to
 improving BridgeDB (namely rBridge). Of the two, IDEmix seems more privacy
 preserving, as tokens are completely blinded, as opposed to Persona, where
 the
 centralised Persona server can see which identity (based on email address)
 is
 logging in to which service, from where (meaning which webpage, etc.), and
 when. Other papers which have been mentioned which I haven't yet gotten to
 reading are BridgeHerder [24] and sysrqdb mentions a "Kaleidescope" paper
 which I've not read either.

 Concerning Metrics:
 -------------------

 Eventually, I think it will be necessary to conduct further research to
 determine if rBridge's privacy-preservation scheme for bridge users'
 identities is something we wish to implement, or if we should continue
 looking
 at any of the alternatives listed above, or others.

 I think this research, and the implementation of some type of
 privacy-preservation scheme will be necessary, for several reasons, mostly
 concerning safely obtaining metrics which would enable us to improve
 bridge
 distribution, uptime, and bridge user connectivity (though also out of
 concern
 at having the compromise of BridgeDB be a single point-of-failure for
 users in
 censored regions to obtain internet access):

   1) There isn't currently a way to safely and accurately get metrics for
   patterns in which bridges are getting burned (sysrqdb also makes this
 point
   in [https://trac.torproject.org/projects/tor/ticket/7520#comment:12
 comment
   #12]). If there were a way to take these metrics, while still preserving
 the
   anonymity of the client, then we could track which client is using what
   without knowing who the client is.

   2) There isn't an easy way for clients, if/when they do connect to the
 Tor
   network, to report which bridges they had previously tried which did not
   work for them. Since bridges already can collect the geolocational data
 on
   connecting clients, it would be nice to have the ability for a client to
 say
   which bridges are unreachable, without giving away (to the bridge
 currently
   in use, as it might be malicious) any information on which bridges were
   previously tried (though it is probably safe to assume that if the other
   bridges are blocked and the current, malicious bridge is colluding with
 the
   censor, then the censor has already blocked the other bridges tried and
 thus
   already knows about them).

 Payment-based Bridge Distribution Schemes:
 ------------------------------------------

 Tor developers and assistants should spend their time doing sysadmin work,
 and
 bridges should be easily runnable by third partied. aagbsn mentioned an
 idea
 to setup a "cloud bridge management" system where a provider would have a
 simple interface for deploying new bridges and taking anonymous payments,
 and
 users could subscribe to N bridges, helping to pay for the cost of running
 the
 bridges.[11] I don't especially like this idea, see Appendix A for why I
 think
 it doesn't help much.

 If "Concerning Metrics" #2 were solved, a provider could recieve reports
 from
 clients' connections to the other bridges which would tell the provider
 that a
 bridge that the provider runs is unreachable, which the provider could
 then
 corroborate with reports from other users in the same country/region.
 There
 could even be an alert system, i.e. "email me if N clients from the same
 country report that the same bridge is down" or possibly/eventually
 automation
 of deploying a new/replacement bridge instance.

 WOT / contact list schemes:
 ---------------------------

   1) Using the PGP WOT seems like low-hanging fruit. Sending an email with
 a
   key in the strong set could possibly be useful test for clients
 requesting
   bridges. However, it would be possible to generate a lot of keys, have
 them
   all sign each other, and then upload them to the key servers. It would
 be a
   good idea to decrease the amount of incentivisation for a censor to take
   such actions. Requiring at least one of the signatures to have been made
   before this PGP-WOT-client-authentication system is deployed is one way
 to
   do it, though rather obviously rather exclusionary. Another way might be
 to
   have a trust path to a Tor developer, or other whitelisted parties,
 although
   this is exclusive and problematic for obvious reasons.

   2) Another option would be to have a way to hook into a google account,
 to
   provide clients with the ability to send bridge "invites" to people on
 their
   contact list. This does not sound very sustainable, as every time a
 social
   network provider updated their API, the code for BridgeDB would need to
   change. Also, many censoring countries are moving towards having their
 own
   alternatives to US-based services, such as Weibo and Baidu in China, and
   vKontakte in Russia.

 Questions:
   - Do Chinese social networks use OpenID?

 ============================================================================

 What to do ASAP:

   1) Go through BridgeDB's codebase and assess the difficultly and time
   requirements for implementing rBridge, Sections 1-4 only.

   2) Write a proposal for fixing/improving/maintaining BridgeDB.

      It should include:
      - Solutions and time requirements for the points in asn's list of
        concerns about BridgeDB
      - Better sanitisation of BridgeDB's logs #3797, #4771
      - Improve BridgeDB's knowledge of current Tor exit nodes (so that
 they
        aren't used to obtain bridges) #4405, #7750
      - Improve the email and website interfaces, including better
        commands/responses for email distribution #8705, #1562, #1610,
 #3061,
        #3573, #5851, #6125, #7296, #5655, #8616
      - Modifying the format of BridgeDB's response to provide extra fields
 for
        transport protocols like ScrambleSuit and obfsproxy (not sure if
 the
        latter was already implemented) #5119, #9013
      - Improving the documentation for BridgeDB code for future
 maintainers.

   3) Hopefully someone funds it.

   4) Do things.

 What to do long term:

   1) Evaluate and discuss the usefulness of directions for further
 research as
   outlined above.

   2) Do that research, rinse, repeat.

 ============================================================================

 Appendix A
 ----------

 I re-read the old Rivest and Shamir paper[1] on time locked puzzles, as
 well
 as the scrypt paper [2] and some documentation on blockchain computation
 in
 bitcoin trying to brainstorm a working PoW scheme. I also looked into
 automated CPU-scaling for MPI clusters[3], the latest custom ASIC hardware
 for
 SHA256 hash digest computation [4][5][6], and the current state of CMOS-
 based
 and single-electron transistors and their efficiency for carrying out the
 four
 steps for AES encryption[7][8][9][10] to try to get an idea of what the
 actual
 economic and time costs would be for a determined attacker. I think that a
 single determined attacker with roughly my capability levels, roughly
 three
 months of development time, and roughly $10,000 USD could break any of the
 PoW
 schemes which remain usable for Tor bridge clients. They might need more
 funding for hardware if we picked a work function which doesn't already
 have
 specialized chips/software in production for the underlying computations
 --
 but nevertheless, this is a rather low bar, and it wouldn't buy us much in
 terms of decreasing the bridge enumeration rate.

 Incidentally, I also read a PETS 2008 paper titled "PAR: Payment for
 Anonymous
 Routing". [11] I was thinking about actual payment systems because aagbsn
 had
 mentioned at some point an idea to create a easy mechanism for outside
 entities to set up a sort of systems management system for a bunch of
 TorCloud
 bridge instances and take anonymous payment from users for private
 bridges. While I admit that I have no idea how "e-cash", as mentioned in
 the
 PAR paper, works, I know that the scheme they described would be
 completely
 traceable with bitcoin. Plus, our goal is to withstand attacks by
 government-level adversaries who are less economically-, CPU-, and memory-
 bounded than bridge users. Payment mechanisms would only allow the rich to
 subvert the "protections" afforded to bridges.

 ScrambleSuit is really interesting, has an elegantly simple morphing
 algorithm, and provides protection against timing and replay attacks.
 However,
 I wanted the PoW so that obtaining bridges through the current
 distribution
 mechanisms would be more expensive. At the present, as aagsn noted at one
 point (in person, not on this thread), "$20k could buy enough googlemail
 addresses to get all the bridges out of the email distributor". Since the
 PoW
 scheme isn?t going to work, I'm trying to find a better way to
 *distribute*
 the bridge addresses -- which now also potentially includes ScrambleSuit's
 master shared secret.

 References:
 -----------
 [0] http://www.cs.kau.se/philwint/pdf/scramblesuit2013.pdf "ScrambleSuit:
 A Polymorph Network Protocol to Circumvent Censorship"
 [1] http://people.csail.mit.edu/rivest/RivestShamirWagner-timelock.ps
 "Time-lock Puzzles and Timed-release Crypto"
 [2] https://www.tarsnap.com/scrypt/scrypt.pdf "STRONGER KEY DERIVATION VIA
 SEQUENTIAL MEMORY-HARD FUNCTIONS"
 [3] https://datawrangling.s3.amazonaws.com/elasticwulf_pycon_talk.pdf "MPI
 Cluster Programming with Python and Amazon EC2"
 [4] http://www.butterflylabs.com/faq/ see "What is ?Hosting? and how do I
 setup my order to be ?Hosted??"
 [5] https://en.bitcoin.it/wiki/Mining_hardware_comparison#ASIC
 [6] http://electronics.stackexchange.com/questions/7042/how-much-does-it-
 cost-to-have-a-custom-asic-made
 [7] http://www.ecs.umass.edu/~jiazhao/custom_aes_liang_li.pdf "A Full-
 custom Design of AES SubByte Module with  Signal Independent Power
 Consumption"
 [8] http://arxiv.org/pdf/1203.4811.pdf "Few Electron Limit of n-type Metal
 Oxide Semiconductor Single Electron Transistors"
 [9] Closing the Power Gap Between ASIC and Custom: Tools and Techniques
 for Low Power Design. Chinnery, David. Keutzer, Kurt William. pp.115-
 http://books.goo\
 gle.com/books?id=Pektbnxx6G4C&pg=PA115&lpg=PA115&ots=dqirO074Fj&dq=custom+asic+aes
 [10] https://en.wikipedia.org/wiki/Tensilica_Instruction_Extension
 "Tensilica Instruction Extension"
 [11] http://cs.gmu.edu/~astavrou/research/Par_PET_2008.pdf "PAR: Payment
 for Anonymous Routing"
 [12] http://www-users.cs.umn.edu/~hopper/rbridge_ndss13.pdf "rBridge: User
 Reputation based Tor Bridge Distribution with Privacy Preservation"
 [13] http://cseweb.ucsd.edu/~klevchen/mml-fc11.pdf "Proximax: A
 Measurement Based System for Proxies Dissemination"
 [14] http://www.cs.huji.ac.il/~ns/Papers/pederson91.pdf "Non-interactive
 and information-theoretic secure verifiable secret sharing"
 [19] https://github.com/mozilla/id-specs/blob/prod/browserid/index.md
 "Specifications related to Mozilla's Identity Effort."
 [20] https://current.trovebox.com A website using Mozilla's Persona for
 authentication.
 [21] http://www.zurich.ibm.com/security/idemix/ "IBM Research: Identity
 Mixer"
 [22] https://idemix.wordpress.com/2009/09/29/pub-anonymous-credentials/
 Papers related to anonymous authentication in IDEmix.
 [23]
 http://domino.research.ibm.com/library/cyberdig.nsf/papers/EEB54FF3B91C1D648525759B004FBBB1/$File/rz3730_revised.pdf
 IDEmix specification

 Not Yet Read:
 -------------
 [14] http://www.pinkas.net/PAPERS/effot.ps "Efficient Oblivious Transfer
 Protocols"
 [15] http://logic.pdmi.ras.ru/ics/papers/ot.pdf "Computationally Secure
 Oblivious Transfer"
 [16] http://www.pinkas.net/ "Homepage of Benny Pinkas" includes
 bibliography of this cryptographer's papers
 [17] http://www.iacr.org/archive/asiacrypt2002/25010142/25010142.pdf
 "Efficient Oblivious Transfer in the Bounded-Storage Model"
 [24] https://trac.torproject.org/projects/tor/ticket/7207 "BridgeHerder: A
 tool to manage bridges"

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/7520#comment:14>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online