Hello all,
Recently a decentralized onion-routing based Bittorrent client known as Tribler has made rounds through the clickbait garbage^w^wtech journalism sites. Since protocol design is a research interest of mine, I did some casual analysis based off the documentation and publicly available source code.
Disclaimer: Analysis was casual and not massively in-depth though I believe it to be correct. Known issues are highlighted as such, though there is a decent amount of new material. References to how Tor approaches certain design problems are provided as appropriate.
TLDR: Do not use this if your threat model includes active attackers, adversaries versed in cryptography, those with lots of money, or if you wish to be anonymous.
Material covered:
* The protocol documentation located at https://github.com/Tribler/tribler/wiki/Anonymous-Downloading-and-Streaming-...
* The source code of the "devel" branch https://github.com/Tribler/tribler/tree/devel/Tribler as of commit a98db38f60550dd304d1e329e16a41a683666c15
Protocol flaws:
* There is no built in safeguard against creating circuits of arbitrary lengths. The subset of the Tor protocol the Tribler designers implemented is lacking a RELAY_EARLY type construct.
Active adversaries can exploit this flaw to create substantial load on the Tribler network, and mount certain kinds of attacks.
See: * Proposal 110 " Avoiding infinite length circuits" (https://gitweb.torproject.org/torspec.git/tree/proposals/110-avoid-infinite-...) * Evans, S., Dingledine, R., Grothoff, C. "A Practical Congestion Attack on Tor Using Long Paths" (http://grothoff.org/christian/tor.pdf)
* The symmetric encryption is ECB-AES128, without authentication.
The former is a documented flaw in the specification and is claimed that it is "for testing", with what looks like a skeletal attempt to use OFB-AES128 present in the code.
ECB should not go on the wire ever and is only useful as a building block for other modes. Even as a "for testing" this is rather sloppy, and is something that should have been addressed way before real users start using the protocol.
The skeletal OFB-AES128 implementation included in dead code is not any better as it reuses a hard coded initialization vector for every single encryption operation. IV reuse within a given key across encryption operations breaks the confidentiality of the mode.
The lack of authentication is flat out inexcusable for reasons that should be obvious.
It is worth noting that this is one of the kludgier bits of the Tor wire protocol currently (see the 'digest' and 'recognized' fields in each RELAY cell, and proposals/ideas/xxx-new-crypto-sketch.txt), so improvements over "what Tor does" are possible and recommended.
Implementation flaws:
(As an aside, it's somewhat difficult to sort through which parts of the cryptography code are actually used and which are not.)
* When gmpy is not present, Diffie-Hellman keys are generated with python's random.randint(). This is issue #874, and it should be evident why this is incorrect, and utterly terrible.
The fact that developer comments regarding this bug note that this fallback is present only because gmpy did not work on Android should raise massive red flags in light of the next issue.
* When gmpy *is* installed, Diffie-Hellman keys are generated with gmpy's "rand()".
while dh_secret >= DIFFIE_HELLMAN_MODULUS or dh_secret < 2: dh_secret = rand("next", DIFFIE_HELLMAN_MODULUS)
As far as I can tell, gmpy's rand() is either unseeded or is seeded via the following idiom that is scattered throughout the code (I suspect it is the latter, due to the Paillier cryptosystem code being cross referenced from other modules, but it's hard to tell):
# The definition of StrongRandom changes depending on the file. # But every file that uses the construct illustrated here pulls it # in from optional_crypto, which looks like: # try: # raise ImportError() # # Dead code, this would be slightly less horrific, maybe. # from Crypto.Random.random import StrongRandom # except ImportError: # # ... Words can not describe how sad I am. # from random import Random as StrongRandom from optional_crypto import StrongRandom from sys import maxint
rand('init', 128) rand('seed', StrongRandom().randint(0, maxint))
The unseeded case results in a LCG of the form:
x = 1 // Yes, a hard coded seed. x = (29CF535 * x + 1) % (2^32)
It should be obvious as to why this is totally broken.
The seeded case results in a LCG of the form:
x = (48A74F367FA7B5C8ACBB36901308FA85 * x + 1) % (2^128)
This is fatally broken and insecure on several fronts.
a) "StrongRandom" is Mersenne Twister seeded once via os.urandom(). What little "StrongRandom" actually resembles something that is "Strong" and "Random" is due to the Python developers, though in practice it is neither.
b) The GMP LCG is seeded with 32/64 bits of "entropy", where "entropy" is Mersenne Twister output.
c) GMP's LCG is used for key generation.
* How not to do Diffie-Hellman:
key = pow(dh_received, dh_secret, DIFFIE_HELLMAN_MODULUS)
This is relatively minor since recovering the secret key is trivial via PRNG attacks, so the fact that the modular exponentiation is not constant time matters less.
* How not to do RSA:
def rsa_encrypt(key, element): assert isinstance(element, (long, int)), type(element) _element = mpz(element) return long(pow(_element, key.e, key.n))
def rsa_decrypt(key, cipher): assert isinstance(cipher, long), type(cipher) _cipher = mpz(cipher) return long(pow(_cipher, key.d, key.n))
RSA being implemented without blinding, using GMP's modular exponentiation, and without OAEP is something I pray that I never see again.
Things not covered:
* How the intro point/rendezvous system works, including peer discovery and etc (Assuming it is trivial to obtain lots of ECElgamal keys/IP address:Port combinations of endpoints, it appears that it is possible to use the CREATE/CREATED response to mount an UDP amplification attack. But I did not check how easy this would be to mount in practice.)
* The bittorrent bits.
* The "dispersy" component beyond "ok, at least the ECElgamal key generation uses M2Crypto, assuming the horribly broken keygen code (community.privatesemantic.crypto.ecelgamal.ecelgamal_init()) is dead like I think it is".
Recommendations:
* For users, "don't". Cursory analysis found enough fundamental flaws, and secure protocol design/implementation errors that I would be reluctant to consider this secure, even if the known issues were fixed. It may be worth revisiting in several years when the designers obtain more experience, and a thorough third party audit of the improved code and design has been done.
* For the developers:
* Use a CSPRNG when doing key generation. Neither GMP's LCG nor Mersenne Twister are CSPRNGs, even if they are seeded from "StrongRandom" that actually is a "StrongRandom" (rather than a horrible misnomer).
* Add a construct similar to RELAY_EARLY.
* Add authenticated encryption to cell payload. This will break wire compatibility.
* Use constant time modular exponentiation. As a matter of fact, don't include your own implementations of ECElGamal, RSA, and Diffie-Hellman. Use well tested/known secure library code instead.
* Blind RSA operations, use padding as appropriate.
* Fix the symmetric encryption somehow, after obtaining a in-depth understanding of what things break security of the mode you wish to use in various ways (Eg: IV/Nonce reuse).
See: http://web.cs.ucdavis.edu/~rogaway/papers/modes.pdf
* Instead of a DH + ElGamal handshake, consider Ntor. Even with GMPY, doing modular exponentiation is relatively CPU intensive, and there is the threat of CPU exhaustion attacks here.
* Clean out all the dead code, after printing out copies to use to scare small children.
Eg: Tribler/community/privatesemantic/crypto/elgamal.py
Most of the fixes require major revisions to the wire protocol. As it appears that there is no versioning, how that will be done is left as an exercise for the student.
Alternatively, rebase the system on I2P.
Regards,