July 2013 - tor-dev - lists.torproject.org

Status report - Stream-RTT
by ra 08 Oct '13

08 Oct '13

Hi all! During the last weeks I have been very busy working on my GSoC project which is about reducing the RTT of preemptively built circuits. There is now a single script called "rttprober"[0] that depends on a patched[1] Tor client running a certain configuration[2]. The goal is to measure RTTs of Tor circuits. It takes a few parameters as input: an authenticated Stem Tor controller for communication with the Tor client, the number of circuits to probe, the number of probes to be taken for each circuit and the number of circuits that should be probed concurrently. It outputs a tar file containing lzo-compressed serialized data with detailed node information, all circuit- and stream-events involved and the circuit build time for further analysis. Since the RTT-measurements are run in parallel with very short locks it is important not to overload Tor nodes. Therefore a single node is not probed more than once at a time. A first analysis of some measurements taken supports the original assumption that a Frechét distribution fits both the circuit build times[3] and round trip times[4]. I will continue gathering and analyzing measurement data and will hopefully be able to draw some conclusions from that. Best, Robert [0] https://bitbucket.org/ra_/tor- rtt/src/1127f6936086664981fc55b4dbc82b1570714140/rttprober.py?at=master [1] https://bitbucket.org/ra_/tor- rtt/src/1127f6936086664981fc55b4dbc82b1570714140/patches?at=master [2] https://bitbucket.org/ra_/tor- rtt/src/1127f6936086664981fc55b4dbc82b1570714140/torrc?at=master [3] http://postimg.org/image/je8k5yydt/ [4] http://postimg.org/image/ktk90vxm7/

3 9

Haskell packages?
by Nikita Karetnikov 23 Aug '13

23 Aug '13

I'd like to improve my Haskell skills. Are there any opportunities? I've been told there is at least one project that uses Haskell, which is not maintained. (For example, this page [1] mentions TorDNSEL, which was replaced by TorBEL.) [1] https://www.torproject.org/getinvolved/volunteer.html.en

5 21

Re: [tor-dev] Format-Transforming Encryption Pluggable Transport
by George Kadianakis 03 Aug '13

03 Aug '13

Hi Kevin, I tried the bundles in https://kpdyer.com/fte/ . For some reason, when I fire up 'start-tor-browser' I don't get 'fte_relay' listener to bind on '127.0.0.1:8079' (like the torrc expects it to). Hence Tor fails to bootstrap and simply says: "The connection to the SOCKS5 proxy server at 127.0.0.1:8079 just failed. Make sure that the proxy server is up and running." When I run 'start-tor-browser' I'm getting the following message in stdout: """ ./bin/fte_relay: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/_MEI8IBYIj/libz.so.1) """ could this be why fte_relay never sets up a listener? Also, if I try to manually invoke fte_relay by doing: ./bin/fte_relay --mode client --server_ip 128.105.214.241 --server_port 8080 I get the same error. Furthermore, I can't get ./bin/fte_relay to give me some kind of usage information. Any ideas? Thanks!

2 3

Onionoo protocol/implementation nuances / Onionoo-connected metrics project stuff
by Kostas Jakeliunas 31 Jul '13

31 Jul '13

Hi Karsten, (not sure whom to CC and whom not to, I have a couple of fairly specific technical questions / comments (which (again) I should have delved into earlier), but then again, maybe the scope of the tor-dev mailing list includes such cases..) @tor-dev: This is in regard to the Searchable metrics archive project, intersected with Onionoo stuff. I originally wanted to ask two questions, but by the time I reached the middle of the email, I wasn't anymore sure if they were questions, so I think these are simply my comments / observations about a couple of specific points, and I just want to make sure I'm making general sense. :) ..So it turns out that it is very much possible to avoid Postgres sequential scans - to construct such queries/cases which are best executed using indexes (the query planner thinks so, and it makes sense as far as I can tell) and whatever efficient (hm, < O(n)?) search algorithms are deemed best - for the metrics search backend/database; the two things potentially problematic to me seem to be: - when doing ORDER BY, making sure that the resulting SELECT covers / would potentially return a relatively small amount of rows (from what I've been reading / trying out, whenever a SELECT happens which may cover > ~10-15% of all the table's rows, sequential scan is preferred, as using indexes would result in even more disk i/o, whereas the seq.scan can read in massive(r) chunks of data from disk because it's, well, sequential). This means that doing "ORDER BY nickname" (or fingerprint, or any other non-unique but covering-a-relatively-small-part-of-all-the-table column in the network status table) is much faster than doing e.g. "ORDER BY validafter" (where validafter refers to the consensus document's "valid after" field) - which makes sense, of course, when you think about it (ordering a massive amount of rows, even with proper LIMIT etc. is insane.) We construct separate indexes (e.g. even if 'fingerprint' is part of a composite (validafter + fingerprint) primary key), and all seems to be well. I've been looking into the Onionoo implementation, particularly into ResourceServlet.java [1], to see how ordering etc. is done there. I'm new to that code, but am I right in saying that as of now, the results (at least for /summary and /details) are generally fairly unsorted (except for the possibility of "order by consensus_weight"), with "valid_after" fields appearing in an unordered manner? This of course makes sense in this case, as Onionoo is able to return *all* the results for given search criteria (or, if none given, all available results) at once. The obvious problem with the archival metrics search project (argh, we are in need of a decent name I daresay :) maybe I'll think of something.. no matter) is then, of course, the fact that we can't return all results at once. I've been so far assuming that it would therefore make sense to return them, whenever possible (and if not requested otherwise, if "order" parameter is later implemented), in "valid_after" descending order. I suppose this makes sense? This would be ideal, methinks. So far, it seems that we can do that, as long as we have a WHERE clause that is restricting-enough. select * from (select distinct on (fingerprint) fingerprint, validafter, > nickname from statusentry where nickname like 'moria1' order by > fingerprint, validafter desc) as subq order by validafter desc limit 100; , for example, works out nicely, in terms of efficiency / query plan, postgres tells me. (The double ORDER BY is needed, as postgres' DISTINCT needs an ORDER BY, and that ORDER BY's leftmost criterion has to match DISTINCT ON (x)) Now, you mentioned / we talked about what is the (large, archival metrics) backend to do about limiting/cutoff? Especially if there are no search criteria specified, for example. Ideally, it may return a top-most list of statuses (which in /details include info from server descriptors), sorted by "last seen" / "valid after"? Thing is, querying a database for 100 (or 1000) of items with no ORDER BY Is really cheap; introducing ORDER BYs which would still produce tons of results is considerably less so. I'm now looking into this (and you did tell me this, i.e. that, I now think, a large part of DB/backend robustness gets tested at these particular points; this should have been more obvious to me). But in any case, I have the question: what *should* the backend return (when no search parameters/filters specified, or very loose ones (nickname LIKE "a") are? What would be cheap / doable: ORDER BY fingerprint (or digest (== hashed descriptor), etc.) It *might* (well, should) work to order by fingerprint, limit results, and *then* reorder by validafter - with no guarantee that the topmost results would be with highest absolute validafters. I mean, Onionoo is doing this kind of reordering / limiting itself, but it makes sense as it can handle all the data at once (or I'm missing something, i.e. the file I linked to already interacts with a subset of data; granted, I haven't thoroughly read through.) In our/this case, it makes sense to try and outsource all relational logic to ORM (but of course if it turns out we can cheaply get 100 arbitrary results and more easily juggle with them ourselves / on the (direct) python side, then sure.) But would such arbitrary returned results make sense? It would look just like Onionoo results, but - a (small) subset of them. Ramble #2: Onionoo is doing more or less direct (well, compiled, so efficient) regexps on fingerprint, nickname, etc. By default, "LIKE %given_nickname_part%" is again (sequentially) expensive; Postgres does offer full text search extensions (would need to build additional tables, etc.), and I think this makes sense to be done; it would cover all our "can supply a subset of :param" bases. I'll see into this. (It's also possible to construct functional(istic) indexes, e.g. with regexp, but need to know the template - I wonder if using Onionoo's regexpressions would work / make sense - will see.) For now, LIKE/= exact_string is very nice, but of course the whole idea is that it'd be possible to supply substrings. Of note is the fact that in this case, LIKE %substring% is O(n) in the sense that query time correlates with row count, afaict. As of now, full text search extensions would solve the problem I think, even if they may look a bit like an overkill at first. End of an email without a proper direction/point. :) [1]: https://gitweb.torproject.org/onionoo.git/blob/HEAD:/src/org/torproject/oni…

3 6

Help with rebasing arma's n23-5 tor branch to current master
by Karsten Loesing 30 Jul '13

30 Jul '13

Dear tor-devs, is anyone here up for a coding task that could help us research performance improvements of the N23 design more? The situation is that we already have a branch (n23-5 in arma's public repository), but it's based on 0.2.4.3-alpha-dev and needs to be rebased to current master. In theory, it's as simple as the following steps: $ git clone https://git.torproject.org/tor.git $ cd tor/ $ git remote add arma https://git.torproject.org/arma/tor.git $ git fetch arma $ git checkout -b n23-5 arma/n23-5 $ git fetch origin $ git rebase origin/master (clean up the mess) $ git add $ git commit $ git rebase --continue (back to clean-up-the-mess step until git is happy) $ git push public n23-5 Bonus points if you make sure the branch compiles with gcc warnings enabled, appeases make check-spaces, and runs peacefully in a private Chutney network. Unfortunately, the n23-5 branch touches a few places in the tor code that have been refactored in current master, including Andrea's connection/channel rewrite. It might be necessary to dive into the channel thing in order to get this rebase right. Once we have a refactored n23-5 branch, I'll try to simulate it in Shadow. For a tiny bit of context, this is for our sponsor F item 13: https://trac.torproject.org/projects/tor/wiki/org/sponsors/SponsorF/Year3 I'm asking here, because the usual suspects are already overloaded with other stuff. As usual, I guess. Thanks, Karsten

2 5

TorPylle: a Python/Scapy TOR protocol implementation
by pierre.lalet＠cea.fr 30 Jul '13

30 Jul '13

Hi all, To test the Tor program, I though an independent implementation might help. I started writing TorPylle with that in mind. The purpose is NOT to implement a secure or robust implementation that could be an alternative to Tor. It relies on Scapy (http://www.secdev.org/projects/scapy/) and is supposed to be used more or less the same way. The code is here : https://github.com/cea-sec/TorPylle and includes an example file. This is an early development stage. Comments, fixes and questions welcome ! Pierre

4 7

Descriptor Monitors
by Damian Johnson 30 Jul '13

30 Jul '13

Hi all. Over this last weekend I started taking advantage of stem's shiny new remote descriptor fetching module [1] to implement some simple monitors... * Descriptor Checker Hourly task that downloads the server descriptors, extrainfo descriptors, and consensus to check for malformed content. In the case of the consensus this downloads from each of the authorities to also ensure that they're all reachable... https://gitweb.torproject.org/atagar/tor-utils.git/blob/HEAD:/descriptor_ch… * Sybil Checker Replacement for the consensusTracker.py I've been running for the last few years [2]. This checks for sudden influxes of new relays, such as the trotsky relays from 2010 [3]... https://gitweb.torproject.org/atagar/tor-utils.git/blob/HEAD:/sybil_checker… Cheers! -Damian [1] https://stem.torproject.org/api/descriptor/remote.html [2] https://gitweb.torproject.org/atagar/tor-utils.git/blob/e537044:/consensusT… [3] https://trac.torproject.org/projects/tor/wiki/doc/badRelays#trotsky

1 0

RFC: obfsproxyssh
by Yawning Angel 29 Jul '13

29 Jul '13

Hello, I have been talking about this in #tor-dev for a while (and pestering people with questions regarding some of the more nuanced aspects of writing a pluggable transport, thanks to nickm, mikeperry and asn for their help), and finally have what I would consider a pre-alpha for the PT implementation. obfsproxyssh is a pluggable transport that uses the ssh wire protocol to hide tor traffic. It uses libssh2 and interacts with a real sshd located on the bridge side. Behaviorally it is identical to a user sshing to a host, authenticating with a RSA public/private key pair and opening a direct-tcp channel to the ORPort of the bridge. It is more aimed at non-technical users (because anyone with an account on a bridge can create a tunnel of their own using existing ssh clients), and thus can be configured entirely through the torrc. It still needs a bit of work before it is ready for deployment but the code is at the point where I can use it for casual web browsing, so if people are interested, I have put a snapshot online at http://www.schwanenlied.me/yawning/obfsproxyssh-20130627.tar.gz Note that it is still at the "I got it working today" state, so some parts may be a bit rough around the edges, and more than likely a few really dumb bugs lurk unseen. Things that still need to be done (in rough order of priority): * It needs to scrub IP addresses in logs. * I need to compare libssh2's KEX phase with popular ssh clients (For the initial "production" release more than likely Putty). It currently also sends a rather distinctive banner since I'm not sure which client(s) it will end up mimicking. * I need to come up with a solution for server side sshd logs. What I will probably end up doing is writing a patch for OpenSSH to disable logging for select users. * In a peculiar oversight, OpenSSH doesn't have a way to disable reverse ssh tunnels (Eg: "PermitOpen 127.0.0.1:6969" allows clients to listen on that port). Not a big deal if Tor starts up before any clients connect, but I'll probably end up writing another patch for this. * Someone needs to test it on Windows. All of my external dependencies are known to work, so the porting effort should be minimal (Famous last words). * The code needs to scrub the private key as soon as a connection succeeds in authenticating instead of holding onto it. Probably not a big deal since anyone that can look at the PT's heap can also look at the bridge line in the torrc. Nice to haves: * Write real Makefiles instead of using CMake (I was lazy). src/CMakeLists.txt currently needs to be edited for anyone compiling it. * It currently uses unencrypted RSA keys. libssh2 supports ssh-agent (on all of the relevant platforms) so key management can be handled that way. I do not think there is currently a mechanism for Tor to query the user for a passphrase and pass it to the PT, but if one gets added, support it would also be easy from my end. * The code does not handle IPv6 since it uses SOCKS 4 instead of 5. When Tor gets a way to pass arguments to PTs that are > 510 bytes, I will change this. * libssh2 needs a few improvements going forward (In particular it does not support ECDSA at all). * Code for the bridge side that makes the tunnel speak the manged PT server transport protocol would be nice (For rate limiting). * libssh2_helpers.c should go away one day. Not sure why the libssh2 maintainers haven't merged the patch that I based the code in there on. Things that need to be done on the Tor side of things: * 0.2.4-14-alpha does not have the PT argument parsing code, so this requires a build out of git to function. * The code currently in Git fails to parse bridge lines with arguments that can't be passed via SOCKS 5 (size restriction). The PT tarball has a crude patch that removes the check, but the config file parser needs to be changed. * The Tor code currently in Git likes to forget PT arguments. asn was kind enough to provide me with a patch that appears to fix this (though the PT has a workaround for when it encounters this situation), but moving forward a formal fix would be ideal. * All the PT related cleverness in the world won't do much vs active probing if there is an ORport exposed on the bridge. Tor should be able to handle "ORPort 127.0.0.1:6969" (It may currently work, I'm not sure. There should be a way to disable the reachability check if only to reduce log spam). Open questions: * Is this useful? * Is it worth biting the bullet and rewriting this to use Twisted Conch instead of being a C app? * Would it be simpler to write a wrapper around existing ssh clients? (Probably not.) * How should people handle distributing bridge information? * How to handle the private key to a given bridge getting compromised? (Correctly configured, all that someone who obtains the key would be able to do is talk to the OR port so it's not a security thing, but it opens the bridge to being blocked). * Does Tor tunneled over SSH look distinctive? No effort is made to change the traffic signature, though this can be added if needed. The tarball contains a more detailed README explaining how to set it up and how it works. obfsproxyssh_client.c has a more in-depth TODO list as a large comment near the top of the file. Comments and feedback will be appreciated. Regards, -- Yawning Angel

5 6

how much havoc can a compromised baseband do to a Guardian ROM device?
by Eugen Leitl 29 Jul '13

29 Jul '13

Anyone knows whether a Nexus 4 baseband processor has r/w access to system memory? The firmware doesn't seem to be loaded at boot, so I presume it's entirely out of reach/ reversing?

3 4

[GSoC '13] Status report - Searchable metrics archive
by Kostas Jakeliunas 29 Jul '13

29 Jul '13

Hi everyone, Some clean-ish working code is finally available online [1] (the old PoC code has been moved to [2]); I'll be adding more soon, but this part does what it's supposed to do, i.e.: - archival data import (download, mapping to ORM via Stem, efficiently avoiding re-import of existing data via Stem's persistence path, etc.); what's left for this part is a nice and simple rsync+cron setup to be able to continuously download and import new data (via Metrics archive's 'recent') - data models and Stem <-> ORM <-> database mapping for descriptors, consensuses and network statuses contained in consensuses - models can be easily queried by sqlalchemy's ORM; Karsten suggested that an additional 'query layer' / internal API is not needed until there's actual need for it (i.e., my plan was to provide an additional query API abstracted from ORM (which is itself built on top of database/SQL/python classes), and to build a backend on top of it, as a neat client of that API as it were; I had some simple and ugly PoC's that are now pushed out of priority queue until needed (if ever)) - one example of how this querying (directly atop the ORM) works is provided: a simple (very partial) Onionoo protocol implementation for /summary and /details, including ?search, ?limit and ?offset. Querying takes place over all NetworkStatuses. This is new in the sense that it uses the ORM directly. If there is a need to formulate SQL queries more directly, we'll do that as well. During the Tor developer meetings in Munich, we tried talking over the existing & proposed parts of the system with Karsten. I will be focusing on making sure the Onionoo-like backend (which is being extended) is stable and efficient. I'm still looking into database optimization (with Karsten's advice); an efficient backend for the majority of all archival data available would be a great deliverable in itself, and hopefully we can achieve at least that. I might do well to try and document the database iterations and development, as a lot of thinking now resides in a kind of 'black box' of DB spec, which does not produce code. The large Postgres datasets are residing on a server I'm managing; I'm working on exposing the Onionoo-like API for public queries; doing some simple higher-level benchmarking (simulating multiple clients requesting different data at once, etc.) now. I might need to move the datasets to yet another server (again), but maybe not; it's easy to blame things on limited CPU/memory resources. :) Kostas. [1]: https://github.com/wfn/torsearch [2]: https://github.com/wfn/torsearch-poc

1 0