[tor-dev] [GSoC] BridgeDB Twitter Distributor report
israel.leiva at usach.cl
Mon Aug 11 01:25:10 UTC 2014
I've taken the liberty to hack your code for the GSoC project I'm working
on: Revamp GetTor. One of the ideas in mind was to give links for the
bundles via Twitter. Thankfully, your code made things a lot easier for me!
:-P I've just made some changes to see if I could accomplish what I was
looking for, nothing big. One of the thins I did was to add a class
Messages in twitter_bot.py  to handle messages in various languages with
i18n, I hope you find it useful in case you have considered translated
That said, I'd like to discuss some issues about creating a twitter bot
which I think it affects the projects we're working on.
1) According to twitter's "Automation rules and best practices" guide ,
in the section of "Automated following and un-following", if I understand
right, applications using twidibot will be suspended, as the current
behavior is to automatically follow and un-follow users. Similar issues are
mentioned on "Following rules and best practices"  and "Twitter rules"
 Are you aware of this? If so, what's the plan?
2) I'm not very familiar with bridges, but for what I understand, one of
the reasons to use obfuscated bridges is to hide the fact that you're using
Tor. With the current behaviour of twidibot (both for BridgeDB and GetTor
twitter distributors), a malicious user could follow the twitter accounts
and learn what users the bot started following and then un-following, thus
identifying all users that asked for bridges/bundles. If you're using an
account with your real name, this could get you in trouble in places where
using software to avoid censorship is prohibited. I think users should be
warned about this. Have you considered this case, or am I just too paranoid?
I'll be glad to hear what you and others think about 1) and 2).
2014-07-13 14:00 GMT-04:00 Kostas Jakeliunas <kostas at jakeliunas.com>:
> Hi all,
> preferring existing code over shiny code and being mad late, I
> * (re)wrote a simple but working churn control mechanism, which uses
> * a general persistable storage system:
> * in particular, the bot now has a central storage controller
> which takes care of storage handlers which, in turn, may be of
> different varieties. Each variety knows how to handle its own kind of
> storage containers (simple objects with data as attributes). Some of
> them may be persistable, others necessarily ephemeral (wipe data on
> * right now we only make use of simple
> pickle-dump-to-file-and-gzip persistable storage; we use it for churn
> control and for challenge responses; everything is self-contained so
> to speak;
> * we hash the user twitter handles (unique usernames / screen
> names) and round up bridges-last-given-at timestamps;
> * we handle bot shutdown by catching the appropriate signal (then
> properly closing down the twitter stream listener and asking the
> storage controller to close down the handlers);
> * we use the storage system in the core bot via a general "bot
> state" object (which is itself oblivious to how storage is actually
> * wrote a simple and generic challenge-response system (which
> makes use of the persistent storage);
> * instead of doing something very smart, we use a general CR
> system which takes care of particular challenge-responses; the general
> CR is usable as-is; the particular CR objects can be easily subclassed
> (and that's what we do now);
> * the current mock/bogus CR system that is in place (for testing
> etc.) is a naive text-based question-answer CR, which asks the users
> to add the number of characters in their twitter username to a given
> verbal/English-word number;
> * I should now finish up with ``BridgeRequest``s, which are the
> proper way to handle bridge requests in the bot while doing
> challenge-responses (the current interaction between the core bot and
> the CR system will lead / has been leading nowhere);
> * also, there's a question to be had whether the cached (and
> hashed) answers to CRs should be persisted to storage (if bot gets
> shutdown while some challenges are pending) in the first place.
> I've been unable to find or to come up with a concept of a
> user-friendly *text-based* CR that would stand against any kind of
> thief who is able to create lots of Twitter users and to write
> twenty-line scripts solving any text-based challenges/questions
> presented. Either it will to be a difficult problem that will be
> easier solved by a computer than by a human (hence unfeasible
> general-UX-wise), or it will be so "symmetrical" in the sense that one
> only has to view the source (if even that) to come up with a script
> trivially solving the challenge presented.
> Hence I've been slowly moving on with the
> captcha-over-twitter-direct-messages idea, which is not pretty, but
> which would at least ensure that we don't give up bridges more easily
> than in, say, the current IPDistributor.
> : https://github.com/wfn/twidibot/compare/master...churn_rewrite
> : https://github.com/wfn/twidibot/compare/churn_rewrite...simple_cr2
>  it's quite hard to find anything of use in the "chatroom problem"
> / "text-based challenge response" area. Basically, it would be great
> to have a "reverse Turing test" that is not about captcha/OCR. I
> realize this is in itself a very ambitious topic.
> : some context on early CAPTCHAs / precursors (have been trying to
> familiarize myself with the general area),
> 0x0e5dce45 @ pgp.mit.edu
> tor-dev mailing list
> tor-dev at lists.torproject.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the tor-dev