<div dir="ltr"><div><div><div><div>Hi Kostas.<br><br></div>I've taken the liberty to hack your code for the GSoC project I'm working on: Revamp GetTor. One of the ideas in mind was to give links for the bundles via Twitter. Thankfully, your code made things a lot easier for me! :-P I've just made some changes to see if I could accomplish what I was looking for, nothing big. One of the thins I did was to add a class Messages in twitter_bot.py [0] to handle messages in various languages with i18n, I hope you find it useful in case you have considered translated messages too.<br>


<br></div>That said, I'd like to discuss some issues about creating a twitter bot which I think it affects the projects we're working on.<br><br></div>1) According to twitter's "Automation rules and best practices" guide [1], in the section of "Automated following and un-following", if I understand right, applications using twidibot will be suspended, as the current behavior is to automatically follow and un-follow users. Similar issues are mentioned on "Following rules and best practices" [2] and "Twitter rules" [3] Are you aware of this? If so, what's the plan?<br>


<br></div>2) I'm not very familiar with bridges, but for what I understand, one of the reasons to use obfuscated bridges is to hide the fact that you're using Tor. With the current behaviour of twidibot (both for BridgeDB and GetTor twitter distributors), a malicious user could follow the twitter accounts and learn what users the bot started following and then un-following, thus identifying all users that asked for bridges/bundles. If you're using an account with your real name, this could get you in trouble in places where using software to avoid censorship is prohibited. I think users should be warned about this. Have you considered this case, or am I just too paranoid?<br>


<div><div><div><br></div><div>I'll be glad to hear what you and others think about 1) and 2).<br></div><div><br>[0] <a href="https://github.com/ileiva/twidibot/blob/master/twidibot/twitter_bot.py" target="_blank">https://github.com/ileiva/twidibot/blob/master/twidibot/twitter_bot.py</a><br>


[1] <a href="https://support.twitter.com/articles/76915#" target="_blank">https://support.twitter.com/articles/76915#</a><br>[2] <a href="http://support.twitter.com/articles/68916-following-rules-and-best-practices" target="_blank">http://support.twitter.com/articles/68916-following-rules-and-best-practices</a><br>


[3] <a href="http://support.twitter.com/articles/18311-the-twitter-rules" target="_blank">http://support.twitter.com/articles/18311-the-twitter-rules</a><br><br><div><div><div class="gmail_extra"><br><div class="gmail_quote">

2014-07-13 14:00 GMT-04:00 Kostas Jakeliunas <span dir="ltr"><<a href="mailto:kostas@jakeliunas.com" target="_blank">kostas@jakeliunas.com</a>></span>:<br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi all,<br>

<br>

preferring existing code over shiny code and being mad late, I<br>

<br>

  * (re)wrote a simple but working churn control mechanism[1], which uses<br>

<br>

  * a general persistable storage system:<br>

<br>

    * in particular, the bot now has a central storage controller<br>

which takes care of storage handlers which, in turn, may be of<br>

different varieties. Each variety knows how to handle its own kind of<br>

storage containers (simple objects with data as attributes). Some of<br>

them may be persistable, others necessarily ephemeral (wipe data on<br>

close);<br>

    * right now we only make use of simple<br>

pickle-dump-to-file-and-gzip persistable storage; we use it for churn<br>

control and for challenge responses; everything is self-contained so<br>

to speak;<br>

    * we hash the user twitter handles (unique usernames / screen<br>

names) and round up bridges-last-given-at timestamps;<br>

    * we handle bot shutdown by catching the appropriate signal (then<br>

properly closing down the twitter stream listener and asking the<br>

storage controller to close down the handlers);<br>

    * we use the storage system in the core bot via a general "bot<br>

state" object (which is itself oblivious to how storage is actually<br>

implemented);<br>

<br>

  * wrote a simple and generic challenge-response system[2] (which<br>

makes use of the persistent storage);<br>

    * instead of doing something very smart, we use a general CR<br>

system which takes care of particular challenge-responses; the general<br>

CR is usable as-is; the particular CR objects can be easily subclassed<br>

(and that's what we do now);<br>

    * the current mock/bogus CR system that is in place (for testing<br>

etc.) is a naive text-based question-answer CR, which asks the users<br>

to add the number of characters in their twitter username to a given<br>

verbal/English-word number;<br>

    * I should now finish up with ``BridgeRequest``s, which are the<br>

proper way to handle bridge requests in the bot while doing<br>

challenge-responses (the current interaction between the core bot and<br>

the CR system will lead / has been leading nowhere);<br>

    * also, there's a question to be had whether the cached (and<br>

hashed) answers to CRs should be persisted to storage (if bot gets<br>

shutdown while some challenges are pending) in the first place.<br>

<br>

I've been unable to find[3] or to come up with a concept of a<br>

user-friendly *text-based* CR that would stand against any kind of<br>

thief who is able to create lots of Twitter users and to write<br>

twenty-line scripts solving any text-based challenges/questions<br>

presented. Either it will to be a difficult problem that will be<br>

easier solved by a computer than by a human (hence unfeasible<br>

general-UX-wise), or it will be so "symmetrical" in the sense that one<br>

only has to view the source (if even that) to come up with a script<br>

trivially solving the challenge presented.<br>

<br>

Hence I've been slowly moving on with the<br>

captcha-over-twitter-direct-messages idea, which is not pretty, but<br>

which would at least ensure that we don't give up bridges more easily<br>

than in, say, the current IPDistributor.<br>

<br>

[1]: <a href="https://github.com/wfn/twidibot/compare/master...churn_rewrite" target="_blank">https://github.com/wfn/twidibot/compare/master...churn_rewrite</a><br>

[2]: <a href="https://github.com/wfn/twidibot/compare/churn_rewrite...simple_cr2" target="_blank">https://github.com/wfn/twidibot/compare/churn_rewrite...simple_cr2</a><br>

<br>

[3] it's quite hard to find anything of use in the "chatroom problem"<br>

/ "text-based challenge response" area. Basically, it would be great<br>

to have a "reverse Turing test"[4] that is not about captcha/OCR. I<br>

realize this is in itself a very ambitious topic.<br>

[4]: some context on early CAPTCHAs / precursors (have been trying to<br>

familiarize myself with the general area),<br>

<a href="http://www2.parc.com/istl/projects/captcha/docs/pessimalprint.pdf" target="_blank">http://www2.parc.com/istl/projects/captcha/docs/pessimalprint.pdf</a><br>

<br>

--<br>

<br>

Kostas.<br>

<br>

0x0e5dce45 @ <a href="http://pgp.mit.edu" target="_blank">pgp.mit.edu</a><br>

_______________________________________________<br>

tor-dev mailing list<br>

<a href="mailto:tor-dev@lists.torproject.org" target="_blank">tor-dev@lists.torproject.org</a><br>

<a href="https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev" target="_blank">https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev</a><br>

</blockquote></div><br><br clear="all"><br>-- <br><div dir="ltr">israel</div>

</div></div></div></div></div></div></div>