commit bd2dd1ea00e79dcf60417f160d65ef062b0eef86 Author: ilv ilv@users.noreply.github.com Date: Fri May 16 02:09:36 2014 -0400
Initial version for GetYor new spec --- spec/design/blacklist.txt | 74 ++++++++++++++++++++++++++++ spec/design/core.txt | 107 ++++++++++++++++++++++++++++++++++++++++ spec/design/smtp.txt | 118 +++++++++++++++++++++++++++++++++++++++++++++ spec/design/twitter.txt | 99 +++++++++++++++++++++++++++++++++++++ spec/overview.txt | 94 ++++++++++++++++++++++++++++++++++++ 5 files changed, 492 insertions(+)
diff --git a/spec/design/blacklist.txt b/spec/design/blacklist.txt new file mode 100644 index 0000000..0cabaa2 --- /dev/null +++ b/spec/design/blacklist.txt @@ -0,0 +1,74 @@ + Google Summer of Code 2014 GetTor Revamp - Blacklist module + Author: Israel Leiva - israel.leiva@usach.cl + Last update: 2014-05-16 + Version: 0.01 + Changes: First version + +1. Preface + + Since GetTor was created it has been a collection of functions and + classes separated in various modules. As its main purpose was + to serve files over SMTP, almost all current files have SMTP-related + procedures, including black and white lists. The proposed design for + the blacklist module intends to separate GetTor services from the + blacklist procedures. + +2. Blacklist module + + The main functionalities the blacklist module should provide are: + + * Check if a given entry is blacklisted for a given service + * Add/update/remove entries. + * Provide a standard critera to prevent flood. + +3. Design + + The new design should consist of the following files, directories and + functions: + + * lib/gettor/blacklist/: Directory for storing blacklisted users of + different services. + + ----- service1.blacklist: users blocked for service1 + ----- service2.blacklist: users blocked for service2 + + * lib/gettor/blacklist.py: Blacklist module of GetTor. + + isBlacklisted(address, provider) + Check if the given address is in the blacklist of provider and + if it acomplish certain restrictions. For example, it could + check when was the last time it was updated on the blacklist. + + +4. Roadmap + + A possible example of how the blacklist module could work: + + a. A service receives a request and call the blacklist module. + b. The blacklist module check the blacklist for that service. + c. If the user (hashed) is present in the blacklist, it checks when + was the last time it was updated. If this date is more than X days + ago, updates the entry with the current date and returns false. If + not, returns true. If the user is not present in the file, it adds + the user with the current date and returns false. + + +5. Discussion + +5.1 Mistakes + + Mistakes concerning the package requested should be consider. If I + send a message asking for a Linux 'es' request but I write 'en' instead + of 'es', should I be able to ask again? or wait until I'm no longer + blacklisted? + +5.2 Users + + The method presented above (Roadmap) should consider weekly or monthly + clean-up of the list. + +5.3 SorryMessage + + May be when a user tries to send too many requests we could send him + a message saying that he won't be able to ask for packages in the next + X days. diff --git a/spec/design/core.txt b/spec/design/core.txt new file mode 100644 index 0000000..447c93b --- /dev/null +++ b/spec/design/core.txt @@ -0,0 +1,107 @@ + Google Summer of Code 2014 GetTor Revamp - Code module + Author: Israel Leiva - israel.leiva@usach.cl + Last update: 2014-05-16 + Version: 0.01 + Changes: First version + +1. Preface + + Since GetTor was created it has been a collection of functions and + classes separated in various modules. As its main purpose was + to serve files over SMTP, almost all current files have SMTP-related + procedures, including address normalization, message composition, etc. + The proposed design for the core module intends to separate GetTor + main functionalities which are independent of the service that + transports the bundles. + +2. Core module + + The main functionalities the core module should provide are: + + * Receive a request with OS version, architecture, bundle + language, and respond with the respective links. + * Generate links, per request or at demand, depending on if the + former is accepted as part of the new design. + * Log anonymous transactions. + +3. Design + + The new design should consist of the following files, directories and + functions: + + * core.conf: Configuration values, e.g. base directory. + + * providers/: Directory for providers configuration. + + ----- providersList.txt: list of valid providers. + ----- provider1.conf: configuration for provider1. + ----- provider2.conf: configuration for provider2. + + All this data is added manually. + + * mirrors.txt: Contains official mirrors. One per line. Added manually. + + * logs/: Directory for logs. Added automatically. + + ----- yyyy-mm-dd.log: daily log of requests. + + * lib/gettor/core.py: Core module of GetTor. + + getLinks(os_version, arch, locale) + Returns links for os_version (in both archs) in locale language. + This will read the providers list and call __generateLinks() + for each one of them, plus calling to __getMirrors(). + + Example: getLinks('OSX', 'en') + + __generateLinks(options, provider) + Generate links for a specific provider according to the options + received (os_version, locale). This will try to import the + provider module and call the uploadBundle function. + + Example (within the module): __generateLinks(options, 'dropbox') + + __getMirrors() + Obtains mirrors from mirrors.txt. + + __logRequest(options) + Log information about the request for future stats (e.g. which + OS from which service is the most required). + + * lib/gettor/providers/provider.py: There should be one module per + provider with the uploadBundle public function. There should be + at least three modules at the end of GSoC: dropbox.py, drive.py, + github.py + + uploadBundle(options) + Calls the provider internal functions to upload the required + bundle according to the options received. This internal + functions will depend solely on the API requirements from + the provider. + +4. Roadmap + + An example of how the core module work: + + a. The SMTP service receives a request. + b. The SMTP calls getLinks() with the options sent by the user. + c. getLinks() calls to __generateLinks() and then to __getMirrors() + d. getLinks() constructs a message with the information obtained. + e. getLinks() calls to __logRequest(). + f. getLinks() returns the message previously constructed. + g. The SMTP service creates a message with the links obtained and + send it to the user. + +5. Discussion + +5.1 Cache + + The above design was thought for per request links generation. Another + way of doing this would be to maintain a cache of generated links and + call __generateLinks() depending on the cache last modified time. + Reading links from this cache should consider to check if the given + links still exists. + +5.2 Logs + + Should we mantain separate logs for successful and fail requests? diff --git a/spec/design/smtp.txt b/spec/design/smtp.txt new file mode 100644 index 0000000..e86e864 --- /dev/null +++ b/spec/design/smtp.txt @@ -0,0 +1,118 @@ + Google Summer of Code 2014 GetTor Revamp - SMTP module + Author: Israel Leiva - israel.leiva@usach.cl + Last update: 2014-05-16 + Version: 0.01 + Changes: First version + +1. Preface + + Since GetTor was created it has been a collection of functions and + classes separated in various modules. As its main purpose was + to serve files over SMTP, almost all current files have SMTP-related + procedures, including address normalization, message composition, etc. + The proposed design for the SMTP module intends to separate GetTor + main functionalities from the services, in this case, SMTP. + +2. SMTP module + + The main functionalities the SMTP module should provide are: + + * Receive requests via mail. + * Identify user instructions, such as ask for help or for a specific + bundle (OS version, language). + * Get the required links from the core module. + * Send different types of answers to the user. + * Manage black lists to avoid flood. + * Log requests for stats (anonymous). + +3. Design + + The new design should consist of the following files, directories and + functions: + + * lib/gettor/services/smtp.py: SMTP module of GetTor. + + __init__(configuration path) + Creates an object according to the configuration values. + + processEmail(email object) + Process emails received (by forwarding). + + __parseEmail(email object) + Parse the raw email sent by processMail(). Check for multi-part + emails and then parse the text part. It also tries to get the + locale information from the user's address. + + __parseText(email object) + Parse the text part of an email looking for the package requested. + + __getFrom(email object) + Returns the from address of an email object. + + __getLocale(address) + Tries to get the locale information from an email address. + + __checkBlacklist(address) + Check if the given address is blacklisted by comparing the + hashed address. If address is not present, it's added. If present, + check for the date when it was added. Yet to define how many + days will be considered for blacklisting or if another method + will be used. For this it uses the blacklist module. + + __sendReply(address) + Sends a reply to the user with the links required. It asks for + the links to the core module. + + __sendDelayAlert(address) + If enabled (on configuration), sends a delay message to the user + letting him know that the links are on their way. + + __sendHelp(address) + Sends a message to the user with help instructions. + + __createEmail(from, to, subject, body) + Creates an email object. + + __logRequest(options) + Log information about the request for future stats (e.g. which + OS and language are the most required). If this is called + after a failure, a copy of the email should be stored. + + * BASE_DIR/logs/: Directory for logs. The BASE_DIR should be in the + configuration file. + + ----- yyyy-mm-dd.log: daily log of requests. + + +4. Roadmap + + An example of how the SMTP module should work: + + a. The SMTP service receives a request (via forwarding). + b. The email sender is checked for blacklisting (comparing hashes). + c. The email is parsed, obtaining the package requested and the + locale information. + d. If the email was asking for help, a help reply is sent. + e. If the email was invalid, the process break. This fail is logged + and the email that triggered it, too. + f. If the email was valid and the delay alert is set, then a reply + informing the links are on their way is sent. + g. If the email was valid, the SMTP service asks for the links to the + code module. + h. After that, a reply is sent to the user. + i. In all cases the request is logged (with no user information). + + +5. Discussion + +5.1 Email forwarding + + Are we going to support forwarding emails as ForwardPackage() did in + the old GetTor? + +5.2 Blacklist sublists + + Now with less types of request (two if no forwarding is added), creating + sublists for different types of requests necessary to blacklist and + email address? Or should we blacklist it if it floods anything? + diff --git a/spec/design/twitter.txt b/spec/design/twitter.txt new file mode 100644 index 0000000..46a178a --- /dev/null +++ b/spec/design/twitter.txt @@ -0,0 +1,99 @@ + Google Summer of Code 2014 GetTor Revamp - Twitter module + Author: Israel Leiva - israel.leiva@usach.cl + Last update: 2014-05-16 + Version: 0.01 + Changes: First version + +1. Preface + + Since GetTor was created it has been a collection of functions and + classes separated in various modules. As its main purpose was + to serve files over SMTP, almost all current files have SMTP-related + procedures, including address normalization, message composition, etc. + The proposed design for the Twitter module intends to separate GetTor + main functionalities from the services, in this case, Twitter. + +2. Twitter module + + The main functionalities the Twitter module should provide are: + + * Receive requests via direct messages. + * Identify user instructions, such as ask for help or for a specific + bundle (OS version, language). + * Get the required links from the core module. + * Send different types of answers to the user. + * Split answers to fit Twitter's format. + * Manage black lists to avoid flood. + * Log requests for stats (anonymous). + +3. Design + + The new design should consist of the following files, directories and + functions: + + * lib/gettor/services/Twitter.py: Twitter module of GetTor. + + __init__(configuration path) + Creates an object according to the configuration values. + + processDM(message) + Process direct messages received. + + __parseDM(message) + Parse the direct message received. Check for the package requested + and the locale information. + + __getUser(message) + Gets the user from the message sent. + + __checkBlacklist(user) + Check if the given user is blacklisted by comparing the + hashed user. Yet to define how many days will be considered for + blacklisting or if another method will be used. For this it uses + the blacklist module. + + __sendReply(user) + Sends a reply to the user with the links required. It asks for + the links to the core module and then split them. + + __sendHelp(user) + Sends a message to the user with help instructions. + + __splitMessage(message) + Receives the links message and split it according to Twitter's + format. + + __CheckNewFollowers() + In order to ask for links the user has to follow the GetTor + account. The Twitter module will be constantly checking for + new followers and follow them back. + + __FollowUser(user) + Follow the given user. + + __logRequest(options) + Log information about the request for future stats (e.g. which + OS and language are the most required). If this is called + after a failure, a copy of the DM should be stored. + + * BASE_DIR/logs/: Directory for logs. BASE_DIR should be in the + configuration file. + + ----- yyyy-mm-dd.log: daily log of requests. + + +4. Roadmap + + An example of how the Twitter module should work: + + a. The Twitter account receives a DM. + b. The Twitter service check if is a valid message and if the user is + in the blacklist, and then tries to obtain the package requested and + the locale information. + c. The Twitter service asks for the links to the core module, then it + splits the message received to adopt Twitter's format. + d. One or more DMs are sent back to the user. + e. For all this, the user must follow the GetTor account. The Twitter + service will be constantly checking for new followers and following + them back. + diff --git a/spec/overview.txt b/spec/overview.txt new file mode 100644 index 0000000..435d256 --- /dev/null +++ b/spec/overview.txt @@ -0,0 +1,94 @@ + Google Summer of Code 2014 GetTor Revamp - Overview + Author: Israel Leiva - israel.leiva@usach.cl + Last update: 2014-05-16 + Version: 0.01 + Changes: First version + +1. Background + + GetTor was created as a program for serving Tor and related files over + SMTP, thus avoiding direct and indirect censorship of Tor's software, + in particular, the Tor Browser Bundle (TBB). Users interact with GetTor + by sending emails to a specific email address. In the past, after the + user specified his OS and language, GetTor would send him an attachment + with the required package. This worked until the bundles were too large + to be sent as attachments in most email providers. In order to fix this + GetTor started to send links instead. + +2. Current status + + The GetTor status can be summarized in the following points: + + * Emails are sent to gettor@torproject.org + * The GetTor reply contains: TBB links, signatures (with text guides + for verification), mirrors, support instructions in six languages. + * Dropbox links are sent to download the TBB and signatures. + * Users can not ask for packages in their language. + * English-only replies are sent. + * Any email directed to GetTor is replied with the same information, + there is no recognition of instructions. + * Links generation is not fully automated. + * All code is written in Python. Various parts are not currently used. + * Current repositories are [0] and [1]. + +3. Proposal + + The accepted proposal [2] for Google Summer of Code (GSoC) 2014 proposes + rewriting the current GetTor using a modular design, with a core module + that handles the main GetTor functionalities, and several other modules, + one for each service (e.g. SMTP), which can interact with the core and + send replies to the users. Three modules will be developed for the + purposes of GSoC: SMTP, Twitter, Skype|XMPP. + +3.1. Goals + + The main goals of this proposal are the following: + + * Provide old GetTor functionalites, such as replies in several + languages and recognize user instructions. + * Send fewer information in each reply. + * Support more providers for uploading the TBB packages. + * Automate links generation. + * Clearer, modular and well-documented code. + * Possibilty to create new modules for other common services. + +3.2. Design + + Preliminar designs for the core module and the services can be found + in the design/ folder. All services consider creating a python script + to add the logic for using them. For example, there should be a script + that receives the emails and uses the SMTP module. For simplicity, + I've tried to specify mostly the main functions of every module; there + are some functions, like opening and writing files, that were not + considered in this preliminar phase. + +4. Discussion + +4.1. Skype + + My co-mentor for GSoC, Nima, has publicly rejected the idea of creating + a module for Skype and proposed to implement one for XMPP instead. + I've chosen Skype for its popularity, but I have no other main reason + to maintain this option, so it's probable that XMPP transport will be + implemented. + +4.2. Storing links + + My original proposal considered the fact that links could be stored + somewhere with restricted access, ideally a git repository. Nima + mentioned that ideally the links shouldn't be stored. May be this idea + could be used only to the mirrors and providers configuration (see + core module design). + +4.3. Generating unique URLs + + Nima mentioned that unique URLs could be generated for each request, + and in case the user don't have access to SSL, these links could be + served and later deleted or recycled. I like this idea. + +4. References + + [0] https://gitweb.torproject.org/gettor.git + [1] https://gitweb.torproject.org/user/sukhbir/gettor.git + [2] https://ileiva.github.io/gettor_proposal.html +