Hi everyone !
My name is Ismael and I'm a french student in information security. I will be working on the Ahmia search engine for this year's Google Summer of Code.
Before going back to school, I was working as a "full-stack" developer for a web agency. This is why I have a little experience with the tech I will be using for this project. I'm also more free-software and privacy conscious since a couple months, that's why I'm really happy to contribute to the Tor Project.
I will be working from 23 May to 21 August with my mentors Juha Nurmi (numes) and George Kadianakis (asn). I plan to send bi-weekly reports on tor-dev@.
I have several major goals:
Review code and infrastructure This is the part where I focus on code quality, test cases, automation and fixing bugs. I want to make existing features work better, faster, easier.
Improve the search experience It all starts with indexing more information about hidden services: - Browse links on the same hidden service and treat them as different search results - Get a screenshot of the page and display it on search results - Using statistics (popularity, backlink) to give search results a score and sort them. Then, the search engine should accept commands (think duckduckgo's !bangs) or parameters (think google's site:*). Finally, (and I'm not sure I will have the time to implement this) consider using natural language processing techniques to better understand the search's query or a page's title, description or content.
Statistics What should we be collecting? Can it improve search results? How can we share or visualize the data?
You can read my complete project proposal on Google doc [1] or download a pdf [2]. I'm open to suggestions or questions. You can reach me by email or IRC (nick: zma).
Cheers, Ismael
[1] : https://goo.gl/AT37hA [2] : http://dl.free.fr/f6EMq8vOu