[tor-dev] Incorporating your torsearch changes into Onionoo
kostas at jakeliunas.com
Fri Oct 25 13:29:45 UTC 2013
On Wed, Oct 23, 2013 at 2:32 PM, Karsten Loesing <karsten at torproject.org>wrote:
> On 10/11/13 4:05 PM, Kostas Jakeliunas wrote:
> Oops! Sorry for the delay in responding! Responding now.
> > On Fri, Oct 11, 2013 at 12:00 PM, Karsten Loesing <
> karsten at torproject.org>wrote:
> >> Hi Kostas,
> >> should we move this thread to tor-dev@?
> > Hi Karsten!
> > sure.
> >>From our earlier conversation about your GSoC project:
> >>> In particular, we should discuss how to integrate your project into
> >>> Onionoo. I could imagine that we:
> >>> - create a database on the Onionoo machine;
> >>> - run your database importer cronjob right after the current Onionoo
> >>> cronjob;
> >>> - make your code produce statuses documents and store them on disk,
> >>> similar to details/weights/bandwidth documents;
> >>> - let the ResourceServlet use your database to return the
> >>> fingerprints to return documents for; and
> >>> - extend the ResourceServlet to support the new statuses documents.
> >>> Maybe I'm overlooking something and you have a better plan? In any
> >>> case, we should take the path that implies writing as little code as
> >>> possible to integrate your code in Onionoo.
> >> Let me know what you think!
> > Sounds good. Responding to particular points:
> >> - create a database on the Onionoo machine;
> >> - run your database importer cronjob right after the current Onionoo
> >> cronjob;
> > These should be no problem and make perfect sense. It's always best to
> > raw SQL table creation routines to make sure the database looks exactly
> > like the one on the dev machine I guess (cf. using SQLAlchemy
> > to do that (I did that before)).
> > Current SQL script to do that is at . I'll look over it. For example,
> > I'd (still) like to generate some plots showing the chances of two
> > fingerprints having the same substring (this is for the intermediate
> > fingerprint table.) (One axis would be substring length, another would be
> > the possibility in (portions of) %.) As of now, we still use
> > substr(fingerprint, 0, 12), and it is reflected in the schema.
> > Overall though, no particular snags here.
> I don't follow. But before we get into details here, I must admit that
> I was too optimistic about running your code on the current Onionoo
> machine. I ran a few benchmark tests on it last week to compare it to
> new hardware, and those tests almost made it fall over. We should not
> even think about adding new load to the current machine.
> New plan: can you run an Onionoo instance with your changes on a
> different machine? (If you need anything from me, like a tarball of the
> status/ and out/ directories, I'm happy to provide them to you.) I
> think we should run this instance for a while to see how reliable it is.
> And once we're confident enough, we'll likely have new hardware for the
> new Onionoo, so that we can move it there.
This sounds like a very good idea. Ok, I can try and do this. Sorry for
delaying my response as well, I'll try and follow up with what I need (if
>> - make your code produce statuses documents and store them on disk,
> >> similar to details/weights/bandwidth documents;
> > Right, so if we are planning to support all V3 network statuses for all
> > fingerprints, how are we to store all the status documents? The idea is
> > preprocess and serve static JSON documents, correct (as in the current
> > Onionoo)? (cf. the idea of simply caching documents: if we serve a
> > particular status document, it gets cached, and depending on the query
> > parameters (date range restriction, e.g.) it may be set not to expire at
> > all.)
> > Or should we try and actually store all the statuses (the condensed
> > document version , of course)?
> Let's do it as the current Onionoo does it. This code does not exist,
I've done some small testing on a local system, it seems the Onionoo way is
plausible, since the generation of all the old(er) status etc. documents
needs to happen only once (obviously, but now I understand this means the
number of resulting status documents and their size is not such a big deal
after all.) I don't have good code for it as of yet.
> >> - let the ResourceServlet use your database to return the
> >> fingerprints to return documents for; and
> >> - extend the ResourceServlet to support the new statuses documents.
> > Sounds good. I assume you are very busy with other things as well, so
> > ideally maybe you had in mind that I could try and do the Java part? :)
> > Though, since you are much more familiar with (your own) code, you could
> > probably do it faster than me. Not sure.
> > Any particular technical issues/nuances here (re: ResourceServlet)?
> Can you give it a try? Happy to help with specific questions about
> ResourceServlet, and I'll try hard to reply faster this time. Again,
> sorry for the delay!
Okay! I've been tinkering a bit, actually. Will see if I can produce
something decent and reliable.
> > : https://github.com/wfn/torsearch/blob/master/db/db_create.sql
> > :
> > (e.g.
> > )
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the tor-dev