[tor-dev] Python ExoneraTor

Damian Johnson atagar at torproject.org
Tue Jun 10 03:41:12 UTC 2014


>> let me make one remark about optimizing Postgres defaults: I wrote quite
>> a few database queries in the past, and some of them perform horribly
>> (relay search) whereas others perform really well (ExoneraTor).  I
>> believe that the majority of performance gains can be achieved by
>> designing good tables, indexes, and queries.  Only as a last resort we
>> should consider optimizing the Postgres defaults.
>>
>> You realize that a searchable descriptor archives focuses much more on
>> database optimization than the ExoneraTor rewrite from Java to Python
>> (which would leave the database untouched)?
>
> Are other datastore models such as splunk or MongoDB useful?
> [splunk has a free yet proprietary limited binary... those having
> historical woes and takebacks, mentioned just for example here.]

Earlier I mentioned the idea of Dynamo. Unless I'm mistaken this lends
itself pretty naturally to addresses as a hash key, and descriptor
dates as the range key. Lookups would then be O(log(n)) where n is the
total number of descriptors an address has published (... that is to
say very, very quick).

This would be a fun project to give Boto a try. *sigh*... there really
should be more hours in the day...

Cheers! -Damian


More information about the tor-dev mailing list