[tor-dev] prop224: HSDir caches question with OOM

s7r s7r at sky-ip.org
Sat Apr 16 17:30:29 UTC 2016


On 4/16/2016 4:11 PM, David Goulet wrote:
>> A third alternative is that we can iterate through each time period:
>> Set K to the oldest expected descriptor age in hours, minus 1 hour
>> Deallocate all entries from Cache A that are older than K hours
>> Deallocate all entries from Cache B that are older than K hours
>> Set K to K - 1 and repeat this process
>> This algorithm is O(Kn), which is ok as long as K is small.
>> This carries a slight risk of over-deallocating cache entries. Which is OK at OOM time.
>> I like this one, because it's simple, performant, and doesn't need any extra memory allocations.
> I do also like this one. It's pretty simple and efficient.
> Now there is a fourth alternative that Yawning proposed in #tor-dev yesterday
> which is always prioritize our v2 cache in the OOM handling that is clean the
> v2 before than if we have to go to the v3 cache. It would be an incentive to
> "v3 is much more important than v2" kind of thing.
> As he describe it, it's a bit like our tap vs ntor situation under pressure,
> we prioritize ntor and drop tap if needed.
> I'm still quite _unsure_ about this. The v3 will bring more memory pressure
> with this second HSDir cache. And my intuition is that most users won't switch
> directly to v3 but will probably have a migration path from v2 to v3 like
> having the v2 onion on for X months before discontinuing it.
> So losing reachability because we decide to drop v2 first could not be
> desirable. But then also how often does a HSDir OOM is triggered... ?
> Anyway, right now I'm leaning towards your approach teor of just using the
> time-period.
> More eyes on this would be great :).
> Cheers!
> David

I agree that teor's O(Kn) is the best approach from performance (no
additional memory allocations), simplicity and efficacy point of view.
O(Kn) algorithm will clear the entries only based on their expiration
time, it won't care to clean the v2 / v3 caches in equal measure which
is good, given that we do not know how long HS operators will take /
need to upgrade their services to prop 224.

The tap vs ntor situation was a good measure, but the threat model was
different (we were trying to ensure new clients using ntor get resources
from relays with priority as opposite to non-updated botnet zombies
using tap). In the current situation we care about v2 and v3 HS caches
exactly the same, for an unknown period of time which might not be
short, so we shouldn't penalize v2 in any way.

This needs to be covered regardless how often a HSDir has its OOM
triggered. I don't think we should assume it's hard to flood HSDirs with
descriptors until the memory is full.

Now that HSDirs will need to handle two caches, is 20% of the total
memory allocated for HS descriptors a good value? What harm would
increasing it to let's say 25% do?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20160416/6e3688ae/attachment.sig>

More information about the tor-dev mailing list