[tor-dev] prop224: HSDir caches question with OOM

Tim Wilson-Brown - teor teor2345 at gmail.com
Sat Apr 16 02:49:44 UTC 2016


> On 16 Apr 2016, at 05:47, David Goulet <dgoulet at ev0ke.net> wrote:
> 
> Hi!
> 
> (For the following, I'm only talking about HS directory.)
> 
> Here is my conundrum. I was working on plugging the new cache for proposal 224
> into our OOM. I stumbled upon a fun problem.
> 
> We plan to have both current and new HS protocol in parallel. This means two
> caches for both version 2 (current) and 3 (prop224). So far so good.
> 
> However, here is the problem. Now, our OOM handlers is called in:
> cell_queues_check_size(). The heuristic is basically:
> 
>      /* If we're spending over 20% of the memory limit on hidden service
>       * descriptors, free them until we're down to 10%.  */
> 
> Considering we have _two_ caches now, we somehow have to cleanup both of them
> "evenly" that is if we have to remove 20kb we either remove 10kb on each (if
> possible) or we actually clean them by entry expiry time instead regardless of
> which version.
> 
> I think the latter is what we want but then it complicates things a bit since
> both cache contains _different_ data struct. and indexed with different keys
> (DIGEST vs DIGEST256). One option here is to go over all entries in both cache,
> create temporary objects and put them in a sorted list (by expiry time). Those
> "temp object" would contain something like: 1) cache version, 2) expiry time
> and 3) index key. Once done, we go over that list and delete entries until we
> reach our number of bytes we need to remove.
> 
> This could work since we would do that when we are at 20% memory but still it
> requires allocating some small objects to _clear_ our memory... Someone has
> maybe something _better_ to propose? :)
> 
> (keep in mind that minimal change to the current HS code is preferable. I know
> that unifying both caches somehow would help but that's lots more work! and
> the new cache for prop224 will be able to support versionning for which our
> current one doesn't, it's baked for v2.)
> 
> I'm asking because this will require some refactoring for which I want to make
> sure we are in agreement on how to proceed and it doesn't end up not the right
> solution.
> 
> Basically my goal is to move the OOM handler for both v2 and v3 to one single
> function called hs_cache_handle_oom() and from which we'll do the gymnastic
> describe above. We'll stop calling rend_cache_clean_v2_descs_as_dir()
> explicitely after that in cell_queues_check_size().

Having code that is only executed under rare conditions is error-prone.
Allocating memory at OOM time is error-prone.

We could add cache entries to the unified list when they are created, rather than at OOM time.
This also has the benefit of being faster than creating and sorting a large list at OOM time.

Alternately, we could iterate through each cache like this:
Find ta, the oldest expiry time in Cache A
Find tb, the oldest expiry time in Cache B
Set Cache X to the cache with the oldest descriptor
Set tx to the minimum expiry time from ta and tb
Set ty to the maximum expiry time from ta and tb
While tx <= ty and we are above the desired memory limit
  Deallocate the oldest descriptor from Cache X
  Set tx to the oldest expiry time in Cache X
If we are still above the desired memory limit, repeat this entire process

We should add entries to a time-sorted list when they are created, otherwise this algorithm is O(n^2).

A third alternative is that we can iterate through each time period:
Set K to the oldest expected descriptor age in hours, minus 1 hour
Deallocate all entries from Cache A that are older than K hours
Deallocate all entries from Cache B that are older than K hours
Set K to K - 1 and repeat this process

This algorithm is O(Kn), which is ok as long as K is small.
This carries a slight risk of over-deallocating cache entries. Which is OK at OOM time.
I like this one, because it's simple, performant, and doesn't need any extra memory allocations.

Tim

Tim Wilson-Brown (teor)

teor2345 at gmail dot com
PGP 968F094B
ricochet:ekmygaiu4rzgsk6n



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20160416/3995acef/attachment.sig>


More information about the tor-dev mailing list