Tim Wilson-Brown - teor teor2345@gmail.com writes:
[ text/plain ]
On 16 Apr 2016, at 05:47, David Goulet dgoulet@ev0ke.net wrote:
Hi!
(For the following, I'm only talking about HS directory.)
Here is my conundrum. I was working on plugging the new cache for proposal 224 into our OOM. I stumbled upon a fun problem.
<snip>
We should add entries to a time-sorted list when they are created, otherwise this algorithm is O(n^2).
A third alternative is that we can iterate through each time period: Set K to the oldest expected descriptor age in hours, minus 1 hour Deallocate all entries from Cache A that are older than K hours Deallocate all entries from Cache B that are older than K hours Set K to K - 1 and repeat this process
This algorithm is O(Kn), which is ok as long as K is small. This carries a slight risk of over-deallocating cache entries. Which is OK at OOM time. I like this one, because it's simple, performant, and doesn't need any extra memory allocations.
I feel this is a reasonable approach here. It will probably have to be combined with time-sorted lists as teor suggested, otherwise each of those steps will require walking through the whole cache every time.
BTW, before it says "Set K to K - 1 and repeat this process" we should also check whether our memory issue has been addressed and if so then we can exit, right?
In that case, maybe we can add the same check between these two checks:
Deallocate all entries from Cache A that are older than K hours Deallocate all entries from Cache B that are older than K hours
so that maybe sometimes we can get away with just removing stuff from the v2 cache.
Are there any other issues with the above approach?
PS: We could also in theory imagine writing those descriptors on disk, and only caching the ones that we have received requests for lately. This way we would not utilize our RAM that much...