Sending many requests is not the optimal strategy. One high-effort request would have exactly the same chance of being selected as many low-effort requests with the same total effort. The difference is that with many requests, the client wouldn't know which rendezvous would be selected, so he'd have to waste resources on opening many circuits.
Is solving for a high-effort solution different from solving for a lower-effort solution? If not, many lower-effort solutions will be found while working for a high-effort solution. Then nothing stops a client from making requests with those lower-effort solutions as well. I can expect to find 2 solutions of effort E/2 for every solution of effort E. If I make 3 requests with those solutions, my chance of succeeding doubles, at the cost of 3 times the server's verification effort.
One way is to include the target effort in requests, and include both the server-provided nonce and the target effort as the x in Hx. Then only check that the real effort comes out no less than the target effort, but use the target effort for everything else.
Anyways, I suggest using a priority queue first and see how it works in practice. To allow efficient insertion and trimming, the queue can be implemented as a red-black tree.
As of trimming the priority queue, we don't have to use a heap. We can compress the effort into maybe 7 bits, and then store the requests in 128 arrays. Then trimming it is freeing an array. The compression can be something like floating point. ~clz(POW_NONCE) << 1 | (POW_NONCE >> (127 - clz(POW_NONCE))) & 1 That is, take the number of leading zeros and by one bit on the right of the leftmost 1 bit, then complement the first part to preserve order. We can expect the number of leading zeros to be less than 64, so this will take 7 bits. A decrement of this value means about 1.3 - 1.5 times more work, which should be finely enough grained.
What you are describing is called a hashtable. The question is: what happens when one of the arrays gets filled up? You would have to discard all additional requests coming into that bucket. With your construction, it's very likely that most requests would end up in just a few buckets and the rest would remain empty (e.g. all buckets for more than 40 leading zeroes).
I was thinking of dynamically sized arrays. Anyway, my point was we don't need to compare 128-bit solutions.