I got a notification from AWS that our SQS rendezvous service exceeded the free-tier usage this month with over 1,000,000 SQS API requests. This is in some sense exciting news, because it shows that the rendezvous channel is effective and getting some use.
It does, however mean that we will have to start paying for the service. The current billing period falls on month boundaries, from January 1st to January 31st. The budget action fired on January 9th, which is pretty early in the month. Looking at our broker metrics[0,1], there were approximately 38,608 client polls using SQS. That's approximately 2.5 requests per poll, which is about what I'd expect.
It's reassuring to know that the budget actions work. I'm going to set them a little higher, to something I can reasonably afford. I don't yet know how the cost will scale with the number of polls, and how that will compare with the cost of domain fronted requests.
[0] https://snowflake-broker.torproject.net/metrics [1] https://metrics.torproject.org/collector.html#type-snowflake-stats
On 2025-01-16 10:48, Cecylia Bocovich via anti-censorship-team wrote:
It does, however mean that we will have to start paying for the service. The current billing period falls on month boundaries, from January 1st to January 31st. The budget action fired on January 9th, which is pretty early in the month. Looking at our broker metrics[0,1], there were approximately 38,608 client polls using SQS. That's approximately 2.5 requests per poll, which is about what I'd expect.
There was an error in my calculation. That should be 25 requests/poll. This might be a little high, but I forgot about the periodic clean up tasks that the broker does. This is probably still accurate for faithful usage of the rendezvous channel. It is possible we could make this more efficient.
On 2025-01-16 11:02, Cecylia via anti-censorship-team wrote:
On 2025-01-16 10:48, Cecylia Bocovich via anti-censorship-team wrote:
It does, however mean that we will have to start paying for the service. The current billing period falls on month boundaries, from January 1st to January 31st. The budget action fired on January 9th, which is pretty early in the month. Looking at our broker metrics[0,1], there were approximately 38,608 client polls using SQS. That's approximately 2.5 requests per poll, which is about what I'd expect.
There was an error in my calculation. That should be 25 requests/poll. This might be a little high, but I forgot about the periodic clean up tasks that the broker does. This is probably still accurate for faithful usage of the rendezvous channel. It is possible we could make this more efficient.
Looking at the source code, the client[0] makes a minimum of 3 requests per poll to send a message to the broker queue, get their single-use queue URL, and then receive a message from their single-use queue. But, if the queue hasn't been created or the answer hasn't been sent it will keep retrying those requests for a max of 11 requests total per poll attempt (only poll successes are counted in the metrics).
The broker[1] makes 3 requests at poll time, to receive a message from its queue, create the client queue, and send the response. During cleanup, to broker makes a delete request once per queue, but not until the queue has timed out. Since the timeout on this is 2 minutes and cleanup happens once every 30s, we are calling the GetQueueAttributes request approximately 4 times per poll. So the broker is making approximately 7 requests / successful poll.
The broker periodic clean up tasks fire once every 30 seconds and make some number of requests to get the list of all queues (there's a max of 1,000 queues per call, which we probably weren't exceeding). Over the course of 9 days, this totals 25,920 requests, which is still pretty low.
If I did this accounting right, we should be making a maximum of 18 requests / per client poll *attempt*. The higher than expected number could indicate that some polls are not successful and a client has to try the rendezvous method multiple times. We do have an open issue regarding SQS[2]. In any case, 17 is not so far off 25 as to indicate an attack.
[0] https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...
[1] https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...
[2] https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowf...
anti-censorship-team@lists.torproject.org