[metrics-team] OnionStats

Thu Jun 2 23:12:50 UTC 2016

On 31/05/2016 16:30, Karsten Loesing wrote:

> Request statistics (2016-05-31 14:50:00, 3600 s):
> Total processed requests: 896066
> Most frequently requested resource: details (894217), summary (1671),
> bandwidth (90)
> Most frequently requested parameter combinations: [lookup, fields]
> (891349), [flag, type, running] (1970), [] (1537)
> Matching relays per request: .500<2, .900<2, .990<2, .999<16384
> Matching bridges per request: .500<1, .900<1, .990<1, .999<8192
> Written characters per response: .500<256, .900<512, .990<512,
> .999<2097152
> Milliseconds to handle request: .500<8, .900<128, .990<2048, .999<4096
> Milliseconds to build response: .500<4, .900<64, .990<1024, .999<8192
> 
> Would you be able to set this up on a powerful and probably not as
> cheap VPS and hammer it with loads of requests?

So I dug a little bit on the topic and I managed to benchmark the API.
I followed the approach outlined here:
- http://tech.opentable.co.uk/blog/2014/02/28/api-benchmark/
- https://github.com/matteofigus/api-benchmark

I created a project and ran some tests. You can download the full
package here: http://138.201.90.124/api_bench.zip. I recommend to take a
look at the report.html file inside the *_reports folders.

If you want you can run your own test, just change the parameters in the
*.json files and type:

$ grunt benchmark

I monitored the VPS resources and found out that the server can easily
handle a decent amount of traffic.

Here's a quick explanation of the test suites:
- all the tests were set up to kick off max 1000 requests batched in 20
parallel requests (simulating 20 concurrent users making 1000 requests
total).

- the response time depends on the amount of data returned. In the test
'wide_test' I was using as a max result to be returned the number 2000.
In 'wide_test_limit' I lowered that value to 1000. Looking at the
graphs, I halved the response time.

- Another improvement was to purge the unnecessary fields from the
results. I managed to optimize the elasticsearch query.

- I've the original test graphs before all these improvements but only
for the local instance. I didn't include them in the zip file.

Conclusion:
- Based on the preliminary tests done and reported, I'm quite confident
to say that the hardware specs of the VPS are good enough to handle a
decent amount of traffic.

- I'm also confident about the possibility of optimize the code even
more, to obtain better results.

- I'm quite sure that a distributed environment (i.e. 2 elasticsearch
servers) will drastically improve the performance, letting us room for
integrating the CollectTor data and make it available in a reasonable
amount of time.

- I did not monitored the backend code.

Next step:
- monitoring the code looking for non-optimized blocks

- monitoring the APIs usage: I'm interested in answering the questions
"Milliseconds to handle request" and "Milliseconds to build response"

- running heavier and more complex tests

Curious to read your thoughts,
Regards

-- 
Anathema

+--------------------------------------------------------------------+
|GPG/PGP KeyID: CFF94F0A available on http://pgpkeys.mit.edu:11371/  |
|Fingerprint: 80CE EC23 2D16 143F 6B25  6776 1960 F6B4 CFF9 4F0A     |
|                                                                    |
|https://keybase.io/davbarbato                                       |
+--------------------------------------------------------------------+