Hi,
I am trying to identify bottlenecks at an entry guard when it is used to create a large number of circuits. To identify this I started a set of clients which use a single entry-guard to create 5K-10K circuits in total. With this scale clients observe timeouts (waiting for CREATED cell) while constructing circuits and they consider the entry guard to be down.
I control the entry-guard chosen in this experiment and no other client should be using this relay node as a entry guard as it does not have GUARD flag in consensus.
Lack of bandwidth at the entry guard is not the cause for timing-out for circuits. It consumed 80-90 KBps while the entry guard's RelayBandwidthRate is set to 400 KBps. Also, CPU at the entry-guard is not bottleneck as it was always less than 10% during the course of experiment.
Can anyone provide me pointers to why these circuits timeout? And, what would be an effective way to verify that this indeed is the bottleneck?
I tried profiling the entry-guard with callgrind but this is extremely slow. In the meanwhile I will create a static tor instance so that gprof produces some meaningful results.
Motivation for this work is to enable a client to predict how many circuits it can create without degrading performance of its entry-guards. If a circuit is not created within a timeout period, then the client perceives the entry guard to be down and it starts using another guard node. If the client keeps on creating excessive circuits, then it is more likely to end-up using a malicious (and resourceful) entry guard. The prediction model would help the client to stay anonymous as it would refrain from unnecessarily switching guard nodes.
Thanks, Abhishek