imentinOn Wed, Feb 17, 2016 at 8:29 AM, George Kadianakis <desnacked@riseup.net> wrote:
Hello there,

I'm not sure what kind of statistics we get out of the current guard simulator.

The simulation creates a network with 1000 relays (all guards) with 96% of reliability, and using simulated time:

- every 20 seconds: creates a new circuit each 20 seconds
- every 2 minutes: updates node connectivity based on its reliability
- every 20 minutes: removes and add new relays to the network

By default, we recreate the client (OP) every 2 minutes (which makes it bootstrap, and so on). We can configure to simulate a long lived client, and in this case it fetches a new consensus every hour.

We're also able to run this simulation in multiple network scenarios: fascist firewall, flaky network, evil network, sniper network, down network, and a scenario that switches between these networks). See --help and [1] for explanation of the terms.

Each simulation runs for 30 hours (in simulated time), for a total of 5400 circuits. The time is discrete with increments of 20 seconds. Everything in the simulation happens with no cost to simulated time. We are experimenting to add some time cost to connections (2 seconds for successful, and 4 to failures) just to have some feeling of how it would impact on the algorithms.

We currently have the following metrics:

- success rate
- avg bandwidth capacity
- exposure to guards (how many different guards we connected to) over time (after hour 1, 15, and 30).
- number of guards we tried until the first successful circuit
- time until the first successful circuit is built

A successful circuit is one which we succeeded to find a guard using the algorithm AND we succeeded to connect to it.

In general, we are interested in security and performance. For security we are
trying to minimize our exposure to network. For performance, we want to
minimize our downtime when our current guard becomes unreachable or after our
network comes back up.

Here are some concrete statistics that we could gather in the simulator:

Security statistics:
         - Number of unique guards we connected to during the course of the simulation.

We have this as "exposure after 30 hours".
 
         - Time spent connected to lower priority guards while a primary guard was online.
         - Time spent connected to lower priority guards while a higher priority guard was online and the network was up.

We don't have these. And also I'm not sure about how we should detect network conditions: we can try to guess from the algorithm or look at which network scenario we are using at the moment.
 
Performance statistics:
         - Time spent cycling through guards.
         - Time spent cycling through guards while network is up.

Since time is stopped while we're choosing guards we have to come with a different metric for this. And it also requires detecting the network time.

         - Time spent on dystopic mode.
         - Time spent on dystopic mode while the network was utopic.

These should be easy as long as we have defined how to detect the network type.
 
Is it possible to collect those statistics? I'm curious to learn how the
current guard algorithm compares to the new prop259 on those aspects.

We have tooling to generate graphs with success rate and exposure taken from a round of ~500 simulations. I can send them to you when they finish running ;)
 
What other stats are important here you think?

We have discussed about counting how many network connections we make over time. For now, we have been comparing success and exposure.

I guess we can add these stats, we just need to come up with an approach to determine the network condition.

All the code is in https://github.com/twstrike/tor_guardsim (branch develop).

1 - doc/stuff-to-test.txt

--
Reinaldo de Souza Jr | Software Developer
ThoughtWorks | www.thoughtworks.com

GPG: EF84 6530 67A5 1559 5554  D8B2 954A 6BEF AF74 ACD7