On 9/23/13 12:53 AM, Sathyanarayanan Gunasekaran wrote:
Hi,
I have some comments on the updated pdf -
Thanks! Much appreciated!
It should be easy for a user to implement or install an experiment that isn’t bundled with the core distribution. Ideally, installing an experiment should be as simple as unzipping a folder or config file into an experiments folder.
I don't understand how this will work when users just apt-get install torperf. Ideally if someone writes a good experiment, they should send the patches upstream and get it merged, and then we update torperf to include those tests and then the users just update torperf with their package managers.
I agree with you that this is a rather unusual requirement and that adding new experiments to Torperf is the better approach. That's why the paragraph said "should" and "ideally". I added your concerns to the design document to make this clearer. (Maybe we should mark requirements as either "must-do", "should-do", or "could-do"?)
It should be possible to run different experiments with different tor versions or binaries in the same Torperf service instance.
I don't think we need this now. I'm totally ok with having users run different torperf instances for different tor versions.
Running multiple Torperf instances has disadvantages that I'm not sure how to work around. For example, we want a single web server listening on port 80 for all experiments and for providing results.
Why do you think it's hard to run different tor versions or binaries in the same Torperf service instance?
It might be beneficial to provide a mechanism to download and verify the signature of new tor versions as they are released. The user could speficy if they plan to test stable, beta or alpha versions of tor with their Torperf instance.
IMHO, torperf should just measure performance, not download Tor or verify signatures. We have good package managers that do that already.
Ah, we don't just want to measure packaged tors. We might also want to measure older versions which aren't contained in package repositories anymore, and we might want to measure custom branches with performance tweaks. Not sure if we actually want to verify signatures of tor versions.
I think we should take Shadow's approach (or something similar). Shadow can download a user-defined tor version ('--tor-version'), or it can build a local tor path ('--tor-prefix'):
https://github.com/shadow/shadow/blob/master/setup#L109
Do you see any problems with this?
A Torperf service instance should be able to accumulate results from its own experiments and remote Torperf service instances
Torperf should not accumulate results from remote Torperf service instances. If by "accumulate", you mean read another file from /results which the *user* has downloaded, then yes. Torperf shouldn't *download* result files from remote instances.
Why not? The alternative is to build another tool that downloads result files from remote instances. That's what we do right now (see footnote: "For reference, the current Torperf produces measurement results which are re-formatted by metrics-db and visualized by metrics-web with help of metrics-lib. Any change to Torperf triggers subsequent changes to the other three codebases, which is suboptimal.")
The new Torperf should come with an easy-to-use library to process its results
Torperf results should just be JSON(or similar) files that already have libraries and we should invent a new result format and write a library for it.
Yes, that's what I mean. If you understood this differently, can you rephrase the paragraph?
request scheduler Start new requests following a previously configured schedule. request runner Handle a single request from creation over various possible sub states to timeout, failure, or completion.
These are experiment specific. Some tests may not even need to do requests. No need for these to be a part of torperf.
I'm thinking how we can reduce code duplication as much as possible. The experiments in the design document all make requests, so it would be beneficial for them to have Torperf schedule and handle their requests. If an experiment doesn't have the notion of request it doesn't have to use the request scheduler or runner. But how would such an experiment work? Do you have an example?
results database Store request details, retrieve results, periodically delete old results if configured.
Not sure if we really need a database. These tests look pretty simple to me.
Rephrased to data store. I still think a database makes sense here, but this is not a requirement. As long as we can store, retrieve, and periodically delete results, everything's fine.
Again, thanks a lot for your input!
Updated PDF:
https://people.torproject.org/~karsten/volatile/torperf2.pdf
All the best, Karsten