-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hey Tim,
I see that I may need to set some environment variables as told by the git readme but the error message is on a different env var:
cory@Nulix ~/tor/tor =) make test-network make all-am make[1]: Entering directory `/home/cory/tor/tor' make[1]: Leaving directory `/home/cory/tor/tor' ./src/test/test-network.sh test-network.sh: missing 'chutney' in CHUTNEY_PATH (~/tor/chutney/) make: *** [test-network] Error 1
I also tried modifying my .bashrc so that the CHUTNEY_PATH pointed to the executable but no luck. Do I need to set the chutney path?
- - Cory
On 4 Jul 2015, at 05:37 , Cory Pruce corypruce@gmail.com wrote:
Signed PGP part Hey Tim,
I see that I may need to set some environment variables as told by the git readme but the error message is on a different env var:
cory@Nulix ~/tor/tor =) make test-network make all-am make[1]: Entering directory `/home/cory/tor/tor' make[1]: Leaving directory `/home/cory/tor/tor' ./src/test/test-network.sh test-network.sh: missing 'chutney' in CHUTNEY_PATH (~/tor/chutney/) make: *** [test-network] Error 1
I also tried modifying my .bashrc so that the CHUTNEY_PATH pointed to the executable but no luck. Do I need to set the chutney path?
The CHUTNEY_PATH variable needs to point to a directory containing a chutney executable.
So you seem to have it right the first time. What do you get when you run: ls -l ~/tor/chutney/
I would expect to see an executable script called "chutney" listed in that directory, along with the other chutney distribution files. The test-network.sh script is complaining that the "chutney" script is missing.
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com pgp ABFED1AC https://gist.github.com/teor2345/d033b8ce0a99adbc89c5
teor at blah dot im OTR D5BE4EC2 255D7585 F3874930 DB130265 7C9EBBC7
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 07/04/2015 12:11 AM, teor wrote:
The CHUTNEY_PATH variable needs to point to a directory containing a
chutney executable.
So you seem to have it right the first time. What do you get when you run: ls -l ~/tor/chutney/
I would expect to see an executable script called "chutney" listed in
that directory, along with the other chutney distribution files. The test-network.sh script is complaining that the "chutney" script is missing.
I got it :-) Thanks!
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
One more thing for right now: how should I do benchmarks with chutney. Should I measure the averages of how long it takes to complete the make test-network command?
- - Cory
On 5 Jul 2015, at 05:45 , Cory Pruce corypruce@gmail.com wrote:
One more thing for right now: how should I do benchmarks with chutney. Should I measure the averages of how long it takes to complete the make test-network command?
make test-network is dominated by the 25 second delay waiting for the Tor test network to bootstrap. So it's not going to help much.
I'm working on a chutney branch to measure bandwidth on "chutney verify", but it doesn't have any command-line arguments yet (it's all constants in the code). I'll see if I can pull it into shape today. https://trac.torproject.org/projects/tor/ticket/14175
Even with these bandwidth measurement changes, there's something else to think about:
chutney will measure the combined throughput of 4-5 tor instances, and 4n - 5n cpuworker threads, where n is the number of cores on your machine. But this isn't the performance you're interested in for multithreaded crypto changes - you want to know how a single instance + n cpuworker threads performs. (A chutney test network is far *more* parallel than a typical tor relay.)
To get an accurate benchmark, you could run one tor instance per machine, or, at the very least, run the client on a slow machine, and everything else on a fast machine, so that the client's multithreaded crypto is the limiting factor. But this seems like a lot of work, and I'm not sure how much accuracy you'll gain.
As a first step, you could minimise the number of tor instances, which might make multithreading improvements easier to measure. You'll find the basic-min network helpful for this: ./src/test/test-network.sh --flavour basic-min
Then check if you're using ~100% of all cores when you push large amounts (100MB+) of data through the network using #14175 (when it's done!) If you're not using 100%, then you'll be able to see any multithreaded improvements when you run the test again. If you are seeing 100% usage already, get more cores or more machines, and re-run the tests.
Let me know how you go with this.
You could also modify tor to use single-hop connections, then measure single-hop bandwidth, by making a 1-hop connection and pushing data through it. There won't be as much client crypto as the 3 or 4-hop scenario; and you'll still have the client and destination on the one machine, unlike the single relay real-world scenario. But it could be closer to real-world multithreaded performance, as you'll only be measuring 2n threads. (Ideally, you want to measure n threads.)
You must *never* use a tor binary built like this on the public tor network, as it has no anonymity. To make tor use 1-hop circuits for everything, change DEFAULT_ROUTE_LEN to 1 in or.h End dire warning about loss of anonymity.
Of course, 1-hop circuits might hide some subtle multithreading bugs, as there's less crypto happening overall. So please test the correctness of your code with DEFAULT_ROUTE_LEN 3 as well.
Give me 8 hours or so to work on #14175, I'll try and get it into a usable state.
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com pgp ABFED1AC https://gist.github.com/teor2345/d033b8ce0a99adbc89c5
teor at blah dot im OTR D5BE4EC2 255D7585 F3874930 DB130265 7C9EBBC7
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 07/04/2015 06:19 PM, teor wrote:
make test-network is dominated by the 25 second delay waiting for the
Tor test network to bootstrap. So it's not going to help much.
I'm working on a chutney branch to measure bandwidth on "chutney
verify", but it doesn't have any command-line arguments yet (it's all constants in the code). I'll see if I can pull it into shape today.
Awesome I'm excited to try it out :D
Even with these bandwidth measurement changes, there's something else
to think about:
chutney will measure the combined throughput of 4-5 tor instances, and
4n - 5n cpuworker threads, where n is the number of cores on your machine. But this isn't the performance you're interested in for multithreaded crypto changes - you want to know how a single instance + n cpuworker threads performs. (A chutney test network is far *more* parallel than a typical tor relay.)
To get an accurate benchmark, you could run one tor instance per
machine, or, at the very least, run the client on a slow machine, and everything else on a fast machine, so that the client's multithreaded crypto is the limiting factor. But this seems like a lot of work, and I'm not sure how much accuracy you'll gain.
As a first step, you could minimise the number of tor instances, which
might make multithreading improvements easier to measure.
You'll find the basic-min network helpful for this: ./src/test/test-network.sh --flavour basic-min
Then check if you're using ~100% of all cores when you push large
amounts (100MB+) of data through the network using #14175 (when it's done!) If you're not using 100%, then you'll be able to see any multithreaded improvements when you run the test again. If you are seeing 100% usage already, get more cores or more machines, and re-run the tests.
Let me know how you go with this.
I too was thinking of how I would set up the test environment and this is probably more than I could have come up with in days! I'm all ready to check the cpu usage when you pump out the update.
You could also modify tor to use single-hop connections, then measure single-hop bandwidth, by
making a 1-hop connection and pushing data through it. There won't be as much client crypto as the 3 or 4-hop scenario; and you'll still have the client and destination on the one machine, unlike the single relay real-world scenario. But it could be closer to real-world multithreaded performance, as you'll only be measuring 2n threads. (Ideally, you want to measure n threads.)
You must *never* use a tor binary built like this on the public tor
network, as it has no anonymity.
To make tor use 1-hop circuits for everything, change
DEFAULT_ROUTE_LEN to 1 in or.h
End dire warning about loss of anonymity.
Of course, 1-hop circuits might hide some subtle multithreading bugs,
as there's less crypto happening overall. So please test the correctness of your code with DEFAULT_ROUTE_LEN 3 as well.
Done. I changed it to start with 1 and I will try 3 as well. If I shouldn't test this out on the public tor network, should I set up everything locally?
Give me 8 hours or so to work on #14175, I'll try and get it into a
usable state.
Tim
Dude, thanks a bunch for you help. I'm really excited to start :D I'm going to read through the initial design and the code to see what functions/structures/constants/etc. need to be changed. Let me know when you release #14175 and I'll be happy to be the test guinea pig =-)
- - Cory
On 6 Jul 2015, at 03:20 , Cory Pruce corypruce@gmail.com wrote:
On 07/04/2015 06:19 PM, teor wrote:
You could also modify tor to use single-hop connections, then measure single-hop bandwidth, by
making a 1-hop connection and pushing data through it. There won't be as much client crypto as the 3 or 4-hop scenario; and you'll still have the client and destination on the one machine, unlike the single relay real-world scenario. But it could be closer to real-world multithreaded performance, as you'll only be measuring 2n threads. (Ideally, you want to measure n threads.)
You must *never* use a tor binary built like this on the public tor
network, as it has no anonymity.
To make tor use 1-hop circuits for everything, change
DEFAULT_ROUTE_LEN to 1 in or.h
End dire warning about loss of anonymity.
Of course, 1-hop circuits might hide some subtle multithreading bugs,
as there's less crypto happening overall. So please test the correctness of your code with DEFAULT_ROUTE_LEN 3 as well.
Done. I changed it to start with 1 and I will try 3 as well. If I shouldn't test this out on the public tor network, should I set up everything locally?
Yes, and testing locally gives you much better control of all the factors which affect performance.
Give me 8 hours or so to work on #14175, I'll try and get it into a
usable state.
Dude, thanks a bunch for you help. I'm really excited to start :D I'm going to read through the initial design and the code to see what functions/structures/constants/etc. need to be changed. Let me know when you release #14175 and I'll be happy to be the test guinea pig =-)
Yeah, that was 8 optimistic, uninterrupted hours. Otherwise known as "fantasy" hours.
Give me a day or two to fit the work in.
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com pgp ABFED1AC https://gist.github.com/teor2345/d033b8ce0a99adbc89c5
teor at blah dot im OTR D5BE4EC2 255D7585 F3874930 DB130265 7C9EBBC7
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Yes, and testing locally gives you much better control of all the
factors which affect performance.
Can I set everything up using vm's or maybe just a single relay?
Yeah, that was 8 optimistic, uninterrupted hours. Otherwise known as
"fantasy" hours.
Give me a day or two to fit the work in.
Haha I figured :p no worries, just let me know.
- - Cory
On 6 Jul 2015, at 04:36 , Cory Pruce corypruce@gmail.com wrote:
Yes, and testing locally gives you much better control of all the
factors which affect performance.
Can I set everything up using vm's or maybe just a single relay?
Well, your ideal scenario is a separate box / VM, with at least 2 separate cores, for each of the tor instances you're pushing data through (2-3 with a path length of 1, 4-5 with a path length of 3, as you'll need to account for cannibalization).
However, anything you do to approach this ideal scenario is an improvement.
And running everything on the same box / VM should still give you some idea, as long as CPU usage on all CPUs isn't ~100%.
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com pgp ABFED1AC https://gist.github.com/teor2345/d033b8ce0a99adbc89c5
teor at blah dot im OTR D5BE4EC2 255D7585 F3874930 DB130265 7C9EBBC7
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Well, your ideal scenario is a separate box / VM, with at least 2
separate cores, for each of the tor instances you're pushing data through (2-3 with a path length of 1, 4-5 with a path length of 3, as you'll need to account for cannibalization).
However, anything you do to approach this ideal scenario is an
improvement. My first thought was this is a job for my r pi but it is only single core. In time I can definitely, find the nodes for the test.
And running everything on the same box / VM should still give you some
idea, as long as CPU usage on all CPUs isn't ~100%.
Haha I guess this is a "we'll wait and see" situation. Let me know if there is anything I can do for chutney as well
- - Cory
On 6 Jul 2015, at 11:16 , Cory Pruce corypruce@gmail.com wrote:
And running everything on the same box / VM should still give you some
idea, as long as CPU usage on all CPUs isn't ~100%.
Haha I guess this is a "we'll wait and see" situation. Let me know if there is anything I can do for chutney as well
Well, you could test my latest branches for #14175:
https://trac.torproject.org/projects/tor/ticket/14175#comment:8
There's a branch which modifies src/test/test-network.sh in tor, and another with the performance measurement code in chutney.
The command-line arguments/environmental variables are in the ticket, and I've modified the chutney README to include performance testing.
Let me know if anything isn't clear.
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com pgp ABFED1AC https://gist.github.com/teor2345/d033b8ce0a99adbc89c5
teor at blah dot im OTR D5BE4EC2 255D7585 F3874930 DB130265 7C9EBBC7
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Well, you could test my latest branches for #14175:
Hey Tim, I got the branch of chutney and tor and made sure that the commands you run in the comments of the issue exist. What do you think would be a good way to start testing? Begin with a static analysis of the code? Verify that the bandwidth is correct? Let me know what you think is important/feasible.
- - Cory
On 10 Jul 2015, at 09:47 , Cory Pruce corypruce@gmail.com wrote:
Signed PGP part
Well, you could test my latest branches for #14175:
Hey Tim, I got the branch of chutney and tor and made sure that the commands you run in the comments of the issue exist. What do you think would be a good way to start testing? Begin with a static analysis of the code?
If you can read Python and shell script, then checking I haven't made any obvious coding errors in my changes would help. But that might require becoming familiar with the codebase - which may take some effort.
The diffs are here, or you can use git diff: https://github.com/teor2345/chutney/compare/feature14175-chutney-performance... https://github.com/teor2345/tor/compare/feature14175-chutney-performance-v2
Also, I was only using Python 2, so I might have accidentally introduced some incompatibilities with Python 3.
Verify that the bandwidth is correct?
Since it's the localhost, CPU-limited, massively-parallel bandwidth, there's no "correct" value. I'm not even sure what sane values are, but we'll get an idea once people start competing for the biggest numbers.
Let me know what you think is important/feasible.
Does it run? When you make performance improvements, does the bandwidth increase? (Or, far more easily: when you deliberately slow down the code, does the bandwidth tank?)
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com pgp ABFED1AC https://gist.github.com/teor2345/d033b8ce0a99adbc89c5
teor at blah dot im OTR D5BE4EC2 255D7585 F3874930 DB130265 7C9EBBC7
On 10 Jul 2015, at 11:35 , teor teor2345@gmail.com wrote:
On 10 Jul 2015, at 09:47 , Cory Pruce corypruce@gmail.com wrote:
Signed PGP part
Well, you could test my latest branches for #14175:
Hey Tim, I got the branch of chutney and tor and made sure that the commands you run in the comments of the issue exist. What do you think would be a good way to start testing? Begin with a static analysis of the code?
If you can read Python and shell script, then checking I haven't made any obvious coding errors in my changes would help. But that might require becoming familiar with the codebase - which may take some effort.
The diffs are here, or you can use git diff: https://github.com/teor2345/chutney/compare/feature14175-chutney-performance... https://github.com/teor2345/tor/compare/feature14175-chutney-performance-v2
This commit contains all the tor changes: https://github.com/teor2345/tor/commit/128d4a68967fb2986965e9f8443f362a9dc20...
(The github comparison picked the wrong base branch, so it's huge and unhelpful.)
Tim Wilson-Brown (teor)
teor2345 at gmail dot com pgp ABFED1AC https://gist.github.com/teor2345/d033b8ce0a99adbc89c5
teor at blah dot im OTR D5BE4EC2 255D7585 F3874930 DB130265 7C9EBBC7
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
If you can read Python and shell script, then checking I haven't made
any obvious coding errors in my changes would help. But that might require becoming familiar with the codebase - which may take some effort.
The diffs are here, or you can use git diff:
https://github.com/teor2345/chutney/compare/feature14175-chutney-performance...
https://github.com/teor2345/tor/compare/feature14175-chutney-performance-v2 Awesome, yup I can read those languages. I'll hopefully wrap my head around chutney's and tor's code in a reasonable amount of time. Gotta start somewhere =-)
When you make performance improvements, does the bandwidth increase? (Or, far more easily: when you deliberately slow down the code, does
the bandwidth tank?)
Sounds like a good idea to me!
Thanks again,
Cory
On Sun, 05 Jul 2015 10:20:50 -0700 Cory Pruce corypruce@gmail.com wrote:
Dude, thanks a bunch for you help. I'm really excited to start :D I'm going to read through the initial design and the code to see what functions/structures/constants/etc. need to be changed. Let me know when you release #14175 and I'll be happy to be the test guinea pig =-)
Maybe look at these for a start. It's something that's on my TODO list, but I wouldn't complain if someone else happens to do it before I do, and it would help HS scalability considerably[0].
https://trac.torproject.org/projects/tor/ticket/13737 https://trac.torproject.org/projects/tor/ticket/13738
(If you happen to be more interested in making non-HS use cases faster, then look elsewhere. :P)
Regards,