On 9/22/20 07:10, George Kadianakis wrote:
George Kadianakis desnacked@riseup.net writes:
tevador tevador@gmail.com writes:
Hi all,
Hello,
I have pushed another update to the PoW proposal here: https://github.com/asn-d6/torspec/tree/pow-over-intro I also (finally) merged it upstream to torspec as proposal #327: https://github.com/torproject/torspec/blob/master/proposals/327-pow-over-int...
The most important improvements are:
- Add tevador as an author.
- Update PoW algorithms based on tevador's Equix feedback.
- Update effort estimation algorithm based on tevador's simulation.
- Include hybrid attack section.
- Remove a bunch of blocker tags.
Two things I'd like to work more on:
- I'd like people to take tevador's Equix PoW function and run it on their boxes and post back benchmarks of how it performed.
I shared some results privately with George and he suggested including the list. Results below.
Particularly so if you have a GPU-enabled box, so that we can get some benchmarks from GPUs as well. That will help us tune the proposal even more.
For anyone else following along or also contributing benchmarks, George clarified for me that the equix benchmark isn't capable of utilizing the GPU.
My results:
First results are on my w530, i7, 4 core (hyperthreaded to 8) laptop (with moderate activity in the background).
I stumbled across some weird artifacts when using more threads than processors: the benchmark reports solutions/sec continuing to increase linearly with #threads. The wall-clock time for the benchmark itself (measured with `time`) show the expected trend though of linear scaling only up to 4 (the number of physical cores), a little bump at 8 (using the hyperthreaded virtual cores), and no improvement past that.
Further below are results on my pinephone. $ time ./equix-bench --threads 1 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 1) ... 1.910000 solutions/nonce 227.714446 solutions/sec. (1 thread) 20301.439170 verifications/sec. (1 thread)
real 0m4.242s user 0m4.230s sys 0m0.012s
$ time ./equix-bench --threads 2 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 2) ... 1.910000 solutions/nonce 450.100153 solutions/sec. (2 threads) 17925.519934 verifications/sec. (1 thread)
real 0m2.184s user 0m4.294s sys 0m0.004s
$ time ./equix-bench --threads 4 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 4) ... 1.910000 solutions/nonce 876.343564 solutions/sec. (4 threads) 18863.079719 verifications/sec. (1 thread)
real 0m1.154s user 0m4.400s sys 0m0.012s
$ time ./equix-bench --threads 8 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 8) ... 1.910000 solutions/nonce 1089.198671 solutions/sec. (8 threads) 17808.857809 verifications/sec. (1 thread)
real 0m0.981s user 0m7.019s sys 0m0.052s
$ time ./equix-bench --threads 16 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 16) ... 1.910000 solutions/nonce 2183.232035 solutions/sec. (16 threads) 18936.014118 verifications/sec. (1 thread)
real 0m1.025s user 0m7.021s sys 0m0.032s
$ time ./equix-bench --threads 32 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 32) ... 1.910000 solutions/nonce 4397.259598 solutions/sec. (32 threads) 17754.229411 verifications/sec. (1 thread)
real 0m1.026s user 0m6.961s sys 0m0.049s
$ cat /proc/cpuinfo <snip> processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3740QM CPU @ 2.70GHz stepping : 9 microcode : 0x21 cpu MHz : 1856.366 cache size : 6144 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds bogomips : 5387.48 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:
Similar behavior on the (4-core aarch64) pinephone: $ time ./equix-bench --threads 1 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 1) ... 1.910000 solutions/nonce 23.920219 solutions/sec. (1 thread) 4477.199102 verifications/sec. (1 thread)
real 0m 40.35s user 0m 40.12s sys 0m 0.01s
$ time ./equix-bench --threads 2 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 2) ... 1.910000 solutions/nonce 47.683428 solutions/sec. (2 threads) 4384.937853 verifications/sec. (1 thread)
real 0m 20.45s user 0m 40.20s sys 0m 0.06s
$ time ./equix-bench --threads 4 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 4) ... 1.910000 solutions/nonce 94.149494 solutions/sec. (4 threads) 4359.695415 verifications/sec. (1 thread) real 0m 10.47s user 0m 40.71s sys 0m 0.08s
$ time ./equix-bench --threads 8 Solving nonces 0-499 (interpret: 0, hugepages: 0, threads: 8) ... 1.910000 solutions/nonce 188.808873 solutions/sec. (8 threads) 4348.479398 verifications/sec. (1 thread)
real 0m 10.50s user 0m 40.61s sys 0m 0.07s
$ cat /proc/cpuinfo <snip> processor : 3 BogoMIPS : 48.00 Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part : 0xd03 CPU revision : 4