[tor-commits] [sbws/master] Replace v3bw-into-xy bash script with python script

pastly at torproject.org pastly at torproject.org
Tue Jun 26 15:36:49 UTC 2018


commit 09691a0fe7b3809f2cdacd7713c2b37668c6c93b
Author: Matt Traudt <sirmatt at ksu.edu>
Date:   Thu Jun 14 22:23:22 2018 -0400

    Replace v3bw-into-xy bash script with python script
    
    GH: ref #182
---
 CHANGELOG.md                  |  2 ++
 scripts/tools/v3bw-into-xy.py | 53 +++++++++++++++++++++++++++++++++++++++++++
 scripts/tools/v3bw-into-xy.sh | 26 ---------------------
 3 files changed, 55 insertions(+), 26 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 3a79c02..6e40965 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -13,6 +13,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
   stored in `v3bw` directory, named `YYmmdd_HHMMSS.v3bw`, and previously
 generated ones are kept. A `latest.v3bw` symlink is updated. (GH#179 GHPR#190)
 - Code refactoring in the v3bw classes and generation area
+- Replace v3bw-into-xy bash script with python script to handle a more complex
+  v3bw file format (GH#182)
 
 ## [0.4.1]
 
diff --git a/scripts/tools/v3bw-into-xy.py b/scripts/tools/v3bw-into-xy.py
new file mode 100755
index 0000000..fda5d74
--- /dev/null
+++ b/scripts/tools/v3bw-into-xy.py
@@ -0,0 +1,53 @@
+#!/usr/bin/env python3
+import sys
+import re
+# File: v3bw-into-xy.py
+# Author: Matt Traudt
+# License: CC0
+#
+# Takes one or more v3bw files as arguments.
+#
+# Looks for lines that contain actual data. That means most of them, since most
+# of them contain "node_id=" and those are the ones that are interesting.
+#
+# Extract the fingerprint and bandwidth values for each of those lines and put
+# them on stdout, one per line. Effectively, after ignoring other lines, this:
+#     node_id=$AAAA...AAAA bw=12345
+# becomes this:
+#     AAAA...AAAA 12345
+#
+# NOTE: If you specify more than v3bw file, this will do NOTHING to tell you
+# when the output from one file stops and the next begins
+#
+# With v1.1.0 of the v3bw file format, we no longer know if node_id or bw is
+# first in the line. Hence two regular expresions and searching for the matched
+# item that has 40 chars (the fingerprint)
+
+
+def main():
+    re1 = re.compile('.*node_id=\$?([\w]+).* bw=([\d]+).*')  # noqa
+    re2 = re.compile('.*bw=([\d]+).* node_id=\$?([\w]+)')  # noqa
+    for fname in sys.argv[1:]:
+        with open(fname, 'rt') as fd:
+            for line in fd:
+                if 'node_id' not in line:
+                    continue
+                match = re1.match(line) or re2.match(line)
+                if not match:
+                    continue
+                items = match.groups()
+                assert len(items) == 2
+                s = '{} {}\n'
+                if len(items[0]) == 40:
+                    s = s.format(*items)
+                else:
+                    s = s.format(*items[::-1])
+                sys.stdout.write(s)
+    return 0
+
+
+if __name__ == '__main__':
+    try:
+        exit(main())
+    except (KeyboardInterrupt, BrokenPipeError):
+        pass
diff --git a/scripts/tools/v3bw-into-xy.sh b/scripts/tools/v3bw-into-xy.sh
deleted file mode 100755
index 059739d..0000000
--- a/scripts/tools/v3bw-into-xy.sh
+++ /dev/null
@@ -1,26 +0,0 @@
-#!/usr/bin/env bash
-# File: v3bw-into-xy.sh
-# Author: Matt Traudt
-# License: CC0
-#
-# Takes one or more v3bw files as arguments.
-#
-# Looks for lines that contain actual data. That means most of them, since most
-# of them start with "node_id=" and those are the ones that are interesting.
-#
-# Extract the fingerprint and bandwidth values for each of those lines and put
-# them on stdout, one per line. Effectively, after ignoring other lines, this:
-#     node_id=$AAAA...AAAA bw=12345
-# becomes this:
-#     AAAA...AAAA 12345
-#
-# NOTE: If you specify more than v3bw file, this will do NOTHING to tell you
-# when the output from one file stops and the next begins
-set -e
-while [ "$1" != "" ]
-do
-    grep '^node_id=' "$1" |
-	    sed -r 's|^node_id=([$A-Z0-9]+) bw=([0-9]+).*$|\1 \2|' |
-	    sed 's|\$||g'
-    shift
-done





More information about the tor-commits mailing list