[tor-commits] [stem/master] Explain why archives contain few microdescriptors

atagar at torproject.org atagar at torproject.org
Sat Aug 17 20:44:27 UTC 2019


commit e66d29ac772ae1143e832c0e38b96a968b80735f
Author: Damian Johnson <atagar at torproject.org>
Date:   Sat Aug 10 14:15:39 2019 -0700

    Explain why archives contain few microdescriptors
    
    Thanks to Karsten for the explanation! I'm probably not the only person that
    will be confused why CollecTor contains so few microdescriptors on an hourly
    basis so explaining this in our pydocs.
---
 stem/descriptor/collector.py       | 11 ++++++++++-
 test/integ/descriptor/collector.py |  3 ---
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/stem/descriptor/collector.py b/stem/descriptor/collector.py
index 3867fa3d..85eafec6 100644
--- a/stem/descriptor/collector.py
+++ b/stem/descriptor/collector.py
@@ -450,7 +450,16 @@ class CollecTor(object):
   def get_microdescriptors(self, start = None, end = None, cache_to = None, timeout = None, retries = 3):
     """
     Provides microdescriptors published during the given time range,
-    sorted oldest to newest.
+    sorted oldest to newest. Unlike server/extrainfo descriptors,
+    microdescriptors change very infrequently...
+
+    ::
+
+      "Microdescriptors are expected to be relatively static and only change
+      about once per week." -dir-spec section 3.3
+
+    CollecTor archives only contain microdescriptors that *change*, so hourly
+    tarballs often contain very few.
 
     :param datetime.datetime start: time range to begin with
     :param datetime.datetime end: time range to end with
diff --git a/test/integ/descriptor/collector.py b/test/integ/descriptor/collector.py
index db57df5e..d38ce7d1 100644
--- a/test/integ/descriptor/collector.py
+++ b/test/integ/descriptor/collector.py
@@ -56,9 +56,6 @@ class TestCollector(unittest.TestCase):
   def test_downloading_microdescriptors(self):
     recent_descriptors = list(stem.descriptor.collector.get_microdescriptors(start = RECENT))
 
-    # TODO: I'm unsure why these counts differ so much from server/extrainfo
-    # descriptors. Checking with Karsten.
-
     if not (300 < len(recent_descriptors) < 800):
       self.fail('Downloaded %i descriptors, expected 300-800' % len(recent_descriptors))  # 23 on 8/7/19
 





More information about the tor-commits mailing list