[metrics-team] server-descriptor not readable (2018-02-08-11-05-00-server-descriptors has >30k descriptors)

Karsten Loesing karsten at torproject.org
Fri Feb 9 11:41:57 UTC 2018


On 2018-02-09 10:44, Karsten Loesing wrote:
> On 2018-02-09 10:26, Katharina Haselhorst wrote:
>> Hi Karsten,
>>> I just gave this a try and was able to parse the file in a bit under 20
>>> minutes. I set -Xmx6g, though it might also work with less heap space.
>>
>> Thx for trying out!
>>
>>> So, this is a known limitation of metrics-lib which cannot handle large
>>> descriptor files very well. We're tracking this bug here, just in case
>>> you want to follow along:
>>
>> I see. Does this limitation only apply to large files or also to large
>> archieves (with many smaller files)?
> 
> Just to large files. Archives with many small files are not affected.

If you're interested in trying out an unreviewed, unreleased metrics-lib
version with a possible fix, here's a development version that you could
use:

https://people.torproject.org/~karsten/volatile/metrics-lib-2.1.1-dev.jar

https://people.torproject.org/~karsten/volatile/metrics-lib-2.1.1-dev.jar.asc

The main change is that metrics-lib will start providing descriptors
from large descriptor files almost instantly, rather than all of them at
once when it's done with the descriptor file.

Parsing a descriptor file with 30,000 descriptors will still take its
time, though. This has not changed with this patch.

If this patch passes the review process, it will be part of version
2.2.0, probably some time next week.

>> Regards, Kathi

All the best,
Karsten

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 528 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/metrics-team/attachments/20180209/a24acb8f/attachment-0001.sig>


More information about the metrics-team mailing list