-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello CollecTor data consumers,
note: if you don't know what CollecTor is or if you have never used CollecTor data before, you can safely stop reading now and enjoy your Friday. But in case you're now curious what CollecTor is, feel free to take a look:
https://collector.torproject.org/
So, we're about to remove a sanity check in CollecTor where we'd feed relay descriptors into metrics-lib and only serve them if they pass the parsing step. This step has so far eliminated descriptors with non-ASCII characters in unexpected places, like:
dirreq-v3-ips us=192,ru=144,de=136,??=112,fr=64,gb=56,br=40,it=40,es=24,in=24,jp=24,pl=24,ar=16,at=16,au=16,ca=16,ch=16,cn=16,cz=16,ir=16,kr=16,mx=16,nl=16,ua=16,za=16,^Wg=8,ae=8,ao=8,be=8,bg=8,bj=8,bo=8,bs=8,bw=8,by=8,cd=8,cl=8,cm=8,co=8,cr=8,cu=8,dd=8,dk=8,du=8,dz=8,ec=8,eg=8,fi=8,gf=8,gh=8,gr=8,gt=8,hk=8,hn=8,hr=8,hu=8,id=8,ie=8,il=8,i�=8,ke=8,kz=8,lk=8,lt=8,lv=8,ly=8,ma=8,md=8,mk=8,mu=8,my=8,nc=8,ng=8,ni=8,no=8,nz=8,pa=8,pe=8,ph=8,pk=8,pr=8,pt=8,ro=8,rs=8,se=8,sg=8,sk=8,sn=8,sy=8,th=8,tn=8,tr=8,tw=8,ug=8,ve=8,vn=8,�p=8
Very soon (next week?), CollecTor will serve those descriptors, just like the Tor directory authorities serve them.
If you're obtaining and processing descriptors from CollecTor, you'll somehow have to handle them. Of course, if you're using metrics-lib, nothing changes, and I hear Stem discards descriptors with non-ASCII characters in the line above, too.
If you want, you can find more details on the ticket:
https://trac.torproject.org/projects/tor/ticket/19170
All the best, Karsten