Hi,
I'm currently writing a follow-up blog post to [1] about a large scale malicious tor exit relay operator that did run more than 23% of the Tor network's exit capacity (May 2020) before (some) of it got reported to the bad-relays team and subsequently removed from the network by the Tor directory authorities. After the initial removal the malicious actor quickly restored its activities and was back at >20% of the Tor network's exit capacity within weeks (June 2020).
[1] https://medium.com/@nusenu/the-growing-problem-of-malicious-relays-on-the-to...
To prevent this from happening over and over again I'm proposing two simple but to some extend effective relay requirements to make malicious relay operations more expensive, time consuming, less sustainable and more risky for such actors:
a) require a verified email address for the exit or guard relay flag. (automated verification, many relays)
b) require a verified physical address for large operators (>=0.5% exit or guard probability) (manual verification, low number of operators). It is not required that the address is public or stored after it got verified. For details see bellow [2].
0.5% exit probability is currently about 500-600 Mbit/s of advertised bandwidth.
Q: How many operators would be affected by the physical address verification requirement if we use 0.5% as a threshold? A: About 30 operators.
There are currently about 18 exit [3] and 12 guard operators that run >0.5% exit/guard capacity if we ignore the fresh exit groups from 2020. Most exit operators (14 out of these 18) are organizations with public addresses or have their address published in WHOIS anyway.
At the end of the upcoming blog post I'd like to give people some idea as to how much support this proposal has.
Please let me know if you find this idea to limit attackers useful, especially if you are a long term relay operator, one of the 30 operators running >=0.5% exit/guard capacity, a Tor directory authority operator or part of The Torproject.
thanks for your support to fight malicious tor relays! nusenu
Hi nusenu,
On Sun, 2020-07-05 at 18:35 +0200, nusenu wrote:
https://medium.com/@nusenu/the-growing-problem-of-malicious-relays-on-the-to...
This was an interesting article, thanks for sharing. It's sad to read that you and the anonymous Tor core member weren't heard on the bad- relays list. Did you ever get a reply from a directory authority? I would assume that the dir auths are very keen to discuss this subject.
To prevent this from happening over and over again I'm proposing two simple but to some extend effective relay requirements to make malicious relay operations more expensive, time consuming, less sustainable and more risky for such actors:
a) require a verified email address for the exit or guard relay flag. (automated verification, many relays)
Wouldn't that be a hurdle for a lot of relay operators? I can imagine that many operators (of smaller relays) don't publish contact information for privacy reasons. Of course, in your proposal that information would only be shared with the directory authorities, but do we have any numbers on how many relay operators are okay with this?
You mention that the current defenses are inadequate for protection against slowly-added dispersed relays that are run by malicious actors. Wouldn't introducing this requirement prevent some relay operators with good intentions from serving exit or guard relays?
Furthermore, do we have any information on how much more difficult it would become to perform a sybil attack if your proposal is implemented? Assuming that this is something that can be somewhat accurately measured.
b) require a verified physical address for large operators (>=0.5% exit or guard probability) (manual verification, low number of operators). It is not required that the address is public or stored after it got verified. For details see bellow [2].
0.5% exit probability is currently about 500-600 Mbit/s of advertised bandwidth.
That seems reasonable. I currently co-run an exit relay that has just under 0.1% probability and would be okay with sharing my physical address with the directory authorities, especially if my probability would be higher.
However, 0.5% seems like an arbitrary number to me. Can you provide some background information on how you got to this percentage? Is there maybe some way to calculate a malicious relay operator's deanonymisation success rate?
Q: How many operators would be affected by the physical address verification requirement if we use 0.5% as a threshold? A: About 30 operators.
There are currently about 18 exit [3] and 12 guard operators that run
0.5% exit/guard capacity
if we ignore the fresh exit groups from 2020. Most exit operators (14 out of these 18) are organizations with public addresses or have their address published in WHOIS anyway.
Please don't assume that all big relay operators are happy with sharing their physical address because most of them already do. Maybe we can poll the big relay operators to find out if they are okay with this? (I don't know if all of them are represented on this list)
Edit: looks like niftybunny is already on this.
At the end of the upcoming blog post I'd like to give people some idea as to how much support this proposal has.
Great, I'm looking forward to this. It's a good thing to publicly discuss proposals like these.
Please let me know if you find this idea to limit attackers useful, especially if you are a long term relay operator, one of the 30 operators running >=0.5% exit/guard capacity, a Tor directory authority operator or part of The Torproject.
It is definitely an interesting idea, one that I have not thought of at least. But I'm not sure if it would be effective at preventing what it tries to prevent. Ultimately, the best solution for the sybil-attacks- are-easy problem is simple: we need more bandwidth provided by relays from operators with good intentions.
Imre
Thanks for all the positive off-list feedback so far!
a) require a verified email address for the exit or guard relay flag. (automated verification, many relays)
Wouldn't that be a hurdle for a lot of relay operators? I can imagine that many operators (of smaller relays) don't publish contact information for privacy reasons.
I believe you can have a valid ContactInfo and privacy.
Of course, in your proposal that information would only be shared with the directory authorities
That is not necessarily the case if the ContactInfo field is used without encryption, basically it is not specified yet.
but do we have any numbers on how many relay operators are okay with this?
I can only give you numbers based on the current tor network data (but that is not an answer to your question since it does not reveal anything about the operator's intention)
~71% of tor's guard capacity has a non-empty ContactInfo. About 700 guard relays have no ContactInfo set and are older than 1 month.
~89% of tor's exit capacity has a non-empty ContactInfo. Only about 60 exit relays have no ContactInfo set and are older than 1 month.
b) require a verified physical address for large operators (>=0.5% exit or guard probability) (manual verification, low number of operators). It is not required that the address is public or stored after it got verified. For details see bellow [2].
However, 0.5% seems like an arbitrary number to me. Can you provide some background information on how you got to this percentage? Is there maybe some way to calculate a malicious relay operator's deanonymisation success rate?
The reasoning behind the specific threshold will be covered in more detail in the upcoming blog post.
Q: How many operators would be affected by the physical address verification requirement if we use 0.5% as a threshold? A: About 30 operators.
There are currently about 18 exit [3] and 12 guard operators that run
0.5% exit/guard capacity
if we ignore the fresh exit groups from 2020. Most exit operators (14 out of these 18) are organizations with public addresses or have their address published in WHOIS anyway.
Please don't assume that all big relay operators are happy with sharing their physical address because most of them already do. Maybe we can poll the big relay operators to find out if they are okay with this? (I don't know if all of them are represented on this list)
In fact, my initial email went to many operators (after the mailing list was not happy with so many recipients I did resend it to the list without the others in TO, so unfortunately you no longer see the full list of recipients), but yes, that is the point of this email - getting feedback from operators, especially from big ones. I a few replied already.
It is definitely an interesting idea, one that I have not thought of at least. But I'm not sure if it would be effective at preventing what it tries to prevent.
Yes, that is basically the key question and since there appears to be a lot of money involved in running malicious relays, they certainly have enough money to buy some office services in some random place and get a physical address verified but one of the other factors of the proposal is also the additional time required for an attacker to go trough the process and that it can no longer be automated completely.
kind regards, nusenu
Thanks for the detailed reply, nusenu. Looks like you thought this through really well.
It would be nice if Tor core people would chip in on this as well! @arma, @teor maybe?
See my further comments inline.
On Sun, 2020-07-05 at 22:50 +0200, nusenu wrote:
I believe you can have a valid ContactInfo and privacy.
I do too, but I hope that prospective operators think so as well.
Of course, in your proposal that information would only be shared with the directory authorities
That is not necessarily the case if the ContactInfo field is used without encryption, basically it is not specified yet.
but do we have any numbers on how many relay operators are okay with this?
I can only give you numbers based on the current tor network data (but that is not an answer to your question since it does not reveal anything about the operator's intention)
~71% of tor's guard capacity has a non-empty ContactInfo. About 700 guard relays have no ContactInfo set and are older than 1 month.
~89% of tor's exit capacity has a non-empty ContactInfo. Only about 60 exit relays have no ContactInfo set and are older than 1 month.
Those numbers look encouraging to me. It's good to see that most operators are doing things the right way, i.e. being reachable in case something happens to their relay. Still not 100% though.
The reasoning behind the specific threshold will be covered in more detail in the upcoming blog post.
Now you're making me really curious.
In fact, my initial email went to many operators (after the mailing list was not happy with so many recipients I did resend it to the list without the others in TO, so unfortunately you no longer see the full list of recipients), but yes, that is the point of this email - getting feedback from operators, especially from big ones. I a few replied already.
That's great! Let's see what they think.
It is definitely an interesting idea, one that I have not thought of at least. But I'm not sure if it would be effective at preventing what it tries to prevent.
Yes, that is basically the key question and since there appears to be a lot of money involved in running malicious relays, they certainly have enough money to buy some office services in some random place and get a physical address verified but one of the other factors of the proposal is also the additional time required for an attacker to go trough the process and that it can no longer be automated completely.
It would be very interesting to know who pays for that. If we figure that out, then maybe we can pursuade them to donate that money to the Tor Project instead. \s
Imre
While I fully support the direction here I do wonder if there’s not also other information that could be used. Eg in bitcoin-land we have persistent issues with anti-privacy services operating large numbers of relays all one three ASNs. In the future, we’ll likely be shipping a compressed netblock -> primary ASN (where we map stub ASNs that always transit one upstream to their upstream) table with the software to limit connections to the same networks. Last I heard, Tor’s sybil attackers did something similar, so at least forcing them to establish business relationships with many hosting providers by limiting percentage of relays in a given ASN may be somewhat limiting. If nothing else, it also captures another useful trait that you don’t want to just bounce your traffic across five hosts in different OVH datacenters from a traffic correlation perspective.
Matt
On Jul 5, 2020, at 13:51, nusenu nusenu-lists@riseup.net wrote:
Thanks for all the positive off-list feedback so far!
a) require a verified email address for the exit or guard relay flag. (automated verification, many relays)
Wouldn't that be a hurdle for a lot of relay operators? I can imagine that many operators (of smaller relays) don't publish contact information for privacy reasons.
I believe you can have a valid ContactInfo and privacy.
Of course, in your proposal that information would only be shared with the directory authorities
That is not necessarily the case if the ContactInfo field is used without encryption, basically it is not specified yet.
but do we have any numbers on how many relay operators are okay with this?
I can only give you numbers based on the current tor network data (but that is not an answer to your question since it does not reveal anything about the operator's intention)
~71% of tor's guard capacity has a non-empty ContactInfo. About 700 guard relays have no ContactInfo set and are older than 1 month.
~89% of tor's exit capacity has a non-empty ContactInfo. Only about 60 exit relays have no ContactInfo set and are older than 1 month.
b) require a verified physical address for large operators (>=0.5% exit or guard probability) (manual verification, low number of operators). It is not required that the address is public or stored after it got verified. For details see bellow [2].
However, 0.5% seems like an arbitrary number to me. Can you provide some background information on how you got to this percentage? Is there maybe some way to calculate a malicious relay operator's deanonymisation success rate?
The reasoning behind the specific threshold will be covered in more detail in the upcoming blog post.
Q: How many operators would be affected by the physical address verification requirement if we use 0.5% as a threshold? A: About 30 operators. There are currently about 18 exit [3] and 12 guard operators that run
0.5% exit/guard capacity
if we ignore the fresh exit groups from 2020. Most exit operators (14 out of these 18) are organizations with public addresses or have their address published in WHOIS anyway.
Please don't assume that all big relay operators are happy with sharing their physical address because most of them already do. Maybe we can poll the big relay operators to find out if they are okay with this? (I don't know if all of them are represented on this list)
In fact, my initial email went to many operators (after the mailing list was not happy with so many recipients I did resend it to the list without the others in TO, so unfortunately you no longer see the full list of recipients), but yes, that is the point of this email - getting feedback from operators, especially from big ones. I a few replied already.
It is definitely an interesting idea, one that I have not thought of at least. But I'm not sure if it would be effective at preventing what it tries to prevent.
Yes, that is basically the key question and since there appears to be a lot of money involved in running malicious relays, they certainly have enough money to buy some office services in some random place and get a physical address verified but one of the other factors of the proposal is also the additional time required for an attacker to go trough the process and that it can no longer be automated completely.
kind regards, nusenu
-- https://mastodon.social/@nusenu
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
On Sun, 5 Jul 2020 at 17:36, nusenu nusenu-lists@riseup.net wrote:
Hi,
I'm currently writing a follow-up blog post to [1] about a large scale malicious tor exit relay operator that did run more than 23% of the Tor network's exit capacity (May 2020) before (some) of it got reported to the bad-relays team and subsequently removed from the network by the Tor directory authorities. After the initial removal the malicious actor quickly restored its activities and was back at >20% of the Tor network's exit capacity within weeks (June 2020).
[1] https://medium.com/@nusenu/the-growing-problem-of-malicious-relays-on-the-to...
To prevent this from happening over and over again I'm proposing two simple but to some extend effective relay requirements to make malicious relay operations more expensive, time consuming, less sustainable and more risky for such actors:
a) require a verified email address for the exit or guard relay flag. (automated verification, many relays)
free email addresses are cheap, but I guess it would give another correlation if they all use the same free email provider.
b) require a verified physical address for large operators (>=0.5% exit or guard probability) (manual verification, low number of operators). It is not required that the address is public or stored after it got verified. For details see bellow [2].
0.5% exit probability is currently about 500-600 Mbit/s of advertised bandwidth.
I am not convinced it would help large scale attacks. Running 50 relays is not much and it each was providing 0.49% of capacity that would give them 24.5%... I would expect that an attacker would create more relays than that and unless there is a good way to find out this is a single entity, they will all be well below 0.5%
Q: How many operators would be affected by the physical address verification requirement if we use 0.5% as a threshold? A: About 30 operators.
There are currently about 18 exit [3] and 12 guard operators that run >0.5% exit/guard capacity if we ignore the fresh exit groups from 2020. Most exit operators (14 out of these 18) are organizations with public addresses or have their address published in WHOIS anyway.
At the end of the upcoming blog post I'd like to give people some idea as to how much support this proposal has.
Please let me know if you find this idea to limit attackers useful, especially if you are a long term relay operator, one of the 30 operators running >=0.5% exit/guard capacity, a Tor directory authority operator or part of The Torproject.
thanks for your support to fight malicious tor relays! nusenu -- https://mastodon.social/@nusenu
[2] Physical address verification procedure could look like this:
The Torproject publishes a central registry of trusted entities that agreed to verify addresses of large scale operators.
The registry is broken down by area so no central entity needs to see all addresses or is in the position to block all submissions. (even though the number of physical address verifications are expected be stay bellow 50 for the time being).
Examples could be:
Riseup.net: US, ... Chaos Computer Club (CCC) : DE, ... DFRI: SE, ...
(these organizations host Tor directory authorities)
- Relay operators that would like to run more than 0.5% guard/exit fraction select their respective area and contact the entity to
initiate verification.
before sending an address verification request the operator verifies that they meet the following requirements:
- the oldest relay is not younger than two months (https://community.torproject.org/relay/community-resources/swag/ )
- all relays have a proper MyFamily configuration
- relays include the verified email address and PGP key fingerprint in the relay's ContactInfo
- at least one of their relays gained the exit or guard flag
- they have a sustained bandwidth usage of at least 100 Mbit/s (accumulated)
- intention to run the capacity for at least 4 months
upon receiving a request the above requirements are verified by the verification entity in addition to:
- relay(s) are currently running
- the address is in the entity's area
a random string is generated by the address verification entity and send with the welcome tshirt (if requested) to the operator
after sending the token the address is deleted
upon receiving the random string the operator sends it back via email to the verification entity
while signing the email with the PGP key mentioned in the relays ContactInfo
the verification entity compares the received string with the generated and mailed string
if the string matches the entity sends the relay fingerprint to the directory authority list to unlock the cap for the operator
After this one-time procedure the operator can add more relays as long as they are in the same family as the approved relay (no new verification needed).
[3]
exit operators running >=0.5% of exit probability without exit groups from 2020 (onionoo data as of 2020-07-03)
abuse-contact@to-surf-and-protect.net 22.73 Foundation for Applied Privacy 6.05 John L. Ricketts, PhD <john AT quintex dot com> 5.6 F3 Netze abuse@f3netze.de 5.47 https://www.torservers.net 3.89 Nicholas Merrill <nick AT calyx dot com> 2.48 https://www.digitale-gesellschaft.ch/abuse/ 1.74 Accessnow.org <abuse .AT. accessnow .DOT. org> 1.71 Hart voor Internetvrijheid 1.63 <zwiebeln at online de> 1.44 TNinja <abuse-team at tor dot ninja> 1.43 tech@emeraldonion.org 1.31 Frenn vun der Enn FVDE 1 https://www.artikel5ev.de/torcontact/ 0.74 Nos oignons 0.71 tor-abuse<at>mailbox<dot>org 0.67 abuse-node49 AT posteo DOT de 0.57 tor-operator@privateinternetaccess.com 0.52
14 out of these 18 operators have their address already publicly listed because they are registered organizations or similar.
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Pascal Terjan:
I am not convinced it would help large scale attacks. Running 50 relays is not much and it each was providing 0.49% of capacity that would give them 24.5%... I would expect that an attacker would create more relays than that and unless there is a good way to find out this is a single entity, they will all be well below 0.5%
Yes, they will try to circumvent thresholds by pretending to not be a group. The good thing is that this requires additional resources and time on the attacker side to hide the fact that they are adding many relays without triggering certain detections.
kind regards, nusenu
nusenu nusenu-lists@riseup.net wrote:
Pascal Terjan:
I am not convinced it would help large scale attacks. Running 50 relays is not much and it each was providing 0.49% of capacity that would give them 24.5%... I would expect that an attacker would create more relays than that and unless there is a good way to find out this is a single entity, they will all be well below 0.5%
Yes, they will try to circumvent thresholds by pretending to not be a group. The good thing is that this requires additional resources and time on the attacker side to hide the fact that they are adding many relays without triggering certain detections.
Your proposed method of delaying the problem would impose a labor burden on the tor project as well and would be slow to react to changes. Why would an automated solution not work? For example, if the directory authorities calculate the traffic percentages every hour or so or even every several hours, then why not just remove a Guard or Exit flag from any guard or exit exceeding the publicized percentage? That would be a fast reaction and would not depend upon multiple human actions. You might also implement a "repeat offender" policy, whereby if the authorities lifted a relay's Exit flag more than n times within a month, a BadExit flag would be applied in addition, which then (and only then) would require the operator to contact the tor project about it.
Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************
Scott Bennett:
Your proposed method of delaying the problem would impose a labor burden on the tor project as well
If we assume that malicious relay activity is impacted I'd assume that the time saved using the proposal might as well outweight the time spend on bad-relays@
After implementation the proposal does not require resources from The Torproject besides publishing of the registry.
Why would an automated solution not work?
I believe the email verification can be automated completely. Also the mailing of letters can be automated but if - let's say 10 - letters/year are send I'm not sure it is worth it.
That would be a fast reaction and would not depend upon multiple human actions.
There is no human interaction involved in the proposal to enforce a cap. The cap would be "on by default" and lifted after verification is passed.
You might also implement a "repeat offender" policy, whereby if the authorities lifted a relay's Exit flag more than n times within a month, a BadExit flag would be applied in addition, which then (and only then) would require the operator to contact the tor project about it.
Malicious actors usually come back with new relays (new keys, new IPs) after they got cough.
Hi nusenu
Thank's you for your encouraging efforts to keep things safe.
Am 05.07.2020 um 18:35 schrieb nusenu:
To prevent this from happening over and over again I'm proposing two simple but to some extend effective relay requirements to make malicious relay operations more expensive, time consuming, less sustainable and more risky for such actors
Is an issue real or not? Any answer to that question does not contradict a substantial method. Right, the proposed measure is not against sneek-in attackers but it buys time to detect and tackle sudden issues. Let's move forward. I hear you.
-- Cheers, Felix
I have nothing against this proposal although im not sure it would be that much efficient. Especially, how does it make relay operations 'less sustainable' or 'more risky'?
@Imre Jonk: why would you want - and why should you have - an higher probability? Sounds to me the ideal case is an infinite amount of independent exits with an almost-zero probability.
C
On Mon, Jul 6, 2020 at 12:28 AM Felix zwiebel@quantentunnel.de wrote:
Hi nusenu
Thank's you for your encouraging efforts to keep things safe.
Am 05.07.2020 um 18:35 schrieb nusenu:
To prevent this from happening over and over again I'm proposing two simple but to some extend effective relay requirements to make malicious relay operations more expensive, time consuming, less sustainable and more risky for such actors
Is an issue real or not? Any answer to that question does not contradict a substantial method. Right, the proposed measure is not against sneek-in attackers but it buys time to detect and tackle sudden issues. Let's move forward. I hear you.
-- Cheers, Felix _______________________________________________ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Charly Ghislain:
I have nothing against this proposal although im not sure it would be that much efficient. Especially, how does it make relay operations 'less sustainable' or 'more risky'?
I assume you mean "make _malicious_ relay operations 'less sustainable' ..".
It would be less sustainable because they would have to run relays for longer before they can start exploiting tor users. And "more risky" because physical letter delivery requires them to pay someone and hiding money trails is usually harder than hiding the origin of someone creating a random email address.
Am So., 5. Juli 2020 um 18:36 Uhr schrieb nusenu nusenu-lists@riseup.net:
Hi,
I'm currently writing a follow-up blog post to [1] about a large scale malicious tor exit relay operator that did run more than 23% of the Tor network's exit capacity (May 2020) before (some) of it got reported to the bad-relays team and subsequently removed from the network by the Tor directory authorities. After the initial removal the malicious actor quickly restored its activities and was back at >20% of the Tor network's exit capacity within weeks (June 2020).
[1] https://medium.com/@nusenu/the-growing-problem-of-malicious-relays-on-the-to...
To prevent this from happening over and over again I'm proposing two simple but to some extend effective relay requirements to make malicious relay operations more expensive, time consuming, less sustainable and more risky for such actors:
a) require a verified email address for the exit or guard relay flag. (automated verification, many relays)
b) require a verified physical address for large operators (>=0.5% exit or guard probability) (manual verification, low number of operators). It is not required that the address is public or stored after it got verified. For details see bellow [2].
0.5% exit probability is currently about 500-600 Mbit/s of advertised bandwidth.
Q: How many operators would be affected by the physical address verification requirement if we use 0.5% as a threshold? A: About 30 operators.
There are currently about 18 exit [3] and 12 guard operators that run
0.5% exit/guard capacity
if we ignore the fresh exit groups from 2020. Most exit operators (14 out of these 18) are organizations with public addresses or have their address published in WHOIS anyway.
At the end of the upcoming blog post I'd like to give people some idea as to how much support this proposal has.
Please let me know if you find this idea to limit attackers useful, especially if you are a long term relay operator, one of the 30 operators running >=0.5% exit/guard capacity, a Tor directory authority operator or part of The Torproject.
thanks for your support to fight malicious tor relays! nusenu -- https://mastodon.social/@nusenu
[2] Physical address verification procedure could look like this:
The Torproject publishes a central registry of trusted entities that agreed to verify addresses of large scale operators.
The registry is broken down by area so no central entity needs to see all addresses or is in the position to block all submissions. (even though the number of physical address verifications are expected be stay bellow 50 for the time being).
Examples could be:
Riseup.net: US, ... Chaos Computer Club (CCC) : DE, ... DFRI: SE, ...
(these organizations host Tor directory authorities)
- Relay operators that would like to run more than 0.5% guard/exit
fraction select their respective area and contact the entity to initiate verification.
- before sending an address verification request the operator verifies
that they meet the following requirements:
- the oldest relay is not younger than two months (
https://community.torproject.org/relay/community-resources/swag/ )
- all relays have a proper MyFamily configuration
- relays include the verified email address and PGP key fingerprint in
the relay's ContactInfo
- at least one of their relays gained the exit or guard flag
- they have a sustained bandwidth usage of at least 100 Mbit/s
(accumulated)
intention to run the capacity for at least 4 months
upon receiving a request the above requirements are verified by the
verification entity in addition to:
relay(s) are currently running
the address is in the entity's area
a random string is generated by the address verification entity and send
with the welcome tshirt (if requested) to the operator
after sending the token the address is deleted
upon receiving the random string the operator sends it back via email to
the verification entity while signing the email with the PGP key mentioned in the relays ContactInfo
- the verification entity compares the received string with the generated
and mailed string
- if the string matches the entity sends the relay fingerprint to the
directory authority list to unlock the cap for the operator
After this one-time procedure the operator can add more relays as long as they are in the same family as the approved relay (no new verification needed).
I would not be very happy to be required to give away personal identifying information even if it's a "trusted entity".
Even if Tor is focused on offering anonymity to its users and not necessarily to its relay operators a move towards this by an organisation that supports privacy wherever they can would seem like a strange idea to me.
I remember that i suggested the idea with an email based verification in one of my previous emails and i still think that would not be a bad idea but i think juristically it might be a difference if the torproject (or maybe even each of the authority operators itself) only have the ability to reject relays or if they even have the ability to decide who is allowed to join.
Your blog post talks about that you found relays doing something odd that the official Tor software is not able to do.
So if that's the only reason for this whole email then i don't see a reason for any of this because relays "doing something odd" should not be able to be part of the Tor network.
In your blog post you talk about malicious relays which as far as i understand are relays who are in the position for doing end-to-end correlation attacks.
Are these attacks really something to worry about in real life or is it just a fear because they are possible theoretically?
I mean even if a malicious entity is adding relays how big is the risk in real life that they can hurt you when they at least can not "do something odd" anymore?
Onion services for example are an answer for that.
Generally i think it's a rather strange move that the opinion changed to "let's cripple operators who want to contribute more" instead of trying to encourage more people to run relays so that the percentage of the other operators will decrease by itself.
The amount of Tor relays seems to have reached its peak (for now, maybe) so i think crippling the existing operators is not necessarily the best way to go and when someone wants to add more relays he should be able to do that without introducing a two-class system of verified and not verified operators.
On Sun, Jul 05, 2020 at 06:35:32PM +0200, nusenu wrote:
To prevent this from happening over and over again I'm proposing two simple but to some extend effective relay requirements to make malicious relay operations more expensive, time consuming, less sustainable and more risky for such actors:
a) require a verified email address for the exit or guard relay flag. (automated verification, many relays)
b) require a verified physical address for large operators (>=0.5% exit or guard probability) (manual verification, low number of operators).
Thanks Nusenu!
I like the general goals here.
I've written up what I think would be a useful building block: https://gitlab.torproject.org/tpo/metrics/relay-search/-/issues/40001
------------------------------------------------------------------------
Three highlights from that ticket that tie into this thread:
(A) Limiting each "unverified" relay family to 0.5% doesn't by itself limit the total fraction of the network that's unverified. I see a lot of merit in another option, where the total (global, network-wide) influence from relays we don't "know" is limited to some fraction, like 50% or 25%.
(B) I don't know what you have in mind with verifying a physical address (somebody goes there in person? somebody sends a postal letter and waits for a response?), but I think it's trying to be a proxy for verifying that we trust the relay operator, and I think we should brainstorm more options for achieving this trust. In particular, I think "humans knowing humans" could provide a stronger foundation.
More generally, I think we need to very carefully consider the extra steps we require from relay operators (plus the work they imply for ourselves), and what security we get from them. Is verifying that each relay corresponds to some email address worth the higher barrier in being a relay operator? Are there other approaches that achieve a better balance? The internet has a lot of experience now on sybil-resistance ideas, especially on ones that center around proving online resources (and it's mostly not good news).
(C) Whichever mechanism(s) we pick for assigning trust to relays, one gap that's been bothering me lately is that we lack the tools for tracking and visualizing which relays we trust, especially over time, and especially with the amount of network churn that the Tor network sees. It would be great to have an easier tool where each of us could assess the overall network by whichever "trust" mechanisms we pick -- and then armed with that better intuition, we could pick the ones that are most ready for use now and use them to influence network weights.
------------------------------------------------------------------------
At the same time, we need to take other approaches to reduce the impact and incentives for having evil relays in the network. For examples:
(1) We need to finish getting rid of v2 onion services, so we stop the stupid arms race with threat intelligence companies who run relays in order to get the HSDir flag in order to scrape legacy onion addresses.
(2) We need to get rid of http and other unauthenticated internet protocols: I've rebooted this ticket: https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/19850 with a suggestion of essentially disabling http connections when the security slider is set to 'safer' or 'safest', to see if that's usable enough to eventually make it the default in Tor Browser.
(3) We need bandwidth measuring techniques that are more robust and harder to game, e.g. the design outlined in FlashFlow: https://arxiv.org/abs/2004.09583
--Roger
I've written up what I think would be a useful building block: https://gitlab.torproject.org/tpo/metrics/relay-search/-/issues/40001
thanks, I'll reply here since I (and probably others) can not reply there.
Three highlights from that ticket that tie into this thread:
(A) Limiting each "unverified" relay family to 0.5% doesn't by itself limit the total fraction of the network that's unverified. I see a lot of merit in another option, where the total (global, network-wide) influence from relays we don't "know" is limited to some fraction, like 50% or 25%.
I like it (it is even stricter than what I proposed), you are basically saying the "known" pool should always control a fixed (or minimal?) portion - lets say 75% - of the entire network no matter what capacity the "unknown" pool has but it doesn't address the key question: How do you specifically define "known" and how do you verify entities before you move them to the "known" pool?
(B) I don't know what you have in mind with verifying a physical address (somebody goes there in person? somebody sends a postal letter and waits for a response?)
The process is outlined at the bottom of my first email in this thread (short: a random challenge sent to an address in a letter which is returned via email).
but I think it's trying to be a proxy for verifying that we trust the relay operator,
"trust" is a strong word. I wouldn't call them 'trusted' just because they demonstrated their ability to pay someone to scan letters send to a physical address.
I would describe it more as a proxy for "less likely to be a random opportunistic attacker exploiting tor users with zero risks for themselves".
and I think we should brainstorm more options for achieving this trust. In particular, I think "humans knowing humans" could provide a stronger foundation.
I'm all ears for better options but at some point I'd like to see some actual improvement in practice.
I would dislike to be in the same situation in one year from now because we are still discussing the perfect solution.
More generally, I think we need to very carefully consider the extra steps we require from relay operators (plus the work they imply for ourselves), and what security we get from them.
I agree.
(C) Whichever mechanism(s) we pick for assigning trust to relays, one gap that's been bothering me lately is that we lack the tools for tracking and visualizing which relays we trust, especially over time,
and especially with the amount of network churn that the Tor network sees. It would be great to have an easier tool where each of us could assess the overall network by whichever "trust" mechanisms we pick -- and then armed with that better intuition, we could pick the ones that are most ready for use now and use them to influence network weights.
reminds me of an atlas feature request for family level graphs https://trac.torproject.org/projects/tor/ticket/23509 https://lists.torproject.org/pipermail/tor-relays/2017-September/012942.html
I'm generating some timeseries graphs now to see what exit fraction (stacked) is managed by https://torservers.net/partners.html and those mentioned at the bottom of https://lists.torproject.org/pipermail/tor-relays/2020-January/018022.html + some custom additions for operators I had some contact before over time (past 6 months). spoiler: it used to be >50% until some malicious actor came along and reduced it to <50%
Seeing their usual fraction over time can be used as an input when deciding what fixed fraction should always be managed by them.
At the same time, we need to take other approaches to reduce the impact and incentives for having evil relays in the network. For examples:
(1) We need to finish getting rid of v2 onion services, so we stop the stupid arms race with threat intelligence companies who run relays in order to get the HSDir flag in order to scrape legacy onion addresses.
outlined, planned and announced (great): https://blog.torproject.org/v2-deprecation-timeline
(2) We need to get rid of http and other unauthenticated internet protocols:
This is something browser vendors will tackle for us I hope, but it will not be anytime soon.
kind regards, nusenu
On Tue, Jul 07, 2020 at 01:01:12AM +0200, nusenu wrote:
https://gitlab.torproject.org/tpo/metrics/relay-search/-/issues/40001
thanks, I'll reply here since I (and probably others) can not reply there.
Fwiw, anybody who wants a gitlab account should just ask for one. Don't be shy. :)
The instructions for asking are here: https://gitlab.torproject.org/users/sign_in
(A) Limiting each "unverified" relay family to 0.5% doesn't by itself limit the total fraction of the network that's unverified. I see a lot of merit in another option, where the total (global, network-wide) influence from relays we don't "know" is limited to some fraction, like 50% or 25%.
I like it (it is even stricter than what I proposed), you are basically saying the "known" pool should always control a fixed (or minimal?) portion - lets say 75% - of the entire network no matter what capacity the "unknown" pool has
Right.
but it doesn't address the key question: How do you specifically define "known" and how do you verify entities before you move them to the "known" pool?
Well, the first answer is that these are two separate mechanisms, which we can consider almost independently:
* One is dividing the network into known and unknown relays, where we reserve some minimum fraction of attention for the known relays. Here the next steps are to figure out how to do load balancing properly with this new parameter (mainly a math problem), and to sort out the logistics for how to label the known relays so directory authorities can assign weights properly (mainly coding / operator ux).
* Two is the process we use for deciding if a relay counts as known. My suggested first version here is that we put together a small team of Tor core contributors to pool their knowledge about which relay operators we've met in person or otherwise have a known social relationship with.
One nice property of "do we know you" over "do you respond to mail at a physical address" is that the thing you're proving matters into the future too. We meet people at relay operator meetups at CCC and Fosdem and Tor dev meetings, and many of them are connected to their own local hacker scenes or other local communities. Or said another way, burning your "was I able to answer a letter at this fake address" effort is a different tradeoff than burning your "was I able to convince a bunch of people in my local and/or international communities that I mean well?"
I am thinking back to various informal meetings over the years at C-base, Hacking At Random, Defcon, etc. The "social connectivity" bond is definitely not perfect, but I think it is the best tool available to us, and it provides some better robustness properties compared to more faceless "proof of effort" approaches.
That said, on the surface it sure seems to limit the diversity we can get in the network: people we haven't met in Russia or Mongolia or wherever can still (eventually, postal service issues aside) answer a postal letter, whereas it is harder for them to attend a CCC meetup. But I think the answer there is that we do have a pretty good social fabric around the world, e.g. with connections to OTF fellows, the communities that OONI has been building, etc, so for many places around the world, we can ask people we know there for input.
And it is valuable for other reasons to build and strengthen these community connections -- so the incentives align.
Here the next step is to figure out the workflow for annotating relays. I had originally imagined some sort of web-based UI where it leads me through constructing and maintaining a list of fingerprints that I have annotated as 'known' and a list annotated as 'unknown', and it shows me how my lists have been doing over time, and presents me with new not-yet-annotated relays.
But maybe a set of scripts, that I run locally, is almost as good and much simpler to put together. Especially since, at least at first, we are talking about a system that has on the order of ten users.
One of the central functions in those scripts would be to sort the annotated relays by network impact (some function of consensus weight, bandwidth carried, time in network, etc), so it's easy to identify the not-yet-annotated ones that will mean the biggest shifts. Maybe this ordered list is something we can teach onionoo to output, and then all the local scripts need to do is go through each relay in the onionoo list, look them up in the local annotations list to see if they're already annotated, and present the user with the unannotated ones.
To avoid centralizing too far, I could imagine some process that gathers the current annotations from the several people who are maintaining them, and aggregates them somehow. The simplest version of aggregation is "any relay that anybody in the group knows counts as known", but we could imagine more complex algorithms too.
And lastly, above I said we can consider the two mechanisms "almost independently" -- the big overlap point is that we need to better understand what fraction of the network we are considering "known", and make sure to not screw up the load balancing / performance of the network too much.
(2) We need to get rid of http and other unauthenticated internet protocols:
This is something browser vendors will tackle for us I hope, but it will not be anytime soon.
Well, we could potentially tackle it sooner than the mainstream browser vendors. See https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/19850#no... where maybe (I'm not sure, but maybe) https-everywhere has a lot of the development work already done.
--Roger
On 7/8/20 12:35 PM, Roger Dingledine wrote:
- One is dividing the network into known and unknown relays, where we
reserve some minimum fraction of attention for the known relays. Here the next steps are to figure out how to do load balancing properly with this new parameter (mainly a math problem), and to sort out the logistics for how to label the known relays so directory authorities can assign weights properly (mainly coding / operator ux).
Which boils down to "subjective" versus "objective" criterias of a weight distribution algorithms.
Which is fine as long as the maths behind it is public available and understandable.
This would enable a relay operator to verify/falsify it.
Le Wed, Jul 08, 2020 at 05:22:44PM +0200, Toralf Förster écrivait :
On 7/8/20 12:35 PM, Roger Dingledine wrote:
- One is dividing the network into known and unknown relays, where we
reserve some minimum fraction of attention for the known relays. Here the next steps are to figure out how to do load balancing properly with this new parameter (mainly a math problem), and to sort out the logistics for how to label the known relays so directory authorities can assign weights properly (mainly coding / operator ux).
Which boils down to "subjective" versus "objective" criterias of a weight distribution algorithms.
Which is fine as long as the maths behind it is public available and understandable.
This would enable a relay operator to verify/falsify it.
I'm pretty sure the maths behind it will be public. For the "understandable" part, I guess there can be a double explanation, one for the mathematicians/CS that know/understand the maths, and a popularisation of this explanation for those who don't understand the maths.
Cheers,
Roger Dingledine:
but it doesn't address the key question: How do you specifically define "known" and how do you verify entities before you move them to the "known" pool?
Well, the first answer is that these are two separate mechanisms, which we can consider almost independently:
- One is dividing the network into known and unknown relays, where we
reserve some minimum fraction of attention for the known relays. Here the next steps are to figure out how to do load balancing properly with this new parameter (mainly a math problem), and to sort out the logistics for how to label the known relays so directory authorities can assign weights properly (mainly coding / operator ux).
- Two is the process we use for deciding if a relay counts as known. My
suggested first version here is that we put together a small team of Tor core contributors to pool their knowledge about which relay operators we've met in person or otherwise have a known social relationship with.
How does the verification process look like to become "known"? Tor core people handing out printed tokens to people that are able to attend one of your preferred conferences or what do you have in mind specifically?
Here the next step is to figure out the workflow for annotating relays. I had originally imagined some sort of web-based UI where it leads me through constructing and maintaining a list of fingerprints that I have annotated as 'known' and a list annotated as 'unknown', and it shows me how my lists have been doing over time, and presents me with new not-yet-annotated relays.
Lets annotate on a family (and not relay) level.
If we had verified contacts, we could avoid MyFamily and use the verified contact only.
As a starting point you can use the family listings on this page: https://nusenu.github.io/OrNetStats/
One of the central functions in those scripts would be to sort the annotated relays by network impact
OrNetRadar provides family lists sorted by CW fraction, guard and exit probability. These lists also contain the date when the family joined.
You can even directly spot the likely malicious exits that are currently recovering from the last attempt to get rid of them especially since you know the specific date when dir auths last attempt to get rid of them was.
regards, nusenu
Hi all,
Le Mon, Jul 06, 2020 at 07:07:19AM -0400, Roger Dingledine écrivait :
Three highlights from that ticket that tie into this thread:
(A) Limiting each "unverified" relay family to 0.5% doesn't by itself limit the total fraction of the network that's unverified. I see a lot of merit in another option, where the total (global, network-wide) influence from relays we don't "know" is limited to some fraction, like 50% or 25%.
That's a great idea, but how do you say you "know" a relay ? And in this case, I guess that this number should stay low. Here we have a case of 20% exit probability by this group of nodes, which is already huge. And we don't necessarily know if they have as well a part of the entry nodes.
(B) I don't know what you have in mind with verifying a physical address (somebody goes there in person? somebody sends a postal letter and waits for a response?), but I think it's trying to be a proxy for verifying that we trust the relay operator, and I think we should brainstorm more options for achieving this trust. In particular, I think "humans knowing humans" could provide a stronger foundation.
More generally, I think we need to very carefully consider the extra steps we require from relay operators (plus the work they imply for ourselves), and what security we get from them. Is verifying that each relay corresponds to some email address worth the higher barrier in being a relay operator? Are there other approaches that achieve a better balance? The internet has a lot of experience now on sybil-resistance ideas, especially on ones that center around proving online resources (and it's mostly not good news).
Two points here : * We should give a read to the sybil-resistance litterature around to see what exists and how it could be adapted to Tor. I know that a lot of work has already been done, but maybe some extended defenses are required at this point. * A suggestion would be to build a web-of-trust between relay operators, using, I don't know, PGP or something like this, and organize signing parties at hackers/Torproject/dev events? For example, I'm every year at FOSDEM in Brussels, and I attend the bird of feathers for Tor when it exists. However, it would prevent people who want to keep complete anonymity from contributing to Tor, which is a bad point. Maybe this lights something up in your brains?
(C) Whichever mechanism(s) we pick for assigning trust to relays, one gap that's been bothering me lately is that we lack the tools for tracking and visualizing which relays we trust, especially over time, and especially with the amount of network churn that the Tor network sees. It would be great to have an easier tool where each of us could assess the overall network by whichever "trust" mechanisms we pick -- and then armed with that better intuition, we could pick the ones that are most ready for use now and use them to influence network weights.
At the same time, we need to take other approaches to reduce the impact and incentives for having evil relays in the network. For examples:
(1) We need to finish getting rid of v2 onion services, so we stop the stupid arms race with threat intelligence companies who run relays in order to get the HSDir flag in order to scrape legacy onion addresses.
Good point. Do you think speeding the process is possible? The deadline is in more than one year from now, which seems a pretty long time. Or maybe is it to synchronize with new versions of the linux/BSD distributions?
(2) We need to get rid of http and other unauthenticated internet protocols: I've rebooted this ticket: https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/19850 with a suggestion of essentially disabling http connections when the security slider is set to 'safer' or 'safest', to see if that's usable enough to eventually make it the default in Tor Browser.
+1, nothing more to say.
(3) We need bandwidth measuring techniques that are more robust and harder to game, e.g. the design outlined in FlashFlow: https://arxiv.org/abs/2004.09583
I have seen that there is a proposal, and a thread on tor-dev that died in April (lockdown maybe?), maybe we should launch again the discussions around this technique?
Hi,
great to see that the Tor network can lose ~20% capacity without having impact on the performance of the complete network. So can we get rid of the non dual stack relays now too?
Hi,
Overall supportive of ways to manage bad relays. I switched to bridges because the MyFamily nonsense is too much of a burden to maintain (even with the hacks available on the tor wiki). If you can detect the "bad relays", why not simply flag them and move on? I read this to mean that nusenu and tor have a different definition of "bad relay" and those with the ability to ban are not as ready to destroy 20% of exit capacity.
A few concerns about the proposed plans. Putting a validated email address in a public field is a concern. It becomes trivial to scrape the address and spam the relay operator. Personally, this is a problem for now (2,500 spam emails in the past week). Potentially, someone is targeting relay operator contact info. I only use this email for tor relays (and posting to this list). I believe this is my first post to the list, so the only ways someone could find my address is from my public relay or if tor's mailing list system is compromised. Alternatively, riseup is compromised because "bad-relays@riseup.net" emailed the address from my public relay contact info field.
Require PGP/GPG is silly. It is a failed system and is easily exploited to find all connections in a social network map. Even the US EFF wants you to stop using it[1]. The system was exploitable for a decade before users noticed. One can be sure governments exploited this heavily.
Physical address verification is unacceptable. Not only would tor possibly know a mailing address, some third party organization also knows it (RiseUp, CCC, DFRI). Under GDPR, I want to know their data handling practices and then subsequently ask them to remove any of my data. With this scenario, we are all a single legal request away from a government agency having all of this data. I understand the USA and EU abuses this system constantly with secret requests. Police and intelligence agencies already have thousands of idle shelf companies waiting to be used. All this requirement does is kick out private citizens and hands the tor network to large entities.
This returns me to the original question, if "bad relays" are already detected, then why not simply enforce bans against these relays? You are already actively managing the capacity of the network by dumping tor releases deemed to be old or bad in some ways.
1. https://www.eff.org/deeplinks/2018/05/attention-pgp-users-new-vulnerabilitie...
On 09.07.2020 00:20, Jonas wrote:
If you can detect the "bad relays", why not simply flag them and move on?
I agree with you for publicizing bad relays and locking them faster. Personally, I blocked some exits in my Tor browser. E.g. these expensive high bandwith (unnamed & without mail contact) https://metrics.torproject.org/rs.html#toprelays
A few concerns about the proposed plans. Putting a validated email address in a public field is a concern. It becomes trivial to scrape the address and spam the relay operator. Personally, this is a problem for now (2,500 spam emails in the past week).
However, the validation email address only needs to be available for a short time. Many providers require that you have an abuse address for an exit server. I have my email not obfuscated and hardly get any spam. And when I get some, I will change it. ;-) https://metrics.torproject.org/rs.html#search/TorOrDie4privacyNET (greylisting, amavisd & spamassassin can help)
Require PGP/GPG is silly. It is a failed system and is easily exploited to find all connections in a social network map. Even the US EFF wants you to stop using it[1]. The system was exploitable for a > decade before users noticed.
PGP/GPG should be used here for verification, not for encryption. Every Debian or Githup package is GPG signed.
With this scenario, we are all a single legal request away from a government agency having all of this data. I understand the USA and EU abuses this system constantly with secret requests. Police and intelligence agencies already have thousands of idle shelf companies waiting to be used.
I am sure that they have direct access to DNS Whois address owner. And the address lists of large providers (Hetzner, OVH and Online S.a.s) will have had them for a long time. Old rule: 'follow the money'. Anyone who does not use Monero to pay for their servers @ provider is known to them. Combating terrorism and child pornography makes it possible. They don't have to come to the Tor Project office with a legal request ;-)
Tor Project has my address and bank details for a long time. The people from the CCCCologne know where I live anyway. Ah, and niftybunny too.
There seems to be a consensus toward building a web of trust. Thinking about it again, I don't like much the direction it is going.
I see tor as a web of untrust actually. I never much appreciated the power already granted to directory authorities. I want to be able to use any relay (I choose) as guard or exit easily (at the operator's discretion), but currently unless Im mistaken I need to wait for those authorities to flag them as appropriate.
Some of this power makes sense at the network level to balance traffic fluently between relays and decrease the probability of bad actors obtaining meaningful data, but others like the recent ban initiated by nunseu sounds like abuse to me. His proposal moves forward in that direction imo.
To be clear, I rely on him and others monitoring the network for bad actors and I believe they made the right move when kicking them off.
However I think it would be preferable to keep as much as possible the open design at the network level. Anything trying to build a web of trust should be completely separate, for instance published white and blacklists. Authorities flagging relays with verified email or physical addresses could publish their lists, and this could be used by the clients with the default configuration. But no single relay - however bad someone thinks it is - should be kicked off the network by the network itself. Especially not on the basis of individual human decisions.
There are a lot of other ways to mitigate sybil attacks, and contrary to the blog post statement that tor can handle some malicious relay only, I believe the design allows for a network entirely powered by malicious relays provided they are belonging to different actors and sufficiently distributed amongst them. I personally would not trust any relay Im not operating directly.
Isn't it a good time to move more decisions to the clients, like choosing between speed vs randomness, agreeing on blacklists/whitelists of some authority, etc? Im sorry if I missed some obvious goals of the project or that Im bringing up previously discussed options.
c
On 7/12/20 2:40 PM, Charly Ghislain wrote:
There seems to be a consensus toward building a web of trust. Thinking about it again, I don't like much the direction it is going.
+1
A Web of Trust does not mean that all have to trust a central instance.
Similar to PGP where nobody relies on the key servers but on his own keyring.
On 12.07.2020 14:40, Charly Ghislain wrote:
However I think it would be preferable to keep as much as possible the open design at the network level. Anything trying to build a web of trust should be completely separate, for instance published white and blacklists. Authorities flagging relays with verified email or physical addresses could publish their lists, and this could be used by the clients with the default configuration. But no single relay - however bad someone thinks it is - should be kicked off the network by the network itself. Especially not on the basis of individual human decisions.
+1
Currently each user tinkers with his own white- & blacklist. ;-) Long lists are created in bitcoin forums with EntryNodes, ExitNodes, ExcludeNodes & ExcludeExitNodes.
By the way: Your relay can be an exit from day 1. But get a guard flag only after 2 weeks or later, if the authorities put it at all. ;-)
Hi list,
First of all, why is email verification not used as an additional method for making family settings? (Additional meaning that operators could choose to either verify their mail address or write all their relays fingerprints in the config.) Wouldn't it be more convenient to verify the address once and then automatically adding every relay with the same contact to that family. And maybe notify the operators so that they can claim the new relays to be "not theirs" in case somebody else pretended to be that operator.
Secondly, how about adding something to the configuration of relays with the purpose of showing inheritance from other configurations in order to make bad relay detection easier?
Imagine the guide for running secure relays changed regularly in a way that makes the 'version' of torrc distinguishable by looking at the settings. This could be easily achieved by adding a varying additional field in the settings. The value of the field is not important, but the name of the field should change in an unpredictable way. Somebody would keep the chronology of old suggested configurations. This way it can be used like non coding DNA for m/paternity testing.
The desired effect: If a good operator sets up a new relay for the very first time, the person reads the guide and copies this extra line which adds very little extra effort. If the good operator sets up many relays he can automate that and always use the same settings because the good operator doesn't need to hide she is running multiple relays. (This does not replace family settings.) If a bad operator sets up many relays over a long period of time, the bad operator cannot automate this process as the relays would stand out by using outdated recommendations that a new operator would not have found online and that can be related to the other relays of the same operator.
The extra information should not de-anonymize relay operators because the time when they setup their relay is usually available through the first seen field and family settings are public available too.
The obvious difficulty is, for it to work, the tor relay guide containing the altering information should be so well-written that the majority of new relay operators always choose to follow this rather than other guides that don't change. And it maybe requires different error handling.
Anyway this example should only illustrate the basic concept. I don't know if that option has been discussed so far.
tor-relays@lists.torproject.org