(Going back a bit in this thread...)

On 7 Dec 2017, at 19:56, Scott Bennett <bennett@sdf.org> wrote:

teor <teor2345@gmail.com> wrote:

On 6 Dec 2017, at 19:13, Scott Bennett <bennett@sdf.org> wrote:

(Snip)
   A script similar to the one used to reveal and make available the
otherwise unidentified source IP addresses of exits could be run by the
project to gather the hidden addresses of currently running relays, and
a list of such addresses could be made available on a compromise basis,
e.g., by having a relay at the project that would serve those lists only
over tunneled directory connections *from relays*, were it not for obstinacy.
Such a list could then be included into our packet filters' "free pass"
lists without putting the list up on the project's web site like the exit
list is.

Outbound addresses aren't secret, because they are used for connections.

    Roger has claimed here that some of them are indeed secret in the sense
that their owners do *not* want them to be published

Then maybe you should respect their wishes?

Or provide a compelling argument in a proposal that the Tor network and
Tor users will be improved by your proposal.

one possible reason for
which being that they do not want their relays blocked successfully by
governments, e.g., China, Iran.  (How hiding the source address of a published
relay would evade the Great Firewall escapes me somehow, except perhaps for
hidden services based inside China that might be reached via those hidden
source addresses.

That isn't how hidden services work: they connect out to guards.
Only exit and relay to relay connections use relay outbound addresses.

Given that most source addresses of relays *are* published,
the chances of getting a circuit into China seem rather slim anyway.)

That's not how relays work, either: they need mutual connectivity.

We typically ask users to configure their clients with bridges or meek in
these countries.

So this could only ever have been about exit connections to websites
in China. And that doesn't make too much sense to me.

(Snip)

Anyone is free to volunteer to create and maintain a list of outbound
relay addresses. It is technically feasible: it requires a few thousand Tor
connections per day, one via each relay, to a relay that reports the

    Ideally, per hour, but that is why it should only be done by one site.

Scanning every hour increases the cost to the network significantly.
Do you really think relays change their outbound addresses that often?

I would also encourage you to work out how much relay capacity this costs
the network, and develop a plan to provide at least that much.
A cost-benefit analysis would be an important part of the proposal.

If you don't want to do this, use 24 hours, which is the exitmap interval.

Note that Exit relays might be skippable because their outbound addresses
are already identified by one site, namely, the tor project, and published.
IOW, only entry/middle and non-Exit exit nodes need be tested, which would
shrink the list by several hundred to a thousand or so.

There are separate OutboundBindAddressOR and OutboundBindAddressExit
torrc options. This means that connections to relays and connections to
websites may come from different addresses.
(My relays are set up like this - it allows the provider to null-route exit DoS
attacks, with less disruption to users.)

Here's an alternative scheme:

Run a relay that attempts to maintain an inbound connection from each
other relay.
Each hour, report the last address seen on the last inbound connection
from each other relay, whether that connection is still active or not.
Each hour, try to re-establish inbound connections from each other relay.

Here's why this works:

As long as the original TCP connection stays up, the relay has the same
outbound address.
Whenever the connection breaks, it is possible that the outbound address
has changed.
Rate-limiting reconnection attempts decreases network load.

(Snip)

It just needs to be done safely, in a way that doesn't collect client
addresses, and avoids attaching a timestamp or order to relay connections.

    Why would client addresses ever be involved?  What would be gathered
are the addresses from which *relays* connect to other relays (N.B. *not* to
destinations).

You could easily collect and publish client and bridge addresses if you run a
relay and dump all inbound connections. You need to understand the
difference between inbound client, bridge, and relay connections.

Relays authenticate, clients and bridges don't.

It's not sufficient to rely on (not having) the Guard flag, because:
* bridge clients use the bridge as their Guard
* most current Tor versions have a bug that assigns a non-zero probability
  to excluded relay flags
* some clients don't use Guards

The only timestamps that I see might be relevant would be
the starting and ending times for each script run, so that an administrator's
own script(s) for incorporating those addresses into his "free pass" list
might easily discern out-of-date script output files from current script
output files.

You will need to sort outputs to destroy order.

(Snip)

Here's how someone could work on this feature:

Create a scanner, publish a list, and show that it has value.

    Because such a list would include addresses whose owners might not be
pleased about those addresses being published (see above), such a list
should *not* be published, but perhaps could be sent to someone in the tor
project.

I personally would not handle a list created against the wishes of a group of
relay operators.

Better still, the generating script could be sent to someone in
the tor project to enable the project to run the script, rather than
encouraging many relay operators all to duplicate the network load of
running it.

I don't know if anyone would expend resources on this.

Or start with a proposal, ask for advice, then create the scanner.

(Snip)

And try to have list downloads rely on existing Tor features, like onion
services. They'll be faster to deploy that way.

    AFAIK, tor has no such feature.  If a relay is to download nothing more
than a file of IP addresses, which feature are you suggesting will do that
upon demand by a relay (and only an identified relay)?

Tor relays submit signed descriptors.
But there is no feature in Tor that only answers signed requests.
That would require another proposal.

But relay keys are public, so you could create a server that only accepts
signed relay requests. And a client that signs requests using relay private
keys. Signing requests with relay private keys introduces potential security
holes, so I don't know how many operators would run a client like this.
(A better design would be to include another signing key in relay
descriptors, cross-certify it, and use that for requests.)

Then you could run the server as an onion service, which provides
authentication of the server.

This looks like a lot of work, and cryptography is notoriously hard to get
right.

Yes, a relay can ask
for a directory download (and so can a client).  Yes, a relay can ask for a
directory update download (and so can a client).  Yes, a relay can ask for
ExtraInfo document downloads.  How does a relay ask to download a kind of
file that doesn't yet exist?  Is there already some undocumented, generic
feature that a identified relay (but nothing else) can ask a directory mirror
or authority to "give me your latest version of file x"?
    If you mean that the downloading process could be spun off to a worker
thread, then yes, of course, it should be, but the actual implementation in
tor would be up to the tor developers, not to me.

We accept patches, that's how people become tor developers.

But I'm not sure if we would merge a patch over the objections of a group
of existing relay operators.

Here's a description of the proposal process:

https://gitweb.torproject.org/torspec.git/tree/proposals/001-process.txt

I think we've reached the point in this discussion where we need to move to
something more structured than email.

If you want to make this happen, and are willing to put some work in, write a
proposal.

(Snip)

You will also see your
Fast and HSDir flags come and go at random, depending upon how many
authorities creating testing circuits to reach and test your node(s)
go through a node that used a hidden outbound address as the source
address that fails your filter to connect to your node.

If you set the connection limit at or above 512 connections per /24, it will
be impossible for well-behaved consensus relays to go above the limit:

2 relays per IPv4 * 256 IPv4 addresses per /24 = 512 connections

    Apparently, the aforementioned effort to limit each relay pair to a
single connection does not apply to hidden service connections, as can be
readily seen on a Fast HSDir relay when bursts of connections occur.

There are multiple resource limits in Tor.
Are you sure it's the connection limit that's being hit?
We often see bandwidth and circuit limits being hit in these cases.

(Snip)
  My relay is a relatively
low-capacity relay, yet when it has the Fast flag, and especially with an
additional HSDir flag, it often has several thousand connections at any
given time.

There are several thousand relays, so several thousand connections is normal.
And an additional few thousand client or many thousand exit connections is
also normal.

(Snip)

T