(You can see this proposal rendered at https://spec.torproject.org/proposals/351-socks-auth-extensions.html )
``` Filename: 351-socks-auth-extensions.md Title: Making SOCKS5 authentication extensions extensible Author: Nick Mathewson Created: 9 September 2024 Status: Open ```
## Introduction
Currently, Tor implementations use the SOCKS5 username and password fields to pass parameters for stream isolation. (See the `IsolateSocksAuth` flag in the C tor manual, and the "Stream isolation" section ([forthcoming](https://gitlab.torproject.org/tpo/core/torspec/-/merge_requests/279)) in our [socks extensions](../socks-extensions.md) spec.)
Tor implementations also support SOCKS4 and SOCKS4a, but they are not affected by this proposal.
The C Tor implementation also supports other proxy types besides SOCKS. They are not affected by this proposal because they either have other means to extend their protocols (as with HTTP headers in HTTP CONNECT) or no means to pass extension information (as for DNS proxies, iptables transparent proxies, etc).
Until now, the rules for interpreting these fields have been simple: all values are permitted, and streams with unequal values may not share a circuit.
But in order to integrate SOCKS connections into Arti's RPC protocol, we additionally want the ability to send RPC "Object IDs"[^ObjectId] in these fields. To do this, we will need some way to tell when we have received an object ID, when we have received an isolation parameter, and to avoid confusing them with one another.
Note that some confusion will necessarily remain possible: Since current Tor clients are allowed to send any value as SOCKS username and password, any value we specify here will be one which a client in principle _might_ have sent under the old protocol.
Additionally, since we are adding complexity to the interpretation of these fields, it's possible we'll want to change this complexity in the future. To do this, we'll want a versioning scheme to premit changes.
## Proposal
If accepted, the following can be incorporated into our [socks extensions](../socks-extensions.md) spec.)
We support a series of extensions in SOCKS5 Username/Passwords. Currently, these extensions can encode a stream isolation parameter (used to indicate that streams may share a circuit) and an RPC object ID (used to associate the stream with an entity in an RPC session).
These extensions are in use whenever the SOCKS5 Username begins with the 8-byte "magic" sequence `[3c 74 6f 72 53 30 58 3e]`. (This is the ASCII encoding of `<torS0X>`).
If the SOCKS5 Username/Password fields are present but the Username does not begin with this byte sequence, it indicates _legacy isolation_. New client implementations SHOULD NOT use legacy isolation. A SocksPort may be configured to reject legacy isolation.
When these extensions are in use, the next byte of the username after the "magic" sequence indicate a version number. Any implementation receiving an unrecognized or missing version MUST reject the socks request.
When the version number is `[30]` (the ascii encoding of `0`), we interpret the rest of the Username field and the Password field as follows:
The remainder of the Username field encodes an RPC Object ID. (If the remainder of the Username field is empty, there is no RPC object.)
The Password field is stream isolation parameter. If it is empty, the stream isolation parameter is an empty string.
### Stream isolation
This replaces the corresponding part of the "Stream isolation" section ([forthcoming](https://gitlab.torproject.org/tpo/core/torspec/-/merge_requests/279)) in our [socks extensions](../socks-extensions.md) spec.
Two streams are considered to have the same SOCKS authentication values if and only if one of the following is true:
- They are both SOCKS4 or SOCKS4a, with the same user "ID" string. - They are both SOCKS5, with no authentication. - They are both SOCKS5 with USERNAME/PASSWORD authentication, using legacy isolation parameters, and they have identical usernames and identical passwords. - They are both SOCKS5 using the extensions above, with the same stream isolation parameter.
### A further extension for integration with Arti SOCKS
We should add the following to a specification, though it is not clear whether it goes in the Arti RPC spec or in the socks extensions spec.
In some cases, the RPC Object ID may denote an object that already includes information about its intended stream isolation. In such cases, the stream isolation MUST be blank. Implementations MUST reject non-blank stream isolation in such cases.
In some cases, the RPC object ID may denote an object that already includes information about its intended destination address and port. In such cases, the destination address MUST be `0.0.0.0` or `::` (encoded either as an IPv4 address, an IPv6 address, or a hostname) and the destination port MUST be 0. Implementations MUST reject other addresses in such cases.
-----
(Here the specifications end. The rest of this proposal is discussion.)
## Design considerations
Our use of SOCKS5 Username/Passwords here (as opposed to some other, new authentication type) is based on the observation that many existing SOCKS5 implementations support Username/Password, but comparatively few support arbitrary plug-in authentication.
The magic "`<torS0X>`" prefix is chosen to be 8 characters long so that existing client implementations that generate random strings will not often generate it by mistake.
The version number is chosen to be an ASCII `0` rather than a raw 0 byte, for compatibility with existing SOCKS5 client implementations that do not support non-ASCII username/password values.
## C Tor migration
When this proposal is accepted, we *should* configure C tor to implement it as follows:
- To reject any SOCKS5 Username starting with `<torS0X>` unless it is exactly `<torS0X>0`.
This behavior is sufficient to give correct isolation behavior, to reject any connection including an RPC object ID, and to reject any as-yet-unspecified isolation mechanisms.
[^ObjectId]: An ObjectId is used in the Arti RPC protocol to associate a SOCKS request with some existing Client object, or with a preexisting DataStream.
Is there a reason why this proposal extends the existing username/password auth, instead of defining a new SOCKS5 authentication type? c.f. https://e.as207960.net/w4bdyj/5dQ6fT3QLm2aTfUx ------------------------------
Any statements contained in this email are personal to the author and are not necessarily the statements of the company unless specifically stated. AS207960 Cyfyngedig, having a registered office at 13 Pen-y-lan Terrace, Caerdydd, Cymru, CF23 9EU, trading as Glauca Digital, is a company registered in Wales under № 12417574 https://find-and-update.company-information.service.gov.uk/company/12417574, LEI 875500FXNCJPAPF3PD10. ICO register №: ZA782876 https://ico.org.uk/ESDWebPages/Entry/ZA782876. UK VAT №: GB378323867. EU VAT №: EU372013983. Turkish VAT №: 0861333524. South Korean VAT №: 522-80-03080. AS207960 Ewrop OÜ, having a registered office at Lääne-Viru maakond, Tapa vald, Porkuni küla, Lossi tn 1, 46001, trading as Glauca Digital, is a company registered in Estonia under № 16755226. Estonian VAT №: EE102625532. Glauca Digital and the Glauca logo are registered trademarks in the UK, under № UK00003718474 and № UK00003718468, respectively.
On Mon, 9 Sept 2024 at 19:04, Nick Mathewson nickm@torproject.org wrote:
(You can see this proposal rendered at https://e.as207960.net/w4bdyj/WX0qiAbY3YcKg8M3 )
Filename: 351-socks-auth-extensions.md Title: Making SOCKS5 authentication extensions extensible Author: Nick Mathewson Created: 9 September 2024 Status: Open
## Introduction
Currently, Tor implementations use the SOCKS5 username and password fields to pass parameters for stream isolation. (See the `IsolateSocksAuth` flag in the C tor manual, and the "Stream isolation" section ([forthcoming]( https://e.as207960.net/w4bdyj/hu9V3E9K4RDQug60 in our [socks extensions](../socks-extensions.md) spec.)
Tor implementations also support SOCKS4 and SOCKS4a, but they are not affected by this proposal.
The C Tor implementation also supports other proxy types besides SOCKS. They are not affected by this proposal because they either have other means to extend their protocols (as with HTTP headers in HTTP CONNECT) or no means to pass extension information (as for DNS proxies, iptables transparent proxies, etc).
Until now, the rules for interpreting these fields have been simple: all values are permitted, and streams with unequal values may not share a circuit.
But in order to integrate SOCKS connections into Arti's RPC protocol, we additionally want the ability to send RPC "Object IDs"[^ObjectId] in these fields. To do this, we will need some way to tell when we have received an object ID, when we have received an isolation parameter, and to avoid confusing them with one another.
Note that some confusion will necessarily remain possible: Since current Tor clients are allowed to send any value as SOCKS username and password, any value we specify here will be one which a client in principle _might_ have sent under the old protocol.
Additionally, since we are adding complexity to the interpretation of these fields, it's possible we'll want to change this complexity in the future. To do this, we'll want a versioning scheme to premit changes.
## Proposal
If accepted, the following can be incorporated into our [socks extensions](../socks-extensions.md) spec.)
We support a series of extensions in SOCKS5 Username/Passwords. Currently, these extensions can encode a stream isolation parameter (used to indicate that streams may share a circuit) and an RPC object ID (used to associate the stream with an entity in an RPC session).
These extensions are in use whenever the SOCKS5 Username begins with the 8-byte "magic" sequence `[3c 74 6f 72 53 30 58 3e]`. (This is the ASCII encoding of `<torS0X>`).
If the SOCKS5 Username/Password fields are present but the Username does not begin with this byte sequence, it indicates _legacy isolation_. New client implementations SHOULD NOT use legacy isolation. A SocksPort may be configured to reject legacy isolation.
When these extensions are in use, the next byte of the username after the "magic" sequence indicate a version number. Any implementation receiving an unrecognized or missing version MUST reject the socks request.
When the version number is `[30]` (the ascii encoding of `0`), we interpret the rest of the Username field and the Password field as follows:
The remainder of the Username field encodes an RPC Object ID. (If the remainder of the Username field is empty, there is no RPC object.)
The Password field is stream isolation parameter. If it is empty, the stream isolation parameter is an empty string.
### Stream isolation
This replaces the corresponding part of the "Stream isolation" section ([forthcoming](
https://e.as207960.net/w4bdyj/wLBELgxldNKoeRu6
in our [socks extensions](../socks-extensions.md) spec.
Two streams are considered to have the same SOCKS authentication values if and only if one of the following is true:
- They are both SOCKS4 or SOCKS4a, with the same user "ID" string.
- They are both SOCKS5, with no authentication.
- They are both SOCKS5 with USERNAME/PASSWORD authentication, using legacy isolation parameters, and they have identical usernames and identical passwords.
- They are both SOCKS5 using the extensions above, with the same stream isolation parameter.
### A further extension for integration with Arti SOCKS
We should add the following to a specification, though it is not clear whether it goes in the Arti RPC spec or in the socks extensions spec.
In some cases, the RPC Object ID may denote an object that already includes information about its intended stream isolation. In such cases, the stream isolation MUST be blank. Implementations MUST reject non-blank stream isolation in such cases.
In some cases, the RPC object ID may denote an object that already includes information about its intended destination address and port. In such cases, the destination address MUST be `0.0.0.0` or `::` (encoded either as an IPv4 address, an IPv6 address, or a hostname) and the destination port MUST be 0. Implementations MUST reject other addresses in such cases.
(Here the specifications end. The rest of this proposal is discussion.)
## Design considerations
Our use of SOCKS5 Username/Passwords here (as opposed to some other, new authentication type) is based on the observation that many existing SOCKS5 implementations support Username/Password, but comparatively few support arbitrary plug-in authentication.
The magic "`<torS0X>`" prefix is chosen to be 8 characters long so that existing client implementations that generate random strings will not often generate it by mistake.
The version number is chosen to be an ASCII `0` rather than a raw 0 byte, for compatibility with existing SOCKS5 client implementations that do not support non-ASCII username/password values.
## C Tor migration
When this proposal is accepted, we *should* configure C tor to implement it as follows:
- To reject any SOCKS5 Username starting with `<torS0X>` unless it is exactly `<torS0X>0`.
This behavior is sufficient to give correct isolation behavior, to reject any connection including an RPC object ID, and to reject any as-yet-unspecified isolation mechanisms.
[^ObjectId]: An ObjectId is used in the Arti RPC protocol to associate a SOCKS request with some existing Client object, or with a preexisting DataStream. _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://e.as207960.net/w4bdyj/fzls62UK8r84YOy2
On Tue, Sep 10, 2024 at 9:25 AM Q Misell via tor-dev tor-dev@lists.torproject.org wrote:
Is there a reason why this proposal extends the existing username/password auth, instead of defining a new SOCKS5 authentication type? c.f. https://datatracker.ietf.org/doc/html/rfc1928#section-3
Indeed there is! The one I was thinking of the most is this:
"Our use of SOCKS5 Username/Passwords here (as opposed to some other, new authentication type) is based on the observation that many existing SOCKS5 implementations support Username/Password, but comparatively few support arbitrary plug-in authentication."
In other words, almost any application that has a working SOCKS5 library can use this system, whereas if we were to define a new authentication type, nearly every application would need to patch their SOCKS5 library, since most SOCKS5 libraries don't let you define new authentication types.
This wouldn't be so bad for applications that implement SOCKS5 themselves, of course.
-- Nick
It'd be helpful to have more context about the object IDs and what we're trying to accomplish with them here; why we need/want them in arti but didn't in c-tor. I'm inferring (maybe incorrectly) that the idea is that this is effectively letting us multiplex differently-configured SOCKS->Tor services on a single port. And/or maybe to multiplex multiple data connections over a single SOCKS socket? Is it worth doing these vs the alternatives (a listening port per service/object and a socket per data stream)? e.g. is this fixing some current resource exhaustion issue, or one we expect to be more problematic in arti...?
Maybe worth mentioning the length limit for user and password (255 I believe) and that it'll be sufficient (?)
Otherwise LGTM
On 9/9/24 12:04 PM, Nick Mathewson wrote:
(You can see this proposal rendered at https://spec.torproject.org/proposals/351-socks-auth-extensions.html )
Filename: 351-socks-auth-extensions.md Title: Making SOCKS5 authentication extensions extensible Author: Nick Mathewson Created: 9 September 2024 Status: Open
## Introduction
Currently, Tor implementations use the SOCKS5 username and password fields to pass parameters for stream isolation. (See the `IsolateSocksAuth` flag in the C tor manual, and the "Stream isolation" section ([forthcoming](https://gitlab.torproject.org/tpo/core/torspec/-/merge_requests/279)) in our [socks extensions](../socks-extensions.md) spec.)
Tor implementations also support SOCKS4 and SOCKS4a, but they are not affected by this proposal.
The C Tor implementation also supports other proxy types besides SOCKS. They are not affected by this proposal because they either have other means to extend their protocols (as with HTTP headers in HTTP CONNECT) or no means to pass extension information (as for DNS proxies, iptables transparent proxies, etc).
Until now, the rules for interpreting these fields have been simple: all values are permitted, and streams with unequal values may not share a circuit.
But in order to integrate SOCKS connections into Arti's RPC protocol, we additionally want the ability to send RPC "Object IDs"[^ObjectId] in these fields. To do this, we will need some way to tell when we have received an object ID, when we have received an isolation parameter, and to avoid confusing them with one another.
Note that some confusion will necessarily remain possible: Since current Tor clients are allowed to send any value as SOCKS username and password, any value we specify here will be one which a client in principle _might_ have sent under the old protocol.
Additionally, since we are adding complexity to the interpretation of these fields, it's possible we'll want to change this complexity in the future. To do this, we'll want a versioning scheme to premit changes.
## Proposal
If accepted, the following can be incorporated into our [socks extensions](../socks-extensions.md) spec.)
We support a series of extensions in SOCKS5 Username/Passwords. Currently, these extensions can encode a stream isolation parameter (used to indicate that streams may share a circuit) and an RPC object ID (used to associate the stream with an entity in an RPC session).
These extensions are in use whenever the SOCKS5 Username begins with the 8-byte "magic" sequence `[3c 74 6f 72 53 30 58 3e]`. (This is the ASCII encoding of `<torS0X>`).
If the SOCKS5 Username/Password fields are present but the Username does not begin with this byte sequence, it indicates _legacy isolation_. New client implementations SHOULD NOT use legacy isolation. A SocksPort may be configured to reject legacy isolation.
When these extensions are in use, the next byte of the username after the "magic" sequence indicate a version number. Any implementation receiving an unrecognized or missing version MUST reject the socks request.
When the version number is `[30]` (the ascii encoding of `0`), we interpret the rest of the Username field and the Password field as follows:
The remainder of the Username field encodes an RPC Object ID. (If the remainder of the Username field is empty, there is no RPC object.)
The Password field is stream isolation parameter. If it is empty, the stream isolation parameter is an empty string.
### Stream isolation
This replaces the corresponding part of the "Stream isolation" section ([forthcoming](https://gitlab.torproject.org/tpo/core/torspec/-/merge_requests/279)) in our [socks extensions](../socks-extensions.md) spec.
Two streams are considered to have the same SOCKS authentication values if and only if one of the following is true:
- They are both SOCKS4 or SOCKS4a, with the same user "ID" string.
- They are both SOCKS5, with no authentication.
- They are both SOCKS5 with USERNAME/PASSWORD authentication, using legacy isolation parameters, and they have identical usernames and identical passwords.
- They are both SOCKS5 using the extensions above, with the same stream isolation parameter.
### A further extension for integration with Arti SOCKS
We should add the following to a specification, though it is not clear whether it goes in the Arti RPC spec or in the socks extensions spec.
In some cases, the RPC Object ID may denote an object that already includes information about its intended stream isolation. In such cases, the stream isolation MUST be blank. Implementations MUST reject non-blank stream isolation in such cases.
In some cases, the RPC object ID may denote an object that already includes information about its intended destination address and port. In such cases, the destination address MUST be `0.0.0.0` or `::` (encoded either as an IPv4 address, an IPv6 address, or a hostname) and the destination port MUST be 0. Implementations MUST reject other addresses in such cases.
(Here the specifications end. The rest of this proposal is discussion.)
## Design considerations
Our use of SOCKS5 Username/Passwords here (as opposed to some other, new authentication type) is based on the observation that many existing SOCKS5 implementations support Username/Password, but comparatively few support arbitrary plug-in authentication.
The magic "`<torS0X>`" prefix is chosen to be 8 characters long so that existing client implementations that generate random strings will not often generate it by mistake.
The version number is chosen to be an ASCII `0` rather than a raw 0 byte, for compatibility with existing SOCKS5 client implementations that do not support non-ASCII username/password values.
## C Tor migration
When this proposal is accepted, we *should* configure C tor to implement it as follows:
- To reject any SOCKS5 Username starting with `<torS0X>` unless it is exactly `<torS0X>0`.
This behavior is sufficient to give correct isolation behavior, to reject any connection including an RPC object ID, and to reject any as-yet-unspecified isolation mechanisms.
[^ObjectId]: An ObjectId is used in the Arti RPC protocol to associate a SOCKS request with some existing Client object, or with a preexisting DataStream. _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Tue, Sep 10, 2024 at 6:26 PM Jim Newsome jnewsome@torproject.org wrote:
It'd be helpful to have more context about the object IDs and what we're trying to accomplish with them here; why we need/want them in arti but didn't in c-tor. I'm inferring (maybe incorrectly) that the idea is that this is effectively letting us multiplex differently-configured SOCKS->Tor services on a single port. And/or maybe to multiplex multiple data connections over a single SOCKS socket? Is it worth doing these vs the alternatives (a listening port per service/object and a socket per data stream)? e.g. is this fixing some current resource exhaustion issue, or one we expect to be more problematic in arti...?
Maybe worth mentioning the length limit for user and password (255 I believe) and that it'll be sufficient (?)
Otherwise LGTM
This is a good question! Right now there isn't a complete spec for arti RPC, but for background you could have a look at the file `rpc-meta-draft.md` ( https://gitlab.torproject.org/tpo/core/arti/-/blob/main/doc/dev/notes/rpc-me... ) in arti, as amended by the WIP branch at https://gitlab.torproject.org/tpo/core/arti/-/merge_requests/2386 .
This doesn't (yet) describe the DataStream protocol, since that's what we're trying to hammer out here. There's a comment in arti::socks about that which I hope to migrate to rpc-meta-draft once it is accurate. I'll copy out the relevant parts below, since many of them are about to be overwritten by this proposal.
Sorry about so many incomplete documents! I hope that this will help answer the questions. If not, please just poke me again.
/// ## Key concepts /// /// A data stream is "RPC-visible" if, when it is created via SOCKS, /// the RPC system is told about it. /// /// Every RPC-visible stream is associated with a given RPC object when it is created. /// (Since the RPC object is being specified in the SOCKS protocol, /// it must be one with an externally visible Object ID. /// Such Object IDs are cryptographically unguessable and unforgeable, /// and are qualified with a unique identifier for their associated RPC session.) /// Call this RPC Object the "target" object for now. /// This target RPC object must implement /// the [`ConnectWithPrefs`](arti_client::rpc::ConnectWithPrefs) special method. /// /// Right now, there are two general kinds of objects that implement this method: /// client-like objects, and stream-like objects. /// /// A client-like object is either a `TorClient` or an RPC `Session`. /// It knows about and it is capable of opening multiple data streams. /// Using it as the target object for a SOCKS connection tells Arti /// that the resulting data stream (if any) /// should be built by it, and associated with its RPC session. /// /// An application gets a TorClient by asking the session for one, /// or for asking a TorClient to give you a new variant clone of itself. /// /// A stream-like object is an `arti_rpcserver::stream::RpcDataStream`. /// It is created from a client-like object, but represents a single data stream. /// When created, it it not yet connected or trying to connect to anywhere: /// the act of using it as the target Object for a SOCKS connection causes /// it to begin connecting. /// (You can also think of this as a single-use client, /// which once used, becomes interchangeable with the DataStream it created.) /// (TODO: We may wish to change this vocabulary. /// We may wish to call this a "stream handle", for instance?) /// /// An application gets an RpcDataStream by calling `arti:new_stream_handle /// on any client-like object. Currently, this always creates an RpcDataStream /// that makes optimistic connections; See #1583. ... /// ## Intended use cases (examples) /// /// (These examples assume that the application /// already knows the SOCKS port it should use. /// I'm leaving out the isolation strings as orthogonal.) /// /// These are **NOT** the only possible use cases; /// they're just the two that help understand this system best (I hope). /// /// ### Case 1: Using a client-like object directly. /// /// Here the application has authenticated to RPC /// and gotten the session ID `SESSION-1`. /// (In reality, this would be a longer ID, and full of crypto). /// /// The application wants to open a new stream to www.example.com. /// They don't particularly care about isolation, /// but they do want their stream to use their RPC session. /// They don't want an Object ID for the stream. /// /// To do this, they make a SOCKS connection to arti, /// with target address www.example.com. /// They set the username to `<arti-rpc-session>`, /// and the password to `SESSION-1`. /// /// Arti looks up the Session object via the `SESSION-1` object ID /// and tells it (via the ConnectWithPrefs special method) /// to connect to www.example.com. /// The session creates a new DataStream using its internal TorClient, /// but does not register the stream with an RPC Object ID. /// Arti proxies the application's SOCKS connection through this DataStream. /// /// /// ### Case 2: Creating an identifiable stream. /// /// Here the application wants to be able to refer to its DataStream /// after the stream is created. /// As before, we assume that it's on an RPC session /// where the Session ID is `SESSION-1`. /// /// The application sends an RPC request of the form: /// `{"id": 123, "obj": "SESSION-1", "method": "arti:new_stream_handle", "params": {}}` /// /// It receives a reply like: /// `{"id": 123, "result": {"id": "STREAM-1"} }` /// /// (In reality, `STREAM-1` would also be longer and full of crypto.) /// /// Now the application has an object called `STREAM-1` that is not yet a connected /// stream, but which may become one. /// /// The application opens a socks connection as before. /// For the username it sends `<arti-rpc-session>`, /// and for the password it sends `STREAM-1`. /// /// Now Arti looks up the `RpcDataStream` object via `STREAM-1`, /// and tells it (via the ConnectWithPrefs special method) /// to connect to www.example.com. /// This causes the `RpcDataStream` internally to create a new `DataStream`, /// and to store that `DataStream` in itself. /// The `RpcDataStream` with Object ID `STREAM-1` /// is now an alias for the newly created `DataStream`. /// Arti proxies the application's SOCKS connection through that `DataStream`. ///
Hi Nick,
It would be useful to have a way of controlling access to the SOCKS port so that untrusted applications running on the same device as a Tor client can't use the Tor client's SOCKS proxy. This is something that people auditing Briar have raised as a security concern.
Unix sockets aren't a great solution here because HTTP libraries don't necessarily know how to connect to them. A TCP socket with username/password auth is what HTTP libraries are expecting to see, but because Tor uses the SOCKS username and password for other purposes, we can't currently use them for access control.
Before seeing this proposal I'd thought about asking if Tor could support some way of configuring username/password pairs, which would function as real SOCKS credentials as well as providing stream isolation. But it seems like this proposal would make that more difficult, and if it's going to be possible to support SOCKS credentials in future, it might make sense to plan for it now.
I'm not asking for username/password auth to be added to this proposal, just for the proposal to leave room for it to be added in the future.
Can you see how that might be done?
Cheers, Michael
On 09/09/2024 18:04, Nick Mathewson wrote:
(You can see this proposal rendered at https://spec.torproject.org/proposals/351-socks-auth-extensions.html )
Filename: 351-socks-auth-extensions.md Title: Making SOCKS5 authentication extensions extensible Author: Nick Mathewson Created: 9 September 2024 Status: Open
## Introduction
Currently, Tor implementations use the SOCKS5 username and password fields to pass parameters for stream isolation. (See the `IsolateSocksAuth` flag in the C tor manual, and the "Stream isolation" section ([forthcoming](https://gitlab.torproject.org/tpo/core/torspec/-/merge_requests/279)) in our [socks extensions](../socks-extensions.md) spec.)
Tor implementations also support SOCKS4 and SOCKS4a, but they are not affected by this proposal.
The C Tor implementation also supports other proxy types besides SOCKS. They are not affected by this proposal because they either have other means to extend their protocols (as with HTTP headers in HTTP CONNECT) or no means to pass extension information (as for DNS proxies, iptables transparent proxies, etc).
Until now, the rules for interpreting these fields have been simple: all values are permitted, and streams with unequal values may not share a circuit.
But in order to integrate SOCKS connections into Arti's RPC protocol, we additionally want the ability to send RPC "Object IDs"[^ObjectId] in these fields. To do this, we will need some way to tell when we have received an object ID, when we have received an isolation parameter, and to avoid confusing them with one another.
Note that some confusion will necessarily remain possible: Since current Tor clients are allowed to send any value as SOCKS username and password, any value we specify here will be one which a client in principle _might_ have sent under the old protocol.
Additionally, since we are adding complexity to the interpretation of these fields, it's possible we'll want to change this complexity in the future. To do this, we'll want a versioning scheme to premit changes.
## Proposal
If accepted, the following can be incorporated into our [socks extensions](../socks-extensions.md) spec.)
We support a series of extensions in SOCKS5 Username/Passwords. Currently, these extensions can encode a stream isolation parameter (used to indicate that streams may share a circuit) and an RPC object ID (used to associate the stream with an entity in an RPC session).
These extensions are in use whenever the SOCKS5 Username begins with the 8-byte "magic" sequence `[3c 74 6f 72 53 30 58 3e]`. (This is the ASCII encoding of `<torS0X>`).
If the SOCKS5 Username/Password fields are present but the Username does not begin with this byte sequence, it indicates _legacy isolation_. New client implementations SHOULD NOT use legacy isolation. A SocksPort may be configured to reject legacy isolation.
When these extensions are in use, the next byte of the username after the "magic" sequence indicate a version number. Any implementation receiving an unrecognized or missing version MUST reject the socks request.
When the version number is `[30]` (the ascii encoding of `0`), we interpret the rest of the Username field and the Password field as follows:
The remainder of the Username field encodes an RPC Object ID. (If the remainder of the Username field is empty, there is no RPC object.)
The Password field is stream isolation parameter. If it is empty, the stream isolation parameter is an empty string.
### Stream isolation
This replaces the corresponding part of the "Stream isolation" section ([forthcoming](https://gitlab.torproject.org/tpo/core/torspec/-/merge_requests/279)) in our [socks extensions](../socks-extensions.md) spec.
Two streams are considered to have the same SOCKS authentication values if and only if one of the following is true:
- They are both SOCKS4 or SOCKS4a, with the same user "ID" string.
- They are both SOCKS5, with no authentication.
- They are both SOCKS5 with USERNAME/PASSWORD authentication, using legacy isolation parameters, and they have identical usernames and identical passwords.
- They are both SOCKS5 using the extensions above, with the same stream isolation parameter.
### A further extension for integration with Arti SOCKS
We should add the following to a specification, though it is not clear whether it goes in the Arti RPC spec or in the socks extensions spec.
In some cases, the RPC Object ID may denote an object that already includes information about its intended stream isolation. In such cases, the stream isolation MUST be blank. Implementations MUST reject non-blank stream isolation in such cases.
In some cases, the RPC object ID may denote an object that already includes information about its intended destination address and port. In such cases, the destination address MUST be `0.0.0.0` or `::` (encoded either as an IPv4 address, an IPv6 address, or a hostname) and the destination port MUST be 0. Implementations MUST reject other addresses in such cases.
(Here the specifications end. The rest of this proposal is discussion.)
## Design considerations
Our use of SOCKS5 Username/Passwords here (as opposed to some other, new authentication type) is based on the observation that many existing SOCKS5 implementations support Username/Password, but comparatively few support arbitrary plug-in authentication.
The magic "`<torS0X>`" prefix is chosen to be 8 characters long so that existing client implementations that generate random strings will not often generate it by mistake.
The version number is chosen to be an ASCII `0` rather than a raw 0 byte, for compatibility with existing SOCKS5 client implementations that do not support non-ASCII username/password values.
## C Tor migration
When this proposal is accepted, we *should* configure C tor to implement it as follows:
- To reject any SOCKS5 Username starting with `<torS0X>` unless it is exactly `<torS0X>0`.
This behavior is sufficient to give correct isolation behavior, to reject any connection including an RPC object ID, and to reject any as-yet-unspecified isolation mechanisms.
[^ObjectId]: An ObjectId is used in the Arti RPC protocol to associate a SOCKS request with some existing Client object, or with a preexisting DataStream. _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Wed, Sep 11, 2024 at 7:02 AM Michael Rogers michael@briarproject.org wrote:
Hi Nick,
It would be useful to have a way of controlling access to the SOCKS port so that untrusted applications running on the same device as a Tor client can't use the Tor client's SOCKS proxy. This is something that people auditing Briar have raised as a security concern.
Unix sockets aren't a great solution here because HTTP libraries don't necessarily know how to connect to them. A TCP socket with username/password auth is what HTTP libraries are expecting to see, but because Tor uses the SOCKS username and password for other purposes, we can't currently use them for access control.
Before seeing this proposal I'd thought about asking if Tor could support some way of configuring username/password pairs, which would function as real SOCKS credentials as well as providing stream isolation. But it seems like this proposal would make that more difficult, and if it's going to be possible to support SOCKS credentials in future, it might make sense to plan for it now.
I'm not asking for username/password auth to be added to this proposal, just for the proposal to leave room for it to be added in the future.
Can you see how that might be done?
Good question, Michael!
So, I see several answers: two comparatively simple, one a bit trickier, one that's a bit philosophical, and two you probably can't use. (There are probably more I haven't thought of.)
In the order of descending usefulness:
## Simple answer 1: Extending this proposal to allow to using username/password as a username/password
So, with this proposal (351), we have the ability to add new semantics to the SOCKS5 username/password field: If the username begins with `<torS0X>`, then the next byte describes what format the rest of the username/password are in.
Now, the version of this proposal sent to the ML only defines semantics for `0`; there's a version on gitlab (see [torspec!280]) that defines both `0` and `1`.
[torspec!280]: https://gitlab.torproject.org/tpo/core/torspec/-/merge_requests/280
But we could also add a new format (call it `P`) that allows encoding an actual username and password. That would be backward compatible with this proposal, though we'd actually need to design and implement it.
To use such a format, you might configure the SOCKS library to say that the username is something like `<torS0X>Pmy_real_username`, and that the password is `cnffjbeq`.
(FWIW, I believe this proposal makes it _easier_ to put a username/password back into the SOCKS5 username password field, since it defines a forward-compatible way to define new implementations.)
## Simple answer 2: Define a new SocksPort where Username/Password means Username/Password
Both C Tor and Arti have the ability to define flags on a SocksPort. We could in principle define a flag that means, "On this particular SocksPort, Username/Password authentication is required, prop351 semantics are not available, and the username/password must be on some list of approved users."
(Again we'd need to actually design and implement this, but it's not impossible.)
## Trickier answer: How to do it if your application is using Arti RPC
With the Arti RPC subsystem (that's Arti's equivalent of C Tor's control port), when your app authenticates an Arti RPC connection, it gets an Object ID for an object called a Session. According to this proposal as amended at [tospec!280], you open a SOCKS connection within a session by setting your username to `<torS0X>1session_id_goes_here`. (Note that Session IDs are fairly long, and deliberately hard to guess.)
Taken together, this would make it possible to add a SocksPort flag that means "Don't allow any SOCKS connections unless they are on a session."
With this flag, if your application is already authenticating with Arti RPC and linking its sockets via the mechanism defined here, it would be allowed to connect, but Arti would shut out any application that wasn't configured to do so.
## Philosophical answer: what are we restricting here and why?
It's not immediately clear to me _why_ it's considered a risk for another application to be able to open connections through the same Tor proxy as yours. Naturally, you wouldn't want another app to use the same circuits as yours, but you can mostly[^1] solve that by setting your username/password to any hard-to-guess random values, and having stream isolation code take care of that for you.
I guess that we might worry about side-channel attacks, where a hostile application is sending traffic through the Tor proxy in order to introduce a timing signal into your traffic? But any application with network access could do that, whether it has Tor access or not.
[^1]: Actually, wait. There's a possible problem here when you're making lots of onion service connections, since IIRC in C Tor onion service circuits aren't affected by isolation. But in Arti, they are. So at least that problem will go away as Arti moves to the fore.
## An answer you probably can't use: embedding Arti
Right now, you can embed Arti in any Rust app. Some folks have already started to write wrappers for Java and other languages. With our RPC protocol, we intend to support embedding Arti in any application written in any language that can call C and link a library.
So with this solution, there is no SOCKS port at all, and nobody can use your Tor client but you.
## An answer you probably can't use: OS-specific restrictions
Because somebody will mention it if I don't: you could probably cobble something together using OS specific restrictions, like containers or selinux. Of course, this isn't really something you can ship in a convenient cross-platform way AFAIK, so it probably isn't going to be any portable application's first resort.
----- Sorry for all the text! But I do hope it's at least somewhat interesting.
best wishes,
Thanks so much for the thorough answer Nick. Looks like there are several potential solutions here. Responses inline below...
On 11/09/2024 14:12, Nick Mathewson wrote:
On Wed, Sep 11, 2024 at 7:02 AM Michael Rogers michael@briarproject.org wrote:
Hi Nick,
It would be useful to have a way of controlling access to the SOCKS port so that untrusted applications running on the same device as a Tor client can't use the Tor client's SOCKS proxy. This is something that people auditing Briar have raised as a security concern.
Unix sockets aren't a great solution here because HTTP libraries don't necessarily know how to connect to them. A TCP socket with username/password auth is what HTTP libraries are expecting to see, but because Tor uses the SOCKS username and password for other purposes, we can't currently use them for access control.
Before seeing this proposal I'd thought about asking if Tor could support some way of configuring username/password pairs, which would function as real SOCKS credentials as well as providing stream isolation. But it seems like this proposal would make that more difficult, and if it's going to be possible to support SOCKS credentials in future, it might make sense to plan for it now.
I'm not asking for username/password auth to be added to this proposal, just for the proposal to leave room for it to be added in the future.
Can you see how that might be done?
Good question, Michael!
So, I see several answers: two comparatively simple, one a bit trickier, one that's a bit philosophical, and two you probably can't use. (There are probably more I haven't thought of.)
In the order of descending usefulness:
## Simple answer 1: Extending this proposal to allow to using username/password as a username/password
So, with this proposal (351), we have the ability to add new semantics to the SOCKS5 username/password field: If the username begins with `<torS0X>`, then the next byte describes what format the rest of the username/password are in.
Now, the version of this proposal sent to the ML only defines semantics for `0`; there's a version on gitlab (see [torspec!280]) that defines both `0` and `1`.
But we could also add a new format (call it `P`) that allows encoding an actual username and password. That would be backward compatible with this proposal, though we'd actually need to design and implement it.
To use such a format, you might configure the SOCKS library to say that the username is something like `<torS0X>Pmy_real_username`, and that the password is `cnffjbeq`.
(FWIW, I believe this proposal makes it _easier_ to put a username/password back into the SOCKS5 username password field, since it defines a forward-compatible way to define new implementations.)
This sounds promising to me. I wasn't sure if the format byte was meant to act as an incrementing version number, with the expectation that new formats would supersede old formats and support for old formats would eventually be dropped. If there's an expectation of keeping multiple formats in use indefinitely to serve different purposes then that's great news and it seems like we could define a username/password format as you've suggested, without needing to hold up this proposal with the details.
## Simple answer 2: Define a new SocksPort where Username/Password means Username/Password
Both C Tor and Arti have the ability to define flags on a SocksPort. We could in principle define a flag that means, "On this particular SocksPort, Username/Password authentication is required, prop351 semantics are not available, and the username/password must be on some list of approved users."
(Again we'd need to actually design and implement this, but it's not impossible.)
This would also work for our purposes and was roughly what I'd had in mind to suggest before seeing this proposal. However it would depend as you mention below on whether we also needed/wanted to use RPC session IDs.
## Trickier answer: How to do it if your application is using Arti RPC
With the Arti RPC subsystem (that's Arti's equivalent of C Tor's control port), when your app authenticates an Arti RPC connection, it gets an Object ID for an object called a Session. According to this proposal as amended at [tospec!280], you open a SOCKS connection within a session by setting your username to `<torS0X>1session_id_goes_here`. (Note that Session IDs are fairly long, and deliberately hard to guess.)
Taken together, this would make it possible to add a SocksPort flag that means "Don't allow any SOCKS connections unless they are on a session."
With this flag, if your application is already authenticating with Arti RPC and linking its sockets via the mechanism defined here, it would be allowed to connect, but Arti would shut out any application that wasn't configured to do so.
Seems to me that this would also solve our problem, although if sessions IDs are being used as capabilities then maybe it would be good to make it an explicit part of Tor's threat model that session IDs must not only be hard to guess, but should be treated as confidential (not logged, etc)? Which is maybe just a viewpoint/documentation change.
## Philosophical answer: what are we restricting here and why?
It's not immediately clear to me _why_ it's considered a risk for another application to be able to open connections through the same Tor proxy as yours. Naturally, you wouldn't want another app to use the same circuits as yours, but you can mostly[^1] solve that by setting your username/password to any hard-to-guess random values, and having stream isolation code take care of that for you.
I guess that we might worry about side-channel attacks, where a hostile application is sending traffic through the Tor proxy in order to introduce a timing signal into your traffic? But any application with network access could do that, whether it has Tor access or not.
[^1]: Actually, wait. There's a possible problem here when you're making lots of onion service connections, since IIRC in C Tor onion service circuits aren't affected by isolation. But in Arti, they are. So at least that problem will go away as Arti moves to the fore.
Good to know that C Tor's onion service circuits aren't affected by isolation - although we aren't using isolation at the moment so it doesn't have an immediate impact.
I don't honestly know what risk the auditors had in mind when they flagged the issue of controlling access to the SOCKS port. I'll follow up with them about that. But I think it's probably fair to say that being able to send traffic through the Tor client's guard connection might plausibly make traffic manipulation attacks easier than just being able to send traffic over the same network interface as the guard connection. Although I don't know if a handwavy "hmm this seems like an attack surface" is enough to justify a feature request. :)
## An answer you probably can't use: embedding Arti
Right now, you can embed Arti in any Rust app. Some folks have already started to write wrappers for Java and other languages. With our RPC protocol, we intend to support embedding Arti in any application written in any language that can call C and link a library.
So with this solution, there is no SOCKS port at all, and nobody can use your Tor client but you.
I think we'll want to move to embedding Arti via Java bindings in future, but we may still want to expose a SOCKS port so that we can use HTTP libraries that expect to talk to a SOCKS port.
Is that expected to be a supported way of using Arti? For example, if we're talking to Arti via bindings rather than RPC on the control port, will it still be possible to open a SOCKS port and will we still have a session ID that we can use as a capability on the SOCKS port?
## An answer you probably can't use: OS-specific restrictions
Because somebody will mention it if I don't: you could probably cobble something together using OS specific restrictions, like containers or selinux. Of course, this isn't really something you can ship in a convenient cross-platform way AFAIK, so it probably isn't going to be any portable application's first resort.
Sorry for all the text! But I do hope it's at least somewhat interesting.
best wishes,
It was very interesting! Thanks for all the ideas. Looks like we have some good options.
Cheers, Michael
On Thu, Sep 12, 2024 at 5:21 AM Michael Rogers michael@briarproject.org wrote:
On 11/09/2024 14:12, Nick Mathewson wrote:
## An answer you probably can't use: embedding Arti
Right now, you can embed Arti in any Rust app. Some folks have already started to write wrappers for Java and other languages. With our RPC protocol, we intend to support embedding Arti in any application written in any language that can call C and link a library.
So with this solution, there is no SOCKS port at all, and nobody can use your Tor client but you.
I think we'll want to move to embedding Arti via Java bindings in future, but we may still want to expose a SOCKS port so that we can use HTTP libraries that expect to talk to a SOCKS port.
Is that expected to be a supported way of using Arti? For example, if we're talking to Arti via bindings rather than RPC on the control port, will it still be possible to open a SOCKS port and will we still have a session ID that we can use as a capability on the SOCKS port?
I do think this is something that we want to support, but I don't know if we'll get it built in the earliest versions of our FFI embedding logic. There is a _lot_ of code to write, and a _lot_ of functionality to support -- so please poke us again (maybe on the bugtracker?) if this isn't easily do-able in the first supported FFI embedding scheme we make.
On Wed, Sep 11, 2024 at 09:12:26AM -0400, Nick Mathewson wrote:
It would be useful to have a way of controlling access to the SOCKS port so that untrusted applications running on the same device as a Tor client can't use the Tor client's SOCKS proxy. This is something that people auditing Briar have raised as a security concern.
For those wondering about how to control access to the SocksPort for applications on *different* addresses, there is the SocksPolicy torrc option -- basically it lets you tell Tor to hang up on connections from some hosts but not others.
So maybe you could use it here if you set up different internal addresses for the different applications (and also take care that routing actually communicates the right addresses even over localhost, e.g. I find on my Linux that connections come "from" 127.0.0.1 even when that's not the address of the piece of my computer that initiated the connection). Not a great solution but one to add to the list of possibilities.
I guess that we might worry about side-channel attacks, where a hostile application is sending traffic through the Tor proxy in order to introduce a timing signal into your traffic? But any application with network access could do that, whether it has Tor access or not.
Right, I thought about this too and went through the same logic. I agree. I also don't have any good real examples of attacks that somebody could do if they could access your socksport, so long as you are using circuit isolation properly -- and it would be great to hear some if they do exist.
But that said, Michael, you said you *aren't* using circuit isolation yet? For that case here's my attack: as a neighboring application, I make a connection through that socksport, also not using circuit isolation, and I take note of what exit relay my stream pops out of, because it's probably the same exit relay that the other application is using.
I'm not sure how to adapt that attack to the onion service context though.
[^1]: Actually, wait. There's a possible problem here when you're making lots of onion service connections, since IIRC in C Tor onion service circuits aren't affected by isolation. But in Arti, they are. So at least that problem will go away as Arti moves to the fore.
Clarification here: I believe C-Tor does circuit isolation correctly for the onion service circuits themselves (including the introduction circuit), but it does not do isolation for other pieces of the rendezvous process, such as onion descriptor downloads or storage.
So if you visited a cloudflare onion service using two different domains in Tor Browser, you would end up with separate circuits to the service, one per domain, but you would reuse the cached onion descriptor rather than fetching a new one. And there are other edge cases that can leak info, like the intro point failure cache: you would skip trying an intro point when connecting to the second one if you had recently failed to reach that intro point while connecting to the first one.
In part this tradeoff was about the complexity of making the changes, but in part we also justified it because the whole rendezvous process is heavyweight enough as it is, and we needed to draw the line somewhere (e.g. we don't fetch new directory documents for each new isolated circuit).
I do think doing more pieces of the isolation in Arti makes sense. And for completeness, in irc discussion Nick reminds us that there is some other state we still share between isolated circuits (in both C-Tor and Arti), for example guards including vanguards.
Hope this helps, --Roger
On 12/09/2024 22:30, Roger Dingledine wrote:
On Wed, Sep 11, 2024 at 09:12:26AM -0400, Nick Mathewson wrote:
It would be useful to have a way of controlling access to the SOCKS port so that untrusted applications running on the same device as a Tor client can't use the Tor client's SOCKS proxy. This is something that people auditing Briar have raised as a security concern.
For those wondering about how to control access to the SocksPort for applications on *different* addresses, there is the SocksPolicy torrc option -- basically it lets you tell Tor to hang up on connections from some hosts but not others.
So maybe you could use it here if you set up different internal addresses for the different applications (and also take care that routing actually communicates the right addresses even over localhost, e.g. I find on my Linux that connections come "from" 127.0.0.1 even when that's not the address of the piece of my computer that initiated the connection). Not a great solution but one to add to the list of possibilities.
Thanks, I'll look into this, although I suspect our options on Android will be limited. We may be able to bind to some other address in 127.x.x.x and restrict SOCKS connections to coming from that address, but I don't know if we'll be able to prevent other apps from binding to that address too.
I guess that we might worry about side-channel attacks, where a hostile application is sending traffic through the Tor proxy in order to introduce a timing signal into your traffic? But any application with network access could do that, whether it has Tor access or not.
Right, I thought about this too and went through the same logic. I agree. I also don't have any good real examples of attacks that somebody could do if they could access your socksport, so long as you are using circuit isolation properly -- and it would be great to hear some if they do exist.
But that said, Michael, you said you *aren't* using circuit isolation yet? For that case here's my attack: as a neighboring application, I make a connection through that socksport, also not using circuit isolation, and I take note of what exit relay my stream pops out of, because it's probably the same exit relay that the other application is using.
I'm not sure how to adapt that attack to the onion service context though.
[^1]: Actually, wait. There's a possible problem here when you're making lots of onion service connections, since IIRC in C Tor onion service circuits aren't affected by isolation. But in Arti, they are. So at least that problem will go away as Arti moves to the fore.
Clarification here: I believe C-Tor does circuit isolation correctly for the onion service circuits themselves (including the introduction circuit), but it does not do isolation for other pieces of the rendezvous process, such as onion descriptor downloads or storage.
So if you visited a cloudflare onion service using two different domains in Tor Browser, you would end up with separate circuits to the service, one per domain, but you would reuse the cached onion descriptor rather than fetching a new one. And there are other edge cases that can leak info, like the intro point failure cache: you would skip trying an intro point when connecting to the second one if you had recently failed to reach that intro point while connecting to the first one.
In part this tradeoff was about the complexity of making the changes, but in part we also justified it because the whole rendezvous process is heavyweight enough as it is, and we needed to draw the line somewhere (e.g. we don't fetch new directory documents for each new isolated circuit).
This is really useful to know, thanks.
I think this tells me that (a) we should start using circuit isolation with C-Tor to protect our non-onion-service connections (RSS feed fetches), and (b) even with circuit isolation, information can leak to an attacker who has access to the SOCKS port (specifically, information about which onion service descriptors have been cached, which in Briar's case is information about the user's list of contacts). So we do need to restrict access to the SOCKS port with C-Tor, and if the HS cache/intro point state/etc isn't isolated per session in Arti then we'll need to do the same there.
I do think doing more pieces of the isolation in Arti makes sense. And for completeness, in irc discussion Nick reminds us that there is some other state we still share between isolated circuits (in both C-Tor and Arti), for example guards including vanguards.
Hope this helps, --Roger
It does, thanks.
Cheers, Michael