[tor-bugs] #21304 [Obfuscation/Snowflake]: Sanitize snowflake.log

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon Mar 25 19:24:35 UTC 2019


#21304: Sanitize snowflake.log
-----------------------------------+------------------------------
 Reporter:  arlolra                |          Owner:  cohosh
     Type:  defect                 |         Status:  needs_review
 Priority:  Medium                 |      Milestone:
Component:  Obfuscation/Snowflake  |        Version:
 Severity:  Normal                 |     Resolution:
 Keywords:  starter                |  Actual Points:
Parent ID:                         |         Points:  1
 Reviewer:                         |        Sponsor:
-----------------------------------+------------------------------
Changes (by cohosh):

 * status:  needs_revision => needs_review


Comment:

 Here's a new candidate:
 https://github.com/cohosh/snowflake/compare/safe_log

 Replying to [comment:12 dcf]:
 Thanks for the helpful feedback! I've addressed each of the comments
 below:

 > Let me suggest a further simplification: ''only'' worry about the server
 log for now, and then we can perhaps factor the safe-logging code into a
 separate package for use by the other programs.

 This is what I've done so far, the log scrubber is only being used on the
 server log at the moment.

 >
 > * I think it's best to default to scrubbed in all cases.
 Agreed. Used the suggested logic to avoid missing default loggers in the
 future:
 [https://github.com/cohosh/snowflake/blob/3eb9064438ca6242f935173aed88ec29a0c16c7d/server/server.go#L329
 server.go#L329]

 > * the default log output in the absence of a `--log` option should be
 >   `os.Stderr`, not `os.Stdout`.
 >   [https://golang.org/pkg/log/#pkg-overview Ref]: "That logger writes to
 standard error..."
 >   There's now a [https://golang.org/pkg/log/#Logger.Writer
 Logger.Writer] method that would allow
 >   us to ask for the default output, but as it was only
 [https://github.com/golang/go/pull/28399 added in 1.12],
 >   it's probably still too new to use.
 Oops, fixed. I agree that the Logger.Writer method is the best thing to
 use and it's probably a good idea to wait until we want to update the go
 version for other reasons as well.

 > * The `logScrubber.Write` method only looks at one buffer's worth of
 data at a time,
 >   See below re `RedactErrorWriter` for an idea on how to deal with this.
 Cool, I really like how it works. I implemented that for the log scrubber
 here:
 [https://github.com/cohosh/snowflake/blob/3eb9064438ca6242f935173aed88ec29a0c16c7d/server/server.go#L91
 server.go#L91] and added your tests:
 [https://github.com/cohosh/snowflake/blob/3eb9064438ca6242f935173aed88ec29a0c16c7d/server/server_test.go#L54
 server_test.go#L54]

 > * You can use `!bytes.Equal(a, b)` instead of `bytes.Compare(a, b) !=
 0`.
 Done.

 > * Should we scrub port numbers, too?
 Yeah, that's a good idea. I've updated the regular expressions to catch
 the port. See my comments below for what I settled on.

 > * Consider structuring repetitive tests like this:
 Done.
 > * Instead of `X.X.X.X`,
 [https://gitweb.torproject.org/tor.git/tree/src/app/config/config.c?h=tor-0.3.5.8#n1102
 tor uses] `[scrubbed]`, as do
 >   [https://gitweb.torproject.org/pluggable-transports/meek.git/tree
 /meek-server/meek-server.go?h=0.31#n181 meek-server]
 >   and
 [https://gitlab.com/yawning/obfs4/blob/obfs4proxy-0.0.9/common/log/log.go#L42
 obfs4proxy].
 Nice, I like this better. It also leaks less information this way. Done.
 >
 > ----

 > Is it worth doing both, structure-based object manipulation and string-
 based patterns? Perhaps not, since the correct functioning of one can mask
 errors in the other. You could think of it as defense in depth, but it
 will be hard to consistently use a `redactError` function everywhere, when
 forgetting it doesn't have any observable consequences. Surely having both
 is safer, but the marginal safety increase of `redactError` may not be
 worth the (albeit small) code complexity increase.

 I think for now it's okay to just have the log scrubber. Before I started
 it, I went through and looked at all of the returned errors and I don't
 recall finding any net.OpErrors anyway.

 > One other difference is this massive, yet conservative,
 [https://gitweb.torproject.org/user/dcf/snowflake.git/tree/server/error.go?h
 =redact-error#n58 IP address–detecting regexp]. I admit its appearance is
 comical, but it will detect IPv6 addresses not surrounded by brackets, and
 avoid false-positive matches like what you mentioned in comment:11. Also,
 it scrubs port numbers. I would say, test
 [https://github.com/cohosh/snowflake/blob/4b0acda984a5ae4335d206fc534f51efb9303d5d/server/server_test.go#L53
 your] and
 [https://gitweb.torproject.org/user/dcf/snowflake.git/tree/server/error_test.go?h
 =redact-error#n148 my] test cases against the pattern you already have;
 and if there are any misdetections, a more precise regexp may be in order.

 I stumbled across that as well when I wrote the original regexp >.< I did
 end up taking some elements from it and expanding the regexp that I have
 to:
 {{{
 const ipv4Address = `\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}`
 const ipv6Address = `(([0-9a-fA-F]{0,4}:){2,7}([0-9a-fA-F]{0,4})?(` +
 ipv4Address + `))` +
         `|(([0-9a-fA-F]{0,4}:){2,7}([0-9a-fA-F]{0,4})?)`
 const optionalPort = `(:\d{1,5})?`
 const addressPattern = `((` + ipv4Address + `)|(\[(` + ipv6Address +
 `)\])|(` + ipv6Address + `))` + optionalPort
 }}}

 My reason for not going with the full one was that I found it unnecessary
 after expanding my tests with the ones you used. I also found that this
 massive regexp was still scrubbing the fingerprint, meaning additional
 changes still had to be made. In particular, this test case failed:
 {{{
 //Make sure it doesn't scrub fingerprint
                         "a=fingerprint:sha-256
 33:B6:FA:F6:94:CA:74:61:45:4A:D2:1F:2C:2F:75:8A:D9:EB:23:34:B2:30:E9:1B:2A:A6:A9:E0:44:72:CC:74",
                         "a=fingerprint:sha-256
 33:B6:FA:F6:94:CA:74:61:45:4A:D2:1F:2C:2F:75:8A:D9:EB:23:34:B2:30:E9:1B:2A:A6:A9:E0:44:72:CC:74\n",
 }}}

 Go does not yet support look-ahead or look-behind, which is what we'd want
 to pass this test case. Instead I implemented a work-around that will find
 regular expressions that check for characters surrounding the address, but
 only replaces the address portion of it:
 {{{
 const fullAddrPattern = `(^|\s|[^\w:])` + addressPattern +
 `(\s|(:\s)|[^\w:]|$)`

 var scrubberPatterns = []*regexp.Regexp{
         regexp.MustCompile(fullAddrPattern),
 }
 func scrub(b []byte) []byte {
         scrubbedBytes := b
         for _, pattern := range scrubberPatterns {
                 // this is a workaround since go does not yet support look
 ahead or look
                 // behind for regular expressions.
                 scrubbedBytes = pattern.ReplaceAllFunc(scrubbedBytes,
 func(b []byte) []byte {
                         return addressRegexp.ReplaceAll(b,
 []byte("[scrubbed]"))
                 })
         }
         return scrubbedBytes
 }
 }}}
 > In the case of these full SDP stanzas, I think we should just not be
 logging them at all, not by default anyway.
 > ...
 > That's what the "SEND" button at
 ​https://snowflake.torproject.org/snowflake.html, and the
 {{{NewCopyPasteDialer}}} in client, are for. I'm okay with that mode of
 operation going away.
 I removed this from the client logs, but if there's now unused
 functionality in the client or proxy code perhaps we should remove that
 too?
 ----

 I made some other additional changes:
 - The regular expressions are now compiled only once, outside of the write
 function. This will make it easier to add more patterns the scrubber later
 and probably makes it more efficient
 - Modified the broker to not print out the raw bytes of ICE answers

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/21304#comment:15>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list