[tor-talk] Tor and Google error / CAPTCHAs.

blobby at openmailbox.org blobby at openmailbox.org
Sun Sep 25 16:54:31 UTC 2016


Hi Alec,

Thanks for your detailed and informative response. I had never heard of 
"scraping". BTW: are you the Alec Muffett name-checked in Kevin 
Mitnick's autobiography? I assume so.

It may be of note that when I got the Google error, Amazon also required 
a CAPTCHA in order for me to login to my account. Whomever was using the 
exit node maliciously, was obviously affecting non-Google organizations 
too.

Since you used to work at Facebook (and I know you've posted on this 
list before about the FB onion address), I've a couple of questions 
based on my experiences with FB and Tor.

I'm wondering if FB (and, for that matter, other companies like Google) 
have some kind of hierarchy of "badness" of IP addresses. For example, 
for FB is an exit node "worse" than a SOCKS proxy which is "worse" than 
a VPN? I ask because I usually login to my FB via a London-based IP 
provided by my ISP. However, when I try to login to my FB account via an 
exit node with a London IP or via a SOCKS proxy with a London IP, I am 
asked to verify myself by selecting photos of my friends. I could well 
understand this if I was logging in from an IP - any type of IP - in, 
say, France but I don't really understand why a London-based IP should 
be suspicious since it matches the usual geographical login location, 
unless of course, all exit nodes and known SOCKS proxies are suspicious 
to FB irrespective of whether or not they correlate with the "normal" IP 
location of the user (in my case London).

What I am trying to ask is: how does FB (or similar organisations) 
decide that an IP is "bad" when it is in the same place as the IP that 
normally logs in to an account.

I wonder if you have any thoughts on the matter. Thanks!



On 2016-09-24 14:21, Alec Muffett wrote:
> On 24 September 2016 at 13:07, <blobby at openmailbox.org> wrote:
> 
>> 
>> Question: what are these people actually doing with the exit node IP 
>> that
>> upsets Google?
> 
> 
> That's a good question; I don't know about Google specifically, but 
> when I
> was at Facebook the most common Tor-exit-node-related problem was 
> called
> "scraping".
> 
> Scraping was/is when people with bad intentions hid behind Tor in order 
> to
> disguise attempts to access and copy people's public pages, looking for
> personal information (names, addresses, pet names, emails, anything...)
> which could be correlated somehow and monetised, eg: via phone fraud or
> phishing.
> 
> Tor is useful to these people because if they were making such access
> attempts from a single IP address, or a single subnet, it would be easy 
> to
> track and stop them.
> 
> So "scraping", along with other/similar reasons, is why tor exit nodes 
> have
> such shitty "IP Reputation" in the tech industry.  The Tor exit nodes 
> hide
> a bunch of people who are doing scraping.
> 
> Of all the big companies in tech, Facebook probably has some of the
> theoretically easiest challenges of addressing scraping - because quite 
> a
> lot of content is only available when one is "logged in" to Facebook, 
> so
> instead of blocking IP addresses Facebook instead can block _accounts_ 
> that
> scrape; however that is not a panacea and fighting scraping at Facebook 
> is
> still a _massive_ task.
> 
> By comparison Google may have a even harder challenge to combat 
> scraping
> because much of Google content is meant to be available without 
> logging-in,
> therefore Google rely more heavily upon IP-address as an identifier.
> 
> Continuing the spectrum - Cloudflare have an enormously harder 
> challenge
> than Google, because they are mostly supplying only "network-level"
> services to their customers, so lack knowledge of username, userids, 
> and
> (most?) cookies that actual platform-providers might be able to use 
> when
> fighting scraping.
> 
> If you correlate this spectrum with "corporate friendliness towards 
> Tor", I
> think you will see a causative pattern emerge; Tor does great work in
> enabling access to these services and platforms for people in need, but 
> it
> also serves to hide/enable scrapers and other malfeasance. To not 
> recognise
> this and instead (for example) to violently beat-up Cloudflare for
> "blocking tor" serves only to entrench anti-Tor sentiment.
> 
> This is why a few months ago I wrote a blogpost[1] explaining how best 
> I
> believe to get more companies to be friendly towards Tor.
> 
> Because any amount of denial, public raging and placard-waving is not 
> going
> to help.  It needs outreach.  It needs mutual understanding and
> communication of benefits.
> 
>     - alec
> 
> 
> [1]
> https://www.facebook.com/notes/alec-muffett/how-to-get-a-company-or-organisation-to-implement-an-onion-site-ie-a-tor-hidden-/10153762090530962
> 
> --
> http://dropsafe.crypticide.com/aboutalecm



More information about the tor-talk mailing list