[tor-talk] tor2web - How can I get my hidden service indexed?

Fabio Pietrosanti (naif) lists at infosecurity.ch
Thu Feb 14 21:45:57 UTC 2013


On 2/14/13 10:28 PM, tor at lists.grepular.com wrote:
> On 14/02/13 10:55, Fabio Pietrosanti (naif) wrote:
>
> > Tor2web software by default: - setup a robots.txt to prevent search
> > engine scraping - block a wide set of Crawler UA to further block
> > search activities - prevent hotlinking (from an internet resource
> > to do <img src="https://blahblah.tor2web.org/image.jpg">
>
> This seems like a strange default to me.
The hotlinking is required to avoid having people linking "highly
controversial material" on public internet forum, using the few Tor2web
proxy as a sort of "Content Delivery Network" .
> I can see why people would
> want to create hidden services that can be discovered using ordinary
> channels on the Internet such as search engines.
>
> If a hidden service operator actually wanted to block search engines,
> they'd know to create their own robots.txt file, or to add appropriate
> meta tags to their HTML, or to simply block based on the User-Agent
> header...
Tor2web 3.0 beta1 is not a final solution, it would require a lot of
additional code and features to make it really flexible (like permitting
a TorHS operator to configure this robots.txt behavior).

However in the meantime there's a simple reason, survival of services,
to avoid the general indexing of Tor2web exposed Tor Hidden Services.

Given the experience, it's much more difficult to keep running a Tor2web
server, rather than keep running a Tor Exit Node with completely open
exit-policy.

If you enable "google indexing", the amount of complaints that you will
receive will exponentially increase, quickly creating serious issue in
being able to keep the Tor2web proxy running. :\

Fabio



More information about the tor-talk mailing list