[tor-talk] Funded search engine for onionspace?

l.m ter.one.leeboi at hush.com
Fri Feb 13 23:30:54 UTC 2015


>Alas no.  I'm aware this is suboptimal.  I see GOOG search engine as
a
>temporary-ladder just to get the ball rolling.  I am open to using
any
>other index.  For what it's worth I'm very pleased with GOOG's
>performance---right now it's searching an index of 650k onion pages
and the
>number grows every day.

If you instead use a google search appliance couldn't you use google
engine for indexing without having to use google itself? Wouldn't that
also avoid the problem of google queries being associated with the
client making the request?

>Although we technically could read provided passwords, we don't keep
logs
>of passed traffic.  However, I understand that many users don't
understand
>the tor2web threat model.  But this is the same as all Tor2web nodes,
yes?
>This is not at all unique to OnionCity.  As far as I know all Tor2web
nodes
>allow form submissions.

What is unique to onion.city is that access to someonion.onion.city
occurs using http and doesn't redirect to the .onion if Tor is in use.
That the tor2web mirror might snoop is implicit--that the exit (if
using tor) might also snoop is more of a concern.

>You mentioned it'd be better to have it randomly pick among the
available
>Tor2web nodes instead of everything going through OnionCity.  This
breaks
>the GOOG search engine which only wants to return "canonical" URLs. 
We
>could talk about making OnionCity a DNS round-robin akin to how
Tor2web.org
>currently works, but then I'm just replicating Tor2web.

The ability of tor2web to provide mirrors should be optional. If you
only know one mirror and that mirror cannot service the request then
how are you going to get any of the other mirrors? Google engine can
return related addresses in an order based on the success of loading
the mirror itself. If onion.city always works it will tend to precede
tor2web.org. If onion.city goes down (having search front-end separate
from tor2web mirror) the search engine can reorder the result to
improve the success of the first click.

  >Right now I aggregate existing lists of onion sites and put them
into the
>site map.
  >* https://ahmia.fi/onions/
  >* http://skunksworkedp2cg.onion.city/sites.txt
  >* http://xlmvhk3rpdux26dz.onion.city/
  >* http://kkkkkku5juzqh33a.onion.city/

If google is itself handling the indexing won't that cause a problem
for sites in those lists, which are normally okay with being indexed,
just not by googlebot? I for one couldn't care less about being
indexed by ahmia.fi but it'll be a cold day in hell before I let
googlebot. Precisely because of how easy it is to link the search to
the requester.
--leeroy


More information about the tor-talk mailing list