[tor-dev] GSoC: Ahmia.fi - Search Engine for Hidden Services
juha.nurmi at ahmia.fi
Thu Apr 24 06:00:21 UTC 2014
-----BEGIN PGP SIGNED MESSAGE-----
On 22.04.2014 17:35, George Kadianakis wrote:
> Enjoy GSoC :)
I will :)
> BTW, looking again at your proposal, I see that you are going to
> do both popularity tracking and backlinks.
Yes, another crawler gathers backlinks from the public WWW and I will
start gathering the URL clicks from the users.
> How are these two technologies going to interact with each other?
> That is, how will the indexer consider the output of those two
Django front-end re-sorts the answers from YaCy back-end.
I have this idea in mind: https://ahmia.fi/static/gsoc/sorter.py
The result is sorted according to YaCy result index, number of
backlinks and clicks which are scaled.
Note the scaling: p_info.backlinks = 1 / (float(index) + 1) etc.
sum_function = 3.0*self.yacy + 2.0*self.backlinks + 1.0*self.clicks
where 3, 2 and 1 are test coefficients. I will optimize these and made
a better model if necessary. However, clicks are easily spoofed and
there have to be small coefficient for them.
> Also, with your newly acquired knowledge about backlinks, how long
> is it going to take your incorporate them in ahmia? Are you
> actually going to do it during the "Use an another crawler to
> search .onion pages from the public Internet" phase?
We can test it when popularity tracking and backlinks crawler are working.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
-----END PGP SIGNATURE-----
More information about the tor-dev