[tor-dev] GSoC: Ahmia.fi - Search Engine for Hidden Services
desnacked at riseup.net
Fri Apr 25 14:27:14 UTC 2014
Juha Nurmi <juha.nurmi at ahmia.fi> writes:
> On 22.04.2014 17:35, George Kadianakis wrote:
>> Enjoy GSoC :)
> I will :)
>> BTW, looking again at your proposal, I see that you are going to
>> do both popularity tracking and backlinks.
> Yes, another crawler gathers backlinks from the public WWW and I will
> start gathering the URL clicks from the users.
>> How are these two technologies going to interact with each other?
>> That is, how will the indexer consider the output of those two
> Django front-end re-sorts the answers from YaCy back-end.
> See https://ahmia.fi/static/gsoc/re_sort.jpg
> I have this idea in mind: https://ahmia.fi/static/gsoc/sorter.py
> The result is sorted according to YaCy result index, number of
> backlinks and clicks which are scaled.
> Note the scaling: p_info.backlinks = 1 / (float(index) + 1) etc.
> sum_function = 3.0*self.yacy + 2.0*self.backlinks + 1.0*self.clicks
> where 3, 2 and 1 are test coefficients. I will optimize these and made
> a better model if necessary. However, clicks are easily spoofed and
> there have to be small coefficient for them.
That makes sense.
BTW, what is the 'yacy' score? Is it just the order that YaCy's
indexer chose for each result? Or does YaCy actually expose a score
for each result? How is the score derived? Or do you treat it as a
blackbox and assume it's the most accurate of backlinks and
More information about the tor-dev