Hello all,
This is the biweekly status update for ahmia development, that arrives a bit late, since I should have sent it on Friday.
The last two weeks I have been working on:
=== Ahmia-Site ===
* Added a "Did you mean" functionality. This utilizes Elasticsearch's fuzziness https://www.elastic.co/guide/en/elasticsearch/guide/current/fuzziness.html functionality, and more specifically phrase suggesters https://www.elastic.co/guide/en/elasticsearch/reference/6.3/search-suggesters-phrase.html, to suggest actual terms when a mispell may have happened. For example: https://ahmia.fi/search/?q=snodwen or https://ahmia.fi/search/?q=tor+netork [1]
* Changed the search criteria of elasticsearch searches, by 1) using weighted fields https://www.elastic.co/blog/multi-field-search-just-got-better and 2) adding 'anchor' and 'content' fields as well. That has improved the first (upper) results in some cases, and also increases the overall results fetched. [2]
* Performed an overall html refactoring, to improve the code structure, fix some unmatched tags, etc [3]
* A minor improvement on add onion page, makes clearer the response message, and doesn't redirect to new page. [4]
* [Ongoing] I have been Integrating PageRank algorithm in order to improve results sorting based on website popularity. For each page we take into account the backlinks from the rest of the onion addresses to calculate its page rank coefficient. An appropriate formula needs to be done to combine this metric with the already elasticsearch relevance score [5] To be committed soon
=== Ahmia-Index ===
* Changed the bulk update request on ES aliases, to an iterative one, to prevent any error on the first requests from disrupting the rest of the requests [6]
* Separated *add *and *remove *alias, functionality to make the first crawls of each month available from the beginning of the month. [7]
[1] https://github.com/ahmia/ahmia-site/issues/25 [2] https://github.com/ahmia/ahmia-site/issues/29 [3] https://github.com/ahmia/ahmia-site/commit/ddcf1a32321a8506f99eaf8c402c0aec6... [4] https://github.com/ahmia/ahmia-site/issues/27 [5] https://github.com/ahmia/ahmia-site/issues/30 [6] https://github.com/ahmia/ahmia-index/commit/08952b6609c7d0d8ee82afec4e64a1c6... [7] https://github.com/ahmia/ahmia-index/issues/6
Stelios Barberakis chefarov@gmail.com writes:
Hello all,
This is the biweekly status update for ahmia development, that arrives a bit late, since I should have sent it on Friday.
The last two weeks I have been working on:
=== Ahmia-Site ===
- Added a "Did you mean" functionality. This utilizes Elasticsearch's
fuzziness https://www.elastic.co/guide/en/elasticsearch/guide/current/fuzziness.html functionality, and more specifically phrase suggesters https://www.elastic.co/guide/en/elasticsearch/reference/6.3/search-suggesters-phrase.html, to suggest actual terms when a mispell may have happened. For example: https://ahmia.fi/search/?q=snodwen or https://ahmia.fi/search/?q=tor+netork [1]
- Changed the search criteria of elasticsearch searches, by 1) using weighted
fields https://www.elastic.co/blog/multi-field-search-just-got-better and 2) adding 'anchor' and 'content' fields as well. That has improved the first (upper) results in some cases, and also increases the overall results fetched. [2]
- A minor improvement on add onion page, makes clearer the response
message, and doesn't redirect to new page. [4]
- [Ongoing] I have been Integrating PageRank algorithm in order to improve
results sorting based on website popularity. For each page we take into account the backlinks from the rest of the onion addresses to calculate its page rank coefficient. An appropriate formula needs to be done to combine this metric with the already elasticsearch relevance score [5] To be committed soon
These are really cool changes! Great to see them! :)
Would be curious to learn how many people use the "search suggester" functionality, and also how the user experience improves with the new popularity code! :)
tor-project@lists.torproject.org