Hi Linus,
I know you're just getting started as sysadmin, but I want to check in with you about something we're hoping to get for support.torproject.org: a search function. The ticket was temporarily closed since we couldn't figure out a way to do search: https://trac.torproject.org/projects/tor/ticket/25322. But maybe you can help?
Happy to provide more info as needed.
Thanks, Alison
Hi all,
On 11/13/18 2:50 PM, Alison Macrina wrote:
Hi Linus,
I know you're just getting started as sysadmin, but I want to check in with you about something we're hoping to get for support.torproject.org: a search function. The ticket was temporarily closed since we couldn't figure out a way to do search: https://trac.torproject.org/projects/tor/ticket/25322. But maybe you can help?
Happy to provide more info as needed.
Thanks, Alison
So getting back to plan a reasonable setup to search our portals. The last time we talked about this, UX wanted the following:
- Text search per portal
- Results should be on a html page and not be fetched via JS
- The subdomain where results should be served should be the same of the portal where the user started the search
My original idea and test included setup for 1 solr instance. Results from solar were retrieved through a web service written in python that would only query solr and serve the html pages to the user.
Ideally the web service could live on a different subdomain and the pages be served with proxypass?
This setup though complicates a lot our static rotation architecture. There are a lot of questions open that I can discuss with Linus and Weasel separately.
Talk soon,
-hiro
Hi again,
On 11/15/18 9:31 AM, silvia [hiro] wrote:
Hi all,
On 11/13/18 2:50 PM, Alison Macrina wrote:
Hi Linus,
I know you're just getting started as sysadmin, but I want to check in with you about something we're hoping to get for support.torproject.org: a search function. The ticket was temporarily closed since we couldn't figure out a way to do search: https://trac.torproject.org/projects/tor/ticket/25322. But maybe you can help?
Happy to provide more info as needed.
Thanks, Alison
So getting back to plan a reasonable setup to search our portals. The last time we talked about this, UX wanted the following:
Text search per portal
Results should be on a html page and not be fetched via JS
The subdomain where results should be served should be the same of the
portal where the user started the search
My original idea and test included setup for 1 solr instance. Results from solar were retrieved through a web service written in python that would only query solr and serve the html pages to the user.
Ideally the web service could live on a different subdomain and the pages be served with proxypass?
This setup though complicates a lot our static rotation architecture. There are a lot of questions open that I can discuss with Linus and Weasel separately.
Talk soon,
-hiro
Adding Antonela and Isabela to this thread.
We have been discussing with Linus and Weasel a few possibilities to run our search service. From what I remember the rationale on UX side was that having a single search.torproject.org service where people get redirected when they search is not ideal.
So we have thought that something we could do is having search run on search.support.torproject.org.
The ux flow will be like this:
- User enters a search query on the search bar on support.tp.o
- User is redirected to search.support.tp.o where they find the search results.
- User clicks on one of the results and is taken back to support.tp.o page with the answer to their question.
Would this work?
Talk soon,
-hiro
tor-community-team mailing list tor-community-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-community-team
silvia [hiro]:
So we have thought that something we could do is having search run on search.support.torproject.org.
The ux flow will be like this:
User enters a search query on the search bar on support.tp.o
User is redirected to search.support.tp.o where they find the search
results.
- User clicks on one of the results and is taken back to support.tp.o
page with the answer to their question.
Would this work?
What I like about the 'central search' idea is that you can get a User Manual result when searching Tor Support... because we have so many different pieces of content that I liked the idea of moving the user from one site to the other through the searches.
Is this still going to happen with your proposal?
On 11/15/18 10:06 AM, emma peel wrote:
silvia [hiro]:
So we have thought that something we could do is having search run on search.support.torproject.org.
The ux flow will be like this:
User enters a search query on the search bar on support.tp.o
User is redirected to search.support.tp.o where they find the search
results.
- User clicks on one of the results and is taken back to support.tp.o
page with the answer to their question.
Would this work?
What I like about the 'central search' idea is that you can get a User Manual result when searching Tor Support... because we have so many different pieces of content that I liked the idea of moving the user from one site to the other through the searches.
Is this still going to happen with your proposal?
I like that too, but I think UX wanted search results per portal?
Talk soon,
-hiro
silvia [hiro]:
On 11/15/18 10:06 AM, emma peel wrote:
What I like about the 'central search' idea is that you can get a User Manual result when searching Tor Support... because we have so many different pieces of content that I liked the idea of moving the user from one site to the other through the searches.
Is this still going to happen with your proposal?
I like that too, but I think UX wanted search results per portal?
I don't know about doing it project-wide, but I feel that for example support.torproject.org and tb-manual.torproject.org could share search results.
emma peel emma.peel@riseup.net wrote Fri, 16 Nov 2018 07:41:00 +0000:
silvia [hiro]:
On 11/15/18 10:06 AM, emma peel wrote:
What I like about the 'central search' idea is that you can get a User Manual result when searching Tor Support... because we have so many different pieces of content that I liked the idea of moving the user from one site to the other through the searches.
Is this still going to happen with your proposal?
I like that too, but I think UX wanted search results per portal?
I don't know about doing it project-wide, but I feel that for example support.torproject.org and tb-manual.torproject.org could share search results.
I think this is a good time for figuring out what Tor Project wants from a search function. I've put down a couple of statements sprinkled with questions below. Please jump in and argue against false statements and answer questions where possible. And please add more questions.
- The web site support.tpo needs a search field and a button next to it resulting in the user seeing a list of matching url's (and their titles) in their browser.
- What corpus would such a search look at? support.tpo only? support.tpo and tb-manual.tpo? More than that?
- Are there other tpo sites that need/want a search function? Should search results include matches from other tpo sites as well, or only the one the user is currently visiting?
- Sending the user to a separate site, say search.tpo, is considered not UX friendly enough.
- Is search.<site>.tpo good enough?
- Are we limited to using solr, as mentioned in #25322, or can we explore other options?
- User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option. Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
Linus Nordberg:
emma peel emma.peel@riseup.net wrote Fri, 16 Nov 2018 07:41:00 +0000:
silvia [hiro]:
On 11/15/18 10:06 AM, emma peel wrote:
What I like about the 'central search' idea is that you can get a User Manual result when searching Tor Support... because we have so many different pieces of content that I liked the idea of moving the user from one site to the other through the searches.
Is this still going to happen with your proposal?
I like that too, but I think UX wanted search results per portal?
I don't know about doing it project-wide, but I feel that for example support.torproject.org and tb-manual.torproject.org could share search results.
I think this is a good time for figuring out what Tor Project wants from a search function. I've put down a couple of statements sprinkled with questions below. Please jump in and argue against false statements and answer questions where possible. And please add more questions.
Thanks Linus! Thoughts below:
- The web site support.tpo needs a search field and a button next to it resulting in the user seeing a list of matching url's (and their titles) in their browser.
Agree.
What corpus would such a search look at? support.tpo only? support.tpo and tb-manual.tpo? More than that?
Are there other tpo sites that need/want a search function? Should search results include matches from other tpo sites as well, or only the one the user is currently visiting?
I think search should have the capacity to look across the website, with buttons allowing the user to limit results to just the portal they're currently looking at should they choose.
Sending the user to a separate site, say search.tpo, is considered not UX friendly enough.
Is search.<site>.tpo good enough?
I don't have an opinion on this, will defer to UX people.
Are we limited to using solr, as mentioned in #25322, or can we explore other options?
User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option. Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
I don't have an opinion on these last two either. :)
Alison
On Fri, Nov 16, 2018 at 10:22:24AM +0100, Linus Nordberg wrote:
- Are we limited to using solr, as mentioned in #25322, or can we explore other options?
I have vague memories that Isa and Hiro explored other options, like outsourcing it to duckduckgo, but apparently the user flow was horrible. So, I don't know what constraints we want now, but there is some history of exploring other options.
- User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option.
If there's some way to limit the number of searches (proxypasses) going at once, so a crawler doesn't take down (fill all the slots of) all of our static webservers, this idea might still be worth exploring. I feel a bit bad putting in place something that is so obviously going to be a source of ongoing pain, but I don't know of amazing better options that match all the other goals.
Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
If we're proxying to another service running *on that same machine*, then I think we're ok on this point. It's just if we have some central separate search service that it would be a problem. So for example if solr is our choice, we could run a replicated solr on each webserver.
--Roger
Hi all, I just realized this conversation stalled, and I'd like to bump it and figure out next steps.
Alison
Alison Macrina Community Team Lead The Tor Project
Roger Dingledine:
On Fri, Nov 16, 2018 at 10:22:24AM +0100, Linus Nordberg wrote:
- Are we limited to using solr, as mentioned in #25322, or can we explore other options?
I have vague memories that Isa and Hiro explored other options, like outsourcing it to duckduckgo, but apparently the user flow was horrible. So, I don't know what constraints we want now, but there is some history of exploring other options.
- User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option.
If there's some way to limit the number of searches (proxypasses) going at once, so a crawler doesn't take down (fill all the slots of) all of our static webservers, this idea might still be worth exploring. I feel a bit bad putting in place something that is so obviously going to be a source of ongoing pain, but I don't know of amazing better options that match all the other goals.
Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
If we're proxying to another service running *on that same machine*, then I think we're ok on this point. It's just if we have some central separate search service that it would be a problem. So for example if solr is our choice, we could run a replicated solr on each webserver.
--Roger
Hi all,
On 2/11/19 4:07 PM, Alison Macrina wrote:
Hi all, I just realized this conversation stalled, and I'd like to bump it and figure out next steps.
Alison
Alison Macrina Community Team Lead The Tor Project
Thanks Alison for bumping this up again.
A while ago, Linus shared this summary (copied below).
The way I understand how UX would like to implement our search portal is that we should have different urls per website.
So if a user is on support.tp.o, they would access search at support.tp.o/search.
The issue on the sysadmin side is that our websites are static. So in order to run some search service that displays results on website.tp.o/search we would have to upgrade our infrastructure and spend considerably more.
Would something like search.website.tp.o work instead?
So we would have search.support.tp.o instead of support.tp.o/search. How does that sound?
Cheers,
-hiro
On 11/16/18 9:22 AM, Linus Nordberg wrote:
emma peel emma.peel@riseup.net wrote Fri, 16 Nov 2018 07:41:00 +0000:
silvia [hiro]:
On 11/15/18 10:06 AM, emma peel wrote:
What I like about the 'central search' idea is that you can get a User Manual result when searching Tor Support... because we have so many different pieces of content that I liked the idea of moving the user from one site to the other through the searches.
Is this still going to happen with your proposal?
I like that too, but I think UX wanted search results per portal?
I don't know about doing it project-wide, but I feel that for example support.torproject.org and tb-manual.torproject.org could share search results.
I think this is a good time for figuring out what Tor Project wants from a search function. I've put down a couple of statements sprinkled with questions below. Please jump in and argue against false statements and answer questions where possible. And please add more questions.
The web site support.tpo needs a search field and a button next to it resulting in the user seeing a list of matching url's (and their titles) in their browser.
What corpus would such a search look at? support.tpo only? support.tpo and tb-manual.tpo? More than that?
Are there other tpo sites that need/want a search function? Should search results include matches from other tpo sites as well, or only the one the user is currently visiting?
Sending the user to a separate site, say search.tpo, is considered not UX friendly enough.
Is search.<site>.tpo good enough?
Are we limited to using solr, as mentioned in #25322, or can we explore other options?
User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option. Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
Roger Dingledine:
On Fri, Nov 16, 2018 at 10:22:24AM +0100, Linus Nordberg wrote:
- Are we limited to using solr, as mentioned in #25322, or can we explore other options?
I have vague memories that Isa and Hiro explored other options, like outsourcing it to duckduckgo, but apparently the user flow was horrible. So, I don't know what constraints we want now, but there is some history of exploring other options.
- User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option.
If there's some way to limit the number of searches (proxypasses) going at once, so a crawler doesn't take down (fill all the slots of) all of our static webservers, this idea might still be worth exploring. I feel a bit bad putting in place something that is so obviously going to be a source of ongoing pain, but I don't know of amazing better options that match all the other goals.
Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
If we're proxying to another service running *on that same machine*, then I think we're ok on this point. It's just if we have some central separate search service that it would be a problem. So for example if solr is our choice, we could run a replicated solr on each webserver.
--Roger
tor-community-team mailing list tor-community-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-community-team
silvia [hiro]:
Hi all,
Re-adding Linus, since he got bumped from my reply.
On 2/11/19 4:07 PM, Alison Macrina wrote:
Hi all, I just realized this conversation stalled, and I'd like to bump it and figure out next steps.
Alison
Alison Macrina Community Team Lead The Tor Project
Thanks Alison for bumping this up again.
A while ago, Linus shared this summary (copied below).
The way I understand how UX would like to implement our search portal is that we should have different urls per website.
So if a user is on support.tp.o, they would access search at support.tp.o/search.
The issue on the sysadmin side is that our websites are static. So in order to run some search service that displays results on website.tp.o/search we would have to upgrade our infrastructure and spend considerably more.
Would something like search.website.tp.o work instead?
So we would have search.support.tp.o instead of support.tp.o/search. How does that sound?
I don't see any issue with doing it that way. Could the user check boxes to select just the support portal (or another portal)?
Alison
Cheers,
-hiro
On 11/16/18 9:22 AM, Linus Nordberg wrote:
emma peel emma.peel@riseup.net wrote Fri, 16 Nov 2018 07:41:00 +0000:
silvia [hiro]:
On 11/15/18 10:06 AM, emma peel wrote:
What I like about the 'central search' idea is that you can get a User Manual result when searching Tor Support... because we have so many different pieces of content that I liked the idea of moving the user from one site to the other through the searches.
Is this still going to happen with your proposal?
I like that too, but I think UX wanted search results per portal?
I don't know about doing it project-wide, but I feel that for example support.torproject.org and tb-manual.torproject.org could share search results.
I think this is a good time for figuring out what Tor Project wants from a search function. I've put down a couple of statements sprinkled with questions below. Please jump in and argue against false statements and answer questions where possible. And please add more questions.
The web site support.tpo needs a search field and a button next to it resulting in the user seeing a list of matching url's (and their titles) in their browser.
What corpus would such a search look at? support.tpo only? support.tpo and tb-manual.tpo? More than that?
Are there other tpo sites that need/want a search function? Should search results include matches from other tpo sites as well, or only the one the user is currently visiting?
Sending the user to a separate site, say search.tpo, is considered not UX friendly enough.
Is search.<site>.tpo good enough?
Are we limited to using solr, as mentioned in #25322, or can we explore other options?
User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option. Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
Roger Dingledine:
On Fri, Nov 16, 2018 at 10:22:24AM +0100, Linus Nordberg wrote:
- Are we limited to using solr, as mentioned in #25322, or can we explore other options?
I have vague memories that Isa and Hiro explored other options, like outsourcing it to duckduckgo, but apparently the user flow was horrible. So, I don't know what constraints we want now, but there is some history of exploring other options.
- User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option.
If there's some way to limit the number of searches (proxypasses) going at once, so a crawler doesn't take down (fill all the slots of) all of our static webservers, this idea might still be worth exploring. I feel a bit bad putting in place something that is so obviously going to be a source of ongoing pain, but I don't know of amazing better options that match all the other goals.
Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
If we're proxying to another service running *on that same machine*, then I think we're ok on this point. It's just if we have some central separate search service that it would be a problem. So for example if solr is our choice, we could run a replicated solr on each webserver.
--Roger
tor-community-team mailing list tor-community-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-community-team
tor-community-team mailing list tor-community-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-community-team
Hi all!
On 2/11/19 5:46 PM, Alison Macrina wrote:
[snip] I don't see any issue with doing it that way. Could the user check boxes to select just the support portal (or another portal)?
Alison
Sure we can. The ux can be designed as we want it. The only thing is the url.
It would help if we can use urls in the form of search.support.torproject.org and search.tb-manual.torproject.org and so on.
Talk soon,
-hiro
Cheers,
-hiro
On 11/16/18 9:22 AM, Linus Nordberg wrote:
emma peel emma.peel@riseup.net wrote Fri, 16 Nov 2018 07:41:00 +0000:
silvia [hiro]:
On 11/15/18 10:06 AM, emma peel wrote:
What I like about the 'central search' idea is that you can get a User Manual result when searching Tor Support... because we have so many different pieces of content that I liked the idea of moving the user from one site to the other through the searches.
Is this still going to happen with your proposal?
I like that too, but I think UX wanted search results per portal?
I don't know about doing it project-wide, but I feel that for example support.torproject.org and tb-manual.torproject.org could share search results.
I think this is a good time for figuring out what Tor Project wants from a search function. I've put down a couple of statements sprinkled with questions below. Please jump in and argue against false statements and answer questions where possible. And please add more questions.
The web site support.tpo needs a search field and a button next to it resulting in the user seeing a list of matching url's (and their titles) in their browser.
What corpus would such a search look at? support.tpo only? support.tpo and tb-manual.tpo? More than that?
Are there other tpo sites that need/want a search function? Should search results include matches from other tpo sites as well, or only the one the user is currently visiting?
Sending the user to a separate site, say search.tpo, is considered not UX friendly enough.
Is search.<site>.tpo good enough?
Are we limited to using solr, as mentioned in #25322, or can we explore other options?
User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option. Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
Roger Dingledine:
On Fri, Nov 16, 2018 at 10:22:24AM +0100, Linus Nordberg wrote:
- Are we limited to using solr, as mentioned in #25322, or can we explore other options?
I have vague memories that Isa and Hiro explored other options, like outsourcing it to duckduckgo, but apparently the user flow was horrible. So, I don't know what constraints we want now, but there is some history of exploring other options.
- User fronting tpo web sites are "on the static rotation" because that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option.
If there's some way to limit the number of searches (proxypasses) going at once, so a crawler doesn't take down (fill all the slots of) all of our static webservers, this idea might still be worth exploring. I feel a bit bad putting in place something that is so obviously going to be a source of ongoing pain, but I don't know of amazing better options that match all the other goals.
Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
If we're proxying to another service running *on that same machine*, then I think we're ok on this point. It's just if we have some central separate search service that it would be a problem. So for example if solr is our choice, we could run a replicated solr on each webserver.
--Roger
tor-community-team mailing list tor-community-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-community-team
tor-community-team mailing list tor-community-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-community-team
Hi everyone,
Resurrecting this thread and adding in Emma since she’s been asking about it and anarcat since he might need to be in the loop also,
Let’s just go ahead and use subdomain urls for search as it seems this will be the easiest and not making a decision either way means nothing is moving forward :)
We can always iterate on it if other options become available.
Thanks!
Pili
— Project Manager: Tor Browser, UX and Community teams pili at torproject dot org gpg 3E7F A89E 2459 B6CC A62F 56B8 C6CB 772E F096 9C45
On 12 Feb 2019, at 14:34, silvia [hiro] hiro@torproject.org wrote:
Hi all!
On 2/11/19 5:46 PM, Alison Macrina wrote:
[snip] I don't see any issue with doing it that way. Could the user check boxes to select just the support portal (or another portal)?
Alison
Sure we can. The ux can be designed as we want it. The only thing is the url.
It would help if we can use urls in the form of search.support.torproject.org and search.tb-manual.torproject.org and so on.
Talk soon,
-hiro
Cheers,
-hiro
On 11/16/18 9:22 AM, Linus Nordberg wrote:
emma peel emma.peel@riseup.net wrote Fri, 16 Nov 2018 07:41:00 +0000:
silvia [hiro]:
On 11/15/18 10:06 AM, emma peel wrote: > What I like about the 'central search' idea is that you can get a > User Manual result when searching Tor Support... because we have so > many different pieces of content that I liked the idea of moving > the user from one site to the other through the searches. > > Is this still going to happen with your proposal? I like that too, but I think UX wanted search results per portal?
I don't know about doing it project-wide, but I feel that for example support.torproject.org and tb-manual.torproject.org could share search results.
I think this is a good time for figuring out what Tor Project wants from a search function. I've put down a couple of statements sprinkled with questions below. Please jump in and argue against false statements and answer questions where possible. And please add more questions.
- The web site support.tpo needs a search field and a button next to it
resulting in the user seeing a list of matching url's (and their titles) in their browser.
- What corpus would such a search look at? support.tpo only? support.tpo
and tb-manual.tpo? More than that?
- Are there other tpo sites that need/want a search function? Should
search results include matches from other tpo sites as well, or only the one the user is currently visiting?
- Sending the user to a separate site, say search.tpo, is considered not
UX friendly enough.
Is search.<site>.tpo good enough?
Are we limited to using solr, as mentioned in #25322, or can we
explore other options?
- User fronting tpo web sites are "on the static rotation" because
that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option. Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
Roger Dingledine:
On Fri, Nov 16, 2018 at 10:22:24AM +0100, Linus Nordberg wrote:
- Are we limited to using solr, as mentioned in #25322, or can we
explore other options?
I have vague memories that Isa and Hiro explored other options, like outsourcing it to duckduckgo, but apparently the user flow was horrible. So, I don't know what constraints we want now, but there is some history of exploring other options.
- User fronting tpo web sites are "on the static rotation" because
that's how we can keep them up and running given the resources at hand. Adding dynamic content, i.e. anything that is not "oh, that url corresponds to this file, let's send it to the user", would not be possible on our current set of VM's given the load we see on user facing tpo websites. This means that one of the proposed solutions with web servers proxying requests to a separate service, search.tpo, is not an option.
If there's some way to limit the number of searches (proxypasses) going at once, so a crawler doesn't take down (fill all the slots of) all of our static webservers, this idea might still be worth exploring. I feel a bit bad putting in place something that is so obviously going to be a source of ongoing pain, but I don't know of amazing better options that match all the other goals.
Another argument against proxying is that it breaks the expectation of end-to-end security given by HTTPS.
If we're proxying to another service running *on that same machine*, then I think we're ok on this point. It's just if we have some central separate search service that it would be a problem. So for example if solr is our choice, we could run a replicated solr on each webserver.
--Roger
tor-community-team mailing list tor-community-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-community-team
tor-community-team mailing list tor-community-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-community-team
tor-community-team mailing list tor-community-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-community-team
tor-community-team@lists.torproject.org