[tor-project] Requiring JavaScript on Tor Metrics

tl tl at rat.io
Wed Nov 16 01:07:08 UTC 2016


Karsten asked me back in January to investigate possible solutions to this problem and I reported back to the metrics-team mailinglist [0]. That mail contains a more detailed evaluation of technical approaches and available solutions.

The gist of it is that there’s no need to decide between "either static pages or interactive JavaScript" but that one can have the best of both worlds: a website that starts static with plain old HTML and graphs as static images to glare at, but gets interactive _if_and_as_soon_as_ JavaScript is enabled (a hint to those not having JS enabled that they miss interactive exploration of the data might be appropriate).
The technique behind this is called "isomorphic". It has been a hot topic during the last 2 years and is now implemented in many mayor frameworks - see below [1] for some background.

The real question here is not what is possible but how much effort it needs and if it’s worth it. Everything non-trivial on the web needs some JavaScript and is helped by a clever use of available frameworks. Also it is easier to build a JavaScript-only version (or a static version) than it is to build one that provides both (an "isomorphic" one), but it’s absolutely possible and done in a lot of high profile places (twitter and airbnb for example). [2]

Regarding the question if it’s worth it it I’d say: yes. Nobody should need to switch back to medium security just to view basic graphs if that’s all they need. The amount to which websites rely on JavaScript today is ridiculous (and dangerous) and there’s no need to support and increase that trend. OTOH interactive graphs are definitely not just eye candy. They can provide controls and levers that enable the user to manipulate them, tweak time frames, add or subtract dimensions and through that interactively get a much better understanding of what’s going on in the Tor network and how it works. That chance should not be waisted.

A mildly interactive version without JavaScript but with form buttons and server roundtrips could provide a middle ground of functionality without enabling JavaScript but this version would probably be tedious to maintain and will never provide the same level of interactivity and fluidness since the clumsyness of forms and the necessity of server roundtripping makes it tedious and time consuming to use.

Regarding R/Shiny: I don’t know R/Shiny but some googling for "isomorphic R/Shiny" didn’t unearth anything meaningful. So I guess it doesn’t support the creation of static representations in the absence of JavaScript. That would not be surprising since R/Shiny is really a niche product thathasn’t much incentive to go there.
R/Shiny is probably an easy route to take in the short term but besides the lack of a static version it also IMHO has too many lock ins. R/Shiny only works with R and while R is very common it’s not the only data manipulation language aroud. Any meaningfull data manipulation tool can serialize to CSV or JSON and that’s all you need to feed a visualization engine like D3.js (D3.js is the predominant visualization framework on the web but there are others too, that are less powerful and less initimidating).
Loose coupling of the analytics engine (whatever, if it exports at least CSV or JSON), the web framework (in any case isomorphic, maybe AngularJS because of Apache Zeppelin [3]) and a visualization engine (like D3.js), would provide a much more flexible and resistant solution.

Regarding Rogers question: saving SVG graphs from the browser to a static, local PNG file is possible but depends on deploying yet another JavaScript library (phantom.js). The positive side is that you cannot only save the initially showed graph but also any result of your interaction with it. But it’s also possible to serve PNGs first and then, if JS is enabled, go interactive . [4]

Regarding Teors question if client side rendering provides anonymity advantages: the size of the data to be transferred to the client is not to be underestimated and it quickly grows. If you need hourly data than the volume is 24x that of daily data. If you want to look at it by country than you might have to transfer 2 orders of magnitude more data (of course it all depends on how elaborately the application is structured and on how outlandish your information desire is). As a rule of thumb you can expect to quickly be in the hundreds of KB and low MB figures. It doesn’t seem efficient to try to hide your real interest in a noisy web app. You should probably rather download the data in big chunks and run it in a local application.

Regarding another question of Teor: any solution will produce a visualization embedded in a webpage that you can link to just as you do now. But right now you have no guarantee that the graph pictured on the page you're linking to won’t change (although I don’t know about the PNGs on the metrics site themselves - maybe they are preserved). Providing a link to a visualization of precisely some chunk of data with specific parameters is in principal possible with an interactive solution. In practice though it depends on the framework - not all of them provide a URL for every state in interactive web applications and the addition of D3.js makes things even more complicated. This would have to be evaluated.

(full disclosure: I went under 3 different handles in the past - "tomlurge", "thms" and "oma" - which is entirely my fault and not cool. but it’s all just me)

[0] https://lists.torproject.org/pipermail/metrics-team/2016-January/000036.html
   There’s also some CMS discussion mixed in - maybe just try to ignore that. 
   The question on security was resolved later but might surface again with 
   every new JS library needed.

[1] With "web 2.0" a lot of big websites started to use JavaScript backed web 
   frameworks that did a lot of work in the browser which before was done on 
   the server. It was a logical evolution of AJAX/web 2.0 and made the websites
   ever more richer in functionality and shinyness. But as more and more 
   functionality was moved from the server into JavaScript libraries and 
   towards the browser initial page load became slower and slow initial page 
   loads turn customers away. The solution to this problem was called 
   "isomorphic" web framework: the framework supports rendering the page in the
   browser or just as well, with the same logic, on the server (like in the old 
   days, but with a JavaScript based engine, almost always running in an 
   JavaScript environment called Node.js) - so either dumping the whole data 
   and code on the browser and let it do the work or serving a HTML page. This
   technique is a solution to the problem described above when only the 
   startpage and some initial state is rendered server-side and enables very 
   fast initial page load in the browser while in the background also the JS 
   libraries and additional data get transferred and, transparent to the user,
   the page in the browser becomes interactive. Because this technique provides
   the best of both worlds and is a solution to a real problem - loosing users
   because of long page loads- it has been implemented in many mayor web 
   frameworks during the last 2 years.

[2] It is possible to use this technique with JavaScript graphing libraries. 
   D3.js is the most prevalent of those libraries and there are examples around
   on how to generate SVG graphs with D3 either on the server or make them 
   render on the client. The whole solution is not without complexities but 
   it’s not rocket science either.

[3] An interesting route to examine might be Apache Zeppelin: that is a tool to 
   visualize data and generate interactive, shareble "notebooks" about findings
   in the data. Those findings (data and visualizations) can also be send 
   around and embedded in web pages.
   Technically interesting is that is not primarily a web tool but it is build 
   with web technolgies, namely D3.js and the AngularJS web framework. 
   So far I haven’t had time to check for myself but maybe there’s a way to 
   drive both a website and that tool with the same data that provides a 
   - from static images of graphs
   - to interactive versions on the web
   - to full blown locally run analytics.
   That to me seems like a path worth investigating with respect to the website
   and generally better accessability of metrics data.

[4] On a website that is static at first, these graphs can be rendered as SVG 
   (which works without JavaScript) but could be PNG as well (although that’s a 
   little harder). Users that have JavaScript enabled would seamlessly get 
   additional interactive controls that allow them to tweak parameters of the 
   visualization and drill into the data. If they don’t have JavaScript 
   enabled, those controls don’t show up and the graph remains static. 

> On 14 Nov 2016, at 12:07, Karsten Loesing <karsten at torproject.org> wrote:
> Hash: SHA256
> Hi teor,
> On 14/11/16 10:34, teor wrote:
>>> On 14 Nov. 2016, at 02:54, Karsten Loesing
>>> <karsten at torproject.org> wrote:
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
>>> Hi everyone,
>>> I'm moving this (old but recently continued) thread from an
>>> internal mailing list to this list as suggested by Roger.  Any
>>> feedback would be most useful by Wednesday, November 16.
>>> Thanks!
>>> All the best, Karsten
>>> On 07/12/15 14:09, Karsten Loesing wrote:
>>>> Hi everyone,
>>>> we're discussing changing the graphing engine on Tor Metrics
>>>> from generating static images on the server using R/ggplot2
>>>> towards generating interactive visualizations using JavaScript
>>>> on the client.
>>>> As a result, Tor Metrics will require Tor Browser users to
>>>> switch to Medium-High Security or lower.
>>>> This switch has some potential with respect to visualizations,
>>>> and the visualization people in the metrics team want to do it.
>>>> It allows producing more graphs like Lunar's bubble graph:
>>>> https://metrics.torproject.org/bubbles.html
>>>> In fact, it would allow all kinds of visualizations like the
>>>> ones seen in the D3 gallery:
>>>> https://github.com/mbostock/d3/wiki/Gallery
>>>> So, my question is: can you all live with this change?
>>>> The next step will be to ask users on tor-talk@, but I figured
>>>> I should ask here first.  If I don't hear any strong objections
>>>> by Thursday, I'll go ask on tor-talk at .
>>>> Thanks!
>>>> All the best, Karsten
>>> On 07/12/15 15:22, somebody else wrote:
>>>> I assume it would be infeasible to maintain non-Javascript 
>>>> fallbacks, correct?
>>>> Likewise, I assume it would be too difficult (for now (?)) to
>>>> try something crazy-fancy like render the javascript versions
>>>> on the server and serve them as images in that case? [0]
>>>> [0] I'm hand-waving about whether or not this is possible, but
>>>> it seems like it in theory, and at least one person has tried:
>>> http://blog.davidpadbury.com/2010/10/03/using-nodejs-to-render-js-charts-on-server/
> On 08/12/15 00:34, somebody wrote:
>>>> I don't understand why you would want Tor users to lower their
>>>> security in order to get prettier graphs.  Is there some
>>>> obvious thing I am missing?
>>>> Personally I run with Javascript disabled on the vast majority
>>>> of web sites I access (via NoScript), and with third party
>>>> "included" Javascript disabled virtually everywhere (via
>>>> PrivacyBadger and sometimes RequestPolicy).  Having web sites
>>>> run their own choice of programs in my computer seems like a
>>>> really stupid idea.  I can only assume that its prevalence as a
>>>> technique reflects how little respect the average web site has
>>>> for the security of its users.
>>> On 09/12/15 08:45, another person wrote:
>>>> I'd like to know how much work it would be.
>>> On 09/12/15 13:59, Karsten Loesing wrote:
>>>> I'll try to find out.  I think that having non-JavaScript 
>>>> fallbacks written in a different language would be infeasible.
>>>> But I could imagine that we render D3.js graphs on the server
>>>> using Node.js and return images to the client.
>>>> I'm mostly worried about the lack of interactivity.  I'd
>>>> really want to get away from requiring a full round-trip from
>>>> client to server and back just to change something in the
>>>> displayed graph. Maybe we can keep them somewhat interactive,
>>>> or at least add tooltips, by using SVG as image format.
>>>> But please understand that while I'm doing okay at processing 
>>>> large amounts of data, I don't know much about web development.
>>>> If people here have ideas, please let me know!
>>>> Note that investigating these options may take a while, mostly 
>>>> because people in the metrics team are busy.  But we won't
>>>> switch to client-side JavaScript before we know about possible 
>>>> alternatives.
>>>> Thanks for the feedback.  This is very helpful!
>>>> All the best, Karsten
>>> On 12/11/16 16:37, Karsten Loesing wrote:
>>>> Hello everyone,
>>>> apologies for digging out such an old thread, but after almost
>>>> starting a new thread on the exact same topic I remembered
>>>> that we had this discussion almost a year ago.  I figured it's
>>>> better to continue this thread to avoid discussing the exact
>>>> same things again and instead add new thoughts below.
>>>> So, on Thursday, Linda, iwakeh, and I met in Berlin to talk
>>>> about making the Tor Metrics website more usable.  We came up
>>>> with some pretty good ideas to better address the various user
>>>> types---from journalists to data scientists---coming to the Tor
>>>> Metrics website to learn interesting things about the deployed
>>>> Tor network.  We'll share results with you in a few weeks from
>>>> now when there's something to see and click on.
>>>> But in this context I want to bring up the topic of JavaScript 
>>>> once more.
>>>> To be clear, our immediate plans---including reorganizing 
>>>> information and displaying sparklines as quick entry points
>>>> into the data---don't require JavaScript on Tor Metrics.  And
>>>> even our planned improvements might work without
>>>> JavaScript---including more customizable graphs and even a
>>>> dashboard with user-selected graphs---can be made to work
>>>> without JavaScript.  Though the latter will be a stretch.
>>>> The question is: do we have to keep stretching by avoiding a
>>>> web techology that would make our lives and the lives of our
>>>> users so much easier?
>>>> This isn't really something that we can work around by
>>>> generating graphs with a JavaScript library on the server.  I'd
>>>> want us to switch to another graphing framework such as R Shiny
>>>> (which I used for the webstats prototype) or D3.js (which we
>>>> currently use on Tor Metrics just for the bubbles graph, though
>>>> not in the most efficient way).
>>>> - From a development perspective this switch would make a lot
>>>> of sense, because we'd have to write a lot less code for new
>>>> graphs and because there'd be potential contributors out there
>>>> who'd appreciate working with a known framework.  Our current
>>>> graphing engine doesn't scale much longer, and this does slow
>>>> us down.
>>>> Note that I'm not arguing to use JavaScript in all places, just
>>>> because we can.  I'm in favor of keeping all the parts of Tor 
>>>> Metrics where we provide textual or static information entirely
>>>> JavaScript-free, so that our data will still be available to
>>>> users that don't have or don't want to use JavaScript.  And
>>>> services like ExoneraTor could easily stay JavaScript-free.
>>>> But the parts of Tor Metrics where we're providing
>>>> visualizations and letting users explore our data would require
>>>> JavaScript.  This would include all graphs and tables, because
>>>> we shouldn't be maintaining two graphing engines.
>> Are there any privacy or security advantages to having client-side 
>> JavaScript?
>> For example, if we download data from the server, and then render
>> it on the client using JavaScript, then the server knows less about
>> what the client is requesting and visualising.
>> Are there ways of coding the metrics website in JavaScript that 
>> improve client privacy in this way? (Or does it cost too much
>> effort or too much bandwidth to pull down larger datasets just to
>> hide what the client is looking at?)
> Fine question.  I assume most Tor Metrics users don't care that much
> about leaking to the server what part of the data they're looking at.
> And I assume that those who do might download the CSV files and look
> at them locally.  But my assumptions might be wrong.
> So, the switch to JavaScript may or may not address this.  If we pick
> something like Shiny, graphs will still be generated on the server,
> and there wouldn't be any difference with regard to client privacy.
> If we use D3.js, the browser downloads the data it needs and produces
> a graph locally.
> I think, overall, I wouldn't add Tor Metrics client privacy towards
> the server as a hard requirement, because we already have too many of
> those.  If the solution we decide on offers more client privacy than
> the current solution, great.  But if it doesn't, okay.
>>>> So, how do we decide this?  I believe that this should be a 
>>>> Tor-wide decision.  My main worry is that we're sinking weeks
>>>> and weeks of development effort into this switch without many
>>>> Tor people noticing, and then once we publish and they get
>>>> aware, we need to roll back, wasting all the effort.
>>>> But to be honest, we're wasting effort right now by keeping the
>>>> workarounds and implementing hacks with dynamically generated
>>>> HTML forms and potentially dozens of parameters just to avoid
>>>> the devil called JavaScript.  This feels like a bad use of our
>>>> time.
>>>> Here's my suggestion: unless somebody raises a valid concern
>>>> how requiring JavaScript on Tor Metrics is *bad for Tor*, say,
>>>> by next Wednesday, November 16, we're putting this on the hold
>>>> again. (That's the day before the next metrics team meeting and
>>>> Vegas meeting, and it should be enough time to raise your
>>>> voice.)
>>>> Otherwise we'll switch.
>>>> Oh, and if you're in favor of switching, please consider
>>>> saying that, too.  Thanks.
>>>> All the best, Karsten
>>> On 13/11/16 05:04, Roger Dingledine wrote:
>>>> Does this mean that we'd be breaking the "download a static
>>>> version of the graph" feature too? I use that feature a lot for
>>>> grabbing snapshots to put into presentations, and we also want
>>>> to use it for pulling in metrics graphs in blog posts, e.g. 
>>>> https://blog.torproject.org/blog/tracking-impact-whatsapp-blockage-tor
> as well as for external journalists.
>>>> Oh, I should also say this would be a great topic for the 
>>>> tor-project list, since it doesn't need to be sekrit (and
>>>> since then other people could know that it's a topic we're
>>>> considering, and maybe even help us make the right decision).
>>> On 13/11/16 10:08, Karsten Loesing wrote:
>>>> It's quite easy to implement that feature using Shiny, for 
>>>> example. Either Shiny would produce a .png file that you can 
>>>> download using your browser's "Save Image As..." feature, or
>>>> there would be a "Download Graph" button.  It would be just a
>>>> few lines of code.
>>>> And we'd be able to provide features like a "Download graph
>>>> data" button with just a few more lines of code, which would
>>>> require a lot more effort right now.
>> Just to clarify, would users need JavaScript turned on to use the 
>> "Download Graph" or "Download graph data" buttons?
>> Would it be possible to give others a URL for a specific graph, 
>> without having to save the graph on some other site? Would users
>> need JavaScript enabled to view the graph?
>> If we can't do this, it would be bad for how I and many others 
>> typically use Tor Metrics on mailing lists.
> Right, very good point.  I don't know Shiny enough yet to give you an
> answer with absolute certainty, but I think that we should be able to
> provide a page with a static image that can be used on mailing lists.
> That page would just have no controls for further customizing the
> graph, unless the browser supports JavaScript.
>> (And if we do allow this feature, I'm not sure how that's any 
>> different from server-side rendering of graphs for clients that 
>> don't use JavaScript.)
> One major difference is that Shiny, for example, allows us to generate
> a user interface with just a few lines of code, whereas we'd otherwise
> have to write and maintain that user interface ourselves.  And I'm not
> very optimistic to find a framework similar to Shiny that works
> without client-side JavaScript, mostly because that's not a
> requirement for most websites.
>> I'm also not sure what you mean by "tables" or "Download graph 
>> data" - will there still be CSV data downloads available?
>> Is it the aggregated data that would be in the (JavaScript-only)
>> tables, and the raw data in the static CSVs?
> The following is not at all final yet, but ideally, all parts that are
> updated twice per day in the background can be obtained without
> JavaScript whereas any data that gets filtered or aggregated by user
> request would require JavaScript.  So, the current CSV data would
> still be available without JavaScript.
>> In general, I would prefer to be able to use the Tor Metrics 
>> website without enabling JavaScript. I don't like the idea that we
>> provide Tor Browser, where we recommend(?) people turn JavaScript
>> off, and then also provide websites where they need to turn it on.
>> But if the advantages outweigh the security and consistency
>> considerations, I'd be ok with it.
>> (Just like I use Atlas, and the Metrics Bubble graphs.)
> Right, this possible inconsistency is what has stopped me in the past.
> And I think if we really want to be consistent here, we'll have to
> rewrite Atlas and the Metrics bubble graphs to not require client-side
> JavaScript anymore, which is certainly doable.  And we should probably
> look at other torproject.org websites that require JavaScript in the
> browser and rewrite those parts.
> But do we really want this?  If we recommend that people turn
> JavaScript off, why do we even support modes in Tor Browser where they
> turn it on again, isn't that inconsistent, too?  I think no, because
> we acknowledge that not everybody has the same requirements.  And in
> this case I believe that the target audience of Tor Metrics is not
> exactly the same as the set of high-security Tor Browser users.
> On the other hand I see that weakening the no-JavaScript rule for the
> Tor Metrics case might influence future decisions for other
> torproject.org websites.  Not an easy decision.
>> But my preference would be for users without JavaScript to still 
>> have some level of basic functionality, even if it's only static 
>> images and tables (which I think should be available for offline 
>> use as well).
> Right.  Two examples for parts of Tor Metrics that would require
> JavaScript are (1) the graphs where users can change the x axis,
> change lines, add new lines, and so on, and (2) a dashboard where
> users can configure the graphs they want to see next to each other.
> But users without JavaScript should still be able to navigate the
> website, learn about where the data comes from, download the raw or
> pre-processed data, and contemplate the graphs posted on mailing lists.
> Thanks for your input here!
> All the best,
> Karsten
> Comment: GPGTools - http://gpgtools.org
> Y7XXFxjMClDH5g6Xi4VDEyVq4xytD/Y1u60un7oh57ZKnVSBH38TkZSCuZfNiLM8
> GTWEZsYJRr53JLajCbHpZB+JRw6wvxuzwPoNp5SvY/2aDtvHm0ke379r3PfKDyNs
> 1SxCD6oAbg4kh9GFlbfBHPpdwyJftZ7k0Hu9oGdJE0erC0J0lLsviPk91eKiP8nf
> BsaYNKXtcZ29bba/GkkMGxayzjJO8bJciNOIFZ7SymOVwKalquwqS7q2VCIkZnO0
> P+wnGOrlRqb69HWynb1RM024922YSU3kDSsfUMBt6GIVcGTqFY3vNYzIUZc4aJs=
> =P1q7
> _______________________________________________
> tor-project mailing list
> tor-project at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-project


More information about the tor-project mailing list