[or-cvs] r18784: {projects} Add discussion on considering bandwidth shortage as an econo (projects/performance)

sjm217 at seul.org sjm217 at seul.org
Thu Mar 5 18:12:16 UTC 2009


Author: sjm217
Date: 2009-03-05 13:12:15 -0500 (Thu, 05 Mar 2009)
New Revision: 18784

Added:
   projects/performance/equilibrium.R
Modified:
   projects/performance/Makefile
   projects/performance/performance.bib
   projects/performance/performance.tex
Log:
Add discussion on considering bandwidth shortage as an economic problem, with potential solutions

Modified: projects/performance/Makefile
===================================================================
--- projects/performance/Makefile	2009-03-05 14:35:36 UTC (rev 18783)
+++ projects/performance/Makefile	2009-03-05 18:12:15 UTC (rev 18784)
@@ -6,6 +6,7 @@
 	node-selection/optimum-selection-probabilities.pdf \
 	node-selection/relative-selection-probabilities.pdf \
 	node-selection/vary-network-load.pdf \
+	equilibrium.pdf \
 	performance.bbl
 
 %.pdf %.aux: %.tex
@@ -24,6 +25,9 @@
 node-selection/%.pdf: node-selection/%.R
 	cd node-selection; R CMD BATCH --vanilla ../$<
 
+%.pdf: %.R
+	R CMD BATCH --vanilla $<
+
 clean:
 	rm -f *~ \
 	      *.Rout \

Added: projects/performance/equilibrium.R
===================================================================
--- projects/performance/equilibrium.R	                        (rev 0)
+++ projects/performance/equilibrium.R	2009-03-05 18:12:15 UTC (rev 18784)
@@ -0,0 +1,156 @@
+###
+### Draw graph showing relationship between network bandwidth and
+### performance, by treating the system as a market for throughput.
+###
+### See related blog post for the details:
+###  http://www.lightbluetouchpaper.org/2007/07/18/economics-of-tor-performance/
+###
+### Steven J. Murdoch <http://www.cl.cam.ac.uk/~sjm217/>, 2007-07-18
+###
+
+###
+### Function declarations
+###
+
+### Plot and annotate a point at the intersection of two curves
+plotIntersect <- function(x, y1=NULL, y2=NULL, i=NULL, ann=NULL, adj=NULL) {
+  if (is.null(i))
+    iIntersect <- which.min(abs(y1-y2))
+  else
+    iIntersect <- i
+  
+  points(x[iIntersect], y1[iIntersect], pch=20, col="darkred")
+
+  if (!is.null(ann))
+    text(x[iIntersect], y1[iIntersect], ann, adj=adj)
+    
+  return(iIntersect)
+}
+
+###
+### Customization
+###
+
+## Number of Tor users (vague estimate)
+uUsersTotal <- 1e5
+
+## Bandwidth (bits/sec) of the Tor network (from
+## http://www.noreply.org/tor-running-routers/) as of July 2007
+rBandwidthTotal <- ( 120     # 120 MByte/sec
+                     * 1e6   # convert to byte/sec
+                     * 8 )   # convert to bit/sec
+
+## Range for plotting number of users
+rgUsers <- c(uUsersTotal* 0.4, uUsersTotal * 1.6)
+
+## Number of points to plot on X axis
+cUserPoints <- 300
+
+###
+### Generate curves
+###
+
+## Number of users
+uUsers <- seq(rgUsers[1], rgUsers[2], length.out=cUserPoints)
+
+## Supply curve: (bandwidth per user) * (number of users) = rBandwidthTotal
+rSupply <- rBandwidthTotal / uUsers
+
+## Supply curve, for a 50% increase in total bandwidth
+rBandwidthTotalNew <- rBandwidthTotal * 1.5
+rSupplyNew <- rBandwidthTotalNew / uUsers
+
+## Demand curve (arbitrary values, just something that looks OK)
+rDemandB <- 5e-12*(uUsers^3) + 6000
+
+## Second demand curve
+rDemandC <-25*(uUsers-20000)^0.5
+## Find intersect between rDemandB and rSupply
+iIntersectOrig <- which.min(abs(rSupply-rDemandB))
+## Find how much rDemandC needs to be shifted to intersect at same point
+rDiff <- rSupply[iIntersectOrig] - rDemandC[iIntersectOrig]
+## Shift curve
+rDemandC <- rDemandC + rDiff
+## Find user count at intersection
+uIntersectOrig <- uUsers[iIntersectOrig]
+
+###
+### Plotting
+###
+
+## Color declarations (selected from colour-blind-safe pallete)
+colDemand <- "#336600"
+colSupply <- "#9900CC"
+
+## Open output file (increase size, so it can be scaled down later and
+## anti-aliased)
+RequestedSize <- Sys.getenv("IMAGESIZE")
+if (RequestedSize=="") {
+  ImageSize <- c(440, 290)
+} else {
+  ImageSize <- as.numeric(strsplit(RequestedSize, split="x")[[1]])
+}
+
+ScaleFactor <- 2
+ImageSize <- ImageSize * ScaleFactor
+pdf("equilibrium.pdf")
+
+## Style declarations
+par(mar=c(2.2,2.2,0.1,0.1))
+par(lwd=3)
+par(col="black")
+
+## Define plotting area and plot supply curve
+plot(uUsers, rSupply, ann=FALSE, frame.plot=FALSE, type="l", col=colSupply, axes=FALSE)
+title(xlab = "Number of users", line=1)
+title(ylab = "Throughput per user", line=1)
+
+## Draw other curves
+lines(uUsers, rSupplyNew, col=colSupply)
+lines(uUsers, rDemandB, col=colDemand)
+lines(uUsers, rDemandC, col=colDemand)
+lines(rep(uIntersectOrig, 2), c(6376.615, 24055.385), col=colDemand)
+
+## Plot and annotate intersection
+iIntersectA <- plotIntersect(uUsers, rSupplyNew, i=iIntersectOrig, ann="A", adj=c(-1.2,0.1))
+iIntersectB <- plotIntersect(uUsers, rSupplyNew, y2=rDemandB, ann="B", adj=c(-2,0.3))
+iIntersectC <- plotIntersect(uUsers, rSupplyNew, y2=rDemandC, ann="C", adj=c(0.5,1.8))
+
+## Arrow showing movement of supply curve
+arrows(60023.34, 17409.23, 73314.21, 18406.15, lwd=1)
+
+## Hide values (since they are basically guesswork and not important anyway)
+axis(1, par("usr")[1:2], label=FALSE)
+axis(2, par("usr")[3:4], label=FALSE)
+
+## Labels
+text(41634.54, 23942.4, "Supply", adj=c(-0.1,0.5))
+text(137844.4, 18810.24, "Demand", adj=c(-0.2,0.5))
+
+## Save output file
+dev.off()
+
+###
+### Collect some stats
+###
+
+## Original values
+uOrig <- uUsers[iIntersectOrig]
+rOrig <- rSupply[iIntersectOrig]
+print(sprintf("Original users: %.0f, bandwidth %.0f", uOrig, rOrig))
+
+## Percentage change
+print(sprintf("Intersect A users: %.0f%%, bandwidth %.0f%%",
+              uUsers[iIntersectA]/uOrig*100, rSupplyNew[iIntersectA]/rOrig*100))
+print(sprintf("Intersect B users: %.0f%%, bandwidth %.0f%%",
+              uUsers[iIntersectB]/uOrig*100, rSupplyNew[iIntersectB]/rOrig*100))
+print(sprintf("Intersect C users: %.0f%%, bandwidth %.0f%%",
+              uUsers[iIntersectC]/uOrig*100, rSupplyNew[iIntersectC]/rOrig*100))
+
+###
+### Old intersection was 94180 users, 10193 bps each
+### Intersection A (constant demand) is 100% of old users, and 150% of bandwidth
+### Intersection B is 118% of old users and 127% of bandwidth
+### Intersection C is 133% of old users and 113% of bandwidth
+###
+

Modified: projects/performance/performance.bib
===================================================================
--- projects/performance/performance.bib	2009-03-05 14:35:36 UTC (rev 18783)
+++ projects/performance/performance.bib	2009-03-05 18:12:15 UTC (rev 18784)
@@ -138,4 +138,29 @@
   type =         {RFC},
   number =       {4301},
   month =        {December}
-}
\ No newline at end of file
+}
+
+ at inproceedings{wendolsky-pet2007,
+  title = {Performance Comparison of low-latency Anonymisation Services from a User Perspective}, 
+  author = {Rolf Wendolsky and Dominik Herrmann and Hannes Federrath}, 
+  booktitle = {Proceedings of the Seventh Workshop on Privacy Enhancing Technologies (PET 2007)}, 
+  year = {2007}, 
+  month = {June}, 
+  address = {Ottawa, Canada}, 
+  editor = {Nikita Borisov and Philippe Golle}, 
+  publisher = {Springer}, 
+  www_section = {Anonymous communication}, 
+  bookurl = {http://petworkshop.org/2007/}, 
+  www_pdf_url = {http://petworkshop.org/2007/papers/PET2007_preproc_Performance_comparison.pdf},
+}
+
+
+ at Misc{economics-tor,
+  author = 	 {Steven J. Murdoch},
+  title = 	 {Economics of {Tor} performance},
+  howpublished = {Light Blue Touchpaper},
+  month = 	 {18 July},
+  year = 	 {2007},
+  note = 	 {\url{http://www.lightbluetouchpaper.org/2007/07/18/economics-of-tor-performance/}},
+}
+

Modified: projects/performance/performance.tex
===================================================================
--- projects/performance/performance.tex	2009-03-05 14:35:36 UTC (rev 18783)
+++ projects/performance/performance.tex	2009-03-05 18:12:15 UTC (rev 18784)
@@ -178,17 +178,6 @@
 
 \subsection{Priority for circuit control cells, e.g. circuit creation}
 
-
-\section{Some users add way too much load}
-
-\subsection{Squeeze loud circuits}
-\subsection{Snipe bittorrent}
-\subsection{Throttle at the client side}
-\subsection{Default exit policy of 80,443}
-\subsection{Need more options here, since these all suck}
-
-
-
 \section{Simply not enough capacity}
 
 \subsection{Tor server advocacy}
@@ -498,6 +487,73 @@
 other than as a periodic keep-alive.
 
 
+\section{Some users add way too much load}
+
+If, for example, the measures above doubled the effective capacity of the Tor network, the na\"{\i}ve hypothesis is that users would experience twice the throughput.
+Unfortunately this is not true, because it assumes that the number of users does not vary with bandwidth available.
+In fact, as the supply of the Tor network's bandwidth increases, there will be a corresponding increase in the demand for bandwidth from Tor users.
+Simple economics shows that performance of Tor, and other anonymization networks, are controlled by how the number of users scales with available bandwidth, which can be represented by a demand curve.\footnote{This section is based on a blog post published in Light Blue Touchpaper~\cite{economics-tor} and the property discussed was also observed by Andreas Pfitzmann in response to a presentation at the PET Symposium~\cite{wendolsky-pet2007}}.
+
+\begin{figure}
+\includegraphics{equilibrium}
+\caption{Hypothetical supply and demand curves for Tor network resources}
+\label{fig:equilibrium}
+\end{figure}
+
+\prettyref{fig:equilibrium} is the typical supply and demand graph from economics textbooks, except with long-term throughput per user substituted for price, and number of users substituted for quantity of goods sold.
+Also, it is inverted, because users prefer higher throughput, whereas consumers prefer lower prices.
+Similarly, as the number of users increases, the bandwidth supplied by the network falls, whereas suppliers will produce more goods if the price is higher.
+
+In drawing the supply curve, I have assumed the network's bandwidth is constant and shared equally over as many users as needed.
+The shape of the demand curve is much harder to even approximate, but for the sake of discussion, I have drawn three alternatives.
+We will return to these assumptions later.
+The number of Tor users and the throughput they each get is the intersection between the supply and demand curves -- the equilibrium.
+If the number of users is below this point, more users will join and the throughput per user will fall to the lowest tolerable level.
+Similarly, if the number of users is too high, some will be getting lower throughput than their minimum, so will give up, improving the network for the rest of the users.
+
+Now assume Tor's bandwidth grows by 50\% -- the supply curve shifts, as shown in the figure.
+By comparing how the equilibrium moves, we can see how the shape of the demand curve affects the performance improvement that Tor users see.
+If the number of users is independent of performance, shown in curve A, then everyone gets a 50\% improvement, which matches the na\"{\i}ve hypothesis.
+More realistically, the number of users increases, so the performance gain is less and the shallower the curve gets, the smaller the performance increase will be.
+For demand curve B, there is a 18\% increase in the number of Tor users and a 27\% increase in throughput; whereas with curve C there are 33\% more users and so only a 13\% increase in throughput for each user.
+
+In an extreme case where the demand curve points down (not shown), as the network bandwidth increases, performance for users will fall.
+Products exhibiting this type of demand curve, such as designer clothes, are known as Veblen goods.
+As the price increases, their value as status symbols grows, so more people want to buy them.
+I don't think it is likely to be the case with Tor, but there could be a few users who might think that the slower the network is, the better it is for anonymity.
+
+To keep the explanation simple, I have ve made quite a few assumptions, some more reasonable than others.
+For the supply curve, I assume that all Tor's bandwidth goes into servicing user requests, it is shared fairly between users, there is no overhead when the number of Tor clients grows, and the performance bottleneck is the network, not clients.
+I don't think any of these are true, but the difference between the ideal case and reality might not be significant enough to nullify the analysis.
+The demand curves are basically guesswork -- it's unlikely that the true one is as nicely behaved as the ideal ones shown.
+It more likely will be a combination of the different classes, as different user communities come into relevance.
+
+I glossed over the aspect of reaching equilibrium -- in fact it could take some time between the network bandwidth changing and the user population reaching stability.
+If this period is sufficiently long and network bandwidth is sufficiently volatile it might never reach equilibrium.
+I've also ignored effects which shift the demand curve.
+In normal economics, marketing makes people buy a product even though they considered it too expensive.
+Similarly, a Slashdot article or news of a privacy scandal could make Tor users more tolerant of the poor performance.
+Finally, the user perception of performance is an interesting and complex topic, which I've not covered here.
+I’ve assumed that performance is equivalent to throughput, but actually latency, packet loss, predictability, and their interaction with TCP/IP congestion control are important components too.
+
+\subsection{Differential pricing for Tor users}
+
+The above discussion has argued that the speed of an anonymity network will converge on the slowest level that the most tolerant users will consider usable.
+This is problematic because there are is significant variation in levels of tolerance between different users and different protocols.
+Most notably, file sharing users are subject to high profile legal threats, and do not require interactive traffic, so will continue to use a network even if the performance is considerably lower than the usable level for web browsing.
+
+In conventional markets, this type of problem is solved by differential pricing, for example different classes of seat on airline flights.
+In this model, several equilibrium points are allowed to form, and the one chosen will depend on the cost/benefit tradeoffs of the customers.
+A similar strategy could be used for Tor, allowing interactive web browsing users to get higher performance, while forcing bulk data transfer users to have lower performance (but still tolerable for them).
+Alternatively, the network could be configured to share resources in a manner such that the utility to each user is more equal.
+In this case, it will be acceptable to all users that a single equilibrium point is formed, because its level will no longer be in terms of simple bandwidth.
+
+\subsection{Squeeze loud circuits}
+\subsection{Snipe bittorrent}
+\subsection{Throttle at the client side}
+\subsection{Default exit policy of 80,443}
+\subsection{Need more options here, since these all suck}
+
 \section{Last thoughts}
 
 \subsection{Metrics}



More information about the tor-commits mailing list