[tor-talk] Cryptographic social networking project

Thu Jan 15 22:04:02 UTC 2015

>Hey wait, now you are saying something that doesn't fit what you said
>before.

oh I meant we won't use multicasting above hidden services so thats why
it become very expensive. If we do so then yes it become as efficient as
using PseudonymousServer 

>Also, I am not so sure running the PseudonymousServer in TOR2WEB mode
>is such a great idea.

what is TOR2WEB doing in your replay? that's something completely
different which have nothing to do with PseudonymousServer

>I think the way the Internet is evolving for the worst we should make
>systems that keep their data to themselves rather than store them in
>clouds.

Awesome slogan. 

>If in a distant future the encryption fails us, attackers would
>be able to decrypt what they see right there plus how much they have
>been keeping as a "full take" or "Tor snapshot." That I hope is different
>from being able to access the entire history of all social network
>interactions, because they're all in that cloud.

Criticizing cloud storage in the case of a cryptanalysis breakthrough is
unrealistic. attackers wiretap communications and pickup cipher-texts in
transit not just from servers. They can easily detect cipher-texts (e.g
PGP encrypted emails) from plain-texts to store them forever so assuming
that we didn't stored cipher-text on a server won't help anyone if
attackers break ciphers themselves. 

>Also, who pays for Utah-like storage requirements? What is your business model >for financing the sharebook cloud servers?

it's not Utah-like. We can get donations if people really use the
application in mass scales (think of wikipedia)

>That's not true, you are still assuming distribution optimization is
>impossible.

i didn't say distribution optimization is impossible, i said we already
have distribution optimization on PseudonymousServer itself so what you
wanna optimize then?

>And here you are repeating the theory that we would be making new
>circuits for each delivery which I thought was my misunderstanding
>three mails ago, why are you making it yours now? There are no 6/7 hops
>there, in neither the sharebook or the RP-multicast scenario.

i answered these above, i thought we don't have multicasting over hidden
services

>You are trading in scalability for what you think is the necessary
>cryptography but researches seem to be of a different opinion as the
>following papers show.

>2009, "De-anonymizing Social Networks" by Arvind Narayanan and Vitaly
>Shmatikov is about correlating Twitter and Flickr users.
>Is this really what you mean? Sounds pretty off-topic to me.

why do you think our one-to-many pseudonyms graph would be different
from Flickr? pubsub attach some metadata to pseudonymous vertices that
can be used for analyzing them 

>2000, "Xor-trees for efficient anonymous multicast and reception"
>2002, "Hordes -- A Multicast Based Protocol for Anonymity"
>2004, "AP3: Cooperative, decentralized anonymous communication"
>2006, "M2: Multicasting Mixes for Efficient and Anonymous Communication"
>2006, "Packet coding for strong anonymity in ad hoc networks"
>2007, "Secure asynchronous change notifications for a distributed file system"
>2011, "Scalability & Paranoia in a Decentralized Social Network."
>2013, "Design of a Social Messaging System Using Stateful Multicast."
>The last two are our own. I'm afraid I can't find a paper that supports
>your bold assertion there. You will have to help me.

anonymity is a different topic. i'm talking about compromising social
graphs. for instance in netflix attack vertices are already anonymous
but attackers try match some data from IMDB that gave real identities
for patterns to deanonymize similar patterns on anonymized netflix
dataset and they really did!

>Since I'm not a paid researcher I have not read all of these papers,
>but it does so far look like there is a majority in favor of our
>architecture rather than yours.

i guess you searched "social network anonymity" in scholar and just sent
me the results. but those papers are not protections against link
prediction algorithms that attackers use for deanonymizing social
graphs. they talk about cucumbers not apples.

>The disadvantages of requiring a storage cloud are more heavy-weight.

If "disadvantage" means a deanonymization attack that breaks our threat
model (attacker can't break Tor, majority of exit nodes aren't
concluding with attacker) then explain it, but if "disadvantage" is
depending on a feudal vendor rather than having fun with a liberal
distributed network, then we try overthrow any feudal part as soon as
possible but it's very hard to do that when we can't find a distributed
alternative. 

>I challenge that, at least in the current Tor network. If the attacker
>applies traffic shaping to the outgoing notification. Only if the
>notification has a fixed size the third hop can avoid replicating the
>shaped traffic and thus allow an observer to see which rendez-vous
>points are being addressed - possibly de-anonymizing many involved
>hidden services behind them. Probably there is even a chance of
>de-anonymization if notifications had a fixed size, since the third hop
>will suddenly be busy sending out all similarly shaped packets to 167 RPs.

First; as I said in our threat model we assume majority of ORs aren't
concluding with attacker in same time and we assume anonymity works
(attacker can't deanonymize Tor). 

Second; an observer can see Alice's third hope sends packet to what RPs.
But attacker can't determine these 167 packet that third hop OR sends to
167 RP, is from Alice to her 167 friends or 167 different person at that
OR send one packet to one of their friends at each RP which in this
scenario it becomes connection between pseudonyms in a linear paradigm.
But what pubsub as far as I understand (maybe I got it wrong) do a
subscription between root sender and leaf receivers which reveals the
connection to an observer in between when root multicast a packet to
leaves subscribers. 

>I challenge that as well. Given a high latency packet-oriented multicast
>system being fed from the third hop, distributing the content to a network
>of reception points, the maximum de-anonymization that can be achieved
>is by p0wning some nodes, seeing some fragments of somebody's trees,
>still not being able to tell where the stuff came from and where it
>will end up.

First question is how much bandwidth in numbers a high latency
packet-oriented multicast saves compared to asking Alice instantly
unicast the packets? 

Second question is if the multicaster sends same packets to recipients
then how is that possible to tell an observer that don't draw edges
between anonymous vertices? Even 1 hour delay doesn't change packets
semantically, while in unicasting random packets the observer can't draw
edges between root vertex and recipient vertices. 

I agree that in real world scenarios multicasting is not that bad but
it's better choose stronger theories when we have the opportunity,
despite the fact that nobody attacks those weak parts which we scared
from. I propose we use the pubsub multicast strategy as a backing plan
when we get ride of available bandwidth in Tor network, to switch
unicasting Notifications into multicasting one Notification that is
encrypted by an epoch forward secure key for all friends. 

>Of course accessing blocks from a third party server is a trade-off
>in excessive bandwidth, please.

How much excessive MB/GB/TB it would be in your estimation when we
download block from server?? 

>Exactly, so your model with the centralized block cloud is doomed,
>as I see it.

it is not doomed if someone pay Amazon's bills but it will be flooded
with blocks and there is no alternative solution for keeping those
blocks on somewhere else as p2p networks are only good on distributing
data not keeping the data itself for long terms with 100% reliability. 

I love to get ride of PseudonymousServer because i'm the one who is
responsible for paying off Amazon bills but even if we do multicasting
above hidden services then we still need it for "future retrievals" when
requesters can't find a seeder for desired block on Bittorrent network
in the future and for "asynchronous retrievals" when Bob's hidden
service is offline and when he becomes online gets Notification from
public pool to retrieve the block but Alice is offline at that time.
Also we need PseudonymousServer for backuping PDB and many different
things. 

>Occasional failures can be recovered - we have a recovery scheme
>for that.

People are unpredictable and unreliable which lead save several copy
from same data somewhere to make sure if one copies disappeared then we
have it somewhere else. we need a lot more space for a distributed
reliable storage than we need on cloud with daily backups. If keeping
blocks on an efficient+reliable cloud is doomed, keeping them by unknown
volunteers is foredoomed. 

>I don't believe people will systematically not provide
>disk space to have a great social networking experience. One
>that beats Facebook's in many ways, not just privacy.

They systematically won't! you are a good man but most of others out
there aren't like you. just think of those zombies who sued apple
[http://www.bloomberg.com/news/2014-12-31/apple-customers-sue-over-shortage-of-storage-space-in-ios-8-1-.htm]
because of asking them free some space for installing an important
update. people really want when they replace their smartphone, simply by
entering a username and password recover everything back. This is not
subterfuge, it's substantial. Usability is a security parameter because
if majority of users don't use our secure software then attackers easily
compromise them. 

We are a mobile application (for various important reasons) and mobile
phones have a few GB free space which is very valuable, we only can
cache texture contents for a long time, cached media contents will be
removed after a short period of time. Without backing up blocks on a
server, for sure they will lose them during time.