[tor-talk] Cryptographic social networking project

evervigilant at riseup.net evervigilant at riseup.net
Fri Jan 16 03:36:31 UTC 2015


On 2015-01-08 17:13, contact at sharebook.com wrote:
>> No, I am just suggesting not to use Tor for something it wasn't built
>> for. We have been working on a technology that combines anonymization
>> with multicast distribution and is therefore a lot better suited for
>> social use cases. I hoped you would see this point and maybe consider
>> joining forces with us rather than developing something that may run
>> into scalability limits.
> 
>> You want to use a shopping bag to deliver a cupboard.. and now you
>> say that using more shopping bags will solve the problem. You can't
>> solve an exponentially growing problem with a linearely growing
>> solution. All technologies that address scalability have a multicast
>> distribution strategy somewhere. In cloud technology it's the way
>> the database replication is organized in distribution trees. In
>> Bittorrent it's the way BT grows a tree with every further downloader.
>> Tor doesn't have that and so far I have not heard of anyone being
>> interested in changing this. In fact, it would be such a drastic
>> intrusion into its current operation mode that it would risk affecting
>> the current way Tor operates. That is why it is good for everyone that
>> other platforms like GNUnet, Tribler and I2P experiment with this
>> challenge and Tor developers who think Tor has reached a sufficient
>> degree of maturity could come and help the other platforms. I can
>> imagine an integration happening at some point, since all of these
>> platforms need a relay router network to perform well.
> 
>> Still whenever Alice uses those 167 circuits (example scenario) she is
>> sending the exact same information to all of those people. If our
>> anonymization network had native distribution trees rather than 
>> unicast
>> circuits, then this task would be roughly the same as when Twitter
>> delivers a tweet to all data centers in order to make it appear on
>> potentially millions of recipient dashboards.
> 
>> Which again means that the same data is being delivered in hundreds of
>> copies over the Tor network, rather than having a multicast strategy
>> that ensures data travels each network node just once at maximum, or
>> at least reduces redundancy to a scalable amount.
> 
>> You insist on only focusing on the cost of establishing circuits, but
>> I don't believe Tor will be able and wanting to deal with an explosion
>> of redundant data deliveries. There is a reason why Bittorrent is
>> discouraged over Tor - because it is the same social use case. Tor
>> scales for a steadily growing number of humanoids that make unicast
>> exchanges with websites and other server-like applications. It's a
>> linear challenge that a slow increase in efficiency and number of
>> relay nodes can tackle.
>> The moment all of these users start interacting with each other like
>> crazy, Tor has a problem. I don't understand why I have to tell and
>> re-tell these basics of scalability as if it was my opinion. This
>> is how scalability works, or rather doesn't work. If we want an
>> anonymization platform that can scale socially, we have to make one.
> 
>> Now you have an assessment that your plan will likely not work out for
>> a relevant number of participants and you are free to find out the 
>> hard
>> way or teach me something about scalabilty after working with it for 
>> ...
>> hmm.. when did I start working on IRC's multicast? That's 25 years ago 
>> now.
>> So good luck proving me that I got it all wrong.
> 
> i think you got it all wrong, maybe it's because English is not my
> native language. let me demonstrate it by technical details
> 
> -ESTIMATING TOTAL COST--
> 
> We assume our network gets 10 million active users with 167 friend per
> each user.
> 
> Hidden Service (hybrid scheme): beside classical hidden services, we do
> have shared secrets between Alice and Bob which after minor changes 
> (few
> lines code) on relay operators empower them create hidden service
> circuits for social networking in mass scales without heat. There will
> be a directory server (managed by stable parties) that every 24 hours
> generate snapshots from all available onion routers (OR) sorted by row
> numbers to make sure everyone around the world see same view from list
> of ORs with same order. Alice knows a SharedSecret and a CommonSecret
> for Bob (Bob's shared secret is unique for Alice but his common secret
> is same for all his friends).
> 
> In an undirected graph number of edges=(vertices)*(degree)/2 so in our
> network there are <835 million connections but there aren't 835 million
> onion circuits between users. Each user for handling hidden services
> only have two regular circuit that for 10 million users sum of them
> would become 20 million. Thus Alice only have two 3hop circuit and Bob
> have two 3hop circuit, there is no overheat here compared to what Tor
> users generally do for browsing websites securely by Tor browser bundle
> hence I'm not going to calculate cost of maintaining these regular
> circuits. The sender circuit (SC) is used for sending Notifications to
> friend hidden services, the receiver circuit (RC) is used as the hidden
> service itself to receive Notifications from all friends. In RC, third
> hop is called rendezvous point (RP). Alice in order to send a
> Notification to Bob, need find out what is his RP plus some additional
> information to send him packets through RP.
> 
> In hybrid hidden services there is no need for asymmetric key 
> agreements
> to establish a secure channel between SC and RC, also I dismiss
> calculating cost of symmetric cryptography on packets as it's trivial
> using regular block ciphers so I won't estimate CPU work required by 
> ORs
> to handle hidden services (check djb's benchmarks for aes_128 at
> cr.yp.to). All informations needed to exchange Notifications securely 
> at
> RPs, is delivered from CommonSecret and SharedSecret.
> 
> Bob select his RPs from directory's snapshot in time intervals between
> 10minutes-12hours after beginning of each day at 00:00 UTC. Time
> interval is delivered from 
> V_1=H(CommonSecret||mm/dd/year||EpochCounter)
> where EpochCounter is a natural number starting from 1 to n that reset
> to 1 again at 00:00 UTC in next day and row number for RP in 
> directory's
> snapshot is delivered from
> V_2=H(H(CommonSecret||mm/dd/year||EpochCounter)). Bob to generate Time
> interval, spin a wheel by V_1 that has 42600 slots and encode where it
> stops into waiting time between 10 minutes to 12 hours. To generate row
> number for each epoch's RP, he spin a wheel by V_2 that has n slots
> (n=number of available ORs in directory's snapshot) and use where it
> stops as RP's row number. if row number for RP is for instance 3907, 
> Bob
> connect to OR #3907 #3908 #3909 and keep these RPs open to make sure if
> Alice failed send her Notification to #3907 then she can try other RPs.
> 
> Bob start opening RPs from 00:00 UTC, wait for generated time interval
> and use a higher epoch counter to determine what is next RP and how 
> much
> he should stay there again by generated time interval. Hence total
> numbers of epochs is different everyday for each person.
> 
> When Alice know what is Bob's RP, she don't send anything to it until
> she have a new Notification for him. She sends packets as
> {CircuitID|Payload}over Http from her SC without establishing a TLS
> channel with RP. CircuitID= first 4 byte of
> H(CommonSecret||mm/dd/year||EpochCounter||GenerateCommonID), payload is
> cipher-text of {cookie|Notification} encrypted by RP_KEY which is
> H(CommonSecret||SharedSecret||mm/dd/year||EpochCounter||GenerateKey),
> cookie is (cookie1)⊕(cookie2). Bob when open an RP, tells all different
> cookies for all his 167 friends to RP (for each friend there is a
> different cookie1 and cookie2 value in each epoch), cookie1= first 4
> byte of
> H(CommonSecret||SharedSecret||mm/dd/year||EpochCounter||GenerateCookie1)
> and cookie2= first 4 byte of
> H(CommonSecret||SharedSecret||mm/dd/year||EpochCounter||GenerateCookie2).
> When Alice gives {cookie|Notification} to RP, if
> (cookie)=(cookie1)⊕(cookie2), RP send the packet to Bob, then RP OR in
> its RAM replace (cookie2) with (H(cookie2). When Alice want to send
> another Notification to Bob using same RP again, for (cookie) she have
> to send (cookie1)⊕(H(cookie2)). Next Notification need
> (cookie1)⊕(H(H(cookie2))) as cookie and so on.
> 
> Let say each packet is approximately 60 byte and Alice sends 50
> Notifications to all her friends each day. Thus Alice sends 50*60*167
> byte to all her friends that sending them via her 3hop SC to each
> friend's 3hop RC will increase the total amount 6x time more. Therefore
> Alice everyday sends 3 MB through ORs in order to deliver Notifications
> for different purposes to all her friends. If 10 million users send 
> same
> amount of data to their friends, it will cost 30 TB data exchange for
> onion network.
> 
> PseudonymousServer: public container for hosting blocks have 100%
> efficiency. If each user everyday send/receive 10 MB data
> (reading/posting) to/from PseudonymousServer, the total amount of
> traffic for 10 million users would be 100 TB each day that based on our
> threat model has to be routed through onion network but this is linear
> traffic not an exponential effect, for instance if on Twitter.com each
> user approximately download/upload 10MB data from/to Twitter.com 
> servers
> everyday, for 10 million users it would require exact same amount of
> traffic (100*3 TB) to be routed through the onion network if they use
> Tor browser bundle to access Twitter.com
> 
> --SUMMARY--
> 
> Our paradigm with 10 million users for presumed social networking
> scenarios, as an extra load compared to classical hidden services for
> linear applications, requires 30 TB data exchange inside onion network
> which constitute ~55% capacity of 5000 volunteer onion router with 
> 1Mbps
> available bandwidth each day that is slight compared to how much
> bandwidth 10 millions users need for surfing their favorite websites
> using Tor browser bundle because simply refreshing a graphical magazine
> like buzzfeed.com will cost more than 3 MB ...
> 
> In conclusion Tor network need more relays if millions of more users 
> who
> transfer megabytes of data per day try use it.
> 
>> Our plan is completely different from what you write here. Pubsub
>> distribution channels operate over the backbone, not the individual
>> friend systems. It is the backbone ensuring that everyone gets a copy
>> of the message she is supposed to get and the subscribers may not know
>> of each other - who they are, how many they are. I don't know why you
>> assume you can judge what we have been working on in the last decade,
>> then talk about things that have nothing to do with us.
> 
> Now I did a search on your website and i'm not exactly sure what is it.
> what I found seems to be an experimental mesh network. You criticized
> Tor because when a global adversary monitors both entry+exit nodes in a
> circuit, metadata is compromised. In a mesh network (if friends are
> using each other as mesh routers) even a local adversary by monitoring
> any part of network can compromise metadata for that part. Breaking
> onion routing need 2 point of failure but breaking mesh network only
> need 1 point of failure. If you employ high delays, padding etc for 
> more
> security, then why not apply same defense on a parallel onion network
> managed by a comprehensive organization like Tor inc?
> 
> In mesh networks when a node route someone else's traffic to
> destination, it makes traffic analysis for an observer harder as they
> can't detect it's from node itself or someone else but exact same
> property imply on onion routing networks either, if user run an onion
> router then it become harder for an observer detect intercepted traffic
> belongs to user itself or someone else behind it. Onion routing is
> already implemented, widely adopted, heavily supported and foils 
> various
> types of more traffic analysis attacks that mesh networks can't.
> 
> By the way it's cool to replace ISPs with mesh networks to reduce 
> radius
> of connection between identities and make dragnet SIGINT more 
> difficult,
> for instance when I send a TCP packet from my home IP address in Iran 
> to
> a Tor entryGuard located in iceland, GCHQ really collect metadata for 
> my
> connection by intercepting Iran's optic fibers at Oman sea and probably
> deanonymize my Tor circuit if they are controlling my selected Tor exit
> node in Japan too. But Internet backbones are beyond application
> developers scope, it's up to societies.
> 
>> You just described another one of the good reasons why Tor isn't the
>> appropriate tool for the job we want to get done. Low latency is a
>> client/server-paradigm requirement that unnecessarily reduces the
>> anonymity for the use case of a distributed social network.
> 
> Our assumption is that anonymity works and when users retrieve 
> something
> from PseudonymousServer via Tor, server can't recognize requests coming
> outside the exit node are from whom, for instance if Alice retrieve
> block1 then retrieve block2 from same exit node, we assume server can't
> recognize these retrievals are from same person as many others are 
> using
> same exit node to retrieve blocks and majority of exit nodes are not
> concluding with attacker in same time. This threat model isn't perfect
> nor broken. If we decide not to do that, there is no alternative
> solution. High latency networks might cause deanonymization harder but
> if they are practical enough, I'm sure Tor network can easily add 
> delays
> by writing few lines code for those who want it and if they do that in
> the future we can easily adopt it. The only other solution that makes
> deanonymizing connection between Alice and Bob really hard, is using a
> PIR protocol by homomorphic encryption to ask Alice put something on a
> database and then Bob later on query the database to pick up her packet
> without telling server what is his query or what server should in
> response give to him! But problem with such a PIR protocol is that for
> 10 million users, service provider have to pay billions of dollars to
> cloud hostings every month for computing astronomical cryptographic
> functions. Another PIR protocol that don't need cryptographically
> massage all records in database to guess output, is asking Alice to put
> something on database and Bob later download all records from database
> to locally choose which record is for him and delete the rest of
> unwanted outputs. But problem with such a PIR protocol is that database
> everyday become larger and larger thus users have to download more and
> more data from it next days which eventually paralyze the Internet.
> 
> So we are on the right track...

-------------------------------------------------------------------------------------------------------------------

Have you thought of using diaspora and combining it with the likes of 
the tor network to make it a fully decentralized hidden social 
networking service?

https://github.com/diaspora/diaspora/

Diastora?


More information about the tor-talk mailing list