[tor-dev] Mnemonic 80-bit phrases (proposal)

Ken Takusagawa II ken.takusagawa.2 at gmail.com
Wed Mar 21 00:11:03 UTC 2012


On Feb 29, 2012 1:58 PM, "Sai" <tor at saizai.com> wrote:

>  For a 6 word sentence, with 8 (3b) templates, we need ~12b (4k word)
>  dictionaries for each word category.

1. You need 2^8=256 templates, not just 8, to reach 6*12+8=80 bits.

2. Having toyed with this idea in the past, let me warn that forming a 4096
word dictionary of memorable, non-colliding  words for each word category
is going to be very difficult.  Too many words are semantically similar,
phonetically similar, or just unfamiliar.  You might find Google Ngrams a
good resource for common words; I provide a complete sorted list here:

http://kenta.blogspot.com/2012/02/lefoezyy-some-notes-on-google-books.html

Another way to go about it might be to first catalogue semantic categories
(colors, animals, etc.) then list the most common (yet dissimilar) members
of each category.  An attempt at 64 words is here:

http://kenta.blogspot.com/2011/10/xpmqawkv-common-words.html

I'd propose that the "right" way to do this is not just sentences, but
entire semantically consistent stories, written in rhyming verse, with
entropy of perhaps only a few bits per sentence.  (Prehistoric oral
tradition does prove we can memorize such poems.)  However, synthesizing
these seem extremely difficult, an AI problem.

3. I presume people are familiar with Bubblebabble?  It doesn't solve all
the problems, but does make bit strings seem less "dense".

Ken
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20120320/a7ffc332/attachment.html>


More information about the tor-dev mailing list