[tor-talk] Programming language for anonymity network
nb.linux at xandea.de
Fri Apr 18 13:45:27 UTC 2014
Stevens Le Blond:
> We are a team of researchers working on the design and implementation of
> a traffic-analysis resistant anonymity network and we would like to
> request your opinion regarding the choice of a programming language /
> environment. Here are the criteria:
(disclaimer: I didn't use the language very often until now, but think
it's awesome :)
What about Ada?
Ada, a language originally developed for the DoD in 1983, may seem dead
for those who don't have it on the radar, but it isn't (latest standard
is Ada 2012). It's just used in very specific domains and I think it
should be used more, because many more projects can benefit from it.
(IMHO almost any project can benefit from it: What's good for space,
aviation, defence, energy, and railway should be good enough for
day-to-day projects as well)
> 1) Familiarity: The language should be familiar or easy to learn for
> most potential contributors, as we hope to build a diverse community
> that builds on and contributes to the code.
Ada is a very clear language and produces good self-documenting code.
The problem is (IMHO) that the people would have to learn it, instead
relying on the presence of C knowledge. Another difficulty of learning
Ada could be that it has a lot more keywords than e.g. C.
> 2) Maturity: The language implementation, tool chain and libraries
> should be mature enough to support a production system.
For example, there's GNAT - the GCC Ada compiler - which is of course
free (as in free beer and in free speech). Furthermore, the testbench
for certification is included in the language standard(!). And it's
actually used in production (for decades).
> 3) Language security: The language should minimize the risk of
> security relevant bugs like buffer overflows.
The domains Ada is built for have the same criteria, therefore, Ada does
not allow most such mistakes by design. For example (in loose order):
- No pointer arithmetic: there are of course similar concepts to C
pointers (called access type), but you cannot do confusing pointer
arithmetic as in C.
- Runtime checks of array/buffer boundaries: you cannot write to
- Range checks for datatypes: you can define ranges for types, e.g.
integers, fixed point numbers, floating point numbers, ..., and get an
exception if the ranges are violated. Personally, I think that
especially I/O interfaces/protocol implementations could benefit from
that: you could have a field A of range a..b in your datastructure and
you cannot assign it a value outside that range, even if A could hold
- Portability: as the safety critical systems often are heterogeneous, a
lot of hardware platforms are supported (there's even an attempt to have
an Ada runtime system for 8-bit AVRs: avrada). Furthermore, Ada can
include/interact with assembly, C, Java, and lots of other languages,
e.g. for a highly optimized module/function.
- Garbage collection: a GC that actually works (even for hard realtime
systems; not as in Java). Removes risk of free after freed.
- Provable code: the new standard from 2012 added support for
annotations (inspired by Eiffel) that allow to define e.g. pre- and
post-conditions (contract based programming). A model checker can be
used then to explore states of the program and flow paths. Furthermore,
there's Ada SPARK, an Ada subset that is fully provable (unlike MISRA C,
which is also a subset, but not fully provable).
- Deterministic builds: I haven't tried it, but I know that you can have
deterministic builds (e.g. you can define the order of modules, and
functions; you can even define the alignment of functions, even
individually for each function)
- Scopes: there're various scoping levels or ways to hide information,
e.g. packages. Then you can also define types that are "private" (you
don't know their structure, but can assign, compare, copy them); then
there's also "limited private" (you don't know their structure, and only
can assign them, not copy or other fancy things; an example is the
"task" type (i.e. a thread): it would be nonsense to copy or compare a
"task", so simply prohibit it)
- Ada doesn't allow most stupid things like
- Ada is case in-sensitive to simply throw an error when someone
defines a variable "ada" one time and "aDa" another time and prevent the
usage of the wrong variable. If variables have the same name, they mean
the same variable. If variables are different or for different usage,
then god, name them differently to express their usage.
- Ada has "procedures" and "functions": a "procedure" has no return
type, but can have "in", "out", and "inout" parameters. A "function"
_must_ have a return type, but is only allowed to have "in" parameters.
Therefore it's not possible to end with things like "read" in C that
reads from some parameters, writes to some parameters, and sometimes
also has a return value that has to be checked.
- Ada supports portable multithreading and message passing (rendezvous
scheme) built directly in the language (since 1983!).
- Ada has strict typing: if there's a "type A is integer range 0..10"
and another "type B is integer range 0..10" and you have variables
"var_a : A" and "var_b : B", Ada _throws an error_ when you try to
assign the variables "var_a := var_b". This is because their types don't
match. One is A, the other is B. It doesn't matter if these have the
same definition, because they have different names one can assume that
the programmer meant different things (in this example they could be
casted explicitly and then assigned).
> 4) Security of runtime / tool chain: It should be hard to
> inconspicuously backdoor the tool chain and, if applicable, runtime
The Ada runtime and toolchain are not small, so it may be difficult to
audit everything. On the other hand, a three letter agency or others
would need to implement a backdoor in a compiler that they know is used
for their own systems and systems that are highly critical (think of a
nuclear plant). Also I assume that from the domains using Ada, a lot of
people are looking at the compilers.
To repeat the disclaimer: I use Ada only for a short time. But I don't
want to use C (or C++ or Java) again, because a paradigm of "everything
is an integer" simply seems not appropriate to me in this millennium.
PS: there's even a GNAT pragma to check the compatibility of licenses of
More information about the tor-talk