Hello tor developers,

I've had limited contact with nickm and dmr and I was encouraged  to solicit interest and comments via the tor-dev mailing list...

I'm interested in contributing a new feature to little-t tor in the form of a new log severity level called 'trace'. Currently there are 5 severity levels [1]; debug, info, notice, warn, and err. trace would be more verbose than debug, and trace function calls at runtime. It would also be configurable -- perhaps by environment variables and/or command line options -- regarding which functions get traced and when at runtime. Because introducing function level tracing likely also introduces a performance penalty, I'm suggesting that this feature would be optionally compiled in at build time.

The use of the new log severity level 'trace' would be for new developers to better comprehend the tor source code by giving them a further resource -- in addition to the source code, tests,  documentation, and IRC chat -- to learn from. The debug log will be more useful in the context of the function call tree, and function usage is easier to grasp than battling through the ~ 165k LOC. In addition, any developer who likes to debug via the log files will appreciate having the extra function call tree context.

I have a basic, generic working proof of concept which produces output when running tor. Here is a snippet of the output:
...
T125580                   + directory_initiate_command_routerstatus_rend() { // #63
T125580                     + node_get_by_id() { // #278
T125580                       }
T125580                     + directory_initiate_command_rend() { // #63
T125580                       + directory_command_should_use_begindir() { // #63
T125580                         + fascist_firewall_allows_address_or() { // #5814
T125580                           + compare_tor_addr_to_addr_policy() { // #6009
T125580                             }
T125580                           }
T125580                         + directory_fetches_from_authorities() { // #126
T125580                           + server_mode() { // #6859
T125580                             }
T125580                           + router_my_exit_policy_is_reject_star() { // #128
T125580                             + router_get_my_routerinfo() { // #130
T125580                               }
T125580                             }
T125580                           }
T125580                         }
...
Explanation of the snippet: '<thread ID> <indent> + <function>() { #<calls>' means <function> has been called <calls> times. <indent> represents the function call tree indent level. '}' means the function returns.

One can introduce extra info not implemented yet such as timestamps, calling parameters, return value, and extra info specified by the developer.

Regarding regular tor log output: This could be mixed in with the trace output, e.g.:
...
T125580                             + router_get_my_routerinfo() { // #130
T125580                               - info 1 of 2: abc <-- non trace
T125580                               - info 1 of 2: def <-- non trace
T125580                               }
...
This email is mainly to determine if there is a interest for such a tor feature to be implemented further. AFAIK it does not seem to be implemented already and does not seem to exist as a documented helpful tool [2]. And I'm not aware of a third party tool which would 'just work' with tor.

Strategies for implementing the function trace:

I'll now discuss the strategies which I know about for implementing the function trace, including the one I want to use:

1. Manual instrumentation: Here each function is manually instrumented in the tor source code with entry and exit macros. This is time consuming and error prone. And it can make the source code less readable.

2. Using the gcc -finstrument-functions [3] command line switch: There are many examples [4] [5] of using this to generate a runtime function call tree. But this mechanism causes each function to make two extra calls whether you want that or not, so performance might be poor. Plus getting hold of the actual function names and displaying them can be tricky; only the function address is given to the entry and exit functions. Also, there is no way to access the function parameters or return value.

3. Using the llvm xray [6] [7] feature: This was originally contributed to llvm by Google but is more designed for inspecting very high performance call trees in a production environment. A sort of dtrace for user level functions. However, it also doesn't log function parameters or the return value AFAIK. And logging happens via an intermediate binary file which must get converted to something ascii / human friendly. This also means it's tricky to mix in the existing tor log output, for one nice seemless log file.

4. Using gcc compile to assembler command line switch: AFAIK this technique was first developed by myself. The idea is to replace gcc with an own script which pretends to be gcc and invokes the real gcc after making subtle changes. If a makefile tries to compile a .c file to a .o file then the pretend gcc will change the behaviour so that an intermediate assembler file will be generated. However, the assembler file gets changed so that every function call becomes an indirect function call via a pointer. This means we can change the pointer to execute either the original function (with the overhead of an indirect function call), or a wrapper function with the same declaration. The wrapper function has access to all the function parameters and return value, can log the entry and exit, while calling the original function which got wrapped.

I have chosen implementation strategy #4 because it provides the most flexibility without changing any existing source code or makefile. A limitation is that it only works currently with gcc on Intel. However, msvc & llvm might have a similar compile to assembly feature. And in theory the same technique could be implemented for non-Intel chips if they have an indirect call assembly instruction.

An interesting side effect of strategy #4 is that it could possibly be used to enhance testing in two ways: 1. If the test is for a particular function or set of functions then trace could be preconfigured to provide more detailed output for just those functions, and any functions they call. 2. The indirect function pointer can be changed at runtime if desired to create a kind of general purpose function mocking facility for testing. However, I believe tor tests already do their own mocking.

Here is an example of the proof of concept auto generated C code for the function firewall_is_fascist_or():

extern unsigned long long cwrap_count_call_firewall_is_fascist_or;
extern int (* cwrap_indirect_orig_firewall_is_fascist_or) (void);
extern int (* cwrap_indirect_wrap_firewall_is_fascist_or) (void);
extern int (* cwrap_indirect_mock_firewall_is_fascist_or) (void);
static int cwrap_wrap_firewall_is_fascist_or (void)
{
   cwrap_count_call_firewall_is_fascist_or ++;
   // todo: add per binary/file/function verbosity checking
   // todo: show function parameters for simple types, e.g. foo=123, bar=7
   if (cwrap_count_call_firewall_is_fascist_or <= 10000) cwrap_log(1, "+ firewall_is_fascist_or() { // #%llu\n", cwrap_count_call_firewall_is_fascist_or);
   int cwrap_result = (* cwrap_indirect_orig_firewall_is_fascist_or)();
   if (cwrap_count_call_firewall_is_fascist_or <= 10000) cwrap_log(-1, "}\n");
   // todo: show return value for simple types
   return cwrap_result;
}

So the pretend gcc doubles the number of functions, creating a 'wrap' version of each regular function. While the changed assembler files change the assembler call function into an indirect call function which uses the newly created indirect function pointer. In theory the makefile just works and is unmodified. However, extra files get created during make.

Looking forward to comments, suggestions, and/or votes for and against, even if it's only to point out that something with similar functionality exists already.

--
Simon

[1] https://www.torproject.org/docs/tor-manual.html.en
[2] https://gitweb.torproject.org/tor.git/tree/doc/HACKING/HelpfulTools.md
[3] https://stackoverflow.com/questions/6176284/why-doesnt-finstrument-functions-work-for-me
[4] https://balau82.wordpress.com/2010/10/06/trace-and-profile-function-calls-with-gcc/
[5] https://mcuoneclipse.com/2015/04/04/poor-mans-trace-free-of-charge-function-entryexit-trace-with-gnu-tools/
[6] https://www.youtube.com/watch?reload=9&v=jyL-__zOGcU
[7] https://llvm.org/docs/XRay.html