I was going to mention this on my pull request https://github.com/torproject/chutney/pull/67 but I figured it would be much more fitting to have a more-general thread here. In the PR I mentioned how I believe that, if it quacks like a dict, it should be implemented as a dict. That way we (hopefully) run into fewer gotchas down the line, trying to reinvent the dict ourselves, and we have clearer self-docuemtnation ("oh, this class inherits dict, now I know kind of what to expect out of it").
I'm still trying to reason through chutney's code for how exactly inheritance works and how this inheritance can be expressed more clearly, for example it looks like the `Node` class is for a type of relay (authority, client, bridge, et cetera) and its `getN` method is to generate a number of those nodes. That is poorly documented and I only figured this out by seeing how `Node.getN` was being called in the tests themselves. To me, it makes more sense not to "overload" what a `Node` is; if we are using an object as a factory to create more `Node`s, then we should make that evident. The factory can then make identical nodes with similar properties, rather than one parent `Node` spawning children that refer back to their parent.
For me this issue is mostly "it feels like a lot of mental overhead, and I'm sure it can be done better than it already is." But of course it's difficult to fix something without first understanding how it should behave. (Perhaps a test suite for the test suite? ;) But since we're dealing with a lot of inheritance in chutney, and since this inheritance is (in my opinion) expressed poorly, it can be quick to find oneself caught in a web of code with no clear way out.
Here is what I have been considering lately, with the code overall. I hope that laying it out in points will help both myself and others try to divy up what could be improved about the code, work on it piecewise, and make sure nothing is reimplemented incorrectly. I know I'm new to the code and thus have not looked into it thoroughly enough to understand everything about it, but again, something in the back of my head is telling me that if it's this difficult to understand now, then it can stand to be changed. So please take my criticism lightly and only where it applies. I could be wrong about some things as well. But, as a testing suite, this is the place we absolutely need to be sure we get right, because with a hard-to-reason testing suite, we open ourselves up to bugs within the tests themselves.
- Documentation especially for methods that the tests are supposed to call. In `Node` class, there is a comment "Users are expected to call these" with `__init__` and three other functions declared. `__init__` is fairly self-explanatory but the other three could stand to have the same docstrings that we are giving other functions. I think this should be my first task, before trying to mess with anything that moves. That way, we have a clearer overview of the testsuite's API, and then we can work the implementation around that. - Once it's clear to all of us (meaning, not only the core Chutney devs, but also outsiders such as myself) what the current interface does, we can see the strengths of that interface, maybe some design flaws if we deem any to exist, and some improvements we can make with the interface. What I mean is: here is the place to make any necessary adjustments to the API before we dive into implementation. - Now we visit the implementation. By this point I (and other non-core developers) should have a better understanding on what the tests *should* be doing, that we can then proceed to verify the correctness of the tests' implementations themselves. Here's where I would like to revisit cleaning up the code so that we make use of functions to abstract and "atomize" certain aspects of the code (like the currently-unused function isBoostrapped), where we should introduce more self-documenting names (as suggested, replace getAttribute names with isAttribute names for when we want to return bool values), et cetera.
I'm outlining what *I* think the approach should be, and in hopes that everything else will line up and make more sense for us all. I do want to make sure I know what the code should be doing, because while I could inspect the code line by line and go through the logic branches myself, I'm going to be perfectly honest and say it gives me a headache :) I think it's much easier to reason about with compartmentalised, self-documenting (and docstringed) components, than to try to step through the code as it runs and take it in as one giant piece.
Because, we're testing a complex piece of software (Tor). Tor itself can do a lot of things that can throw us off especially if we're less familiar with its implementation, yet. And that throws off our understanding of how Chutney is supposed to work, as well. A lot of the tests currently seem to be very conditional -- they pass or fail with little rhyme or reason, which is of course the best type of issues to debug :P -- so I don't see the test results helping me as well as refactoring the testsuite first.
Again, let me know if I have anything wrong, made any unjust assumptions, showed my ignorance, et cetera. I'm going into this boldly and I hope that's the right call.
Caitlin