[tor-dev] Stem Descriptor Parsers

Damian Johnson atagar at torproject.org
Mon Jul 9 17:22:53 UTC 2012


On Mon, Jul 9, 2012 at 8:40 AM, Erik I Islo <eislo at wesleyan.edu> wrote:
> Hello,
>
> Megan and I have been working on the CSV export functionality that was being
> discussed a little over a week ago, and given the recent discussion, we
> would like to clarify the expected/desired implementation of this feature.
>
> We have created an export.py module within /stem/descriptor, which contains
> a single method as of now that takes a descriptor object and two possible
> lists of fields.  These lists are to be specified as either the explicitly
> included attributes of the descriptor or the attributes to be excluded.  As
> we continue to work on this code, Megan and I were wondering if it wouldn't
> be better to accept a file object as well, in addition to accepting any
> number of descriptor objects (i.e. def csv_exp(..., *descriptors)).  Or are
> there other suggestions request concerning what sort of input such a method
> should take?
>
> -Erik & Megan
>
> On Fri, Jul 6, 2012 at 1:49 PM, Damian Johnson <atagar at torproject.org>
> wrote:
>>
>> > So is export intended to be an instance method of descriptor, one that
>> > just dumps a single csv line of the instance attributes (maybe subject to
>> > some selection of those attributes)?  Or a static method that takes a
>> > collection?
>>
>> Either would work fine. I was envisioning the former, though on
>> reflection stem/descriptor/export.py module would probably be better
>> since that localizes this functionality and allows for better
>> expansion in the future (other formats such as json, or the inclusion
>> of import functionality).
>>
>> > It seems like it might be awkward to have to hack stem itself to add a
>> > new export format (for example).  Is this a concern?
>>
>> That depends on how useful users would find it to be. If researchers
>> commonly want csv export functionality then we might as well support
>> it. However, if it's a rarely desired feature then there's little
>> reason to clutter our API. My understanding is that this feature is
>> mostly for researchers and sysadmins, so as part of the target
>> audience I'm happy to defer to you on how we handle this.
>>
>> > Do all the known use-cases make need both an interface to Tor Control
>> > and a descriptor utility library?
>>
>> No, you're completely right. Stem's controller functionality utilizes
>> its descriptor functionality but not vice versa. Another design that
>> we could go with is to make several smaller libraries (descriptors,
>> controller, response parsing, shared utilities, etc) if stem grows
>> unwieldy. However, we're nowhere near that yet and keeping stem as a
>> single library makes development, testing, installation and usage far
>> easier.
>>
>> Stem is a library to make working with Tor easier for developers and
>> researchers, with the current scope of the Tor control and dir specs.
>> My plan is to complete that, release it to the community, then see
>> based on feedback where we should go from there.
>
>

Naif: This was your feature request. Thoughts?

> Megan and I were wondering if it wouldn't
> be better to accept a file object as well, in addition to accepting any
> number of descriptor objects (i.e. def csv_exp(..., *descriptors)).

If we can make it work then that would be nice, though having a *list
entry generally doesn't work well for optional keyword fields. Ie, if
you had the signature...

def csv_exp(include_fields = None, exclude_fields = None, destination
= None, *descriptors)

Then the caller needs to provide all of those keyword fields which
kinda defeats the purpose of them being optional. For instance, to
call it with the defaults and a single descriptor it would be...

csv_exp(None, None, None, my_descriptor)

My suggestion is to just accept a single argument that can either be a
single descriptor or a list of descriptors.

Cheers! -Damian


More information about the tor-dev mailing list