Status Report

Hi all, after a month of work I’m pleased to announce that the Stem based
replacement for Doctor is now live!

For those who aren’t familiar with it, Doctor is a service that pulls hourly
consensus information and checks it for a host of issues (directory authority
outages, expiring certificates, etc). In the case of a problem it notifies
tor-consensus-health@, and we in turn give the authority operator a heads up.

The new version of Doctor replaces Karsten’s java based counterpart and is
part of our ongoing scheme to eventually deprecate Metrics-lib in
favor of Stem
.

Other news this month includes…

Hi all. This month was mostly spent on non-tor work including a server migration, bad service outage at work, and a full week of cleaning my apartment. Still, plenty of spiffy news in stem land…

Remote Descriptor Fetching

Major feature for this month was the addition of a module to remotely fetch tor descriptors…

This works much like tor itself does, downloading descriptor content from directory authorities and mirrors. With it we can now easily script against the present state of the tor network without piggybacking on a live tor instance.

Curious what you can use it for? See our present monitors for some ideas.

This also included a little work with Nick on the spec and tor side…

  • Dropped the unimplemented microdescriptor query from the spec. (ticket)
  • Noting the max queryable fingerprints/hashes in spec. (ticket)
  • We’re getting a high failure rate on the downloads we make. A little more investigation is needed on my part to help narrow this down. (ticket)

Other news includes…

  • Revised the appearance of stem’s frontpage. The blue buttons were pretty jarring, so switched to something that matches the rest of the page. (before, after)
  • Added Slackware to our download page. Many thanks to Markus for adding us to SlackBuilds!
  • Worked with Sreenatha to port Tor Weather to stem. Unfortunately Weather does not presently have an active maintainer so I’m not sure how we will proceed on this front. (thread, ticket)
  • The Munich dev meeting has attracted quite a few potential volunteers. After discussing prospective projects with seros I tidied up our volunteer page. Changes included…
  • Our automated Jenkins test runs ran into another regression in tor that caused it to segfault. (ticket)
  • STREAM events mishandled IPv6 addresses. (caught and patched by soult, ticket)
  • Thread with Pierre about TorPylle which turned into a discussion with Nick regarding future language direction for the tor codebase. I’m looking forward to seeing where it goes!
  • We just finished with midterms for Google Summer of Code. Chang unfortunately did not pass, but the other projects are going well.
  • Sorted out travel arrangements for the GSoC mentor summit. Nick and I will be going, and Moritz is presently on the waitlist.
  • Code reviewed ra’s rttprober. He provided some great feedback for which I still owe him a reply.

Hi all. Without much ado here’s my status report for June…

Stem: Migrated to the Mock module

Our homemade mocking framework has served us well, but over time it taught me one very important lesson: writing a mocking framework is hard. On the surface it seems pretty simple: apply and revert a set of monkey patches. But how do you monkey patch class methods? What about alias imports like the os module? And god forbid you want to mock python’s open() function.

I’m finally taking a lesson from one of my coworkers and using a library for this. Python has several options but the most common is PyPI’s mock module, which became part of the standard library in Python 3.3. (change)

Arm: Pruning the torTools utility

Arm’s torTools module was a wrapper around TorCtl that provided caching, thread safety, and a better API. Now that we’re using stem it’s obsolete, and my aim is to completely eliminate it from the codebase. This is easier said than done however. This month I pruned vast swaths of the module, reducing it from 2768 lines to 1020. (before, after)

Doing this required some expansions on stem’s part. Added functionality includes…

  • get_pid() and get_user() methods in the Controller
  • system functions for getting a process’ start time and FreeBSD jail path
  • the system module’s call() wasn’t respecting exit codes

Stem: Remote descriptor fetching

Throughout the month Karsten and I have been bouncing ideas back and fourth concerning a stem API for remote descriptor fetching. I have some s3krit ideas for improving this that will address both our use cases even better (spoiler: future style instances). We’re coming up on a four day weekend so hopefully I’ll be able to start implementing this soon!

Stem: Expanded FAQ with answers to common Stack Overflow questions

I took a pass over the tor related questions on Stack Overflow, answering fifteen that concerned controller scripting. The vast majority of those unfortunately were some variation of ‘how do I programmatically change my IP?’ which I answered with a Stem FAQ entry.

The only question I thought was especially interesting went along the lines of ‘How do I check the IPs of the exits I’m presently using?’ (link). I answered this with a FAQ entry too.

… and a handful of smaller tasks

  • Investigated the descriptor version provided via tor’s ‘GETINFO ns/*’ options. Contrary to to the spec it turns out these have been v3 all along, and stem now parses them as such. (#7953)
  • Our automated Jenkins test runs caught their first instance of tor regression. This concerned LOADCONF’s behavior after merging a branch for ticket #6752. (#9122)
  • Handful of GSoC tasks (welcome/sorry emails after acceptance was announced, and org admin prep for the program’s start)
  • Added msg_type argument to ControlMessage.from_str() (request by meejah, #8605)
  • Investigated cache consistency issue (thanks to Ravi, #7713)
  • Fixes for ONLINE target (patch by Jeremy, #8692)
  • Minor revisions to how consensus bandwidth-weights are validated (#6872)
  • Addressed an arm issue with certain TERMs (#9064)
  • Answered some questions from Sreenatha that came up while migrating Tor Weather to Stem.

Hi all. This is a little early but considering the time sink GSoC has been I doubt I’ll get much more done, so here’s my status report for May. The few stem tasks I’ve snuck in included…

  • Added Ubuntu and Fedora to our download page, Fedora package is thanks to Juan Orti
  • Several testing fixes so our Jenkins tests runs now pass
  • The Controller mangled non-unicode descriptor content when using python 3.x (caught thanks to aj00200, #8755)
  • Expanded our client usage tutorial to use SocksiPy and include an example for polling Twitter (thanks to Ashish)
  • Integ tests now assert ownership on the tor process (patch by Jeremy Kushner, #8634)
  • The DescriptorReader mishandled relative paths (patch from Kostas, #8815)
  • Fix so the Controller cache is thread safe (patch by Akshit, #8607)
  • Running our integ tests littered the /tmp directory (patch by Ashish, #8622)
  • Improvements so we use the staticmethod decorator and new style exception catching (patches from Sean, #8824 and #8823)

Hi all. Though April had a bit of stem work the month was mostly gobbled up by the chaos that is GSoC. I never cease to be amazed at how much time orchestrating it takes, but enough of that. Other tasks from last month included…

Improved Stem’s Site

General Stem Improvements

  • Overhaul for a large part of our testing framework, run_tests.py in particular was in need of a rewrite
  • Fix for broken process renaming (patch from ragwater, ticket)
  • We now have a custom :trac: and :spec: role for Sphinx (patch from ragwater, ticket)
  • ATTACHSTREAM provided an unexpected 555 response (caught thanks to a cypherpunks, ticket)
  • Added support for the ADDRMAP event’s new CACHED attribute
  • Looked into anomalous bridge-ip-transports lines in the consensus, turned out to be from an unmerged tor patch running on George’s relay

Hi all. Even without counting the Boston dev meeting March was a highly productive month. Noteworthy things include…

Stem Tutorials

Stem’s tutorials got an overhaul, including:

  • A much friendlier layout. No more intimidating wall of text – the tutorials have been rewritten and broken into subsections.
  • "To Russia With Love" tutorial, exemplifying client usage and programmatically managing a tor process.
  • "Tortoise and the Hare" tutorial, demoing tor event handling through a curses bandwidth graph.
  • "Double Double Toil and Trouble", which ends the tutorials with a page of scripts and applications that use stem.

Feedback welcome! The shiny new tutorials are available at…

https://stem.torproject.org/tutorials.html

Stem Packaging

Thanks to a half dozen package maintainers stem is now available on several platforms, with more in the works. The most recent was the Python Package Index (PyPI), which can make stem installation as simple as ‘pip install stem’.

Google Summer of Code

The timing of this year’s winter developer meeting couldn’t have been better. During it I begged, bribed, and poked people with pointy sticks until they ‘volunteered’ to mentor something in this year’s GSoC. Thanks to them our project ideas page is no longer overly sparse. Google will be announcing the selected orgs on April 8th.

Stem Release 1.0

Last, but certainly not least I wrote numerous finishing touches for stem and made its long overdue initial release!

https://blog.torproject.org/blog/stem-release-10

Hi all. Between being a short month and oncall for work I didn’t get as much done as I’d like. What time I did have for tor mostly went into stem’s descriptor functionality. In particular…

Microdescriptor Support

Stem can now read and parse microdescritpors, with controller methods coming later this weekend.

I understand the desire for lightweight descriptors but they’re a step backward for controllers. Their lack of fingerprints make them clunky to use, and tor lacks the usual methods for retrieving them (ticket) and v3 network status information (ticket). Controllers will often need to read descriptor content from the data directory until those are fixed.

General Descriptor Improvements

  • Invalid descriptor content within archives caused the reader to stop processing content from the archive. This bug is simple in retrospect but cost me around a week of hair pulling frustration to sort out. Thanks to Karsten for catching it! (ticket)

  • Descriptor readers can now optionally provide network status documents rather than the entries they contain. Feature request by Karsten. (ticket)

  • Calling str() on Descriptors choked if it contained unicode content. Caught by Sathyanarayanan. (ticket)

  • Thanks to Karsten Descriptors now provide hex digests. (ticket)

  • Discussed the new ‘flag-thresholds’ attribute and added support for it to stem. (ticket)

  • Descriptor parsers that used readline() could choke if derived from a descriptor archive. Caught by Karsten.

Other Tasks…

  • Discussed proposal 218 with Karsten on tor-dev@. (thread)

  • Changed our consensus-tracker script to alarm for non-exits (we had an undetected sybil attack early in the month). Also fixed an issue where the script gobbled up way too much memory. (ticket)

  • Made a few backward incompatible changes to improve stem’s usability in anticipation of our March API freeze…

    • Version comparison is now done through normal comparison operators rather than a meets_requirements() method.
    • Renamed the keyword arguments for Controller.from_port() and others to be less verbose (for instance ‘control_port’ to just ‘port’).
    • Dropped the ‘path’ arg from parse_file(), it was never intended for external callers.
  • Variety of bug fixes…

    • We didn’t recognize a ‘NEVER’ date in ADDRMAP events. Caught by Desoxy. (ticket)
    • Patch from Abhishek so our tests avoid static /tmp usage. (ticket)
    • Added copyright notices throughout most of our codebase. Suggested by Juan. (ticket)
    • Addressed issues with get_process_name() on OSX. Caught by Sathyanarayanan. (ticket)

Hi all. This January I’ve been integrating feedback from first time stem users and implementing their feature requests. Projects included…

Python 3.x Support

Stem now works under both the python 2.x and 3.x series. To use stem with python 3 simply use it when you install…

python3 setup.py install

This turned out to be a larger project than I had anticipated, taking almost half the month. But it was well worth the effort.

PEP8 Compliance

Ditched most of my odd coding preferences in favor of the standard python style guide. I’ve integrated both pep8 and pyflakes with our tests to prevent regression. Hopefully this’ll make it easier for others to contribute to stem.

Arm Codebase Refactoring

Prior to being shanghaied into the projects above I sunk quite a bit of time into overhauling arm. Thus far I’ve dropped around a third of the codebase in favor of similar (but tested!) capabilities in stem.

Descriptor Improvements

Got lots of input concerning stem’s descriptor module (thanks Karsten!), leading to quite a few improvements…

  • used feedback from Aaron Johnson to make the descriptor API less confusing
  • we now support bridge network status documents (ticket)
  • added support for ‘-legacy’ authorities (ticket)
  • parsing error if network status documents lacked a ‘directory-footer’ line (ticket)
  • empty ‘bridge-ip-versions’ lines caused problems (ticket)
  • we didn’t recognize the @type annotation for key certificates (ticket)
  • we weren’t parsing the new ntor-onion-key lines (caught by sonu, ticket)
  • error when ran with pypy (caught by peer)

Sean helped quite a bit at the start of the month, but has since been busy with other things.

  • added a Controller.get_streams() method (ticket)
  • tests for the close_stream() method (ticket)
  • expanded the Controller’s unit tests (ticket)
  • fixed type checks (ticket)
  • fixed test_reattaching_listeners (ticket)

Hi all. Between the holidays and being oncall for work I didn’t have very high hopes for December. However, by the time the dust settled it turned out to be a surprisingly productive month. Projects included…

  • Finishing stem support for event handling. This was the last major feature we were missing before having feature parity with TorCtl.

  • Ported arm and the consensus- tracker to stem. The arm migration went surprisingly smoothly, but there’s still a lot of cleanup work left to do here. Ideally arm will be a far simpler codebase now that it doesn’t need a wrapper module around the controller.

  • Moved stem’s site to "https://stem.torproject.org/". (ticket)

  • Smaller things include…

    • finally fixed the periodic freezes in arm (ticket)
    • uniform support for a default response in Controller getters (ticket)
    • vastly improved performance and memory usage for the ExitPolicy class
    • expansions for descriptor handling (ticket 1, 2, 3)
    • extend_circuit(), attach_stream(), and get_circuits() support (patches by Ravi, ticket 1, 2)
    • TAKEOWNERSHIP support (thanks to Lunar^ for the initial patch, ticket)
    • fixed a bug where circuit/stream ids were sometimes ints (caught by Lunar^, change)
    • added a post-authentication hook so event listeners can be reattached to Tor
    • several OPW discussions with Marina (we didn’t get any substantial applications)
    • added flash proxy and txtorcon to the volunteer page, and made lots of general revisions
    • discussed TorCtl deprecation with Mike and made the announcement
  • Besides this, Sean Robinson has been submitting an absurd number of fixes, improvements, and code reviews of his own. Many thanks!

    • version pre-requirement checks for events and tests (ticket)
    • testing expansion for malformed events (ticket)
    • close_stream() method (ticket)
    • STREAM_BW event handling (ticket)
    • testing util expansion to make it easier to test client use cases (ticket)
    • get_socks_listeners() method and related mocking changes (ticket)
    • … and many, many more (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)

November was a month that included eating far too much pie. There was some stem work too on the side, though delicious pumpkin pies are always how that month should be remembered.

My primary focus for November was tor event handling, which is the last major feature we need before having parity with TorCtl. We presently support nine of the nineteen event types, including the most commonly used ones (logging, BW, CIRC, STREAM, etc). I’ll be spending a good chunk of December finishing this up.

Besides this I’ve been really thrilled at how contributors are coming out of the woodwork to help…

Ravi

  • Ravi has volunteered to take a lead on moving stem onto tor’s site. Unfortunately this is presently blocked on getting a subdomain.

  • Provided a patch to move stem’s controller exceptions to the top level namespace. (ticket)

  • Fix for the repurpose_circuit() integ test (ticket) and discovered an issue with the stem.process test. (ticket)

Eoin

  • Submitted a great patch overhauling and expanding verification of server descriptor content. (ticket)

  • Caught a possible tor bug related to ‘GETINFO orconn-status’ queries when disconnected. (ticket)

  • Numerous spelling fixes (change) and caught an issue with respect to how the descriptor reader handles archives. (change)

Sean

  • Reviewed my event parsing branch, offering feedback (ticket) and catching a bug where STREAM events could have a zero port. (ticket)

  • Submitted patches to add close_circuit() to the Controller (ticket) and a setup.py (ticket). The former led to a discussion about stem’s licensing and copyright for patches.

  • Helped resolve an issue with EXTENDCIRCUIT where we weren’t taking into account when the path was optional or not. (ticket)

Other things I did this month includes…

  • Preparation for the 2013 Outreach Program for Women, the application deadline for which is now only two days away. This mostly involved helping others add their project ideas to the volunteer page and adding one of my own.

  • Made a landing page for stem’s bug tracking and linked to it from stem’s site.

  • Revamped stem’s enum documentation to be both more readable and support interlinking. (change)

  • Provided a code review for Karsten’s pygeodate.py. (ticket)

  • Answered a handful of controller inquiries on our lists. Stem’s now at a point where I don’t mind suggesting it to developers. If you’re scripting or writing an application around tor then please give stem a try! I’d love to get more feedback on where its rough edges are before we make an initial release. (1, 2, 3, 4)