Status Report

Ahhh, the beginning of winter. That very special time of the year when Seattle skies are perpetual overcast and gloomy. When crowds flock to the coffee shops and the best thing to do with your time is hook up to a caffeine IV and code.

… on a side note it has been a really productive month.

  • Stem Website

    Stem now has a far more developer friendly website!

    https://stem.readthedocs.org/en/latest/index.html

    Barring the obvious new sections I revised vast swaths of stem’s API documentation.

    Feedback welcome! The tutorial section is just starting, but I have some ideas for expanding it.

  • Network Status Document Parsing

    After a couple months of work by Ravi and me the network status document handling is finally done. It turned out to be a… very big feature branch.

    With this done we’re very nearly at feature parity with metrics-lib (and I suspect a bit past it in terms of testing). In finishing this up I also spotted an undocumented oddity with microdescriptor ‘directory-signature’ lines.

  • GSoC Mentor Summit

    Met with the developers of numerous other open source projects. I was really impressed at how many worthwhile conversations were crammed into such a short conference. Highlights include…

    • Arc from Python had distutils suggestions, thanks to which stem and arm will soon have both Python 2.x and 3.x support.
    • OSU’s Open Source Labs are entertaining the thought of running Tor relays.
    • Discussed TorBirdy and other Tor projects with Sukhbir. He mentioned an issue with trac permissions that is now fixed.
    • Talked with Terri from Python about Mailman 3, which would be a great answer for the requests we get to have a Tor forum.
    • Attended a talk led by Marina from Gnome about their outreach program for women. We’ll likely be taking part in it this year.

Other changes include…

  • Cleaning up orphaned pyc files as a part of running our tests (feature suggested by Ravi, ticket)
  • Ported arm’s str_tools module to stem and added unit tests (ticket, docs)
  • Code reviewed Ravi’s attach_stream addition, waiting on the revisions (ticket)
  • Arm troubleshooting on irc with zack and ultramage. Also fixed a broken link on arm’s site spotted by wh6iQ.
  • A new contributor, Eoin, spotted some great issues and contributed patches. Tickets include…

    • python 2.5 compatibility bugs (ticket)
    • count of the number of skipped tests (ticket)
    • error with the whitespace checker (ticket)
    • spelling corrections (ticket)
  • Sathyanarayanan spotted some nice bugs including…

    • on python 2.5-2.6 a missing microdescriptor consensus wasn’t causing the related test to be skipped (ticket)
    • integ tests for the process module would fail if tor was already running (ticket)

Next up: Ravi and I are working on tor event handling, the last major feature stem is missing as a controller library. After that we plan to port arm over, then tidy up loose ends in preparation for stem’s initial release.

Hi all. As is often the case work and such meant not too much time for tor. For September all I have to report is…

  • Network Status Document Parsing

    This has been my main focus for September and it’s still not finished… but it’s close! Version 3 document parsing just has a couple days of work left, then abstracting it to cover v2 documents and microdescriptors should be relatively easy-ish. I’m really looking forward to merging this feature branch. It has grown quite monstrous…

  • Stem Documentation Hosting

    For a while now I’ve had a TODO item for making a nightly cron that built and hosted Stem’s new sphinx documentation. I was about to do this when I recalled that meejah once recommended ReadTheDocs, a service that does… well, exactly that. After a few minor bumps (1, 2, 3) it’s now live…

    https://stem.readthedocs.org/

    We definitely need to put effort into making them more reader friendly. At present it’s just a dump of all the pydocs which, while informative, is actually a bit overwhelming for new users. Module summary pages would greatly help.

  • MAPADDRESS Support

    Ravi submitted a patch for adding MAPADDRESS support to stem’s controller. It’s a nice addition, especially the integ test.

  • Arm Issues

    Looked into a couple arm issues…

    • Tor’s start time didn’t show up if the system has proc contents but we fail to parse it (ticket).
    • Can’t connect when using a control socket with password auth (ticket).

  • Updated Dev Wiki

    In response to a potential volunteer I wrote a summary of several development tasks. Updated stem’s wiki with what was in that email.

Hi all. I spent most of August, like the prior month, traveling. This time I attended Toorcamp and went on vacation. Both were pleasant and relaxing, but not terribly conducive to coding so I don’t have much to report…

  • Descriptor CSV Export Functionality

    Naif proposed export functionality for Tor server descriptors a while back, which Eric and Megan took the first stab at. I ended up revising this quite a bit, but it turned out nicely.

  • Caching Expansion and Test Prompt

    Added caching for GETCONF and static GETINFO queries. I also added a handy little script for a debugging interpretor. It kicks off Tor, then provides an interactive python prompt with a controller instance to it (optionally shutting Tor down afterward). Ravi code reviewed these changes and volunteered to do code reviews of my future work as well.

  • Consensus Parsing

    Over the last few weeks Ravi’s been working on a patch to parse network status documents. It’s functional, but missing unit tests and deviates from the parsing style and strict validation done for the other descriptor types so I’m taking a turn with the code. Thus far I’m done revising and adding tests for router status entries, and now working on the document. Changes are available in the ‘document-parsing’ branch of my repo.

    Parsing these documents is a far larger task than I thought (especially if you include v2 documents and microdescriptors), so working on this branch will probably keep me occupied for much of September.

  • Controller Expansion

    Ravi has gone on a hacking binge, adding support for USEFEATURE, SIGNAL, EXTENDCIRCUIT, and SETCIRCUITPURPOSE.

Hi all. My July was mostly spent traveling, both for Defcon and a couple visits with my family on Vashon (for a funeral and Strawberry Festival). August will be much the same. I’ll be gone…

  • 8/8 – 8/12 for Toorcamp
  • 8/17 – 9/3 for a family trip

Besides that and cat wrangling for the GSoC midterms, here’s the things I did this month…

Hi all. This June I exchanged my developer hat to be a mentor instead, spending more time reviewing code than writing it myself. Fingers crossed that at least one or two of them stick around after the summer ends!

As such, this status report is more about other people than me. Apologies if I miss anything.

  • Ravi (GSoC Student)

    This month Ravi discovered a bug with Tor’s GETCONF method and wrote numerous features including SAFECOOKIE support and a GETCONF method for the controller. The GETCONF handling is complicated by the accursed HiddenServiceOptions so it has needed several iterations, but I plan to merge it this week. After that Ravi already has two more feature branches lined up for me to review (*sob*)…

  • Beck (Volunteer)

    Somehow I’ve never been able to bring myself to do development on Windows (if you haven’t seen Neal Stephenson’s Hole Hawg article then I recommend it). Fortunately Beck does, and has done a fantastic job of fixing stem and its tests to work there (1, 2). He also added get_version(), authenticate(), and protocolinfo() methods to the controller.

  • Erik and Megan (Wesleyan Students)

    Erik and Megan have been focusing on stem’s tests, first submitting a couple fixes for the mocking module (1, 2) then writing unit and integration tests for the proc utilities. Next they plan to implement the CSV export functionality suggested by naif.

  • Sathyanarayanan (Volunteer)

    Though work has kept him pretty occupied, Sathyanarayanan has been working on an ExitPolicy class and is presently at the dev meeting making plans to implement a python based Onionoo.

  • Karsten

    Though he isn’t hacking on stem itself, Karsten has been helping by reviewing its descriptor handling and suggesting improvements (1, 2).

Besides those, I implemented a few stem improvements too this month…

  • Sphinx Documentation

    At the start of June I rewrote stem’s documentation into reStructuredText so it could be compiled by Sphinx. The results are very pretty

  • Python 2.5 Compatibility

    Stem aims to support python versions 2.5 and above (in the 2.x series). However, most of our development has been on 2.7, letting backward incompatible changes slip in inadvertently. This ended up being a two week bug fixing odyssey, but now that it’s done stem and its tests should now work if users want an interpretor that harks back to 2006…

  • Test Freezing Issue on Mac OSX

    Both Sathyanarayanan and Karsten reported that stem’s integ tests freeze on Mac OSX. After a dozen hours of hair pulling I’ve narrowed it down to an issue where control sockets are left in a CLOSE_WAIT state when closed, and eventually we lose the ability to make new control sockets…
    https://trac.torproject.org/6235

    From what I can tell this is either an issue with Tor, Python, or Mac OSX (my money’s on the last). Help welcome if anyone has ideas.

Other random things from this month include going to the Fremont Fair, attending a SEAPIG meeting (local python developer group) and making travel arrangements to attend Defcon later in July.

Hi all. Spring is in the air, and with it many lovely things including the UW Street Fair, Folklife, and an iSec Open Forum. It also included a week of oncall but that doesn’t count in the ‘lovely things’ column. On the upside though, this time it was pleasantly light (first time I’ve gone a whole week without being woken up by a pager at 3am!).

As a GSoC admin I don’t have much to report besides a blog posting at the start of the month. However, as a mentor some neat things are in the works…

  • Ravi is adding SafeCookie support in stem. I’ve code reviewed the first couple iterations and it’s looking great. Given some tests and revising to fit with recent refactoring it should soon be ready to merge.

    After that he’ll start working on the general controller. This last weekend I refactored how response classes are organized and implemented GETINFO as an example, so this will hopefully be an easy project for him to start hacking on.

  • Beck added descriptor validation to check that the signing key’s hash matches its fingerprint. This is the first part of descriptor integrity validation that Karsten suggested, however this project has gotten stuck so he’s moving on to other stem tasks.
  • Investigated a couple issues brought up by Sathyanarayanan. (1, 2)
  • Two Wesleyan students will start working on stem soon!

Needless to say helping these projects has occupies much of my time, and will take even more once the Wesleyan students get started. But after years of trying in vain to attract developers to my projects I wouldn’t have it any other way.

Other stem tasks I finished this month includes…

  • ExtraInfo descriptor parsing. This took me a couple weeks since it contains so many attributes, but I’m glad it’s finished. Now all that remains before we can port Onionoo are the consensus’ network status entries. That’s now at the top of my todo list for descriptor work, but I’m setting it aside for now in favor of Sphinx and helping Ravi and Beck with controller work.
  • Improved the launch_tor() function, making the test instance easily configurable via a temporary torrc (similar to what Vidalia does) and adding integ tests. I also added a workaround so it’ll work on Windows. (1, 2)
  • Updated the stem development wiki so it’ll be easier for new volunteers to find tasks that interest them.
  • Discussed descriptor type annotations with Karsten and implemented stem’s side of it. We also discussed some changes to bridge descriptor sanatization which led to some minor stem changes too.

Ducks are awesome. Especially cute Indian Runners who waddle about upright like penguins. Besides this startling discovery, in April I wrote some code, released some code, and did a bunch of GSoC mentor/administration work.

Stem had a reasonably good month. Work included…

  • Finished and merged all of the outstanding todo items from my discussions with Karsten (merge diff). This included additional testing, support for bridge descriptors, re-discovering a tor bug that caused negative uptimes, and a variety of other things. I’m keeping an eye on metrics-lib tickets as they roll in so stem can improve from the issues that Karsten discovers (example), and discussion is ongoing about the addition of a descriptor header field.
  • Discussed stem’s copyright with Wendy and others. We now have a plan for contributors that’ll allow us to reuse stem code under other licenses when needed.
  • Ongoing discussions with Beck about potential stem projects. He sounds eager to work with Ravi and me on the general controller.
  • The whitespace conventions of my projects drive you guys nuts just as badly as yours annoy me. This was gonna be a continued pain point for accepting contributions from others so I’ve added a whitespace checker to stem’s tests that’ll yell at people if they start doing something funky.
  • Stole a trick from git and dropped the ‘–no-color’ argument from the test runner in favor of autodetecting if stdout is a tty terminal or not. This means that test output to your console will have pretty colors, but the ANSI escape sequences will be omitted if you’re piping the output to something else (less, a file, etc).
  • Merged a chroot testing target so we can ensure that stem plays nicely with those environments. This is a use case that traditionally causes problems for our controllers since they don’t account for a path prefix in cookie authentication, determining the data directory, etc.
  • More intermittent concurrency woes. Hopefully I fixed it for realz this time!

Arm also got some love, including a couple important fixes which were released in version 1.4.5

  • Fix for unrecognized authentication methods. I also filed a ticket with the fix for TorCtl which got a less-than-heartwarming thanks.
  • Added a notice when ptrace is disabled, which by extension causes some proc contents to only be readable by root (breaking arm’s connection panel). Only Jake has expressed an opinion that this is a good feature to have, but others don’t seem interested in discussing it so guess it’s something that I’ll just need to work around. The message tells users how to disable the feature and cites the ticket if they want to know more.
  • Helped arm users including Eric, LoneWolf, and MoPac.

Finally, I spent a good chunk of this month cat herding for GSoC. We survived the student selection process (yay!) and for the moment at least things are proceeding smoothly. Thanks to Sebastian for leading the student selection meeting and covering for the #gsoc deduplication discussions.

Cheers! -Damian

PS. The aforementioned ducks are in reference to an email thread dispensing free flightless avian waterfowl to the masses. Alas though, I couldn’t get any since my small apartment isn’t especially duck-friendly. *sob*

I never cease to be amazed at how fast a month sweeps by. This March I fell in love, fell out of love, got ill with a horrible stomach bug, and wrote a bunch of python code. My favorite was the last one and this is just about that.

My time developing stem this month was almost entirely dedicated to writing a python counterpart for metrics-lib. Most of the effort here went into reader concurrency, server descriptor validation, and lots of testing. For my part this project has the following goals…

  • provide the server descriptor, network status, and microdescriptor parsing needed by the controller
  • validate that new tor versions comply with the spec and don’t break our parsing
  • replace the java metrics-lib so we have a single codebase with multiple maintainers (in other words, persuade Karsten to hack on stem)
  • allow applications that just need descriptor data (such as the consensus tracker script) to use cached descriptor data so they don’t require an open control port

At present stem’s implementation just handles server descriptors. A lot more work will be needed to cover the rest of what metrics-lib does.

In other stem news Ravi Padmala, a contributor to several of our projects and a GSoC applicant, made multiple fixes to stem’s version parsing. I never cease to be amazed at how error prone something that sounds as simple as ‘parse the tor version’ can be. Guess that’s why we’re writing a library…

Sathyanarayanan also took the first stab at porting arm’s ExitPolicy class to stem, though more work is still needed there.

Besides stem, roughly an equal amount of my time has gone into this year’s GSoC (for anyone living under a rock we were accepted, yay!). With my org admin hat on I revised our application, made lots ‘o revisions to our volunteer page, and helped to respond to general GSoC inquiries.

With my mentor hat on I reviewed Ravi’s proposal and decided afterward that I dislike the fire-and-forget approach that we usually take with GSoC projects. We give students isolated projects where they can work independently because it is less work for us. On occasion we end the summer with a new core contributor or something we can use, but in general it fails on both counts. I want to try something a little different this year and actually work on the tasks with my applicant. Maybe it’ll work, maybe it’ll drive them mad. We’ll see…

Other random things that I did this month included…

  • Attending local presentations by Bruce Schneier and Dan Kaminsky.

  • Looking over meejah’s txtorcon, a python controller lib using twisted. It’s impressive that he got this up so quickly and it’s neat to see what a twisted implementation looks like. However, it is missing large and very basic controller functionality (such as parsing controller replies), and what parsing it does do is hacky (if "COOKIE" in protocolinfo_reply: do cookie auth which will obviously fail with replies like ‘SAFECOOKIE’). With work though this could be a nice alternative implementation. Meejah is obviously very capable and it’ll be interesting to see where he goes with it. (Correction: mistake on my part, txtorcon actually does have parsing for GETINFO and GETCONF responses)

  • Discussed with Norman and Karsten the possibility of Weslayan students working with us again this year on either Stem or Onionoo. It will be a smaller scope than last year’s project if it materializes (just a couple students) which in my opinion is a good thing. Large groups are hard to manage.

Oh, how todo lists never seem to get any shorter.

My first half of February was focused on stem development, most importantly the implementation and testing of the BaseController class. This is the foundation on which useful controller activity can be based, providing a parallel to TorCtl’s asynchronous controller communication (event handling) and sendAndRecv function. Good news is that the BaseController is also designed to be thread safe. Bad news is that getting the deadlocks worked out was a pain in the ass and consumed well over a week. Sometimes concurrency is hard. >:(

Other stem development included…

  • Simulated chroot setups for integration testing (ticket). This hasn’t yet been merged because I haven’t added a method for users to provide their chroot prefixes (and hence these integ tests for things like cookie authentication rightfully fail). Not hard, just haven’t gotten to it yet.
  • Gave some input on Robert’s Safe Cookie proposal and filed a ticket for supporting it in stem. Sathyanarayanan has offered to take the first pass at implementing it.
  • Discussions with people helping to make stem better. Sathyanarayanan put the finishing touches on configuration saving and Neena fixed an integration testing bug. Many thanks!

Later in the month I began making the arduous trek (half hour walk) to the Tor developer meeting. For everyone who could attend it was great to see you again! Highlights for me included…

  • Demoed stem to several people and schemed about its future plans.
  • Discussed a python metrics-lib with Karsten. Making a skeleton for that now holds the top slot on my dance card when I finally get some time to do development again.
  • Talked with potential mentors about ideas for GSoC. There was quite a bit of interest but not any concrete plans at the time.
  • Discussed the burden of proof needed for badexiting and resolved a ticket we had for an automated exit setup we’ve been seeing.
  • Brainstormed alternative names for the third incarnation of TorStatus with Karsten and Arturo. In the end we went with “Atlas”. I later filed tickets (1, 2, 3) to move it and Onionoo to tor’s infrastructure (tpo vm, git repos, trac, etc).
  • Talked with Runa and Karsten about the monitoring infrastructure project. It won’t be a GSoC project, but rather something that Runa plans to hack on later.
  • Organized for us to go on the Underground Tour. Note to future self: leaving with twice as much transit time as you need doesn’t work. Quadruple it.

Sadly as the month went on I’ve shifted more and more from development to helping others. Of late the little time I have has gone toward GSoC preparation…

  • Revised our GSoC landing page, rewriting a few of the sections.
  • Nagged lots of people for project ideas and added them to the ideas page.
  • Added a project idea for stem.
  • Dug up our application from last year. With only a few minor tweaks it should still be fine.
  • Discussions about if we’ll be filing a joint application with the EFF again or not. Conclusion was that it probably isn’t as vital to our acceptance as we once believed, but we’re still gonna do it because we like the EFF and they’ve pinky promised to communicate better this year.

… and then of course there were other things…

  • Code reviewed Karsten’s script for gathering obfsproxy statistics (ticket).
  • Several volunteer page changes, like adding Obfsproxy, Ooniprobe, and Shadow.
  • Discussions about trac with proper. On one hand I’m glad that he’s trying to help, but on the other I feel like there’s a growing need for us to include a banner on our wiki warning that it’s community maintained. With our logo and the official domain there’s a sizable risk that visitors won’t realize that we don’t review several of the pages at all.
  • Thanks to Sebastian for fixing an arm bug where reading tor logs from February 29th on leap years would crash arm. This problem was reported by dozens of people, which is actually really heart warming.

While I’d like to get back to my own coding, I doubt that these distractions will subside much any time soon. C’est la vie, I shouldn’t complain – it’s all good stuff.

Hi all. Performance reviews and oncall kept me occupied for much of January, and Megan had dibs on most of what remained. I’m still hacking on stem but progress isn’t as fast as I’d like. C’est la vie.

Stem’s development in January mostly focused on…

  • Writing a proper mocking module and refactoring the tests to use it. This will greatly improve the maintainability and ease of writing new tests going forward. Originally this began with the humble goal of ‘remove a built-in mocking hack from the system module’, then went down the rabbit hole of larger scale testing improvements. Still, I’m happy with the results.
  • Sathyanarayanan took on development tasks including integration tests for chroot setups, saving configurations, and troubleshooting test failures on OSX. Design discussions and code reviews take a fair bit of time but I’m thrilled to finally have someone to hack on the codebase with me. A couple other potential volunteers (piffey and blackpaw) showed interest but have since disappeared.
  • A large part of my discussions with Sathyanarayanan centered around making stem more developer friendly, both in terms of its utility APIs and easier collaboration. As it turns out keeping stem’s todo list in a text file on my netbook is not the most optimal location for other people. I’ve since moved it to a development wiki.
  • Expansion of the configuration utility. The most notable changes include multi-line configuration options and moving to a listener architecture. The former lets us move user facing strings out of the source (good if we ever translate) and the later greatly simplifies usage of this utility. It’ll also allow for runtime configuration editability later.
  • Additional options for running stem’s tests…
    • ‘–tor’ – Runs integration tests against a given tor binary (obviously needed to test during tor development).
    • ‘–no-color’ – Removes ANSI escape sequence formatting which is preferable when piping test output.
    • ‘–log’ – Makes stem provide its logging output with the test results. Hopefully by making log messages more visible during development we’ll get better, more user friendly logging for stem’s users. Actually, I’ve already rewritten most of stem’s log messages because of this option…

Non-development things I did include…

  • Sent tor posters to international people. The pile of customs slips was pesky, but worse was twiddling my fingers at the post office as they typed each form in one by one. Hunt and peck is not the fastest method for data entry…
  • The consensus tracker script had a couple interesting finds this month. The first was an oddly configured exit from the University of Waterloo and the second was 41 exits with what looks to be an auto-generated configuration.
  • Realized that my git-fu wasn’t up to par for some of the things we’re doing at work, so I read ‘Git from the Bottom Up‘. If you’ve ever been curious about git’s internal data model then this is the article for you. It’s short and gives a very well written overview starting with git’s most basic components (blobs) and building up from that. I’ve heard that Pro Git is also good so I might skim some of that next.

Looking forward to seeing most of you at the development meeting!