Archives

All posts by atagar

Hi all. Even without counting the Boston dev meeting March was a highly productive month. Noteworthy things include…

Stem Tutorials

Stem’s tutorials got an overhaul, including:

  • A much friendlier layout. No more intimidating wall of text – the tutorials have been rewritten and broken into subsections.
  • "To Russia With Love" tutorial, exemplifying client usage and programmatically managing a tor process.
  • "Tortoise and the Hare" tutorial, demoing tor event handling through a curses bandwidth graph.
  • "Double Double Toil and Trouble", which ends the tutorials with a page of scripts and applications that use stem.

Feedback welcome! The shiny new tutorials are available at…

https://stem.torproject.org/tutorials.html

Stem Packaging

Thanks to a half dozen package maintainers stem is now available on several platforms, with more in the works. The most recent was the Python Package Index (PyPI), which can make stem installation as simple as ‘pip install stem’.

Google Summer of Code

The timing of this year’s winter developer meeting couldn’t have been better. During it I begged, bribed, and poked people with pointy sticks until they ‘volunteered’ to mentor something in this year’s GSoC. Thanks to them our project ideas page is no longer overly sparse. Google will be announcing the selected orgs on April 8th.

Stem Release 1.0

Last, but certainly not least I wrote numerous finishing touches for stem and made its long overdue initial release!

https://blog.torproject.org/blog/stem-release-10

Hi all. After eighteen months of work and a number of delays rivaling that of the Big Dig I’m pleased to announce the initial release of stem!

For those who aren’t familiar with it, stem is a python controller library for tor. With it you can write scripts and applications that interact with your tor client or relay. For some examples of what you can do see the tutorials on…

https://stem.torproject.org/

Stem is compatible with python 2.6 and higher (including the 3.x series), and is a near complete implementation of tor’s control and directory specifications. It has relatively high test coverage (~80% for most modules) and integration tests to check its continued interoperability with new releases of tor.

As always, if you encounter issues or have feature requests then please let me know! Also, if you write something that uses stem then please tell me, both so we can continue to improve our API and expand the tutorial’s list of examples.

Many thanks to everyone that helped make this initial release of stem possible, both with its development and packaging!

Hi all. Between being a short month and oncall for work I didn’t get as much done as I’d like. What time I did have for tor mostly went into stem’s descriptor functionality. In particular…

Microdescriptor Support

Stem can now read and parse microdescritpors, with controller methods coming later this weekend.

I understand the desire for lightweight descriptors but they’re a step backward for controllers. Their lack of fingerprints make them clunky to use, and tor lacks the usual methods for retrieving them (ticket) and v3 network status information (ticket). Controllers will often need to read descriptor content from the data directory until those are fixed.

General Descriptor Improvements

  • Invalid descriptor content within archives caused the reader to stop processing content from the archive. This bug is simple in retrospect but cost me around a week of hair pulling frustration to sort out. Thanks to Karsten for catching it! (ticket)

  • Descriptor readers can now optionally provide network status documents rather than the entries they contain. Feature request by Karsten. (ticket)

  • Calling str() on Descriptors choked if it contained unicode content. Caught by Sathyanarayanan. (ticket)

  • Thanks to Karsten Descriptors now provide hex digests. (ticket)

  • Discussed the new ‘flag-thresholds’ attribute and added support for it to stem. (ticket)

  • Descriptor parsers that used readline() could choke if derived from a descriptor archive. Caught by Karsten.

Other Tasks…

  • Discussed proposal 218 with Karsten on tor-dev@. (thread)

  • Changed our consensus-tracker script to alarm for non-exits (we had an undetected sybil attack early in the month). Also fixed an issue where the script gobbled up way too much memory. (ticket)

  • Made a few backward incompatible changes to improve stem’s usability in anticipation of our March API freeze…

    • Version comparison is now done through normal comparison operators rather than a meets_requirements() method.
    • Renamed the keyword arguments for Controller.from_port() and others to be less verbose (for instance ‘control_port’ to just ‘port’).
    • Dropped the ‘path’ arg from parse_file(), it was never intended for external callers.
  • Variety of bug fixes…

    • We didn’t recognize a ‘NEVER’ date in ADDRMAP events. Caught by Desoxy. (ticket)
    • Patch from Abhishek so our tests avoid static /tmp usage. (ticket)
    • Added copyright notices throughout most of our codebase. Suggested by Juan. (ticket)
    • Addressed issues with get_process_name() on OSX. Caught by Sathyanarayanan. (ticket)

Hi all. This January I’ve been integrating feedback from first time stem users and implementing their feature requests. Projects included…

Python 3.x Support

Stem now works under both the python 2.x and 3.x series. To use stem with python 3 simply use it when you install…

python3 setup.py install

This turned out to be a larger project than I had anticipated, taking almost half the month. But it was well worth the effort.

PEP8 Compliance

Ditched most of my odd coding preferences in favor of the standard python style guide. I’ve integrated both pep8 and pyflakes with our tests to prevent regression. Hopefully this’ll make it easier for others to contribute to stem.

Arm Codebase Refactoring

Prior to being shanghaied into the projects above I sunk quite a bit of time into overhauling arm. Thus far I’ve dropped around a third of the codebase in favor of similar (but tested!) capabilities in stem.

Descriptor Improvements

Got lots of input concerning stem’s descriptor module (thanks Karsten!), leading to quite a few improvements…

  • used feedback from Aaron Johnson to make the descriptor API less confusing
  • we now support bridge network status documents (ticket)
  • added support for ‘-legacy’ authorities (ticket)
  • parsing error if network status documents lacked a ‘directory-footer’ line (ticket)
  • empty ‘bridge-ip-versions’ lines caused problems (ticket)
  • we didn’t recognize the @type annotation for key certificates (ticket)
  • we weren’t parsing the new ntor-onion-key lines (caught by sonu, ticket)
  • error when ran with pypy (caught by peer)

Sean helped quite a bit at the start of the month, but has since been busy with other things.

  • added a Controller.get_streams() method (ticket)
  • tests for the close_stream() method (ticket)
  • expanded the Controller’s unit tests (ticket)
  • fixed type checks (ticket)
  • fixed test_reattaching_listeners (ticket)

Hi all. Between the holidays and being oncall for work I didn’t have very high hopes for December. However, by the time the dust settled it turned out to be a surprisingly productive month. Projects included…

  • Finishing stem support for event handling. This was the last major feature we were missing before having feature parity with TorCtl.

  • Ported arm and the consensus- tracker to stem. The arm migration went surprisingly smoothly, but there’s still a lot of cleanup work left to do here. Ideally arm will be a far simpler codebase now that it doesn’t need a wrapper module around the controller.

  • Moved stem’s site to "https://stem.torproject.org/". (ticket)

  • Smaller things include…

    • finally fixed the periodic freezes in arm (ticket)
    • uniform support for a default response in Controller getters (ticket)
    • vastly improved performance and memory usage for the ExitPolicy class
    • expansions for descriptor handling (ticket 1, 2, 3)
    • extend_circuit(), attach_stream(), and get_circuits() support (patches by Ravi, ticket 1, 2)
    • TAKEOWNERSHIP support (thanks to Lunar^ for the initial patch, ticket)
    • fixed a bug where circuit/stream ids were sometimes ints (caught by Lunar^, change)
    • added a post-authentication hook so event listeners can be reattached to Tor
    • several OPW discussions with Marina (we didn’t get any substantial applications)
    • added flash proxy and txtorcon to the volunteer page, and made lots of general revisions
    • discussed TorCtl deprecation with Mike and made the announcement
  • Besides this, Sean Robinson has been submitting an absurd number of fixes, improvements, and code reviews of his own. Many thanks!

    • version pre-requirement checks for events and tests (ticket)
    • testing expansion for malformed events (ticket)
    • close_stream() method (ticket)
    • STREAM_BW event handling (ticket)
    • testing util expansion to make it easier to test client use cases (ticket)
    • get_socks_listeners() method and related mocking changes (ticket)
    • … and many, many more (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)

November was a month that included eating far too much pie. There was some stem work too on the side, though delicious pumpkin pies are always how that month should be remembered.

My primary focus for November was tor event handling, which is the last major feature we need before having parity with TorCtl. We presently support nine of the nineteen event types, including the most commonly used ones (logging, BW, CIRC, STREAM, etc). I’ll be spending a good chunk of December finishing this up.

Besides this I’ve been really thrilled at how contributors are coming out of the woodwork to help…

Ravi

  • Ravi has volunteered to take a lead on moving stem onto tor’s site. Unfortunately this is presently blocked on getting a subdomain.

  • Provided a patch to move stem’s controller exceptions to the top level namespace. (ticket)

  • Fix for the repurpose_circuit() integ test (ticket) and discovered an issue with the stem.process test. (ticket)

Eoin

  • Submitted a great patch overhauling and expanding verification of server descriptor content. (ticket)

  • Caught a possible tor bug related to ‘GETINFO orconn-status’ queries when disconnected. (ticket)

  • Numerous spelling fixes (change) and caught an issue with respect to how the descriptor reader handles archives. (change)

Sean

  • Reviewed my event parsing branch, offering feedback (ticket) and catching a bug where STREAM events could have a zero port. (ticket)

  • Submitted patches to add close_circuit() to the Controller (ticket) and a setup.py (ticket). The former led to a discussion about stem’s licensing and copyright for patches.

  • Helped resolve an issue with EXTENDCIRCUIT where we weren’t taking into account when the path was optional or not. (ticket)

Other things I did this month includes…

  • Preparation for the 2013 Outreach Program for Women, the application deadline for which is now only two days away. This mostly involved helping others add their project ideas to the volunteer page and adding one of my own.

  • Made a landing page for stem’s bug tracking and linked to it from stem’s site.

  • Revamped stem’s enum documentation to be both more readable and support interlinking. (change)

  • Provided a code review for Karsten’s pygeodate.py. (ticket)

  • Answered a handful of controller inquiries on our lists. Stem’s now at a point where I don’t mind suggesting it to developers. If you’re scripting or writing an application around tor then please give stem a try! I’d love to get more feedback on where its rough edges are before we make an initial release. (1, 2, 3, 4)

Ahhh, the beginning of winter. That very special time of the year when Seattle skies are perpetual overcast and gloomy. When crowds flock to the coffee shops and the best thing to do with your time is hook up to a caffeine IV and code.

… on a side note it has been a really productive month.

  • Stem Website

    Stem now has a far more developer friendly website!

    https://stem.readthedocs.org/en/latest/index.html

    Barring the obvious new sections I revised vast swaths of stem’s API documentation.

    Feedback welcome! The tutorial section is just starting, but I have some ideas for expanding it.

  • Network Status Document Parsing

    After a couple months of work by Ravi and me the network status document handling is finally done. It turned out to be a… very big feature branch.

    With this done we’re very nearly at feature parity with metrics-lib (and I suspect a bit past it in terms of testing). In finishing this up I also spotted an undocumented oddity with microdescriptor ‘directory-signature’ lines.

  • GSoC Mentor Summit

    Met with the developers of numerous other open source projects. I was really impressed at how many worthwhile conversations were crammed into such a short conference. Highlights include…

    • Arc from Python had distutils suggestions, thanks to which stem and arm will soon have both Python 2.x and 3.x support.
    • OSU’s Open Source Labs are entertaining the thought of running Tor relays.
    • Discussed TorBirdy and other Tor projects with Sukhbir. He mentioned an issue with trac permissions that is now fixed.
    • Talked with Terri from Python about Mailman 3, which would be a great answer for the requests we get to have a Tor forum.
    • Attended a talk led by Marina from Gnome about their outreach program for women. We’ll likely be taking part in it this year.

Other changes include…

  • Cleaning up orphaned pyc files as a part of running our tests (feature suggested by Ravi, ticket)
  • Ported arm’s str_tools module to stem and added unit tests (ticket, docs)
  • Code reviewed Ravi’s attach_stream addition, waiting on the revisions (ticket)
  • Arm troubleshooting on irc with zack and ultramage. Also fixed a broken link on arm’s site spotted by wh6iQ.
  • A new contributor, Eoin, spotted some great issues and contributed patches. Tickets include…

    • python 2.5 compatibility bugs (ticket)
    • count of the number of skipped tests (ticket)
    • error with the whitespace checker (ticket)
    • spelling corrections (ticket)
  • Sathyanarayanan spotted some nice bugs including…

    • on python 2.5-2.6 a missing microdescriptor consensus wasn’t causing the related test to be skipped (ticket)
    • integ tests for the process module would fail if tor was already running (ticket)

Next up: Ravi and I are working on tor event handling, the last major feature stem is missing as a controller library. After that we plan to port arm over, then tidy up loose ends in preparation for stem’s initial release.

As per conferencing tradition Friday was spent on travel and meeting the other attendees. Some of the highlights for me were…

  • David from KDE

    Besides demoing some KDE eye candy we discussed their project infrastructure. KDE is a federation of smaller projects and had over sixty students this year (ten times the number mentored by Tor).

    Their project’s scale has led to some unusual infrastructure decisions. For instance, they have a partly decentralized git infrastructure where pushes go to a single master host and pulls are from any of several mirrors. The config they use to do this leads to some… odd behavior. For instance a ‘git pull’ updates your tracking branch but not the origin branch reference. The result is that to do a pull for realz you need to call *both* pull and fetch. No doubt they also get fun behavior from mirroring delays…

    We also talked a bit about post-review and defaults they could set to better support their setup. KDE has the largest public ReviewBoard instance, but the above git setup makes it a bit confusing to use.

  • Sukhbir from Debian

    In 2011 Sukhbir applied to us for GSoC to work on TorBirdy. We loved his proposal, but due to prior commitments he ended up working with Debian instead. Since then he has become a GSoC mentor for Debian and involved with the Tor by implementing his earlier proposal for TorBirdy.

    Sukhbir’s interested in getting even more involved with Tor so we discussed other projects that might interest him, and ways that we could better publicize TorBirdy on our site.

  • Arc from Python

    When I found out that Python had a mentor at the summit I made a mental note to hunt him down and ask about packaging best practices. After an unexpected discussion about rugby I found out that it’s actually easy to support both python’s 2.x and 3.x series by including a 2to3 conversion at build time. This can be done via either distutils or distribute.

    I also asked him to look into his crystal ball for when python 3 would take over the world and he said ‘Next year. Ubuntu and Fedora are ready and willing to make the switch. The last main holdout is Gnome. They tried to migrate but work there isn’t finished yet.’

Saturday was the first day of the unconference. After an amusingly confused attempt to have each of the couple hundred attendees shake each other’s hand there were sessions. Some were a little interesting, but I spent more time on the hallway track since that’s the real benefit of the summit. The only useful tidbits I got from the talks were…

  • Do outreach early. The successful GSoC students who stick around tend to be the ones that get involved before the application phase. We should try harder to recruit college students to hack on tor, with the carrot that this’ll give them a leg up when applying for the program. OpenHatch might be something to look into for this. This would be a nice task for a community manager if we get one…
  • Google Code In is a program somewhat similar to GSoC where highschool students become involved with open source. Last year they had 18 organizations and this year they’re narrowing it down to 10. I was already highly tentative about having us apply and now that I’ve heard more I’m sure we don’t have enough bandwidth for the hand-holding this would require.

As for the hallway track…

  • Adriano and Luis from Umit

    Last year Adriano showed me Open Monitor, a censorship detector written in python. Sounds familiar? I thought so too, and tried a few times to get them to talk with Ooni Probe and vice versa without success. My impression is that they’re UI developers (a skillset we sorely lack in the tor project) with a rather unscalable backend, while Ooni Probe’s backend is far more mature but lacks any sort of UI for rendering real time censorship information.

    I made another stab at getting the two projects to talk, after which the meeting took a weird turn with Adriano arguing that ‘some censorship is good’. Evidently they decided that Open Monitor won’t look for censorship concerning ‘porn or terrorism’. I argued that this was a slippery slope and that censorship monitoring shouldn’t try to pass a moral judgment on the content being censored, but after a time it was clear that we were talking past each other.

    I still think that we should leverage their UI expertise, but that’s up to the Ooni Probe devs.

  • Open Source Lab

    Met with a couple administrators from the OSU’s Open Source Labs. They provide hosting for several of the largest open source projects including Apache and the Linux Foundation. Mostly we talked about amusing legal threats they get for hosting the phpBB project. Evidently lawyers are quite skilled at clicking the ‘this is a phpBB forum’ link followed by ‘hosted by the OSL’ before sending their angry emails. We also talked a bit about setting up non-exit relays. They might be pretty receptive to this if we want to follow up.

  • Sumana and Rob from Wikimedia

    Unsurprisingly Wikipedia occasionally has issues with spammers using Tor. We talked about some possible options, such as requiring accounts for Tor users to edit with a sort of proof of work in account creation to make ban evasion more of a pita.

  • Terri from Python

    Mailman 3 is coming, and with it an interface that *doesn’t* look like it came from the 1980s! Most importantly for us, the new version of Mailman provides a forum interface, letting email and forum users communicate by whichever method they prefer. This would be a good answer to our forums ticket. She estimates that it’ll be ready in six months or so.

Flying out on Sunday cut my day short, but there was one session that I thought was interesting. Gnome and Wikimedia are launching a program similar to GSoC to encourage more women to get involved with open source. It runs later this winter. One gotcha is that Google’s not involved so mentoring orgs need to cover the $5k stipend.

I like the idea. Is this something we want to take part in? If so then I’d be happy to administer the non-financial parts of it.

Hi all. As is often the case work and such meant not too much time for tor. For September all I have to report is…

  • Network Status Document Parsing

    This has been my main focus for September and it’s still not finished… but it’s close! Version 3 document parsing just has a couple days of work left, then abstracting it to cover v2 documents and microdescriptors should be relatively easy-ish. I’m really looking forward to merging this feature branch. It has grown quite monstrous…

  • Stem Documentation Hosting

    For a while now I’ve had a TODO item for making a nightly cron that built and hosted Stem’s new sphinx documentation. I was about to do this when I recalled that meejah once recommended ReadTheDocs, a service that does… well, exactly that. After a few minor bumps (1, 2, 3) it’s now live…

    https://stem.readthedocs.org/

    We definitely need to put effort into making them more reader friendly. At present it’s just a dump of all the pydocs which, while informative, is actually a bit overwhelming for new users. Module summary pages would greatly help.

  • MAPADDRESS Support

    Ravi submitted a patch for adding MAPADDRESS support to stem’s controller. It’s a nice addition, especially the integ test.

  • Arm Issues

    Looked into a couple arm issues…

    • Tor’s start time didn’t show up if the system has proc contents but we fail to parse it (ticket).
    • Can’t connect when using a control socket with password auth (ticket).

  • Updated Dev Wiki

    In response to a potential volunteer I wrote a summary of several development tasks. Updated stem’s wiki with what was in that email.

Hi all. I spent most of August, like the prior month, traveling. This time I attended Toorcamp and went on vacation. Both were pleasant and relaxing, but not terribly conducive to coding so I don’t have much to report…

  • Descriptor CSV Export Functionality

    Naif proposed export functionality for Tor server descriptors a while back, which Eric and Megan took the first stab at. I ended up revising this quite a bit, but it turned out nicely.

  • Caching Expansion and Test Prompt

    Added caching for GETCONF and static GETINFO queries. I also added a handy little script for a debugging interpretor. It kicks off Tor, then provides an interactive python prompt with a controller instance to it (optionally shutting Tor down afterward). Ravi code reviewed these changes and volunteered to do code reviews of my future work as well.

  • Consensus Parsing

    Over the last few weeks Ravi’s been working on a patch to parse network status documents. It’s functional, but missing unit tests and deviates from the parsing style and strict validation done for the other descriptor types so I’m taking a turn with the code. Thus far I’m done revising and adding tests for router status entries, and now working on the document. Changes are available in the ‘document-parsing’ branch of my repo.

    Parsing these documents is a far larger task than I thought (especially if you include v2 documents and microdescriptors), so working on this branch will probably keep me occupied for much of September.

  • Controller Expansion

    Ravi has gone on a hacking binge, adding support for USEFEATURE, SIGNAL, EXTENDCIRCUIT, and SETCIRCUITPURPOSE.