Hi all! Getting in lots of family time as I sort out ongoing health issues, but none the less got some neat things to report this month!


Interpreter Panel

Part of Sambuddha’s GSoC project, half my month has gone toward reintroducing nyx’s interpreter and I’m pleased to say it’s turned out great!

Built upon Stem’s tor-prompt this expands upon the capabilities of the last arm release, providing an interactive python interpreter along with updated tor capabilities.


Curses Test Coverage

Concluding a four month overhaul, nyx now has a high degree of test coverage for its curses display capabilities which is good cuz… well, that’s kinda what nyx does. To answer the obvious question no, still no release date but this is a major milestone. Remaining work I have planned includes…

  • Refactor nyx’s menu and controller modules.
  • Put together a new site for nyx.
  • Run an open beta to solicit testing and ideas from our community.

Stem was in the works for two years before its first release. Nyx too will be ready when it’s ready.

  “Always do things right. This will gratify some people and astonish the rest.” -Mark Twain

Summer is here! Family, festivals, and other totally-not-tor things are occupying much of my time so this is gonna be short…

  • Expanded Nyx tests to include the graph and log panels. With this we have coverage of 66% of the curses components.
  • Stem support for new tor additions, highlights of this month being shared randomness and ADD_ONION basic auth.
  • GSoC midterm evaluations – everyone passed!

That’s all – now off to enjoy the sun!

Eeek I’m late! Wanted to get the PyCon trip report out first but… ok, maybe I went a tad overboard there. Oh well, better late than never. Between summer festivals and health issues tor took the back seat again this month but still some neat stuff…


PyCon

Probably wrote way too much on this already so I’ll spare ya. If you haven’t skimmed the report yet then check it out – PyCon was a neat event!


Nyx Event Selection Dialog

Remember arm’s bizarre and clumsy event selection dialog? Remember how confusing it was?

Yeah, it sucked. Sambuddha and I have been tossing pull requests back and forth all month as part of his GSoC project. This is requiring a lot of my direct involvement but oh well, the new dialog has turned out nicely!


Few other noteworthy things…

I’ve been to quite a few conferences. LinuxFest Northwest, SeaGL, PETS, Toorcamp, Defcon (prior trip report), but PyCon was particularly impressive. At over three thousand attendees with five parallel tracks of talks the word ‘busy’ hardly seems to do the conference justice.

Top TL;DR highlights for me were new capabilities in the Python 3.x series and HTTP 2.0. In particular…

  • Python 3.6 releases on Christmas, finally adding string interpolation!

    >>> name, job = 'Damian', 'software engineer'
    >>> print f'{name} is a {job}'
    Damian is a software engineer
    

  • Python 2.x support will be completely discontinued in 2020.

  • New async/await keywords in Python 3.5 provide built-in support for Twisted-style async IO.

  • Gradual type syntax in Python 3.5 makes code even more self-documenting and supportive of static analysis.

  • First major protocol update since 1999, HTTP 2.0 is now supported by all modern browsers and 60% of users in the wild. Connection multiplexing allows all site assets to be retrieved over a single connection, improving latency on the order of 50%. The new protocol also negates any need for the clever performance hacks we’ve developed over the years like asset minimization and sprite maps!

PyCon 2017 will be in Portland one more time before moving on to another venue, so if the following sounds interesting then check it out!

 


 

Serendipity is delightful. My first time taking the train, I strongly suggest Amtrak (particularly the Coast Starlight) if heading down to Portland. Comfortable, scenic, and by happy coincidence sat with Sarah Leivers: PyCon speaker with roots in the UK deaf community.

Sarah made the interesting point that even for deaf communities in English speaking countries English is often a second language. Signing is their native tongue, putting them at a disadvantage when it comes to involvement in our communities. Part of the larger ESL puzzle, our discussion was a nice reminder of why it’s important to keep documentation as linguistically simple and accessible as we can.

In the observation car the Parks Department described sights we passed, my favorite being the Centralia train station. Completed right around the time these newfangled ‘airplane’ things were taking off, to celebrate they decided to christen the building with champaign. Three bottles were loaded onto a plain and dropped. The first couple bottles missed but the third hit dead on, puncturing right through the roof.

Spoiler alert: this was the last building they christened in such a way.

Go to a conference without exploring the area and you’re doing it wrong. My train left me a few hours to explore the city, starting with the Portland Saturday Market. Easily comparable to Pike Place, the market is four city blocks jam packed with all the essentials of life: hand-carved bark houses, tie die, and of course fancy hats!

Next hit the Lan Su Chinese Garden, beautiful gem nestled into the heart of downtown…

Of course visited Ground Kontrol just a block away. Classic arcade that successfully reminded me just how much I suck at Marble Madness. In my defense haven’t played since my good old Amiga 2000…

Finally, hidden below my hotel lurked a black light pirate themed putt-putt course. So… seems that’s a thing!

With that out of the way, on to the conference!


File Descriptors, Unix Sockets and other POSIX wizardy

First talk of the first day, Christian Heimes gave a crash course on *nix file descriptors. In python descriptors are fetched with f.fileno() and Christian demoed interacting with them directly to open his cd tray.

Christian’s talk focused on file descriptor basics (which honestly I’m rustier on than I should be)…

  • Descriptors 0-2 are reserved for stdin/stdout/stderr with -1 for errors.
  • Fork clones the current process while pointing to the same global entry.
  • Exec replaces the current program, inheriting the prior descriptors (which is why pipes continue to work).
  • Descriptors can be delegated. This is useful in sandboxing situations like seccomp, allowing a broker to open files/sockets on a sandboxed process’ behalf.

Lastly Christian walked through a little strace example that illustrates how descriptors are used in a basic scenario…

% cat reader.py
with open('/home/atagar/Desktop/reader.py') as my_file:
  print(my_file.read())
% strace python reader.py
...
open("/home/atagar/Desktop/reader.py", O_RDONLY|O_LARGEFILE) = 3
read(3, "with open('/home/atagar/Desktop/"..., 4096) = 80
read(3, "", 4096)                       = 0
close(3)                                = 0
write(1, "with open('/home/atagar/Desktop/"..., 81) = 81

Refactoring Python: Why and how to restructure your code

Nice presentation by Brett Slatkin, the author of Effective Python on how and when to make code more maintainable. As developers we optimize for making things work in our first pass, and for many of us that’s where the story ends. To make code that’s truly easy to follow requires time and patience to take follow-up passes that optimize for maintainability. Something most developers don’t do.

To illustrate this Brett asked: how much of your coding time goes toward implementation? 90%? 75%? The few developers he knows that write easy to follow code only do so because they spend fully half their time refactoring anything they write. Maintainability isn’t cheap, and when faced with deadlines it’s often the first thing to go.

Brett’s other main takeaway was that without tests you’re DOA. Refactoring requires a willingness to make mistakes, and without high coverage any major overhaul of production systems is in practice impossible.

This dovetailed nicely with the following talk, Code Unto Others, which gave a few tips…

  • When it comes to maintainability remember that you don’t scale. Any rough code you write is something you’ll need to explain over and over to engineers that touches it. That’s not really how you want to spend your time, is it?
  • Commonly people can track 5-9 things at a time which is why phone numbers are seven digits. Subdivide modules to take advantage of this. As a counter-example they used Mercurial’s Repository class, a 17,000 line headache for newcomers.
  • Be wary when describing your module uses the word ‘and’ (“it does this and that”). If you need that word you’re probably doing it wrong. After reading the first half of a class you should be able to take an educated guess at what you’ll see in the second.

Finding closure with closures

Peek under the hood at how Python implements closures…

>>> def print_greeting(first_name):
...   def msg(last_name):
...     platform = os.uname()[0]
...     return "Hi %s %s, you're running %s" % (first_name, last_name, platform)
...   print(msg('Johnson'))
...   print("co_varnames: %s" % ', '.join(msg.__code__.co_varnames))
...   print("co_names: %s" % ', '.join(msg.__code__.co_names))
...   print("co_freevars: %s" % ', '.join(msg.__code__.co_freevars))
... 
>>> print_greeting('Damian')
Hi Damian Johnson, you're running Linux
co_varnames: last_name, platform
co_names: os, uname
co_freevars: first_name

varnames are local variables while freevars are variables we’re closing over from the outer scope. A gotcha that’s probably bitten every python dev is that assignment to a closed over variable overwrites it with a local…

>>> def get_score():
...   total = 0
...   def add_points():
...     total += random.randint(0, 5)
...   for i in range(3):
...     add_points()
...   return total
... 
>>> get_score()
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 6, in get_score
  File "", line 4, in add_points
UnboundLocalError: local variable 'total' referenced before assignment

Python 3.x adds a new ‘nonlocal’ keyword for re-binding closures but for those of us stuck in the past our best option is to use the mutable hack. Gross, but it works.

>>> def get_score():
...   total = [0]
...   def add_points():
...     total[0] = total[0] + random.randint(0, 5)
...   for i in range(3):
...     add_points()
...   return total[0]
... 
>>> get_score()
8

What is and what can be: an exploration from ‘type’ to Metaclasses

Owww, my head. This and another talk the previous day by Mike Graham introduced audiences to the wonderful world of python metaclasses…

    “The subject of metaclasses in Python has caused hairs to raise and even brains to explode.” -Guido

Method for redefining the fundamental behavior of objects and in doing so tear the fabric of reality, metaclasses are what you invoke each time you extend object. Dustin demonstrated this by defining his own metaclass that transparently causes method invocations to be accompanied by a bark…

from functools import wraps
from inspect import isfunction

def bark(f):
  @wraps(f)
  def wrapper(*args, **kwargs):
    print("bark!")
    return f(*args, **kwargs)

  return wrapper

class MetaDog(type):
  def __new__(meta, name, bases, attrs):
    for name, attr in attrs.items():
      if isfunction(attr):
        attrs[name] = bark(attr)

    return type.__new__(meta, name, bases, attrs)

class Dog(metaclass = MetaDog):
  def sit(self):
    print("*sitting*")

  def stay(self):
    print("*sitting*")

d = Dog()
d.sit()

So why will you use this? Well… hopefully you won’t. Besides the obvious unforgivability of this sin upon your coworkers, this is the kind of black magic Ruby folks do all the time but Python devs know better. Like redefining builtins, just don’t.

That aside, it was interesting to learn a little more about the abstract base class module and how python works under the hood.


Building protocol libraries the right way

Cory Benfield, author of Requests, urllib3, and other core I/O libraries discussed a common pitfall that inflicts protocol libraries: mixture of I/O with parsing.

Python has as many HTTP parsers as there are I/O libraries. Urllib variants, aiohttp, Twisted, Tornado, and friends all reinvent this wheel. Code re-use is particularly great when you have a well defined problem with a single correct solution. Arithmetic, compression, and parsing are all examples of this, so why don’t they all share a unified parser?

The problem is that we tangle network I/O with parsing of the messages we read. As such all these projects trip over the same obscure edge cases and re-implement the same optimizations.

Cory’s message was simple: keep parsing separate. Besides code reuse this greatly improves testability because you don’t need to invoke your I/O stack for coverage.

Personally I found this talk interesting because this is exactly something I ran into with Stem. To work our I/O handler needs enough understanding of the control-spec to delimit message boundaries, but beyond that parsing is a completely separate module. This has been a great boon for testing…

TEST_MESSAGE = """\
250-version=0.2.3.11-alpha-dev
250 OK"""

def test_single_getinfo_response(self):
  """
  Parses a GETINFO reply response for a single parameter.
  """

  control_message = stem.response.ControlMessage.from_str(TEST_MESSAGE, msg_type = 'GETINFO')
  self.assertEqual({'version': b'0.2.3.11-alpha-dev'}, control_message.entries)

HTTP can do that?!

Whimsical look at lesser known bits of the HTTP specification…

  • Need just metadata of a GET request? Use HEAD instead for a far lighter response.
  • Calling OPTIONS will tell you the HTTP operations a resource supports.
  • Besides normal CRUD operations (GET, POST, PUT, DELETE) the HTTP spec has PATCH to update just part of a resource.
  • The specification also has TRACE, LINK, and UNLINK methods. Nobody uses them but hey, they’re there.
  • Few interesting headers include ETag for versioning resources, If-Modified-Since to only solicit a response if the resource has changed, and Cache-Control to define cacheability. Actually, the specification even has a From header in case you want to tell everybody in the world your email address…
  • Few standard but infrequently used response codes are…
    • 410 – That resource used to be here but now it’s gone.
    • 304 – You asked to get this resource if it’s been modified but it hasn’t.
    • 451 – Unavailable for legal reasons. Mostly comes up with censorship firewalls.
  • Unsurprisingly you can make up your own status codes and reason strings. Sumana had several amusing ones she’s found in the wild.

Playing with Python bytecode

Amusing demonstration of executing raw bytecodes in python, including runtime manipulation to switch a functor’s addition operation to multiplication. Interesting in a ‘oh god, you can do that?’ sense but even the presenters said ‘kids, don’t do this at home’. Few (if any?) practical applications, and opcodes change even between minor Python interpretor version bumps making any such hacks a maintenance nightmare.


SQLite: Gotchas and Gimmes

Tips by Dave Sawyer for SQLite, mostly focusing on the advantages over pickles (performance, safety, etc), common pitfalls, and locking strategies…

  • Deferred – Multiple readers/writers.
  • Immediate – Multiple readers/single writer
  • Exclusive – Single reader/writier.

WAL (Write Ahead Locking) is an alternative where readers are unlocked with the writer appending deltas. Upon checkpoints SQLite halts all reads/writes to apply the deltas as a batch.


See Python, See Python Go, Go Python Go

Last talk I attended and the one I wanted to see most. Imagine a world where performance critical code could be written in Go rather than C. No more memory leaks. No compilers. Sounds great, right? Well, keep dreaming.

Both Python and Go can drop to C and Andrey gave a demo of doing so as a bridge between them, and in the process explained why this is a terrible idea. The CPython Extension interface requires a bit of boilerplate but can work with no dependencies while CFFI requires some magic but provides a more portable solution. But in either case crossing both the Go-to-C and C-to-Python boundaries drop you to the least common denominator. This means no Go interfaces or routines, and no Python classes or generators.

GC, GIL, and JIT all add their own headaches but worse, you need to implement your own memory management. Sharing between Go and Python risks release of memory the other side still references. Andrey got around this by passing his own dereferenceable pointers but… ick.

In the end Andrey’s demo worked and in fact was just as performant as a direct Go implementation, but made it clear there be dragons. Frustratingly, it’s still better to just call os.system().

 


 

This being my first PyCon I focused on talks rather than the hallway track but none the less had some nice finds…

  • Seattle is home to quite a few technical meetups. Hardware hacking, TA3M, Ruby Brigade, you name it and there’s probably a group for it. SeaPig has been a fun local python group but sadly its gone dormant in recent years. Among the booths however I ran into members of PuPPy, another local python group that seems to be quite alive and well!
  • Didn’t realize in advance but AWS networking ran a booth during the job fair. Fun chats with Shawn – he has a great approach for exciting folks to apply.
  • Crossed paths with meejah several times. Together we whipped up a recipe combining our libraries so users can read stem-parsed event objects from txtorcon. Neat stuff!

Simply a great conference, I look forward to hitting PyCon again next year!

Hi all, health issues nailed me for much of April but ick aside here’s what I was up to last month.


Google Summer of Code

GSoC planning is done and we’re delighted to announce seven great projects for the summer!

In particular I’ll be mentoring Sambuddha who’s making several Nyx additions that’ll make our upcoming release even better!


Nyx Curses Testing

User interfaces are tough to test. This goes for websites, GUIs, as well as terminal UIs. Often there’s libraries for doing so but honestly they’re never exactly what I’d call ‘elegant’.

As the rarest of the three terminal UIs are particularly under-served when it comes to testing, which has left Nyx a black hole when it comes to code coverage. But that all changed this month when I had a bit of an eurika moment.

So… we can now test Nyx’s UI! This month I added test coverage for our popups and header panel, with ongoing work for the rest of the interface. This is slow going and will no doubt gobble my May but I’m really delighted with the direction it’s going.


Few other noteworthy things…

  • Replaced tor-assistants@ with a new list to cut down on spam we all need to deal with.
  • Been working on a new website for Nyx during my commutes. Still a ways off but coming together bit by bit.

Hi all, March has been a busy month! My fingers are in quite a few pies but they’re all still baking so this is gonna be a short one…


Google Summer of Code

Sweet, delicious chaos! Forty-five applications rolled in this year which needless to say is keeping us busy. Best of luck to everyone that applied!

Selection announcement is April 22nd.


Nyx Curses Usage

Still a work in progress but I’m abstracting direct Curses usage out of Nyx. Short term this provides better thread safety, and long term could allow direct Windows support through PDCurses. Work on this will continue all through April.

Overhaul aside, brainstormed project ideas with GSoC applicants including…

Also cleaned up our ticket queue and made Nyx a bugtracker page.


Few other noteworthy things…

  • Stem now provides shorthand aliases for fetching descriptors. This makes scripting as simple as…

    import stem.descriptor.remote
    
    print 'Current exits are...\n'
    
    for desc in stem.descriptor.remote.get_consensus():
      if desc.exit_policy.is_exiting_allowed():
        print(" * %s: %s" % (desc.nickname, desc.exit_policy))
    

  • Better descriptor validation of non-ascii content.
  • Worked with Sebastian, who’s writing a guide for using Stem to better test Tor. He found some interesting testing issues on OSX..

Titan II, V2, SR-71 Blackbird, not to mention the Spruce Goose. If you’re ever in Oregon I’d highly suggest the Evergreen Aerospace Museum. It’s truly impressive. Bonneville Dam was fun too, but I’m kinda a sucker for big feats of engineering.

Fun trip aside February was a productive month!


Google Summer of Code

We’re in!

Org administration is taking a big chunk of my time and will continue to do so through March, but things are coming together nicely. Student applications are due March 25th. Many thanks to Isabela, Roger, and Sebastian for helping with the proposal!


Nyx Torrc Panel

Last panel’s done! In doing so we’re dropping torrc validation which has been the #1 point of confusion for arm users on irc. As always the rewrite came with deletion of what we in the biz call ‘a crap ton of code‘. Still lot more work before we release, but getting closer!


Few other noteworthy things…

This was a great start to the year! Highlight was lunch with David White, author of Battle for Wesnoth who also gave me a tour of Valve. But January had lots of neat Tor stuff too…


Nyx Connections Despite DisableDebuggerAttachment

For years Tor’s DisableDebuggerAttachment has been the bane of Nyx. The feature wasn’t intended to effect us, but screws with proc permissions breaking every connection resolver we have.

Resolvers read /proc//fd to get connection inodes, then use that to determine what from /proc/net/tcp belongs to our process. Tor’s DisableDebuggerAttachment breaks that by making /proc//fd only readable by root. However, even without knowing the inodes we can identify Tor related connections by if they go to a relay or our Tor ports. This is exactly what Nyx already does to identify a connection’s type.

TL;DR. Connection resolution now works all the time. Only drawbacks are…

  • This will blend connections when running multiple Tor processes.
  • Connection resolution can’t work until we have consensus information. This can take a few seconds when starting up.
  • We can’t show client or exit connections. Nyx already scrubbed these so no big loss, but means we now can’t even show that they exist.

Small drawbacks to have the connection panel work by default once again. User can still set ‘DisableDebuggerAttachment 0’ in their torrc for more reliable connection resolution.


IPv6 Connection Resolvers

Thanks to toralf Stem can now retrieve IPv6 connection information. More important for users, this means Nyx’s connection panel now works for IPv6 relays!


Few other noteworthy things…

Happy holidays everyone! Between family and tenacious colds December is always a slow month, but I’m pleased to announce another milestone toward Nyx’s next release!

Simplified from a 611 line monstrosity to trivial 346 line panel, Nyx’s configuration editor is the latest part to get some love. Simpler codebase aside, this overhaul greatly improves startup time. In its last release arm required multiple seconds the first time it was run to bootstrap this panel. This is no longer the case.

And really… that’s it. Faster, simpler, and otherwise a drop-in replacement. Next up is the last (and simplest) panel which presents the torrc. Honestly we’re getting kinda close to being ready to release which is exciting. First time arm users have gotten a new toy since 2012!

Hi all! For years Nyx (aka arm) has done a neat trick where we describe what torrc options do and how they’re used. To do this Nyx had its own cobbled together parser for tor’s man page. Clearly a hack, but it worked.

That was all well and good, but we could clearly do better and now we have!

Stem manual information module

Besides filling Nyx’s needs the shiny new stem.manual module provides…

  • Tor test coverage. This adds several integration tests to confirm tor can properly build a valid man page.

  • Provides all Stem users with three methods for getting tor manual information…

    1. from_cache() – Retrieves information bundled with Stem. This is only as up to date as Stem itself, but the fastest and most reliable method.

    2. from_man() – Parses information from the local system by running ‘man tor’. Still fast, but obviously requires tor’s man page to be present.

    3. from_remote() – Retrieves the latest manual information from tor’s git repository. This is slow and shouldn’t be used without a fallback, but provides the most up-to-date manual information.

  • Along with tor’s manual information we provide brief, more user-friendly descriptions of all tor’s configuration options.

  • Parser is much improved over Nyx’s. In particular the stem.manual module has vastly improved performance, test coverage, and updated summary information.