High-level Language Features and Testing

When I first started doing test-driven development as a PHP coder, our development shop used Marcus Baker's excellent SimpleTest framework. I liked it a lot. Since then I've used unit test frameworks in C, Perl, Java, and Python, and SimpleTest is still my overall favorite in any language.

As I became more ~~obsessed with~~ interested in automated testing, however — reading books and blog articles, experimenting with new testing patterns, getting xUnit tattoos — I sometimes felt frustrated. Often I would want to write some kind of test in the framework and language, but one or both was just not powerful enough to express the idea cleanly.

It wasn't until I started coding a lot in Python that the cause hit me. Most xUnit frameworks, particularly if they provide good mock objects, are more than adequate in themselves to support any testing pattern I could come up with. SimpleTest certainly is. The problems I ran into came from the language itself.

Now, I don't mean to complain about PHP. Well, maybe a little. Okay, fine. I can't stand PHP. But really, it's a solid, practical language for web apps. It's also a high level and dynamically typed one, which makes many tests a lot easier to write and, importantly, parameterize – certainly easier than in a language like C, or even Java.

And I must admit that Python isn't perfect either, though that's hard, being totally infatuated with the language right now like I am. Oh, Python! You're so dreamy.

Ahem. Python has two high level features that PHP does not support well: functions as first-class objects, and closures. Once I started heavily coding in Python, I began to discover ways to use these features to create nicely powerful test cases.

You probably already have an idea of what is meant by "functions as first-class objects". Basically, if a language lets you treat a function as an object that you can manipulate as easily as any boring class instance, it supports this concept well. You can assign the function to variables, pass it as a parameter to a method, or even create a function dynamically and return it from another function. Some languages support all this better than others. Functional languages like Scheme and Haskell do first-class functions about as well as any language today. Javascript too, believe it or not. C does very slightly, in that you can sling around function pointers; also Java, if you are willing to write unholy code.

You may not know about closures. In essence, a closure is a function that is evaluated in a certain context, with a certain set of local variables; and then remembers that context even if it is invoked in a different environment. It's a function that sort of carries an external memory with it, in a way that seems a little spooky at first. One canonical example is a derivative function:

      # f is some numerical function.
      # dx needs to be close to zero, but not zero.
      def derivative(f, dx):
          def df_dx(x):
              return (f(x+dx) - f(x)) / dx
          return df_dx

df_dx is a closure. It's a numerical function of one argument, x, which yields (in a very bad approximation) the derivative value of f(x). And it will continue to work correctly even if used in a wholly different scope of the code where f and dx are not lexically visible.

(Here's a good article explaining first-class functions and closures more. Favorite quote: "Ruby is a good language for demonstrating features that ought to be in Java.")

Right now I'm the QA engineer at SnapLogic. It's a great place to be if you want to work with a herd of engineers who are all about ten times smarter than you are. Keeps my ego in check. Anyway, a good example of what I'm talking about is in our SnapLogic Python API, which can be conveniently compared to the SnapLogic PHP API. (pop and browse source: Python, PHP) These two client libraries provide a simple interface for accessing SnapLogic resources. They are very similar, both in their method signatures and underlying algorithms. In fact I implemented them in parallel, adding a feature to one, then translating it into the other.

(Switching rapidly between thinking in PHP and thinking in Python is a trip, by the way. Felt dizzy for those few days. All I can say is, good thing we do code reviews.)

My process was, naturally, to usually translate the tests first. And here is where where I got a good demonstration of how closures and first-class functions come in handy.

Consider a method that does something like this:

      def get(self, url):
          ...
          response = urllib2.urlopen(url)

(This is actually a real example.) urllib2.urlopen is a function that performs an HTTP GET request on a URL. (Python syntax: The function's name is "urlopen", and it is in the "urllib2" module.) Since this is supposed to be a unit test, not an integration test, I don't want it to actually ping anything across the network. The answer is to make get() accept a mock function:

      def get(self, url, urlopen=urllib2.urlopen):
          ...
          response = urlopen(url)

(There are other ways to do this. It would probably have been better to make urlopen a class property in this case, rather than polluting get()'s method signature. I blame my code reviewer.)

That urlopen parameter accepts a function object. By default, it uses the "real" one, urllib2.urlopen. So when called in real code, it looks like this:

      foo.get(url)

But I can also pass in a different, fake function, that only pretends to do what the urllib2 version does. Thus, in the test case, I invoke get() like this:

      foo.get(url, mock_urlopen)

where I earlier define that last parameter like this:

      def mock_urlopen(url):
          ...
          return some_mock_response_object

Pretty neat, huh? Problem is, you can't do this in PHP! Nor in many other languages. Well, okay, technically you can, if the life of you and everyone you love depends on it. But it's not pretty. In practice, you would take another approach, because the friction coefficient is just too high.

I'm not picking on PHP. It's the same in most languages you've heard of.

Actually, though, this is hardly even a mock function in the purest sense, because there's no business logic, no built-in testing of actual behavior. It does not even validate the uri parameter in any way.

At SnapLogic here, we try to use mocks a lot in our unit tests, mainly because I won't shut up about it. Normally we use the superb PyMock library. I actually didn't use any mock library for the SnapLogic Python client lib, for two reasons. One is that I wanted people to be able to run the tests without having to install another library. (As you know, because you've downloaded it already!) Another reason is that the mocking needs were not overly complicated, and so it was easy to implement something elegant on my own.

So what do I want here? I want to be able to pass a special function into the get() call that will be used as the URL opener. I want it to be able to expect to receive a particular value or values for the uri it is invoked on, and then return a particular value. And I'm allergic to boilerplate, so I want to generate these functions programmatically for different test cases.

Let's say the return value is an instance of a class called MockResponse. (urllib2 actually uses another mechanism, instantiating a class named OpenerDirector. But trust me, it's way more complicated than you'd be interesting in hearing about. So I just made a mock response class.)

A mock function that does not check its input may look like this:

      def mock_urlopen(uri):
          return MockResponse()

Doesn't do much. But one of my requirements is that it raise a commotion if it doesn't get the uri parameter it should. So we can add that:

      def mock_urlopen(uri):
          if "http://google.com/search?q=why+python+programmers+are+sexy" != uri:
              raise Exception("Your code doesn't work, bonehead")
          return MockResponse()

We also want to configure the MockResponse object somehow. Now, since I'm not an idiot, I don't want to code N different mock_urlopen variants for N test cases. So I write a function to generate functions:

      def mk_mockurlopen(expected, req):
          def mockurlopen(uri):
              if expected != uri:
                  raise Exception("Maybe programming isn't your thing.")
              return req
          return mockurlopen

So far, so good. I can now do something like this:

      for item in testcase_data:
          mockurlopen = mk_mockurlopen(item['expected_uri'], item['mock_response'])
          result = agent.get(item['input_uri'], urlopen=mockurlopen)
          self.assertEqual(result, item['expected_result'])

There is one thing that needs to be better. The above tests take place within unittest, Python's xUnit library included in its standard distribution. The self.assertEqual call is made in a test method of a unittest.TestCase instance. Right now, though, the assertion of uri's value when calling the mock urlopen is done in a very crude way, just throwing an exception. By inspecting the stack trace, we can figure out what is happening, but it's much more convenient to integrate it into the unit test framework, so that the harness can do the work for us of tracing down which assertion failed where.

That's easy enough:

      # A test case
      class TestOfGet(unittest.TestCase):
          # Utility to create a mock urlopen function, attached
          # to this case's context
          def mk_mockurlopen(self, expected, req):
              def mockurlopen(uri):
                  self.assertEqual(expected, uri)
              return req
          return mockurlopen

          # Now, use it in one or more actual tests
          def test_of_some_important_thing(self):
              ...
              mockurlopen = self.mk_mockurlopen(expected_uri, mock_request_object)
              ...

This is a closure. A function is created in the execution context of the unit test framework. It's invoked in a completely different context. The context from the assertion is still accessible to it, though. In fact, that's essentially where the assertion is made.

See why this is better than just throwing an exception? The unittest module, like all full xUnit libraries, includes an integrated reporting facility. This abstraction layer adds a lot of value, allowing different front ends, IDE integration, and so on. Creating this closure allows us to apply a precisely targeted, specific test at an important place deep in the code, while allowing it to be integrated in the reporting hooks with no effort on our part.

Ain't that sweet?

The stack trace when it fails looks like this:

Traceback (most recent call last):
  File "test/tests.py", line 512, in testmain
    actual = sr.count(td['rel_uri'], _urlopen=my_mock_urlopen)
  File "/home/amax/src/snaplogic/trunk/Packages/SnapLogic-py/lib/snappy/SnapLogicAgent.py", line 141, in count
    req = _urlopen(full_uri)
  File "test/tests.py", line 24, in mockurlopen
    self.assertEqual(expected, uri)
AssertionError: 'http://foobar.com/alpha/beta?sn.count=records' != 'http://foobar.com/alpha/beta'

Note that this specifies which test case failed (the one on line 512 of test/tests.py), where in the production code things went wrong (line 141 of SnapLogicAgent.py), and what the precise failure was (the missing sn.count GET parameter).

There's something beautiful about it. It's days like this that I love being an engineer.