The other day I was talking to a guy about a possible freelancing gig and he said how wonderful it was that I should bring up the topic of unit/automated testingwithout being asked. Said that most (many?) developers don’t have the level of rigor to use automated testing.

Oops building

Oops building

My reaction was one of disbelief “Rigor!? But automated testing is one of the laziest things a developer can do! It speeds stuff up so much!”

As luck would have it, last night I was hit over the head with my own words and nearly died debugging a single function.

I was working on Stripe webhooks and for security reasons decided not to use the event data sent in request body. Makes sense right? Take the event id from request body, then fetch the actual event from Stripe.

It’s the only way to be certain you aren’t responding to bogus events sent by an evil person trying to make you look bad (nothing actually bad can happen, at worst a customer would get extra paid invoice emails).

Due to poor decoupling – I didn’t really want to decouple a 6 line function into two functions – everything was now difficult to test. I can’t create events on Stripe’s servers with unit tests and without actual events existing I can’t test the function works as it’s supposed to.

How many bugs can you put in a 6 line function anyway?

A lot of bugs!

When the client tested on staging … it didn’t work. Invoice email wasn’t sent and Stripe complained of a 500 error.

English: British Army Signals Trials Unit test...

Image via Wikipedia

It took me almost two hours to fix all the bugs because my testing cycle looked like this:

  1. change code
  2. commit to develop branch
  3. switch to staging branch
  4. merge develop branch into staging
  5. push to github
  6. change to other terminal window
  7. pull from staging branch
  8. restart python processes
  9. go to Stripe dashboard
  10. pick customer
  11. create invoice item
  12. create actual invoice
  13. choose invoice
  14. pay invoice
  15. go to Stripe logs
  16. find invoice.payment_succeeded webhook
  17. scroll down to response
  18. look through raw html of django’s error page
  19. find symptom
  20. GOTO 1.

That’s right, a whopping 20 step debug cycle all because I’m an idiot and couldn’t find a way to automate this. Or maybe I was too tired to do the unobvious thing … although I still don’t want to split a 6 liner into two functions.

With proper unit testing the debug cycle would look like this:

  1. change code
  2. run tests
  3. symptom thrown in face
  4. GOTO 1.

Much lazier right?

For the record, those six lines of code contained 4 bugs ->

  • forgot to import a module
  • different event data structure than I understood from the docs
  • twice^
  • misnamed variables from one line to another

Yes, all of those could easily have been caught if my test coverage was actually any good! And then not only would I not look like an idiot in front of the client, I’d probably spend no more than ten minutes fixing this.

Let this be a lesson both to you and to Future Swizec!

Enhanced by Zemanta
  • Avi

    Unit testing is for lazy people and they’re about to get even lazier. Typemock’s V7 runs automatically as you code, pinpointing bugs - 
    http://www.typemock.com/isolator-v7. Because 2 minutes was just too long. Because better to test than spend hours hunting for bugs.

  • Musaab

    I’m sure this would be more easier with .NET and MVVM pattern 

  • http://twitter.com/estebanfeldman Eka

    So, were you able to automate that? And if so, how?

  • Daniel Lyons

    All four of your bugs would have been caught by static typechecking.

  • Guest

    The last one, _maybe_, but the first three: obviously not.

    always cute to see people who don’t really know what they’re talking about make blind global suggestions to people without understanding their circumstances at all. It shows exactly what kind of problem solver you are (or what kind of problem solver you aren’t…)

  • Antti Tuppurainen

    What are you smoking? Let’s go through them all.

    1) forgot to import a module

    This would clearly result in a compiler error in a statically typed language because the compiler would not be able to even locate the required types and functions.

    2) different event data structure than I understood from the docs

    Different data structure means that the compiler would have caught any inappropriate attempt to access the data either by missing members (data or functions) or by type mismatch in assignment to a local variable.

    3) twice^

    Ditto.

    4) misnamed variables from one line to another

    Most likely caught, unless you’re in the practice of using a lot of variables of the same name and type in nested scopes, causing them to shadow each other.

  • Ben

    Agreed about #1 and #4, but for #2 (and #3), we’re probably talking about something parsed from JSON… do you define huge amounts of rigid, static structure to store every possible JSON responses you receive from web services? If so, how do you handle it when your upstream (3rd party) data source adds a field? Isn’t one of the point of JSON  to be a flexible interchange format?

    That said, I find that programmers who are *used* to unit tests are less careful about these sorts of things up front. Unit testing seems to encourage buggier first attempts, which is perhaps okay except that not all bugs are caught by unit tests. Worth the tradeoff? Usually, if only for long term maintenance reasons, but still frustrating…

  • Anonymous

    You might be interested in a little utility we use:
    https://github.com/tomakehurst/wiremock

    It’s written in Java, but don’t hold that against it. It allows you to define a stubbed out response body to an arbitrary HTTP request. It’s most useful if you’re either doing manual integration testing, or you’re using JUnit (or similar) and can use the programmable API. Depending on what language/unit testing framework you’re using it might be worthwhile writing a similar utility if only to decouple yourself from a service that exists outside the scope of the unit test.

  • http://profiles.google.com/arrigoni.andrew Andrew Arrigoni

    You could use a simulated user with Selenium or its ilk to automate the first 14 of those steps…

  • http://twitter.com/entendu Domenic Santangelo

    Yikes, do you not have a local development environment set up?

  • Haha

    The problem you are having is your a poor developer

  • Random

    And you could have won the lottery if only you had picked the numbers they just announced!

    I work with several developers who release bugs to prod frequently and always use this as an excuse and justification of why they could write less buggy code.

    The truth is, unit testing is only one of many ways to reduce bugs, you are never going to eliminate them, hell you can easily write bugs into your tests.

  • http://swizec.com Swizec

    No, I do not have the whole of Stripe locally mocked up.

  • http://swizec.com Swizec

    #2 and #3 are exactly this. It’s an API returning JSON data … the only way I could have mocked this up (without splitting the function into two functions) was if I convinced the code to send a request to a mocked up server instead of the real server when testing.
    But it looks like begging to introduce bugs when you start putting those kinds of conditionals into your code.

  • http://swizec.com Swizec

    Looks pretty cool, but how would I convince the code to poll the mock service in testing and the real service in production?

  • Asasd

    No he’s not. But you’re a poor commenter.

  • Anonymous

    Looking at what you’re actually trying to do I think I may have misunderstood, I was thinking you were calling a web service (the Stripe API), but instead Stripe is calling you. I still don’t fully understand why the full commit to github, then and update is necessary unless you left out where your “second terminal” is actually a ssh session on another server.

    You could still use the principles or wiremock to do some testing, you’d just have to turn it on its head, so instead of a server listening for queries it instead is a service that makes HTTP requests when prompted from unit test hooks. That is, you mock out the calls that you’re expecting stripe to make, and in your unit tests you call your services using those mocks and verify the result, rather than having the live Stripe service do the calls. You’d still have to test against the actual Stripe service at some point to make sure your mock matches the real service, but hopefully you’ve worked out most of the bugs by that point, and if it doesn’t match for some reason you can update your mock and test locally until it works again.

  • http://swizec.com Swizec

    The other terminal is in fact an ssh session :)

    The full situation is really something like this ->

    1. Stripe makes a request to my service
    2. My service does a request to Stripe
    3. My service uses that data to do some stuff (send emails)

    Should’ve made that clearer in the post itself, but testing those sorts of things is always messy in my experience.

  • Bulkan-Savun Evcimen

    You’re not your. You’re bad at grammar. 

  • http://www.facebook.com/Ctide Chris Burkhart

    Welcome to the answer to all of your problems: https://github.com/chrisk/fakeweb

    EDIT: oh, django.

  • Cesar Del Solar

    Just use the Stripe webhook tester on a development server…

  • http://swizec.com Swizec

    Yup, and who is going to add all the needed metadata and such to that request pray tell?

  • dainiusfigoras

    Unit test does not use outside data, that’s why they are unit test. So for data structure you would need to create moc and use it and you would not catch that bug with unit test. For that you need integration test.

  • Daniel Lyons

    That’s fair. And the common solutions are not awesome. I have one piece of code at work with a Jersey client that handles JSON. If the upstream were to add a field, I honestly am not sure what would happen. Obviously I wouldn’t get a compile-time error. I naively expect JAXB to explode over it at runtime, because that’s what it does when XML comes across invalid. Not an improvement over Python, true. 

    There probably is a lower-level API with better-defined semantics, and probably returns nulls I’m expected to check and handle. This is also not an improvement. 

    If I were using Haskell, there is probably a library that handles the serialization to and from JSON in a way that would validate the claim. But I’m not using it at work or at home.

    I think your perspective is sage. As Dijkstra said, testing proves the presence of bugs, never their absence. In practice, I am interested in improving reliability and correctness, and this is not a single-front war.

  • Anonymous

     So yeah, in that instance I’d use wiremock (or equivalent) to stub out the response from Stripe, and then some other tool to simulate the initial request from Stripe that kicks the whole thing off that way you’re completely decoupled from Stripe. Of course, that is a lot of tooling and work, and if this is the only external touchpoint for your code it may not be worth the effort. Anyway, it’s your unit tests, and your decision how you want to handle it, all I can say is that in our project being able to decouple our tests (and some of our test environments) from outside services has been a huge productivity boost.

  • Cesar Del Solar

    The tester can generate any of the possible web hooks and you can see the request with postbin.org.

  • Pingback: Hot Deployed Blog

  • Pingback: A geek with a hat » How to make your django app slow