End-to-end Testing with Docker

Posted by Aaron Schuenemann on


At Grubhub, we use many types of UI frameworks. There’s Angular 1.x/2, React, and ClojureScript to name a few. Our two largest front end apps in the company, including grubhub.com, are Angular 1.x. For the purposes of this discussion I’ll talk about testing large Angular apps and why unit tests alone fall short.

The problem

Karma and Jasmine, the go-to unit testing for AngularJS apps, are absolutely fantastic at testing controller and service logic. In fact, it’s relatively trivial to set code coverage targets in the 90+% range and achieve those targets without a coup from your developers. That’s that’s only half the  battle! Consider the following template:

<div class="gh-order__totals text-right"
    ng-if="::(orderCtrl.isOrderNew | orderCtrl.isOrderFuture)">
 <div class="u-weight--semibold">{{ ::orderCtrl.details.itmeCount }}</div>
 <span class="gh-order__item-total">{{ ::orderCtrl.details.orderTotal }}</span>

Unit tests for orderCtrl check out fine, but can you spot the bug? In my rush to display the itemCount, I happened to accidentally misspell it as itmeCount. Unfortunately, templates largely go untested and mistakes like this are extremely common.

The answer

Selenium to the rescue! Well, in this case by Selenium I mean Protractor, which is Angular’s Webdriver wrapper. Protractor comes with a lot of handy things like selecting elements by binding, or waiting for Angular to finish $timeout’s, $http calls, or other promises before your “expect(x).toBe(y)’s” fire. Essentially, Protractor basically does things a user will do, clicking around and checking things on the page. It doesn’t really know, or care, about the underlying implementation. It’s usability is limited to what you can see and do on the screen. Which is handy, because in a web app that’s just about everything. From here on our I’ll refer to these tests as E2E, since they test the app “End to End”.

Don’t go overboard

When testing, it’s important to keep in mind that e2e tests only check the very top of the testing pyramid:

Therefore your tests shouldn’t be as all encompassing as unit tests. And shouldn’t be; E2E tests generally run slower and increase build times (you are running them automatically on every commit right?!?).There’s a point where if tests take too long to run they decrease in their usefulness. Many organizations strive for continuous delivery, and if that’s your jam you’ll need to find a balance here.

You mentioned something about Docker?

I’m getting to it! The problem with our E2E tests was that they generally worked great when running on my Macbook Pro, but often our developers would forget to run them locally even if they added new tests. CI tools (we use Jenkins, so I’ll speak with that domain knowledge) don’t have a UI when they run, so lots of folks prefer to offload their E2E tests to third party providers. Here’s why I am not a fan of outsourcing:

  • Security: You’ve gotta open up a proxy from their servers back to yours to test in pre-production.
  • The tests can be extremely slow, and they seem to have random failures which are unpredictable in nature.
  • It comes at a price…an actual price, in dollars. You’ve gotta buck up and pay for these services.

So how do you run your E2E tests from a headless CI? The answer, thankfully, is virtual framebuffers. Simply put, you can run Xfvb first and then run Chrome, and it’ll think it’s in an actual window. Xvfb shares a lot of code with X11 minus the screen, and Chrome/Firefox don’t know the difference. Moreover, you can control the screen resolution directly on the buffer so you don’t need to put in all the different options into your protractor.conf.js file.

You still haven’t mentioned Docker….

Right…well as you can see, there’s a lot of setup required to actually run your protractor tests anywhere with repeatability. You’ve gotta have node/npm installed, the right version of protractor and a matching webdriver-manager, the right version of chrome and firefox, and all the plugins you are using with protractor like screen capture tools, etc.

All that has to be setup correctly on each Jenkins/CI slave you intend to run on. However, at Grubhub, we’ve got a bunch of slave pools, and I don’t want to mimic configurations on each one. This is a textbook case for using Docker. Containerization is awesome, and in my opinion, no one does it better than Docker.  You can encapsulate an entire operating system and encompassing configuration with just a few files. One of those files will always be a file called Dockerfile. In this case mine looks something like:

FROM node:6
RUN apt-get update --fix-missing && \
   apt-get install -y xvfb wget openjdk-7-jre libgconf-2-4 libexif12 chromium && \
   apt-get clean

RUN mkdir /protractor && \
   npm install -g protractor

RUN webdriver-manager update

ADD protractor.sh /protractor.sh

WORKDIR /protractor
ENTRYPOINT ["/protractor.sh"]

Protractor.sh looks like:

Xvfb :1 -screen 0 $SCREEN &
webdriver-manager start &
protractor [email protected]

This sets up the box to be able to run protractor inside of it, sharing a /protractor directory with the host machine. So in your jenkins job setup you simply pull down your code and run:

docker run --rm --env-file ./docker-env --env SUITE=$SUITE -v $(pwd):/protractor/
<>:5000/your/path/to/image protractor.conf.js

In the above example I’ve got a docker-env file which specifies a few things including the environment to run the tests against, a display property (in this example 0:1),  and the target screen size of the virtual framebuffer. This gives a lot of flexibility from CI to be able to run any suite of tests against any screen size in any environment you choose. Handy!

So get testing!

It’s worth noting that I love open source, and I’m not even close to the first person to pull this off. In constructing my final docker image I borrowed very heavily from the following two projects:



Final Words

Some of you may think, why do we need end to end tests? After all, Google published a scathing Just Say No To More End To End Tests article in 2015. I largely agree with this article, and believe we should keep our E2E test suites as small as possible. The 70/20/10 proposed is a great starting point. However, this article doesn’t apply to Grubhub front end devs because we don’t utilize a QA department in the traditional sense. We don’t toss our code over an arbitrary “QA wall” for manual testing. Our developers are our testers, and the more (smart) automation the merrier.