Living in a state of accord.

Testing@LMAX – Making Test Output Useful

Just like production code, you should assume things are going to go wrong in your tests and when it does you want good logging to help track down what happened and why. So just like production code, you should use a logging framework within your DSL, use meaningful log levels and think about what info you’d need in the logs if something went wrong (and what you don’t). There’s also a few things we’ve found very useful specifically in our test logging.

Log Alias to Real Name Mappings

Since the DSL uses aliases, if we want to poke around in the exchange manually to understand why a test failed, we need to know the real names to use. So whenever we create a real name for an alias we log some information about it. For example when creating an instrument:

21:28:53,916 WARN  …rdersWithSuppliedInstructionIds [AdminAPI] Created instrument 'instrument' (actual name: 'instrument-54369ih64k63', externalId: 180423, internalId: 180422) on tradeReportingGroup 1003

All the key information we need is in that log statement – the alias, real name plus unique identifiers (externalId and internalId).

Name Your Threads

We use a custom JUnit test runner (via the @RunWith annotation) so we can run tests within a test suite in parallel. With tests running in parallel all their output gets mixed up and becomes hard to read.  Recently we started setting the test thread names to the name of the test case. 

private void executeTest(
  final FrameworkMethod method,
  final Description description) {

We actually trim the method name to 30 characters (cutting off the start rather than the end which tends to work better with the way we name tests) so we get output like:

21:28:47,356 INFO  …tityAndPriceAndNoStopLossOffset [TestContext] Created PartyCode XMCS (alias: marketMaker)

21:28:47,356 INFO  …erWithZeroSuppliedInstructionId [TestContext] Created PartyCode U6HK (alias: marketMaker)

21:28:47,356 INFO  …ctionIdIfFirstOrderHasCompleted [TestContext] Created PartyCode 3DS6 (alias: marketMaker)

21:28:47,356 INFO  …StopLossOffsetAndAStopLossPrice [TestContext] Created PartyCode XFY8 (alias: marketMaker)

There are a few cases where we spawn additional threads within a test (typically to pull data from long poll or other push event channels). In those cases we generally pass the thread name down with an additional prefix (e.g. LongPoll-…StopLossOffsetAndAStopLossPrice) so we can still associate that output with the right test.

Time Traveller Names

The way we allow time travelling tests to run in parallel is reasonably complex – only one thread ever actually executes the time travel and there’s a bunch of cross-thread coordination – so our thread names aren’t as useful in that little area of code. As such we give each test we’re currently running a time traveller name so we get log output from the Tardis like:

The Doctor asking to travel to mondayOpen (Mon Mar 14 07:00:00 UTC 2016)
Clara asking to travel to mondayOpen (Mon Mar 14 07:00:00 UTC 2016)
Captain Jack asking to travel to mondayOpen (Mon Mar 14 07:00:00 UTC 2016)
Rory asking to travel to mondayOpen (Mon Mar 14 07:00:00 UTC 2016)
Missy asking to travel to mondayOpen (Mon Mar 14 07:00:00 UTC 2016)
Amy asking to travel to mondayOpen (Mon Mar 14 07:00:00 UTC 2016)
River asking to travel to mondayOpen (Mon Mar 14 07:00:00 UTC 2016)
Rose asking to travel to mondayOpen (Mon Mar 14 07:00:00 UTC 2016)
Time travelling to: mondayOpen (Mon Mar 14 07:00:00 UTC 2016)

We may have gotten a little carried away with the Doctor Who theme but having a name for each time traveller makes it far easier to understand what each test is waiting for.

Alert Dialogs Do Not Appear When Using WebDriverBackedSeleniu

With Selenium 1, JavaScript alert and confirmation dialogs were intercepted by the Selenium JavaScript library so they never appeared on-screen and were accessed using selenium.isAlertPresent(), selenium.isConfirmationPresent(), selenium.chooseOkOnNextConfirmation() and similar methods.

With Selenium 2 aka WebDriver, the dialogs do appear on screen and you access them with webDriver.switchTo().alert() which returns an Alert instance for further interactions.

However, when you use WebDriverBackedSelenium to mix Selenium 1 and WebDriver APIs – for example, during migrations from one to the other – the alerts don’t appear on screen and webDriver.switchTo().alert() always throws a NoAlertPresentException. This makes migration problematic because selenium.chooseOkOnNextConfirmation() doesn’t have any effect if the next confirmation is triggered by an action performed directly through Web Driver.

The solution is to avoid WebDriverBackedSelenium entirely and instead create a WebDriverCommandProcessor, configure it to not emulate dialogs and pass it to DefaultSelenium.

public Selenium createSelenium(
  final WebDriver webDriver,
  final String baseUrl)
  WebDriverCommandProcessor commandProcessor =
    new WebDriverCommandProcessor(baseUrl, () -> webDriver);
  return new DefaultSelenium(commandProcessor);

The key method call here is setEnableAlertOverride(false) which disables the alert emulation.  Note that the Selenium 1 methods for interacting with dialogs will no longer work – you have to switch all alert handling over to WebDriver.

Also note the subtle but important use of a closure to provide a Supplier<WebDriver> instead of just passing the webDriver. The overloaded constructor instance that takes a WebDriver immediately calls start() on the processor and setEnableAlertOverrides must be called before start().

It’s unfortunate, but understandable, that Selenium 1 and WebDriver style alert interactions can’t be mixed as that would make migration a lot easier. At least with this approach you can convert your alert interactions over in one batch then convert all the other interactions incrementally.

Testing@LMAX – Introducing ElementSpecification

Today LMAX Exchange has released ElementSpecification, a very small library we built to make working with selectors in selenium/WebDriver tests easier. It has three main aims:

  • Make it easier to understand selectors by using a very English-like syntax
  • Avoid common pitfalls when writing selectors that lead to either brittle or intermittent tests
  • Strongly discourage writing overly complicated selectors.

Essentially, we use ElementSpecification anywhere that we would have written CSS or XPath selectors by hand. ElementSpecification will automatically select the most appropriate format to use – choosing between a simple ID selector, CSS or XPath.

Making selectors easier to understand doesn’t mean making locators shorter – CSS is already a very terse language. We actually want to use more characters to express our intent so that future developers can read the specification without having to decode CSS. For example, the CSS:

#data-table tr[data-id='78'] .name


.thatContainsA("tr").withAttributeValue("data-id", "78")

Much longer, but if you were to read the CSS selector to yourself, it would come out a lot like the ElementSpecification syntax. That allows you to stay focussed on what the test is doing instead of pausing to decode the CSS. It’s also reduces the likelihood of misreading a vital character and misunderstanding the selector.

With ElementSpecification essentially acting as an adapter layer between the test author and the actual CSS, it’s also able to avoid some common intermittency pitfalls. In fact, the reason ElementSpecification was first built was because really smart people kept attempting to locate an element with a classname using:

//*[contains(@class, 'valid')]

which looks ok, but incorrectly also matches an element with the class ‘invalid’. Requiring the class attribute to exactly match ‘valid’ is too brittle because it will fail if an additional class is added to the element. Instead, ElementSpecification would generate:

contains(concat(' ', @class, ' '), ' valid ')

which is decidedly unpleasant to have to write by hand.

The biggest benefit we’ve seen from ElementSpecification though is that fact that it has a deliberately limited set of abilities. You can only descend down the DOM tree, never back up and never across to siblings. That makes selectors far easier to understand and avoids a lot of unintended coupling between the tests and incidental properties of the DOM. Sometimes it means augmenting the DOM to make it more semantic – things like adding a “data-id” attribute to rows as in the example above. It’s surprisingly rare how often we need to do that and surprising how useful those extra semantics wind up being for a whole variety of reasons anyway.

Use More Magic Literals

In programming courses one of the first thing you’re taught is to avoid “magic literals” – numbers or strings that are hardcoded in the middle of an algorithm. The recommended solution is to extract them into a constant. Sometimes this is great advice, for example:

if (amount > 1000) {

would be much more readable if we extracted a ADDITIONAL_AUTHORIZATION_THRESHOLD variable – primarily so the magic 1000 gets a name.

That’s not a hard and fast rule though.  For example:

value.replace(PATTERN_TO_REPLACE, token)

is dramatically less readable and maintainable than:

value.replace("%VALUE%", token)

Extracting a constant in this case just reduced the locality of the code, requiring someone reading the code to unnecessarily jump around the code to understand it.


My rule of thumb is that you should extract a constant only when:

  • it’s reasonably easy to think of a good name from the constant – one that adds meaning to the code OR
  • the value is required in multiple places and is likely to change

Arbitrary tokens like %VALUE% above are generally unlikely to change – it’s an implementation choice – so I’d lean towards preserving the locality and not extracting a constant even when they’re used in multiple places.  The 1000 threshold for additional authorisation on the other hand is clearly a business rule and therefore likely to change so I’d go to great lengths to avoid duplicating it (and would consider making it a configuration option).

Obviously these are just rules of thumb so there will be plenty of cases where, because of the specific context, they should be broken.

Travis CI

Probably the best thing I’ve discovered with my recent playing is Travis CI. I’ve known about it for quite some time, even played with it for simple projects but never with anything with any real complexity. Given this project uses rails which wants a database for pretty much every test and that database has to be postgres because I’m using it’s jsonb support, plus capybara and phantomjs for good measure, this certainly isn’t a simple project to test.

Despite that, getting things running on Travis CI was quick and easy. The container based builds give you a well controlled environment out of the box and they already have a heap of software you’re likely to need ready to go. Other things can be easily provided using a simple .travis.yml file.

While it’s common to have a build server these days, it’s quite uncommon for them to be so well controlled and able to be rebuilt quickly and easily – most wind up being special little snowflakes in some way or other. So building on Travis CI is immediately a massive leap forward in testing practices.

The two things we do at LMAX that I can’t see how to do with Travis CI (yet) is creating a database of historic test results that can be easily queried (at the individual test level, not the build result level), and running a build in parallel with the tests automatically load balanced across a number of hosts. I wouldn’t expect either to be provided as part of the free service anyway but they seem like the next major areas of value it could offer (with significant effort required I imagine).

Still, it’s been a real luxury to have a full CI setup for hacking on a little side project. So thanks for the generosity Travis CI.