entries friends calendar user info My Website Previous Previous Next Next
Anthony Bailey's blog - Regression Therapy - Contentful Testing
anthonybailey
[info]anthonybailey
Add to Memories
Tell a Friend
Regression Therapy - Contentful Testing

Regression Therapy - Contentful Testing

Contentful bear

This article is a fleshing out of a ten-slides-in-ten-minutes talk that I originally gave in July 2007 at work.

Since then I talked on the same subject at the Scotland on Rails 2008 conference. Slides and so forth are available.

1. Introduction

Regression testing is usually seen as the poorer cousin of "proper" domain-abstracted assertion-based testing. Often rightly so!

However, with the right support in place, I have found that this form of testing can work very well in certain contexts.

This article addresses one such context: testing the view content generated by a web app. I discuss the background, then present a concrete example in the form of a plug-in for Rails.


2. Three reasons to test

For the purpose of this discussion, I identify three overlapping but distinguishable uses of tests in software development.

To preserve existing good behaviour

Tests can be used to maintain known good behaviour, preventing accidental bad changes (also known as "regressions".) The tests provide a safety net during refactorings, and enfore a Hippocratic "first, do no harm" oath during other work.

....................................!..!...........................!...........

The test stays green whilst all is well. A red bar tells us that we've messed up and broken something.

To drive intended changes

Tests can be used to change (often, to add) behaviour, providing context to an intended good change. The archetypal example is classic TDD, creating tests first in order to define the problem, and to guide the design and implementation of a solution.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

New tests flash red and are put back to green by introducing production code.

To understand emergent change

A less often discussed use of testing is to explore and validate emergent behaviour.

In a complex system it can be hard to prejudge or predict every knock-on effect of a deliberate change to one part of that system. Tests can provide a context for understanding and controlling such changes and consequences.

.....!!??????...................!!??????............!!??????..............

In this way of working, changes cause flashes of red that are observed and then often approved, updating expectations to recolor things back to green again.


3. Three ways to test

Next I'll categorize three methods of testing, paying particular attention to their use in the context of testing view content.

Live testing

In a scandalously unrepeatable manner that sacrifices reliable coverage, we can test by hand and verify by eye.

Drive the web app through the browser and see if looks right. We all do this. It shouldn't be sniffed at; it is genuinely useful because it is so cheap. However, once the system gets sufficiently complex, it doesn't hold water. I won't consider manual testing further in this article.

To automate testing, we have to be able to provoke the generation of view content that we can then test. Assuming a good web app framework that allows us to do this (no small thing, but not the topic of interest today), there are then two means of testing the content.

Assertion testing

We can explcitly check individual desired properties of the the resulting content.

Regression testing

Or, we can compare the content in its entirety against a previously validated known good copy.

I'll now discuss in greater detail the relative merits of testing view content through assertion and regression.


4. assert_equals(expected, test_method)

Assertion tests are prevalent in contemporary developer testing, with good reason. They tend to be very effective, given sufficient investment – we need to express the properties that care about succinctly, and at an appropriate level of abstraction within the domain.

The Rails web framework has evolved a best-of-breed solution for assertion testing view content in the form of assert_select. It uses CSS selectors to locate content, and neatly expresses the conditions it should satisfy. Here's a simple example:

class CartExampleTest
  def test_total
    get "/store/add_to_cart", :id => rails_book.id
    assert_select "div#cart tr.total td:last-of-type", "$39.95"
  end
end

This is pretty nice. But it doesn't escape some downsides of assertion testing which happen to be particularly troublesome for view content.

Coverage is selective

We can only capture the behaviors that we think to test for. The importance of many properties of view content are not obvious until we've accidentally changed them.

Exploration is expensive

Assertion tests are generally great for catching regressions and for driving changes. But they are poor for my third type of test use, that of exploring change. The developer has to run around rebalancing the books, fixing up test expectations to agree with the unexpected consequences of a change.

Explicitness is verbose

Many aspects of the view are worth caring about. It takes too much code to express them all explicitly; translation, even into a well-designed and succinct language, still takes time. The most concise summary of all the properties of a piece of content is the content itself…


5. Regression (past life)

…which leads us to our next alternative.

Regression tests work by checking current content against expected content. We test by comparing the entirety of what we generate against some previously approved copy. The expected content might be quoted verbatim in the test; more often it is stored in a file.

class CartExampleTest
  def test_total
    get "/store/add_to_cart", :id => rails_book.id
    assert_equals File.new('expected.html').read, @response
  end
end

This is clearly very cheap to express, and capturing existent good behaviour needs no translation: just a copy and paste of the current results.

But like assertion testing, regression testing view content comes with some downsides.

Tests don't come first

Again this form of testing is good for two of the three uses of testing that I identified. It is ideal for exploring changes, and unsurprisingly it catches regressions. But, it is very awkward to drive new behavior.

However, this problem is not serious in the context of view content. Although it would not be plausible to test-drive the generation of content through regression testing, it is not clear that one would ever want to. View content tends to come into existence in a concrete form, designed or composed in toto rather than being built up one abstract essence at a time. Even for those who usually love to test first, test-last of view content does not seem so detestable.

The main problem with regression testing is that it is very expensive if we act naively in response to diffs. There are two common pain points.

Tests are brittle and fragile

Comparing against a complete text copy means that we see every change, even ones we don't actually care about. Every difference needs explicit attention.

Tests are usually noisy

Because there is no abstraction, one change may break many tests. Similar differences need attention in multiple places.

These workflow costs multiply as the system under regression test grows. Investing in the abstraction of assertion tests is usually preferrable. Even in the domain of view content, regression tests will prove too costly if we do not work to make them less so.


6. Regression Contentful testing

So: let's attempt to address the problems, and turn regression testing into contentful testing.

Lessen the noise

Information can be defined as any difference that makes a difference. The diffs that don't are the noise in regression tests that we want to reduce.

Ignore insignificant change

Some textual changes in HTML source (for example, certain whitespace in content; and formatting, quoting, capitalization and attribute order within the mark-up itself) won't affect how the content is displayed.

These can be ignored by allowing a little abstraction back in. We can still use standard text-based diff tools, but to compare unique text representations of DOM-normalized content.

DRY up which content is tested

Rather than see the same diff to the same repeated content over and over, we can focus regression tests on those parts of content that vary across our testcases.

Smooth the workflow

In good automation fashion, we will make the common tasks in the workflow as cheap and as easy as we can.

It has to be simple

  • to create a new regression test,
  • to inspect diffs from the expected content when changes occur, and
  • to accept the changes once they have been checked over and seen to be OK.

And, this needs to work well in batch. We should be able to detect, review and accept changes in many tests at a time when need be.

The Contentful Rails plug-in

I put together a plug-in that tries to do these things within Rails.

% script/plugin install svn://rubyforge.org/var/svn/contentful

The plug-in was evolved in, then extracted from, a small spare-time web application. I've since re-used it successfully in other Rails projects. If you write Rails applications, you can read more and download the plug-in at the Contentful website.

For this discussion, Contentful is simply a single concrete example of a general approach I believe should translate across language boundaries, or from my Rake automation commands into your favorite IDE.


7. Capturing content to create tests

With Contentful plugged-in, we can check the content generated in a test by saying assert contentful.

class CartExampleTest
  def test_total
    get "/store/add_to_cart", :id => rails_book.id
    assert_contentful
  end
end

Then when we run the test for the first time, the test passes - and it generates the expected content from the current content, as a side effect. In accord with Rails' culture of convention over configuration, we locate the expected content in a standard place, derived from the name of the test.

% rake test
Started
...................................!...............
Finished in 3.487022 seconds.

  1) Contentful Notification:
test_total(CartExampleTest):
Generated /depot/test/contentful/cart_example/total/expected.html

To avoid duplication, we can focus on a particular subset of the content using a CSS selector. This allows us to ignore components that are common across all our views, such as navigation sidebars and the like.

class CartExampleTest
  def test_total
    get "/store/add_to_cart", :id => rails_book.id
    select_contentful "div#cart"
  end
end
% rake test
Started
...................................!...............
Finished in 3.494442 seconds.

  1) Contentful Notification:
test_total(CartExampleTest):
Generated /depot/test/contentful/cart_example/total/div#cart.expected.html

8. Detecting changes and inspecting diffs

Now our tests are up, we should set them running. (Running everything is the norm, but Contentful will build and execute a suite containing only its own tests if asked. I'll show that here.)

Here's what happens when we've broken something:

% rake test:content
Testing expectations in test/contentful
Started
..F.....
Finished in 0.808864 seconds.

  1) Failure:
test_total(CartExampleTest)
diff test/contentful/cart_example/total/*.to_diff
to see the content change

The failing test will have written a temporary changed.html file next to expected.html. Additionally it generates a second pair of files (expected.to_diff and changed.to_diff) which are the same HTML DOMs, but with some added line-breaks and other normalization to make them usefully comparable using a standard text diff.

(Whenever a test passes, the stale three files get cleaned away.)

As well as using the suggested command-line to examine an individual failure, we can inspect all current diffs by running a single command.

% rake test:diff
Finding changes in test/contentful
! Diff in cart_example/total/changed.html
91c91
< <td id="total">
< Total
---
> <td id="total" class="grand">
> Grand Total
! Diff in another_example/something_else/changed.html
[etc...]

(Additionally, if we want to focus on a subset of all the current changes, we can do so by running within a subdirectory of test/contentful.)


9. Accepting change

If a set of changes are good, then often we can tell this by casting an eye over the diffs and observing a plausible pattern.

(Sometimes one can grep the diffs for extra certainty. For bonus points, script anything you find yourself doing by hand three times!)

If all looks good, we want to accept a bunch of changes.

% rake test:accept
Finding changes in test/contentful/cart_example
! Accepting cart_example/total/changed.html
! Accepting another_example/something_else/changed.html

That was a bit better than opening up an editor and copying and pasting the content, right?

(Again we can refine our acceptance to a subset of changes by running within a subdirectory.)

Also, since expected content is generated when absent, we can alternatively use the file system to our advantage, and update expectations by removing particular subdirectories and then re-running tests.

% rm test/contentful/further/specific/*
[etc...]
Generated /depot/test/contentful/further/specific/expected.html

10. Making a vice out of a virtue

Personally, I really like my content tests to be a permanent part of the test suite. Once the workflow is smooth, I find them well worth their maintenence cost. I like the total coverage; they allow serendipidous discovery of changes I didn't consider.

However, even if you won't buy the entire bridge that I'm trying to sell you, perhaps I can interest you in a strut: use the virtues of contentful testing as a vice. (Sorry for the UK spelling. Translate it into the US "vise" if you like, but you'll spoil my pun.)

Suppose we want to perform a pervasive refactoring. For example,

  • to change the templating framework we're using, or
  • to factor out a set of view helpers and partials.

Big changes are risky. It would be nice to temporarily pin everything about current content tightly down in place whilst we work. When using Contentful, then we can add a bonus content assertion to every existing functional and integration test by setting a single Rails config flag in environment.rb:

CONTENTFUL_AUTO = true

Running tests once generates expectations and locks down all the current content. Then we can perform our refactoring under their full protection. When done, we unset the config flag and delete the temporary expectations before checking in the safely refactored code.

% wget http://contentful.rubyforge.org/

Thanks for reading! Comments are most welcome.


Tags: ,
Current Mood: content

Comments
From: (Anonymous) Date: August 13th, 2007 05:38 pm (UTC) (Link)

terms

I've always heard of a regression test just being any test run as part of a continuous integration to look for introduced bugs. What you describe as regression testing, is what I would consider a "Golden Master" test.
anthonybailey From: [info]anthonybailey Date: August 13th, 2007 06:22 pm (UTC) (Link)

Re: terms

Thanks.

I hadn't heard the "golden master" terminology before: a quick Google suggests it is far from universal. It's a fair enough term, although the analogy does have the unfortunate connotation that you hardly ever change the expected output.

Hmmm. I hadn't thought about this explicitly. I guess the original meaning, of "regression test", coined in darker ages, seems unhelpful in a world where we have automated testing and builds: apart from the slowest almost every test necessarily matches the original meaning, for what possibly failing test wouldn't you want to run every time? So use of the term has unconsciously evolved among my immediate peers and regained usefulness by now meaning a regression test and only a regression test: the specific case of a test where we've captured existing (usually volumnious) believed good output and are compare against that, rather than one (usually, assertion-based) that we used to drive up the software and behavior.

But from looking around that shift in meaning may well be local, and you're correct that I'm abusing a too generic term and risk causing confusion; I'll think about finding a better term or adding a disclaimer.
From: (Anonymous) Date: September 1st, 2007 02:24 pm (UTC) (Link)
Thanks Anthony, to write this post that rehabilitate what I call UI testing.
Generally the XP/TDD guys give it a bad reputation.

Interested in implementing this kind of testing with .NET languages (C#, VB.NET, IronPython), see www.incisif.net.

FTorres
anthonybailey From: [info]anthonybailey Date: September 2nd, 2007 10:29 pm (UTC) (Link)
UI as in user interface or interaction? I'm not advocating capturing and replaying user input to create one's test cases - just capturing and comparing output, in certain domains. I've not yet worked in depth in a domain where it wasn't worth abstracting out enough input to drive tests programatically.

(I may be misconstruing InCisif, or implying too much guily by association with other replay robots - apologies if so.)
From: (Anonymous) Date: June 23rd, 2008 06:10 pm (UTC) (Link)

Nice work

Anthony, this looks great! I'm just re-entering the world of application development. Used to be (decades ago) I could write a kick butt C program, but my, times have changed. I'm going down the Ruby/Rails, web 2.0, best practices road, and so much to learn. My first project is adding data management functionality to an existing web site. Which means first I'm going to rewrite the mostly static content in a rails fashion (modularly with templates and partials). As I reviewed Rails view testing but before I looked carefully at Contentful, I decided I wanted the kind of full output comparison you describe here. And Voila! Here it is. I'm surprised it's not receiving more attention in the rails community. Cheers, Dave
anthonybailey From: [info]anthonybailey Date: June 25th, 2008 06:16 pm (UTC) (Link)

Re: Nice work

I'm guessing you won't see this response, Dave, but just in case...

Thanks for the kind words.

I hadn't considered using Contentful to migrate from another framework, but yes, it may be of help - I guess you would want to copy pages from the existing site in as your original expected.html golden content, and evolve from there.
6 comments or Leave a comment
about this journal
License:Public Domain Dedication
Feed:RSS feed
Contact details
Blog - permalink
Tumblelog - Anthony uncut
My views, not Amazon's
tags
page summary
Anonymous terms [+1]
Anonymous (no subject) [+1]
Anonymous Nice work [+1]