Property Test-Driven Development

It's only fair to share...Share on Google+0Share on Facebook0Tweet about this on TwitterShare on LinkedIn0Email this to someone

It’s been some time since my last post. A lot happened in my life, but I’ll try my best to get back to writing more posts again.

Let’s start off by looking at a variation of test-driven development: property test-driven development. Basically, it’s TDD with property tests. If you don’t know about property tests (QuickCheck should ring a bell), then you’re in for some nice treats. I won’t say anything about either of these being better or worse than the other. The goal of this post is simply to raise awareness about some of the neat possibilities available with property-based testing.

Usually, property test libraries start with an example like string concatenation and looking at the length of the result string. Let’s do something a bit more interesting and write a class for censoring strings. We want to replace bad words in a sentence with a censored representation like ****.

Instead of traditional TDD where we would start writing a simple test for an empty string, PTDD needs a property. We want to start off with something simple though. What about this?

    @Property
    public void lengthRemainsUnchanged(
            String s, Set<String> badWords) {
        final Censor censor = new Censor(badWords);
        assertEquals(s.length(), censor.censored(s).length());
    }

Looks almost like a normal unit test, except that the method takes two arguments. Property-based testing, as with junit-quickcheck in this example, uses generators to randomly generate arguments and execute the test with them.

When we run this test against a trivial implementation that always returns the empty string, it will fail. Returning any constant string won’t work either, since the property-based test gives us a lot of differently sized strings. For now, let’s implement it by returning the argument string:

    public String censored(String argument) {
        return argument;
    }

Next, we might want to add a property for an actual bad word. For the sake of simplicity and to keep the post length under control, we assume that a method nStars(int) is available to generate a string of n star (*) characters. Using it, the next property could look like this:

    @Property
    public void onlyBadWordIsReplaced(String s) {
        final Censor censor = new Censor(new HashSet<>(Arrays.asList(s)));
        assertEquals(Censor.nStars(s.length()), censor.censored(s));
    }

Of course, the production code fails now. Let’s fix it and keep it as blatantly simple (and wrong) as possible:

    public String censored(String argument) {
        return Censor.nStars(argument.length());
    }

Now, we just censor everything. Interestingly enough, this also satisfies our initial property. So we need a property that deals with only a part of the string being censored:

    @Property
    public void badWordAtStart(String badWord) {
        final Censor censor = new Censor(new HashSet<>(Arrays.asList(badWord)));
        String s = badWord + " you";
        assertEquals(Censor.nStars(badWord.length()) + " you", censor.censored(s));
    }

We will again try to keep the implementation as simple and pointless as possible, but it’s getting trickier. Firstly, we can no longer get away with ignoring the actual bad words. The number of stars we need to prefix the result with to satisfy this latest property depends on the differently sized bad words. Hence, we need to store the Set of bad words in an attribute. Here’s our new production code:

    public String censored(String argument) {
        if (badWords.size() == 1) {
            String badWord = badWords.iterator().next();
            return Censor.nStars(Math.min(badWord.length(), argument.length()))
                    + argument.substring(badWord.length());
        }
        return Censor.nStars(argument.length());
    }

There’s some obvious trouble here, but it satisfies the properties. The size check is bogus of course, but the Math.min is already required due to our initial property. The property-based testing will generate those cases, where the input word is shorter than the first bad word. We could refactor this, but it’s still so blatantly wrong that there’s not yet much point to it. So let’s continue with a property that utilizes all of the bad words – like a really bad sentence:

    @Property
    public void badSentence(Set<String> badWords) {
        final Censor censor = new Censor(badWords);
        String badPart = badWords.stream().collect(Collectors.joining());
        String badSentence = "Prefix"
                + badPart
                + "Suffix";
        assertEquals("Prefix"
                + Censor.nStars(badPart.length())
                + "Suffix",
            censor.censored(badSentence));
    }

This mean sentence joins all the bad words together. Note that we do have to add some non-bad part, as otherwise, the property would pass. To get ahead a bit faster, we added good parts in the front and at the end of the string. Next try to mess things up:

    public String censored(String argument) {
        String allBadWords = badWords.stream().collect(Collectors.joining());
        String nStars = Censor.nStars(allBadWords.length());
        return argument.replace(allBadWords, nStars);
    }

Totally different approach now, but it actually satisfies all the properties so far. Let’s rewrite our last property and make it a bit more challenging by simply separating the bad words with spaces:

    @Property
    public void badSentence(Set<String> badWords) {
        final Censor censor = new Censor(badWords);
        String badPart = badWords.stream().collect(Collectors.joining(" "));
        String badSentence = "Prefix"
                + badPart
                + "Suffix";
        assertEquals("Prefix"
                + badWords.stream()
                    .map(String::length)
                    .map(Censor::nStars)
                    .collect(Collectors.joining(" "))
                + "Suffix",
            censor.censored(badSentence));
    }

Again, we must get a bit closer to an actual implementation, but are still finding ways to make our code wrong:

    public String censored(String argument) {
        String result = argument;
        for (String badWord : badWords) {
            result = result.replace(badWord, Censor.nStars(badWord.length()));
        }
        return result;

Oh oh.. Now we can see that the PTDD approach really does a lot of tests. Since the approach with String#replace is of course horribly slow, we can see our tests slow down. But after a while, the test fails nevertheless. There is a subtle bug in the above code and junit-quickcheck uncovers it mercilessly in almost every run. Not every single run though due to the randomization of the input strings.

The problem happens, when a bad word is contained in another bad word. For example, the bad words “ab” and “abc” will cause the first replace to change “ab abc” to “** **c”. The second replace of “abc” will then no longer succeed. That’s the power of property tests – although they do not explictly cover edge-cases their randomized mass of tests produces all sorts of edgy cases like this.

Let’s fix the production code then. When we order the bad words by decreasing length the above problem vanishes:

    public String censored(String argument) {
        List<String> orderedBadWords = badWords.stream()
                .sorted(Comparator.comparing(String::length).reversed())
                .collect(Collectors.toList());
        String result = argument;
        for (String badWord : orderedBadWords) {
            result = result.replace(badWord, Censor.nStars(badWord.length()));
        }
        return result;
    }

Now the tests run fine again. However, we have a flaw in the actual property test, which is a good way to learn about the weaknesses of property-based tests. While loads of strings are getting generated, it does not generate certain interesting strings like “Suffix”. Would “Suffix” be considered as a bad word, then the actual test we wrote is wrong, since the result should censor that word at the end.

Let’s fix that by adding assumptions on the bad words:

    @Property
    public void badSentence(Set<String> badWords) {
        assumeFalse(badWords.stream().anyMatch("Prefix"::contains));
        assumeFalse(badWords.stream().anyMatch("Suffix"::contains));
        final Censor censor = new Censor(badWords);
        String badPart = badWords.stream().collect(Collectors.joining(" "));
        String badSentence = "Prefix"
                + badPart
                + "Suffix";
        assertEquals("Prefix"
                + badWords.stream()
                    .map(String::length)
                    .map(Censor::nStars)
                    .collect(Collectors.joining(" "))
                + "Suffix",
            censor.censored(badSentence));
    }

We can realize now that the badWordAtStart property suffers from the same problem and add another assumption to make sure the generated badWord is not conflicting with our hardcoded string.

Finally, we can think about refactoring. Obviously, ordering the bad words can happen once in the constructor, so let’s move it there instead. We may want to refactor the remaining method further and separate the replacement logic into its own method, which also allows us to use the Stream API’s reduce method in a nice way. Hence, we end up with this resulting class:

public class Censor {

    private final List<String> orderedBadWords;

    public Censor(Set<String> badWords) {
        orderedBadWords = badWords.stream()
            .sorted(Comparator.comparing(String::length).reversed())
            .collect(Collectors.toList());
    }

    public String censored(String argument) {
        return orderedBadWords.stream()
                .reduce(argument, this::replaceBadWord);
    }

    private String replaceBadWord(String sentence, String badWord) {
        return sentence.replace(badWord,
                Censor.nStars(badWord.length()));
    }

    public static String nStars(int n) {
        char [] chars = new char[n];
        Arrays.fill(chars, '*');
        return new String(chars);
    }
}

This example demonstrated how property tests can quickly guide you towards a good solution. One could still invent code that satisfies these properties, but fails to solve the actual problem. PTDD is no silver bullet, but it often helps to create a better test suite. An added benefit of property tests is the necessary reflection. You are forced to think about properties of your code rather than exemplary executions.

Testing in general suffers from incompleteness when compared to proofs, since each test is only validating a single output for a single input. While property-based testing also only just samples several test cases, the property itself should be provable. Realizing such properties about one’s program is in itself a worthy goal.

It's only fair to share...Share on Google+0Share on Facebook0Tweet about this on TwitterShare on LinkedIn0Email this to someone