Don’t let Test Automation be the final nail in the coffin!

by Ger Cloudt, author of “What is Software Quality?”

Image by Michael Schwarzenberger from Pixabay

“We need an extra Test Engineer.”,  the Agile Master informed me on a sunny Monday morning. Already I noticed that the team needed more effort to keep the dashboard of the nightly build and testing green. More often in the morning “reds” were reported indicating the build or test had failed. Good…, our short feedback loop seems to work, defects are detected early and thus can be addressed immediately.
However before granting the request of the Agile Master I wanted to understand better the root-cause of the increase of reds.

Automate everything.

Clearly there is a lot of pressure on development teams to become faster, more efficient and predictable. Over time the software industry addressed this demand a.o. by changing processes and becoming Agile. One of the key principles to mitigate waste is to create short feedback loops. If something is done wrong and detected very fast the waste is limited, simply because it is relatively easy to fix. That’s why we try to test as early as possible to catch defects as early as possible because we all know the relation between the time gap of defect insertion and defect resolution and costs. The earlier a defect is detected and solved the cheaper it is.
Testing as early as possible implies a lot of testing, over and over again. And that’s why we started to automate testing because activities to be repeated over and over again should be automated with clear reporting such that fails are noticed immediately without too much additional effort.

How about all these reds.

Before deciding on adding an extra Test Engineer we needed to have a closer look at the root-cause of these increasing reds in the nightly test run. First question to be asked was whether the reds were caused by indeed regressions in the Software Under Test? Analysis of the reds showed that this was not the case. Failures in test cases, instability in test framework and test infrastructure, not updated test cases clearly caused the increase of reds. A next interesting question to be asked is what percentage of reds is caused by actual regressions in the product code? As it seemed the majority of reds was not caused by regressions in product software but by other reasons.

Over the years automated test bench grew, configurations were added, tests were added, infrastructure changed and apparently more and more effort is needed to maintain all these automated tests and everything related. To be honest this should not be a surprise because test automation is software development. And software is subject to Technical Debt (see Help…, my software rots!). So, your test cases and test environment will be as well.

Gherkin scenarios

Let’s have a closer look at Gherkin scenarios supported by tooling like e.g. SpecFlow and Cucumber.
Gherkin is a business readable language used to describe behavior which can be used for defining executable test cases. It consists out of steps and uses keywords like “Given”, “When” and “Then” to describe a precondition, an action and a result. Each step is associated to a keyword. Scenarios are written in Gherkin and the steps require “glue-code” to address the Software Under Test. Tooling like SpecFlow generates for each step a signature which consists out of a method interface in a specific programming language like e.g. C# or Java. The actual code (glue-code), implementing the step, to address the Software Under Test in the correct way needs to be programmed by the Test Automation Engineer.

Despite Gherkin is a simple language, you still can build a mess… and if not paying attention to good programming practices you will build a mess even in Gherkin.

Therefore even in Gherkin scenarios you might think about defining generic steps to be re-used in multiple test cases and specific steps. Even in Gherkin scenarios you might think about Clean Code principles like using meaningful names, naming conventions and keeping steps small. Once I saw a step in a Gherkin scenario which resulted in a method with more than 15 arguments! Ouch……, what does Clean Code of Robert C Martin say about number of arguments? What if you need to adapt this Gherkin scenario due to a new requirement? Imagine you might have many of these of kind scenarios…..
As you can build a mess in both your Gherkin scenarios as well as your glue-code, test cases contain Technical Debt which might slow you down significantly and result in increasing numbers of reds in your automated test execution.

Even more code.

For testing we use Gherkin scenarios, glue-code, unit tests. All of these are code. But there is even more code, we have e.g. build scripts, configuration code and test framework code. There is a lot of code outside the actual software which is delivered as product or service. And also this code needs to be maintained, also this code needs to be changed as your product is evolving. New components are developed, meaning build scripts to be adapted, new test cases to be created. Existing test cases need to be changed, so there is always work to be done on code. For this reason, for accomplishing a sustainable pace of development, this non-product code needs to be handled in the correct way, in the same way as our product code. Mitigating Technical Debt as much as possible.
Is your test framework actually designed? Or did it grow without any design or structure? Is your test code under version control? Do you apply Clean Code practices on your test code and Gherkin scenario’s? Do you apply static code analysis on your test code? Is your test code reviewed? Is your test infrastructure maintained? Do you track defects in test code? Are your Test Automation Engineers actual Software Engineers?

To summarize, non-product is code as well. Technical Debt is not only applicable to your product code but to all other code as well. To keep your automated testing up and running without too much effort one needs to apply proper software craftsmanship on all code and not only on the product code. If not, more and more effort need to be spent on analyzing “false-positive” reds resulting in slowing down your regular development until it becomes the final nail in the coffin.

Leave a Reply

Your email address will not be published. Required fields are marked *