We need to discuss technical debt……

Last week an engineer, who I did not know, approached me for having a short talk. He explained to me that he is working on a piece of embedded software containing significant levels of technical debt. He would like to achieve that a certain percentage of the team effort would be spent on handling technical debt, only 10% would already be appreciated.

However, due to time pressure to deliver functionality the product owner does not want to grant the percentage for handling technical debt. The engineers question was whether I had some tricks and tips which could help him. He had read some parts of my book “What is Software Quality?”.

I pointed out that we, as a software community, need to stand for our profession. It is our responsibility to mitigate technical debt in a context of finding the right balance between short term and long term. We need to engage the dialogue on the subject with our stakeholders like product owners, project managers and management in general. We need to point out the consequences of technical debt and why it needs to be addressed.

First of all we need to have a look at ourselves; what can we do to mitigate technical debt. Like, developing software as it should be; applying good development practices with craftsmanship and producing clean code while demonstrating the needed discipline to do so despite time pressure. And even then, technical debt is inevitable and will creep into our software. So, we need to do something additional.

Considering technical debt itself we can distinguish between small TD and big TD. An example of small TD is this compiler warning which is not yet fixed. Or this function or method with a too high cyclomatic complexity. Small TD can be handled by the boy-scout rule; leaving code cleaner than you encountered it. Whenever altering a piece of code, get rid of small TD in this piece of code. We always should apply the boy scout rule and, in my opinion, we do not need to ask for ‘permission’ to do so. It is part of our craftsmanship.

An example of big TD is a required redesign of a module which will take a significant amount of effort. In this case the technical debt is so big that it cannot be solved instantly. We need to register big TD on e.g. a Technical Debt Backlog (TDB). This TDB then needs to be considered on a regular basis in the context of the features to be planned for the next release. Which TDB-items are needed to be addressed for the implementation of the prioritized User Stories? Preferably TDB-items will be ‘connected’ to one or multiple User Stories, thus whenever the User Story is prioritized the ‘connected’ TDB-items will be as well.

To be able to discuss and prioritize big TD with the stakeholders it is important the stakeholders do understand what technical debt is and which consequences it has. Therefore, they need to be educated by us; by the software professionals. In my book I try to explain technical debt in such a way that it can be understood by people who do not have software knowledge as well.

Engaging with the stakeholders and explaining and discussing the consequences of technical debt is necessary; using metaphors, explaining the complexity of execution paths, visualizing the size of software and showing the vast diversity of technical debt and pointing out the long term consequences of technical debt on development speed and efficiency. Personally, I like to talk about ‘a sustainable pace of development’ instead of ‘development speed’, in accordance with the Formula-1 in which they talk about ‘race-pace’ instead of ‘race-speed’. This has reason, focus on speed in Formula-1 will increase tire-wear like focus on speed in software development will increase software-wear. In both cases velocity declines. Let’s take our responsibility and start discussing technical debt with our stakeholders, using metaphors like tire-wear in Formula-1.

Should Software Engineers be Certified?

Software is everywhere and its relevance is growing at a phenomenal pace. It would not be an exaggeration to say that software runs the world. And still, as expressed in a previous blog of mine, “Trial and Error Programming”, programming skills need to improve significantly. Besides programming skills, software engineering skills also need to improve. As an example, how is that not many software engineers are using UML for modeling requirements and design?

Low threshold to become a software engineer

Let’s have a closer look at the issues related to becoming a software engineer. If there is a lack of professionals in a certain profession, people will jump on it. There will be jobs all over the place. Combine this with the very low threshold to produce any software and it will be very easy to jump on the possibilities we have due to the lack of software engineers.

How easy is it to download a compiler and write and execute your first program? How easy is it to search the internet for any piece of code and copy it into your program to get a result? How easy is it to learn Python and start programming? Unlikely in other engineering disciplines, everybody can start programming easily. Tools are available and downloadable from the internet; on the other hand, production is done by compiling. With partial understanding of a programming language, first results can be easily achieved. How different is it from building your first electronics circuitry? Building electronic circuitry without understanding voltage, current, resistors, coils, capacitors, transistors, or digital circuitry is not even possible.

The low threshold to becoming a software engineer coupled with the high need of software engineers in the market is the reason why we do have high numbers of software engineers who are not adequately educated.

Software engineering is more than programming alone.

If we take software development seriously, we should make sure that our software engineers are well trained in necessary software engineering practices, next to “only” knowing a programming language. Engineering practices such as requirements engineering, modeling, design methodologies and design patterns, Clean Design, algorithms, Clean Code, and testing assume even greater importance than knowing the programming language itself. Additionally, our software engineers should be educated in computer architectures, computer networks and security, databases, operating systems, communication protocols, and so on.

In order to build high quality software, it is not sufficient to “only” know a programming language. Therefore, should we certify our software engineers to ensure they are well educated in relevant topics associated with software engineering? This is especially important for determinative software in our society, like software in aerospace or automotive or in our financial systems. Certification of our software engineers would help us ensure that only well-educated software engineers are working on our most critical software!

Process vs Skill

by Ger Cloudt, author of “What is Software Quality?”

One of the struggles of software development is the assurance of quality. Sure, if you talk about quality assurance, people will refer to testing. Or, one might want to assure quality by applying proper processes and best practices. Process improvement and reference models like ASPICE and CMMi are used in the industry to increase quality or to determine the capability of the organization to develop high quality software. However, when using reference models like ASPICE or CMMi there is a pitfall, using ASPICE or CMMi might disturb the balance between process and skill.

Process and Skill

According to Wikipedia.org a process is defined as “a series or set of activities that interact to produce a result”.
Additionally, to execute a process one needs to have a skill. According to Wikipedia.org a skill is “the learned ability to perform an action with determined results”.

Process and skill are complementary, so you will need both. To execute actions or activities you need a certain level of skill, to achieve a result multiple actions need to be structured by a process. However, the importance of process compared to the importance of skill is dependent on the kind of activities to be performed.

In figure below it is tried to visualize the relevance of process versus the relevance of skill to perform a certain task. On the left side of the figure tasks are positioned for which process is highly relevant and the needed skill-level is relatively low. An example is assembly-line work. Many people are able to perform these tasks as the needed skill-level is low, however the order of activities to be performed are very strict and therefore the process relevance is high. The same applies to building this furniture bought at IKEA. A manual describing how to assemble (process) the cupboard or chair is important whilst the needed skill-level is quite modest.

At the other side of the scale we have activities for which process relevance is low but a very high skill-level is needed which only a few people master. An example would be painting the Mona Lisa as Leonardo da Vinci did.

Software Engineer

The question is where to position the software engineer in the figure? If you draw a line reflecting the position at which process relevance and needed skill-level are equal, would the software engineer be placed on the left or right of this line? In my humble opinion the software engineer would be positioned on the right of this line, meaning that I think the needed skill-level is more important than the process for performing the role of software engineer. To put it differently: a process is a tool that helps you apply your skills. If the skill isn’t available, a process doesn’t help you. However, if the skill is available, a process can help to apply this skill.

ASPICE or CMMI level

Process improvement models like ASPICE and CMMI define different maturity levels which are used in some industries as a minimum requirement for software or system suppliers. This requirement or target setting to be process compliant within a certain process reference model disturbs the balance between process and skill.

By setting a target to be process compliant and achieve a certain ASPICE or CMMI maturity level the focus is directed to the less important part in the process-relevance versus needed skill-level balance. Achieving the target-level of process compliance does not tell the full story and therefore there is the danger that it is assumed that the process compliance is equal to achieving high-quality software which it is not. Process compliance assures that defined and necessary activities are performed, however it does not say anything about how well these activities are performed. Therefore, additionally to checking process compliance, one should check the content, the result of the activities, as well. Simply, because for performing the activities well, skills are needed. Good skills will lead to good results, bad skills will lead to bad results.

Software Estimations – why is it so hard?

by Ger Cloudt, author of “What is Software Quality?”

Image by  OpenClipart-Vectors from  Pixabay

Organizations like predictability in their development projects. High predictability enables e.g. Sales to sell and actually deliver. It enables the organization to negotiate contracts which can be fulfilled and if obligations are met no sales will be lost.
However, it seems that predictability of software development is low. Agile principles, like developing in small increments with a Potential Shippable Product at the end of each sprint, is a way to address the predictability problem in software development. Having small increments enables frequent deliveries, in time, but maybe not with the wanted scope. This is a less problem because a next release, which will be available soon, will contain the missing features.
But still, this incremental approach is not applicable to every software development and thus we are back to the predictability of software development.

In the development of multi-disciplinary embedded products predictability still is important. Since releasing and updating the software is not as frequent as in a true Agile environment, available functionality in a release becomes more important because if not in, the customer needs to wait for a next release which might take a considerable amount of time or even worse if the product is not remotely upgradable.

Cone of uncertainty

In 1995 Boehm et al presented the estimate converge graph, also known as the cone of uncertainty stating that for any given feature set estimation precision can improve only as the software itself becomes more refined.

Investing more time and effort in developing requirements, understanding requirements , analyzing requirements and even building the software itself will result in more accurate estimates as the cone of uncertainty shows. But still, even if everything seems to be clear, complete and understood we still will have uncertainty which will cause deviations from the plan. Uncertainty cannot be banned and therefore predictability will never be as accurate as 100%.

Puzzle analogy

Then, how to explain that software estimations are difficult and unprecise? For this I would like to use an analogy of solving puzzles like e.g. crosswords, cryptograms or Sudoku’s.

If I want to explain the difficulties we experience in software estimation to somebody I would ask the person to estimate how long it would take to solve a booklet of puzzles? Most likely you will get questions in return like, what is the difficulty level of the puzzles? What kind of puzzles are your referring to? How many puzzles need to be solved? How big are these puzzles? Am I allowed to use a dictionary? Well…., I do not know but anyway please provide me an estimate. You can imagine the accuracy of such an estimate.
This question of estimating the needed effort to solve an unknown bunch of puzzles can be compared to asking estimates typically for road mapping purposes. The development team is provided with some one-liners of the requirements and an estimate is asked, to be able to plot a roadmap.

Then we can take a next step and provide the actual booklet of puzzles. Having a look into the booklet will provide better insights and most likely the initial estimate as provided will be adjusted towards the new insights.
The more time you spend to have a look at the puzzles and understand the difficulty, the better the estimation to complete will become. This is in real software development comparable to your collecting, understanding and analyzing of requirements and maybe here and there perform some pre-development or prototyping for high risk areas.

To achieve an even better estimate you could not only have a look at the booklet of puzzles, but actually solve some puzzles. Measure how long it takes to solve and count the remaining, not yet solved, puzzles. This is what we call measuring velocity and applying it to the remaining work to predict when to deliver.
You would expect a pretty accurate estimate, right?

However, when several puzzles are made and velocity seems to be stable I will come in and tear out a number of puzzles and add another number of puzzles to the work to be done. Typically this can be compared to changing requirements and adding requirements during the project which will happen throughout the development.

Another problem you will encounter during your puzzle solving is that suddenly you will encounter puzzles with an unexpected very high complexity. If the puzzles you solved so far are in the complexity area between 3 or 5 stars you will encounter some puzzles with a complexity much higher and your velocity will drop tremendously. Encountering these high complexity puzzles can be compared in encountering difficult problems during development, hard or nearly impossible to reproduce and even harder to solve. Also you will encounter puzzles which have a relation with previously solved puzzles and to be able to solve these new puzzle you have to redo the previously solved puzzles.

And then we did not yet talk about external influences in your puzzle-solving. What if you are out of pencils? Just because I come into the room and want to replace all pencils by a cheaper type of pencils? Or suddenly your dictionary will be lost? Compare it to Corporate IT performing a security update on the network. If you are lucky the update is done in the weekend and not affecting your project but there is a risk you will have problems on Monday.

If estimating puzzle solving is already hard… how about estimating software?

As you can see, estimating solving puzzles, an activity everybody can perform, is already hard and inaccurate. Imagine the development of a big software system which only can be done by highly educated engineers. How can we expect accurate estimates?

Don’t let Test Automation be the final nail in the coffin!

by Ger Cloudt, author of “What is Software Quality?”

Image by Michael Schwarzenberger from Pixabay

“We need an extra Test Engineer.”,  the Agile Master informed me on a sunny Monday morning. Already I noticed that the team needed more effort to keep the dashboard of the nightly build and testing green. More often in the morning “reds” were reported indicating the build or test had failed. Good…, our short feedback loop seems to work, defects are detected early and thus can be addressed immediately.
However before granting the request of the Agile Master I wanted to understand better the root-cause of the increase of reds.

Automate everything.

Clearly there is a lot of pressure on development teams to become faster, more efficient and predictable. Over time the software industry addressed this demand a.o. by changing processes and becoming Agile. One of the key principles to mitigate waste is to create short feedback loops. If something is done wrong and detected very fast the waste is limited, simply because it is relatively easy to fix. That’s why we try to test as early as possible to catch defects as early as possible because we all know the relation between the time gap of defect insertion and defect resolution and costs. The earlier a defect is detected and solved the cheaper it is.
Testing as early as possible implies a lot of testing, over and over again. And that’s why we started to automate testing because activities to be repeated over and over again should be automated with clear reporting such that fails are noticed immediately without too much additional effort.

How about all these reds.

Before deciding on adding an extra Test Engineer we needed to have a closer look at the root-cause of these increasing reds in the nightly test run. First question to be asked was whether the reds were caused by indeed regressions in the Software Under Test? Analysis of the reds showed that this was not the case. Failures in test cases, instability in test framework and test infrastructure, not updated test cases clearly caused the increase of reds. A next interesting question to be asked is what percentage of reds is caused by actual regressions in the product code? As it seemed the majority of reds was not caused by regressions in product software but by other reasons.

Over the years automated test bench grew, configurations were added, tests were added, infrastructure changed and apparently more and more effort is needed to maintain all these automated tests and everything related. To be honest this should not be a surprise because test automation is software development. And software is subject to Technical Debt (see Help…, my software rots!). So, your test cases and test environment will be as well.

Gherkin scenarios

Let’s have a closer look at Gherkin scenarios supported by tooling like e.g. SpecFlow and Cucumber.
Gherkin is a business readable language used to describe behavior which can be used for defining executable test cases. It consists out of steps and uses keywords like “Given”, “When” and “Then” to describe a precondition, an action and a result. Each step is associated to a keyword. Scenarios are written in Gherkin and the steps require “glue-code” to address the Software Under Test. Tooling like SpecFlow generates for each step a signature which consists out of a method interface in a specific programming language like e.g. C# or Java. The actual code (glue-code), implementing the step, to address the Software Under Test in the correct way needs to be programmed by the Test Automation Engineer.

Despite Gherkin is a simple language, you still can build a mess… and if not paying attention to good programming practices you will build a mess even in Gherkin.

Therefore even in Gherkin scenarios you might think about defining generic steps to be re-used in multiple test cases and specific steps. Even in Gherkin scenarios you might think about Clean Code principles like using meaningful names, naming conventions and keeping steps small. Once I saw a step in a Gherkin scenario which resulted in a method with more than 15 arguments! Ouch……, what does Clean Code of Robert C Martin say about number of arguments? What if you need to adapt this Gherkin scenario due to a new requirement? Imagine you might have many of these of kind scenarios…..
As you can build a mess in both your Gherkin scenarios as well as your glue-code, test cases contain Technical Debt which might slow you down significantly and result in increasing numbers of reds in your automated test execution.

Even more code.

For testing we use Gherkin scenarios, glue-code, unit tests. All of these are code. But there is even more code, we have e.g. build scripts, configuration code and test framework code. There is a lot of code outside the actual software which is delivered as product or service. And also this code needs to be maintained, also this code needs to be changed as your product is evolving. New components are developed, meaning build scripts to be adapted, new test cases to be created. Existing test cases need to be changed, so there is always work to be done on code. For this reason, for accomplishing a sustainable pace of development, this non-product code needs to be handled in the correct way, in the same way as our product code. Mitigating Technical Debt as much as possible.
Is your test framework actually designed? Or did it grow without any design or structure? Is your test code under version control? Do you apply Clean Code practices on your test code and Gherkin scenario’s? Do you apply static code analysis on your test code? Is your test code reviewed? Is your test infrastructure maintained? Do you track defects in test code? Are your Test Automation Engineers actual Software Engineers?

To summarize, non-product code is code as well. Technical Debt is not only applicable to your product code but to all other code as well. To keep your automated testing up and running without too much effort one needs to apply proper software craftsmanship on all code and not only on the product code. If not, more and more effort need to be spent on analyzing “false-positive” reds resulting in slowing down your regular development until it becomes the final nail in the coffin.