If continuous integration (CI) is so awesome, why are software development teams still spending an average of 13 hours per failure in their backlog?
CI is often touted as the key to delivering software changes at velocity. It is the fundamental component that allows Agile development and DevOps to work. It offers radical gains in terms of speed, productivity, and quality.
According to Gartner, an effective CI practice produces a functional software system on a scheduled and repeatable basis; and teams that are able to adhere to this prime directive stand to realize significant advantages from their build systems. (see A Guidance Framework for Continuous Integration: The Continuous Delivery Heartbeat).
And yet, a recent study carried out by the Cambridge Judge Business School MBA project reveals that, while CI adoption is on the rise (growing from 70% in 2015 to 88% in 2019), software failures in test remain a major impediment to delivery speed.
This is The Great Stink in software development.
A CI/CD environment, test automation, and more advanced testing capabilities like fuzz-testing, all help with delivering quality software at speed - but they bring with them a growing backlog of failing tests. Test suites are plagued by hundreds of failing tests that take months of engineering effort to bring under control.
And defects are expensive. Failures in automated tests cause bottlenecks in the development pipeline and increase overall development cycle time. The ensuing waste and inefficiency in the process results in uncertainty, unpredictability, and delays in bringing new features to market.
Over 600 million hours a year are spent on debugging code in North America alone. This equates to $61Bn just in salary costs.
CI requires zero tolerance to software failures in test. Tests must pass reliably; failures represent new regressions.
Intermittent failures spewing out of increasingly complex applications, combined with running thousands of tests every hour, or every day, make defect resolution a growing challenge.
We sweep those test failures under the carpet and hope they go away. But the bulging blockage in the pipeline stinks like a putrid sewage drain.
It slows development teams down, preventing them from releasing new functionality faster and from responding to customer demand before the competition gets there first.
Many of these failures are benign; but buried in there are also ticking time bombs waiting to blow up in production.
So what’s the real problem here? Why do we have a growing backlog of failing tests preventing us from realizing the true potential of CI?
91% of software developers admit to having defects which remain unresolved because they cannot reproduce the issue (see an analyst firm's study on software reliability). That’s the problem: REPRODUCIBILITY.
Reproducibility is the fundamental problem slowing engineering teams down. It blocks their development pipeline and prevents them from releasing software changes at the pace they need to.
So how can engineering teams confidently deliver quality software on a scheduled, repeatable, and automated basis?
Software Failure Replay offers a solution by entirely removing the guesswork in failure diagnosis.
LiveRecorder is the leading Software Failure Replay platform. It can record a failed process down to instruction level - capturing bugs in the act. No need to reproduce the issue. All the data engineers need is already there in the recording. Engineers can simply debug the recording artifact by replaying it forward and backward. This gives them full visibility and data-driven insight into what their software really did before it failed or misbehaved.
This workflow reduces the number of loops in agile development cycles and increases development velocity.
The diagram below illustrates how LiveRecorder can reduce the time-to-resolution of software defects and unblock CI/CD pipelines.
LiveRecorder customers are able to:
- Accelerate Time-to-Resolution of defects by up to 10x
- Improve cycle time and delivery velocity through engineering efficiency
- Resolve customer issues faster, protecting their key client relationships