If continuous integration (CI) is so awesome … why are software engineering teams still spending an average 13 hours per failures in their backlog?
CI is often touted as the key to delivering software changes at velocity. It is the fundamental component that allows Agile development and DevOps to work. It offers radical gains in terms of speed, productivity and quality.
According to Gartner, an effective CI practice produces a functional software system on a scheduled and repeatable basis; and teams that are able to adhere to this prime directive stand to realize significant advantages from their build systems. (ref. A Guidance Framework for Continuous Integration: The Continuous Delivery Heartbeat).
Yet, a recent study carried out by the Cambridge Judge Business School MBA project reveals that while CI adoption is on the rise (growing from 70% in 2015 to 88% in 2019), software failures in test remain a major impediment to delivery speed.
This is The Great Stink in software development.
A CI/CD environment, test automation, and more advanced testing capabilities like fuzz-testing all help with delivering quality software at pace - but they bring with them a growing backlog of failing tests. Test suites are plagued by hundreds of failing tests that take months of engineering effort to bring under control.
And defects are expensive. Failures in automated tests cause bottlenecks in the development pipeline and increase overall development cycle time. The ensuing waste and inefficiency in the process results in uncertainty, unpredictabilty, and delays to get new features to market.
Over 600 million hours a year is spent on debugging code in North America alone. This equates to US$ 61Bn in salary costs alone.
CI requires zero-tolerance to software failures in test. Tests must pass reliably; failures represent new regressions.
Intermittent failures spewing out of increasingly complex applications, combined with running thousands of tests every hour or every day, make defect resolution a growing challenge.
We sweep them under the carpet and hope they go away. Except that the bulging blockage in the pipeline stinks like a putrid sewage drain.
It slows development teams down, preventing them from releasing new functionality faster and responding to customer demand before the competition gets there.
Many of these failures are benign; but buried in there are also ticking time-bombs waiting to blow up in production.
So what’s the real problem here? Why do we have a growing backlog of failing tests preventing us from realizing the true potential of CI?
91% of software developers admit to having unresolved defects because they cannot reproduce them (Ref. analyst study on software reliability). That’s the problem: REPRODUCIBILITY.
Reproducibility is the fundamental problem slowing engineering teams down. It blocks their development pipeline and prevents them from releasing software changes at the pace they need to.
So how can engineering teams confidently deliver quality software on a scheduled, repeatable and automated basis?
Software Failure Replay offers a solution by removing the guesswork in failure diagnosis entirely.
LiveRecorder is the leading Software Failure Replay platform. LiveRecorder can record a failed process down to instruction level - capturing bugs in the act. No need to reproduce the issue. All the data engineers need is already in the recording. Engineers can simply debug the recording artefact by replaying it forwards and backwards - getting full visibility & data-driven insight into what their software really did before it failed or misbehaved.
Because engineers get an exact CPU-level copy of what the failed process did at any point in time, they no longer need to waste time recreating the failure. Diagnosis time is significantly reduced, and engineers can get straight to debugging the recording artifact. This reduces the number of loops in agile development cycles and increases development velocity.
This diagram below illustrates how LiveRecorder can reduce the MTTR of software defects and unblock CI/CD pipelines.
LiveRecorder customers are able to:
- Accelerate Mean-Time-to-Resolution (MTTR) of defects by up to 10x
- Improve cycle time and delivery velocity through engineering efficiency
- Resolve customer issues faster - protecting their key client relationships