Chaos Engineering Adoption

Want more insights like these?

Chaos engineering tests the resilience of complex systems, but how many tech leaders have adopted it in their organizations? Read on to find out.

One minute insights:

  • Zig Zag Arrow UpIncreasing system complexity is a common reason to adopt chaos engineering
  • Thumbs UpRespondents were satisfied with the blast radius of their chaos engineering experimentation but dissatisfied with the vulnerabilities uncovered
  • AgreementMost use real-world or live-environment testing and choose the system access level to introduce failures
  • Dart GoalRespondents called out improving MTTR as a top benefit of chaos engineering, while fear of causing disruptions is a common challenge
  • StarRespondents prefer that software engineers have experience in chaos engineering

Many engineering leaders are deploying chaos engineering as a way to manage increasing system complexity

How informed are you about chaos engineering?*

How informed are you about chaos engineering?

82% of respondents understand chaos engineering.

*Respondents who had never heard of chaos engineering were eliminated from the survey.

n = 300

Is your organization currently deploying chaos engineering?

Is your organization currently deploying chaos engineering?

More than half (59%) say their organization is currently deploying chaos engineering.

n = 300

Of those whose organizations haven’t deployed chaos engineering, one-third (33%) are in the process of doing so. 56% feel their organization should deploy chaos engineering, even though it isn’t planning to.

Is your organization planning to deploy chaos engineering?

Is your organization planning to deploy chaos engineering?

Increasing system complexity (68%) was the most common reason for adopting chaos engineering. Half (50%) of respondents cited lack of preparedness seen during a system failure, and 49% attributed chaos engineering adoption to unclear technical debt.

What are the reasons your team decided to adopt or is planning to adopt chaos engineering?

What are the reasons your team decided to adopt or is planning to adopt chaos engineering?

Lack of clarity on system limits (e.g., loads, elasticity, etc.) 25%, Moving towards a DevOps or CI/CD deployment model 20%, Increased automation 19%, Increased use of AI/ML 11%, None of these 0%, Other 0%

n = 210

Chaos engineering is crucial for infrastructure reliability and resilience. It simulates failure to measure system robustness, helping build systems that can manage chaotic events. Proper monitoring and alerting, understanding system boundaries and defining a recovery process are all vital elements.

C-suite, professional services industry, <1,000 employees

Would love to be able to start to incorporate it with our current Agile methodology.

VP, retail industry, <1,000 employees

Tech leaders are satisfied with the blast radius of their chaos engineering experimentation but dissatisfied with the vulnerability uncovered

Among those whose organizations have deployed chaos engineering, 63% were satisfied with their experiment blast radius. 16% were dissatisfied with the vulnerabilities they uncovered during their deployment.

How are you currently finding team management according to the following aspects:

How are you currently finding team management?

There is a lot of apprehension in unintended disruption—we’ve been using a ‘warm’ instance to work out the kinks.

C-suite, educational services industry, 10,000+ employees

Having a separate environment to start testing chaos engineering makes the process significantly easier. Once you move to production, kickstart the chaos testing at off-peak times!

VP, educational services industry, <1,000 employees

Most use real-world or live-environment testing and introduce failures at the system access level

Almost three-quarters (72%) of respondents use real-world or live-environment testing during chaos engineering. 63% intentionally introduce realistic bugs.

What approaches do you use for chaos engineering?

What approaches do you use for chaos engineering?

None of these 0%, Other 0%

n = 177

Respondents most commonly introduce system failures at the level of system access (57%), application (54%), API (53%) and virtual machines (52%).

At what levels have you introduced system failures?

At what levels have you introduced system failures?

Chaos engineering makes sense for your highly available and core customer-facing systems. Since it introduces additional cost, it only makes sense once you have certain scale and system maturity.

C-suite, educational services industry, 10,000+ employees

It is important to address the most significant weaknesses proactively, before they affect our customers in production. We need a way to manage the chaos inherent in these systems, take advantage of increasing flexibility and velocity, and have confidence in our production deployments despite the complexity that they represent. Hence we have enforced this now and are seeing how consistently we can remediate failure.”

Manager, professional services industry, 10,000+ employees

Improving MTTR is a top benefit of chaos engineering, while fear of causing disruptions is a core challenge

Half (50%) of respondents said that improving MTTR is one of the main benefits of chaos engineering. Other top benefits included uncovering system weaknesses (46%), improving team culture (45%) and improving failure detection (44%).

What are the main benefits of chaos engineering?

What are the main benefits of chaos engineering?

Study/optimize how systems operate under failure 27%, Improve understanding of system steady state 26%, Build better processes for handling failures 20%, Provide more context to build systems that can handle failures 18%, Enable testing of rare/unlikely scenarios 14%, None of these <1%, Other 0%

62% of respondents cited the fear of causing disruptions as one of the main barriers to chaos engineering. Lacking an understanding of system steady state (49%) and skill gaps (49%) are also key challenges.

What are the main barriers to chaos engineering adoption?

What are the main barriers to chaos engineering adoption?

Insufficient developer/testing environments 17%, Costs (e.g., to create environment, upskill team, invest in tools, etc.) 11%, Lack of executive interest 9%, Lack of engineer engagement (i.e., engineers are not keen to explore chaos engineering) 9%, Lack of tools 7%, Insider threat concerns 6%, None of these 0%, Other 0%

There is a steep learning curve and initial fear, but the potential improvements and results are worth the investment.

VP, software industry, <1,000 employees

Chaos engineering is a complex principle which is very difficult for organizations to embrace completely. It feels very risky and challenging and potentially embarrassing. Organizations instead need to embrace this as a way to solve their problems, not as a way to embarrass people or create new problems.

VP, telecommunications industry, 1,000 - 5,000 employees

Software engineers may need experience in chaos engineering, and most respondents think it will impact software development

Is chaos engineering experience required for software engineer positions?

Is chaos engineering experience required?

69% of respondents said that experience with chaos engineering is preferable for software engineers; 12% called it a must.

n = 177

60% of respondents think that chaos engineering will have a role to play in software engineering, while 20% see it becoming fundamental.

What level of impact do you think chaos engineering will have on development teams?

What level of impact do you think chaos engineering will have on teams?

Chaos engineering is one of those once-in-a-generation frameworks that will revolutionize how software engineering should work.

C-suite, software industry, <1,000 employees

My final thought on chaos engineering is that it is an essential practice for modern organizations, as it allows them to anticipate and plan for potential system failures. It is really important to ensure the chaos engineering process is structured correctly and that engineers are given the support and resources needed to execute the tests effectively.

C-suite, professional services industry, <1,000 employees
A lightbulb

Want more insights like this from leaders like yourself?

Click here to explore the revamped, retooled and reimagined Gartner Peer Community. You'll get access to synthesized insights and engaging discussions from a community of your peers.

Respondent Breakdown

Respondent Breakdown