Sauce Labs Makes AI Agent for Creating and Running Tests Available

Sauce Labs today made generally available an artificial intelligence (AI) agent that translates a natural language intent into a set of executable test suites that can run anywhere.

Company CEO Dr. Prince Kohli said the Sauce AI for Test Authoring agent closes a gap that has emerged between the rate at which code is being written in the age of AI and the ability of application developers and software engineering teams to validate it. Testing has now become a major bottleneck that is preventing DevOps teams from realizing many of the promises of AI coding, he added.

In the absence of an ability to effectively test higher volumes of code, there are now also more applications than ever that have limited test coverage, noted Kohli. In general, Sauce Labs research suggests that even prior to the rise of AI coding, automated test coverage for complex journeys typically plateaus at under 35%.

Trained using 8.7 billion real-world test runs to enable 41% faster root-cause analysis than a general-purpose large language model (LLM), the Sauce AI for Test Authoring agent addresses that issue because it understands intended application behavior by scanning the application workflow or interpreting specifications written by product managers, or analyzing designs found in tools such as Figma. Alternatively, engineers, application developers and product managers can describe application behavior in natural language, noted Kohli.

The platform then autonomously generates complete, executable test suites for web applications or for Android, or iOS in a way that is now 90% faster, with the option to execute them in the Sauce Labs Test Cloud or run them independently of that platform. That capability doesn’t eliminate the need for humans to be in the middle of the testing loop, but it should eliminate most of the toil associated with creating and maintaining tests that can now be more easily customized for each application, said Kohli.

The overall goal is to improve the quality of the applications while at the same time reducing the number of bugs that need to be fixed after an application has been deployed, said Kohli. At the same time, application developers should also be able to devote more time to writing code rather than building test suites, he added. In fact, Sauce Labs estimates application developers are today spending more than 30% of their time writing and maintaining test suites. Software engineering teams also spend 40% of their working hours fixing flaky tests or maintaining legacy scripts that an AI agent will eliminate the need for DevOps teams to manage, noted Kohli.

Many of those legacy tests often don’t need to be run at all, or are simply no longer relevant, noted Kohli. Too many DevOps teams are simply executing a suite of tests by rote regardless of how the application has been constructed or its intended use case, he added.

Hopefully, the rise of AI will lead to better tests being created and run much faster than ever. In the meantime, however, some organizations may be accruing technical debt much faster than they realize simply because the pace at which code is being created now far exceeds their ability to validate it before being deployed in a production environment.