People Mar 24, 2026 · 6 min read

The QA Tester Plays to Win: Sabtain Ahmad on AI Agents That Break Games

Game studios spend millions on human testers who click through the same levels thousands of times. ManaMind's co-founder built a vision-language model that plays the game itself — and finds bugs no human would think to look for.

Sabtain Ahmad — CTO & Co-Founder of ManaMind and Speaker at Human × AI Conference 2026

The economics of game testing are brutal in their simplicity. A major studio title ships with millions of possible state combinations — character positions, inventory configurations, lighting conditions, physics interactions, dialogue branches. Before release, human testers must verify that none of these combinations produce a crash, a visual glitch, or a progression-blocking bug. The process is manual, exhaustive, and represents a significant share of a game's total development budget. Studios routinely staff hundreds of QA testers for months. Most of them are performing work that is, in computational terms, a search problem.

Sabtain Ahmad recognised this as a machine learning problem with a specific structural advantage: games are closed environments with defined rules, observable states, and deterministic failure conditions. Unlike autonomous driving or medical diagnosis, the cost of a false negative is a bad review, not a fatality. The risk-reward profile is ideal for autonomous agents.

The Model That Plays

ManaMind's core technical insight was that off-the-shelf vision-language models — the kind that can describe a photograph or answer questions about an image — cannot reliably interpret the visual language of video games. Game UIs are dense with non-standard iconography, layered transparency effects, dynamic camera angles, and art styles that vary wildly between titles. Ahmad's team at ManaMind built a proprietary vision-language model specifically trained to parse game environments: to read health bars, interpret minimap data, recognise quest markers, and distinguish between intended visual effects and rendering artefacts that indicate a bug.

The agent does not simply observe. It plays. It navigates menus, selects dialogue options, moves through three-dimensional space, triggers combat sequences, and systematically explores edge cases that a human tester would need explicit instructions to attempt. When it encounters an anomaly — a texture that fails to load, a character that clips through geometry, a quest that cannot be completed — it logs the precise reproduction steps, the game state at the moment of failure, and a confidence score for the severity of the issue.

From TU Wien to the Startup

The technical foundation for ManaMind was laid during Ahmad's doctoral research at TU Wien, where he worked on scalable and privacy-preserving distributed machine learning. His dissertation addressed a problem that maps directly onto the game testing domain: how to train models across distributed environments where data cannot be centralised, where compute must be efficient, and where the system must operate reliably at the edge rather than in a cloud data centre.

Before ManaMind, Ahmad spent three years building AI systems for industrial automation — work that taught him the engineering discipline required to move from research prototype to production system. A research collaboration at Umeå University in Sweden added a cross-institutional perspective on how European AI research translates into commercial applications. The Austrian Academy of Sciences recognised his doctoral work with the Critical Infrastructure Award, a signal that the research community saw the industrial relevance of his approach before the startup ecosystem did.

Why Vienna

ManaMind operates from both London and Vienna — a dual-headquarters structure that places its commercial operations in Europe's largest gaming market and its research base in one of the continent's strongest AI research ecosystems. TU Wien's machine learning group, the Austrian Institute of Technology, and a growing cluster of AI startups create the kind of environment where a PhD researcher can move from distributed systems theory to a venture-backed product company without leaving the city.

The Human × AI Conference audience includes precisely the category of investor and enterprise operator who needs to understand what autonomous agents can do outside the chatbot paradigm. Ahmad's work demonstrates a pattern that the European AI ecosystem is beginning to produce at scale: deep technical research, applied to a specific industrial domain, commercialised by founders who understand both the science and the economics. The game testing market alone is worth billions. The underlying technology — agents that can see, interpret, and act in complex visual environments — has applications that extend far beyond entertainment.

Implications

For game studios: Autonomous QA agents represent the first credible path to reducing testing costs by an order of magnitude without sacrificing coverage. ManaMind's claim of up to 80% cost reduction, if validated at scale, would fundamentally restructure the economics of game production.
For AI founders: The game testing use case illustrates a broader principle: the highest-value applications for vision-language models may not be in general-purpose consumer products but in narrow, high-stakes industrial domains where the visual environment is complex but bounded. European founders with deep research backgrounds are structurally well-positioned for this category.
For conference attendees: Expect a technical demonstration of what autonomous agents look like when they move beyond text — a system that sees, decides, and acts in real time, applied to a problem that every entertainment company on the planet is paying to solve manually.

Sabtain Ahmad joins Human × AI on May 19, 2026, in Vienna.

View Speaker Profile →

Created by People. Powered by AI. Enabled by Cities.

One day to shape
Europe's AI future

Secure your place at the most important AI convergence event in Central Europe.

Get Your Ticket → Become a Partner