Reading Time: 50 minutes

Quality assurance is one of the most crucial aspects of game development to ensure the game functions as intended and to provide a positive player experience. Around $300M in funding has been directed towards AI “efficiency tools for creators” and Christoffer Holmgard, Founder of Modl.AI, and Nathan Martz, Founder of Agentic, are pushing the frontier of how AI can enhance and streamline QA processes.

From “players as a service” to functional load testing, your host Alexandra Takei, Director at Ruckus Games, and guests discuss the history of QA, the various methods developers use to access it, the different types of testing in the ideal test pyramid, and where AI can realistically plug.-in. Agentic and Modl.AI are production-ready and share stories of successful partnerships with AA and indie studios to improve their testing processes. Tune in for a fascinating episode on the evolution of QA testing and a window into the future where collaboration between humans and AI agents is a developer's everyday life.   

Zebedee

Also, big thanks to ZBD for making this episode possible! ZBD provides a plug-and-play API and SDK for seamless integration of instant, borderless, and low-fee payments using the Bitcoin Lightning Network. Want to better engage and monetize your global user base? Start for free at http://zbd.gg/


This transcript is machine-generated, and we apologize for any errors.

Alexandra: Hello, everyone, and welcome to the Naavik Gaming Podcast. I'm your host, Alex Takei, and you know it. This is Interview and Insights. As always, I've got an exciting topic for all of you listeners today. Trigger buzzword, AI. We know that everyone and their mother are talking about the use cases of AI and the billions of dollars that have flooded the AI ecosystem.

In gaming specifically, startups looking to bring AI to gaming across a myriad of categories have closed over 700 million in disclosed funding since 2022. In terms of funding categories, they are broadly broken down into five things. One, efficiency tools for creators. Two, enabling new patterns of play.

Three, AI enabled game developers for moderation. And five other, the biggest of these categories is efficiency tools for creators at 272 million and moderation. The most pecuniary at 42 million. Within the slice of efficiency tools for creators, there are a few subcomponents such as art generation and animation, videos, game engines, audio, content creator tools, and finally playtesting and QA, which from the title of our episode you can probably discern is today's topic.

And so today we're going to dive into the use case of AI as involved in the quality assurance stage of production for video games. And to do so, I'm welcoming two founders of two companies leading innovation in the space. Thanks So my first is Christoffer Holmgard CEO and co-founder of Modl.AI.

I'm of course going to have him share how he came to found the company, but for our audience, Modl.AI is a company seeking to redefine game development through two products Modl.test, which is automated game testing, and Modl. Play, virtual on demand players to satiate matchmaking needs. Modl has backers like Griffin Gaming Partners, Rendered VC, and several others.

So welcome to the pod, Christoffer.

Christoffer: Yeah, thank you so much. Thanks for having me. Excited to be here. Woo!

Alexandra: My second guest is Nathan Martz, CEO and co-founder of Agentic. Agentic, and of course Nathan will explain more in depth, is building AI players as a service, aka a service that allows developers to quickly train AI that can play and test their games.

Agentic is backed by 8VC and others. Welcome to the pod, Nathan.

Nathan: Hey, it's great to be here. Thank you.

Awesome. Yeah, I'm very excited to have you on. I was really impressed by the demo that you guys showed at the ETCGDC event. I was like, I gotta get these two guys together. Um, Yeah.

Alexandra: Awesome. Alright, so before we get started on all things QA and AI, I gave you guys some brief intros, but I would want to hear and how do you tell the audience a little bit more about you and why you founded your respective companies and specifically what inspired the names behind your company?

Christoffer, how about you go first?

Christoffer: Yeah, sure. Absolutely. Happy to. Yeah so we started Modl.AI in 2018. Ahead of that time. I, myself I've been in games since around 2008, professionally so I've been doing that for quite a while. And then I've been doing work in AI and machine learning for games since 2011.

And what we do in modl.AI is really just, bringing those two things together, as it were, and and trying to help as many game develop game developers with that as we can't why we decided to fund the company. Ver versus a multitude of reasons for that. But I've, the things that we're trying to address and help folks with in modl.AI are things that we, as the founders of the company have faced ourselves when we were building games, I think one of my, one piece of original trauma for me was when I had a game crash on live national television and one of the first ones that I did.

So that definitely taught me a lot about the value of good testing and QA. But more generally, I think when we started modeling out in 2018, we were like seeing that there were. A lot of strides were being made in what could be done with game playing AI. And some of it was in the service of just advancing AI in general.

And I think we're seeing some of the benefits of it, writ large now. But because we came out of a space, we thought somebody should really try and work on getting that out to as many people as possible and help them reap the benefits of this technology. That's why we started the company.

And as for the name at the end of the day, what we do at model we model what players might do in your game. We build models and we do it using AI. It's as simple as that.

Alexandra: Awesome. But you've you model them without Excel, I'm sure. So you remove the E from model from the finance word.

Christoffer: That's where I'd had to go.

Alexandra: Death to Excel. That's only, it's an, it's, that's a newbies modeler space. I'm sure yours are much more sophisticated.

Christoffer: I love Excel and Google sheets.

Alexandra: Okay. Very cool. I love that journey. And thank you for sharing a little bit about the name. I think that's always so super interesting, especially in the it's it usually has a lot of meanings to the founders that found the company.

Nathan, what about you?

Nathan: Yeah. So I, like Christoffer, started out as a game developer in my case in the early 2000s. I was a game dev for about 14 years at LucasArts and Double Fine. And then eventually went to Google, was there for seven years. It led developer products for AR and VR and some machine learning work.

And, actually I started my career as a tech artist building like tools and pipelines. And I think in many respects the through line of my career has been creating technology that. Empowers it helps out creative professionals, right? That's what I love most is to build a technology that lets humans do what they're best at.

And often that you do that by getting rid of the busy work, the stuff that's tedious, time consuming, so that humans can focus on really being, creative and expressive. And during the time at Google I started to see, Okay. Oh, there's this intersection, this confluence of, AI game development.

There might be ways to apply that. That really helps game developers in a new kind of way. So I met my now co founders, Stuart and Leo, while we were at Google. And in 2021, we left to found, excuse me, the focus of which, is to create a service that allows game developers to train AI that can act like players of their game.

Very at a high level, very similar to what Christoffer outlined, sometimes you want players for QA. Sometimes you want them to validate your infrastructure. Sometimes you want them actually in your live service to be, non toxic teammates or to keep liquidity in your matchmaking pool.

But all of that was our thesis. And the thing that we wanted to do most of all, Was to develop technology in a way that it was, AI in service of game developers. It was thoughtful about the way that game developers work. They're like very pragmatic constraints, their goals, and to make sure that everything that we built from the algorithms to the infrastructure to the developer experience really made sense when your day job is making a game, not working towards AGI.

As far as the company name goes, we like our main goal is we wanted something that was, it was fun. That was a bit playful. That was suggestive of the work that we were doing with AI. I think it's very easy to have a name that's dystopian, right? A lot of kind of sci fi AI references, are like.

Not awesome. And then also, we were, I personally, one area of superstitious is I think you, you have to be very careful about your like company name. Don't tempt fate. If you have something that's like jinx, like why do we have all this bad luck? So we wanted something that was, yeah, very cool, positive, good vibes. And when we looked around, we came across this word agentic, which is really about acting with agency, acting in a way that is goal seeking where everything you do is designed to achieve a positive outcome. And it's like, oh, that's cool. That's like kind of what the company is about.

That's what the tool does. And then it's actually been a pleasant surprise that in the last year or two, the word agentic has become actually kind of part of the The larger generative AI conversation.

Alexandra: So yeah, yeah, no, it totally has. And that's actually why I was wondering whether or not that informed the name, but it sounded like the root of it came from something else.

Nathan: And it's just coincidental that it has agent in the name, right? So that's nice, but also made agentic comes from like cog side. It, predates, LLMs and our company, and we just, Oh yeah, this is a, a nice, it's an easy to say word it's memorable.

And then a train and it sort of is evocative of the space that we're in. That's awesome.

Alexandra: Thanks guys. That gives some good grounding for what your companies do, who you guys are. And before we go down though, the AI buzzword wormhole and talk about a lot of things that you guys mentioned the different types of testing formats and how could we could improve QA.

I want to talk a little bit about the QA function in games, its purpose and its history, and how QA needs evolve over the life cycle of a game from pre production to production, to alpha, to beta, to ship. And one of the things I want to make clear up before we even open this discussion is that QA is often a, Disenfranchised discipline in games.

I'm sure we've all seen Mythic Quest on Apple TV, but to some extent, QA departments in games have been where people begin their careers and rise through the ranks to become some of the most famous game directors and senior game leaders of all time. But it is that like mailroom equivalent in Hollywood.

And QA people sit around. playing games all day, which is not true. It's not monotonous, and in reality, QA can actually be, at times, a creative and technical function. So I want to set that tone today, that although we're going to talk about how AI can enhance and lighten QA demands for a game, We're obviously in no way diminishing its functional importance.

So Nathan, you're up first. I would love for you to tell me a little bit about what QA is in games, why QA is important to games, and what are the various methods developers use to access QA resources currently?

Nathan: Yeah. To start, I think the most. Important idea is that like when you're building a game, at least the way I've always thought of it you're building this little virtual world and contained within that world is like basically a set of possible experiences, right?

That's what I love about designing a game is like no one experiences it. The same way and as a designer trying to shape, not just one experience, but all possible experiences. And when you're doing that, there are some experiences that you don't, that should not exist, right? The experience where your game crashes on national TV, you want to, prune those experiences out of the game, right?

Or if the game is unfair, frustrating, you get rid of those. On the other hand, there are like positive experiences that you want to have more of. You want to shape. The game towards those. And I feel like it on a very high level. The role of QA is both of those, right? Part of the QA function is finding those kind of negative experiences and getting rid of them.

And then another part of it is actually helping Refine and improve those positive experiences, how they do that. There's a variety of forms of QA, right? You have traditional kind of automation, unit testing, where you have code do it for you, usually that's doing great atomic parts of the game.

At the other end, you have, human QA, which is actually this, you know, where people are playing the game in a variety of different ways, usually collaborating with other members of the development team to understand, the behavior of the game. Within that, especially within human QA, You can look at some of the work that's incredibly creative and collaborative.

And, I am so grateful for every QA person that I've worked with because, when you have a bug that's hard to repro or you need a tough decision about, Oh, should we change the game this way or that way? Those people who are really experts in the game and the experiences of the game are invaluable.

But there's also parts of QA that are Can be like very repetitive, where it's okay. There's a giant open world. The player can harvest plants, make sure that every plant is harvestable, brute force search just and up to every plant, exactly. And so at agentic, really our high level goal is to empower, members of the team, especially.

Folks on the QA team to use AI to delegate the busy work, to scale themselves, to have, machines do often, I think the least interesting, the least creative work, and then free up the humans to do what humans are really great at, which is refining the experience, collaborating, things like that.

Alexandra: In that, you mentioned so many different types of testing frameworks and Christoffer, maybe I can lean to you on this. In there, I heard gameplay testing, I heard unit tests, I heard functional testing. Can you sort out what all these things are maybe explain the ideal test pyramid unit tests all the way up to functional tests and what do those actually mean, Nathan, you talked about there are tests that are only in code and then there's human players, when are these sort all of those things by like labor and also time intensiveness because I think that's, I think also for our audience, that's something that's and even myself, I'm confused about where actually, What are all the different types of tests that you need to do in QA to make sure that your crash on national television does not happen?

Christoffer: Yeah, that's right. That's right. I'm happy to take a stab at it at least. And I think I very much agree with with what you're saying nif, and that there are two questions that there are many questions, but two fundamental things you're trying to answer with the QA and the testing process is does it work?

And the other question is is it good or is it doing what we want it to do? And you can think of the testing pyramid as being made up of it. We're like at the bottom, of the pyramid we're trying to answer a question. Does it work at all? And we can do that at different sort of like levels of scale and with different kinds of method.

And then we try to move the way up and saying are we building the thing that we're intending to build? And if we're not, what should we do about it? And how do we react to it? And that's like the way of it. I think fundamentally you want to think about. QA holistically, but it's something that permeates the whole development process and is part of every stage of the game's development.

And back to your point, Alex, that's a, and your point as well, Nathan that's also why, QA is so intrinsic to everything that we do in the business and maybe sadly has been a little underserved tool wise. We can get into why that's the case in a second, perhaps but like at the bottom of the pyramid when you were talking about the unit testing where you're trying to really almost you know when you are producing the game when you're writing code bits when you're adding assets you're trying at the smallest functional level to understand does this bit of my game work and that's incredibly valuable in the in the day to day because it helps you like with every step you're taking forward to answer the question okay does does this part of the program does this part of the game do what it needs to do but then on top of that once you start Putting all these bits together, then you start having these emergent phenomena happening between all the different bits that people made, right?

That can be between multiple systems in the code, just two pieces of code. It could also be the fact that you have somebody who is adding an asset as part of a production flow into the game. And an asset is, in many ways, also a function. There's also a functional piece of the game, right? And and now you have multiple disciplines that are interacting, and they're all contributing to the same artifact.

So as soon as we set that into dynamic play, and it starts operating we need to add on these more complex elements of testing, and we start stepping up the pyramid. And so that's when we start going up to things like uh, you know, smoke testing, for instance will the game run at all? Does it crash when I try to operate it?

Under which conditions will it run? Will it not run? And for how long will it run? So we also need to start doing things like soap testing. So we have to figure out, keep this running for a long time. Do we have, memory leaks or other emergent phenomena that only, show up after the application has maybe just been open for a long time or maybe, has been running for a long time.

Then on top of that, we have to think about how does this perform under load and on different hardware targets? So we need to get into performance testing as well. So that means we have to think about how does this run on the different platforms that I'd like to deploy it on. But also, if we're doing something that has a multiplayer component, for instance will that also work if 100 clients connect at the same time if 1000 clients connect?

Now At this point, we've only really been addressing the question does it work? Does it even run? Does it even blow up? But that actually takes up the vast majority of of the labor that we're talking about here. And for all of those different processes, I think it's also important to say, you can solve some of that with automation, but you still need to have a layer here where somebody, when something does go wrong, somebody still has to have a look at it and say, okay, why did it go wrong?

What are we going to do about it? Who should take an action on this? And how likely, how important is it? How likely is it that it's going to happen again? Do we even know how to reproduce this? So there are all these steps of actually analyzing the information that's coming out of it, right? Then stepping up from there, that's where we can start thinking about things like functional testing.

And now we start not quite moving into the, into sort of the part of it where we say is it good, but we're at least talking about, is it doing what we want or the dynamics of the application playing out the way that we would like it to. And so that's again, sort of like where you start relying more and more on this human judgment and human intelligence to say is what is happening in the game making sense?

Is this really the thing that should be happening? But also. Being smart about the combinatorics and saying, have we tried all the things that we really need to try in order to push this, application to the edges. And then finally, moving up from there considering the game more holistically, then we get into like the area in the space of play testing where, your team is playing the game.

Is it playing out? Is it giving bad experience that we'd like to have an event point? You're probably doing that both with your internal QA, your external QA. And ideally, of course, probably also with. With your players or somebody who's representative of your player base. And I think where the role of AI is coming in and automation is coming at the moment is just like you were saying, Nathan is that if we can take some of the time that's being spent at the bottom of the pyramid, take that away, either through great processes in our organization or through automation or those two things together, then you can really set the people who used to be spending their time, they're free.

to operate in the upper part of of that pyramid, so to speak. But for me, the takeaway there is just that there are so many components and so many dimensions to quality assurance. But that's, one fear could be, but that's another reason that it may have been a little underserved from a tooling perspective, because it's a really hard problem to solve without having, human intelligence in there.

Alexandra: Yeah. First that was incredibly I feel much more educated because I think that there's all the way that you broke this down does it work? Is it, does it work at all? And then there, is it functioning as intended? And then there is like, is it good, right? It's like the three different tranches of QA testing, and you actually need it.

All of that, the whole way through, the whole life cycle. And I think, I can't remember which one of you guys said this, but you we're just, we're talking about how testing needs change over the course of a game's production life cycle. You said different players have different intelligence.

And it sounds like it's not even just different players have different intelligence, it's also like entirely different types of testing. Things that require players and things that actually don't require players at all. And before we jump into the actual components of it, I think this is a great point to switch over to modl.

ai and Agentix role in QA and what you guys are doing in that actual pyramid, which of the things are you actually servicing? Because it actually might be interesting. All a lot of them. Just giving some perspective for how big QA organizations are at gaming companies, because we talked about people spending a lot of time in the unit tests, like the more like, does it even work at all?

At the bottom of the pyramid. And I'm not sure what it was like at EA. And Christoffer, maybe you also have an interesting perspective, giving your background in some indie studios, but the QA org at Blizzard was like 400 people, if not more. Now obviously this was for many titles, but that is, and.

An army, right? And the Activision Blizzard QA union, for example, stands at 600 people strong. And so do you guys have any perspectives on the way that you saw QA. org staffed to give us some perspective of, just simply how enormous is this task in terms of headcount?

Nathan: I think one of these is important to qualify it too, is that there's not such a thing as like the QA or, usually you have, for example, like dev QA, which are, members of the, employees usually have the same company usually working, very closely with, could have committed to one game at a time experts in that game just as focused on it as, an animator or an engineer.

Then often you have a shared service QA model, or sometimes it's called pub QA, where like it's a broader pool of people that may move from one game to another based on like where they are in development. So often they'll have expertise like certification or other kind of common processes that every game through, goes through at certain points in the life cycle.

And then more recently, there's often been outsource QA beyond that as well, where they look at a vendor who has, another body of people that you can bring on at certain times. And different studios, different game companies, whether you're at Indie or AAA, it all looks different.

But I think it's just, that's the, what we even think about like tooling, the workflow, who does what, it depends a little bit on, on which group you're talking.

Christoffer: Yeah. Yeah. I agree with that. And I think If we're thinking about it, like from the indie perspective, or at least some of the productions that I've been part of, in most productions you're probably actually, one of the, one of the challenges there is that you just don't have enough QA, you don't have access to enough people or you have to outsource it, but you can only do it for a certain period of time.

And you'd like. To have more than you can, but you can get probably, or you're reliant on the publisher QA and having those people help you out often maybe a little later in the process then you might like, so it's a very, it's also just a case of like, do you have those resources at all?

And if you don't have a dedicated resource, what happens then? At that point, you probably start leaning on your, on, on the rest of the team and everybody becomes like part of that function. So yeah, At different ends of the industry, there might be like different makeups in terms of in the bigger productions, you have these like large groups at the smaller end of the spectrum, you're like just missing resources at all as it were.

Nathan: Yeah, got it. Okay. I think another kind of take on this is that I think when you look at. What are the kind of limit, what limits do you, in terms of, the ability to solve early, what do you need from QA what Q abilities do you want? So I think one thing you want is actually access to the game, right?

Where QA, different within these different groups, often different people have more or less access to the actual game itself. Of course, an engineer on the gaming team has the most access, and usually like dev QA. Has more access than say pub QA or, outsource QA.

And so one of these you often want to do is actually. Give QA people more access to the game more ability to shape the game, exercise the game, test the game, do automation. And so I think that's actually what are the interesting opportunity that the both of our companies are looking at, which is you essentially use AI as a way of expanding the pool of people who have access.

To the game and its automation. I think another thing you can look at is just the number of people that you have, like actually testing the game or the number of agents that you have, games have gotten, in my professional lifetime, just bigger and bigger. And I think you can see, if you're like an indie team, you're probably using like procedural content creation techniques.

If you're a triple a team. You have, enormous amounts of engineers and artists to build that game. And if you're doing a game as a service, your players are content. If you have UGC, the state space of the game is even bigger. And so I think there's been this macro trend towards games becoming rapidly more complex.

And that is, I think, generally speaking, really outstripped the ability for traditional teams just keep up with the scope of the game. And so I think there's this other dimension, which is how many just people can you bring to bear at any one time. And then the final, which is you want to think of scale.

But then the last piece I'll say is like flexibility, where game development is not like a linear constant process. There are could have ebbs and flows to it. And so often, let's say you want to do um, you want to like load test your servers. You're doing a multiplayer game.

You want to know, today we're development today. We have zero concurrent players and we want to test what would happen if our infrastructure had a million concurrent players. How, now, do you need to have a million concurrent players around the clock? No, that's you're not there yet.

So today folks say maybe you just, you wait for early access or something like that. But ideally what you'd want to do is spin up, if you could have a million people jump on your game for a weekend, or an evening collect data, turn it all off. And then go back to making the game like that would be ideal, but that flexibility, the ability to scale up and scale down, you can't do that with, with live players, because they want to actually play the game.

You can't do that with QA, so if you look at these sort of three things like How much access does QA have? How much scale does QA have? How much flexibility does QA have? I think the kind of technologies that we're building can really benefit, all three of those areas. And again, not as a, not at all as a replacement to QA, but actually as a compliment, as like giving the people who do the work the ability To do things they couldn't before to work at a scale they couldn't before and to be dynamic in a way they couldn't before.

Christoffer: Yeah, I think if I can latch on to most like free concepts like access scale and flexibility especially when you're thinking about scale and flexibility It doesn't even have to be that much before you start really enabling teams all together if you can, you know Give them access to a number of virtual players and i'll give you a couple of examples We've talked to a number of like multiplayer studios where you know They're being built out by teams that are just like the core team is smaller You Then the number of people that the multiplayer game is intended for.

So that means that not only if you want to play test your game, for instance, or just see if it works, like even if you take every developer in the studio and ask them to spend their Friday afternoon doing that, it's still not going to be enough. And so what that means in terms of flexibility is it's very hard to get like a proper test of what it is that you're building and that in turn means that your iteration cycles become long.

And and I think most people who are in game dev would say that like shorter iteration cycles are usually. Preferable because it gives you a a faster path to understanding again, to answer that question, is it good or not? And that's also a place where I think that BCI tools can help out a lot because we can start tightening some of those iteration loops that are going on in development.

I think everyone can, Save time, save effort, but also get faster to the creative goal that everybody on the team is trying to reach.

Alexandra: I was going to ask you both to pitch me on why studios should look towards AI for QA, but I'm sold. What you've already said seems to be incredibly valuable.

Shorter iteration cycles and actually even the capacity to even perform the tests at scale. Access and flexible ways Nathan, as you said. So I guess this is a great time to actually start shifting towards talking about modl.ai and agentic, specifically. And, we did, we have this framework, we understand what QA encompasses, we understand how big orgs have been in the past, or we understand how indies have got it done in the past, and we understand what the QA needs might be over the course of it.

I would love to hear from you guys about, where are your companies starting? What developments, where in the development cycle are you guys, you guys as your specific companies based on where you're, you are in your own product roadmap today, equipped to help what genres of games are you equipped to help and does that actually change based on whether or not you're a mobile platform or maybe a third person shooter.

Nathan, maybe you can kick us off here. What is actually in the, who's the first guest, I guess the first QA customer of a genetic and what do they look like?

Nathan: Yeah. We are right now, we basically just announced at GDC this past year, which was pretty fun. We're working with our first design partners right now.

Those are mostly look like a double A and triple A studios as a company. Our big focus our, our, our mission, we want to serve game developers, right? Help them be more effective. And so we're looking at the areas where we think that our technology has the highest marginal utility where our, where there's like a significant improvement.

In using, a technology like agentic relative to the next best thing. And so we think that certainly real time games are where you can benefit a lot more because you have this very complicated simulation and we're currently focused on avatar based games where you control. A particular character in the world.

So it'd be first person, third person, MOBA, fundamentally within that, kind of space of genres, our approach is that we want to give game developers on the team. And I include the QA folks in that we want to give them a tool. To, create AI that do useful work for them.

Because we think that the broad pattern is, you want more players of your game. Christoffer made a great example of that. How do we like, it's a battle Royale. There's 30 of us at the company and we have a hundred players per match. That's a case where it gives a QA, not usually in a formal sense, but you want to be able to inject more players into your game if you have that, your unit testing.

And your human testing, where, you're going to test very little and you want to automate more, of the busy work of smoke tests. Often you want to do that by injecting more players. Now, these players may play in a different way, but it's all, you want agents to interact with and play your game.

And so what the thing is that every game is different, so you can't take a one size fits all. You need a tool that can adapt itself or be adapted to the specifics of each individual game. And also the, like the workflows and the sort of scenarios that developers carry out, care about vary from one company to the next.

And so our approach at agentic overall is rather than saying okay, we will give you this like turnkey, you must use it this way for this purpose was like no it's much more like a behavioral Photoshop. Like we will give you a tool. That allows you to define and train and deploy agents that are useful for you.

And I can talk about the tech, behind that a little bit, but that's the, at a high level, it's about empowering these teams with the tool. Especially when that allows non programmers, non AI engineers, a wide variety of folks on the team to create these agents that do useful work as you define it.

And then once defined, you can deploy the agents, whenever you want, wherever you want, however you want.

Alexandra: Yeah, I am super excited to talk about the how there and what it actually looks like from the developer perspective. So we're definitely going to go there. But Christoffer, any reactions to that?

If things are similar, things are different at modl.ai. For you guys, where are you focusing on the development cycle? And, what gaming platforms, what game genres what type of studios are you guys looking to work with first?

Christoffer: Yeah, for sure. I think in terms of the core problem, very much agree with everything you were saying, Nathan.

You like at Modl.AI we're also very oriented towards bringing quote unquote AI that works. So it does useful work, but also it just like pragmatically works to productions. So we'd rather give you something sooner that'll move the needle for your team right away. Then sell you like on.

A big bang project or anything like that, right? In terms of how we've gone to it I think we also start with the type of game because that kind of like determines what kind of AI is good for solving practical issues in them. So yes, we've also focused a lot on avatar character based games games where there's one singular player character, but you can ask the AI to control that also makes it perhaps easier to learn From what players are doing, that's part of what we do and some of our approaches is to observe what players are doing and then replicate that inside of the build.

And then the second thing that, that we've been looking at is actually just the GUI layer and all the things that are core gameplay, but also supporting that. So that could be for, obviously we have a game that's completely GUI, like a narrative game or something, right? A 2D narrative game might be put together that way.

But also if you have a store. But you're releasing quite often. That's just something that we found to be valuable to many teams because it's a, you're again, like we were talking about before, you're putting in a lot of assets, you're making a lot of changes all the time. You might be releasing weekly, daily, multiple times a day, or just like a lot of things, but can slip through the cracks fair.

So that's like another place where. Where you can help folks out terms of the studio types. Yeah. I think we've we've done a lot of our work recently together with double A's and some triple A's. I think what characterizes most organizations is also that typically you have that functional specialization.

So you have somebody to sit on the other side and be the, your champion or your ally that receives it. That's always good to have somebody who understands the mission you're on and can relate to it. We do have, we actually do have a product out, but Indies can install and use on their own, but take some of the functionality.

But generally what we're also trying to do is to provide our tool to folks in a way where, we help you get started. We'll give you support to get going with it. And then over time, you make it yours by extending it into the functions, but but you needed to do, because as you were saying, Nathan, every game is different.

So we only, the only team that can really define what AI needs to do for your game is your team, right? We can help you. We can give you tools. At the end of the day you'll know where it moves the needle for you. And we tend to approach it. Yeah, with three different things in mind.

One is like what we're putting together needs to be informed by the craft of game making, and it needs to understand the workflow that devs are going through. Then you need to have AI behaviors that are practically useful, but I don't think they always need to be human like. Sometimes you just want a bot, which is like pushes every bot button or like tries to, visit every part of the map or anything like that.

So we tend to think about it as like an in depth. There's almost like an intelligence ladder for AI agents or AI bots, where some are, some are useful for some things, others are useful for other things. And then the first thing that we bring is we have an infrastructure solution that you can even have, you can either have it with us or you can deploy it with yourself.

So often if it's a barrier to figure out how do I spin up a lot of instances to test my game at scale that's something that we can help out with as well. And target wise, we've built out our tooling mostly for unreal and unity, but we're not we can technically support most targets.

But obviously for the big engines we try to have a nicer interface from folks to work with.

Alexandra: Yeah it sounds like you guys have basically from this concept, I know that we you Nathan, I think it was you that were lamenting the relationship between AI and games at the present, games are being used to develop AI techniques versus, AI companies being helping game developers make good games.

It sounds like basically from both what your context you're providing is that you already have tools that are have production readiness and are working alongside QA devs at this moment in time. For some double A indie studios in a way that's actually pragmatic and applicable. And so maybe I would love to hear actually maybe some, a success story, Nathan and Christoffer about something that you guys have done alongside a studio and how that has actually gone down and, how did you ensure predictability?

You guys talked about this phrase called the deterministic hammer, which I would love for you to elaborate upon for our audience about what that is. So yeah, we'd love to hear about how that's actually working day to day for some studios right now.

Nathan: Yeah. Christoffer, you want to start?

Christoffer: Sure, sure. Maybe it makes sense to take outside I would say just like a specific case but we've worked with before it's a game that's that shipped now. But we worked with them for a year and a half, a couple of years. And for this particular this particular title was built out by, I think they were a 50 person team around that size.

So it's like where you have some dedicated QA resources, but you wish you had more. And what we did with this team was that they were building a, it's a multiplayer looter shooter with not procedurally generated maps, but they would dynamic map loading and big areas and things like that.

And they were just facing that that it would take them like 40 minutes to play for a level, also story content and everything for a human tester, right? And they were shipping the game with a hundred levels. And then with variations, it was going to be 300. And it was just like way out of scope for what they could do with that team to have continuous testing in there.

So what do you do about that? For this particular title, we the thing that ended up being, I think the most valuable thing to them was consistent and continuous performance testing. Of the game. So the sort of like the flow that they got into was that every week their builds would go to our test servers.

We would scale it up. We would test all the we just all the levels overnight. So that means that when you leave the office Friday, then when you show up Monday then you basically have. Smoke tests and performance tests and functional tests across all those different maps. And then that informs you like, which of these do I send onto human QA?

Which do we just know, okay where's something totally broken here? And maybe also crucially, you can look at the maps and then, week to week, you can say okay, is the performance getting better? Or actually, did we make a code change that now completely broke this one map? What do we do about that?

So it becomes like a way of taking the temperature. On your game over time it's interesting. We can see in the usage from the studio, every time that they were getting closer to a milestone or if they were shipping a new feature, if they're going to do a release, that's like when all of that that testing consumption really went up so you could see the team using it.

And I think that's one of the most validating things when you're working with a team But you can actually see like where are they in their process And what we're doing seems to be making a difference to them because we're using it, you know More at least like key points in the development process.

Nathan: Yeah a couple from our side one there's a studio we worked with last year. It was a similar size studio pc console game And this is a game. It's like a kind of action brawler with procedural Control World generation designed for between 1 and 8 players and when we started working with them at alpha when the team was, getting a handle on the game itself, but they were realizing that their number 1 risk was multiplayer, especially when you had 8 concurrent players that just stressed everything more.

Then when it was single player, except they had two full time dev QA. And so they had this basic problem. They're like, we want every test to be an eight player co op test. We don't have that many people that, like that's actually bigger than the test team itself. And so what we helped them is we created agents that were like kind of companions where you would have a human and the human would be the kind of implicit leader.

And then the agents would join and, you could choose whether they, follow along and be good buddies or just be randos and go off in their own direction. But the thing that we do is our integrations whether it's, Unity or Unreal or proprietary, we integrate at the player controls layer.

So we this is a bit in the weeds, but a lot of bot frameworks bypass the player controls. They actually connected the game with kind of lower level systems. So They look like a player, but they don't actually interact with the game the way that a human does. And we think that's, really important that if you want to truly exercise the game like a player, you have to integrate in the right way.

And so in our case, we integrated like a real player, so they could actually spit up a bunch of little headless, basically, whenever a tester would test the game, they would fire up the main game. And then seven little headless game instances that had agents in them that all joined the same game instance.

And all of a sudden, lots of, Oh my goodness, like this here, there's a crash when this happens, latency spikes here. And in a lot of cases, the developer, they had the ability to catch those defects, on their own. They just didn't know they were there without having, a consistent suite of players.

I think another thing that's, that I think is also interesting is that even though, QA, I think is the kind of the obvious first application of this technology, we really see it as a tool for the entire dev team. Other examples, let's say you're an animator. And you're just like iterating on the move set of a character.

Your tool of choice is, Maya, you're animating, you're trying to work with, real data or mocap data, your own inspiration, pull it all together, export it to the game. But in practice today, you often have to just sit there and play the game to do a move over and then go back.

Work in Maya, play the game some more. And that iteration cycle, like Christophe was saying, it was like that, the longer that iteration cycle is, the less productive you are. It's actually just training an agent to be like, do this move over and over again, or do this combo, is actually now you could just let the agent.

Like this is AI in a very simple sense, but if you can define that and train it and deploy it without writing any code, without knowing anything about behavior trees or state machines, you're like, yeah, Hey AI, do this thing over and over again. And now you can iterate on your animation. Just look over, Oh, that's looking better.

Keep it. Oh, and now you've got, a move set that you like. And is that like a AGI kind of, Oh, you know, application? No, but it's incredibly. Useful and productive, right?

Alexandra: Yeah, and maybe that's a lot about the, the distinction, like you said before, about the relationship between AI and games, this being useful to game developers versus some sort of AGI statement, like you said before.

And, first of all, that was super helpful because I guess from, from the studio perspective you're saying, okay, how does this actually work? And it sounds like from, at least from the agentic side, it's like almost like being in like a Final Fantasy party based combat system where you're like, okay I'm playing Kingdom Hearts.

And Donald Duck is gonna, I'm gonna send him to Utility Player, I'm gonna tell, send him to Agro, he's only allowed to heal me at all times, he's never allowed to attack anybody, right? You're setting the customized behaviors of the characters or the other players in the game to be alongside you to help and aid with that QA testing.

Is that generally, The way that some studios can think about it.

Nathan: Yeah, I feel like Christoffer and I could probably get into the technical weeds for a long time. But I'll, and I'll try to hold myself back from getting too deep. But with the Gentik, basically we provide SDKs for Unity, Unreal, and proprietary game engines.

So you pick the SDK that matches your game engine, you integrate it. Yeah. And then we provide you with depending on your game engine, different ways of defining an agent and, agents are fundamentally defined as the actions that they can take. And that's jump, move, look, fireball, use item, whatever.

And then you also define what are called observations, which is for us, the kind of the parts of the game state that the agent cares about. And so if that's like in a third person shooter you probably care about the enemies you might care about, your next objective and where that is, you maybe, you probably care about the static world around you, obstacles and things like that.

So we give developers tools in the SDK, in the game editor to define what are more agents. But then once you've defined them. You actually need to tell them what to do. You need to train them, quote unquote. And with agentic, we provide a conversational human in the loop approach for defining training and deploying agents.

And so the way that you once you've integrated the SDK, the way that you train a new agent. As you launch your game we give you a, a chat interface at our SDK, and you talk to the agent and you tell it what you want it to do. And so if you say, yeah, I want you to go, be a healer and heal under these conditions the agent will actually write code against kind of a scripting language.

It's part of our SDK that's customized per agent, per game. And so essentially the AI will try to write a script to do what you asked it to do. We'll do some validation. We'll deploy that into your game live. You can see the agent actually do its version of what you asked it to do. Sometimes it's perfect and you can basically approve it, or you can give it additional feedback like, ah, like too much healing, or ah, actually, I forgot to tell you this important detail.

So there's this you essentially have a conversation with the agent. And what you're doing is you're building up, you're giving that agent a set of skills and often skills build on top of other skills and then you're a lot of what we do is we make it, we allow you to do that without writing any code.

There's a lot that we do to actually allow an LLM to write code that's functional in the context of a game. We're an adapter in that sense. And then as you train skills for The agent can learn them, memorize them, and then you can deploy the agents later on with those skills. And you can do that conversationally or through an API if you want to do it with automation.

But there's lots of really interesting reasons of why we chose that approach. There's other approaches that are also compelling in different ways. But I think that some really high level things that we believe in are that That it's human in the loop and that it's a tool, right?

That it's not it's tempting to think of, AI is like a magic lamp and like test the game Jeannie, I have free wishes, and then somehow magic happens and it's like. It all works, but that's not how this works at all. It's really, we need to leverage what AI is really good at, like scale, flexibility, writing code on your behalf and also leverage what humans are good at, which is like creativity, context, purposefulness.

And so the way that we see that working is to rather than being like, yeah, here's a magic lamp. Here's a black box. Instead, here's a tool. That gives you some new superpowers, but fundamentally, because it's a tool, it has an interface, you can use it in different ways. You get actually a final say over whether the behavior is what you want, and we think that kind of developer experience framing infrastructure, all of that in service of the human is what we're excited about and what we've really built our product around.

Christoffer: Yeah, I think philosophically we're coming from very much the same place and also from the same starting point, but it's if you have an agent in the game, it needs to know two things, right? It needs to know what can I do and what's going on? And those are that's the starting point for many, like for almost all for any agent based approach, right?

We also have an SDK but but we take it as like the starting point for the developers to get going. We do a setup that sounds a lot like, like how it works with the gen tech, ask the developer, like nobody knows the game better than the developers themselves. So it would be like a total waste to try and guess what is this game about instead of just having the developers tell you by setting it up with the SDK.

So that's like our core approach as well. And what that gives you is like a a summary of like, what is this game about? And that's like the perfect bit of like expert information to get your AI started. Okay. So once you have that in there, I think that's where in the model approach, we diverge a little bit.

I think that's a great approach but you're taking me from, I definitely see strengths with that. I think we we approach a little bit with some pre packed behaviors that can get you going without any input from other events, setting up the SDK. So that's where we try to move the needle for devs, like out of the box where we see, you don't have to tell it anything, you don't have to do anything.

Just set it up. Let it go, but it's not going to play like a human, but it'll do something for you. And then the way that, that we interact or but we think about informing the agent about what it's supposed to be doing inside of the game is very much based on showing at the moment. So a lot of our approaches is it's rooted in the fact that, QA.

And then dev teams in general will have a lot of test cases to find and they will be running through them, daily, weekly and monthly, whatever, over and over again. And that's like a perfect starting point for asking the developer to show me what this game is about.

And just demonstrate for me how you would do this, because I think we're very rooted in the notion that if you have done something, five, Five times over and over in the same day, it should be automatable. We just have to try and find the right technological way of doing it. And so that's like where we build a lot of our approaches on having, like leveraging the fact that you already have a team that are executing these functions.

We can learn from like, seeing what we're doing in the game, and then, by virtue of doing it in the game, we can then also demonstrate this is what we want to happen inside of the game, and then that becomes the basis of building up the behaviors. So it's also, like a similar approach, but based more on seeing what the what the The people who are building the game are doing with it and then replicating that in there.

Then once you get enough behavior of that, we can start merging like that data together and then building models that are typically, more adept at playing the game in a general fashion.

Alexandra: Yeah. So this is first sounds like there's slightly two different approaches, but same kind of situation where SDK And you're basically either empowering an agentic agent to play it a certain way, or Christoffer, you're talking that Moodle.

ai has these custom preset Agents Models, Agents and Models the Agents and Models come to my game and they help me out. And so actually I think this is a great time to actually transition to what does this actually look like in terms of a business relationship between yourself and the studios for you and for the studio?

And so, ostensibly it sounds you'd partner up with you guys. Walk me through what that actual onboarding looks like and also tell me a little bit about What this costs to run for the studio obviously like potentially we're saving time, but one could presume that some of these large scale tests are expensive.

Who's bearing the costs for those tests? Is it the studio? Is it you guys? Nathan, how about you? You kick us off.

Nathan: Yeah, sure. So we're a SAS business. So software as a service it's fundamentally you pay per unit agent time. If you have a thousand agents play your game for an hour or one agent play for a thousand hours, it's the same cost.

With our customers, we're right now invite only. So we're doing it's The pricing is all private, but it's essentially tiered usage based pricing where, developers, they need to be able to, increase the, like Christoffer was saying earlier, there are seasons where you're doing a lot of automation and there are other seasons where you're doing less.

And so people need the flexibility to go up and go down. We've also heard from a lot of customers that the thing they're most afraid of is like some script goes rogue and they wake up to a, million dollar cloud bill or, infrastructure bill they weren't expecting. So they want, I mean, that sounds good to me.

 They really want to have capped costs. So the ability to have predictable expenditures, but really conceptually, we think of our agents as like little virtual contractors, . You can deploy them in your game. They'll be confidential, right?

So they won't tell anyone else about the specifics of your game. They'll learn as they as they play your game and they'll learn how to do it better. But then also that they may carry some of the metal learning for one game to the next, even though they won't carry any of the specifics of your game for one to the next.

Christoffer: Interesting. Yeah. And over on our side we're also a SAS business. So that is that's like taking the same approach here. I think what we want to do is, the more, we want to provide value to our customers by having them want to use our products. And when we want the testing to be helpful for them.

And that's also why we take a consumption based approach. We basically say, it's very straightforward. We want you to want to use this product and the more you use it, the Probably the better your game is going to be the better off we're going to be. And that's like how we succeed together, in the industry, we want to be part of that movement.

And that's that's how we've set it up. So with modl.AI, we typically set you up with a running subscription package that gives you hours of testing. Bed testing will involve one of our bots in one way or the other. And basically we'll spin up, machines that will run for the allotted amount of time that you'd like to test your game for.

And then that can happen on demand. Some teams like to manage it manage it themselves and just be very manual about. So we wanna test it now. Other teams maybe want it to be part of their continuous integration pipeline and say, given certain conditions, we just fire off a build and it goes up there and we'll spin up, 10 or 20 or a hundred copies, run the testing consumer results on the other end.

So that's the same approach that we're taking. I think in terms of the onboarding experience, that's something we've spent a lot of time talking to customers about and experimenting with and trying to figure out how to get these tools to fit into, to people's workflows. And with the kind of customers that we have at the moment, the conversation often goes along the lines of, We need help, but we're really busy and probably too busy to get it like on our own.

So what we've decided to do in modl.AI is that if you decide to onboard with us, like one of the things that we set you up with as you're coming in is what we call our developer success engineers, which is basically, it's a person who will sit next to you and make sure that you get the tool implemented in a good way that you get the most value out of it.

 They'll help, define your first test, test cases with the AI automation, all those different things. Because at the end of the day it's, as we were talking about before it's all about the team and the people who are building the game and you need to help them adapt to it.

Christoffer: And we've just found that if you have somebody to get you going that usually can help drive that transformation in the production. And when, you know, when the team's ready, you can hand over the keys and they can start building out with with the tools on their own.

Yeah, I was going to say yes, yeah, we never go away, we're always around for support but the idea is that we get you going, and then you take it away when you feel that you're ready

Nathan: yeah, we have a similar model, and that's part of the reason why we're invite only right now, where we're working with Select Studios to make sure that we can provide That kind of high touch onboarding experience.

I also agree with Christoffer that, yeah, there's this interesting balance that you want to strike where, these are, like I said, these are new, this is a tool, but it's a new kind of tool. And so teams need, at minimum, so help, like, how do I use the tool? What's the most effective, right? In some cases there's more than that.

There's the integration of like, yeah, we actually want some help. Not just understanding it, but connecting it into the details of our game. Sometimes it's not just integrating it. It's setting up things, but that's, I think that, that it's important for companies like ours to be available in that way to help get people off the ground.

But I think we also both, there's this other trap where if you do. Too much of that for too long, then the game teams don't really internalize it, and ultimately what I think both of our companies aspire to is a tool that like the teams value, you know, not just because it's work that somebody else has done, but they value it because it makes their day to day job easier.

And then the more they value it, the more they use it, the more they use it, the more they're able to own it. And that's the sort of path from, early on kind of high touch support to a sustainable, scalable business.

Alexandra: Yeah. Yeah. In a way, you're actually quite similar to like the backend gaming engine business where, it would take way longer for me to build a QA org by myself to almost to a on, especially for smaller studios and undoable amount.

So this is a little bit like of a build or buy situation. And what you guys are basically suggesting is that your business models are, they should be saving these studios a tremendous amount of money, right? Whether or not you're on a subscription package or whether or not you're on a consumption based model.

At the end of the day, right? It should take, I should need to hire. My org before needed to be 70 QA people and now it can be two QA guys with partnered with either agentic or mobile AI.

Nathan: Yeah, I don't actually. Yeah. And there I was like, I don't know that I would say don't hire the 70 people. I would say still hire, like just the 70 people, can just do way more than smoke testing the bill and harvesting all the plants.

And I think it's really important not to see this as like a zero sum game of like humans versus robots fights, but I'm like the more humans, the better. And even if you have the 70 people, I guarantee all of them can be more productive, help you make a bigger and better game. If you add a technology like auto or a jet tech.

Christoffer: Yeah, historically, automation has usually led to more growth and more productivity and basically more, more, prosperity being created. Sometimes I call it the spreadsheet effect. It sounds like a boring story, right? But when Visicalc in the first spreadsheets came out, everybody thought Oh, accounting is going to be automated, right?

There's more accounting than ever happening in the world. And that's like the way it tends to go with with this kind of automation. So not to not to. Pessimistic about that side of it.

Alexandra: Yeah. And maybe that's um, you know, you were drawing to a point that I was going to try and close on, which was, this is the cyclical pattern of industries, which leads to more leisure time. I started reading a book about, generational change and just talking about this idea that, you know, back in the day we spend a million hours harvesting corn and now we have machines help us harvest corn, which has allowed us to make the iPhone.

That sounds like a pretty good life improvement. And maybe perhaps this the wave of AI that's happening in games is going to be very similar where it is going to allow, like you said, Nathan, the people who are humans to do the things that they're the best at while taking away some of the brunt work.

Earlier in this episode, we talked about the ATVI QA union. And for those of the audience that have been keeping up with the news the Activision Quality Assurance United CWA form to protect QA salaries, labor protections, and overall to fight for their status rights to say at parity with developers and other software engineers.

Alexandra: This was initially start to start sparked by QA testers at Raiden. But it grew over time. And I think that, Nathan, to your point, you're not saying don't hire these 70 people, just empower them to do more. But at the same time, this is. the appeal of things and tools like this is to allow a studio who would be unotherwise able to do full productions testing fit it into their budget.

 And this isn't just the QA thing, like this is the entirety of the efficiency tools for the game sector's value proposition. And so I think that's important for us to just address head on, is that it is actually going to allow us to move away from brute forcing things and But brute force before costed money which I think is obviously where the business value proposition comes from your companies.

Alexandra: But guys, this is such an amazing episode, and I learned so much from this about the different types of QA testing, where agentic and modl could really impact a studio. And I have a final question for you guys. Before we close and wrap, we're 10 years into the future. What do you guys think that the future of QA testing looks like?

And how do you envision that relationship between AI and man to be? Nathan.

Nathan: Yeah, I think given the rate of progress that we're seeing, in machine learning and AI 10 years, it's gonna be a long time. It's gonna be, I think any prediction made today, about 2034 should be taken with a giant grain of salt.

But I think there's a few things. And again, I think, I'll go back to, our company name, like agentic, right? Helping people achieve their goals. And my hope is that, in 10 years or well before then that we're a technology that helps everyone on the dev team achieve their goals more effectively.

And if you're a tester, your goals are to like cover the whole game, which is very, how do you, which it's very hard to do given. Procedural content creation, UGC. So we'll give you a tool to scale yourself and delegate the busy work. One of your goals very often is actually to be closer to design, right?

If you go into QA as a career, often it's because you care about games and you want to help them be more fun. We're going to free you up. To let you do more, spend more of your time on not does it work, but is it enjoyable, for the rest of the dev team, same story, right? You're an animator.

You want to focus on your craft. The agents will help designer in for engineer, whatever, or you want to shape the experience. Give your players a better experience. We'll help you there, but I think broad, I think, honestly, the 10 year vision is way bigger than that, where we're going to see agents permeate software development.

And, the same way as like right now, if you do a task on a computer, you use a GUI, right? Your experience is mediated by and facilitated by, windows and icons. And that helps. People who aren't programmers, who don't like console commands be more like use computers. It helps all of us be more productive.

There are many things that you can't do with, if you want to paint on a computer, you can't do that in a command line. I think that agents are really going to be the next generation of interface where more and more of your work will be facilitated by, mediated by in collaboration with an agent, an agent that, And I think agents that are increasingly Capable, intelligent, directable, collaborative, and so I think what in that 10 year time frame, what you're going to see is everyone on the team, maybe even without fully realizing it is using agents in different ways in order to be more productive, more creative, that hopefully, make games that are richer and more rewarding more polished, than that we can make right now.

Awesome. That's a beautiful feature.

Christoffer: Adding on to that I think 10, yeah, 10 years is such a long time in terms of what we've seen just like in the last two to three years. But I also think, it's going to be, you're going to have these assistants or these agents or these, AI features, but they're going to ask you questions like, what would you like to happen?

So again, if we think about the quality of a product it's going to be less about, press a press, does the sequence work? It's going to be is this what we would like to happen? Is it? Is your game supposed to work this way? I made this change. Do you like it? I think it's entirely realistic that if you think about having agents that or bots that interact with games that actually play them right, but then produce ways of seeing what happened inside of the game, automatically identifying issues, not just like functional things or, code level things like crashes or missing assets, but maybe also, you could imagine easily that you'd be able to detect things like textural glitches, or you could maybe even identify with certain constellations of things in the game scene as it's being rendered to the player are not really exactly what you're looking for.

And then 10 years down the line, it's absolutely Thinkable that agent would then also go back to the game project and suggest a change to it, right? So there's there could be another leap where you say what happens after we identify the bug then we go and make the suggestion how to change it and fix it And then the role of the tester really becomes, you know We become like the decision maker in that process and answering these questions Is this what we want to do?

You're directing the process of what the game is becoming over time So I think that'll be a big part of what testing looks like some years down the line And then I think the other thing Is that like experiences are going like games as experiences are going to become even larger than they are now.

I bet would be my guess anyway, just because we now have tools to generate more content faster. And players are participating in that as well with UGC and everything. But that then, also asks for, cause you to ask the question, which player sees what part of the content. So what part of content should go to who?

As we're bringing people into these games. And I would imagine that the testing role also becomes part of that kind of like annotating the experience, like building like a map of the game that is now so large that no one can see all of it, like in one go. So what's appropriate for which person that's like a natural extension of the design intent that you have, that you're building a game again, answering the question, is it good, but also who is it good for?

Nathan: Yeah. So I was just saying, I think I agree with this. Really, there's that commercial from the nineties. About there's the two QA testers and they're like, hold on, I've got to tighten up the graphics on level three. So my vision is that, in the future, you'll just tell the agent, tighten up the graphics on level three and it'll happen.

Alexandra: Yeah. The true vision, the true dream state. All right guys. I loved hearing about your visions of the future. 10 years down the line, where are we with QA? And it was such a pleasure to have you guys both on. There's so much opportunity in this space, clearly. If anybody from the audience would love to get in touch with you, either because they're looking to work alongside you, they want to have a further conversation.

They're just a fan. How can they get in touch?

Nathan: With our website? It should take. ai.

Christoffer: And same thing for us. It's a M O D L dot AI. So model without E.

Alexandra: Amazing. A model without the E. All right. So basically we concluded the episode and then we got into the breakout room and we decided that there needed to be an addendum, a couple of minutes to discuss a couple of things around quality of life for QA and also around the things that modl and agentic are not doing in order to train their players.

And so Nathan. Why don't you kick us off on your appendix thoughts?

Nathan: Yeah so first, I think I'll talk to the quality of life because it's a really important subject that, it's very easy to focus on game quality and, saving money. But I think one of the big issues in game development, certainly in my professional experience, is that it can be a very hard game.

Job, right? And there's still an industry with a lot of overtime, a lot of stress. And I think if you feel like what are the causes of that, I think very often it's because you're making the game and it's just barely under control and you're trying to find bugs, but inevitably, if you're not able to scale, your testing, your dev processes, then you wind up, you know, bugs again.

They stay there for months at a time. Other bugs get built on top of them. You finally fight and fix that first bug and then more things break. And for a lot of teams that the crunch at the end of development is just to get the bugs to burn down close to zero, right? Like you. And so I think that actually automation has this really Great opportunity to improve people's like quality of life, like day to day and year to year.

And that if you could actually be just consistently exercising more of your team, automating the busy work, then you're going to actually like your game on a moment to moment basis is going to be more fun. You'll spend more of the time doing the things you like, but there's really strong evidence that you could also keep your kind of maximum number of bugs in control, which means that the process of finaling the game or shipping the next season Is way, way more tractable and reasonable without just like throwing the candle of your life at the problem.

Christoffer: Yeah. And I think maybe that's actually something that is most keenly seen from the QA perspective, right? If there's any part of your organization that understands how smooth the process could be if this was like a continuous thing, rather than something that was built up and then addressed at the end it's probably the people who are directly like involved in that I just remembered Just finished yesterday with, which is this like great online event that bring QA professionals together and talk about it.

And there's a lot of the talk topics that came up there was precisely on, scaling and and automating and structuring and systematizing both at the, the social technical layers, like how do we work, but also in terms of the. The hard technical layer, what do we work with?

So there's definitely a bit I think that movement is happening in the industry. Everybody understands that that there's a potential to be gained by automation, but I think a lot of people are still waiting to see it to come to fruition. And I think, hopefully companies like ours can start providing some of the tools that can actually make that happen for folks that you don't.

You know, sleep under your, Sleep under your desk. But so that you don't have those super long days that end up, just costing a lot at personal level.

Alexandra: Awesome. All right, epilogue part two. Let's talk a little bit about what you guys are not doing in terms of large language models and the kind of academic approach from the research companies in AI versus you guys being more production readiness tools for game developers.

Christoffer: What we were just talking about, neither of us do reinforcement learning at scale. And and that's because, when you're doing that kind of agent construction where the we AI basically learns the game from a set of high level objectives and just from interacting with the game.

Just the cost of scaling that up in terms of for how long you need to run the game in terms of like how many agents or how many bots, how many sessions you need to run, how much data you need to collect. That's a very expensive process to do. And it may make sense for certain games at a certain scale, but as a general solution problem for the games industry, at least in terms of like where we stand right now, it's not something that really seems to be cost effective or within reach.

For most for most productions, because if you look at someone like that really impressive work that's been done with with bots for playing, Dota or Starcraft or racing games like the Sophie AI that, that came out of a Sony team all of it is super impressive.

And it's really great AI work, but the amount of effort that goes into generating that, which has put you outside, like both in terms of the effort, like to build it for a game specifically, perhaps, but also just the amount of compute that you would need to expend to make it happen would end up taking up such a large chunk of your game budget, but it would be out of scale for, 90 percent of the industry, probably yeah, sorry, Nathan, over to you.

Nathan: No, not at all. Yeah. I think you just do because earlier we were talking about, the differences between, the agentic approach and the model approach. And I think those it's natural, right? There's actually probably more than one approach that works and there's different trade offs.

But I think, yeah, I didn't want to go on. Said was what Christoffer just mentioned, which is that I think that if you come to the space, from the outside and you go alpha star was 2018, like they trained an AI that can play the game. So how, why is this like a hard problem?

Like, why do we, like, why can't you just like take that algorithm, rapid API around it, put a stripe payments in front of it and problem solved, and the. The reality is that, that the work of a game developer, the resources that you have available, the amount of compute and data is really different than, a research lab that's focused on AGI, especially at a deep mind or open AI kind of scale.

So I think much of what our, I think that's very common to the two companies that I really personally find fascinating is this, how do you deliver AI that is useful, directable, productive without, the sort of, without requiring expertise, you have to have 10 ML PhDs or, tens of thousands of concurrent game instances.

And so to me, that's like actually the more interesting story is like that fundamental, the invisible part of shaping your tech and finding a path that allows you to deliver that value. In a way that's usable, useful and economical.

Alexandra: All right, guys. As always friends um, if you've got feedback or ideas hit me up at [email protected].

I'm always open. And with that, we're out Nathan, Christoffer. Thanks for coming on.

Nathan: Thank you so much. It's been a pleasure.

Christoffer: Yeah, thanks so much. It's been great. Thank you.

If you enjoyed today's episode, whether on YouTube or your favorite podcast app, make sure to like, subscribe, comment, or give a five-star review. And if you wanna reach out or provide feedback, shoot us a note at [email protected] or find us on Twitter and LinkedIn. Plus, if you wanna learn more about what Naavik has to offer, make sure to check out our website www.naavik.co. There, you can sign up for the number one games industry newsletter, Naavik Digest, or contact us to learn about our wide-ranging consulting and advisory services.

Again, that is www.naavik.co. Thanks for listening and we'll catch you in the next episode.