Utility AI

Week 19 • May 12, 2024

Following up on last week's exploration of GOAP, I spent some time this week trying out a different approach to AI decision-making: Utility AI. It's simpler to implement but shares some concepts with GOAP, so I ended up making a small implementation of the concepts to try it out myself.

How it Works

We need to cover a few concepts before we can understand Utility AI:

Attribute Scores
Actions
Utility
Attribute Curves

Attribute Scores

If you've ever played "The Sims", you'll know sims have a bunch of attributes that you need to manage to keep them happy:

Hunger
Energy
Bladder
Fun
Room
etc.

These attributes will (mostly) slowly go down over time, and you need to balance completing attributes against each other. As explained in GMTK's excellent video on the topic, The AI in The Sims can convert a sim's current state to a single value, which we'll call an "attribute score". For now, we'll take a simple approach, and imagine this attribute score is just the sum of all the values, normalized to a [0-1] range. For example, if we had the following (simplified) attributes:

Hunger: 0.4
Energy: 0.3
Fun: 0.7

The total attribute score would be 0.4 + 0.3 + 0.7 = 1.4.

Actions

In The Sims, a house has a bunch of objects which can be interacted with. The available interactions for each object are called "actions". These actions provide some change to the attributes. For example:

A fridge might have an "Eat Lunch" action available, which adds +0.5 to the "Hunger" attribute
A Bed might have a "Nap" action available, which adds +0.4 to the "Energy" attribute
A TV might have a "Watch TV" action available, which adds a +0.4 to the "Fun" attribute

Each of these actions adds to a single attribute, but actions might increase (or decrease) several attributes. Point is, actions are defined by some change they'll make to the attributes.

Utility

We can think of our actions in terms of how much we expect them to affect the attribute score, reducing our actions to a single number. We call this expected change in the attribute score "utility".

For our actions above, we'd say that:

"Eat Lunch" has a utility of 0.5
"Nap" has utility of 0.4
"Watch TV" has a utility of 0.3, since the "Fun" attribute would get capped at 1.0

Now that we've reduced each action to a single number, we can easily order them from most utility to least, and pick the action which gives us the most utility (or pick between the top 2, to give us a bit more randomness).

More generally, utility is the difference between the attribute score before and after the action has been performed.

Attribute Curves

In our previous example, attributes all always contributed the same amount to the attribute score, but in reality, things tend to be a little more complex. If you're REALLY hungry, you might not care too much that you're also REALLY bored, you just need food. This is where attribute curves come in.

Attribute curves allow you to manipulate the attribute score so important needs can be taken care of before others which are not as urgent.

In The Sims, attribute curves allowed designers to model these dynamics, so that important needs like eating and resting would take precedence over ones like having fun. The nice thing about this approach is that, once these curves are incorporated into the calculation of the attribute score, they can have a big impact on what action is picked.

Implementation

I implemented a simple version of utility AI in a React application. The application has attributes:

Hunger
Energy
Fun
Health

and actions:

Eat
Rest
Play
Damage
Heal

Each action changes the attributes in some way, and each attribute is given a curve, modeled with a cubic bezier (start point, control point 1, control point 2, end point). Each attribute is also given a "decay rate" to indicate how fast it should decrease over time.

The application then calculates the utility of each action, and picks the one with the highest utility. The actions are ordered by utility and the action with the highest utility is highlighted.

For fun, I also added a "store" where you can buy a couple items which also add to attributes, but I'm not calculating the utility of those. There's a special "Add Coin" action, so you can buy items from the store.

Here's the interface as it stands today. The attributes are visible in the bottom left, along with their associated curves. Actions ordered by utility are visible in the bottom right, and the action with the highest utility is highlighted as a primary button, while all other actions are secondary buttons.

Utility AI interface showing attributes and actions

Wrapping Up

It was pretty fun to build out a system like this behind a simple interface. While it's not exactly a fun game, I think it proves out the fundamentals of how one might build a utility AI system for use in a real game. Of course, in a game, actions would take time to complete or might slowly change attributes over time as they are performed (as is the case in The Sims), but the core concepts are there.

In regards to the Anjin project, a similar concept could be used to choose which goal to work on or which actions to take during a given session. I'm still thinking about that in the background, but it's also been fun to just explore some of these concepts in a more isolated environment.

I've started spending some more time with Godot, so next week I'll write about that experience. I've also got a lot of travel coming up, so my updates might be a bit more sporadic over the next few weeks, but I'll do my best to get them up in a timely manner.

See you next week!