Incorporating ChatGPT into a Unity prototype (Part 1 of 2)
Is it a good idea to add ChatGPT to a Unity game? You may have your own thoughts already, but this was the question that my thesis partner and I sought to answer for our Master's thesis.
In this post I would like to go how we incorporated ChatGPT into Unity. This is a relatively high-level description of our approach. If you want more detail, you can always check out the repo yourself.
In our prototype, we have two main ways of making use of ChatGPT:
- setting property values for instantiated assets
- generating narrative
(If you wish to read the thesis in its entirety, you can get it here)
REST client
Let's first talk about how one can even communicate with the ChatGPT backend in the first place.
Despite there being some projects that aim to allow developers to communicate with OpenAI's APIs (either as a C# library or a Unity package) we had better luck writing our own simple REST client. Luckily, Unity has its UnityWebRequest
to make things easier.
Whenever we want to POST to the ChatGPT backend, we invoke a Post
method:
This works well for our purposes, despite some awkwardness regarding the manual copying of previous messages (it ended up that way to facilitate both the beginning of the game, where we have no previous messages, and the calls subsequent to this).
As you can see, we are copying messages from earlier in the conversation (if there are any) to the messages property of a ChatGPTPost
class, which is merely a Serializable
class capable of holding a model
string and messages
array. We serialize this using Unity's own JSONUtility
.
The response is similarly deserialized:
response.ParseBattleInfo()
and response.ParseLogString()
are how we make use of the returned data from ChatGPT for our game. Let's talk about response.ParseBattleInfo()
first. I'll go over response.ParseLogString()
in Part 2.
Instantiating assets
As you saw earlier, JSON deserialisation is at the heart of this. Upon launching the game, we make a request to ChatGPT's backend to get the beginning of our narrative, as well as some property values for instantiating our enemies.
The following is our ChatGPTResponse
class, in its entirety:
The actual conversational response from ChatGPT comes in an array property called choices
(so-called because the model can come up with multiple possible completions given the user's input prompt). ParseBattleInfo()
is how we deserialize our very first call to the API, as we ask that ChatGPT gives us both an opening narrative scene and an array of enemies for us to instantiate:
This prompt grew over time. The sentence All of your output should be a part of the JSON object came about from ChatGPT's tendency to say Sure! Here's your JSON object:, followed by the JSON object, thereby breaking everything. Additionally, towards the end, there is It is the beginning of the story only; we sometimes had issues with ChatGPT writing a completely self-contained (albeit small) story, including an ending. We needed to remind it that it is writing the setup only.
We're also asking for particular sizes and weapons. The actual deserializing is done in ChatGPTResponse
via Unity's own JsonUtility.FromJson
method, as shown above.
We're using our BattleInfo
class to both describe the structure of the object we want from ChatGPT (in the prompt) and deserialize the result of our call:
Then, when the call is finally complete, we can instantiate our orcs in our GameManager
:
We similarly display the result of the logString
property of our ChatGPTRepsonse
- a script attached to a GameObject is repsonsible for adding the opening narrative (and any additional narrative) to a text view. This text view shows combat info as well, as combat progresses:
And that's it! Remember that if this seems light on details you can always check out the repo.
In Part 2 I'll go over how we handle narrative subsequent to the opening scene. We do this by merging enemy characteristics with generated narrative to create a story influenced by those characteristics.