Phew, so that was the meat of the week’s work. Let’s wrap up the loose ends.
This is probably skippable as it’s just a note of some other things that got changed.
Moving to Unity.Mathematics
Unity added a new math library recently and whilst having practical benefits, it also results in much nicer code to write and read so we have started using it for all new work.
I took an hour this week to move all the tile data code to use this except for Bounds as I want to make sure that the new behavior matches what I expect.
I also got to delete my Vector3Int32 class as Unity.Mathematics contains int3 and I renamed my Vector3Int16 to short3 and tweaked the API to match that of Unity.Mathematics so it all feels more consistent.
One future task that has been making me nervous is updating the realtime networked portion of TaleSpire.
I had a little breakthrough simplifying some of our code that got very spidery and complicated during the development of the alpha. I was going to write a little about that here but I’m gonna save it for another week as it would turn this into another long post.
Start loading assets earlier
I probably should have mentioned this in a previous post but oh well.
Unity loads asset bundles asynchronously. As we don’t want to try load every tile in TaleSpire at once, when you first place a tile we need to load it. This results in some frames passing before spawning continues.
This could be exacerbated by the fact that changes are queued as it won’t know a tile is needed until it’s time to spawn it.
For that reason, we make sure that as soon as the data representation has a new tile kind added, it notifies the asset loader so it can start loading the asset.
This will rarely matter but it may help when there is heavy load, and that’s the time you need the most help.
Merging this monster
I’ve also started kicking this code into a shape where it’s suitable for merging into master. Jonny and I divided up the work so that, technically, neither of us have tasks that block the other. However, we still need to end up with one game and my current monster branch goes against everything I like when developing.
Ah well, soon enough it’ll be in.
Actually the end for realz
Well, that’s the week. In that time I was also best man at the wedding of a dear friend so it’s been a doozy.
It’s probably time for a cup of tea.
Sound of trumpets BEHOLD, Part 3 commences.
We have talked about updating the data and the presentation but we still have to animate the tiles.
In TaleSpire we want to be able to have animate tiles as they appear and disappear as it makes the game feel better. We have a lot of tiles to update so we have to take care of how we do this.
A simple problem
We wouldn’t want to have to update the positions of the GameObjects every frame so we animate the drop-in using the vertex shader. The animation curves are simply stored as ramps inside textures we can sample over time. This means we only have to update the per-instance data for the GameObjects when we want to change which animation they are running.
A secondary problem
One annoying thing though is that, when changes to this state do come, they tend to come in large numbers. One example of this is during tile selection.
When players select tiles we raise them very slightly so show that they have been selected. This means we need to update the per-instance data for those tiles. Also if the tiles have props attached to them those props need to be raised too.
Unity’s interface to this is via MaterialPropertyBlocks and like so many things in Unity it can only be updated from the main thread. This kinda sucks as what we need to do is perfectly parallelizable
If the tile bounds intersect the selection bounds then set the value
This is extra annoying when we consider that selections bounds are 3D and so the number of tiles that may be involved goes up terrifyingly fast.
To resurrect our example from the previous articles a 30x30 tile slab contains 900 tiles. Due to TaleSpire’s smallest tile size being 1x0.25x1 units A 30x30x30unit region could (theoretically) contain over 80000 tiles.
Obvious that is insane and we will need to add some form of density limit, but the game will need to communicate this limit to the players in a way that feels fair.
Once again user-generated-content games bring their own flavor of crazy to the party :)
The secondary problem with a hat on
But we aren’t done, oh no we are not. Last time we talked about progressively applying changes to the presentation. Due to this, a selection can be made on tiles that are still spawning. According to the data layer, you selected perfectly valid tiles but the presentation hadn’t quite caught up. Maybe the tile spawned the frame after you let go of the selection and so as far as you percieved it was there when you let go.
Now there are different ways to resolve this but the way I want to go with for now is that the data is the source of truth, if you selected the right region you selected the tiles even if the presentation hadn’t quite got there yet.
I expect this to be an edge case that people won’t even notice, but it is better to have an answer for it.
So let’s talk about one change that helps here. What we are going to do is move the tile’s animation state out of its per-instance data and into a separate buffer on the gpu (let’s call this the tile-state-buffer). Then we will put the index into that buffer into the tile’s per-instance data instead.
This little change lets us solve all the above issues.
First off we will have a copy of the tile-state-buffer in local memory. This will be a native container so that we can operate on it directly from Jobs. This means all those selection updates are done in parallel now.
Next, we just gained the option to push only part of the local buffer to the GPU per frame. This is super handy in cases where huge numbers of tiles are being changed per frame. There is a concern this delay to the start of the animation could feel slightly unresponsive, however, I’m expecting that as long as something starts happening on the same frame as the action it should feel ok. Especially as this means that lots of changes are being made.
This also plays nicely with our selection -v- progressive update issue. We can now make the changes to the tile-state-buffer independent of whether the presentation for the tile has finished loading yet. As soon as the GameObjects are spawned they will put their state index into their MaterialPropertyBlock and they will be up to date immediately.
And another thing
One obvious this I skipped when presenting the selection problem is that updating all tiles in the selection is totally unnecessary. We only actually need to update the shader state for the tile that has become selected or unselected since the last frame.
From the following diagram, we can see that in 2d that means you are inside one of two rectangles. For 3D it’s 3 cuboids.
frame 1 difference +----------------+ +------------+---+ | | | | frame 0 | | | 0 | +------------+ | | +------------+---+ | | | | | | | | | | | | | | | | | | | | | | | | | | | 2 | | | | | | | | | | | | | | | | | | | | | | +------------+ +----------------+ +------------+---+
Given that we can control how fast the panning in the game is this makes the number of tiles to update per frame much less. I left this until now though as selection makes a great example of the issue of tile counts in volumes and we have other operations for volumes that benefit from these changes too.
Well, this got long. I think I’m gonna save the miscellaneous stuff for another post.
See you there!
Alright, part 2!
So in the last post, we talked about updating the data representation of the board. This time let’s get into the bit the players see and interact with. The GameObjects.
In Unity’s currently supported approach, any thing in the world is a GameObject. GameObjects do very little on their own but their properties and behavior come from Components.
We don’t need to understand much more about this for now except that Unity naturally has to manage all these GameObjects and, as with much of the API, you can only interact with them from the main thread.
One of the nice things Unity does do for us is handling instancing so that it’s fast to draw many of the same kind of object. For example, if we have 1000 of the same tile in the scene then Unity can make one draw call to the GPU to draw them all.
We aren’t going to talk about rendering any more in this post but we will revisit it in the next one as there are some issues we need to overcome.
So we have a TileSpire-like game where we want to have the players be able to make tonnes of tiles, they should be able to drag out or paste big slabs and the delete individual tiles as they please.
If someone drags out a 30x30 slab, we need to spawn 900 tiles. Depending on the complexity of the tile Unity might not be able to handle instantiating that many new GameObjects in a single frame. This is serious though as when people are building we have building actions arriving from other players fairly frequently and we wouldn’t be able to apply the changes robustly until all the GameObjects are made.
The first change we could make is to keep the data for Tiles separate from their GameObjects. This also gives us opportunities to store that data in blittable types and use Unity’s Job system to operate on them (like we talked about in the last post)
I’m going to shift from talking explicitly about tile data and GameObjects to talking about the tile data and it’s ‘presentation’ unless the fact that GameObjects are involved is important
With this, we have the opportunity to make the data changes immediately and apply them over a number of frames. This is the approach we are using and is a part of why we want to apply data changes as fast as possible, because there may be multiple arriving in a single frame.
One issue we have given ourselves is that, now that the data and presentation are separate, we need to keep then in sync. If someone deletes a tile we need to delete the presentation too and, naturally, the same goes for undo/redo/etc. We could keep an index from the data to the representation but this has implications:
- We need an extra integer of storage for every tile. 4 bytes is nothing on its own, but 100000 tiles is not a crazy number of tiles to expect to have to deal with so it does add up.
- It may benefit us to have different layouts for the data and the presentation arrays. That could mean having to update the data->presentation index. One can then assume you’ll need to have a way to do this quickly.
Now you may not care about either of those, which is valid too. But whichever way you pick there will likely be tradeoffs.
One simplification we can do is to note that tile data is added in chunks. If we keep presentations of tiles from the same chunk together then you only need indices per chunk, this can help in the approach you pick.
Other than being able to spread out work over multiple frames we get some trivial opportunities to avoid work too.
First off, if we can apply changes to data first and then generate the changes to the presentation afterward then we often get to skip work. Imagine an Add followed by a Delete arriving in the same frame, by looking at the data afterward we only need to spawn the tiles that remain, rather than first spawning them all and then applying the delete.
Also in the case where there is a small backlog you can get opportunities to skip entire actions. The best case is an Undo & Redo together (this can happen when people are playing around to see if they like a particular change). In that case, you can remove both actions from the queue as they cancel each other out.
You don’t always need to present anyway
If you have players in two different parts of the board you may want the data for the zones in memory but not have to have a presentation of the Zone you are nowhere near. This is trivial here as the data is separate.
This makes jumping to that part of the board much less jarring than it would be as you are already to date with the latest version of that part of the board.
A Zone’s presentation now gets given a ‘budget’ of how much it can change this frame. It keeps popping changes from its queue and applies as much as it can until the budget runs out. The budget is represented by a floating-point number and we give separate costs to spawning tiles, reusing ones from the GameObject pool, deleting tiles, etc.
These numbers are made up by us but they are easily tweakable so we have a lot of freedom now to profile and find out what works. We’ll chat more about this in the future when we get more data.
Sticking it together
So we are doing a bunch of these things. We have a separate presentation for each Zone. They each have queues of changes to apply which, due to how tiles are managed in data, mainly boils down to adding and deleting chunks of GameObjects by their Id.
We use the code that applies the change to data to make simpler change instructions for the presentation so that it has much less work to do.
We don’t maintain any indexes between the data and presentation (beyond unique ids of chunks) so we don’t have to do any upkeep work, but we do end up scanning tiles in more cases that we would have to otherwise 
We don’t do all the work avoiding stuff yet but the hooks are in the code so we can add this easily when we want to. First off I need to iron out the bugs that remain.
Ok so that wraps up this post, in the next one, we need to talk a little about rendering and a few miscellaneous bits and bobs we’ve also done this past week.
 Ignoring drawmeshinstanced and friends for this as it makes for an easier to parse, if slightly inaccurate, sentence.
 At least, only the Unity data, you can change the fields on your components as you like.
 I should also mention we are not using the new ECS yet as some features we need are only just arriving and we would still need to do a bunch of work to test everything we need will work in the new system. I’m 95% sure we’ll be moving to it within the next year though.
 Because of reason 1023 is the largest number of instances that can be made in one call in Unity. This is a limitation of their system and it’s a bit of a shame, but we will deal with this another time
 Tiles in the presentation are still currently in the same order as the data representation, so this does still leave us with ways of making applying deletes less costly. Currently, this is not an issue though so I’m leaving it for now
There have been no daily updates for the last six days as I’ve been in a sort of self-imposed crunch to hammer out some things that have been on my mind.
The first one was related to the performance of Add/Delete/Undo/Redo operations on tiles. The faster we do this the better and we have a bunch of opportunities to make this quick. The first was to move the core code of these features over to Unity’s Job System.
Why we really have to care about performance
For newer arrivals to these writeups, performance may seem an odd concern for a game where the rate things are happening seems much lower than that of, for example, an FPS. For us, it’s not the frequency of actions that plague us but the volume of work that can be created by them, and the fact there is no general pattern to when they happen.
For example, dragging out a 3x3 square creates 9 tiles, but 30x30 is 900 tiles.
- Tiles are usually made of smaller assets so that’s 900*N object to spawn and render
- If any of those sub-objects have scripts then that’s 900 more scripts running per interactable sub-object
- The first thing tiles do is drop into place, so you best be ready to animate 900 of those (we do this via shaders)
- When people drag to select we slightly raise the tile to show it has been selected so you will need to be able to query many thousands of tile bounds quickly.
.. and 30x30 tiles isn’t many considering a board is meant to be 30000x30000 units wide and 10000 units high.
All of this isn’t to complain, we want the game to behave like this, but we do need to be clear to ourselves on what needs to be achieved within the 16ms of a frame
So back to the jobs. One of the first things I looked at was the collections we use to store the tile data. As of today, the data for tiles are split up somewhat like this:
- Board: contains N zones
- Zone: contains up to 16 client-data-chunks, one for each client who has built in a zone
- AssetData: Holds the data for the tiles themselves
- AssetLayouts: Holes layout information of the data in AssetData
As you can imagine we have lots of opportunities for performing operations in parallel here. Zones are a 16x16x16 unit chunk of the board and so if an add or delete crosses multiple zones all of those adds/deletes can potentially be done in parallel.
Also as each client’s data is separate within the zone we have opportunities here too. This has so far been less useful as an Add only affects the data of the client that performed it, and during deletes, we want to store the tiles that were deleted for undo/redo and so that would require collections that concurrently have arbitrary numbers of tiles written into them. Making a new collection type to cover this case was too much of an additional distraction and the common case is one or two people building so there is less parallelism to be gained here anyway. 
A diversion into better collections
Here is a rough idea of how AssetData is laid out.
// (the data is actual flat but grouped here for clarity) AssetData: [ [Tiles from Add 0] [Tiles from Add 1] [Tiles from Add 2] [Tiles from Add 3] ] ActiveHead = 4 Length = 4
We tiles we consider active are all from the Add less than ActiveHead. So in the above case, all of them
One of the pain points from the alpha was Undo/Redo feeling unresponsive with large numbers of tiles. With this layout, the undo of an Add is just subtracting 1 from ActiveHead (we will talk about spawning the visuals for the tiles later). By not deleting the data yet Redo is just adding 1 to ActiveHead.
If you Undo and Add and then perform an add or delete action there is no way to Redo that add so we just overwrite its data with new tile data. Thus we rarely need to shrink the backing NativeList behind AssetData.
Now having the Head integer and Native collection separate is fine but as they need to be updated from jobs together it helps for them to be together in a native collection that properly uses Unity’s thread safety checks to make sure nothing funky is going on. To that end, I spent a little while making a collection with the terrible name NativeGreedyList.
It has a Capacity, and ActiveLength and a FullLength. The FullLength is the apparent length of the List. The ActiveLength is the ActiveHead integer from before. Capacity and FullLength are separate as you often want to allocate in larger increments than you immediately need so you have to resize less often.
Unlike NativeList we don’t every reduce the capacity unless explicitly required to (as the data past the ActiveLength is often still valuable to us)
Back to jobs and the problem with multiple people building
Alright, so armed with new collections the work Jobifying everything continued. After a bunch of wrestling and learning, I got this into shape and so now the data updates are parallel. Yay!
But of course, there is more.
One thing that has been on my mind is ‘roll back’ and ‘reapply’ and that what I’m going to ramble about now.
When you have multiple people changing the board you need to make sure that each change is applied the same on every client to make sure they all end up with the same board. When you only add stuff it’s easy but with Delete the result depends on the order the operations are applied.
To set the order each ‘action’ is sent to one client who is deemed the ‘host’ and they give it a history-id and send it on to the rest of the clients. Great, this gives everything a set order but it also means that you don’t see a change until it’s made that round trip to the host. That kind of delay is unacceptable (it feels awful) so we did the following:
- Send the action to the host
- Apply the change locally immediately
- time passes
- An action arrives from the host
- if the action is the change we already made, we don’t need to do anything.
- if the action is from a different client we:
- ‘roll back’ our change
- apply the action from the other client
- ‘reapply’ our change
This works but necessitates some form of ‘snapshot’ we can roll back to. This has been a pain for a while and something I’ve never been happy with. However, during the week I was looking at the problem of spawning the Unity GameObjects for all these tiles (more on that in the next post) and I spotted a cool bunch of details.
For the rest of the article we will use the term ‘uncommited’ to refer to actions/tiles we have made but that have not yet been confirmed by the host, and so would have to be rolled back
- Each client has separate collections for their assets in a zone
- Adds only affect the data for the sending client
- Deletes can affect data for all clients
- Deletes have not only a region in space but also a point-in-history (as a selection may be made before the person hits the delete key)
- An uncommitted change, by definition, can only be on the client that made it
So by making sure that delete only operates on tiles that have been committed then we don’t need to rollback for this case. We do have to reapply our deletes though as they may have added tiles one of our selections should have captured. When we reapply a delete we only do it to the data of the clients whose actions were committed before ours.
Undo & redo are also simple as they can only affect asset data for the client that performed the action
With all this in place, we get to totally remove snapshots. It requires a bunch of tweaks to how changes coming from the network are applied.
This is one of those lovely times where data layout fundamentally change what has to be done to implement a feature. I find that stuff pretty cool.
This post has got long enough but there is still plenty more to cover from this week so let’s get back to it in the next post.
 We do use instancing but until Unity’s ECS supports per instanced data we are forced to stick with the classic GameObjects approach which means we do need to spawn separate GameObjects.
 or, of course, spread out across many frames
 A client, in this case, is a single running instance of the TaleSpire executable.
 This will be revisited in the future as that collection type would be very handy. I also didn’t use NativeQueue as ParallelWriter is only for use in IJobParallelFor jobs as far as I can tell.
 Previously we have talked about how we can’t afford to be sending all the tile data across the network so instead, we just send the building ‘actions’ the players perform.
 Using Undo/Redo wouldn’t work in this case due to how deleting tiles placed by other clients works
 If you delete tiles from another client and then Undo that delete the tiles are not placed back in their collection. Otherwise, if they undo and then redo they restore those tiles, which feels a bit odd.
The last two days touched a few different things.
First off I was working on the resource list for the Spaghet scripts. This holds references to Unity objects in the asset like Colliders and Animators, and also the configuration of radial menus and GM requests.
In implementing this I re-learned a terrible rule of Unity’s immediate mode UI. Don’t try and make the code ‘nice’ you’ll sink ages into chasing things that look like they would help only to hit some limitation of the system and have to throw it all away. DONT DO IT. I’m sure I was meant to have learned this lesson last time, but apparently, that wasn’t enough.
Once that was good-enough-for-now™ I started looking into reimplementing the doors/chests/etc using the new scripting system. For now, I’m not worried about improving the graph, as I want to switch to Unity’s node graph elements at some point, so I just wrote the ops by hand. In the process, I added a few more operations to the language: yield, goto, and an op to send a message to a resource from the resource list.
It’s not done yet but after the resource list thing, I was a bit tired of scripts and put it to the side for another day.
On Friday I started off with a planning session where I just try and work out any small details I can across the tasks on my todo list.
One that stood out as needing some work was pathfinding so I’ve started work on that. The goal for the beta is as follows.
When you pick up a creature you should be able to see all the places you could walk to within a given range (determined by the creature)
This means we are looking at a kind of flood-fill where we ‘walk’ something for a certain distance, but what do we walk?
Well, last year I walked an octree of the board itself (we did this for fog of war so the octree was available) and this was hard. This time I want a dedicated graph. I’ll write more about this once I’ve got it working.
And that’s the lot for now. It’s been a good week and there will be more of these logs next week.
Until then, peace.
Today I carried on working on the scripting implementation.
Each script has a small section of memory that it will be able to store values to, my first job of the day was finishing of that data store. A few things were a little difficult to debug as VS2017’s debugger doesn’t show what data is on the other end of a pointer.
After finishing I found out that VS2019 does have this feature so I spent a couple of hours fighting to get that upgraded and playing with Unity. If you try this yourself remember:
- Don’t let VS install unity again, the version is old. Search for ‘Unity’ in the components section of the installer and find the tools “unity tools for visual studio” component
- Open Unity’s package manager and make sure “Visual Studio Code Editor” is up to date
- If VS2019 isn’t appearing in
Preferences -> External Tools -> External Script Editorthen click browse and find something similar to
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\devenv.exe
- If like me you only use VS for debugging and have a different editor for programming you will need to update omnisharp (or your equivalent) that and probably your editor integration too.
- Restart everything :p Just for safety
After all this, I hooked up the scripting system so that the board asset arrays now receive the references into Spaghet’s data store. All is working according to plan there.
The next thing I want to do is look at moving Spaghet into its own package so I can share that between TaleWeaver (the asset prep and modding tool) and TaleSpire itself.
That’s gonna take a little coordination with the rest of the team so during that I’m going to have a look at making the Unity component that lets you connect resources to your scripts. Here is the old (very unfinished) one that held the LUA scripts.
The important part, for now, is the bit marked in red. You need to be able to add resource slots so you can drag in the animator or sound effect you are interested in. Then those resources can be referred to from the script.
Another potentially nicer way to handle it would be have the slots on the state machine nodes themselves.. hmm actually there might be a nice way to do that as the nodes are made with Unity’s imgui elements. I’ll look into that tomorrow.
Whatever approach I use for beta, I’ll then need to look for this element when the asset is first loaded in TaleSpire and submit the scripts to Spaghet.
Slowly slowly all these threads are weaving together.
Today I’ve continued working on scripting. My main goal has been to work out the most efficient way to handle the allocation and setup of the private data that each instance of a script is allowed to store data into.
There can be large numbers of tiles made in a single action and so making sure I’m no performing unnecessary copying of data has been important. On the flip side many creation events are of only a single tile so whatever we have needs to make sense at that scale too.
As I progressed with this I kept feeling that the native collections I had available, whilst great, were not ideal for this task so I decided to look into how to write my own. I really wanted something chunked like NativeChunkedArray but with
- faster indexing
- no deallocation of backing chunks unless explicitly requested
- An api more focused on being backing store for the data rather than being focused on working on an element by element basis
The official documentation is both limited and also kind of out of date so I’d recommend starting with this fantastic article series by Jackson Dunstan. In fact if you are considering doing any work with Unity’s new DOTS systems, trawl that site, it’s a goldmine.
Pulling apart his examples and getting to grips with Unity’s safety system took the rest of the evening, but it was definitely worth it. I’ve now got the base of a much more focused chunked store that I can ensure provides exactly what the scripting system needs.
That’s all for tonight.
Update time! Progress has been very good.
@Ree has kept on hammering away at the building tools from the last update. He’s also been handling lots of behind-the-scenes organizational stuff which whilst not exciting to blog about is as critical to TaleSpire happening as anything else we do.
Jason has been chatting to loads of the backers who pledged at the ‘Help design a ___’ levels and sculpting is in full swing. Very exciting!
I’ve finally broken through the wall of fiddly details that was plaguing my board sync implementation and it’s looking pretty good now. This has freed me up to look at some other tasks that are on my todo list so I’m gonna ramble about that for a bit in this post.
First off I went back to Fog of War. Basically, the task is this:
- Take a 16x16x16 grid
- In every cube write ‘true’ if there is meant to be fog there and ‘false’ if not
- Now take this info and make a 3D mesh that contains all the cells marked ‘true’
And I wanted to make sure we had step
3 worked out.
Now there are a couple of points to keep in mind:
- The mesh we need to make needs to conform fairly closely to the grid as we don’t want to see glimpses of anything we shouldn’t
- We don’t need to make this look like fog, we just need a mesh we can start working with, we can do all kinds of polygonal and shader effects on top of a simple mesh that will look much more fog-like.
As a great example check out this tweet from @Ed_dV !
Inspired by Sea of Thieves and Sky, I've been playing with opaque polygonal clouds - sphere meshes that have their normals softened and distorted via command buffer. Still lots to work out, but progress is being made! No animation yet, just aiming for a fluffy render first. pic.twitter.com/WVyk7X5p5b— Edward del Villar (@Ed_dV) September 4, 2019
Those clouds are spheres, with lots of magic on top (if you watch the whole thing it shows the spheres without the magic).
Given those two points, I decided a good source of wisdom would be the kinds of algorithms that Minecraft-like games often use. Luckily for me Mikola Lysenko of 0fps.net has written some amazing articles on these techniques along with WebGL demos and source code, what a star! This gave me a huge boost and along with a few implementation tips from other sources I was able to put together a ‘Greedy Mesher’ (the name of one of the techniques) in Unity. Now, this next picture is not meant to be clouds, it was just a test to make sure it was working.
For the coders out there, I also made this implementation using Unity’s job system which allows me to run the mesh generation jobs in parallel across all cores. It’s pretty speedy considering how little effort was required.
That is all on the subject of fog for now but we will be back with more in future.
Next on my list of concerns was interactable Tiles and modding.
We have a bunch of interactable tiles in TaleSpire and this is only going to increase as we open up the ability for the community to make them.
A real pain point when you make a user-driven content game like TaleSpire is that you have no way to know when or where someone will just suddenly throw tonnes of extra stuff for your game to do.
For example, take that side-table in the gif above. There is nothing to stop you dragging out a whole load of tables and the numbers get large FAST. Remember that if you drag out a square with 32 tiles down one side then that is over 2000 tiles that your game suddenly has to handle. They all need to appear on all the other player’s machines quickly and they all need to be interactable immediately.
That means all 2000 of them are running their own little scripts. How do you ensure that they don’t start crippling your game?
Now we naturally knew this was coming and previously we had been making small fast components that would then be wired up using a LUA script (that’s another programming language for those not into this stuff :] ).
It worked but I was still concerned about how it would scale.
There was another thing. Like I said we were using LUA in a weird way, as a way of setting stuff up. I started thinking that that might be the worst of all worlds for users as, for those who like LUA, there are lots of things they would be told not to do and for those who hate LUA, well they hate it :p
Were there other problems? You’re damn right there were!
As mentioned before, we need to keep all of these tiles in sync on everyone’s machines and so the idea of scripts just being able to fire off messages whenever they liked was a nightmare, it could really make the game unstable if done wrong.
Now, this sounds tautological but there are two kinds of state for a tile: State that matters to the story and state that doesn’t. For example, whether a fire is lit or not matters, the exact positions of every smoke particle does not. As long as it roughly conforms all is well.
So it would be good if we can separate these kinds of scripts out from each other so we have a shot at being able to enforce sensible behavior over the network.
So after pages and pages of notes and a lot of coffee, I decided it would almost be worth making our own compiler that suited our needs but we needed
Now, this bit gets technical so feel free to skip to the row of asterisks!
I wanted something that
- has a simple execution model.
- was focused on only the tasks we need.
- run on multiple cores
- used no GC
- was reasonably fast
Here’s the approach I went for:
0. I decided to start with only 4 data types: int, float, int3, and float3
1. I use a node graph as the code representation. I’m currently using the excellent XNode but if/when Unity’s new graph UI stabilizes I will probably move to that.
I added code to detect cycles and compute the correct types at each node
2. Walk the graph and generate a tree of IR nodes. This is mainly for convenience but in the conversion, we also pick correctly typed IR nodes in a few specific places.
3. Walk the IR nodes to compute the layout for the stack
4. Walk it again and generate a bytecode that represents the program
Do a little cleanup pass and kick the result out as a
6. Write a Job that takes the instruction array, an array of private state and a few other bits and run our little bytecode in parallel.
7. Marvel at the power of coffee
It took 18 hours to get the first version up and running and the results were promising.
Here is a picture of some nonsense ‘code’
The graph above (whilst being gibberish) can be run for 10000 tiles in around 2ms, which is way slower than it will be but it’s acceptable for a first pass. It told me the approach was worth more work so I took Friday to flesh it out a bit.
Oh, and the language is called Spaghet!
With that we have the kind of script that can be used for the ‘not story critical stuff’ but we still need the other scripts too.
To that end, I’m now working on how to script the state machines that will run the important stuff. Here is a little picture of the prototype I’m currently working on:
Naturally, the look of the graph and the nodes available will be improved too.
Soooo yup, it’s been a good, busy week. I’m going to be working on this and all the code that holds it together all this week. I’m hoping to have the major plumbing done by Wednesday though.
This weekend Ree and I also took another look at the tile cutaway effect. This one is super important not just to reveal the part of the building you are looking into, but also to cut down walls that would get in the way of controlling your creatures.
Please note: All the following screenshots are not from TaleSpire or even Unity, they are from some test code I was using to play with cutaway effects. It’ll look much nicer in TaleSpire :p
We had prototyped this effect before. Here is a sphere we have chopped the top of using the technique.
The basic approach was that we discard all the fragments that are above the cutaway and then texture the backfaces as if they were positioned at the flat cutaway plane.
However, we saw an issue when the meshes of the models were intersecting (we also saw this in some of the assets on the Unity store for drawing cutaways)
So what can we do? Well if we look at this with the power of MS Paint this is what we have.
So let’s draw some lines from an imaginary camera point to a few points across the cutaway spheres. Then let’s make a score for each line if the line crosses a backface you add 1 point, for a frontface you subtract 1.
Neat, notice how if the score is above zero then it means you are seeing through the cutaway. Also note that because it’s addition and subtraction, we can do it in any order.
We can perform this check for every pixel really easily. Let’s render all the backfaces first and additively blend all the fragments. Then let’s render all the front faces into the same buffer subtractively blending.
Then we have an image which is the mask for the cutaway area.
At this point, we have effectively solved the original problem. We can now render the tiles (throwing away the stuff above the cutaway) and then render the cutaway using the mask we have made.
Note: you could definitely do some nice stuff with stencil buffers to speed all this up.
And there it is, a simple cutaway handling intersections.
Now this may be the version we use in TaleSpire or it may not, we still need to work on it some more and see how it performs once it’s in. But either way, it’s neat :)
Alrighty, that’s all for now.
Recently I’ve been spending most of my time working on undo/redo support and general data model work inside TaleSpire. Because of that, I thought I’d take a post to ramble a little about undo/redo and how a ‘simple’ feature can grow when you have a few constraints.
This is not the exact way we are doing things in TaleSpire, but it’s an exploration of an idea I went through whilst looking at the feature.
In this post, I’m only going to talk about placing/deleting tiles rather than creatures to keep things simpler.
Chapter 1 - Building basics
So lets first have a look at a little building:
Players can do a few things:
- They can place a single tile
- They can drag to place multiple tiles
- They can delete a single tile
- They can select and then delete multiple tiles
- They can copy and paste slabs of tiles
We are going to refer to placing tiles as an
add operation and deleting tiles as a
There is no moving of tiles in TaleSpire, anything that looks like a move is actually a
delete operation followed by an
add operation. So cut and paste work like this. However, for now, we will ignore copy/paste and just look at the fundamental operations.
What can we say we know about
- The building tools should ensure that the new tile/s do not intersect any existing geometry
- The building tools should ensure that no two tiles in a drag
For now, we will just leave these points here but we will refer back to them later. One thing of note though is that the ‘does not intersect’ property is something we want to try and maintain. Due to undo/redo it’s pretty easy for people to violate this but the rest of the system should try to maintain this as much as possible.
All right, so our building tools let the player
delete tiles, but we have to maintain this state in memory. Our world (the board in TaleSpire) can be far too big to fit in memory all at once so we divide the world up into
zones which we can load in and out on demand. Because of this, we will want
zones to be as independent as possible and so each
zone is going to maintain it’s own tile data.
delete of a single tile seems like a simple place to start, it should only affect one
zone right? Well, it depends on the size of the tile. Let’s say our
zones are 16x16x16 units in size and that the minimum size of a tile is
1x0.25x1 (where the 0.25 is the vertical component), well then if a single tile is 40x1x40 then it could be overlapping nine different
zones. Even a humble 2x2x2 tile can overlap eight
zones if it is in the corner. For the sake of sanity let us say that a tile can be no bigger than half the size of the zone (8x8x8), this way it can at most overlap eight zones, just like the 2x2x2 tile. We will also leave this issue here for now as we will have enough to think about, however, keep in mind how many little things we are ignoring for the sake of the readability of the article, all of these will need to be resolved before your game ships :p
adding tiles! When tiles are placed we need to store data about them, things like the kind of tile it is, the position, rotation, etc. We will just talk about a tile’s data as if it’s a single thing for simplicity.
player0 drags out 3 tiles.
tile-data = [add(* * *)]
tile-data is some linear collection holding the tile data and
add is an object holding three tiles represented by asterisks (
*) in this case.
Ok so they
add 2 more tiles
tile-data = [add(* * *), add(* *)]
Neat. Now let’s look at
undo. Well, that’s pretty simple right now as we have been storing our tiles in the order they happened, let’s just drop that most recent add.
tile-data = [add(* * *)]
Hmm, but now we need to support
redo, it’s a shame we just threw away the data on the
undo. What if we didn’t, what if we had a little store of things that have been ‘undone’. That means after the
undo the data would look like this
tile-data = [add(* * *)] undone-data = [add(* *)]
and now redo is simple too. We pop the data off of the
undone-data stack and push it back onto
tile-data = [add(* * *), add(* *)] undone-data = 
So far so groovy. But now let’s think about deleting tiles. Well, that doesn’t map to our data layout as neatly. Our tile data is organized in chunks that are ordered by when they happened in history, not by where they are in the world. Our
zones do offer a limited amount of subdivision of the world so currently, this means we need to visit each zone the selection intersects and then scanning through all the tiles in those zones to see which are in the selection. 
So about those zones.. how many are going to be involved? Let’s look at this 2d version.
Notice how, as the size increases the number of
zones we need to consider is going up by the
width x height and it’s more painful in 3D as it’s increasing by
width x height x depth, so roughly cubing with respect to edge length. This is going to be quite a lot of zones to check. But how many tiles can a zone hold? Well a zone is 16x16x16 but a tile can be as small as 1x0.25x1 then that is 16x64x16 tiles 16384tiles…ugh this is getting to be a lot of work.
But a ray of hope appears, notice the hatched areas in the diagram above. Those are zones that are completely enclosed by the selection. Naturally, this means a
delete operation will delete all the tiles within so we have an opportunity for a big optimization. For any zone that is completely enclosed we just remove all tiles, and we only compare the selection with the tiles themselves for zones that are only partially intersected by the selection. In general, this means the outer ‘surface’ of
zones. The number of zones on the surface of the selection grows roughly in the order of (x^3 - max(x-2, 0)^3) which is much more manageable.
The situation has improved then but we still need to apply the operation to all those zones, and quickly. Luckily you have been watching the recent developments in your game engine of choice and have learned that they have added a neat job system which can spread work over multiple cores and has a specialized compiler which can optimize the heck out of your code. “Yay” you say, but one limitation all your data need to be Blittable types or arrays of such so our previous ‘list of chunks of tiles’ approach isn’t going to work. Furthermore, it would be neat if the tile data was packed together in memory so you could write that jobs that traverse the whole set of tiles rather than jumping around memory to separate chunks. This becomes important as you watch your players and notice that, whilst they do drag out large chunks of tiles, they also place a lot of tiles one at a time.
Ok so let’s look at our old data layout
tile-data = [add(* * *), add(* *)] undone-data = 
Now let’s make this a single array, and instead of undo removing one and adding to another we with have an index which we will call the
active-head. It will point to the most recent active chunk of tiles
v tile-data = [add(* * *), add(* *)]
active-head is represented by the
tile-data in code it’ll just be an integer.
With that change undo looks like this
v tile-data = [add(* * *), add(* *)]
and redo does this
v tile-data = [add(* * *), add(* *)]
Cool, but now to remove those extra objects, let’s flatten the array and move the layout information to a separate array
v tile-data = [* * * * *] v layout = [3 2]
So far so good. Now our layout tells us how many tiles are in each chunk and
tile-data stores all the tiles packed together.
Undo now looks like this…
v tile-data = [* * * * *] v layout = [3 2]
and redo is still the reverse. This means undo and redo are still as simple as changing an integer from the data-model side
Delete is different though, delete is based on a region on the board so we need to check each tile to see if it intersects with that region and, if it does, then we remove it from the
tile-data array. Undo naturally requires us to put that data back. Now from a data angle that is annoying as it could leave ‘holes’ in our array as in the following diagram (deleted tiles represented by
v tile-data = [x * x x *] v layout = [3 2]
This is no good as now we have to handle this. We could have each tile have a flag that says if it’s alive or dead. We would then need to check that any time we scan over this array. This is ok but, unless we undo the delete, we still have memory being wasted as it’s occupied by data for tile that been deleted. So maybe we reuse the gaps? Sounds great, but we can’t fill in the gaps with tiles from newer operations as otherwise, we lose the property that tile-data is in ‘history order’ and thus lose the ability for undo/redo to be represented by the
So if we don’t want holes but can’t fill them then another option is to compact the array as we delete, so whenever tile data is deleted the tile data to the right is shifted left to fill in the gaps. This requires more copying during the delete but let’s assume we profiled it and it proved to be the best option. This is what we will assume for the rest of the article.
Let’s now look at the above
delete operation with compaction.
v tile-data = [* *] v layout = [1 1]
Neat! but we should note we have complicated
undo for ourselves. If we undo a delete We will have to ‘make room’ in tile-data for the old tiles to be copied back into. There is one blessing though, we only care that chunks are in history order, not the tiles within the chunks. That means we don’t need to make room in the same places as we had the tiles originally…
v tile-data = [_ * _ _ *] v layout = [3 2]
Rather, we can make room at the end of each chunk…
v tile-data = [* _ _ * _] v layout = [3 2]
And still keep all the undo/redo benefits we had previously with the active-head approach.
This also simplifies redo of a delete, can you spot how?
Originally we had to scan through the tiles to find the ones to delete. Redo requires us to delete those same tiles again but we know exactly where they are now, they are at the end of each chunk.
Chapter 2 - Multiple players and a quandary
Now things get ‘fun’. We can have up to 16 players on a board at a single time and all of them could be given the permissions to build. That means we need to keep that all in sync on all machines but all make sure everything feels instant.
Let’s get one thing out of the way quickly, order matters. By this I mean the obvious quality that an
add followed by a
delete can give a different result from a
delete followed by an
add, from this we know we need to apply changes in a fixed order. For us, that means when we build we will send a request to a ‘host’ which will decide what gets applied in what order and tells the other player’s games (called a
client from now on). Note that we also need to see a change the moment it happens so the order will be:
- Player tries to make a change
- The local client applies the change immediately and sends a change request to the host
- the local client listens for change acknowledgments from the host
- if the acknowledgement is for the thing we just built then we are done
- if the acknowledgement is for another client’s change then we undo our change, apply their change and then reapply our change
Now that is very briefly covered we will assume that things are applied in the same order on each client and ignore that aspect of the problem for the rest of the article.
To keep everything in sync we need to be able to send information about what is being done by each player. We already know that our selections could be very large so we don’t want to have to send information about every specific tile that was modified, so instead we will want to send just what happened e.g. ‘player 0 added 30 wall tiles in this area’. This is ok for us as we are already assuming a fixes order of operations so we can assume that every player has the board in the same state when they apply the change.
Multiple players building also gets interesting due to each player has their own undo/redo history. This is very important as shared histories get very annoying very quickly. However, this messes with our little data layout from before. In the following layout I have changed out the asterisks for a designation that tells us who added the tile, this is only for our benefit in this article, it needed to be encoded in the actual data like this.
v tile-data = [p0 p0 p1 p1 p0 p0] v layout = [2 2 2]
So three chunks of tiles have been added. First
p0) added 2 tiles, then
p1) added 2 and then
p1 added 2 more.
Now let’s say
p1 hits undo, that means the middle two tiles need to be undone but we can’t do this just by shifting the active head.
Here we face two choices, either we drop the
active-head approach or keep it and find a way around the above issue. Neither way is wrong and both require changes to how the data will be stored. For the sake of the article, I am just going to continue with the active head approach so we can see how this has to be modified to keep its nice properties in the face of multiple players.
Chapter 3 - Multiple Players with the active-head approach
OK, so let’s say we like the way we can just change a single integer to represent undo/redo using the active-head approach. What would be needed to keep this working with multiple players?
Well each player could have their own tile and layout data
v p0-data = [p0 p0 p0 p0] v p0-layout = [2 2] v p1-data = [p0 p0] v p1-layout = 
This lets us regain the quality that undo for
p1 only affects
From now on let us say that if a tile is in
player0 ‘owns’ that tile and if the tile is in
player1 ‘owns’ that tile. With that little bit of terminology, we can look at delete again.
Delete is based on a selection of tiles on the board, it does care about who ‘owns’ the tile behind the scenes; if the tile is in the selection, it will be deleted. That means that when we scan the data for tiles to delete we need to scan each player’s tile array.
Remember how undo of a delete needed to restore tiles to where they came from? Well, what do we do for tiles that we delete from other players? It turns out that adding them back to the other player’s chunks (and thus their history) results in behavior that is fairly confusing to play as the result of undo or redo depends to an extent on the current state of another player’s undo or redo. How robust a game feels depends a lot on to what degree a player’s expectations of game behavior match reality and so it’s sometimes better to have a simpler behavior even if a more correct but complicated option is available. I’m going to claim this is such a case without trying to prove it here :)
The result of this is that we don’t want to try to push undone deletes back into the data arrays of other players. So where do we put them? How about the end of our data section? This is a bad idea as is breaks the active-head approach again. So for us to continue using this approach we now need an additional store per player for undone deletes.
v p0-data = [p0 p0 p0 p0] v p0-layout = [2 2] p0-deletes =  v p1-data = [p0 p0] v p1-layout =  p1-deletes = 
The addition of more per player data probably feels annoying (it did to me) but one upside is that this array can use the same active-head tricks that we have used for the player’s data array but with a slight difference, the tiles exist in-game only if they are to the right of the active-head. In this case:
- a delete of tiles owned by other players pushes them onto the player’s
deletesarray and advances the active-head
- an undo of the delete shifts the active-head to the left, the things to the right are now active in-game
- a redo of the delete shifts the active-head right again
This might feel like it’s all getting a bit out of hand. That is a fair conclusion to make! This is why I said at the beginning this writeup is just an exercise in following an idea through to where it leads rather than making a call on the best approach. I hope you have some thoughts on this and also a feeling for where this kind of feature can get complicated.
I have left out a bunch of things like “how do we maintain the mapping between the tile data my game engine’s game objects when our data can only contain Blittable types?” and “How long can undo/redo histories be?”. They are fun in themselves as I do recommend considering them if you decide to try and come up with your own schemes.
 For example imagine
player0 placing a block of tiles, then
player0 deletes those tiles, then
player1 places some tiles in that (now empty) spot. Finally
player0 undoes their delete and suddenly
player1’s tiles intersect with
 This means we are ignoring both the fields within and also the various possible layouts for those fields.
 We could of course store the ‘bounds’ (a 3d region) of the tiles organized in a way that makes spatial queries easier, and then have those contain indexes back to the tiles they refer to. This will mean sorting/maintaining this structure as tiles are added/deleted, so it’s not a free lunch but you will want to look at this if you are building this yourself.
 you still need to handle the relationship between the data and actual
GameObjects or entities (depending on the system you are using)
 remember that building only happens as fast as humans can click, whereas some things might be happening every frame (60 times a second). It can prove better to prioritize the more common case over the less frequent one even if it does result in a slight drop in performance for the less common case.