Hey folks, I was meant to write this in the morning, but code is distracting :P
Yesterday I worked on dice physics. I had been concerned by the behavior of dice and wanted to find a good path forward.
To be able to see what was going on, I made this test scene. I can reset the dice by clicking the reset ‘button.’
This is one place where Unity absolutely excels. Being able to knock up tests and tooling in minutes so often can help you make headway on otherwise stubborn issues.
There are a few points that I find interesting:
DotsPhysics and ClassicPhysics do not line up in behavior. This is really important as this means that I shouldn’t expect my conversion routines to give the same results as classic as mine are directly based on the ones built into DotsPhysics.
The BouncePhysics wrapper around DotsPhysics lines up with the behavior of DotsPhysics. In that, the way they bounce and come to rest is similar. This lends support to the point above and gives some evidence that the conversion is ok.
DotsPhysics falls faster. This is because they don’t handle time properly, so faster framerate means faster simulation.
Mine falls at the same speed as ClassicPhysics. This is very important. It suggests my fixed-timestep code and use of Unity’s simulations values (like gravity) are in the right ballpark.
Once I realized that DotsPhysics doesn’t give the same bounce, drag, etc, for the same input values, I felt comfortable playing with values to find something that felt right. This clip is not final behavior but shows how a couple of minutes of tweaking can help.
Afterward, I set up some ramps and D12s, so we had a decent place to start testing when we sit down to dial this in for real.
I also saw that my interpolation code was incorrect. I found some issues, but I was struggling to match all of Unity’s behavior. I then noticed that we don’t use interpolation in the current build and so this stopped being a priority. However, it still took me a couple of hours to quit playing with it. It’s so tempting to keep going when you know you have the math right, and it’s the surrounding plumbing that is the issue.
Anyhoo that was that. Today I’ve been working on the character controller again, results are promising, but I’ll leave that until tomorrow.
With lights under control, I’ve turned my evil eye back to physics. Since I replaced the physics engine, gm-blocks and hide-volumes have not been working. This was expected, but it was time to fix them.
For sanity, I’m going to call Unity’s old physics engine ClassicPhysics, their new one DotsPhysics, and our wrapper BouncePhysics.
Hide volumes make use of the fact that, in ClassicPhysics, we could resize colliders on the fly. We needed something similar in our new system. First up was a different problem, however. When we convert Unity’s colliders to the form required by DotsPhysics, we remove the components. This means there isn’t an obvious place for the programmer to set the values. This meant BouncePhysics needed its own version of these components.
I knocked together a Sphere, Cylinder, Capsule, and Box collider components. I modified the BouncePhysicsBody to walk the hierarchy, converting the ClassicPhysics components to their BouncePhysics equivalent, and registering all these new colliders.
During this, I noticed some issues in how my code handled baking of scale when the collider is subject to multiple transformations. This sucked for a good few hours, but I was able to find code that showed how to do it properly, and I got it fixed.
With that done, I was able to hook up the hide volumes and get them working again \o/
Next, gm-blocks. There I found an interesting bug when deselecting them. First, the gm-block would be destroyed, and as that happened, it would unregister itself from BouncePhysics. The issue arose because, in DotsPhysics, you reset and pass all the RigidBodys every frame. This means that although I had unregistered the object from BouncePhysics, the RigidBody was still in the DotsPhysics collision world. I can’t just remove that one body (the only option is Reset, which clears everything). Instead, I now set a flag inside the RigidBody to state that it is unregistered and then modified the collision queries to check for this flag when collecting results. This all takes place after the updates to motion have been calculated, so there is no risk of something bouncing off this unregistered body before the next frame, and by then, it will have been cleaned up.
Of course, this check (which is essentially a comparison of a ulong) is not free. I will probably have the system only use those new collectors on frames where something has been unregistered. Luckily in TaleSpire, we don’t have much churn in the physics objects.
During this, it made sense to clean up some of the physics code in general. Until now, I had made the class that holds the board data responsible for the physics ‘world’. This wasn’t ideal, and the physics related code was very out of place. I decided to centralize it in the BouncePhysics classes. This does mean that, for now, we only allow only one board to be using physics at a time (which could matter when changing boards), but the trade-off was that I could make the API much more like ClassicPhysics. The API design is important because, as soon as I merge this branch, Ree needs to be able to get up to speed quickly. The more familiar it is, the better.
In amongst this, there was the usual smattering of bugs, confusion, and other things that slow one down. But overall, it’s felt good to have some forward momentum.
Next, I really need to find out why dice in BouncePhysics are not behaving like they did in ClassicPhysics. It could be an issue in the conversion routines or, more likely, mistakes in the code than manages fixed-timestep and interpolation. I’ll set up some test scenes where I can compare everything side-by-side, and we’ll see what we can find out.
Alright, that’s all from me for now. Seeya later
Since our move away from GameObjects, we have had to make our own batching, culling, and animation systems. One that I had left for a while was lighting, so it was time to tackle that.
First off, the deeply annoying news. Unlike with the BatchRendererGroup, which gave us a way to avoid GameObjects and populate the data from multiple threads, there is no similar thing for lights. In fact, when I looked into Unity’s ECS, they were just spawning GameObjects for the lights and updating their transform each frame to be in line with the entities that supposedly contained them.
This is a pain, but it’s the only option, so we needed something similar.
As always, the fact that TaleSpire is a user-generated-content game means we have a bunch of complexity. We don’t know in advance how many lights are about to be spawned. Worse, because we are back using GameObjects, we know that spawning too many per frame will cause serious fps issues. This means we need a progressive spawning approach and pools for reusing light that that been destroyed.
Also, spaghet scripts are allowed to update light color, intensity, position, and rotation on a per-frame basis. Now position and rotation updates can be jobified if we put the light’s Transform in a TransformAccessArray. However, color and intensity can only be set from the main thread… joy >:[
On each change to a zone, we rebuild all the batches, including laying out the lights again. We have jobs to:
- find out what tiles and pros are using lights
- collate the numbers of each kind of light
- write the details for each instance of each kind of light (in parallel) into various collections.
We push all of the current lights back into the pools for reuse and, over the course of multiple frames, spawn new lights or pull them from the pools.
Dynamic lights are updated by Spaghet scripts. If there is a change to the intensity or color, we push the index of the modified light into a NativeQueue. That queue is consumed on the main thread where we can apply the changes.
Getting all of this to play nice has taken me the last 3-4 days. There is still plenty of room for improving performance, but I need to profile using real-world scenes before making any more changes. Luckily we have fantastic community sites full of slabs I can use :D
Alright, the next stop is looking at little things that broke on the move to the new physics engine.
Seeya in the next log (which will hopefully take much less than a week this time :P)
This one is interesting to write as it’s fairly downbeat. However, it is part and parcel of making stuff, so let’s do it.
As mentioned in previous posts, we have known for a while that TaleSpire would not scale to bigger boards using their GameObject abstraction. There was a lot of song and dance from Unity about their upcoming ECS, and while it looked pretty promising, we miscalculated the timeframe it would take Unity to bring it to life. That meant that this year, as we tested again, we saw we needed a custom solution. That’s what I’ve been working on.
Now I have tried hard to always be clear about when the issues we have run into have been Unity’s fault and when they have been ours. In general, the responsibilities have been ours as we are using packages marked experimental and not in general recommended for shipping. However, today’s principal issue is, I believe, not on us.
To move away from using GameObjects we have been building on the BatchRendererGroup. This object lets you specify what needs to be drawn and then provide the matrices to lay these instances out in world-space. The matrix array you fill is a NativeArray and, as such, is fully compatible with the job system. This meant that we have been able to parallelize the population of these arrays, which has resulted in the massive improvements in the performance spawning tiles we have seen in previous dev-logs.
When you are using instanced rendering, you need to somehow get the per-instance data to your shaders. Traditionally in Unity, this is done using MaterialPropertyBlocks. You can associate a MaterialPropertyBlock with an object and, when it renders, that data is made available to your shader. When using BatchRendererGroup, this requirement seems to be filled by using GetBatchVectorArray (and co).
Now here is my portion of the fuckup. When I couldn’t get this to work in my early testing, I put it down to me not being too hot with Unity’s shaders, and that I would work it out later. Unlike the new stuff in Unity, the BatchRendererGroup is not marked as experimental as so I had the same confidence in the system that I do for other released parts of Unity. I put a few questions up on the forums and got back to work. I rigged up a simple system where I looked up the data from ComputeBuffers using the instance-id as the index; this works fine without culling enabled.
The problem is that GetBatchVectorArray does not (as far as I can tell) work with the classic rendering system in Unity. This is not stated in the documentation, but in the docs for an experimental package called the ‘Hybrid Renderer,’ it mentions some things that can be construed to mean it only works with the new scriptable rendering pipelines.
UPDATE: While writing this log, I’ve seen that my bug report against this method has been accepted. I will keep you posted on how this goes.
This sucks. Worse, we know the old version won’t work; we are months into the rewrite with no way to go back. Frankly, it was rather distressing.
Now those of you who know a little about Unity might be wondering if we could just make something custom using DrawMeshInstanced or DrawMeshInstancedIndirect. The problem is that things drawn with those methods are not culled, and this is a problem as realtime lights/shadows require rendering the scene from multiple vantage points. Without culling, you end up rendering everything for each camera regardless of the direction it’s facing. This massively hurts performance. The BatchRendererGroup was excellent in this regard as it allowed us to implement the culling jobs ourselves.
One thing that the BatchRendererGroup does have is a flag that lets you specify that you only want to use it to render shadows. Another little detail is that TaleSpire’s shadows don’t need to respond to the tile drop-in/out animation we implement in the vertex shader. This gives us the possibility to keep using the BatchRendererGroup for shadows and keep the culling advantages.
However, that still leaves actually drawing the scene. The only thing I could think of was that I’d need to implement this myself. I’d use all the batching code we have written to layout data on the GPU, I’d write a GPU-frustum culling implementation that matched the one we use on the CPU side, and use DrawMeshInstancedIndirect to dispatch the draw calls.
Now I’ve never written this kind of thing before, but it felt like the only feasible option. Over a few days, I got this written and was finally able to cull our scenes again without horrible flickering.
Now the scary thing is that this was not part of the plan. It’s cost us a lot of time and how it scales is currently unknown. What I do know is now we have a whole pile of extra complexity to manage and improve. Not great.
However, there are some up-sides:
Currently, we have to use the same meshes for rendering, occlusion checks, and shadow rendering. I wanted to use different meshes for shadows and occlusion checks as this allows us to use lower-poly meshes (helping performance) and close tiny cracks the fog of war shouldn’t be able to see through. I was implementing that as part of the GPU-occlusion system. When I delayed that feature to after the Early Access release, we delayed the ‘separate shadow mesh’ feature too. With our new system, we can trivially use different meshes, so these improvements are now on the cards for Early Access.
As mentioned recently, most of Unity’s API has a max of 500 instances per batch when using non-uniform scaling. DrawMeshInstancedIndirect does not as all the data is already on the GPU. This means we have the potential to have much larger batch sizes. Currently, we batch on a per-zone basis, so we are a little limited, but it will not be hard to take zones that have not changed in a while and combine their batches. It’s something I’ll look into when I get back to improving performance.
Thirdly, we now have a bunch of tile/prop data on the GPU that we can easily write compute-shaders to operate on. This gives us more options for future developments.
So all in all, this has been pretty horrible. I’m very grateful that the community is so supportive and that the Eldritch Foundry update was so well received. That was a serious boost in a tough couple of weeks.
However, we are still very far from done. By my estimates, I was at least a month behind my personal goals before this escapade, so I’m not sure where we are now. Luckily we’ve been vague with deadlines, but we will need to sit down and look at the plan again.
What a year!
This dev log is long, so I’m gonna leave talking about lighting to the next post.
 This is important as, traditionally, most of Unity’ apis are only valid to call from the main thread.
 There is such a lot of work to do for Early Access that often I can’t afford to stress on issues that can be worked out another day. The important thing is to be making progress somewhere as often another day I’ll have had time for the solution to come to me.
 Culling means some instances are not being drawn and so the instance-id for a given tile can be different from frame to frame.
A quick tip when dealing with BatchRendererGroup. The instance limit per batch is 1023, but the UNITY_INSTANCED_ARRAY_SIZE limitation still applies, and so on desktops, this will likely cut your max batch size down to 500. This matters if you rely on instance-ids. I saw some very odd stuff when a single batch went over 511 instances; what was happening was that Unity was splitting the batch in two, one for the first 511 instances and one for the last one. This meant the instance-id for the 512th instance was 0 rather than 511, and this naturally resulted in incorrect behavior when using it to index into a buffer.
That was an afternoon of pain explained after a quick trip to the frame profiler, I was so concerned that my batching code was wrong that I didn’t consider the instance-ids not being what I expected. I can’t wait to know more of Unity’s internals as these issues are such a pain and seem avoidable.
Another node that I added last week is the random node. It can produce random floats, int, and vectors of each. It uses a simple stateless random function I’ve used in shaders before, which means it would be terrible for security, but is perfectly fine for visuals. The node has an option called ‘Update Mode’ that might be worth mentioning.
By default, if you give the node the same input (the seed), you are going to get the same result on the output. You will probably feed time into the input to get different results each frame. However, if you are using global time, then the output will be the same for all instances of this script running on all assets this frame. That might not be what you are going for! This is where the ‘Update Mode’ comes into play. You can set it to global, which gives the behavior described above, or you can pick one of two ‘local’ options, which both modify the seed in a way that will give different results. The ‘Local’ option means that it mixes in the tile’s unique id, so now the result will be different from other tiles running the same script. Note, though, that each asset within a tile can have a script, and maybe you don’t want those assets to get the same result from the same node. To handle this case, you have ‘Asset Local’, which also mixes the asset’s index into the seed along with the tile’s id.
Alright, back to work. Seeya!
 Unity’s use of constant-buffers for its object-to/from-world matrices and constant-buffers have a max guaranteed size of 65k. See section 1.4 over here for some interesting details.
Hey folks. For the last few days, I’ve been looking at batching again. Unity has an interesting limitation in that the max batch size is 1023. This seems to be an artifact of some decision long ago, however, we need to make sure we don’t try and make batches bigger than this. Making our code respect this is a delicate operation as it’s very multi-threaded code and the access to the data are often not protected by Unity’s safety systems (for performance reasons). This means I’ve been taking it very slowly, it’s going well, and I hope to have it finished soon.
As that is quite a short update, here is something from last week.
There is a node in Spaghet for choosing a mesh from a list. We use this for the stop-motion style effect of the fire. It looks like this in the graph.
While fixing bugs in the implementation, I had to fix things both on the TaleWeaver and TaleSpire sides, and the process got very tedious. At some point, I decided to re-prioritize work on the modding tools and hooked up Spaghet such that it works as a live previous in TaleWeaver. Here it is in action. Notice that every time we save, the changes are immediately applied.
This sped up the process a lot! I was able to find some mistakes in my animation packing code and fix those before getting back to batching.
Alright, that’s the lot for this update.
Hi everyone, last week was intense but productive. I’ve been down at Ree & Heckbo’s place, working on the engine. We had hoped to start integrating our branches, but my stuff still was not ready, so instead, we just kept working, making use of the collaboration advantages that being in the same room gives.
As there has been a bunch of stuff happening, I’m gonna spread it over a few posts.
Let’s start with something random: line-of-sight.
Line of sight checks came up in conversation in the community recently. The issue was that people have issues because one creature can block the line of sight to other creatures behind it. This is one of those things that sounds like it makes sense but just doesn’t feel right in-game. Any gaming using miniatures is always an approximation of the fantasy setting, and so sticking to it like it represented reality often doesn’t make sense.
Let’s take a concrete example. A giant is blocking the view to a goblin behind its leg. This sounds fine, but imagine the scene as it would play out in ‘real-life’. The giant would be towering above you, and as it walks, you catch glimpses of the goblin as it darts about trying to get a shot on your comrades. This fidelity is lost if we treat the table as ground truth.
Instead, it’s better to ignore iter-creature relationships when checking line-of-sight.
This poses a minor implementation issue that is worth exploring. Back in the alpha, we checked line-of-sight using collision rays. This worked, but it was inaccurate as you can only perform so many checks per frame. Back then, we checked less than 10 positions per creature, and both false positives and negatives were common. People rightly asked for a more accurate line of sight.
In response, we changed the approach to rendering the whole scene from the creature in question’s perspective. We rendered tiles and creatures into a cubemap using colors as IDs. We then used compute shaders to gather the results and report back what colors (and thus creatures) we visible. As you can see, this introduces the issue that any creature that is entirely obscured by another creature will be considered invisible.
To fix this limitation, we are going to need something new. The idea is to render the scene without creatures into the cubemap, then we will render all the creatures, but they will not write their depths (in-fact, we will probably discard all their fragments). This means that we should get a fragment shader call for every fragment in each creature that isn’t discarded by depth. We can then write the visibility info from the fragment shader into the result buffer, hopefully giving us the data we need.
This will also help fog-of-war, which uses the same visibility cubemap. (more on fog-of-war another day).
I’m gonna jump back into code now, more dev-logs coming when I take breaks!
Ree has been working full steam on the flying mechanic, and it’s looking pretty damn nice. You should get your hands on it very-soon™ if all goes well.
Work on assets has been going well too. We have some good stuff we hope to show there soon-ish also.
Before the weekend, I tried to get more dice working. However, the D12 was spinning relentlessly, so I just put that down for a bit. I feel like we are past the big hurdles with that part of the physics integration, so it should be a case of just hunting down those issues.
I also started on porting the animated fire to the new system. This requires some relatively simple changes to the batcher but also means we need to recreate some scripts in spaghet. Here is the noodly goodness as it stands now.
This will definitely be cleaner when we support collapsing subgraphs into function-nodes, but it’s not too bad even now. We are sampling a few curves in order to:
- pick the active mesh
- rotate the object which holds the light
- adjust the light’s intensity
For the curious, this compiles down to a short sequence of operations.
I still need to switch from local time to global time and add a random-number node to add some timing variation between each fire instance. I believe I still need to do a little work to handle some cross-frame state, but this hopefully won’t throw any spanners in the works.
Tomorrow I’ll pick up where I left off and try to get this lot running in TaleSpire. Except for lights, which are their own can of worms.
Integrating the new physics engine is going well. I wrote a component that converts old-style colliders and rigid-bodies into the new style.
Sorry about the weird colors here. Unity today decided that I last renewed my license tomorrow (yup), and so right now, it thinks I’m unlicensed and gave me the light editor theme. The blue tint on the righthand-side lets me know when I’m in play mode, which is super helpful. It looks a lot less baby-blue when used with the dark theme :P
I’ll pretty the UI up in time. For now, seeing everything is helpful.
It took me a while to work out where to get some mass related values right, so the dice rolled strangely, but now it’s a little better.
Behind the scenes, a lot is going on, not least of which is that the physics is running with a fixed-timestep. This shouldn’t be a big deal as a fixed-timestep is mandatory for stable physics across framerates. However, this is not fully supported in Unity’s ECS, so I was concerned that it would be difficult.
Luckily for us, it was not. The physics engine is pretty great at giving you control, so I loop running the simulation for the required number of steps. If you have read about fixed-timestep before you’ll know that you need to interpolate the results as the final step. The physics engine doesn’t have that support for that out of the box, so we added a job to collect that information and apply it as we write the transforms back to the GameObjects.
With that done, I need to replicate the parts of the old physics API that we use. If I can get dice rolls working again then I should have replaced most of what we need.
Have a good one folks, Peace.
 yup I do know about the latest version making the dark theme free, but upgrading is risky and we are on a tight schedule right now.
 The interpolation is not hard, but the Unity folks have a lot more to support that we do, so we get to only handle the simpler case.
 The best known fixed-timestep explanation can be found over here. It’s a good one.
Phew, a few days away from the dev-log there. I got really caught up in some changes that were a real grind.
I’ve been working on the behavior of uncommitted changes. This work is taxing as you have a matrix of different states tiles might be in. Let’s take redo of delete for an example.
- You might be redoing an undo which has been committed
- You might be redoing an undo which has not yet been committed but the delete action it undoes has been committed
- You might be redoing an undo where neither the undo nor redo have been committed yet
And that’s one action in isolation. What about the delete of assets which were made by undoing a separate delete. The matrix of possibilities applies to all the actions involved in that sentence too. In short, making sure you get a reasonable result takes a lot of concentration (and bizarre diagrams :D ).
When I finally got it running locally well, I connected up a second client and immediately disaster. The result was very wrong. I’ll admit I started panic sweating at that point. Last time this happened, we had to delay the Beta. Luckily this time, the system is much simpler, and I have an extensive test suite to lean on.
I dusted off the tests (I may have neglected them for a while) and started digging into issues. The great news was that, whilst the tests did find a couple of bugs in the new code, the underlying board code was still working fine.
This helped me relax and focused my attention back into the code that handles the batching of uncommitted tiles. After some time in the debugger, I finally spotted a case where tiles were not removed from the uncommitted state when committed. This amounted to a constant being 1 instead of -1. With that typo fixed, it started behaving rather well.
With that terrifying detour done, I needed to get back into physics. I have had to write a lot of the physics code fairly reactively rather than with a solid plan, and so I hit the point where I needed to clean things up. This sucked. It was a painful combination of being very tedious but also critical to get right. Some days are just like this.
These last weeks I’ve really been feeling the pain of not being able to just rely on the game-engine in the ways we usually do. Because of our bad estimate of when Unity’s new ECS would be ready, we are having to make a lot of systems ourselves. This is fine in theory (I love making stuff), but it’s meant losing at least a month of dev time that we did not expect. That’s always difficult, but our timeline was already tight. It’s pretty stressful.
Anyway, panic/rant aside, things are progressing. The thing on my plate now is hooking regular Unity GameObjects into our physics setup. We need something that takes current-gen Unity Physics components and makes equivalents in a format suitable for their new physics engine. We can then write some code to manage copying the values to and from the new physics engine to keep it all in sync.
If all that goes well then hopefully I’ll be rolling dice again soon!
Thanks for stopping by folks,
Seeya in the next dev-log