Another crammed day, ending past 4AM. I had planned some other small tasks around the office, emails and tasks, but I spent pretty much the whole time continuing the saga of performance improvements. I seem to be addicted to saving frames!
I'll give you the highlights (as I am wiped out). I've replaced the terrain physics system from a height map to a trimesh, extracted from the actual visual terrain geometry at LOD1. What this means is that your dynamic objects will not sink into the floor or float above it in strange ways. My initial code was to use LOD0, but that dragged me from 170 fps to 110 fps, so by using LOD1 there was no speed loss and I could continue. No extra speed but much more accurate floor collisions, and the back-end meshes I needed for the next achievement.
Terrain occlusion for objects was possible thanks to the LOD1 meshes I generated as part of the physics creation. Converting them down to just flat vertices and passing them to the occlusion system means that hills and high ridges now occlude any objects that sit behind them. A BIG occlusion win. I have not run extensive tests on the performance gain (too busy on related wins) but it did not cause any slow down which we can thank to the GPU stall which extolls a single one-off cost for the occlusion so extra submissions to the system was free.
I also improved the occlusion system to use a dynamic vertex buffer instead of a fixed static one. This means I can calculate and render the best 60K polygons worth of occluders instead of rendering over 2 million vertices (the entire scene) through a static draw call. It is still one draw call, but it only renders the polygons immediately surrounding the player. There is also room for further optimizations here which is a boon as this one DID show a performance boost from 101 fps to 166 fps :)
The final task which I did not quite finish due to running out of brain juice was to work out which terrain sectors (small patchwork blobs of terrain) are hidden by the occluder depth render. I hacked into the BlitzTerrain module and had it skip the sector render if the associated object had been hidden by the occlusion system (if you recall, the object I used to make the new terrain physics floor is the one I left to pick-up the occlusion info for this technique). The problem is that there are many LOD levels, and a relational scatter of sectors per LOD level, and I am only associating DBP objects with LOD Level One. My early tests show it working, but it needs to work A LOT BETTER before it's ready for the public (as only a small part of the hidden terrain is actually skipping a render). Alas when I did an aggressive test and did some hack guesswork on neighboring sectors, wiping out most of the distant terrain, the frame rate did not get much past 180 fps. I have left in a conservative implementation which just acts on LOD1 which will ensure you don't get 'missing terrain squares' when playing your levels.
Notice the empty terrain sector? I set terrain to wire frame and put a hill between me and the distant plane. I am not saving huge amounts here, but the theory is sound and with more work we should be able to throw out a LOT of polygons when such occluded sectors are well hidden. I have emailed the author of BlitzTerrain in the hopes of gaining more insight and inspiration into the relationship between LOD levels and Sector Objects.
The fact I don't get much higher than 180 fps even when I obliterate the terrain rendering suggests that the bottleneck I must chase next is the GPU stall, and as described in a previous blog post I have a fiendish plan to solve it. Alas I probably will not get to that until the pile of little issues that my plate is starting to collect have been vanquished. All in all though, some nice progress and the occlusion system continues to pay dividends!