I have spent most of Friday running around finding little jobs to make my mental desk a little cleaner before the weekend. I found time to add some nice things to Reloaded such as character fading when they die, switching them to instance objects when they are far away to save performance, fixing reflections which disappeared when the new occlusion system went in, adjusted the metrics readout so that polygon and draw calls don't instantly fill up the bar to appear more representative of a 'maximum' state, tweaked collision properties for some trees and generally bashed the engine with a variety of performance tests to locate the worst offenders.
I have isolated two performance jobs which move to the top of my plate, and both of them hooks into the new occlusion system very nicely.
The first is to replace distant objects with quads. I did this before for the instance stamp system but it proved to be a huge memory hog and hurt performance when a large group of high polygon objects entered the rendering zone. The great news is that I can re-use the quad system I created from the instance stamp mechanism but drop the 'dynamic VB filler part' which was the troublesome bit. I will be using the feedback from the occlusion system to work out whether quads should be rendered (as they are in a single draw call so need vertex shader magic to hide/show them individually). By passing in a quad distance value to the bound sphere submissions of the occluder I can control quad visibility entirely on the GPU :) Naturally, I can get the CPU to skip draw calls on the higher polygon objects by simply hiding them when they enter the assigned quad range (which will be a value in the FPE so you can change it per entity).
The second boost work will be to move the vegetation generator which currently kicks in hard when you run really fast through lots of grass (and can take good 60 fps levels down to 42 fps). I will move it from the DBP code which creates and destroys meshes on mass into a space between the start and end of the HZB query calls (that's right, exactly where the stall is located). My hope is that the vegetation generation becomes completely free as the CPU would be sitting around waiting for the GPU to come back with my juicy visibility values for the occlusion.
Put these two optimizations together and you should get a reduction of draw calls amounting to half the scene, and a substantial drop in polygons too. I've run measurements with other speed-up ideas, and these two will help increase frame rates significantly AND ensure the rate does not drop when you run quickly through dense foliage.
If you've been wandering why I've been pulling 12-14 hour days for the last five days it's because I am off on a short busmans holiday from Sunday. I will be changing my overworked programmer hat for my CEO and Marketing hat as I represent TGC at this years MWC. Specifically I will be presiding over the huge hackathon there and helping fellow developers as they attempt to code a game in just three days. It's a chance to chill, unwind a little and talk shop with 3000 developers, and trade horror stories to find out if there are any crazy cool optimization tricks that I have missed. If you're heading there yourself, I will be tweeting my location throughout the event in case you want to say hello and swap travelling tales.
Alas I have left myself just one day to prepare all my demos and test my equipment, but I have one more day of Reloaded performance work during Saturday, and hopefully the internal build I am preparing will be received well by our internal team testers. I made a video showing the new occlusion in action, but I have been warned repeatedly about showing 'not great' videos so you will have imagine what it looked like;
"Picture it....I start far from a set of 15 buildings standing on a super flat terrain, each building has 3 barrels placed outside, the draw call count is 62. I run into the nearest building and make sure I can see through no open windows, and the draw call drops to just 12. Apart from the floor, sky and a few other quads the only thing being rendered is the building I am in. Frame rate on my machine stayed well above 60 fps, the target I aim for during my tests. Once I add in quads for singular objects, I expect the initial draw call of 62 to drop to more like 32. All to play for in Reloaded land!"