Performance Is King
As a mere subject of the king, my work today was rather dull with plenty of nothing to screen shot for your delectation! I had a big six hour thought last night due to insomnia and came up with a radical plan to transform performance metrics in Reloaded. It involved a bottom-up approach to the solution which is 'what don't I need to draw right now'.
Given this simple premise, I started by getting rid of all the objects until I was left with the floor and sky. The terrain floor was happily gobbling up over 30K in polygons for a completely flat floor. I immediate remembered the QUAD REDUCE function of the Blitz Terrain system so decided Wednesday would be about using that feature. I already knew it could not simply be turned on and I had to be flagged before passing in a height-map. Generating and testing that height map took most of the day.
The Performance Plan
Essentially I am going to build a spacial R-tree database of all entities, create 3D box volumes for the enclosed groups of entities, then generate pools of quads that share a communal texture. The quads and communal texture is generated just before the game starts and will take very little memory. I then re-use my Instance Stamp system batch together all the HIGH and MID static entities relative to the local position of the player, which means we only create what we need to see within out close range. I then use GPU occlusion detection with the r-tree volumes to work out what I don't need to render, and then add dynamic entities only where volumes are visible for that cycle. The quad will have multiple textures depending on which angle you are looking at it, and one for when looking at it from above. A shader will control the quad rotation and texture selection to avoid writing into the vertex buffer, and the same batches of static buffers and quads can be used by the shadow rendering process. The upshot is that we will only render the large pools of static common geometry that is visible, and only dynamic geometry that is associated with a visible r-tree volume, giving us the minimum amount to draw for the maximum performance.
The quad system is pretty similar to how Dark Imposter does things, but won't be required to regenerate the object view as the player moves. By pre-rendering eight fixed directions around the entity for our quad texturing, we will sacrifice that millimeter perfect quad visual for more speed, which is the objective at the moment. I must say though I really like the way Dark Imposter can produce a quad texture in real-time and regenerate it as you move around the object, and still manage to keep hundreds of FPS in the bag! If only I knew how they did it :)
Signing Off
All sounds impressive right :) Well now I need to code it, and I am giving myself a few weeks to do this, but the end result should be entities as far as the eye can see, and every draw call that can be saved, saved. I can also use the r-tree spacial database to localize entity searches for things like physics interaction, sound playing, gun and missile ray-casting and to some degree entity logic. Traditionally in FPSC Classic, we would have to step through EVERY entity in the level to perform our tasks, even if we only needed access to just one or a few close items. Having a spacial database means our searches can be MUCH more targeted.
I am just finishing off some terrain quad reduction work, and then Thursday I will delve into the spacial functions of the BOOST library which promises some nice algorithms to make my database implementation go smoothly. Once I have my spacial hierarchy in place, creating batches of common buffers from them and hiding them via GPU occlusion should be a little easier!
Most of what you said there wast lost on me,but its nice to here you have found a better direction to go to speed things up.I am loving watch reloaded grow in realtime.
ReplyDelete"I am going to build a spacial R-tree database of all entities, create 3D box volumes for the enclosed groups of entities, then generate pools of quads that share a communal texture. The quads and communal texture is generated just before the game starts and will take very little memory. I then re-use my Instance Stamp system batch together all the HIGH and MID static entities relative to the local position of the player, which means we only create what we need to see within out close range. I then use GPU occlusion detection with the r-tree volumes to work out what I don't need to render, and then add dynamic entities only where volumes are visible for that cycle. The quad will have multiple textures depending on which angle you are looking at it, and one for when looking at it from above. A shader will control the quad rotation and texture selection to avoid writing into the vertex buffer, and the same batches of static buffers and quads can be used by the shadow rendering process. The upshot is that we will only render the large pools of static common geometry that is visible, and only dynamic geometry that is associated with a visible r-tree volume, giving us the minimum amount to draw for the maximum performance."
ReplyDeleteI once thought for six hours and decided to get a new toaster rather than continue using the grill.
lol :P
DeleteWow. Sounds like a really awesome solution :D It'll be (extremely) complicated to code, but WELL worth it IMO :)
ReplyDeleteAwesome blog today Lee!
ReplyDeleteWouldn't it be possible for some large amounts of entities, like a forest, to draw only the "outer box" and not the whole inside? I mean, if we are onlye supposed to se a forest far away, and not go there, we wont need to see the trunk of the trees,, just the top.
ReplyDeleteI think it would be a great idea if you were able to resize the map. You don´t always need that big map, especially before we have drivable vehicles, ships and even airplanies? )
ReplyDeleteA smaller map would generate some better performance, yes?
Perhaps not a lot, but some.
How come Lee, AAA Game Development Engines make their engines really fast effecient and beautiful without the trouble you go through? I'm sure they have bugs to deal with but kits like CryEngine 3, the water is amazing and perfect, the models blend nicely with the ground and you can create any terrain no problem performance problems. I know the developers have a huge team and money hanging out of their shoes, but what gets me is that lee is on a rampage for performance vegitation being an issue, and these engines let everything lee has to cut down on just appear on the screen with minimal hindering on preformance.
ReplyDeletehonestly can't wait for the release of reloaded.
Lee quote: "I had a big six hour thought last night due to insomnia and came up with a radical plan to transform performance metrics in Reloaded. It involved a bottom-up approach to the solution which is 'what don't I need to draw right now'. "
ReplyDeleteSome of my best coding solutions (maybe even most) were done either during my non-sleep (insomnia -- tossing and turning) or even when I was a sleep and suddenly wake up with an idea. I'm glad you went with it even though at first you really didn't want to do that.
To others who are not programmers: I'm not angry or anything but,well, unless you've been there and done that I don't think you should probably think that others have done it better or found it easier. They certainly haven't found it easier. I don't know how many teams or how many are on a team but having teams allows the code to be broken up into simpler sections/segments and all they have to do is concentrate on one area of code rather than having to continually think about the whole. In this way so often it is easy to miss the tree for the forest. Or perhaps you don't even hear the fall of the tree because you haven't seen it. :BIG SMILE: