Having a great time profiling the FPSC Reloaded Engine with VTune at the moment. I started by switching off every component and running the profiler on a completely empty scene, no terrain, objects, sky, physics, anything. I then had a look at what might be hogging things. Turns out quite a few things.
Seems the engine would be monitoring ALL the objects, even if they where invisible for things like animation potential, mesh vertex update potential and other large loops. Completely redundant of course and hogged my CPU cycles. By changing them to use shortlists, I would only do a loop that consisted of the objects of interest, and the bottleneck completely disappeared.
At first I also thought the huge amount of time spent in NVD3DUM.DLL was something I could optimize but I think a good engine spends most of it's time in here as that is where the CPU is constantly giving things to the GPU to process, which means more frames and faster games. My current guideline is to ensure that the engine always spends more time in this module than it does in the remaining modules, thus ensuring a fast throughput of polygons to the card and zero CPU stalling.
I still have an outstanding issue which is causing a DirectX Error crash due to skipping the texture sort each frame (a MASSIVE hog) but once I track down the specific object(s) responsible I can fix it properly and massage the texture sort system so I am not breaking something elsewhere.
As you can see, by ensuring the texture sort only happened when the overall number of objects in the engine changes (i.e. something got added or removed) I went from 143 to 208 by adding two extra lines of code and a new variable!
After I've solved the texture sort crash bug, I need to spend some time playing the game and using the editor and features of the engine to ensure I have not broken anything major. Best to fix those now when I know what code I changed than a week from now when I won't have a clue.
I won't tease you with my current frame rate gains as they are very subjective but I am happy to report that for every bottleneck I find, and eliminate, the bottom line FPS jumps up. There is still the unavoidable issue that the engine drops back down to the 40 range when I try to draw a thousand shaded objects, but that is something I plan to tackle separately as it relates back to visuals and how the rendering order and quantity is handled.
Pretty happy with the performance work so far, and my hope is to bring you some solid news of the gains before the week is out. Until then, watch this space and keep your fingers crossed.