As we mentioned previously, upcoming FluidMark 1.2, next version of popular GPU PhysX testing and benchmarking application, will include support for Multi-Core CPU PhysX calculations, and overall multi-threading optimizations as well.
Jerome Guinot, FluidMark developer, was kind enough to provide us with latest beta-version of new Fluid-Mark 1.2, and we’ll try to answer finally, what is faster – GPU PhysX or properly optimized CPU PhysX.
But first, lets take a closer look at new FluidMark. (click to view full picture)
“Multi-core PhysX” checkbox enables all multi-threading optimizations, vital and most interesting part of new FluidMark.
“# of CPU cores” is used specify number of CPU cores dedicated to simulation (up to 32 in current version), however this option is no so transparent as it looks – increased number of cores adds additional fluid emitters to the scene (one emitter per core or two in general), and with equal number of particles, various number of emitters can affect performance.
Application window has also changed – benchmark is still based on SPH fluid simulation, built into PhysX SDK (latest version 220.127.116.11 is used), but scene includes additional static objects, particles appearance if different and, as mentioned earlier, several emitters can be used simultaneously. Nice addition – GPU temperature overlay, usefull for GPU stress testing.
Final Global score in benchmarking mode is calculated now in a different way, and can’t be compared with previous version of FluidMark. It consist of two components – GraphX score (graphics framerate per second) and PhysX score (physics simulations per second).
Thus, Global score = (GraphX_score * 0.3 + PhysX_score * 1.7) / 2
Now, lets do some testing.
The Wonder of Multi-Threading !
Take a look at the following graph:
[Three emitters were used (# CPU cores = 4) with fixed number of particles - 15 000. Timerange - 60 sec. 800x600 rendering window. System: C2Q 9400 @ 2.66 GHz CPU, Nvidia GTX 275 + GTX 260 (192 sp) GPUs, 4GB RAM, Win XP, PhysX System Software 9.09.1112]
When “Multi-core PhysX” option is off, PhysX simulation and scene rendering are done in the same thread and, more important, PhysX SDK multi-threading flags are not set.
But when “Multi-core PhysX” is enabled, all PhysX simulations are done in separate threads and since there is still a thread for the rendering, scene rendering is boosted because there is no longer PhysX in scene thread. Same situation with PhysX, one or several threads are completely dedicated for physics simulation.
While SPH fluid simulation is running on CPU with “Multi-core PhysX” set to off, load is destributed through several cores (probably due to internal Windows threads management), but in sum that’s 26% – full one core.
But with multi-threaded optimizations enabled, application fully utilizes all four cores by 100%, what results in great speed boost.
In addition, one interesting detail was discovered – fluid simulation is running faster on GPU when one emitter is used, and opposite way – for CPU it prefers multiple emitters (with equal number of particles) – probably that’s peculiarity of PhysX SDK itself.
For example, with one emitter and multi-core PhysX switched to off, CPU simulation results in 36 global points (64 with 3 emitters – on graph above), while GTX 275 GPU – in 247 points (128 with 3 emitters). But since one emitter can’t utilize more than two cores, number of emitters was increased to gain equality.
Therefore, bechmarking seems to be a little tricky in new FluidMark. We are curious if someone will come with solid method after app release.
P.S. Thanks to Jerome for beta FluidMark and detailed explanations