At Game Developer Conference 2016 (GDC), NVIDIA has announced the GameWorks 3.1 development kit, which introduces several new physics simulation solutions – PhysX GRB and NVIDIA Flow. Let’s take a look at them more closely:
PhysX GRB is the new GPU accelerated Rigid Body simulation pipeline. It is based on heavily modified branch of PhysX SDK 3.4, but has all the features of the standard SDK and almost identical API. PhysX GRB is currently utilizing CUDA and requires NVIDIA card for GPU acceleration.
Unlike previous implementations, PhysX GRB is featuring hybrid CPU/GPU rigid body solver, and the simulation can be executed either on CPU or GPU with almost no difference in behavior, supported object types or features (GPU articulations are not implemented yet, however).
GRB provides GPU accelerated broad phase, contact generation, constraint solver and body/shape management. In addition, it introduces new implementations of island management and pair management that have been optimized to tolerate the order of magnitude more complex scenes that can be simulated on GPU compared to CPU. New mechanisms to parallelize event notification callbacks and a new feature to lazily update scene query asynchronously are also provided.
GPU acceleration can be enabled on a per-scene basis by setting specific flags on the scene and theoretically, any application that uses the PhysX 3.4 SDK or later can choose to run some or all of its rigid body simulation on an NVIDIA GPU with no additional programming effort.
GPU rigid body simulation takes advantage of the massive parallelism afforded by modern GPUs and can provide speed-ups in the region of 4x-6x faster and above compared to CPU simulation in the scenes with large amount of objects.
However, when simulating smaller scenes (commonly less than 500-1000 bodies depending on the hardware involved), simulating on the GPU tends to be slightly slower than simulating on the CPU, because there is a fixed cost associated with simulating a scene on the GPU. This cost is related to the overhead of DMAing input data to the GPU, dispatching the kernels required for GPU simulation, DMAing back the results and synchronizing with the GPU.
The following graph shows a performance comparison between an i7-5930k CPU (6 threads are used) and a GTX 980 GPU when simulating a grid of stacks of 4 convex hulls. This scene does not put a massive amount of strain on either the broad phase or the narrow phase so most of the load in this demo is borne by the constraint solver.
As can be seen, the results are completely skewed by the inferior CPU performance when processing 16384 stacks. GPU simulation shows up to 10-15x performance improvement.
Provided below is a graph demonstrating results for 1-4096 stacks to enable further performance analysis.
As can be seen, the cross-over point (where GPU simulation becomes faster than CPU simulation) lies somewhere in the 2ms range, between 256 convex stacks and 1024 convex stacks.
Second test scene simulates a drop of a pile of random convex shapes and puts a lot of pressure not only on the solver, but also on the broad phase and contact generation because it involves a far larger number of contact pairs that must be processed.
From these results, we can see that GPU simulation outperforms the CPU in the larger scenes.
Omitting the results from 27648 convexes in a pile, we see that the CPU and GPU seem to take roughly the same amount of time when simulating 1728 convexes and that the GPU is significantly faster than the CPU when simulating either 6912 or 13842 convexes.
The cross-over point for performance lies somewhere in the 2-3ms range, after which the GPU’s performance differences are substantial.
PhysX GRB SDK and demo should be released to public in the following weeks.
NVIDIA Flow is the new computational fluid dynamics algorithm that simulates combustible fluids such as fire and smoke.
Flow is featuring a dynamic grid-based simulation and volume rendering. It also includes a hardware agnostic DX11/DX12 implementation.