PHYSX NEWS PHYSX SDK
PROJECTS TABLE
GPU PHYSX
GAMES INFO
PHYSX
ARTICLES
PHYSX WIKI FORUM
РУССКИЙ ENGLISH


:: Back to news index ::

GDC 2016: PhysX GPU Rigid Body and NVIDIA Flow

with 13 comments

At Game Developer Conference 2016 (GDC), NVIDIA has announced the GameWorks 3.1 development kit, which introduces several new physics simulation solutions – PhysX GRB and NVIDIA Flow. Let’s take a look at them more closely:

PhysX GRB

PhysX GRB is the new GPU accelerated Rigid Body simulation pipeline. It is based on heavily modified branch of PhysX SDK 3.4, but has all the features of the standard SDK and almost identical API. PhysX GRB is currently utilizing CUDA and requires NVIDIA card for GPU acceleration.

Unlike previous implementations, PhysX GRB is featuring hybrid CPU/GPU rigid body solver, and the simulation can be executed either on CPU or GPU with almost no difference in behavior, supported object types or features (GPU articulations are not implemented yet, however).

GRB provides GPU accelerated broad phase, contact generation, constraint solver and body/shape management. In addition, it introduces new implementations of island management and pair management that have been optimized to tolerate the order of magnitude more complex scenes that can be simulated on GPU compared to CPU. New mechanisms to parallelize event notification callbacks and a new feature to lazily update scene query asynchronously are also provided.

GPU acceleration can be enabled on a per-scene basis by setting specific flags on the scene and theoretically, any application that uses the PhysX 3.4 SDK or later can choose to run some or all of its rigid body simulation on an NVIDIA GPU with no additional programming effort.

GRB fully supports jointed objets. Presented scene contains 400 walker robots, each constructed from over 40 motorized joints and bodies .

GPU rigid body simulation takes advantage of the massive parallelism afforded by modern GPUs and can provide speed-ups in the region of 4x-6x faster and above compared to CPU simulation in the scenes with large amount of objects.

However, when simulating smaller scenes (commonly less than 500-1000 bodies depending on the hardware involved), simulating on the GPU tends to be slightly slower than simulating on the CPU, because there is a fixed cost associated with simulating a scene on the GPU. This cost is related to the overhead of DMAing input data to the GPU, dispatching the kernels required for GPU simulation, DMAing back the results and synchronizing with the GPU.

The following graph shows a performance comparison between an i7-5930k CPU (6 threads are used) and a GTX  980 GPU when simulating a grid of stacks of 4 convex hulls. This scene does not put a massive amount of strain on either the broad phase or the narrow phase so most of the load in this demo is borne by the constraint solver.

Provided by NVIDIA

As can be seen, the results are completely skewed by the inferior CPU performance when processing 16384 stacks. GPU simulation shows up to 10-15x performance improvement.

Provided below is a graph demonstrating results for 1-4096 stacks to enable further performance analysis.

Provided by NVIDIA

As can be seen, the cross-over point (where GPU simulation becomes faster than CPU simulation) lies somewhere in the 2ms range, between 256 convex stacks and 1024 convex stacks.

Second test scene simulates a drop of a pile of random convex shapes and puts a lot of pressure not only on the solver, but also on the broad phase and contact generation because it involves a far larger number of contact pairs that must be processed.

Provided by NVIDIA

From these results, we can see that GPU simulation outperforms the CPU in the larger scenes.

Omitting the results from 27648 convexes in a pile, we see that the CPU and GPU seem to take roughly the same amount of time when simulating 1728 convexes and that the GPU is significantly faster than the CPU when simulating either 6912 or 13842 convexes.

Provided by NVIDIA

The cross-over point for performance lies somewhere in the 2-3ms range, after which the GPU’s performance differences are substantial.

PhysX GRB SDK and demo should be released to public in the following weeks.

NVIDIA Flow

NVIDIA Flow is the new computational fluid dynamics algorithm that simulates combustible fluids such as fire and smoke.

Flow is featuring a dynamic grid-based simulation and volume rendering. It also includes a hardware agnostic DX11/DX12 implementation.

Written by Zogrim

March 17th, 2016 at 3:07 am

13 Responses to 'GDC 2016: PhysX GPU Rigid Body and NVIDIA Flow'

Subscribe to comments with RSS

  1. Thank you for info!

      

    Vojtech

    17 Mar 16 at 1:27 pm

  2. Interesting fact, btw – it seems that CPU execution is also pretty fast with this new solver


    (click to enlarge)

    Kapla Tower scene: ~ 15 000 rigid bodies, 20-25 fps on i7 3770K, simulation is set to use 7 threads (you can see proper CPU utilization)

    Ragdolls are the heaviest – around 15 fps on CPU, 40-45 on GPU

      

    Zogrim

    17 Mar 16 at 9:59 pm

  3. Really impressive, thanks for the article. Should see more impressive scenes with a higher rigid body count with that performance increase on the CPU.

    “GPU acceleration can be enabled on a per-scene basis by setting specific flags on the scene and theoretically, any application that uses the PhysX 3.4 SDK or later can choose to run some or all of its rigid body simulation on an NVIDIA GPU with no additional programming effort.”

    Hopefully we’ll see an exponential amount of games with a GPU acceleration option too!

    Have they mentioned if they plan on releasing those particular demos on the dev page with the SDK? :)

      

    Spets

    18 Mar 16 at 10:09 am

  4. Spets: Hopefully we’ll see an exponential amount of games with a GPU acceleration option too!

    There is a catch here – as you may noticed, there a cross-over point between CPU/GPU performance, and it lies at 1000+ bodies – a lot more than usually used in current games.
    So you can’t simply enable GPU acceleration anywhere and get a boost, and considering that CPU GRB path is also faster than standart PhysX 3.3/3.4, cause it has also recieved optimizations..
    But, GRB will be very usefull for VFX, certain types of games (physics playgrounds like Besiege), cloud computing, etc

    Spets: Have they mentioned if they plan on releasing those particular demos on the dev page with the SDK?

    In coming weeks, as I heard.
    But it is still to be decided if PhysX GRB SDK will replace standart PhysX SDK or not. It is similar but also different (so it may “break stuff” im middleware integrations), and is not very usefull on platforms like consoles and mobiles, so most likely GRB and standart SDK will coexist for quite some time.

      

    Zogrim

    18 Mar 16 at 10:58 am

  5. Hi Zogrim

    The beauty of GRB being a hybrid simulation is that you can transition between CPU-only simulation and hybrid CPU/GPU simulation at run-time. We’re still ironing out how to best take advantage of this but it effectively means that, when there isn’t much going on, you could potentially run CPU-only simulation and, as the number of active bodies increases, the simulation can transition to the GPU, effectively removing the concern about smaller scenes being slower on the GPU. It should also be noted that the numbers involved are still very small (< 2ms) and that 2ms is the total time between calling simulate() and fetchResults(), rather than the time that the GPU is actually doing any work. The majority of this time in smaller scenes is host-side (CPU) overhead marshalling data and dispatching kernels rather than actual time that the GPU is busy doing anything. It is something that potential users should be aware of but, when simulating simpler scenes, the intention is that GRB should eventually never be slower (although the current state should still not cause significant problems because the performance in tiny scenes is still extremely quick).

      

    Kier Storey

    19 Mar 16 at 3:08 am

  6. Kier Storey:
    Hi Zogrim

    The beauty of GRB being a hybrid simulation is that you can transition between CPU-only simulation and hybrid CPU/GPU simulation at run-time. We’re still ironing out how to best take advantage of this but it effectively means that, when there isn’t much going on, you could potentially run CPU-only simulation and, as the number of active bodies increases, the simulation can transition to the GPU, effectively removing the concern about smaller scenes being slower on the GPU. It should also be noted that the numbers involved are still very small (< 2ms) and that 2ms is the total time between calling simulate() and fetchResults(), rather than the time that the GPU is actually doing any work. The majority of this time in smaller scenes is host-side (CPU) overhead marshalling data and dispatching kernels rather than actual time that the GPU is busy doing anything. It is something that potential users should be aware of but, when simulating simpler scenes, the intention is that GRB should eventually never be slower (although the current state should still not cause significant problems because the performance in tiny scenes is still extremely quick).

    Will there be a demonstration that actively shows the transitioning between CPU/GPU anytime soon (or currently)?

      

    Spets

    19 Mar 16 at 4:16 am

  7. Kier Storey:
    Hi Zogrim

    The beauty of GRB being a hybrid simulation is that you can transition between CPU-only simulation and hybrid CPU/GPU simulation at run-time.

    Thanks for clarification, Kier :)

    Is it possible to mix CPU and GPU execution ? (for example, run broadphase on GPU and solve contacts on CPU). I think it may be benifical on certain cases.

    Spets: Will there be a demonstration that actively shows the transitioning between CPU/GPU anytime soon (or currently)?

    You can switch between CPU and GPU calculation in the demo by pressing F5, it works on the fly, without need to restart the scene or reload the application. Noticable lag is still present though.

      

    Zogrim

    19 Mar 16 at 11:40 am

  8. Hi Zogrim. Where can I download the demo? Or is it available only for registered developers?

      

    mareknr

    19 Mar 16 at 9:28 pm

  9. mareknr:
    Hi Zogrim. Where can I download the demo?

    I doubt you can right now, I got mine by request

      

    Zogrim

    19 Mar 16 at 11:52 pm

  10. Hi,

    the article mentions that joints and motors are supported, but can anyone tell me whether articulations work on the GPU?

      

    dg211

    22 Mar 16 at 2:28 pm

  11. dg211:
    Hi,

    the article mentions that joints and motors are supported, but can anyone tell me whether articulations work on the GPU?

    No, afaik articulations are not supported currently on GPU

      

    Zogrim

    22 Mar 16 at 2:42 pm

  12. We plan to add support for articulation later but, at the moment, we are limited to just rigid bodies and joints.

    The CPU-GPU hot-swap feature is still under development and the version present in the demo is a very early, inefficient prototype that was introduced only to support CPU/GPU switching in the demo. The final feature should be fast-enough to not cause noticeable hitches when a switch occurs.

      

    Kier Storey

    29 Mar 16 at 12:31 pm

  13. Hi, I’m japanese student and I’m really interested in GRB.
    So, I’d like to use PhysX 3.4 right away.
    Does anybody know the release date of PhysX 3.4?

      

    takku

    9 Apr 16 at 10:29 am


Leave a Reply

*
Copyright © 2009-2014. PhysXInfo.com | About PhysXInfo.com project | Privacy Policy
PhysX is trademark of NVIDIA Corporation