Mafia II is not using GPU for PhysX Clothing simulation ?
Everyone who played Mafia II will eventually notice, that enabling special APEX PhysX content will not only bring you flowing clothing and particle debris, but also significantly reduce framerate – even if you have a proper NVIDIA GPU and latest PhysX System Software.
Update: APEX Clothing in numbers
Update #2: Tests with dedicated PhysX GPU added – PART III.
While preparing our PhysX tweaking guide we have discovered that physically simulated clothing on characters is affecting performance the most. Why that ? Cloth is one of the basic effects, used intensively in many GPU PhysX games – just remember Mirror’s Edge and it’s countless tearable banners and flags. Two answers come to mind:
- Clothing simulation in Mafia II is so detailed and so high-resolution, that only top dedicated GPUs can run it at proper fps.
Unlikely. Even in most intensive scenes total cloth vertices count is not exceeding 8000, while framerate is crawling around 20 fps – on GTX 470.
- Clothing simulation is running on CPU, while it is supposed to be hardware accelerated on GPU.
Plausible. But, as our further tests have revealed, proved to be true for single GPU systems only.
Combine it with facts that a) Cloth in PhysX SDK is not using all available CPU resources by default b) Clothing is heavy computational task in any case – and you’ll see the probable reason of poor performance. Let’s check this theory.
PART I – MAFIA II BENCHMARK.
For the first part of tests we’ll use benchmark, built into Mafia II, running with two sets of APEX effects – Clothing only (Particles are disabled using methods from tweaking guide) and Particles only (Clothing is omitted) – and PhysX acceleration enabled/disabled from NVIDIA Control Panel.
Settings: 1680x1024, AO/AA On, AFx16, APEX High. System: C2Q 9400, GTX 470, 4GB RAM, Win 7 x64, 198.32 GPU drivers, PSS 9.10.0513

Interesting results. While Particles are benefiting from GPU acceleration without doubt, PhysX switch is not affecting Clothing simulation at all.
It seems our assumption was correct – APEX Clothing is calculated on CPU in any case – but let’s confirm it with some deeper research.
PART II – AGPERFMON.
Best way to find the truth is to look into entrails of PhysX SDK/APEX implementation in Mafia II.
Coarse tools like PhysX Visual Indicator won’t help, connecting PVD to the game is not easy task, so we’ll use profiler called AgPerMon, that will allow us to collect and review specific PhysX SDK simulation events.
Firstly, to find out which events are referring to GPU Cloth simulation, lets profile a simple application – SampleCloth.exe from PhysX SDK 2.8.3.
SampleCloth.exe

We are using scene 2: only one Cloth object in scene with applied wind acceleration.
Comparing two sets of data, extracted from SampleCloth running on GPU (left picture) and CPU (right picture) we can identify which events are responcible for GPU simulation – their names are starting with “NgPrDeformable“, “NgPrCloth” and “_cuda_kernel_Deformable“.
Now, let’s check actual game.
Mafia II – Chapter 15

For testing we used one scene from Chapter 15 – APEX is set to High, no additional characters on screen, APEX Clothing is represented by Vito’s casual suit with hat (616 dynamic cloth vertices). Also, we’ve spawned some APEX Particles events by shooting at walls.
————
First run – with GPU PhysX enabled in drivers.
As expected, we weren’t able to find any GPU Cloth specific events (left picture), familiar to us from SampleCloth. Instead, CPU Cloth simulation events were collected, thus - Clothing simulation is running on CPU.
What about Particles (right picture) ? They are running on GPU, as you may notice from lots of “NgPrFluid” events gathered.
————
Second run – with GPU PhysX disabled in drivers.
Result is predictable – both APEX Clothing (left picture) and APEX Particles (right picture) simulations are running on CPU – as they should.
PART III – DEDICATED GPU SAVES THE DAY !
Now, as new GTS 450 has arrived to suppement our current GTX 470, it is time to find out if dedicated PhysX GPU can help with Clothing simulation – as NVIDIA has stated:
Clothing is running on CPU unless you have a GPU, fully dedicated to PhysX.
First set – Mafia II built-in benchmark, testing methods and in-games settings are similar to PART I, but a different software setup (260.63 GPU drivers and 9.10.0514 PSS) was used this time.

Dedicated PhysX GPU gives a nice boost to Particles simulation, and more important – improves framerate for pure Clothing sim. So it’s actually working as promised ? Let’s check a few things firstly:
We’ve measured GPU load for a dedicated GTS 450 with MSI Afterburner 2.0.0. while performing a benchmark run with only Clothing enabled.

As you may see, Clothing simulation is putting some stress on dedicated PhysX GPU. Final test- profiling with AgPerfMon (click to view full pic)
Lots of “NgPrDeformable” and “NgPrCloth” events are indicating, that in case of dedicated PhysX card, APEX Clothing simulation is fully calculated on GPU, and thus performance gain is achieved.
We also can say, that in actuall game difference is even bigger. Some situations (with APEX High and without any tweaks), previously putting our single GTX 470 on knees, are running smooth and fluent with a dedicated GTS 450.
PART IV – CONCLUSION.
1) If you have a single NVIDIA GPU.
According to our results – in current version of Mafia II additional APEX Clothing content is running on CPU regardless to PhysX settings (while it is expected to be hardware accelerated by NVIDIA GPUs) and, thus, producing a huge performance drop.
In order to improve performance you can either disable (fully or partially) clothing simulation as described in our guide, or overclock your CPU.
APEX Particles effects are working normally, and can be calculated on GPU without any major performance loss.
2) If you have NVIDIA GPU, fully dedicated to PhysX calculations.
In this case both APEX Particles and APEX Clothing effects are running on dedicated GPU, leaving the game fully playble with APEX High setting and un-tweaked content.
Accent is put on fully dedicated – according to our data, PhysX switch in drivers must not be used in “Auto” mode, and we expect that SLI configurations may have problems with Clothing calculations too.





















Hmm, 8000 cloth vertices is quite a lot in such a situation. (my 9800GT seems to be able to do 1000-2000 _active_ vertices in a realistic scenario in hardware, less if using software).
Speculation:
But there may be a whole range of valid reasons to use software cloth anyway. For example, it may be more efficient to use the CPU(eg the vertices must be skinned on the CPU, since PhysX does not provide a mechanism to feed in a GPU buffer). Also just having the additional effects increases GPU load due to rendering.
In practice the hardware PhysX may be enabling this by reducing the CPU cost of the particle systems.
Of course, bugs are also rather likely:-)
David Black (QUOTE)
David Black
30 Aug 10 at 11:08 pm
Hmm, 8000 cloth vertices is quite a lot in such a situation
Cloth sim in Mirror’s Edge was using 1000-1500 vertices per tearable banner, and in some scene there were around 6-8 of them on screens simultaneously.
And it was running like a breeze on single GTX 260.
But there may be a whole range of valid reasons to use software cloth anyway
But what was the reason to make it so intensive anyway ?
In “heavy scene” I’ve mentioned there were 14 characters wearing 7935 cloth vertices combined – and it happens often on the streets.
Since there is no separate APEX C settings, LOD budget is set to high (2000 vertices would be more than enough) – whole APEX PhysX content (without tweaks) is just ruined fo users without top (Core i7, i9, Phenom II X6, etc) CPUs – like me. Regardless to the fact that I have last-gen NV GPU.
It just does not make any sence.
Zogrim (QUOTE)
Zogrim
30 Aug 10 at 11:47 pm
>>
In “heavy scene” I’ve mentioned there were 14 characters wearing 7935 cloth vertices combined – and it happens often on the streets.
< <
Future proof perhaps? But a lot of those vertices are probably constrained and faster to procces(I would hope).
But I agree it would probably be better to run on the GPU now... One would prefer that installing a faster GPU would do more good than a faster CPU(esp the NVIDIA people:-)
David Black (QUOTE)