While something like the Samsung S4 (from 2013) isn't fast by any means, it sometimes surprised me with how slow it is compared to an old dual core laptop from 2011, and a relatively ancient Intel Q6600 (One of the first quad cores they released in 2007). But still, having access to low end platforms is excellent when looking at the nitty gritty of performance optimization.
In general there are basically two flavours of optimization: Things that affect the CPU and things that affect the GPU. There is usually cross talk between both, but it's good to know what will affect one or the other.
CPU
A quick list of some of the CPU vampires- Physics (At least for now with PhysX 3.3, probably)
- Scripts (old-skool Mono based)
- Particles
- Terrain
- Object counts
- Make sure you're using primitive colliders where possible, that they're static, and you're telling Unity to bake down your physics at build time.
- Make sure you're using Unity's layer system to isolate your system as much as possible. Especially expensive systems. No need to everything to be tested against everything else.
- If you have AI with location based damage (arms, legs, head, torse, etc), are you applying the previous rule? Or are all those individual colliders being tested every frame against everything?
- Better yet, shouldn't all those dynamic colliders be turned off most of the time anyways?
- Lots of AI moving around on a nav mesh, avoiding one another, can be a big hit.
- And of course other stuff that has been documented online by Unity and forum members.
Scripting can also have obvious pit falls. While the topic is as broad and deep as the seven seas, I just want to focus on a few specific points that I personally encountered.
- A script counts as a game object, and depending on the platform and the scope of the game, this can cause issues. In an open world game, for example, it's important to keep an eye on how many active objects there are at any given time, or how many scripts these objects are using.
Is there a map system that looks for in game objects, and then uses their positions to draw an icon? How many objects are there? Dozens? Hundreds? Do they all need to be active all the time? - If the scripts call the update function, can each script register itself with a manager, and the manager calls the update manually?
If there is a lot of logic branching in the Update function, have the most likely out come happen first. It may seem like a minor optimization, but again, when dealing with thousands of scripts, it can add up! - Cache those references at the beginning! If you know you're gonna be accessing a particular component or script all the time, stuff it into a variable during Awake() or Start().
- Also depending on the platform or scenario, built in math functions can heavy.
Vector3.Distance(a, b) is nice an convenient
but...
Vector3 heading = new Vector3(vecA.x - vecB.x, vecA.y - vecB.y, vecA.z - vecB.z);
dist = heading.x * heading.x + heading.y * heading.y + heading.z * heading.z;
... is probably faster, especially if you're calculating in 2D using only "x" and "z".
Use manual squares, like the example above, instead of Pow() or Sqrt(), etc.
- If you need lots of particles, consider splitting your emitters into how ever many cores you have. Need 10000 particles and you're using a 4-core CPU? Use 4 instances of the same emitter doing 2500 particles each.
- Noise can make particles look great, but it's HEAVY, even on PCs. Don't bother with noise on mobile platforms unless you've got plenty of CPU to spare (Meaning a super simple scene, almost bare-bones scene, or low particle count). Low (1D) and Medium (2D) noise is ok for consoles, but save 3D noise for higher end desktops. Even with the multiple emitter trick, you'll be paying a high price.
- Same with collision. If you need collision on particles on a low end platform, low quality collision and multiple emitters are about your only hope of keeping performance in check.
- Trails, you'll be paying a rendering cost (vertices), not so much an outright computational cost like with collision and noise. But this only applies if you're rendering super smooth trails with verts really close together.
- The more gradients are being used, the more processing times is being spent. Unity's documentation provides good breakdowns about this.
- And finally, if a particle is not emitting, turn the game object OFF. It's surprising how much CPU time can be eaten up just by running through idle particle systems.
- Make the terrain as smooth as possible in out-of-bounds areas, and when it's hidden under objects. This will prevent it from sub-dividing and causing more drawcalls than necessary. This can shave milliseconds off the CPU if you have large or multiple terrains.
- Crank the "Detail resolution per patch" value as high as it'll go before you start running out of verts. Even if you have no details on your terrain, the extra overhead needed to compute all the detail patches will eat into the CPU, especially on low end systems.
- Use the "Pixel Error" value to lower the terrains "LOD" and mesh detail. The fewer chunks of the terrain you draw, the better. This may not be as big of a deal in the future, but for now, it helps a lot! The higher this value, the chunkier the terrain, but the reduction in draw calls could be worth it.
- The texture system uses 4 texture per pass. 1 texture is as expensive to draw as 4 textures. If you need 5 textures you may as well go all the way up to 8, because either way you'll be rendering the terrain in 2 passes.
Object counts, this is something that might be overlooked in a lot of projects, especially ones with huge open world style levels. And when I say "objects" I'm not referring to only game object.
If there is a large rock asset with 4 LODs, the game will count it as at least 9 objects in memory: The LOD group component, Mesh filter and Renderer on LOD0 and the other 3 LODs as well. Now multiply this asset hundreds of times, and it quickly adds up. UI canvases and individual components, audio, all those little helper scripts I mentioned earlier. It's death by a thousands cuts.
Even if the game has well optimized art and scripts, high total object counts could still pose a performance hit, depending on the platform. And unfortunately there isn't a quick and easy fix for this, other than explicitly turning off (or even unloading) anything that isn't being used or seen.
Cutting scenes up into smaller chunks; turn an area of high-fidelity art consisting of multiple objects into a low-fidelity "vista" version made up of a single mesh. Either as a simple art swap, or if you want to get really fancy, a high-detail scene the player interacts with, and a low-detail vista scene.
Using scenes this way is a fairly advanced technique, but it can cut object counts and drawcalls by a LOT. Not to mention improving load times.
No comments:
Post a Comment