Vancouver Technical Artist

Wednesday, December 5, 2018

Fast distance calculations

I may as well put this into it's own little post...

One of the things that I've been doing with scripts in Unity is finding ways to disable objects based on distance. Naturally LODGroups are meant for that, but I'm referring to things such as lights, particles, and other things that don't have a mesh renderer component.

As I mentioned in the previous post, while something like "Vector3.Distance(a, b)" is convenient, it is also quite intensive. So when dealing with potentially hundreds of objects every frame, it can add quite a lot to the CPU overhead.

The solution is to use a fast distance calculation:
public static float FastDist2D(Vector3 a, Vector3 b)
{
    Vector2 heading;

    heading.x = a.x - b.x;
    heading.y = a.z - b.z;

    return heading.x * heading.x + heading.y * heading.y; }

Since the action in most games takes place horizontally, it makes sense to simply use the X and Z coordinates without involving any square roots.

And an all encompassing binary distance check would look like this:
public static bool FastDistCheck2D(Vector3 a, Vector3 b, float d)
{
    float distSqr;
    Vector2 heading;

    heading.x = a.x - b.x;
    heading.y = a.z - b.z;

    distSqr = heading.x * heading.x + heading.y * heading.y;

    if(distSqr < d * d)
        return true;

    else
        return false;
}

Clean and simple!
Calculating in 3D is just as easy of course, simply make "heading" a vector 3, and add the "y" component.

Wednesday, October 17, 2018

Lessons in art optimizaion

In the last several years, my spread of platforms I use for Unity covers slow low end mobiles phones and laptops (like the venerable Samsung S4 and dual core Intel PCs), to high end consoles and desktops. And it's been very interesting to see how each generation of Unity provided various levels of performance across these different platforms.

While something like the Samsung S4 (from 2013) isn't fast by any means, it sometimes surprised me with how slow it is compared to an old dual core laptop from 2011, and a relatively ancient Intel Q6600 (One of the first quad cores they released in 2007). But still, having access to low end platforms is excellent when looking at the nitty gritty of performance optimization.

In general there are basically two flavours of optimization: Things that affect the CPU and things that affect the GPU. There is usually cross talk between both, but it's good to know what will affect one or the other.

CPU

A quick list of some of the CPU vampires
  • Physics (At least for now with PhysX 3.3, probably)
  • Scripts (old-skool Mono based)
  • Particles
  • Terrain
  • Object counts
Physics is naturally one of those things that can eat a CPU alive, and I won't go into the do's and don'ts of the basics, since I'm assuming everyone knows what is "Too Much" based on their own design. But of course there may be areas or things that are overlooked which can cause performance issues.
  • Make sure you're using primitive colliders where possible, that they're static, and you're telling Unity to bake down your physics at build time.
  • Make sure you're using Unity's layer system to isolate your system as much as possible. Especially expensive systems. No need to everything to be tested against everything else.
  • If you have AI with location based damage (arms, legs, head, torse, etc), are you applying the previous rule? Or are all those individual colliders being tested every frame against everything?
  • Better yet, shouldn't all those dynamic colliders be turned off most of the time anyways?
  • Lots of AI moving around on a nav mesh, avoiding one another, can be a big hit.
  • And of course other stuff that has been documented online by Unity and forum members.
It was difficult to optimize physics in Unity 4.x because even though an object was labeled static, the collider wasn't. It was only in Unity 5.x that they fixed things and gave you the option to bake things down, vastly improving performance when a lot of colliders were around (like trees on terrain).

Scripting can also have obvious pit falls. While the topic is as broad and deep as the seven seas, I just want to focus on a few specific points that I personally encountered.
  1. A script counts as a game object, and depending on the platform and the scope of the game,  this can cause issues. In an open world game, for example, it's important to keep an eye on how many active objects there are at any given time, or how many scripts these objects are using.

    Is there a map system that looks for in game objects, and then uses their positions to draw an icon? How many objects are there? Dozens? Hundreds? Do they all need to be active all the time?
  2. If the scripts call the update function, can each script register itself with a manager, and the manager calls the update manually?

    If there is a lot of logic branching in the Update function, have the most likely out come happen first. It may seem like a minor optimization, but again, when dealing with thousands of scripts, it can add up!
  3. Cache those references at the beginning! If you know you're gonna be accessing a particular component or script all the time, stuff it into a variable during Awake() or Start().
  4. Also depending on the platform or scenario, built in math functions can heavy.

    Vector3.Distance(a, b) is nice an convenient
    but...
    Vector3 heading = new Vector3(vecA.x - vecB.x, vecA.y - vecB.y, vecA.z - vecB.z);
    dist = heading.x * heading.x + heading.y * heading.y + heading.z * heading.z;

    ... is probably faster, especially if you're calculating in 2D using only "x" and "z".

    Use manual squares, like the example above, instead of Pow() or Sqrt(), etc.
Particles are a wonderful thing with newer versions of Unity! And they're also one of the few systems that are brilliantly multi-threaded too. Of course it's important to know the sub-components of particles because some are more costly than others. So what are some of the best ways to optimize them?
  • If you need lots of particles, consider splitting your emitters into how ever many cores you have. Need 10000 particles and you're using a 4-core CPU? Use 4 instances of the same emitter doing 2500 particles each.
  • Noise can make particles look great, but it's HEAVY, even on PCs. Don't bother with noise on mobile platforms unless you've got plenty of CPU to spare (Meaning a super simple scene, almost bare-bones scene, or low particle count). Low (1D) and Medium (2D) noise is ok for consoles, but save 3D noise for higher end desktops. Even with the multiple emitter trick, you'll be paying a high price.
  • Same with collision. If you need collision on particles on a low end platform, low quality collision and multiple emitters are about your only hope of keeping performance in check.
  • Trails, you'll be paying a rendering cost (vertices), not so much an outright computational cost like with collision and noise. But this only applies if you're rendering super smooth trails with verts really close together.
  • The more gradients are being used, the more processing times is being spent. Unity's documentation provides good breakdowns about this.
  • And finally, if a particle is not emitting, turn the game object OFF. It's surprising how much CPU time can be eaten up just by running through idle particle systems.
Terrain. Unsurprisingly, the terrain system is in dire need of an overhaul, since as of this writing, the current terrain system in Unity 2018.2.11 is a left over from Unity 2.x days, with few updates in between. And while there are third-party solutions, there are still things to consider if you need to use the existing out of the box system.
  • Make the terrain as smooth as possible in out-of-bounds areas, and when it's hidden under objects. This will prevent it from sub-dividing and causing more drawcalls than necessary. This can shave milliseconds off the CPU if you have large or multiple terrains.
  •  Crank the "Detail resolution per patch" value as high as it'll go before you start running out of verts. Even if you have no details on your terrain, the extra overhead needed to compute all the detail patches will eat into the CPU,  especially on low end systems.
  • Use the "Pixel Error" value to lower the terrains "LOD" and mesh detail. The fewer chunks of the terrain you draw, the better. This may not be as big of a deal in the future, but for now, it helps a lot! The higher this value, the chunkier the terrain, but the reduction in draw calls could be worth it.
  • The texture system uses 4 texture per pass. 1 texture is as expensive to draw as 4 textures. If you need 5 textures you may as well go all the way up to 8, because either way you'll be rendering the terrain in 2 passes.
While Unity is updating the terrain system internally, it's not always clear when the bulk of the changes will actually be implemented. Unity 2018.3 will have GPU based terrain computation, which apparently speed things up considerably. (At some point I will be examining the betas and doing some apples to apples comparison)

Object counts, this is something that might be overlooked in a lot of projects, especially ones with huge open world style levels. And when I say "objects" I'm not referring to only game object.

If there is a large rock asset with 4 LODs, the game will count it as at least 9 objects in memory: The LOD group component, Mesh filter and Renderer on LOD0 and the other 3 LODs as well. Now multiply this asset hundreds of times, and it quickly adds up. UI canvases and individual components, audio, all those little helper scripts I mentioned earlier. It's death by a thousands cuts.

Even if the game has well optimized art and scripts, high total object counts could still pose a performance hit, depending on the platform. And unfortunately there isn't a quick and easy fix for this, other than explicitly turning off (or even unloading) anything that isn't being used or seen.

Cutting scenes up into smaller chunks; turn an area of high-fidelity art consisting of multiple objects into a low-fidelity "vista" version made up of a single mesh. Either as a simple art swap, or if you want to get really fancy, a high-detail scene the player interacts with, and a low-detail vista scene.

Using scenes this way is a fairly advanced technique, but it can cut object counts and drawcalls by a LOT. Not to mention improving load times.

    Wednesday, October 10, 2018

    Been too long! New updates soon.

    Been too long since I posted, but I've been working, taking care of a growing family, and most importantly learning the ins and outs of Unity. From my humble beginnings in a late release of Unity 3.x, to the bleeding edge of Unity 2018.3 beta.

    I'm hoping to post some new content about some of the things I've learned with regards to optimization and art creation from a practical stand point. Things beyond "Have you tried static batching?" "How about occlusion culling? It helps!" And why it doesn't always actually help.

    Look for these articles and discussions soon, like within a week or two!

    Monday, June 24, 2013

    Learning code from the Pros

    It's been a while since I've posted anything, again, but as is the way of things sometimes there is little to post about. Recently I've been working at Goldtooth and learning a lot of python and other code related etiquette when it comes to creating pipelines and work flows from scratch.

    While I've worked with a lot of programs and applications in the video game industry, it's great to be exposed to new tools in other industries, like Shotgun, Tank, RVIO, Nuke, etc. Although I will admit writing code to connect with a vast database can be daunting :)

    When I'm not connecting with databases and the likes, I'm writing custom scripts to push thousands of files around... usually renaming them in the process. Sometimes renumbering entire frame sequences too. It's interesting to see how a good naming convention can really make parsing files a breeze, while files named all willy-nilly make it just the opposite, or at least make it so that I have to write a custom file handler, or brute force renamer to handle most situations.

    One of the main functions I've written for myself is copy_RenameOnly(), and it takes a lot of parameters.
    • srcDir: source directory for the files
    • dstDir: where the renamed files need to go
    • fileName: The name of the new file sequence. Honestly I've given up trying to automate this because of the sheer number of different file names I encounter during the day. At least entering it manually gives me full control.
    • extension: Again, when dealing with EXRs, JPGs, JPEGs, TGAs... sometimes within the same folder, I decided to give explicit instructions as to which file type this function needs to look for when grabbing the list of files.
    • duration: when copying a lot of files I sometimes need to copy only a certain frame range.
    • print only: A useful addition that allows me to double check my output to make sure I've gotten everything correct.
    Mind you this may not seem like a big deal, but there are days when I may need to rename a couple dozen directories to what the client needs, it helps to have all my custom tools at my disposal :)

    One of the things I thought of doing was writing a script which would parse the Nuke scripts, extract all the read-in nodes to get the source frames, and then actually have my script write a python file and auto populate all the necessary fields for me so that I can just enter the destination file name and run it, saving me a lot of time and ensuring no typos with variable names etc.

    I have a habit of doing a lot of the following... if only to stay more or less Pythonic and keep my lines within 80 characters and be able to reuse variables easily:

    dst = "//localDrive/project/delivery_dir"
    src = "//networkDrive/project/show/dir1/dir2/etc..."
    shot_1 = "{0}/source_dir1".format(src)
    shot_2 = "{0}/source_dir2".format(src)
    shot_3 = "{0}/source_dir3".format(src)

    copy_RenameOnly(shot_1, dst, "GoodFileName_1", "exr", ...) 
    copy_RenameOnly(shot_2, dst, "GoodFileName_2", "exr", ...) 
    copy_RenameOnly(shot_3, dst, "GoodFileName_3", "exr", ...) 
    ...

    It looks streamlined, and it is to a degree, but I think it can be streamlined further by writing a code generator script to help me out further. Such is the life of a technical artist in the film and FX industry ^_^

    In personal news, I'm slowly working away on my new game project. I've put my racing game on hold to work on something simpler that I can tackle myself in a reasonable time frame. Afterall, my first project should be about creating something from start to finish, even if it doesn't end up selling millions :)

    I'll have more of that later!

    Tuesday, March 19, 2013

    Unreal Script

    Been a while since I've posted anything, but things have been busy for the past few months. Between contracts and family obligations I'm finally finding the time to work on my own project and dive head first into some Unreal Script.

    Right off the bat I've forgotten how complicated it can be to learn this stuff, since I haven't used it in a while. Thankfully it's mainly C / C++ so it's a familiar language, but learning all the classes, or at least the classes I need is daunting. It's pretty much Class extends Class extends UTClass extends UDKClass extends Class extends Actor extends Object, heheh :)

    I'm certainly appreciating how much a well setup IDE can benefit the learning process, so I think I'll track one down for .uc files to speed things up.

    I'll have more updates later on my progress.

    Thursday, December 13, 2012

    Switching gears

    As of a few days ago I have switched gears back to working with UDK, as it was suggested to me that being familiar with Unreal would be a big plus, currently. Thus my racing game is being put on hold as I work on a quality portfolio piece in Unreal :)

    Loading up the latest (Nov 2012) UDK really surprised me as to how many new features and visuals are present! This really got me excited as to what kinds of amazing things I'll be able to create, and the wheels in my head are already turning.

    I'll be reimagining a section(s) of a game that is near and dear to my heart, so stay tuned for more details later on in the week! :D

    Thursday, November 15, 2012

    Reworking the code

    With the release of Unity 4 I've decided to rework my code and make it more modularized again, just like what happened with Star Gen 4. This is partly to do with the fact that I've written myself a design doc streamlining and shaping my ideas, which in turn made me realize that the way my project was written made it unsustainable for a large scale development beyond a proof of concept. And it's a great way to learn how to do things properly as well, or at least better than before.

    In my new iteration I'm planning on having the ships pretty much assemble themselves within the script instead of pre-assembling them in the editor as complex prefabs, as part of the modularization.

    With regards to art and style, I've been examining XGRA for inspiration since, like Extreme G3, it has some incredible race tracks, probably some of the best I've ever seen. And it makes me wonder what a game like F-Zero would be like in those types of levels. There is only one way to find out! Nothing like getting inspired, and wanting to reach the final product so you can play it!

    Now time utilize the available time properly and make things happen :D

    Oh yeah, and I haven't given up on Star Gen 4 either, I just need to figure out a decent renderer, one that won't take a month to render a full skybox. It makes iteration and debugging rather painful!