Our tips to improve Unity UI performance when making games for mobile devices

There is a Unity Unite talk named "Unite Europe 2017 - Squeezing Unity: Tips for raising performance" by Ian Dundore about things you can do in your game to improve Unity performance by explaining how Unity works behind the scenes.

Update: I just noted (when I was about to complete the post) there is another talk by Ian from Unite 2016 named "Unite 2016 - Let's Talk (Content) Optimization", which has other good tips as well.

There are two techniques to improve Unity UI performance we use at work they didn’t mention in the video and we want to share them in this blog post. One of them is using CanvasGroup component and the other one is using RectMask2D.

CanvasGroup

CanvasGroup component controls the alpha value of all the elements inside in its RectTransform hierarchy and whether that hierarchy handle input or not. The first one is mainly used for render purposes while the second one for user interaction.

What is awesome about CanvasGroup is, if you want to avoid rendering a hierarchy of elements, you can just put alpha in 0 and that will avoid sending it to render queue improving gpu usage, or at least that is what the FrameDebugger and our performance tests say. If you also want to avoid that hierarchy to consume events from the EventSystem, you can turn off the block raycast property and that will avoid all the raycast checks for its children, improving cpu usage. The combination of those two is important. It is also easier and more designer friendly than iterating over all the children with CanvasRender and disable them. Same thing to disable/enable all objects handling raycasts.

In our case, at work, we are using multiple Canvas objects and have all of them “disabled” (not being rendered nor handling input) using CanvasGroup alpha and block raycasts properties. That improves a lot the speed of activating and deactivating our metagame screens since it avoids regenerating the mesh and calculating layout again which GameObject SetActive() does.

RectMask2D

The idea when using masks it to hide part of the screen in some way, even using particular shapes. We use masks a lot at work in the metagame screens, mainly to show stuff in a ScrollRect in a nice way.

We started using just Mask component, with an Image without Sprite set, to crop what we wanted. Even though it worked ok, it wasn’t performing well on mobile devices. After investigating a bit with FrameDebugger we discovered we had tons of render calls of stuff that was outside the ScrollRect (and the mask).

Since we are just using rectangle containers for those ScrollRects, we changed to use RectMask2D instead. For ScrollRects with a lot of elements, that change improved enormously the performance since it was like the elements outside the mask weren’t there anymore in terms of render calls.

This was a great change but only works if you are using rectangle containers, doesn’t work with other shapes. Note that the Unity UI Mask tutorial only shows image masks and doesn't say anything about performance cost at all (it should).

Note: when working with masks there is a common technique of adding something over the mask to hide possible ugly mask borders, we normally do that on all our ScrollRect that doesn't cover all the screen.

Bonus track: The touch hack

There is another one, a hack, we call it the Touch Hack. It is a way to handle touch all over the screen without render penalty, not sure it is a great tip but it helped us.

The first thing we thought when handling touch all over the screen (to do popup logic and/or block all the other canvases) was to use an Image, without Sprite set, expanded to all the screen with raycast enabled. That worked ok but it was not only breaking the batch but also rendering a big empty transparent quad to all the screen which is really bad on mobile devices.

Our solution was to change to use a Text instead, also expanded to all the screen but without Font nor text set. It doesn’t consume render time (tested on mobile devices) and handles the raycasts as expected, I suppose it is because it doesn’t generate the mesh (since it doesn’t have text nor font set) and at the same time still has the bounding box for raycasts configured.

Conclusion

It is really important to have good tools to detect where the problems are and a way to know if you are improving or not. We use a lot the FrameDebugger (to see what was being drawn, how many render calls, etc), the overdraw Scene view and the Profiler (to see the Unity UI CPU cost).

Hope these tips could help when using Unity UI to improve even more the performance of your games.

More

Optimizing Unity UI - http://www.tantzygames.com/blog/optimizing-unity-ui/

A guide to optimizing Unity UI - https://unity3d.com/es/learn/tutorials/temas/best-practices/guide-optimizing-unity-ui

Implementing Multiple Canvas Groups In Unity 5 - http://www.israel-smith.com/thoughts/implementing-multiple-canvas-groups-in-unity-5/

 

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Using Unity Text to show numbers without garbage generation

The idea of this post is to show different ideas and analysis on how to use Unity UI Text to show numbers without garbage generation. I need this for a framerate counter and other debug values.

Test case

Shows a fixed digit length number in screen, regenerated each frame with a new random value.

Shows how the test scene used for all test cases work.

Using Strings

Since strings are immutable in c#, common operations on strings generates new strings and hence allocates new heap memory. If you are using strings as temporary values like showing a changing number in a UI text then that memory becomes garbage. In PC that garbage could go unnoticed but not in mobile devices since that could derive in a hiccup when the garbage collector decides to collect it.

The idea with these tests is to try to use make the label work with strings without garbage generation. To detect generated garbage I am using the Unity profiler and avoiding ToString() of int, float, etc, to just calculate the cost of the string manipulation for now.

String concatenation

String concatenation generates 30 Bytes per frame since internally String.Concat() calls String.InternallyAllocateStr().

It is not as bad as expected, it is just creating a new string with the length of the first string plus the second and then it copies their values. Obviously it becomes worse when multiple concatenations are done in secuence.

Test code:

Text text;
 
static readonly string[] numbers = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" };
 
void Start () {
    text = GetComponent<Text> ();
}
 
void Update () {
     
    string a = numbers[UnityEngine.Random.Range(0, numbers.Length)];
    string b = numbers[UnityEngine.Random.Range(0, numbers.Length)];
 
    text.text = a + b;
}

String format

Using string.Format() generates 176 Bytes per frame, internally is using String.FormatHelper + StringBuilder.ToString().  The first one creates a new StringBuilder and the second is the transform from StringBuilder to string.

Test code:

 Text text;
 
 static readonly string[] numbers = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" };
 
 void Start () {
     text = GetComponent<Text> ();
 }
 
 void Update () {
     string a = numbers[UnityEngine.Random.Range(0, numbers.Length)];
     string b = numbers[UnityEngine.Random.Range(0, numbers.Length)];
 
     text.text = string.Format ("{0}{1}", a, b);        
     
 }

String Builder Format

Using cached StringBuilder improves the previous one a bit, it generates 86 Bytes per frame, the AppendFormat is generating garbage and then the set_Length() (used to clear the StringBuilder).

Test code:

 Text text;
 StringBuilder stringBuilder = new StringBuilder(20, 20);
 
 static readonly string[] numbers = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" };
 
 void Start () {
     text = GetComponent<Text> ();
     stringBuilder.Length = 3;
 }
 -j
 void Update () {
     string a = numbers[UnityEngine.Random.Range(0, numbers.Length)];
     string b = numbers[UnityEngine.Random.Range(0, numbers.Length)];
 
     stringBuilder.Length = 0;
     stringBuilder.AppendFormat ("{0}{1}", a, b);
 
     text.text = stringBuilder.ToString();
 }

Note: If I change the StringBuilder starting capacity and max capacity, the cost is the same but goes to ToString() method instead, but internally to the same method String.InternallyAllocateStr().

String Builder only Append

Instead of using StringBuilder.AppendFormat, change to use only String.Append. This reduces the cost to only 30 Bytes per frame (the same of the first one), the only cost here is the set_Length() which internally calls String.InternallyAllocateStr().

Test code:

 Text text;
 StringBuilder stringBuilder = new StringBuilder(20, 20);
 
 static readonly string[] numbers = { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9" };
 
 void Start () {
     text = GetComponent<Text> ();
     stringBuilder.Length = 3;
 }
 
 void Update () {
     string a = numbers[UnityEngine.Random.Range(0, numbers.Length)];
     string b = numbers[UnityEngine.Random.Range(0, numbers.Length)];
 
     stringBuilder.Length = 0;
     stringBuilder.Append (a);
     stringBuilder.Append (b);
 
     text.text = stringBuilder.ToString();
 }

Note: Does the same behaviour if I change starting and max capacity, the cost is the same but is on ToString() instead of set_Length().

String Builder by replacing chars

If instead of Append I replace chars directly by using [] and avoid the set_Length(), the cost is the same, 30 Bytes per frame, since the String.InternallyAllocateStr() goes to set_Chars().

Test code:

 Text text;
 StringBuilder stringBuilder = new StringBuilder(20, 20);
 
 static readonly char[] numbers = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
 
 void Start () {
     text = GetComponent<Text> ();
     stringBuilder.Length = 3;
 }

 void Update () {
     char a = numbers[UnityEngine.Random.Range(0, numbers.Length)];
     char b = numbers[UnityEngine.Random.Range(0, numbers.Length)];
 
     stringBuilder [0] = a;
     stringBuilder [1] = b;
 
     text.text = stringBuilder.ToString();
 }

Note: Again, does the same behaviour if I change starting and max capacity, instead of set_Chars(), the cost is in ToString() method.

String Builder, access internal string by reflection

There is a suggestion at in this post to access by refleciton to _str field from StringBuilder class to avoid the cost of ToString() method.

Test code:

 Text text;
 StringBuilder stringBuilder = new StringBuilder(20, 20);

 static System.Reflection.FieldInfo _sb_str_info = 
        typeof(StringBuilder).GetField("_str", 
        System.Reflection.BindingFlags.NonPublic | 
        System.Reflection.BindingFlags.Instance);
 
 static readonly char[] numbers = { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
 
 void Start () {
     stringBuilder.Length = 3;
 
     text = GetComponent<Text> ();
 }
 
 void Update () {
     stringBuilder[0] = numbers[UnityEngine.Random.Range(0, numbers.Length)];
     stringBuilder[1] = numbers[UnityEngine.Random.Range(0, numbers.Length)];
     stringBuilder[2] = (char) 0;
 
     var internalValue = _sb_str_info.GetValue (stringBuilder) as string;
     text.text = internalValue;
 }

In this case, there is no garbage at all. However, I see no change in the UI Text even though the editor shows the text field value is changing, like it is not being redrawn in screen. I suppose that could be because the string pointer is not changing and by taking a look at the Text code from the Unity UI it is comparing with != instead of Equals... not sure here.

 public virtual string text
 {
     get
     {
         return m_Text;
     }
     set
     {
         if (String.IsNullOrEmpty(value))
         {
             if (String.IsNullOrEmpty(m_Text))
                 return;
             m_Text = "";
             SetVerticesDirty();
         }
         else if (m_Text != value)
         {
             m_Text = value;
             SetVerticesDirty();
             SetLayoutDirty();
         }
     }
 }

I tried by forcing layout and vertices dirty after updating internal string, just in case, but had no luck (sad face).

Caching strings

Another option suggested in this blog post is to precache strings for different numbers but that is only reasonable for a small amount of digits. I like it because it is simple and could be generated at runtime, and works well for debug numbers like FPS where the number is normally between 0 and 60.

I tried it and it works really well and generates 0 Bytes per frame.

Test code:

 Text text;
 
 string[] generated;
 
 // Use this for initialization
 void Start () {
     text = GetComponent<Text> ();
 
     generated = new string[100];
 
     // should go from 0 to 99.
     for (int i = 0; i < 100; i++) {
         generated [i] = string.Format ("{0:00}", i);
     }
 }
 
 // Update is called once per frame
 void Update () {
     int random = UnityEngine.Random.Range (0, generated.Length);
     text.text = generated [random];
 }
 

Rendering numbers directly

One possible way to avoid all this garbage (I mean both the code and the unused memory) is to not use strings at all but to just render to the screen images for each number digit, where each digit is a different sprite.

When making TinyWarriors prototype I did a basic number rendering where I could specify the number of digits and it just created multiple Unity UI Images inside a horizontal layout.

Shows a test using images for each digit instead of a text.

Test code:

 public Image[] numbers;
 
 // in order, like 0, 1, 2, ..., 9
 public Sprite[] numberSprites;
 
 public bool fillZero = true;
 
 void Start()
 {
     SetNumber (0);
 }
 
 public void SetNumber(int number)
 {
     int tens = (number % 100) / 10;
     int ones = (number % 10);
 
     var tensActive = fillZero || tens != 0;
     var onesActive = fillZero || number > 0;
 
     numbers [0].gameObject.SetActive (tensActive);
     numbers [1].gameObject.SetActive (onesActive);
 
     if (tensActive)
         numbers [0].sprite = numberSprites [tens];
 
     if (onesActive)
         numbers [1].sprite = numberSprites [ones];
 }
 
 public void Update()
 {
     int random = UnityEngine.Random.Range (0, 100);
     SetNumber (random);
 }

The code could be adapted to support more digits. When profiling it in editor there is a lot of garbage generation, around 1KB per frame, in Canvas.SendWillRendereCanvases() because it is forcing a material rebuild each time a sprite is changed. However, I tested it on devices and it doesn’t so it must be something related with the Unity editor.

Other strategies

Other strategies include minimizing the garbage generation by reducing the text update frequency, for example, by avoiding updating the text if the number didn't change and/or updating the text from time to time and not every frame.

Conclusion

Since I just wanted a solution for a framerate counter (and other debug numbers) the last solutions are perfect and I believe those could even be extrapolated for other game needs, like showing the player points in an arcade game, with a bit of extra thinking.

References

Here is a list of some articles, forum and blog posts I took a look during the tests and the post writing.

Unity memory optimizations article - https://unity3d.com/es/learn/tutorials/temas/performance-optimization/optimizing-garbage-collection-unity-games

Memory management reference - http://www.memorymanagement.org/

FPS implementation caching strings - http://catlikecoding.com/unity/tutorials/frames-per-second/

Using reflection to set StringBuilder string to avoid garbage - http://www.defectivestudios.com/devblog/garbage-free-string-unity/

FPS Asset - http://blog.codestage.ru/unity-plugins/fps/

Another FPS Asset - https://www.assetstore.unity3d.com/en/#!/content/6513

StringBuilder API - https://msdn.microsoft.com/en-us/library/system.text.stringbuilder(v=vs.110).aspx

Performance tips for Unity for mobile - https://divillysausages.com/2016/01/21/performance-tips-for-unity-2d-mobile/

Unity UI Source code - https://bitbucket.org/Unity-Technologies/ui

Untiy Community Library - https://github.com/UnityCommunity/UnityLibrary

 Guardar

Guardar

Guardar

Guardar

Guardar

Guardar

Guardar

Guardar

Guardar

Guardar

Guardar

Guardar

VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

Vampire Runner version 1.0.3 - some performance improvements

Since the last update of Vampire Runner we were experiencing some notorious performance issues on the Android version and that is why we focused our efforts trying to improve it. The main problem was having some stuttering from time to time and some really bad fps on some devices.

We updated Vampire Runner in the Android Market with all the improvements we made:

  • Improved performance.
  • Improved graphics.
  • Removed the energy bar
  • Fixed the instructions texts to be clearer.

Here is the QR-code if you want to easy access from your Android device:

Hope this new version works better as it is working for us and enjoy the game.

VN:F [1.9.22_1171]
Rating: 5.0/5 (1 vote cast)

Reusing Artemis entities by enabling, disabling and storing them

As we mentioned on a previous post, we were having some performance issues in Vampire Runner and we were trying different approaches to improve its performance.

Introduction

One limitation of Android when making games is you have to avoid generating garbage whenever you can since the garbage collection would generate pauses on your games and that leads to a bad user experience. Then, we should try to reuse already created object instead of creating new ones.

In Vampire Runner, one problem we were having was that we were creating a lot of entities at a specific moment of the game, when we detected a new obstacle should be created, and that was making some pauses on the Android version.

As we use Artemis, we should try to reuse some entities when we can. For example, if we make a shooting game (like the Jetpac prototype I made) it seems a good idea to reuse bullets since their life cycle is really short. Ziggy made two blog posts about this topic some weeks ago here and here, however we followed a slightly different approach and we will explain it in this post.

Storing entities to reuse them

We created a concept named Store (similar to LibGDX Pool class) which let us easily store objects, in this case entities of one kind (for example bullets).

	free(T t) // returns an entity to the Store to be reused later

	get() : t // returns an entity from the Store, it reuses an object from the free 
			collection if there is one or creates a new object otherwise.

The idea is to, for example, instead of creating a new bullet when a weapon is fired, calling store.get() and set the component values as they should be, and when the bullet collides with something call the store.free(e) instead of deleting the entity, so we can reuse it later.

This is a generic approach and we can use different stores to reuse different kind of entities but it has a big problem, those entities keep being in Artemis world, that means they keep being processed (collisions, render, etc). A basic solution to this problem was adding a new state to the entity, and we explain that in the following section.

Enabling and disabling Artemis entities

Artemis supports reuse of entities by internally caching created entities inside the World class, however their state (which components their have) is not easily reused, and that was one of the big problems when creating a new entity, we wanted to reuse their state.

Our current solution to the problem was adding a new state to the entities, if they are enabled or not. Being enabled means the entity is processed by all interested EntitySystems, being disabled means the entity is still in the Artemis world but it is not processed by any system.

So, in our customization of Artemis we added three new methods to Entity to be called whenever you want to enable or disable an entity:

	disable() : disables an entity to avoid it to be processed on EntitySystems

	enable() : enables again an entity to let it be processed on EntitySystems

	isEnabled() :  returns true if the entity is enabled, false otherwise.

Then, we added new methods to EntitySystem API to let each EntitySystem to be aware an entity of interest was enabled or disabled:

	disabled(Entity e) : called whenever an entity of this EntitySystem was disabled

	enabled(Entity e) : called whenever an entity of this EntitySystem was disabled

In our case, we are using them to enable and disable Box2D bodies in our PhysicsSystem, and also to remove them from our render layers in our RenderSystem.

As an example, we have a nice video of Vampire Runner we made by changing the zoom of the camera to see the behind the scenes:

As you can see, when entities like wall, fire and Christmas stuff are behind the main character, they disappear. That is because they are disabled and moved again to their stores so they stop being processed by Artemis, in particular, stop being rendered.

Conclusion

By combining both solutions, we have an easy way to reuse created entities of one kind, like our obstacles tiles in Vampire Runner, while at the same time we can disable them when they are on a store to avoid them being processed.

In case of Vampire Runner, this solution improved Vampire Runner performance since we now pre create a lot of entities we need during the game and then disable them and enable them only when needed, in this way, we could avoid creating a lot of entities in one update after the game was started.

This is a first approach solution to the problem and seems good for our current games but it may not fit other type of games or bigger games, we don't know that yet.

If you use Artemis and you had this problem too, hope this blog post is helpful to you.

VN:F [1.9.22_1171]
Rating: 3.4/5 (5 votes cast)

Basic frustum culling to avoid rendering entities outside screen

As we were having some performance issues with Vampire Runner and we didn't have a clear idea of what was happening, we started trying some improvement techniques. The first one we implemented was a basic frustum culling technique to avoid trying to render objects outside of the screen.

Basic implementation

First, we created an Artemis component named FrustumCullingComponent with a Rectangle representing the bounds of that entity to easily detect if the entity is inside the screen or not. For now, as it is a basic implementation, the rectangle was only modified when the entity was created. So, for example, if we know an entity was able to rotate during the game, then we create a bigger bounding box using box diagonal.

Then, we added a method to our custom 2d Camera implementation to get the camera frustum (by making the corresponding transformations).

Finally, we modified our Artemis render system to check before rendering if an entity has or not a FrustumCullingComponent, if it hasn't one, then we perform the render logic as we always did. If it has one, then we check if the bounds of that entity overlaps with the camera frustum, if it does, then we render as we always did, if it doesn't, then we avoid rendering that entity.

Here is an example of the bounds and the frustum of the camera:

In the image, the element (a) and (b) are rendered because their bounds overlaps with the camera frustum. The element (c) is not rendered because its bounds are totally outside the camera frustum.

Conclusion

For Vampire Runner, we didn't notice the difference of having this technique enabled or not since the game always render fast (on our devices) and we had no metrics of the render process time. However, as it was really easy to implement this basic version of the technique, we believe it should help to maintain render performance, and we can reuse the logic for all of our games.

As always, hope you like it.

VN:F [1.9.22_1171]
Rating: 4.5/5 (2 votes cast)