AS3: Inline methods
Theory
You might know the inlining concept from other programming languages as C++, the idea is quite simple. If you are familiar with assembly languages and their processing in CPU, you know the instructions are processed sequentially and when a jump occurs (e.g. a method call), the hardware must handle the jump to the proper address within the code and then jump back after its finish. It is clear this mechanism consumes an extra performance so why don't just replace jumps with the code itself? And that's how inlining works.
You can argue such approach increases the code size and memory consumption per call and you are right, but it is only question of priority whether you want to achieve better performance or lower memory consumption.
Another issue with inlined code can occur while debugging. It is strongly based on the debugger how it handles the link between inlined bytecode and regular code in your debug interface. This situation is very similar to debugging an optimized bytecode, so my recommendation is don't use inlining while debugging. I would also like to warn before using inlined code when compiling for iOS via AIR environment, it occurred to me multiple times that the generated and optimized LLVM bytecode was unable to be debugged at all, sometimes inlining of specific methods even prevented the build with unknown error.
In AS3 inlining presents a simple mechanism that allows compiler to avoid method calls by replacing the call with the method body. This allows to significantly improve code performance because the method calls are quite expensive as we said before, even more expensive in AVM.
Inlining in ASC 2.0
ASC (ActionScript Compiler) in version 2.0 brought the inlining capability. According to its documentation a method can be inlined when
- is final, static or the containing scope is file or package
- does not contain any activations
- does not contain any try or with statements
- does not contain any function closures
- body contains less than 50 expressions
It is also needed to add -inline compiler argument and I recommend to mark inlined methods with [Inline] metadata to highlight them both for compiler and programmer. When inline compiler argument is used, methods are inlined regardless they contains this metadata if they meet the rules above.
Bytecode
Let's see now a simple implementation of methods that increase value of class private field. One is non-inlined and the second is inlined due to rules mentioned above:
private var m_x:uint = 0; public function _inc_n():void { m_x++; } [Inline] public final function _inc_i():void { m_x++; }
and their bytecode interpretation:
function _inc_n():void { 0 getlocal0 1 pushscope 2 getlex private::m_x 4 convert_d 5 increment 6 findpropstrict private::m_x 8 swap 9 setproperty private::m_x 11 returnvoid }
function _inc_i():void { 0 getlocal0 1 pushscope 2 getlex private::m_x 4 convert_d 5 increment 6 findpropstrict private::m_x 8 swap 9 setproperty private::m_x 11 returnvoid }
We can see bellow that their bytecode interpretation is identical. It simply fetches the variable, increments it and sets the value back. That's it. You are not surprised there is no difference, are you? :) You already know that the difference is in the call.
NOTE: You might notice that the variable is converted into double before incrementation. It's because the increment instruction takes a Number operand and convert_d instruction converts the value to Number. You can often see such "instruction pairs" in the AVM bytecode.
Let's put these methods into context now. We create a script that calls our methods. The non-inlined first:
public function Main() { _inc_n(); }
In bytecode represantion of the script we can see the instruction callpropvoid providing the method call.
function Main():* { 0 getlocal0 1 pushscope 2 getlocal0 3 constructsuper (0) 5 findpropstrict _inc_n 7 callpropvoid _inc_n (0) 10 returnvoid }
Now the inlined method call:
public function Main() { _inc_i(); }
I highlighted the inlined block with the comment. We've seen this logic before — in the _inc_i() method. It is obvious that the method call has been replaced with the method body. The code is not identical because of differences in variable fetching and scope control. I hope it's clear now.
function Main():* { 0 getlocal0 1 pushscope 2 getlocal0 3 constructsuper (0) // Inlined method code placed here 5 getlocal0 6 getproperty private::m_x 8 convert_d 9 increment 10 getlocal0 11 swap 12 setproperty private::m_x // end of the inline block 14 returnvoid }
Performance
Let's figure out the performance gain of inlining. I performed a little test that calls methods in a loop and measures time needed to execute method calls. It periodically increases the amount of calls to see the evolution. I also left debug mode on and let the slow debug instructions help to get measurable results at low loop sets.
In the test I double size of the set by power of two, just for little bit faster execution thanks to shift operations. For time measurement I use the getTimer() method and the finish time fetching follows directly after inner for-loop to avoid distortion from _reportStatus method call.
public class Main extends Sprite { /** Use squares of 2 */ private static const MULTIPLIER:uint = 1; /** Too low value gives immeasurable results */ private static const STARTING_LOOPS:uint = 10000; /** Too high value can cause delay and distortion of results */ private static const MAX_LOOPS:uint = 2<<15; private var m_loops:uint = STARTING_LOOPS; private var m_x:uint = 0; public function Main() { _test(); } private function _test():void { var startTime:uint; var call:uint; m_loops = STARTING_LOOPS; while (m_loops < MAX_LOOPS) { startTime = getTimer(); for (call = 0; call <= m_loops; call++) { // METHOD CALL HERE } // getTimer() is called before the method call for better accuracy _reportStatus(startTime, getTimer()); m_loops <<= MULTIPLIER; } } private function _reportStatus(startTime:uint, endTime:uint):void { trace(endTime - startTime + " | " + m_loops); } //region TEST METHODS public function _inc_n():void { m_x++; } [Inline] public final function _inc_i():void { m_x++; } //endregion }
It is logical that the function of duration is exponential. What we look for is the ratio of call durations of both result lines. Based on the test, statistical results with debug configuration were:
D = 2,886254154
D = 3,032160526
and for the non-debug (fastest) version:
= 9,413533835
= 9,057931766
Gross performance gain of inlined calls is at least 280 % against the non-inlined calls. When debug mode was turned off, performance multiplier rises up to 940 %. This ratio will be probably higher on more powerful machines. We can see the method call really is an expensive operation.


Conclusion
Based on the theory, bytecode analysis and results of the test we can summarize when inline methods are good to use:
- when many method calls occurs in the code (especially in loops)
- when we prefer better performance over lower memory consumption
- when not debugging
From my own experience, inlining is useful for building mathematical helpers, tool classes and similar concepts where using in loops is assumed. If you need to cut down frame time, this could be a good solution.
References
- ActionScript Virtual Machine 2 (AVM2) Overview [online]
- ASC 2.0 (ActionScript Compiler) wiki [online]
- getTimer() reference [online]