Phffff=blog===
Example of Type-Based Optimizations in Factor
30 Jun 2018
Factor is dynamically typed, yet with words such as declare and TYPED: we can guide the compiler to generate more efficient code.
Consider the following two words and their definitions:
: add-untyped ( x y -- z ) + ;
TYPED: add-typed ( x: fixnum y: fixnum -- z: fixnum ) + ;
We will use the
disassemble
word for viewing assembly.
Let’s now compare the difference between add-untyped
and add-typed
.
First, add-untyped
:
00007f134c061820: 8905da27f4fe mov [rip-0x10bd826], eax
00007f134c061826: 8905d427f4fe mov [rip-0x10bd82c], eax
00007f134c06182c: 488d1d05000000 lea rbx, [rip+0x5]
00007f134c061833: e9e8f115ff jmp 0x7f134b1c0a20 (+)
Very simple, this jumps to the generic word +
which will dynamically dispatch to the appropriate +
definition for fixnum
.
And now add-typed
:
00007f134c041c90: 89056a23f6fe mov [rip-0x109dc96], eax
00007f134c041c96: 53 push rbx
00007f134c041c97: 498b06 mov rax, [r14]
00007f134c041c9a: 4983ee08 sub r14, 0x8
00007f134c041c9e: 4983c708 add r15, 0x8
00007f134c041ca2: 498907 mov [r15], rax
00007f134c041ca5: e8d65dfefe call 0x7f134b027a80 ([ \ integer>fixnum-strict ~array~ 0 ~array~ inline-cache-miss ])
00007f134c041caa: 498b07 mov rax, [r15]
00007f134c041cad: 4983c608 add r14, 0x8
00007f134c041cb1: 4983ef08 sub r15, 0x8
00007f134c041cb5: 498906 mov [r14], rax
00007f134c041cb8: e8c35dfefe call 0x7f134b027a80 ([ \ integer>fixnum-strict ~array~ 0 ~array~ inline-cache-miss ])
00007f134c041cbd: 89053d23f6fe mov [rip-0x109dcc3], eax
00007f134c041cc3: 5b pop rbx
00007f134c041cc4: 488d1d05000000 lea rbx, [rip+0x5]
00007f134c041ccb: e950ffffff jmp 0x7f134c041c20 (( typed add-typed ))
This first tries to coerce the two inputs to fixnum
and then jumps to an
address marked as (( typed add-typed ))
. The assembly for (( typed add-typed ))
is
00007f134c041c20: 8905da23f6fe mov [rip-0x109dc26], eax
00007f134c041c26: 53 push rbx
00007f134c041c27: 498b06 mov rax, [r14]
00007f134c041c2a: 498b5ef8 mov rbx, [r14-0x8]
00007f134c041c2e: 4801c3 add rbx, rax
00007f134c041c31: 0f801a000000 jo 0x7f134c041c51 (( typed add-typed ) + 0x31)
00007f134c041c37: 4983ee08 sub r14, 0x8
00007f134c041c3b: 49891e mov [r14], rbx
00007f134c041c3e: 8905bc23f6fe mov [rip-0x109dc44], eax
00007f134c041c44: 5b pop rbx
00007f134c041c45: 488d1d05000000 lea rbx, [rip+0x5]
00007f134c041c4c: e96f5dfefe jmp 0x7f134b0279c0 ([ \ integer>fixnum-strict ~array~ 0 ~array~...)
00007f134c041c51: e80a1a45ff call 0x7f134b493660 (fixnum+overflow)
00007f134c041c56: 8905a423f6fe mov [rip-0x109dc5c], eax
00007f134c041c5c: 5b pop rbx
00007f134c041c5d: 488d1d05000000 lea rbx, [rip+0x5]
00007f134c041c64: e9575dfefe jmp 0x7f134b0279c0 ([ \ integer>fixnum-strict ~array~ 0 ~array~...)
This will do an add
instruction directly and then check for overflow.
No dynamic dispatch involved.
Let’s test these out with some input values.
We’ll look at the assembly for [1 2 add-untyped]
and then [1 2 add-typed]
.
First, using add-untyped
:
00007f134c148370: 89058abce5fe mov [rip-0x11a4376], eax
00007f134c148376: 4983c610 add r14, 0x10
00007f134c14837a: 49c70620000000 mov qword [r14], 0x20
00007f134c148381: 49c746f810000000 mov qword [r14-0x8], 0x10
00007f134c148389: 890571bce5fe mov [rip-0x11a438f], eax
00007f134c14838f: 488d1d05000000 lea rbx, [rip+0x5]
00007f134c148396: e98594f1ff jmp 0x7f134c061820 (add-untyped)
00007f134c14839b: 0000 add [rax], al
00007f134c14839d: 0000 add [rax], al
00007f134c14839f: 00 invalid
As expected, this jumps to add-untyped
.
And now using add-typed
:
00007f134c146850: 8905aad7e5fe mov [rip-0x11a2856], eax
00007f134c146856: 4983c610 add r14, 0x10
00007f134c14685a: 49c70620000000 mov qword [r14], 0x20
00007f134c146861: 49c746f810000000 mov qword [r14-0x8], 0x10
00007f134c146869: 890591d7e5fe mov [rip-0x11a286f], eax
00007f134c14686f: 488d1d05000000 lea rbx, [rip+0x5]
00007f134c146876: e9a5b3efff jmp 0x7f134c041c20 (( typed add-typed ))
00007f134c14687b: 0000 add [rax], al
00007f134c14687d: 0000 add [rax], al
00007f134c14687f: 00 invalid
This directly jumps to (( typed add-typed ))
which will directly do the add
instruction.
The compiler safely skipped the type coercion as it knows the inputs are both fixnum
.
View on GitHub