Fooling Truffle’s intelligence

Today I decided to skip ahead a bit and try to convince Truffle to actually compile my code on Graal (since I wasn’t fully convinced my JVM setup was actually working as expected). I’ll be coming back to basic tutorials as I need, but they have honestly been boring me a bit.

I found that every time I would call CompilerDirectives.inCompiledCode() I would get false. After adding some extra VM flags, I found that Truffle was queuing my nodes for compilation, but never finishing the compilation. This wasn’t the result I was looking for. Eventually, I settled on the following code:

package com.wordpress.hextruffle.tests;

import java.util.Random;

import com.oracle.truffle.api.CallTarget;
import com.oracle.truffle.api.CompilerDirectives;
import com.oracle.truffle.api.Truffle;
import com.oracle.truffle.api.TruffleRuntime;
import com.oracle.truffle.api.frame.VirtualFrame;
import com.oracle.truffle.api.nodes.Node;
import com.oracle.truffle.api.nodes.RootNode;

public class LoopInvocationTest {

    public static class HardMathNode extends Node {
        public long exec() {
            long i;
            for (i = 0; i < 1_000_000; i++) {
                long j = (i * 2) + 1;
                i = (i - (j / j));
                i++;
                
                if (i % 1_000_000 == 0 && new Random().nextInt(100) == 42) {
                    System.out.println("TRAP");

                }
            }
            return i;
        }
    }

    public static class ImmutableNode extends Node {

        @Child
        HardMathNode child;

        public ImmutableNode() {
            this.child = new HardMathNode();
        }

        public long exec(Integer q) {
            long i = 0;
            for (long m = 0; m < 10; m++) {
                i += Math.min(0, child.exec());
            }
            i += child.exec();
            if (CompilerDirectives.inCompiledCode())
                return q + i;
            return -q - i;
        }

    }

    public static class TestNode extends RootNode {
        private @Child ImmutableNode child;

        public TestNode(ImmutableNode child) {
            super();
            this.child = child;
        }

        @Override
        public Object execute(VirtualFrame frame) {
            System.out.println(CompilerDirectives.inCompiledCode());
            return child.exec((Integer) frame.getArguments()[0]);
        }

    }

    public static void main(String[] args) {
        TruffleRuntime runtime = Truffle.getRuntime();
        TestNode root = new TestNode(new ImmutableNode());
        CallTarget tgt = runtime.createCallTarget(root);
        for (int i = 0; i < 120; i++) {
            long start = System.currentTimeMillis();
            long r = (long) tgt.call(i);
            System.out.println(System.currentTimeMillis()-start+":"+r);
        }
    }
}

Also, I am using VM flags -server -G:+TraceTruffleCompilationDetails -XX:+TraceDeoptimization -G:+TraceTrufflePerformanceWarnings -G:TruffleCompilationThreshold=5 to debug compilation.

This code manages to fool Truffle into taking a while to interpret it, giving enough time to actually compile it. First, I have a root node that is called 60 times with an argument that is always changing. Running code multiple times will convince Truffle to actually compile it. My flags specify 5 times is needed before Truffle will start trying to compile the code. The HardMathNode contains a bunch of operations that are supposed to fool the default JIT. Before, when I had a simple for loop, this node was immediately optimized as it was a bunch of dead code with no side effects. The addition of the trap also helped as it meant that no implementation could really dead-code optimize out that branch.

When I finally compiled and ran this code, after only 5 iterations the following was shown, meaning that Graal was queueing and starting compilation of my nodes:

[truffle] opt start        TestNode@12f40c25                                           |ASTSize       4/    4 |Calls/Thres       5/    3 |CallsAndLoop/Thres       5/    5 |Inval#              0 
[truffle] opt queued       TestNode@12f40c25                                           |ASTSize       4/    4 |Calls/Thres       5/    3 |CallsAndLoop/Thres       5/    5 |Inval#              0

When compilation was finished after a good ten seconds (!) the following notified me of compilation finishing:

[truffle] opt done         TestNode@12f40c25                                      |ASTSize       4/    4 |Time  7205(1332+5872)ms |DirectCallNodes I    0/D    0 |GraalNodes   727/ 2700 |CodeSize        12616 |Source            n/a 

Much to my dismay Graal is slower by nearly a five-fold factor than the uncompiled version (around 1000 msec compared to 200-230 msec), although the uncompiled version in reality used a much more mature JIT and this is an extremely contrived example. I will have to build a version that has Graal as the main JIT as well as the Truffle JIT to test this hypothesis.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s