August 12, 2019

What is GraalVM ?

It happened the beginning of the year when I was scanning my Twitter feed. I came across a tweet announcing one of the GraalVM release candidates that were followed by a series of other interesting tweets. It was evident that buzz around GraalVM has already started. I was wondering why. A few more tweets and articles, I found that the whole Java ecosystem is getting crazy with GraalVM. More and more frameworks started to announce support for GraalVM proudly, and people began to embrace it for the Serverless and containers.

What is GraalVM?

According to GraalVM website:

GraalVM is a universal virtual machine for running applications written in JavaScript, Python, Ruby, R, JVM-based languages like Java, Scala, Groovy, Kotlin, Clojure, and LLVM-based languages such as C and C++.

Before explaining and showing what it means, let see what’s the history of it.

Java =< 8

Up to Java 8, HotSpot was the main Oracle Java virtual machine to run Java programs. The JVM was written in C/C++, and release by release was getting more and more complex. The purpose of the HotSpot JVM is to run Java bytecode. Running bytecode means:

Interpret bytecode and execute it using machine code
Continuously analyze the program performance for so-called “hot spots” that are often executed
Once “hot spots” are identified then just-in-time (JIT) compile them into native code (machine code)
Such in-time compiled code is then being executed natively, so no bytecode interpreter involved.

Java > 9

Java 9 thanks to the JEP295 proposal got Ahead-of-time compilation feature - an ability to compile java classes to native code before launching JVM. It used Graal as the code generating backend. Graal was a compiler written in Java that demonstrated it could generate highly optimized code. So the loop was closed - Java compiler was written in Java ;) The main downside of this approach is the platform-dependent native code.

Graal gave a foundation for a new project called GraalVM. Oracle started to develop a new VM to tackle with vast and complex C/C++ codebase of the HotSpot JVM but also to make VM polyglot (to run programs coded in any programming language).

Slide from Devoxx talk by Thomas Wurethinger

Java HotSpot VM - still C/C++ codebase but thanks to the JEP 243 - Java level JVM Compiler interface, it has an interface for Graal
Graal is a dynamic java compiler written in Java
Truffle which is an Open Source library (already part of GraalVM project) for building programming language implementations as interpreters for self-modifying Abstract Syntax Trees.
Thanks to Truffle, GraalVM allows us to compile and execute code in many languages.

In addition to the above the most interesting feature is ahead of time compilation:

GraalVM allows you to compile your programs ahead-of-time into a native executable. The resulting program does not run on the Java HotSpot VM, but uses necessary components like memory management, thread scheduling from a different implementation of a virtual machine, called Substrate VM. Substrate VM is written in Java and compiled into the native executable. The resulting program has faster startup time and lower runtime memory overhead compared to a Java VM. - GraalVM about AOT

This isn’t like Launch4J or javapackager tool in Oracle’s java that create a simple executable that points to your .jar and is bundled together with JRE. GraalVM makes real executable with a native code of subset of JVM (SubstrateVM) with your code AOT compiled.

Several projects have already accepted GraalVM for their applications. To name it few hottest ones: Quarkus, Micronaut, Helidon. For these frameworks, GraalVM native images significantly reduce the runtime memory requirements compared to running on HotSpot what makes them the right way to develop cloud-native applications.

Frameworks memory footprint comparison by GraalVM

Frameworks startup time comparison by GraalVM

How to start with GraalVM?

If you want to try the code by yourself, you need to install GraalVM. There are two editions of it - Community Edition (CE) and Enterprise Edition (EE), for Linux, macOS and recently for Windows. However, native image generation is not yet available for Windows, so I’d recommend using official Docker images instead and build it inside container. As a alternative for Windows users:

You can install Windows Subsystem for Linux for Windows 10 and run the compilation in WSL as in linux
Or, more complicated way of doing it directly on Windows

Native image generation

Once you install GraalVM, you need to install addon with the native image generation feature using gu install native-image command.

The result of the image generation is a stand-alone executable file that consists “partial” virtual machine called SubstrateVM that holds a subset of JVM components, such as thread scheduling, simple garbage collector. However, many are not in, e.g., dynamic loading of classes, no security manager, JMX, to name it a few. It’s because they are no longer make sense in the native image.

So, imagine classic hello world class.

public class HelloWorld {
  public static void main(String[] args) {
    System.out.println("Hello, World!");
  }
}

Run the following commands to build a native image for your current platform (if you build using Docker container, you need to run it on the container too).

$ javac HelloWorld.java
$ native-image HelloWorld

and then run it:

$ ./helloworld
Hello, World!

Mixing languages

Let’s take a quick example from GraalVM documentation to show you an idea how languages can cooperate together. A Java code that does:

Java: Reads standard input
JS: Parses the string into a JSON object and returns in a pretty format
Java: prints out result to standard output.

import java.io.*;
import java.util.stream.*;
import org.graalvm.polyglot.*;

public class PrettyPrintJSON {
  public static void main(String[] args) throws IOException {
    BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
    String input = reader.lines().collect(Collectors.joining(System.lineSeparator()));
    try (Context context = Context.create("js")) {
      Value parse = context.eval("js", "JSON.parse");
      Value stringify = context.eval("js", "JSON.stringify");
      Value result = stringify.execute(parse.execute(input), null, 2);
      System.out.println(result.asString());
    }
  }
}

Compile and build a native image

$ javac PrettyPrintJSON.java
$ native-image --language:js --initialize-at-build-time PrettyPrintJSON

Then run it:

$ ./prettyprintjson <<EOF
{"GraalVM":{"description":"Language Abstraction Platform","supports":["combining languages","embedding languages","creating native images"],"languages": ["Java","JavaScript","Node.js", "Python", "Ruby","R","LLVM"]}}
EOF

What else can I do?

I suggest a few more readings to explore GraalVM capabilities and supporting tools. My favorite ones are the ability to create your own language that runs on VM or use Google Chrome browser developer tools to debug GraalVM languages.

Top 10 Things To Do With GraalVM - great article with hands-on on top features & GraalVM tools
My Github repo where I show how to build a simple servlet that returns a page with a graph that was rendered by R lang code from Java. Everything stitched together by Quarkus framework that I’m going to talk in the next posts.

Summary

So, why there is a hype on this? If you didn’t find it already, the answer is Cloud and Containers!

Why ?

Run Java code faster (in JVM mode) thanks to optimized JIT. Alternatively, run your java code as a native image even much faster. That is important if you’d like to scale your application very quickly or if you want to run your AWS Lambdas without the negative effect of cold start.
If you dockerize your Java application, you usually end up in images having hundreds of MB, as it contains not only your app but all dependencies (jars) and the JVM itself. With native image, you can reduce container images to just a few megabytes.

What about limitations or drawbacks?

Building CI/CD pipeline is a challenging task as Ahead-of-time compilation takes a lot of time, and if your application is big enough it might takes tens of minutes.

The native image does no support all Java features - see SubstrateVM limitations for details. It is the most challenging part, as those limitations make the process of your application adoption pretty complicated.

Native image build process relies on static analysis of which code will be reachable. However, if your code uses Java Native Interface (JNI), Java reflection, dynamic proxy objects or class path resources, those parts cannot be predicted during analysis. Those elements need to be provided to the native-image tool either via configuration files or as tool parameters. See native-image tool options or SubstrateVM sub-project markdown pages.

Last but not least

I think it’s worth to start exploring what’s happening around GraalVM as it opens the whole spectrum of new capabilities. You might argue with the usability of native images because you’re losing all the java dynamic. But is it really a problem? If you build cloud applications that are mostly run in containers that are not changeable by their nature, why you’d need any form of dynamism? Your microservices will not gonna have plugins deployed in runtime. Any code changes to the logic usually result in new image generation.

So, stay tuned. I’m going to explore it more in the next posts to show more real examples. I’m gonna use Quarkus as a framework of my choice, you will see how to create and run AWS Lambdas as native images, and more.

Previous Post Top