Cross-language benchmarking

In my recent blog posts (part 1, part 2 and part 3) I have described in detail how to do micro benchmarking for Java and C/C++ with JMH and Hayai. I have presented a common execution approach based on Gradle. Everything was brought together in one single Gradle project structure, that keeps benchmarking infrastructure completely away from your production code and allows to build and execute both Java and C++ projects and benchmarks in one command. Git submodules might have caused some headaches, but this time we clarify some more details, I promise!

Back to my list. What we did so far, and what's next:

(Done in Part 1) Benchmark Java Code with JMH as part of a Gradle build
(Done in Part 2) Benchmark C++ code with Hayai
(Done in Part 2) Integrate C++ native binary compilation into the Gradle build
(Done in Part 2) Integrate Hayai benchmark execution with Gradle
(Done in Part 3) Bring Java and C++ projects together in one cross-language Gradle build
Aggregate JMH and Hayai results in a third composite result
(Done in Part 3) Split benchmarking code out of the projects into dedicated project and dedicated SCM
Extract all benchmarking config from the build.gradle files into easy-to-use Gradle plugins, in order to offer them to all of you
Automatically push aggregated benchmarking results to a well stuctured git repository to keep track of benchmarking results over the code changes.

Today we want to tackle 6. and 8.

Plugins, Plugins, Plugins...

I've promised it in part 3 already. After we have isolated all the relevant code in very specific Gradle build files, the whole project was ready to move all the enabling code to dedicated plugins. To describe the Gradle plugin approach in one sentence: "Take your tasks and move them to a new project, where, with some minimal code and infrastructure overhead, you get a distributable plugin, which looks like all the others". Applied plugins are basically nothing else than tasks you define on-the-fly in your build.gradle!

Ok, to make it short, I would like to refer to the comprehensive Gradle Docs for details how to create plugins.

Here, I would like to show-case you the transformation and discuss the side effects. However, if you have never seen a Gradle plugin from the inside, take a closer look! You will quickly find similarities and pattern!

This is the build.gradle where we left in part 3 the choma-java-benchmarks:

github:49f367554114ae4f786f1b1a98b57fba

Looks like a standard benchmarking project that makes use of the jmh-gradle-plugin. But you don't want to add all those things to your project.

What YOU want to do is this:

github:a321bb5a9f90af9b567a59aa76fa7dd5

I assume, the difference is clear. We replaced the originally used plugins with our own plugin, that adds our cherry on top (later).

The Gradle Docs quickly lead the way to an interface and some additional details which become handy when building your own plugins. This is how the CroLaBeFraJavaPlugin class looks like:

github:779746c16ec75c06e81d3347ac69690f

Eye-catching details:

It is Groovy :), sure.
We can apply and configure transient plugins inside our own plugin! (JavaPlugin, ShadowPlugin, JMHPlugin)
It is not really clear how everything fits together to allow this one-liner shown above. Sorry it doesn't have to because you are here for benchmarking not for creating Gradle plugins ;) However, if you want to take a closer look to what is necessary, please check out the git repositories of my several Gradle plugins.

Tha CroLaBeFra Java part is really straight forward, because there is JMH and the already existing plugin for it in place. Nevertheless, it encapsulates the Microbenchmarking Framework and its configuration, so you could write your own custom strategy and create an additional Plugin next to it….

… for example one for C++ code

I wrote the very same thing for Hayai benchmarks which works slightly different, compared to JMH but the CroLaBeFra wrapper plugin should behave the same way as the Java one does.

Differences:

Your C++ application needs to be compiled as library first.
You might need to download Hayai, if you want a cool one-shot command that works out-of-the-clone :)
You need to compile your benchmarks and link your application library

The CroLaBeFra C++ Plugin does all those things for you, even if you are not using the cross-language part of it. So it is also a standalone integration of Hayai benchmarks into your tool chain with just one additional 'apply plugin':

github:5194d54e77215750a58071c34702e5a1

Due to the more loose coupling and the naming flexibility with the native components, you have to define some properties according to your build output:

projectToBenchmark: is necessary to setup the Gradle project dependency to your library project, so that it is built beforehand. The dependency definition is handled by the plugin, because it has to link Hayai to it.
outputLibraryName: the configured name of the output library within projectToBenchmark to depend on.
benchmarksPath: The path to your Hayai benchmarks within the current benchmarks project

In order to get a deeper understanding, how those things fit together, please have a look at the C++ Plugin sources.

One more plugin …

You come that far,let's get to an end. Both plugins produce their very own results with special settings, and different ways of execution. That makes comparison hard, but not impossible.

Both frameworks somehow give their benchmark results a 'name'
Free to choose under Hayai (benchmark group and name)
Predefined by class (name) and package (group) in JMH
Both frameworks somehow support execution time measurements in ms
JMH offers certain annotations to set execution count (batch size and iterations), measurement method and time unit
Hayai measures in ns and also supports execution count (batch size and iterations)
Is there a common output format? Json for instance?
In JMH, yes
In Hayai, no, (not yet?). But I have created a fork and added my own Outputter to the project, so it writes a format I have under control so far. Here ist the link the my fork. This fork is also downloaded per default by the CroLaBeFra C++ Plugin.

A JMH specialty which is worth a discussion at this stage: A well-known characteristic of the JVM is its cold-start behavior. Have you ever seen it? Do you know its impact? It can be made visible with JMH because there exists a special mode called SingleShotTime. Normally, JMH spends quite a lot of effort to circumvent it. But my goal is, to show a "worst", "average" and "best" in my report. Why am I doing this? Because the cold-start phase exists! It is there and that's fine! No one should bash the JVM for that, because in fact, it does a terrific job when optimizing the code, and that should be made visible! Consequently, the "worst" value value for Java benchmarks will be quite high, but still, I think it is interesting to see within the big picture. In contrast the compile time optimization of common C/C++ compilers will lead to numbers which are close together, but that's fine as well and is typical for static runtime code.

… to rule them all

I told you that both JMH and Hayai are able run their benchmarks the same way, with the same identifiers, and the two previously presented plugins allow to create those results in a single build chain. This ultimately leads to the cherry on top, the funky mothership of plugins, the CroLaBeFra mothership.

**Funky Mothership** - Image by Fuzheado [CC BY-SA 4.0], via Wikimedia Commons

What it does:

It forms a bracket around all the language-specific Gradle projects and joins them all in a parent project, where the plugin is applied to
It detects and runs all language-specific CroLaBeFra benchmarking tasks
It collects all language-specific results and brings them together
The results are merged together into an HTML report based on the benchmark name. Currently, the benchmark writer has to take care that the benchmark names and execution parameters match.

Et voila:

Output example of the CroLaBeFra Mothership HTML report

The given numbers read as follows: Fastest, average, worst time spent per iteration. An iteration had a batch size of 1000000 calls of the according benchmark method. More is ok as well but probably increasing the number of iterations might make more sense. Less than 1000000 might not be enough reasonable amount of time. In general, the more iterations you can afford from a time consumption perspective, the more precise will be your results in general.

This is a set of results, that I have grabbed on my 6 years old MacBookPro under conditions which would make every performance expert cry - not due to its excellence of course. However, today we are not talking about the execution environment, but the possibility to have a one-shot command, which produces a combined report just like that one.

The execution parameters:

Batch size and number of iterations are or course equal in both worlds
JMH runs in SingleShotTime mode
C++ code builds with clang-700.1.81 and flags "-O3" and "-march=native"

Several eye-catching things:

Java cold-start performance is clearly visible and it is not as awful as generally thought and taught.
C++ execution timings for worst/avg./best are close together as anticipated for compile-time optimisation
Java is sometimes faster :) This match ends with a 4:2 for C++ but ...
… mathSqrt should be identical on both worlds, should it?

It all depends. The mathSqrt benchmark seems to be a good showcase to visualize the execution and measurement overhead of the two different micro benchmarking frameworks. Additionally, it is not yet proven, that the processor instructions are exactly the same in both worlds (although there should be the same sqrt intrinsic in the ASM code. But who knows whether there is additional 'noise'?). But these questions rectify a new blog post. For today, I would like to invite you to try out my showcase project yourself.

The CroLaBeFra POC project in a nutshell

What you need:

Gcc or Clang compiler installed
libSDL (e.g. via 'brew install sdl')
Oracle JDK 1.8 + JavaFX (or Open JDK + OpenJFX, not tested though)
MacOS or Linux (work in progress)

Steps to follow:

Clone the POC: git clone https://github.com/bensteinert/crolabefra-setup-poc
Within the project folder init all submodules: git submodule update --init --recursive
Run CroLaBeFra: ./gradlew crolabefra
Open the report in the mothership project: open ./build/reports/mothership/index.html

CroLaBeFra Caveats today:

The benchmark modules need special care so that the results from the two worlds stay roughly comparable - same batch size and iterations, proper mode selection in JMH and so on.
The method names need to match exactly. The groups are not yet used
The report does not store environment details like CPU info, used compilers or compiler settings
An internet connection is required, in order to fetch Hayai from github
The name. Well it is a working title….
New languages to come :-)

Cheers

Ben

‍

Further articles of this series:

Cross-language benchmarking - Part IV

Plugins Plugins Plugins