Measuring performance of Java UUID.fromString() (or lack thereof)
aka “Why is UUID.fromString() so slow?”
Similar to my recent adventure into evaluating performance of String.format() for simple String concatenation, I found another potentially meaningful case of widely used (and useful) JDK-provided functionality that seems to significantly under-perform: decoding of
java.util.UUID from a String:
UUID uuid = UUID.fromString(request.getParameter("uuid"));
This was found to be a hot spot on profiling done for a web service. Given that decoding should not really be that complicated compared to, say, generating UUIDs, this seemed odd.
(note: I am not the first developer to have noticed this, see f.ex this blog post from 2015)
Given that I once wrote (and still maintain) the popular java-uuid-generator library — which actually predates addition of
java.util.UUID in JDK 1.4 by a year or two :) — I thought I could see what is happening and whether I could find and evaluate more efficient alternatives.
And as usual, I’ll reproducible test(s) on https://github.com/cowtowncoder/misc-micro-benchmarks microbenchmark suite.
Before proceeding further I think I need to add the usual standard disclaimers about microbenchmarking, results and interpretations: microbenchmarking itself is hard to get right and results (absolute and relative) are never absolute, and beyond being relative are highly contextual: something being 10x or 100x or 1000x slower may or may not be a problem in a given situation.
So I’ll get and show some numbers, talk about them; you may consider using that information if it makes sense for your particular usage.
Does JUG have it? How about Jackson?
Measuring the actual performance of
UUID.fromString() can be useful in estimating the relevance of measurements, but the other important part is to establish potential upper bound on faster implementation.
At first I thought that java-uuid-generator probably has equivalent code I could use. With a bit of digging I found it indeed has this method:
UUID uid = UUIDUtil.uuid(uuidString);
and we can see how its performance ranks compared to JDK implementation.
In addition, I know that Jackson (jackson-databind) also has a rather heavily optimized
UUIDDeserializer which could be useful — not exactly as-is, but we can easily copy-paste the logic. So let’s check out that too: Jackson codebase tends to be heavily optimized in many places and I recall having spent some time with UUID handling in particular.
I suspect there are many other implementations out there too that I could have benchmarked (there are 2 other widely used Java UUID generator libraries, both of which may well have a method for decoding) — if anyone wants to contribute an addition, feel free to submit a PR against https://github.com/cowtowncoder/misc-micro-benchmarks and I can update the results!
But for now let’s see how these initial 3 choices perform.
Test Case & Results, initial
In this case our test set up is quite simple: we pre-construct a set of
java.util.UUIDs (32) using random-number based generation method, then convert them to
Strings: the test consists of rounds of constructing UUIDs back from the 32-element String array.
The test class is
ValidUUIDFromString (in package
com.cowtowncoder.microb.uuid) if you want to have a peek.
Running this test on my set up (Mac Mini (2018) 3.2Ghz 6-core Intel Core i7, JDK 1.8.0_272) gives results like this (note: I trimmed names of test cases to fit results better):
Benchmark Mode Cnt Score Error Units
UUIDFromString.m1_JDK thrpt 15 89175.229 ± 1899.754 ops/s
UUIDFromString.m2_JUG thrpt 15 490052.366 ± 35351.600 ops/s
UUIDFromString.m3_manual thrpt 15 1129569.239 ± 6054.405 ops/s
This gives us rough ratios of performance: using JDK implementation as the baseline, we see that:
- JDK approach can decode almost 3 million UUIDs per second (per core)
- java-uuid-generator can decode UUIDs from Strings about 5 times as fast (15 million / sec / core)
- copy-pasted code from Jackson can decode UUIDs from Strings over 10 times as fast (33 million / sec / core)
So: there is indeed a significant performance overhead with JDK (8) way of doing things. Yet even with it you need to call this method awful lot for it to necessarily matter.
Results with JDK 14
Given that JDK 8 is getting old by now, I decided to try the current stable LTS JDK, JDK 14. To my surprise there was quite a big positive difference — JDK 14 implementation is almost 4x as fast as in Java 8! While not quite as fast as alternatives, performance is rather close to
java-uuid-generator. Jackson’s optimized version is still faster but it’s only by a factor of 3 now.
Benchmark Mode Cnt Score Error Units
UUIDFromString.m1_JDK thrpt 15 358738.222 ± 7188.453 ops/s
UUIDFromString.m2_JUG thrpt 15 529761.948 ± 3866.240 ops/s
UUIDFromString.m3_manual thrpt 15 1181016.727 ± 4541.903 ops/s
Reasons for slowness of Java 8 / UUID.fromString()
One thing that interests me, as usual, is the reason for slowness of Java 8 implementation. I was thinking of profiling it a bit until realizing that someone else had already checked into it: “Micro-optimization for UUID.fromString()” has a good overview of where time is spent, so I won’t repeat it here.
And as to JDK 14, I assume it uses approach similar to what java-uuid-generator does. That approach is probably bit less code than Jackson’s, using simple looping: Jackson, on the other hand, gains more speed but removing looping after first sanity-checking validity of input format.
Given the difference between
java-uuid-generator, Jackson’s decoder, I am tempted to use code from latter to speed up
UUIDUtil.uuid() implementation in former — leading to the fascinating idea of releasing a new version of almost 20-years old library (although the oldest version at Maven Central is marked as being from 2005, the first version was actually made available in 2003 or so).
We’ll see — it seems like a nice idea if I find time to do that.