Jackson 2.18.0-rc1 overview
The first release candidate for Jackson 2.18 – 2.18.0-rc1 — was released bit over a week ago. While waiting for the official 2.18.0 release (due in a week or two if all goes well), let’s see what is and will be included.
(as usual the full list of contents can be found on Jackson 2.18 release wiki page).
2.18 Stats
Development took about 5 months since 2.17.0 release, and there are about 60 changes (new features, fixes) across all official Jackson components.
This can once again be classified as a “minor minor release” — although one that is an important milestone in one specific way: it completed one last “must have” feature I considered a blocker for getting 3.0.0 ready for Release Candidate phase of development.
pre-3.0 Must-Have: Rewrite Property Introspection
That one specific thing to focus on 2.18 was implementing databind#4515 — “Rewrite Bean Property Introspection logic in Jackson 2.x”. This has been planned for past 8 years (Jackson 2.8 release notes from 2016 refer to “Fix Creator Introspection … deferred… attempts will be made to include in 2.9”).
And now I finally forced myself to work on absolutely no other feature before getting the rewrite done. So in May 2024 I spent ~2 weeks to figured out how to do it… and then… did it. Phew!
In hindsight I should have just done this earlier: while it was probably the most challenging thing I have done in years, it was no easier to do now than say, 5 years earlier. Still: better late than never.
So why was this important thing to do? The motivation for this rewrite was that there were about half a dozen bugs that were difficult if not impossible to actually fix without major changes Property Introspection logic: specifically, unifying handling of “regular” properties (ones accessed via getter/setter/field), and that of “creator” properties (ones passed through constructor/factory-method).
Without this unification some annotations present in one of accessors (getter/setter/field/constructor/factory-factory) would not be used for other accessors. This, in turn lead to hard to solve problems with Java Records, as well as with Kotlin (and Lombok) data classes.
Lack of unification also sometimes prevented full functioning of “annotation-less” creators for POJOs — which should now finally work well: so that if you have “parameter names” module registered, and POJO only has a single public
constructor, no annotations should be needed, so following will now work reliably:
public class MyType {
private int a, b;
public MyType(int a, int b) {
this.a = a;
this.b = b;
}
public int getA() { return a; }
public int getB() { return b; }
}
This rewrite was that one specific thing on my mental list of things that MUST be done before focusing on making Jackson 3.0 pre-releases possible, during the rest of 2024 — I wanted both 2.x and 3.x codebases to have the “New Introspection” implementation.
So this was a big deal for me; I hope it will also significantly improve user experience for Creators (both annotated and implicit), by removing a category of “why doesn’t that annotation work here?!?!” bugs that were formerly unfixable.
Most-Wanted feature: Allow @JsonAnySetter via Creators
One of remaining 9 “Most-Wanted” features of jackson-databind, databind#562, was implemented for 2.18.
It’s pretty much what you’d expect: this now works
public class MyAnyType {
private final int id;
private final String name;
@JsonCreator
public MyAnyType(@JsonProperty("id") int id,
// Can use either "Map" or "JsonNode"
@JsonAnySetter JsonNode extra) {
this.id = id;
name = extra.path("name").asText();
// and so on
}
}
and aside from being able to annotate both Map<String,ValueType>
and JsonNode
values it also works (like it should) with Java Record types too (and Kotlin data classes). More on this bit later on.
Performance: Yet More Faster Floating-Point Reads
I already wrote about another important improvement: “Jackson 2.18 even faster floating-point reads!” but TL;DNR is that there is additional 10–20% speedup for reading double
and BigDecimal
values when upgrading from 2.17 to 2.18. Speedup is due to removed String
allocations on common decoding paths. No code changes needed, free boost — although, it’s generally worth enabling “Fast FP reads” as shown f.ex in “Jackson 2.16 faster BigDecimal reads”:
JsonFactory f = JsonFactory.builder()
.enable(StreamReadFeature.USE_FAST_DOUBLE_PARSER)
.enable(StreamReadFeature.USE_FAST_BIG_NUMBER_PARSER)
.build();
ObjectMapper mapper = new JsonMapper(f);
CSV: support “Decorations” for things like Vector values
This is actually one of surprisingly rare features: something I personally wanted for my own use. I had been looking for datasets with sizable vectors — for performance testing FP values in JSON (see above), and found something (which it seems I did NOT bookmark?!? Need to go find it again) where data was in CSV, but instead of something like:
id,vector
123,0.5,0.25,-0.1
which Jackson CSV module can handle, array values are enclosed in brackets like so:
id,vector
A123,[0.5,0.25,-0.1]
Thanks to implementation of dataformats-text#442 for reading and dataformats-text#495 for writing such “decorated values”, these (and many similar values) can be handled. Reading could happen like so:
@JsonPropertyOrder({"id", "vector" })
static class Embedding {
public String id;
public double[] vector;
}
// Can either construct Schema manually, or auto-detect from class like here:
CsvSchema schema = CsvMapper.schemaFor(Embedding.class)
.withHeader()
.withArrayElementSeparator(",")
// We can use one default decorator (there is also simple "prefix/suffix"
// one for different start/end markers; strict/optional variants)
.withColumn("vector",
col -> col.withValueDecorator(CsvValueDecorators.STRICT_BRACKETS_DECORATOR
));
MappingIterator<Embedding> it = MAPPER.readerFor(Embedding.class)
.with(schema)
.readValues(new File("data.csv"));
// Could read them all, or iterate one by one like here
while (it.hasNextValue()) {
doSomething(it.nextValue());
}
// ... and we are done!
and writing would use similar CsvSchema to add decorations.
Note, too, that this feature is not limited to array values: it may be used to add/remove (write/read) decorations for values of any types. So it may come handy for handling all kinds of exotic values in CSV content.
Finally… I need to go back to testing end-to-end performance improvement with such “dense” vector data. Hope to blog about this in near future.
Overall Support for Java Record types (and similar), jackson-jr
Of features mentioned above, 2 improved handling of Java Records (and similar types, Kotlin data classes, Lombok-generated classes): Property introspection fixed issues with annotation use; and ability to use @JsonAnySetter
via Creators (Records being immutable no fields or setters can be used) allows more flexible construction.
But in addition to Jackson-databind improvements, Jackson Jr also added basic support for Java Records via jackson-jr#162 so Record values can be read and written just like regular POJOs.
Avro CVEs: Apache-avro 1.11
A long-standing version problem with Jackson Avro module was the inability to use a recent version of Apache Avro library as dependency (module can decode Avro content with or without this library, but uses it for encoding): dependency was limited to 1.8.x.
This was mostly problematic due to CVEs filed against older versions of Apache Avro. With dataformats-binary#167 issues were fixed and now versions up to 1.11.x can be used without problems.
One more `StreamReadConstraints` option (maxTokenCount)
We have added a few “processing limit” settings to jackson-core
(that is, for JsonParser
and JsonGenerator
) in last couple of releases: these configurable limits offer ways to protect against potential DoS attacks.
Jackson 2.18 adds one more for reading side: maxTokenCount
.
It was added via jackson-core#1310 (see details there).
Setting this limit allows a way to limit maximum document read size by number of JSON Tokens (as opposed to raw byte
length,for example) — it can be used to limit attempts to pass a megabyte of bogus JSON like:
[[[[[[[[[],[],[],[]]]]]]]]
where small document can produce tons of tokens and (if using databind) use much more memory than the input document itself. By default token size is unlimited; you can set it to, say, 10_000 tokens with:
final JsonFactory JSON_FACTORY = JsonFactory.builder()
.streamReadConstraints(StreamReadConstraints.builder().maxTokenCount(10_000).build())
.build();
What Next?
Ok so the very next step is obviously getting 2.18.0 released: so far there has been one regression reported in 2.18.0-rc1 and that was just fixed so there is nothing blocking release.
But I think I will wait for a week or so to give Jackson user/dev community chance to do some more testing.
But beyond that — it is now finally possible to focus more on getting Jackson 3.0.0 ready for the Release Candidate phase. Branch itself has been ready for years now, passed test suite and so on. But there are things to decide both regarding project/repository structure — should we use separate 2.x and 3.x branches in existing repos, or create new ones? — and regarding what feature set do we want. Like, are there things, big refactorings/renamings yet to be done? Right now anything, all kinds of changes are possible: but once 3.0.0 is released, API is somewhat frozen again (for 3.x series). So breaking changes should be done before that.