Jackson 3.0 vision (Jan 2021)
The Jackson project is by now almost 14 years old: the first public version was released in the summer of 2007. A lot has happened over the years: a two dozen “minor” releases (ones with new features) from 1.0 through to 2.12 and more than 100 patch releases (ones with only bug fixes) have been released over the years.
But in addition there has been one transition known as the “major version upgrade”, in which the new version is not backwards-compatible with the older version. This occurred with the release of Jackson 2.0.0 in February of 2012 with the release of Jackson 2.0.0.
Jackson Major versions: 1.x and 2.x
There have been two major Jackson versions: Jackson 1.x and 2.x.
Jackson 1.0.0 was released in May 2009 and the last version (1.9.13) was released in July 2013, so it was developed for about 4 years. Jackson 2.0.0 was released in February 2012 and the latest version (2.12.1) was just released earlier this month (January 2021); so it has been developed for almost 9 years and is still going strong.
The reason for deciding to create a new major version was driven by desire to clean up API, as well as restructure things to support new (at the time) extension concept, Jackson modules. Introduction of modules around Jackson 1.7 was a big change and triggered a quick evolution of new features, support for new data formats, 3rd party datatype packages and even support for non-Java JVM languages like Scala and Kotlin.
For more detailed dive-in through Jackson 1.x and transition ideas towards 2.x, please see the “Brief History of Jackson the JSON processor” I wrote back in 2013. It will provide useful context for this post.
Jackson 2.x, accumulating baggage
Since 2.x has been developed for almost a decade, many problems have been uncovered. Many have been resolved, but a few that have come about due to expansion of feature set and coverage of different supported data formats (especially for formats like XML and CSV), JVM languages (support for first Scala and then Kotlin have pushed the limits; and even newer JDK version features have their challenges) have been more difficult to tackle.
This means that there is some accumulation of things that would be good to fix but that cannot be changed due to compatibility constraints; situation quite similar to what happened with Jackson 1.x and lead to the decision of pursuing a new major version.
This is why I started planning for what should become another major version upgrade — meaning one that has backwards-incompatible changes where necessary — shortly after the release of Jackson 2.9.0. This was around late summer 2017. I started planning and implementing some of the changes, eventually upgrading most Jackson Github repos’ “master” branches to be for 3.0, with 2.x development continuing on “2.9” branch.
The first full set of major changes (initially mostly focusing on allowing full Builder-style
ObjectMapper construction with immutable mappers) was ready towards end of 2017, although no release candidates were made at that point or since.
But while I made good progress, I did not give up on development of Jackson 2.x minor versions. Part of the reason was the sudden rise in focus on security problems, problems surfaced by security researchers regarding Polymorphic Deserialization: this focus guided much of Jackson 2.10 development (and some of 2.11) to address the fundamental problem of “block list” approach (see “On Jackson CVEs…” for full background).
I also started actively backporting some of the features, changes that seemed actually doable in 2.x branch — sometimes things that appear impossible or unfeasible with compatibility constraints turn out doable, in a carefully orchestrated way. Jackson 2.10, in particular, contained quite a few forwards-compatibility additions, including support for Builder-style construction (albeit without true immutability as removal of configuration of
JsonFactory cannot be yet removed due to compatibility constraints).
This backporting allowed further work on 2.x feature set and reduced the immediate need for the new major version — although the biggest benefit is probably to allow slightly smoother upgrade, eventually, as users can do “soft migration” in 2.x already. For example, it is possible to construct
ObjectMapper using the new Builder-style with 2.10 and later: this will be the only way to construct mappers in 3.0.
But why exactly 3.0?
Above explanation of challenges in implementing changes is a bit vague so let’s expand on it a bit.
First: even with backporting of some of the features, one crucial difference is that fundamentally you can only ADD new features, but not REMOVE (or in many cases even significantly alter) features. It is possible to deprecate various features — and that is what Jackson 2.x does, accumulating more and more
@Deprecated features over time. So one of the things that a major version upgrade allows is proper “deep cleaning”: removal of all deprecated methods, classes, fields and features.
Second: even adding some features can be impossible. A good example is that of making
TokenStreamFactory (format-agnostic subset of
JsonFactory added in 2.10, but fully used in 3.0) immutable, so that all configuration is done during construction using Builder-style construction, and once built (using
builder.build() after configuration) no changes are possible — except via new
ObjectMapper.rebuild() process that creates a new mapper instance. Since removal of existing direct configuration methods on
ObjectMapper is verboten for compatibility reasons, the “feature” of Immutability cannot be added.
Somewhat related, another big improvement planned for 3.0 — changing all Jackson exceptions from checked (
extends IOException) into unchecked (
extends RuntimeException) — is impossible because such a change would make a lot of existing code non-compilable (although interestingly enough, remain binary compatible as exceptions are not part of method signatures: binary- but NOT source-compatible!).
I will have more to say about actual feature set in another blog post, but these should suffice as examples for now.
Third: ability to change default settings, behavior, even without adding new features or removing old ones. Changing this behavior is fundamentally problematic for compatibility reasons and is best done in a major version upgrade.
This is something that users have often asked for; and something where user feedback regarding desired changes is important. It is also true that a good choice of default settings is a very important — but often underappreciated! — aspect of usability, helping developers.
The fourth reason, which is more of a personal preference, is that due to historical reasons some of naming and naming convention original chosen (or developed over time) has proven to be inconsistent and misleading: it would be good to correct/unify some of the naming discrepancies.
The main one is that of naming many types “Json”-something —
JsonProcessingException — and with support of a dozen other formats, this now seems quite odd. Hence move towards replacements like
Similarly some concepts have multiple names:
- In Object values, key/value pairs may be called “fields”, “properties” or even “entries” — over time, term “property” has become preferred within Jackson concept, but use of “field” has existed alongside in Exception messages and some (internal) method names
- For reading we have “parsing”, “decoding” and just “reading” (and, for some, “deserialization”. Since Jackson uses “deserialization”/”serialization” for higher level concept (where we’d have alternatives of “marshal”/”unmarshal”, too!), the lower-level streaming (incremental) is focusing on “token stream” concept and simple “read”/”write” terminology — and reducing references to “parsing” (and for writer-side ”generation”)
With naming changes there is a bigger challenge of figuring out reasonable balance between useful changes that improve readability of both code and documentation (consistent naming can help, inconsistent confuse), and tendency for naming changes to be confusing on its own over time. It is important to figure out a set of changes to make without renaming everything everywhere — especially since some changes have cascading effects if followed through the whole code base.
One example of just keeping the old somewhat misleading naming will probably be the Jackson Annotations package: renaming all annotations would probably be more hassle than worth. I will have something more to say about dealing with annotations, in another blog post — it is an interestingly different kind of dependency, requiring a bit different handling (IMO).
I probably forgot something else that a major upgrade allows (or is needed for) but I think this is a sufficient overview of Pros.
Or… why not 3.0?
So far I have talked about the need to go with another backwards-incompatible version upgrade as if that was a Fait Accompli, something that would obviously just make sense to do. But there are counter-arguments against making such change. Java platform itself, for example, has managed very long without truly backwards-incompatible changes — even with Java 9, most of the code would continue to work as-is. Depending on priorities, Jackson project could conceivably just continue with a set of incremental changes, producing more on more 2.x minor releases.
So what are the Cons here? Downsides, concerns, trade-offs?
The first obvious challenge to me is the danger of Conflicting Major versions. If Jackson 3.0:
- Used same Maven coordinates (same Group Id AND Artifact Id), OR
- Used same Java package names (
- (with Java 9 and later) Used same Java Module names
then one could not have both Jackson 2.x and 3.x versions in the class (or module) path — every user, and (more importantly!) every framework would have to choose whether to use Jackson 2.x or 3.x as co-existence would not be possible. While this may seem like a simple decision from single application perspective, it becomes VERY difficult with transitive (indirect) dependencies, frameworks. This problem would likely both delay (or prevent) adoption of the new version and (worse!) lead to significant problems for Jackson users as the upgrade to 3.0 would have to be carefully coordinate across development stacks: something that would be very difficult with Jackson especially, due to its popularity and inclusion on most major Java web frameworks.
So, dealing with such an upgrade would be a Very Big Ask from users and development community. This is why Jackson 2.x uses different Maven coordinates (
org.codehaus.jackson) and Java package names (similarly) to allow co-existence of two major versions.
While this approach has some downsides (users will have to change pretty much all import statements that refer to Jackson types), I think it is well worth as it prevents much more difficult compatibility problems when an application depends on 2 libraries or frameworks, one depending on Jackson 1.x (not converted) and another has upgraded to 2.x — this situation is not necessarily optimal, but it is not a blocking conflict. Most importantly it allows for gradual, incremental upgrade by the user/developer community.
I plan to approach Jackson 3.0 similarly, although with a twist about annotations. But more on that at a later point.
The second, related concern for me is that of adoption: will users, library/framework developers upgrade to the new major version? And at what rate? Is it necessary to keep on maintaining both major branches — and if few upgrade, will there be value in more focus on 3.0(-only) development?
Having gone through 1.x -> 2.0 upgrade, I think that there is definitely need for continued support for 2.x, and there will probably be at least one more minor version for 2.x even after 3.0.0 is fully released. With 1.x the main reason that the development effort stopped had to do with the migration of the SCM from Codehaus (now long-defunct) to Github as well as build system from Ant to Maven — I have not actually been able to release new versions for years now (since Ant Maven plug-in to push Maven Central releases broke at some point).
But one thing that I have noticed is that there has been long lingering usage of Jackson 1.x — the latest published version, 1.9.13, is still in relatively wide use: it only very recently fell off top-100 most popular libraries on mvnrepository.com.
On plus side, Jackson 1.x just “seems to work” the way it does, without much user-reported problems. Over time users do move over and that transition tends to go relatively smoothly from the feedback I have received over the years.
I think the main driving force with major version upgrade — at least in the case of Jackson — is that much of the usage is via frameworks like DropWizard (the first one fully adopting Jackson), Spring Boot, RESTeasy, Jersey. And with pluggable service frameworks like these, support can be added with plug-ins/providers/extensions, so my main hope is that there will be new “Jackson 3” provider that can be plugged instead of existing “Jackson 2” provider. So that users can make upgrades whenever they are ready.
But I guess we will see how this transition will go. I suspect it will take longer than 1.x-to-2.0.
At the moment I am back to working on Jackson 3.0 — but will occasionally switch back to bug-fixes for 2.12 (and in due time work a bit on 2.13, too) for a bit.
I plan to write more about 3.0 plans soon.
But in the meantime if you are interested in seeing how 3.0 development work is proceeding, you may want to read further on:
- Jackson 3.0 release page details some of completed changes, as well as discusses planning, status
A few bigger changes have their own JSTEP entries too:
- JSTEP-1: 3.x upgrade compatibility details
- JSTEP-2: change in default settings, behavior, for 3.x
- JSTEP-3: improvements to Tree Model (
JsonNode) — some changes in 2.x, others 3.0-only
- JSTEP-4: replace checked
Enough for today; right now I am working on the last point (JSTEP-4) but hope to follow-up in a week with more coverage on some of Jackson 3.0 work.