Jackson 3.0: Immutability w/ Builders
aka ObjectMapper.builder().[configure].build()
First the disclaimer: as I write this (end of Jan 2021), Jackson 3.0 is still very much Work-in-Progress (see earlier Jackson 3.0 vision): even the first release candidate is months away. My hope is to release the final 3.0 some time during 2021. Until then you can see the progress on master
branches of Jackson repos and build locally, kick the tires, but should not use it for any real work.
With that out of the way, let’s look at the first major new feature that has been implemented for 3.0 (also listed on Jackson 3.0 release planning wiki page): constructing JsonFactory
and ObjectMapper
instances using Builder-pattern.
What Does the Change Look Like?
The difference in construction looks like this:
// jackson 3.x: NEW and BETTER way
JsonFactory f3 = JsonFactory.builder()
.enable(JsonReadFeature.ALLOW_JAVA_COMMENTS)
.build();
ObjectMapper mapper3 = JsonMapper.builder(f3)
.enable(MapperFeature.ACCEPT_CASE_INSENSITIVE_PROPERTIES)
.addMixIn(MyValue.class, MixinOverrides.class)
.build();// jackson 2.x
JsonFactory f2 = new JsonFactory();
f2.enable(JsonReadFeature.ALLOW_JAVA_COMMENTS);ObjectMapper mapper2 = new ObjectMapper(f2);
mapper.enable(MapperFeature.ACCEPT_CASE_INSENSITIVE_PROPERTIES);
mapper.addMixIn(MyValue.class, MixinOverrides.class);
and does not look like a huge change, just an addition of bit of code.
So what is the big deal?
The Main Problem to Solve: Config-before-Use requirement
The traditional use pattern for stream factories (like JsonFactory
) and object mappers in Jackson has consisted of two parts:
- Construction and configuration of the entity
- Actual usage (read/write)
both performed through API of a single class, and in which two phases MUST BE DONE SEQUENTIALLY: specifically meaning you are not to (try to) reconfigure entities after they have been used, even if just once. Configuration first, then usage. This means that following is NOT VALID USAGE:
ObjectMapper mapper = new ObjectMapper();
// first write with standard settings:
MyValue value = new MyValue(28);
byte[] defaultOutput = mapper.writeValueAsBytes(value);
// and then we'd want different settings by applying a mix-in:
// DOES NOT WORK! DO NOT EVEN TRY.
mapper.addMixIn(MyValue.class, HidePasswordMixIn.class);
byte[] safeOutput = mapper.writeValueAsBytes(value);
What would likely happen here is that defaultOutput
and safeOutput
are the same: mix-in assignment has no effect. What happened?
The reason for configuration having to occur before usage is two-fold: first, configuration setting changes are not designed to be thread-safe; and second, indirect effects of configuration are not necessarily undone — specifically, all the serializers and deserializers constructed with existing configuration settings may still be cached and reused. So when weserialized MyValue
instances with certain settings, JsonSerializer
that was constructed with those settings and used will be cached by ObjectMapper
: all future use will use the same serializer instance even if configuration settings had changed to indicate different handling. Direct changes to mapper configuration may well stick (in this case we did not have thread access concerns) but do not necessarily take effect as we might expect.
This effectively means that attempts to reconfigure stream factories and ObjectMapper
s will generally not successfully change settings and may produce new interesting ways for code to fail.
But while the basic idea is straightforward and documented in Javadoc comments, it is easy for users to be unaware of such limitations, especially when API itself cannot do anything to support two-phase life-cycle (a few ways have been suggested by developers but none really work well — a lot of synchronization, complexity, and possibly significant multi-threaded usage performance losses are likely).
This is the main problem being solved with Builder-style construction.
The Other Problem to Solve: Too Big API
Aside from the correctness and usage safety problems, there is also one other related problem: full API of Jackson 2.x ObjectMapper
is big, and only grows if and when new configurability options are introduced.
And this is not even going into the issue of possible format-specific configuration.
Builder-style to the Rescue!
Fortunately there is a proven design pattern to solve this problem: Java Builder pattern (aka “Josh Bloch” builders, since this was described in Josh’s excellent “Effective Java” book — not the original GoF Builder pattern that is more complex). It separates API for configuration (“Builder”) from the API of actual immutable result objects, yielding multiple benefits:
- Configure-then-Use is naturally enforced: you cannot use something not yet built; you can only configure builder. Safety and ease-of-use ensue.
- Simpler API for entities: configuration API separate from actual read/write API (especially important for
ObjectMapper
) - Only the Builder class is mutable; built entities can be (and in case of Jackson, are) fully Immutable — this prevents problems with concurrent changes that allowing configuration on entities would bring, without requiring any synchronization
- Easy to support different subtypes, configuration settings: Jackson 3.x can now much more easily support format-specific configuration settings for token stream factories and object mappers
Also available in Jackson 2.x since 2.10!
Although the fully immutable implementation of Builder-pattern only exists for Jackson 3.0, Jackson 2.10 introduced mostly complete facade (*) for Jackson 2.x. The original intention was to help eventual upgrade to Jackson 3.0 (and this should still prove useful for that), but it will hopefully prove to be a convenient construction mechanism even with Jackson 2.x.
The main difference is that the resulting mappers and factories still have potential problem with configuration attempts after usage.
(*) minor omissions due to Jackson 2.10 not requiring Java 8, so lambda-taking config methods could not yet be supported — this can be resolved for Jackson 2.13
Jackson Builder Bonus Pattern: Rebuild!
In addition to the basic builder-pattern, Jackson 3.0 also adds “rebuild” functionality so that you can start configuration with existing settings of a previously built entity:
JsonMapper mapper1 = ...;
// use for operation
// and then reconfigure
JsonMapper mapper2 = mapper1.rebuild()
.addMixIn(Value2.class, MixIn2.class)
.build();
This can be useful when a system requires multiple slightly differently configured mappers.
Note: it is also possible to reuse actual Builder objects but these are mutable and stateful which means that they cannot be shared across threads.
Note: not available in 2.10 as it cannot be implemented without underlying Builder state support.
Jackson Builder: simpler java.io.Serializable
support
But wait! There is even more! Aside from other benefits, use of Builder pattern makes it easier to keep ObjectMapper
JDK serializable (java.io.Serializable
) (*). The reason for this is that when mappers (and token stream factories) are JDK serialized, we will only serialize (and later deserialize) the state of Builder object that was used — not the actual mapper.
This significantly decreases the size of serialized ObjectMapper
instances and should make (de)serialization faster as well. Mappers passed this way may be rebuilt as well.
(*) yes, Jackson 2.x is and has been JDK serializable; while not generally useful it can be helpful for some distributed processing use cases
Jackson Builder internals: module handling
One aspect of initialization that had to change internally, to support Builder-style configuration, is the interface that Jackson extension Module
s use.
In Jackson 2.x Modules are basically given access to actual ObjectMapper
instance during their initialization; initialization occurs immediately when ObjectMapper.registerModule()
is called.
In Jackson 3.x module initialization/registration does not happen immediately when MapperBuilder.addModule()
is called: instead, references to modules are kept by Builder and initialization is performed in addition order when MapperBuilder.build()
is called. At this point Modules are given “last” access to Builder object to make changes to configuration, in determined order.
This change should be mostly hidden from users, with one exception: modules may be added and removed during build-process as the initialization only takes effect when .build()
method is called on builder.
Format-specific ObjectMapper subtypes in Jackson 3.x
One related change in Jackson 3.x (… although also retrofitted in some parts into 2.10…) is that all dataformat modules will now include their own ObjetMapper
subtype (such as SmileMapper
), as well as matching TokenStreamFactory
(*) implementation that was always needed to support different underlying format. Build process itself is similar to that of JsonMapper
/ JsonFactory
but the benefit is that now it is much easier to support format-specific configuration right when building mapper instances.
(*) TokenStreamFactory
is the immediate super class of JsonFactory
in 2.x; in 3.0 other format type factories will directly extend TokenStreamFactory
and not JsonFactory
.
Addendum: safe ways to reconfigure in Jackson 2.x
After reviewing the new, safer way to configure token stream factories and object mappers, let’s have a look at mechanism in Jackson 2.x that can also be used to allow reconfiguration of entities.
Use of .copy()
method
Both JsonFactory
and ObjectMapper
have .copy()
method which does what name implies: constructs a fresh new copy with existing settings. This instance can now be configured (before any and all use!) safely, and changes should take effect as expected.
The main challenge with this mechanism is that the interaction with Module
s can be problematic: while the new copies should have empty caches, it may not be possible to change settings that Modules had.
So while useful mechanism for Jackson 2.x, rebuild() is a simpler and more reliable mechanism to use in Jackson 3.x.
Reconfigurable ObjectReader, ObjectWriter
Some configuration settings were designed in a way to be usable on per-call (read/write) basis: for example, while MapperFeature
s are things that have longer-lasting effects (applied during construction of various helper objects), all SerializationFeature
and DeserializationFeature
settings are applied dynamically as needed (*).
But since we cannot use a singleton ObjectMapper
to change settings for just one call, there needs to be something else to apply these changes.
This is where ObjectReader
and ObjectWriter
helpers are used: they are reusable, immutable, light-weight (**) and thread-safe helper objects with specific configuration to use. But you can also create new instances with slightly different configuration.
For example:
ObjectReader r = mapper.readerFor(MyValue.class);
MyValue v1 = r.readValue(new File("/tmp/foobar.json"));
// need bit more lenient reader, so use these settings:
MyValue v2 = r .with(DeserializationFeature.FAIL_ON_IGNORED_PROPERTIES);
API for ObjectReader
and ObjectWriter
is designed so that everything you can do is legal and safe: there is nothing you can only call in certain situations, or that you could not call after usage.
Needless to say, use of these classes over ObjectMapper
is strongly recommended: while there are some things for which mapper must be used (like value conversions), basic value reading and writing options are all there.
(*) actually distinction of features has been done in a way that this is true — MapperFeature
s are things that cannot be supported dynamically, things related to, for example, annotation introspection
(**) Cheap to construct unlike ObjectMapper
instances which are expensive to create