Jackson 2.10 features

(esp “Safe Default Typing” to vanquish stream of CVE patches!)

@cowtowncoder
10 min readSep 29, 2019

Jackson 2.10.0 release is finally out, and it took (again) longer than expected — over 2 years, in fact, since 2.9.0 was released! Some of this was due to having started major work towards 3.x (`master` branch is now for 3.0.0-SNAPSHOT), but even after deciding that we still need at least one more minor version of 2.x things took their time. But better late than never.

As usual, there is a full set of 2.10 Release Notes if you want to know everything that is included — over 100 (!) issues have been resolved since last 2.9 patch.
This blog post gives an overview of goals and major additional features. I also hope to find time to write follow-ups with bit more detailed usage than what is included here.

Major Goals for 2.10

Looking back, there were 3 major goals for this minor release:

  1. Resolve the growing problem of “endless CVE patches”, a stream of fixes for reported CVEs related to “Polymorphic Deserialization” problem (described in “On Jackson CVEs… ”) that resulted in security tools forcing Jackson upgrades. 2.10 now includes “Safe Default Typing” that is hoped to resolve this problem.
  2. Evolve 2.x API towards 3.0, based on changes that were done in master , within limits of 2.x API backwards-compatibility requirements.
  3. Add JDK support for versions beyond Java 8: specifically add“module-info.class” for JDK9+, defining proper module definitions for Jackson components

Another continuing general theme is to “fill in the blanks”: increase coverage of new features added in 2.8 and 2.9 but that were not always fully supported across datatype or format modules; or did not work in certain combination of features. This is especially relevant for “config overrides” (added in 2.8), @JsonAlias and @JsonSetter (null for deserialization) (both added in in 2.9), and coercion settings (2.9).

And finally, ergonomics/usability improvements are being emphasized, to add convenience methods, missing overrides: this is especially true for JsonNode (Tree Model) improvements.

Major feature: Safe Default Typing (safe Polymorphic Deserialization)

Perhaps the biggest frustration during 2.9 development cycle has been seemingly endless stream of Vulnerability disclosures (aka CVEs) against Jackson databind module: use of security tools has forced patch versions to be applied regardless of whether users actually need a new Jackson version or not — problem being that mechanism (described in “On Jackson CVEs…”) is not even enabled by default.

Part of the problem was that Jackson documentation — and in particular Javadocs for methods needed to enable “Default Typing” — was not clear on potential security concerns users should consider. But additionally there was not much you could do, even if understood concerns, to limit kinds of classes that can be deserialized using Default Typing.

2.10 solves this general problem by:

  1. Adding new Safe methods (activateDefaultTyping()) to replace old unsafe methods — see details on Issue #2195. Note: new different name chosen on purpose as this allows security to easily distinguish safe(r) use from older unsafe use.
  2. Deprecating old methods ( enableDefaultTyping() is deprecated in 2.10.0)
  3. Adding Javadoc notes on both methods regarding security considerations

The main change in new methods is the requirement to specify a PolymorphicTypeValidator that is used to determined whether given subtype is allowed to be deserialized. Configurable standard implementation, BasicPolymorphicTypeValidator is included for convenience: it supports common “allow list” approach.

Validator is applied when resolving Class Name as Type Identifier, but check i only made on types that Jackson “does not know about” — ones not covered by explicit deserializers by databind, or registered datatype modules — which usually means Beans (POJOs) but not, for example, common types like Strings, Numbers, Lists, Maps or arrays. This simplifies definition and use of validators: in most cases you only need to be concerned by Java types you define, or inclusion which you have control over.

So if all POJO types are known to be subtypes of MyValue you could safely turn on Default Typing by:

// construct validator that accepts `MyValue` or any of its subtypes
PolymorphicTypeValidator ptv =
BasicPolymorphicTypeValidator.builder()
.allowIfSubType(MyValue.class)
.build();
ObjectMapper mapper = JsonMapper.builder() // new style
.activateDefaultTyping(ptv, DefaultTyping.NON_FINAL))
.build();

With that, you can safely allow Default Typing since none of gadget types would be allowed (as they are not subtypes of MyValue). BasicPolymorphicTypeValidator has other means to define checks, and you can also just implement PolymorphicTypeValidator yourself for ultimate flexibility.

There is also a related configuration settings — JsonMapper.Builder.polymorphicTypeValidator() — used for the other potential case, that of @JsonTypeInfo.

Last thing to note is that since the new safe(r) Default Type methods are named different from old methods, security tools should also be able to start detecting and flagging use of old methods. This could improve CVE applicability detection as well.

Major feature: API Evolution (Builders, Feature splits)

By the time it was decided that maybe we still need at least one more 2.x minor release (mostly to tackle “Safe Default Typing”), a lot of work had been done towards major new version of Jackson, 3.0
Although much of work can not be back-ported — after all, the whole idea of 3.x is to be able to do things that are not allowed, under Semantic Versioning, in any other type of version upgrade — some of the things turn out to be doable either in whole or in part.

Some changes being back-ported in 2.x are incremental (some of “small features” described later on, for example), but there were 2 larger areas of improvements:

  • Use of “Builder pattern” for constructing ObjectMapper and JsonFactory instances (for 2.x, optional/alternative, for 3.0 the only way)
  • Separation of format-specific and general features

which seemed possible to backport, useful in their own right and helpful in eventual migration from Jackson 2.x to 3.0.

API Evolution: Builders

Use of Builders for constructing mapper and parser/generator factory instances is due to 3 major benefits builder/object separation allows:

  1. Separation of configuration API (builder) from actual API for object itself (read/write methods of mapper, parser/generator construction for JsonFactory)
  2. Allow use of format-specific sub-classes for format-specific configuration — this requires sub-classing, something that was already done for JsonFactory (all format backend sub-class it), but is now also true for ObjectMapper (all format modules include sub-class thereof)
  3. Possibility to make built object fully immutable

Of these, Jackson 3.0 can take full benefit (mappers and factories are finally fully immutable, freely shareable), but 2.10 still gets (1) and (2).

We already saw use of builders for ObjectMapper for enabling “Safe Default Typing”; general pattern is like so:

ObjectMapper mapper = JsonMapper.builder()
.disable(MapperFeature.USE_ANNOTATIONS) // don't use annotations
.addMixin(MyValue.class, MyMixIns.class) // add mix-ins
.build();

The nice thing here is that users can clearly see that configuration is defined during build phase, via Builder, and not mapper. With 3.0 API ObjectMapper itself will not even have any configuration methods (although over time they will be , but 2.x will still need to expose them for backwards compatibility (although we can start marking them as deprecated in next minor version)

Situation is similar with JsonFactory:

JsonFactor f = JsonFactory.builder()
.disable(JsonFactory.Feature.INTERN_FIELD_NAMES)
.build();

and similar limitations wrt 2.x vs 3.0 exist: in 3.0, there will actually be general factory interface, TokenStreamFactory, and JsonFactory will become just one of specific backend (TokenStreamFactory is actually added in 2.10, too, but can not be used in place of JsonFactory due to backwards-compatibility constraints).

API Evolution: splitting Features

Configuring format-specific backends other than “default” JSON backend (default since JsonFactory is not only API but ALSO actual implementation — this is due to historical reasons) has become more problematic over time; partly because existing JsonParser.Features and JsonGenerator.Features have consisted of 2 sets of features:

  1. Generally applicable settings like JsonParser.Feature.AUTO_CLOSE_SOURCE
  2. JSON-specific settings like JsonParser.Feature.ALLOW_UNQUOTED_FIELD_NAMES

of which former work for all (or at least most) backends; and latter only work on JSON (or just one or two other backends).
One change in 3.0 is that these are split in two (in case of JSON):

  1. StreamReadFeature (general parser features)
  2. StreamWriteFeature (general generator features)
  3. JsonReadFeature (for JSON-specific parser features)
  4. JsonWriteFeature (for JSON-specific generator features)

(for more information, including full list of features, check out jackson-core Wiki)

Similarly for most format backends there will be matching [Format]ReadFeature and [Format]WriteFeature (or, in 2.x, [Format]Parser.Feature and [Format]GeneratorFeature).

For 2.10 we chose to do the split but still also support use of (now) old Features — this means that there is some duplication (3.0 will remove old features for good). Use of features is through factory builder methods:

JsonFactory f = JsonFactory.builder()
.enable(StreamReadFeature.STRICT_DUPLICATE_DETECTION)
.disable(JsonWriteFeature.QUOTE_FIELD_NAMES)
.build();

and MapperBuilder allows their use, as doObjectReader and ObjectWriter (as these are features that may be changed on per-read/per-write basis).

Major Feature: JDK9+ module info

Something that was requested by many users — and one of the first things considered for 2.10 — was actually addition of Java module info. While Jackson 2.x itself still mostly only requires Java 7 (and for streaming jackson-core, only Java 6) — 3.0 will require at least Java 8 — many users are moving past Java 8 into Java 11 (Long-Term version) and eventually beyond. On those platforms full Module information is useful.

The initial challenge was this: since Module system is only available on Java 9 and above, but Jackson only requires Java 8 (for building, Java 7 for running); and since Oracle is only interested in support module information on Java — what can we do?

Solution came via Moditect tool set, which allows multiple ways of injecting module-info.class from Java 8 build system. In case of Jackson, we are using its Maven plug-in to take module-info.java declaration and processing into proper class file.

As a result, all Jackson components will now, as of version 2.10, include full module declarations.

One caveat we have is that although attempts have been made to test this information, core development is still done on Java 8, so we are not exactly dog-fooding it. Users’ help is needed to work around likely initial teething trouble: let us know of problems you find!

Smaller Features

Beyond “big 3”, there are other significant areas of improvement besides tons of single-issue fixes.

Jackson-jr feature expansion

With 2.10 we wanted to expand scope of jackson-jr functionality (to allow doing new things) and to increase configurability and extensibility; but strictly within context of keeping jackson-jr as lean and mean, low resource, fast-to-start-up library suitable for environments like Android, Lambda/Serverless and Spark/Flink processing. To do this we added:

  1. Support for reading “root value streams” (aka “JSON Streaming”, line-delimited JSON (LDJSON) (#60)
  2. Allow registration of custom ValueReaders and ValueWriters (similar to Jackson databind custom deserializers, serializers) (#65)

First one is supported via new ValueIterator type; usage can be seen in ReadSequencesTest of jackson-jr-object repo:

static class Bean {
public int id;
public String msg;
}
File input = new File("json-stream.ldjson");
try (ValueIterator<Bean> it = JSON.std.beanSequenceFrom(Bean.class, input)) {
while ((Bean bean = it.nextValue()) != null) {
// do something with 'bean'
}
}

Support for custom handlers is bit more complicated, but the idea is that you will implement ReaderWriterProvider that can then supply custom ValueReaders and/or custom ValueWriters to handle both types jackson-jr could handle already (override default processing) and types it could not.
Full example would require a separate blog post: until that time you want to check unit test CustomValueReadersTest.

Our hope is that this extension point would allow providing for simple datatype extensions for jackson-jr.

JsonNode usability improvements

Tree Model, represented by JsonNode, has not always received as much attention regarding usability as its popularity by users would suggest. With 2.10 attempts were made to improve usability in a few ways:

  • JsonNode is now JDK Serializable (implements java.io.Serializable). This is mostly important for distributed processing frameworks that rely on JDK serializability and should not be used for JSON processing code — just write it out as JSON! — but for some frameworks this is necessary. (databind#18)
  • JsonNode.toString() is now officially supported: previously it has “sort of” worked but was never guaranteed to produce valid JSON (for example, escaping was not applied to String values), and there was a plan to instead make output unlike JSON to reduce confusion. But after thinking things through what was decided was to do the opposite — to use a private JsonMapper configured with vanilla settings for serialization. This results in both fully valid JSON output AND efficient operation. (databind#2187)
  • Addition of convenience “require[d] accessors: see (databind#2237) for details but basically you can now validate existence of assumed paths and values, without either null checks or calls to isXxx() methods — this should simplify tree value access code a lot.

Misc other notable improvements

For Streaming API:

  • InputCoercionException by streaming API (issue #508) (and its super type StreamReadException) were added to allow more granular handling of token stream — level problems. This similar to how JsonMappingException and its subtypes are used at databind level. All Jackson exceptions will still be based off of existing JsonProcessingException (which in turn, for 2.x, extends java.io.IOException)
  • Reduced memory retention for jackson-core by smaller recycled buffer size (issue #539). Jackson reuses some of its low-level buffers using combination of ThreadLocal and SoftReference: this improves efficiency especially for smaller documents. But in some cases it could lead to unnecessarily high memory retention: 2.10 changes recycling parameters to reduce worst-case retention by more than 50%

For Databind:

  • Global override setting for @JsonFormat(lenient) (issue #2424) — it is now possible to specify global “strict” processing as default, not just per-type or per-property definitions. Use with ObjectMapper.setDefaultLeniency(Boolean) (or Builder equivalent)
  • For serialization for ObjectMapper.convertValue() (issue #2220) — sometimes “short-cut”ting of seemingly unnecessary conversions prevented useful conversion attempts — so code will now trust caller to know what they are doing and use full processing.
  • Addition of PropertyNamingStrategy.LOWER_DOT_CASE (issue #2241) to support names like “first.name” for what would in Java Bean be set using setFirstName().

For data format modules:

  • Avro: multiple fixes to support legal usages regarding Unions, alternative Record types
  • Properties: allow use of prefixes (issue #100); support direct reading “from” Map<String, String> (issue #139)
  • XML: handling of incoming xsi:nil (issue #354), assorted other bug fixes
  • YAML: use latest SnakeYAML (1.24), various minor bug fixes

Datatype modules:

  • Guava: add support for RangeSet (issue #50), various smaller bug fixes

Java 8 support:

  • Multiple improvements, fixes, to Java 8 date/time functionality, especially regarding allowed conversions, coercions.

Other JVM Languages:

  • Scala module: couple of bug fixes (mostly same as in 2.9.10)
  • Kotlin: support for easy polymorphic handling of sealed classes (issue #239)

New Module(s)

And last but not least: support was added for Eclipse Collections via new module ( jackson-datatype-eclipse-collections ), added in jackson-datatypes-collections Github repo. Module itself has been built since mid-2.9, but this is the first “official” release.

--

--

@cowtowncoder

Open Source developer, most known for Jackson data processor (nee “JSON library”), author of many, many other OSS libraries for Java, from ClassMate to Woodstox