Jackson 2.12 Most Wanted (4/5):

CoercionConfig system

@cowtowncoder
6 min readDec 8, 2020

(part 4 of “Deeper Dive on Jackson 2.12” mini-series — see “Jackson 2.12 Features” for context)

After going over couple of rather old feature requests, this one — jackson-databind#2113 — is slightly more recent, addressing requests users have made over past couple of minor versions: Jackson being sometimes too lenient in accepting secondary representations of values.

Background: need for type coercions

With default Jackson settings, you might have value type like:

public class State {
public boolean enabled;
}

and expect to read it from JSON content like:

{ "enabled" : true 

but you could also have content like:

{ "enabled" : 42 }{ "enabled" : "true" }

which would effectively be handled same as the first example (with default Jackson configuration settings): values for enabled in both cases would be coerced into value true.
This works since in addition to accepting JSON boolean values, the default deserializer for boolean types in Jackson also accepts some secondary JSON types too:

  • Integer numbers: 0 means false, anything else true (C-style)
  • String representations of “true” and “false”
  • … but not other types like Objects or Arrays (Jackson does not quite go to Javascript or Perl level of truthiness :) )

These value coercions (implicit type conversion from non-matching to matching type) have been added over time to support various use cases: the first coercion, for example, was added shortly before Jackson 1.0. At the time I was working on a Java web service that was called by a Perl client (a team mate was Perl-head, don’t judge :) ) and Perl does not (or at least did not?) have true boolean type and instead relied on “truthiness” of other value types. As a consequence Perl client passed 1 or 0 to denote true/false distinction. To support Perl client it was convenient to allow such representations. Similarly some other clients (often written in a scripting language) encode all kinds of values simply as Strings and coercion from String would be similarly necessary.

But this may not make sense for all use cases as it can introduce various unintended error cases or hide more serious problems. There are also different philosophies regarding exactly how permissive should systems be — ranging from “almost anything goes” to “thou shalt NOT use anything but the EXACT type!”. One global definition of allowed coercions for a library like Jackson, used very widely by all kinds of people for all kinds of systems, is unlikely to work perfectly for all users.

To address this problem of differing preferences, a few DeserializationFeature (and related) options have been added to allow users more control over what is acceptable; usually to make handling less lenient than defaults. For example:

  • DeserializationFeature.ACCEPT_EMPTY_STRING_AS_NULL_OBJECT controls whether empty String is acceptable instead of { }
  • DeserializationFeature.ACCEPT_FLOAT_AS_INT controls whether floating-point values are acceptable to bind into integer target types (int, long, BigInteger)
  • MapperFeature.ALLOW_COERCION_OF_SCALARS controls whether “number as JSON String” coercion is allowed for primitive types (and their wrappers) (note: exactly what is “scalar” is poorly defined in this case — implies non-structured Java types, in general, but what that means is less clear)

In addition, @JsonFormat annotation (and matching “ConfigOverride” system) have added concept of “leniency” (see @JsonFormat(lenient = )), which could be used as another option for allow more coercions.

But it has become obvious that due to sheer number of different combinations, on/off style features do not scale well: options above can overlap and do not give very granular control.

Solution: configurable “CoercionConfig”s

When thinking through different relevant aspects, it seems that ability to define 3 things:

  1. Target type expected: for the first example that would be boolean — two ways to specify, either specific Class (boolean primitive, or wrapper java.lang.Boolean) or more general LogicalType (LogicalType.Boolean)
  2. JSON source type (CoercionInputShape): for example JSON String (CoercionInputShape.String) — note, there are also couple of “virtual” shapes for “empty” values (EmptyArray, EmptyObject, EmptyString)
  3. Action to take (CoercionAction) : from basic “allow” (TryConvert) and“fail with exception” (Fail) to two defaulting actions (AsNull and AsEmpty)

would give enough granularity to configure allowed/disallowed coercions: first 2 define applicability and the third what happens.

Beyond these dimensions, there is also that of scope: you can specify default coercion settings based on CoercionInputShape (most often for “empty” values) used unless more specific combination is found, and actual target-type specific actions.

Usage Examples

Let’s start with our first use case: we do not really like the idea of any boolean values being coerced from JSON integer values. So, we could configure our mapper like so:

ObjectMapper mapper = JsonMapper.builder().build();mapper.coercionConfigFor(LogicalType.Boolean)
.setCoercion(CoercionInputShape.Integer, CoercionAction.Fail);

and with that, numbers would no longer be acceptable input values for fields with logical boolean type (primitive boolean, wrapper Boolean and AtomicBoolean).

Another common case where we might instead want to extend allowed coercions is the “empty String” (CoercionInputShape.EmptyString) case.
Perhaps we have special POJO, for which some clients pass “” instead of null (a perl client? :) ); and instead of failing with an exception (“Cannot deserialize type MyPojo from String value”) we would like to just get null.
If so, try this:

mapper.coercionConfigFor(MyPojo.class)
.setCoercion(CoercionInputShape.EmptyString, CoercionAction.AsNull);

or, if you’d prefer getting “empty” POJO (one creating with default constructor):

mapper.coercionConfigFor(MyPojo.class)
.setCoercion(CoercionInputShape.EmptyString, CoercionAction.AsEmpty);

Or, to do latter for all POJOs:

mapper.coercionConfigFor(LogicalType.POJO)
.setCoercion(CoercionInputShape.EmptyString, CoercionAction.AsEmpty);

… in fact, what the hey: let’s allow ALL KINDS OF THINGS to be deserialized from empty String! This might do the trick:

mapper.coercionConfigDefaults()
.setCoercion(CoercionInputShape.EmptyString, CoercionAction.AsEmpty)

Note: for some types “empty value” selection would be null — this is determined by what JsonDeserializer that handles target type returns for getEmptyValue() method.

Backwards-compatibility with existing settings

On/off features mentioned earlier should be used as global defaults to apply in case no specific coercion configuration has been defined. Their use will be supported for Jackson 2.x, although some or all may be dropped in future for Jackson 3.0, depending on how well this new system works.

JsonFormat(lenient=) setting is not used as part of definition, due to various reasons (including incompatible scoping): going forward, this is meant to only control value ranges within expected type — for example, whether non-standard textual representations are valid for Date/Time (like “February 31”) — but not related to value coercions.

Beyond these on/off features, the ultimate global defaults are quite lenient, allowing value conversions.

Actually configurable coercions: work-in-progress

One important caveat with Jackson 2.12 and coercion configuration is that deserializers will have to explicitly use the new system for settings to take effect. Although a lot of effort was spent on retrofitting existing deserializers of jackson-databind and core datatype modules, it is likely that a few cases were missed.

In addition to existing coercions that may not yet consider new configuration, there are also cases of “potential coercion”: cases where value coercion was never considered — for example, I earlier mention that Jackson does not try to do anything with Object value if boolean is expected — and failure is automatic. But while there may not be reasonable value coercion available, one of other settings (“as-null”, “as-empty”; latter meaning false for booleans) could make sense: deserializer simply needs to start considering such use cases (that is, have handling that checks coercion settings).

In both cases, it makes sense to raise a Github issue against specific component that handles datatype in question (if you know what it is), or jackson-databind if not sure of module.

Future Work

In near future (2.12.x patch releases) it is likely that support for value coercion configuration will extend slowly but steadily. Support for “Empy Array” and “Empty Object” pseudo-shapes, for example, is very limited currently.

Beyond these incremental improvements, it is also possible that some aspects of type coercions will differ between different format backends — for example, textual format types like CSV, Java Properties and XML — need to allow more flexible coercions not matter what.
And at the other end of format spectrum, more strictly specified types like Avro and Protobuf may have challenges of their own ( but note: since coercions are between external possibly strong type and definitions of POJOs coercions can still be useful and even necessary).

--

--

@cowtowncoder

Open Source developer, most known for Jackson data processor (nee “JSON library”), author of many, many other OSS libraries for Java, from ClassMate to Woodstox