“Nothing to Declare but My Genius”: Type Safety in Decision Automation

Oscar Wilde’s retort to customs officials may be apocryphal, but oddly it has implications for decision automation. For example, is it always necessary for decisions to declare (in their design) the data type of their results? Or are there advantages to withholding this information until they are used? Should tools verify that decisions comply with these declarations, and how is this helpful?

Introduction

Architecturally, it seems vital that decision models, taken as a whole, declare the type of outcome they can generate. Such a declaration is essential for clients of any decision service based on the model. They need to understand and safely use the service’s output. For example, consider a decision service to calculate a student loan. Given a student’s details, we might reasonably expect this to yield a capped, non-negative sum of money. Furthermore, a decision service to classify customer loyalty should provide one of several defined loyalty classes (e.g., bronze, silver, gold). Without this information, how would a user effectively incorporate the decision into their use case? This article explores the value of these declarations, when they should be introduced, how tools check them and when they can and should be overlooked.

Type Safety in Decision Modelling

Explicit Typing and Type Checking

Within a decision model, explicit typing is the process of overtly defining the data type of all inputs and outputs (information items) when the decision is created. Type checking is the process of ensuring that, when the model is used, these type constraints are obeyed–in short, checking that every information item has content of the expected type and is used in a manner consistent with its type.

By explicitly defining all types, you improve the model’s

  • Understandability: furnished with a definition of all the input information item types and output types, a user of a decision model can better understand the requirements and purpose of the model. By analogy to a mechanical or electronic component, a big part of understanding the behaviour of a device is to appreciate precisely what it consumes and produces. This information forms part of the ‘contract’ between a decision service and its users.
  • Integrity: explicit definitions permit type checking to be done before the model is executed. The model can check that the operations performed on an information item are consistent with its type definition. For example, it can ensure that we don’t attempt to divide two string information items. Critically this can be done before any execution, allowing us to detect and eliminate many problems before decisions are used and thereby improving the robustness of decisions.

How to Leave Types Implicit

DMN insists that every decision declare the type of every input and output information item. So how is it possible not to be explicit about type? The answer lies with the DMN type Any. If an information item has type Any, it can contain anything. It is implicitly typed – a black box into which any value can be placed. Many modelling tools default the type of information items to Any.

Implicit typing is very flexible. It allows you to change your decision design at any point without needing to adjust type definitions. It also allows you to omit type information for (non-executable) models where it is not appropriate.  It is also hazardous. A decision that consumes your decision’s outcome may be expecting data of a specific type (e.g., a number) and receive something different (e.g., a list), causing it to fail or behave erratically. These mistakes can only reliably be detected when the decision is used. Furthermore, because decisions often contain edge case logic which is rarely used, if they are not thoroughly tested, a mistake may only be discovered when a decision service has been in production for a considerable time.

Should Explicit Typing Be Applied Within Decision Models?

While type safety is essential for whole decisions services, should this practice be applied to every interim decision made within a model? For example, a decision model to calculate the maximum loan amount to which a student is entitled may have many interim decisions within it. One to recognize the student’s income, another to tally their savings and perhaps another to determine their financial responsibilities. Must each of these have all their inputs and output types declared when the decision is first defined?

Explicit Typing of Interim Decisions: Pros and Cons

The pros of strict type checking are derived from those factors mentioned above:

  • Ease of Comprehension: The model’s interim decisions are easier to understand because each one defines the type and its inputs and outputs
  • Safer Reuse: Sub-decisions are easier and safer to reuse in other contexts because of the understanding provided by explicit typing
  • Rapid Error Detection: Any inconsistency between a decision and its subordinates will be detected quickly

However, the overhead of adding explicit interim types has the following costs:

  • Loss of Agility: A drag caused by having to define and re-define output types whenever interim decisions are changed
  • Premature Commitment: The tendency to cease model innovation early (or be reluctant to refactor it) because of the increased overhead of change
  • Irrelevant Detail: Complex explicit types may not be relevant, and even counterproductive, in non-executable models
  • Excess Complexity: This can lead to an explosion in the number of types, making the model more complex

Particularly in the early stages of creating and evolving a decision model, when the structure is changing rapidly, the need to define interim types for every decision (and business knowledge model) is arduous and counterproductive. This is especially true when types are complex (e.g., structures, nested lists). Furthermore, the effort can slow down model improvements and even make modellers reluctant to refactor their models.

Best Policy

A trade-off for executable models is to implicitly type interim decisions with complex output information items until the model matures. At that point, and certainly no later than deployment, an explicit type can be assigned to each one. Ensure you use a DMN execution engine (like RapidGen Genius) that performs thorough type checking. However, be aware that this ability is rare. None of the top five DMN modelling environments supports explicit type checking.

Conclusion

While it is always prudent to define and check the inputs and output types of all information items crossing the boundary of a decision service, the choice of defining interim types is more nuanced. The best practice should be to leave complex types implicit until the model is mature but then make them explicit before deployment.

About the Author

Jan Purchase has been working in investment banking for 20 years during which he has worked with nine of the world’s top 40 banks by market capitalization. In the last 13 years he has focused exclusively on helping clients with automated Business Decisions, Decision Modelling (in DMN) and Machine Learning. Dr Purchase specializes in delivering, training and mentoring all of these concepts to financial organizations and improving the integration of predictive analytics and machine learning within compliance-based operational decisions.

Dr Purchase has published a book Real World Decision Modelling with DMN, with James Taylor, which covers their experiences of using decision management and analytics in finance. He also runs a Decision Management Blog www.luxmagi.com/blog, contributes regularly to industry conferences and is currently working on ways to improve the explainability of predictive analytics, machine learning and artificial intelligence using decision modelling.

purchase@luxmagi.com   |   @JanPurchase   |   blog.luxmagi.com