04/16/2024 | News release | Distributed by Public on 04/16/2024 07:34
The maturity model for contracts and document generation looks like this:
Stage (4) is interesting because it poses some new challenges related to template governance, naming, semantics and modularity. These challenges are similar to ones that software developers have wrestled with since at least 1958 and the development of Algol. No, I'm not that old!
In this article I will dig deeper into these challenges, and potential solutions.
What Are Templates?
First, a little background on templates.
A template in its most basic form is natural language text with embedded variables. Here is an example micro-template (aka "clause template"), expressed using Accord Project TemplateMark syntax.
The data model for this template declares that the template references two variables "from" and "to" and that they are both of type DateTime.
So, templates bind a locale-neutral data model (in this case an InsurancePeriod concept, with a "from" and a "to" DateTime) to natural language (in this case English). We should not confuse the concept of an InsurancePeriod with how it has been bound to an English paragraph. In some cases we may need to also bind the InsurancePeriod to other natural languages, such as French - perhaps because we are operating in Canada and need to express this concept in two natural languages.
Template models are locale-independent and describe the shape of data (a schema), while template natural language references a template model and may (optionally) use any of the variables defined by the template model. The semantics of the template are carried primarily by the template model, while the natural language of the template expresses the syntax of the template and verbalises the concept for humans.
How to Modularise Templates?
The Period of Insurance template shown above is clearly designed to be used as part of a larger document. It is not a document template itself.
The figure below shows three paragraphs in a document, with the first and the last providing boilerplate text (text without variables) while the middle paragraph is the Period of Insurance template, along with its two variables.
So far, so good! As long as the user supplies the values for "to" and "from", the document can be generated from the template.
A more realistic example, however, would be one where several clause templates shared variables. In this example below, the from variable is referenced in paragraph 1 and 2, while the to variable is referenced in paragraph 2 and 3.
This now poses some interesting challenges:
Lessons from Computer Science
Most computer programmers are (at least subconsciously!) familiar with these problems, as they are the basis for how we create modern software. The maturity model for software over the past 80 years is:
The key insight comes from mathematics and Lambda Calculus : suboutines are mathematical functions that receive inputs and produce outputs.
The mathematical operator that applies a function to a set of inputs doesn't know or care what the names of the inputs are outside of its scope.
Today, rather than creating programs that reference a fixed set of global variables, computer programmers define a set of functions that transform data, irrespective of where the data is coming from or how it is named. The semantic binding between the data and a function occurs WHEN THE FUNCTION IS CALLED, not when the global variables are defined, or even when the function is defined.
An example will, I hope, make this concrete.
The semantics of the Add function are documented (by the function author) and clear: it adds its two numeric arguments together and returns their sum. WITHIN the Add Function these arguments are referred to as x and y. The function author has created a useful function and does not need to be concerned with who is calling the function.
Somewhere else (in space and time) a programmer (who is not necessarily the function author) decides that the Add function would be useful in their program (because they read the documentation and liked the semantics and trusted the implementation). They instantiate two variables A and B and then call the function, assigning the result to variable C.
The caller of the function passed variables A and B to the function (not x and y). They do not need to know how the function is implemented, or how the variables are referred to internally.
Back to Templates
Templates are functions that take data as input and return natural language text. Templates should be CALLED, just as functions are called in our example above. It is only the caller of the function that can map the data they have at hand to the data that is required by the function they would like to call. The onus is on the CALLER of the template to understand when to use the template and its data requirements, not vice versa. Once template authors have to be aware of WHERE their templates might be used, modularity and organisational scalability breaks down. This includes the presence of any sort of list of global variables and their semantics.
In the figure below, the document template author has introduced two document template variables: startDate and endDate.
The author of the document template would like to use Clause Templates (1) (2) and (3) and they understand that the semantics of the startDate variable corresponds to the "from" variable in the templates, and the semantics of the endDate variable corresponds to the "to" variable. To use the clause templates, they must define the binding between the variables they have in their scope to the variables required by the clause templates they would like to use/call.
One can imagine this binding being more-or-less automatic, but we should never fall into the trap of assuming that variables that are NAMED THE SAME, ARE THE SAME.
Tables have legs. People have legs. Sports events have legs. Not the same thing!