Grammar and interpreter versioning

The grammar and interpreter versioning can be based on the release tag (semantic versioning) or on a hash. Only there should be a mechanism for checking the version. Because we do not have a mechanism to tag the textx grammar, the easiest way to capture the grammar version is to compute a hash of the grammar string. But the hash will not make difference between breaking and non-breaking changes. Even a negligible non-breaking change in the grammar will require a corresponding change in interpreter. Another way is to keep a list of hash values for grammar against which all the tests have passed.

This is related to issue #67 because the version of the interpreter is coupled to the I/O schema used in the (de)serialization code.

Basically, we can maintain a compatibility mapping between grammar, interpreter and JSON schema for I/O in a table like this:

grammar version	interpreter version	JSON schema version
1	105	11
1	110	12
2	111	13
2	112	13
3	112	13

So the compatible grammar and JSON schema versions can be accommodated in a dict like this

# version 112
compatibility = {'grammar': [2, 3], 'json schema': [13]}

This means grammars tagged with 2 or 3 (both from persistence or scratch) and persistent json data tagged with 13 will work with version 112 of the intepreter. Therefore, the interpreter does not need its own version tag for this check.

The decision to not increment a version after a change is critical. As a rule of thumb, if changes are done in both interpreter and grammar then grammar tag will be incremented all old grammar versions have to be removed. If only the grammar has been changed - then only the grammar tag is incremented and added to the list of compatible grammar. If the change is in ordering of rule, formatting, comments etc. then no tag change is needed. The same can be used to maintain compatibility with the JSON schema of persistent data.

One special type of changes is where the type of some parameters changes without grammar changes or change of the JSON schema for I/O. For example, the :columns property of a Table. The type has recently changed from Tuple (serialized as a list) to Series (serialized as a FWSeries). The change required a change in the tests. Using old model instance with the new interpreter may lead to type errors. This can only be captured only if the schema is aware of the internal type mapping used by the interperter. This is possible, if every single JSON data island has a schema specification. Then this island (document) can be validated against that schema, in this example the list document should refer to schema Table:column:v1 and the FWSeries document should refer to the schema Table:column:v2. This approach requires these schema metadata in all serialized objects. Therefore, a real JSON schema approach is not suitable for purely resolving compatibility. A simple tagging the implicit "JSON schema" will do the work. If we need to validate the data agains the schema then we need both the JSON schema and a reference to the relevant schema in the JSON data.

Edited Nov 06, 2023 by Ivan Kondov