software – First time having users – how to deal with backwards compatability?

Well first, this is a good use for regression tests. Whenever you make a change to the file format, have saved and documented examples of files in the old format. Then make unit tests that try and load in the old format and assert that they load in correctly. This can be as simple as

def assert_can_read_v1():
    with open("legacy/v1/example_1.wtvr") as f:
        data_structure = read_logic(f)
    assert data_structure.title == "abc"
    assert data_structure.stuff == "..."

But that’s from a process perspective, how to make sure you don’t mess up. How do you actually write the code?

Well, its gonna depend on a lot of factors including on how different the files are and what kind of format things are stored in.

If things are stored in a text based, or otherwise generic, format like JSON or XML or in a data format that is built to allow extra fields in data like ProtocolBuffers or CapnProto then adding fields should be fairly simple. You just take all the places where you read the field and add some default value if the field is not there.

If you are using a format that doesn’t allow for that kind of extension or where removing fields is backwards incompatible or where you are making larger incompatible changes, you can write some predicate that tells you what “version” of file you have and dispatch to the right functionality. You can make this easier on yourself by adding explicit “version” fields to things, but that is on you.

def read_file(file):
    contents = parse_format(file)
    if contents.version == 1:
        return parse_contents_v1(contents)
    else:
        return parse_contents_v2(contents)

As a corollary, this is easier to do if you have some model in your code that is “separate” from what your config file reads into. That way you have some place to put this massaging logic.

So you could start with something like this.

import dataclasses
import json

@dataclasses.dataclass(frozen=True)
class Project:
    name: str

    @staticmethod
    def read_from_file(file_path):
        with open(file_path, "r") as f:
            contents = json.load(f)
        return Project(name=contents("name"))

And do some minor upgrades to get this

import dataclasses
import json

from typing import Optional

@dataclasses.dataclass(frozen=True)
class Project:
    name: str
    description: Optional(str)

    @staticmethod
    def read_from_file(file_path):
        with open(file_path, "r") as f:
            contents = json.load(f)
        return Project(
            name=contents("name"),
            description=contents.get("description", None)
        )

And then maybe need to do a major overhaul and end up with

import dataclasses
import json

from typing import Optional

@dataclasses.dataclass(frozen=True)
class Project:
    name: str
    description: Optional(str)

    @staticmethod
    def read_from_file(file_path):
        with open(file_path, "r") as f:
            contents = json.load(f)
        version = contents.get("version", None)
        if version is None:
            return Project(
                name=contents("name"),
                description=contents.get("description", None)
            )
        else:
            name, description = contents("stuff").split(":::", 1)
            return Project(name=name, description=description)

Does that sorta make sense? There are no hard set rules, but having the very first thing – regression tests – can help a lot.