version control – How much should we archive for reproducible builds?

A few alternative twists on the question title to contextualize further:

  • What to archive of the “sources” for a given software build?
  • Should I include all transitive packages in my repository?
  • Is it OK to rely on the package manager to be able to reproduce a build at all?
  • Should I archive a ZIP file of my git repo release tag?
  • Should we archive the build tools?

Context: Building an application that will be installed on multiple users’ machines / devices.

OK, so here’s the problem:

Obviously, all of our source code lives in source control. This is NOT enough to build the software however.

When you want to create a binary build of the application, you need:

  • Install Visual Studio in the correct version (we automate this via Chocolatey).
  • (a) Check out the correct SCC “release tag”.
  • Run “the build script”:
    • (b.1) Run a nuget restore against our internal package server
    • (b.2) Fetch 3rd party sources that are not checked in the primary repo (think vcpckg or something similar)
    • (c) Build the actual software (call msbuild in our case)
    • (d) Package the created application binaries into something that can be passed downstream

Note: Normally all of the above, and more, is run in the automated CI System (using Jenkins here).

Some here think we should create and archive a “ZIP file” before step (c) so that we have a base for a “reproducible build” and can reproduce a given build on any dev machine without relying on our source code server and/or our package mgmt server — and more specifically, without relying on the scripted part(s) of steps (b.#) as these have to get all the settings for the infrastructure correct (server names etc.) which could change over time.

Some here think that’s a waste of time and space, as the whole build system is critical infra anyway, so having something that “works without it” doesn’t make sense.

Is there some accepted norm with regard to this?