Skip to content

support for potentially persisting ephemera between builds #32

@jclulow

Description

@jclulow

At present, buildomat targets provide a pristine environment each time a job is started. This is the best way to provide reliable, understandable results that can be reproduced or debugged after a build completes.

Unfortunately, some software build processes are truly staggeringly expensive. It would appear to be possible to, at the expense of making the build less hermetic, persist some portion of the build environment at the end of a successful build (but not the whole environment) to be unfurled on top of the pristine environment the next time around. This is obviously a facility that will require some care in order to avoid creating a lot of potentially quiet problems; some features we should consider are:

  • allowing this to occur for pull requests but not for builds pushed to the main branch of a repository; GitHub currently forces a new commit hash even when doing a purely fast-forward rebase merge, so our deduplication based on commit hash likely won't be a problem here unless that changes
  • if this is driven by declarative configuration in the job TOML:
    • preserving only a very specific set of files, perhaps using the same rule matching behaviour we get with output_rules directives
    • we should take care to preserve files only after a successful build
    • we will need some way to reliably invalidate any existing persisted ephemera
  • this could also be driven explicitly using the bmat control program that is available inside jobs, which was added in 6cd4797
    • perhaps one could nominate files to persist; e.g., bmat persist cargo-registry ~/.cargo/registry
    • one could unfurl the latest persistent data with, say, bmat restore cargo-registry and it would be unpacked to the location from which it was originally saved
    • these commands would direct the agent to begin managing the persistent files, or to report that it couldn't do it for whatever reason; an advantage might be that jobs could handle failure however they like, and that the operational cost of doing this would itself be visible within the job
  • jobs should report any sets of persistent ephemera that they used, including a link to download the archive itself, as well as any integrity information, or metadata like creation date, which job created the archive, etc

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions