Skip to content

More correct support for build caches in CI #3404

Open
@klirichek

Description

@klirichek

Proposal:

Right now cache on github CI is just a 'snapshot' of build artifacts of rarely-changed and 3-rd party libraries. It needs manual massage when some of them changes. For example, right now fresh columnar headers and roaring bitmap library are not in the cache, and build rebuilds them each time on github.

However, gitlab version is more correct in same case; when new build is cached, the cache is updated and next builds will see renewed version with latest updates.

In github CI same might be implemented using restore-keys cache keys.

To complete it, we need a key depends on content of the cache. For example, list all the files from cache and take md5/sha1/whatever from concatenated names - cd cache; find . | sha1sum Then append hash to fixed prefix, for example, build_linux_debug_x86_64.
This way final key will be something like build_linux_debug_x86_648be2067abece5ba297e2c6ff4cfeb57922b27f41
On finish of the build we should store our cache with such key.

According to https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/caching-dependencies-to-speed-up-workflows#matching-a-cache-key

key: build_linux_debug_x86_64v
      restore-keys: |
        build_linux_debug_x86_64

in this case restoring will first try exact key build_linux_debug_x86_64v (that is fake value), and then will try most recent matching restore key, build_linux_debug_x86_64*. That step will retrieve our cache with key build_linux_debug_x86_648be2067abece5ba297e2c6ff4cfeb57922b27f41.

On the finish of build, we again, calculate the key. If nothing was changed, it will be same build_linux_debug_x86_648be2067abece5ba297e2c6ff4cfeb57922b27f41, so github will recognize such cache is already exists, and will not save it. But if something changed (say, new version of columnar saved new headers), hash will be different, and new version of cache finished.

I'm not so familiar with github actions, so open the issue for somebody more experienced.

Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

  • Implementation completed
  • Tests developed
  • Documentation updated
  • Documentation reviewed
  • OpenAPI YAML updated and issue created to rebuild clients

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions