Skip to content

[Proposal]Introduce a Header to Indicate Image Pull Intent #564

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kashsham95 opened this issue Feb 10, 2025 · 3 comments
Open

[Proposal]Introduce a Header to Indicate Image Pull Intent #564

kashsham95 opened this issue Feb 10, 2025 · 3 comments

Comments

@kashsham95
Copy link

kashsham95 commented Feb 10, 2025

Image pulls happen for various reasons, including image vulnerability scans (potentially other maintenance processes). These pulls do not necessarily reflect the customer’s intended use of the image, yet they currently contribute to the image’s pull history, giving the impression of active usage. This issue becomes more significant if customers rely on pull activity for image lifecycle management. In particular, third-party services pulling images regularly for scanning or other processes could distort the actual usage data.

Proposed idea

Introduce a new HTTP header in OCI image pulls, such as X-OCI-Pull-Intent-, to allow callers to specify the reason for the pull, for example:

X-OCI-Pull-Robot: <true|false> (for vulnerability scanning, or any other usecase that's not considered a "real" pull)

The default would be false if this header is not specified. OCI compliant registry services can use this information to differentiate between image pulls made for maintenance purposes versus pulls that reflect real customer usage, such as deploying containers.
The default behavior will not change if this new header is not set in the requests.

Benefits: Adding a header for image pull intent helps improve accuracy by showing how images are actually being used; helps with image lifecycle management based on pull activity.

Risks & Considerations: If third-party tools don’t adopt this header, pull history with respect to pull intent will not be known. There’s also a risk that clients may set the header to an incorrect intent like scan/audit and lead to inaccurate pull activity data if they were real image usage.

@tianon
Copy link
Member

tianon commented Feb 10, 2025 via email

@kashsham95
Copy link
Author

kashsham95 commented Feb 10, 2025

This is usually (in other HTTP-based domains) a problem solved via user
agent, right? 🤔

Yes, but the proposal is to standardize a header in OCI spec for the callers to adopt.

@sudo-bmitch
Copy link
Contributor

Various scanning solutions I've seen improve their performance by only scanning the layers once, and caching the SBOM associated with those layers (or at least the assembled image). That allows future scan requests to be based on the SBOM without putting the image or layers, and only verifying the digest is unchanged by looking at the HEAD request to a tag.

There will also be tools where the intent is unknown. Even with something like a docker pull, that could be done by a script that is copying images with a docker tag/docker push pattern. Or it could be inspecting the image config. Or it could be exporting the image into a vulnerability scanner. And none of those uses would be known at the time of the image pull. Or a mirroring utility could pull the content to an air-gapped registry where actual usage of the image is not visible to the upstream registry.

Given that, my initial concern is this could result in different usage metrics, but not necessarily more accurate metrics. If an organization controls the scanners, then they could improve the usage metrics by excluding those requests specifically, by user agent, or even source IP. If there are implementations that implement this and are able to improve their metrics even further, I'd be interested in seeing those implementations before moving forward with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants