Skip to content

Explicitly encourage user agents to validate cache content via integrity attributesΒ #101

Open
@crisperdue

Description

@crisperdue

It has been recognized for a number of years that cryptographic hashes of content have potential to improve the effectiveness of user agent caching. Ideas have been aired for shared caching keyed by cryptographic hashes, enabling sharing of identical content even across origins, but these have been rejected on various privacy and security grounds. In fact, browsers have been moving in a conceptually opposite direction with cache partitioning, in order to avoid potential privacy issues due to access of the same CDN content by multiple websites, and observability of the resulting access patterns.

With this background in mind, even within a conventional browser cache keyed by content URL, or a partitioned cache further keyed by the origin of the containing document, considerable reductions in numbers of network roundtrips could be achieved by a user agent that uses the SRI "integrity" attribute for validation of existing browser cache content.

Scenario: Document https://example.com/A refers to resource https://example.com/C, perhaps a script, image, or stylesheet. References to C from A include a suitable "integrity" attribute. When document A is first loaded, the user agent also loads C, checking that its content matches. Later the user returns to document A. (Assume recency as by max-age is in no case keeping the cached content live.) On return to A, the user agent again needs C, potentially triggering an HTTP request, with potentially an "if-none-match" header and a 304 response indicating that the cached content is still valid. With an integrity attribute for C, the browser has enough information to validate C without participation by any server. If document A continues to refer to the same content as before, the integrity attribute will be the same, and match C if it is still present. If document A refers to different content using the same URL, the integrity attribute of the new version of A will differ, and the user agent can detect that the currently cached content is not valid and re-fetch. Similar sequences would result from access to another document B at the same origin, referring to the same resource C, either updated or the same version as before.

Network roundtrips can take as much or more time than actual resource loading, so the performance improvements can be considerable and comparable to the benefits of including version identifiers in URLs and configuring the server to respond with very large max-age values for versioned URLs, as is frequently done in CDNs for well-known resources (think jQuery). Removing network roundtrips this way can also free up network connections for access to other resources. This use of the integrity attribute requires no server modifications nor even server configuration to function, and it is compatible with partitioned user agent caching.

As far as I can tell, this possibility is not well documented, and probably deserves to be called out as permissible, for the benefit of implementors of user agents as well as web developers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions