Skip to content

Support for zipped zarr files #189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vlevasseur073 opened this issue May 11, 2025 · 3 comments
Open

Support for zipped zarr files #189

vlevasseur073 opened this issue May 11, 2025 · 3 comments

Comments

@vlevasseur073
Copy link

Hello,

is it foreseen to add the support of zipped zar files (on local filesystem as well as on cloud storage)? It is very valuable in the case of cloud storage. Regarding the python library, it can open zipped files from cloud storage if the zarr has zero compression, and the archive must not containt the root folder.

Many thanks in advance for feedbacks,
Vincent

@nhz2
Copy link
Member

nhz2 commented May 11, 2025

Yes, you can read from a zipped Zarr file with Zarr.ZipStore.

Here is an example in the tests, where ppythonzip is a path to the file:

Zarr.jl/test/python.jl

Lines 240 to 243 in 8d65246

g = zopen(Zarr.ZipStore(Mmap.mmap(ppythonzip)))
@test g isa Zarr.ZGroup
@test g.attrs["groupatt"] == "Hi"
a1 = g["a1"]

I imagine the goal with reading from cloud storage would be to only download the needed parts of the zip file. This should be possible, but I don't know enough about cloud storage to say how easy this would be to do.

@vlevasseur073
Copy link
Author

Many thanks for your prompt answer. For local filesystem I implemented something based on p7zip_jll, unzipping the file on a temporary folder, but this way looks way better. I will check this soon.
Regarding zip file on cloud storage, it is indeed useful with lazy loading to access only the structure of the whole product. I don't know much neither but I think it basically reads the .zmetadata to build the template.

@mkitti
Copy link
Member

mkitti commented May 12, 2025

I would take a look at the Zarr v3 sharding codec which allows many chunks to be consolidated into a few (or one) file.
https://zarr-specs.readthedocs.io/en/latest/v3/codecs/sharding-indexed/index.html

I intend to implement this feature here in the near future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants