-
-
Notifications
You must be signed in to change notification settings - Fork 335
refactor v3 data types #2874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
d-v-b
merged 164 commits into
zarr-developers:main
from
d-v-b:feat/fixed-length-strings
Jun 16, 2025
Merged
refactor v3 data types #2874
Changes from 21 commits
Commits
Show all changes
164 commits
Select commit
Hold shift + click to select a range
f5e3f78
modernize typing
d-v-b b4e71e2
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b 3c50f54
lint
d-v-b d74e7a4
new dtypes
d-v-b 5000dcb
rename base dtype, change type to kind
d-v-b 9cd5c51
start working on JSON serialization
d-v-b 042fac1
get json de/serialization largely working, and start making tests pass
d-v-b 556e390
tweak json type guards
d-v-b b588f70
fix dtype sizes, adjust fill value parsing in from_dict, fix tests
d-v-b 4ed41c6
mid-refactor commit
d-v-b 1b2c773
working form for dtype classes
d-v-b 24930b3
remove unused code
d-v-b 703e0e1
use wrap / unwrap instead of to_dtype / from_dtype; push into v2 code…
d-v-b 3c232a4
push into v2
d-v-b b7fe986
remove endianness kwarg to methods, make it an instance variable instead
d-v-b d9b44b4
make wrapping safe by default
d-v-b bf24d69
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b c1a8566
dtype-specific tests
d-v-b 2868994
more tests, fix void type default value logic
d-v-b 9ab0b1e
fix dtype mechanics in bytescodec
d-v-b e9f5e26
Merge branch 'main' into feat/fixed-length-strings
d-v-b 6df84a9
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b e14279d
remove __post_init__ magic in favor of more explicit declaration
d-v-b 381a264
fix tests
d-v-b 6a7857b
refactor data types
d-v-b e8fd72c
start design doc
d-v-b b22f324
more design doc
d-v-b b7a231e
update docs
d-v-b 7dfcd0f
fix sphinx warnings
d-v-b 706e6b6
tweak docs
d-v-b 8fbf673
info about v3 data types
d-v-b e9aff64
adjust note
d-v-b 44e78f5
fix: use unparametrized types in direct assignment
d-v-b 60cac04
start fixing config
d-v-b 120df57
Update src/zarr/core/_info.py
d-v-b 0d9922b
add placeholder disclaimer to v3 data types summary
d-v-b 2075952
make example runnable
d-v-b 44369d6
placeholder section for adding a custom dtype
d-v-b 4f3381f
define native data type and native scalar
d-v-b c8d7680
update data type names
d-v-b 2a7b5a8
fix config test failures
d-v-b e855e54
call to_dtype once in blosc evolve_from_array_spec
d-v-b a2da99a
refactor dtypewrapper -> zdtype
d-v-b 5ea3fa4
Merge branch 'main' into feat/fixed-length-strings
d-v-b cbb159d
update code examples in docs; remove native endianness
d-v-b c506d09
Merge branch 'feat/fixed-length-strings' of github.com:d-v-b/zarr-pyt…
d-v-b bb11867
adjust type annotations
d-v-b 7a619e0
fix info tests to use zdtype
d-v-b ea2d0bf
remove dead code and add code coverage exemption to zarr format checks
d-v-b 042c9e5
fix: add special check for resolving int32 on windows
d-v-b def5eb2
add dtype entry point test
d-v-b 1b7273b
remove default parameters for parametric dtypes; add mixin classes fo…
d-v-b 60b2e9d
Merge branch 'main' into feat/fixed-length-strings
d-v-b 83f508c
Update docs/user-guide/data_types.rst
d-v-b 4ceb6ed
refactor: use inheritance to remove boilerplate in dtype definitions
d-v-b 5b9cff0
Merge branch 'feat/fixed-length-strings' of github.com:d-v-b/zarr-pyt…
d-v-b 65f0453
Merge branch 'main' into feat/fixed-length-strings
d-v-b cb0a7d4
update data types documentation, and expose core/dtype module to autodoc
d-v-b 40f0063
Merge branch 'feat/fixed-length-strings' of github.com:d-v-b/zarr-pyt…
d-v-b 9989c64
add failing endianness round-trip test
d-v-b a276c84
fix endianness
d-v-b 6285739
additional check in test_explicit_endianness
d-v-b e9241b9
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b 2bffe1a
add failing test for round-tripping vlen strings
d-v-b aa32271
route object dtype arrays to vlen string dtype when numpy > 2
d-v-b 617d3f0
relax endianness mismatch to a warning instead of an error
d-v-b 2b5fd8f
use public dtype module for docs instead of special-casing the core d…
d-v-b 1831f20
use public dtype module for docs instead of special-casing the core d…
d-v-b a427a16
silence mypy error about array indexing
d-v-b 41d7e58
add release note
d-v-b c08ffd9
fix doctests, excluding config tests
d-v-b 778d740
revert addition of linkage between dtype endianness and bytes codec e…
d-v-b 269215e
remove Any types
d-v-b 8af0ce4
add docstring for wrapper module
d-v-b df60d05
simplify config and docs
d-v-b 7f54bbf
update config test
d-v-b be83f03
fix S dtype test for v2
d-v-b 3979746
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b a210f9f
fully remove v3jsonencoder
d-v-b 8fbd29a
refactor dtype module structure
d-v-b afc9872
add timedelta64
d-v-b e1bf901
refactor time dtypes
d-v-b 45f0c88
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b 890077e
widen dtype test strategies
d-v-b a3f05f0
modify structured dtype fill value rt to avoid to_dict
d-v-b 4788f05
wip: begin creating isomorphic test suite for dtypes
d-v-b d3f9204
finish common tests
d-v-b fdf17e3
wip: test infrastructure for dtypes
d-v-b 4afa42a
wip: use class-based tests for all dtypes
d-v-b 4990803
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b 1458aad
fill out more tests, and adjust sized dtypes
d-v-b 9673997
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b aa11df4
wip: json schema test
d-v-b f706b46
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b 52518c2
add casting tests
d-v-b 4ab1c58
use relative link for changes
d-v-b e4c89f3
typo
d-v-b e386c2b
make bytes codec dtype logic a bit more literate
d-v-b 703192c
increase deadline to 500ms
d-v-b 0fab5e5
fewer commented sections of problematic lru_store_cache section of th…
d-v-b 2f945bf
add link to gh issue about lru_cache for sharding codec
d-v-b 63a6af4
attempt to speed up hypothesis tests by reducing max array size
d-v-b 56e7c84
clean up docs
d-v-b eee0d7b
remove placeholder
d-v-b 1dc8e72
make final example section doctested and more readable
d-v-b 13ca230
revert change to auto chunking
d-v-b 2a42205
revert quotation of literal type
d-v-b 3f775c8
lint
d-v-b 5320a77
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b b525b8e
fix broken code block
d-v-b ec94878
specialize test to handle stringdtype changes coming in numpy 2.3
d-v-b 3af98aa
add docstring to _TestZDType class
d-v-b 6388203
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b 6ef7924
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b 1329c69
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b d8c3672
type hints
d-v-b 3f4d87a
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b d8a382a
expand changelog
d-v-b 9aa751b
tweak docstring
d-v-b e4a0372
support v3 nan strings in JSON for float dtypes
d-v-b 8a976d6
revert removal of metadata chunk grid attribute
d-v-b be0d2df
use none to denote default fill value; remove old structured tests; u…
d-v-b 8c90d2c
add item size abstraction
d-v-b 0fc653f
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b 7c58f7a
rename fixed-length string dtypes, and be strict about the numpy obje…
d-v-b 3a21845
remove vestigial use of to_dtype().itemsize()
d-v-b ce0afe3
remove another vestigial use of to_dtype().itemsize()
d-v-b e67d4dc
emit warning about unstable dtype when serializing Structured dtype t…
d-v-b 4e2a157
put string dtypes in the strings module
d-v-b a1deda6
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b 528a942
make tests isomorphic to source code
d-v-b c9c8181
remove old string logic
d-v-b 1cb7734
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b d80d565
use scale_factor and unit in cast_value for datetime
d-v-b 7806563
add regression testing against v2.18
d-v-b 39219fa
truncate U and S scalars in _cast_value_unsafe
d-v-b 4a7a550
docstrings and simplification for regression tests
d-v-b 807c585
changes necessary for linting with regression tests
d-v-b 5150d60
improve method names, refactor type hints with typeddictionaries, fix…
d-v-b 9ddbe97
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b d6535d6
fix storage info discrepancy in docs
d-v-b 42e14ef
fix docstring that was troubling sphinx
d-v-b 3991406
wip: add vlen-bytes
d-v-b d7da3d9
add vlen-bytes
d-v-b c3c3288
Merge branch 'main' of github.com:zarr-developers/zarr-python into fe…
d-v-b d1feaee
Merge branch 'main' into feat/fixed-length-strings
d-v-b 3ef138a
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b 1f767e4
replace placeholder text with links to a github issue
d-v-b cf55041
refactor fixed-length bytes dtypes
d-v-b 24b6b35
more v3 unstable dtype warnings, and their exemptions from tests
d-v-b 7f099a2
Merge branch 'main' into feat/fixed-length-strings
d-v-b bf7e2c5
Merge branch 'main' into feat/fixed-length-strings
d-v-b cbb0b0d
clean up typeddicts
d-v-b 8f3aa68
Merge branch 'main' into feat/fixed-length-strings
d-v-b e885869
update docstrings
d-v-b 63de7c4
Update docs/user-guide/data_types.rst
d-v-b b069d36
refactor wrapper to allow subclasses to freely define their own type …
d-v-b ae36dbf
Merge branch 'main' of https://github.com/zarr-developers/zarr-python…
d-v-b a1f2c94
Merge branch 'feat/fixed-length-strings' of https://github.com/d-v-b/…
d-v-b b2e56c8
make method definition order consistent
d-v-b d26b695
allow structured scalars to be np.void
d-v-b 49f0062
use a common function signature for from_json by packing the object_c…
d-v-b 70da4da
fix dtype doc example
d-v-b 16b4ac6
Merge branch 'main' into feat/fixed-length-strings
d-v-b File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -50,6 +50,7 @@ | |
get_indexer, | ||
morton_order_iter, | ||
) | ||
from zarr.core.metadata.dtype import DTypeWrapper | ||
from zarr.core.metadata.v3 import parse_codecs | ||
from zarr.registry import get_ndbuffer_class, get_pipeline_class, register_codec | ||
|
||
|
@@ -355,9 +356,10 @@ def __init__( | |
object.__setattr__(self, "index_location", index_location_parsed) | ||
|
||
# Use instance-local lru_cache to avoid memory leaks | ||
object.__setattr__(self, "_get_chunk_spec", lru_cache()(self._get_chunk_spec)) | ||
object.__setattr__(self, "_get_index_chunk_spec", lru_cache()(self._get_index_chunk_spec)) | ||
object.__setattr__(self, "_get_chunks_per_shard", lru_cache()(self._get_chunks_per_shard)) | ||
# TODO: fix these when we don't get hashability errors for certain numpy dtypes | ||
d-v-b marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this something that needs fixing before this PR is merged? |
||
# object.__setattr__(self, "_get_chunk_spec", lru_cache()(self._get_chunk_spec)) | ||
# object.__setattr__(self, "_get_index_chunk_spec", lru_cache()(self._get_index_chunk_spec)) | ||
# object.__setattr__(self, "_get_chunks_per_shard", lru_cache()(self._get_chunks_per_shard)) | ||
|
||
# todo: typedict return type | ||
def __getstate__(self) -> dict[str, Any]: | ||
|
@@ -402,7 +404,9 @@ def evolve_from_array_spec(self, array_spec: ArraySpec) -> Self: | |
return replace(self, codecs=evolved_codecs) | ||
return self | ||
|
||
def validate(self, *, shape: ChunkCoords, dtype: np.dtype[Any], chunk_grid: ChunkGrid) -> None: | ||
def validate( | ||
self, *, shape: ChunkCoords, dtype: DTypeWrapper[Any, Any], chunk_grid: ChunkGrid | ||
) -> None: | ||
if len(self.chunk_shape) != len(shape): | ||
raise ValueError( | ||
"The shard's `chunk_shape` and array's `shape` need to have the same number of dimensions." | ||
|
@@ -483,7 +487,10 @@ async def _decode_partial_single( | |
|
||
# setup output array | ||
out = shard_spec.prototype.nd_buffer.create( | ||
shape=indexer.shape, dtype=shard_spec.dtype, order=shard_spec.order, fill_value=0 | ||
shape=indexer.shape, | ||
dtype=shard_spec.dtype.unwrap(), | ||
order=shard_spec.order, | ||
fill_value=0, | ||
) | ||
|
||
indexed_chunks = list(indexer) | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calling
to_dtype
here would help avoid having to call it twice below.