FIX: Revise bzero handling in dMRI data's `to_nifti()` #82

jhlegarreta · 2025-01-31T00:36:15Z

Add null gradient values to DWI data serialization if appropriate.

When the flag insert_b0 is True the previous implementation prepended the bzero attribute data to the DWI instance dataobj, but it was not adding the corresponding null gradient value to the bval/bvec file pair. This patch set prepends a null gradient to the bval/bvec file pair.

Parameterize the test_load testing function to check both cases (i.e. insert_b0 False and True).

Add additional checks to ensure that the write/read round-trip works as expected, and that the bvals and bvecs attributes have the expected values.

Fixes #79.

jhlegarreta · 2025-01-31T00:37:40Z

This only demonstrates the issue; I wrote the code as I spent time investigating the issue, so pushing it here to move forward. Some thinking/discussion is needed to see what convention we follow so that the issue can be addressed properly.

codecov · 2025-01-31T00:40:57Z

Codecov Report

Attention: Patch coverage is 80.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 70.93%. Comparing base (e964b96) to head (38ccdad).
Report is 15 commits behind head on main.

Files with missing lines	Patch %	Lines
src/nifreeze/data/dmri.py	80.00%	1 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #82      +/-   ##
==========================================
+ Coverage   70.10%   70.93%   +0.83%     
==========================================
  Files          23       23              
  Lines        1067     1132      +65     
  Branches      129      136       +7     
==========================================
+ Hits          748      803      +55     
- Misses        275      284       +9     
- Partials       44       45       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jhlegarreta · 2025-05-30T23:28:28Z

With the progress and refactorings made lately, I think it is fair to revisit this. @oesteban let's set aside some time to discuss about this.

oesteban · 2025-06-02T07:08:02Z

Sure, please book my calendar :)

Add null gradient values to DWI data serialization if appropriate. When the flag `insert_b0` is `True` the previous implementation prepended the `bzero` attribute data to the `DWI` instance `dataobj`, but it was not adding the corresponding null gradient value to the bval/bvec file pair. This patch set prepends a null gradient to the bval/bvec file pair. Parameterize the `test_load` testing function to check both cases (i.e. `insert_b0` `False` and `True`). Add additional checks to ensure that the write/read round-trip works as expected, and that the `bvals` and `bvecs` attributes have the expected values. Take advantage of the commit to make it clear in the `DWI.to_nifti` method that it assumes that the `bzero` attribute value is never `None`.

jhlegarreta · 2025-06-08T01:40:13Z

I think I am done with this: it is ready to be reviewed/merged.

A few comments:

Do the dwi-b0_desc-avg.nii.gz, dwi-b0_desc-brain.nii.gz and dwi-b0_desc-brain_mask.nii.gz testing data files correspond to what we should expect to be contained in dwi.h5 testing data file?
i.e. The below:
```
bzero_brain_nifti = nb.load(datadir / "dwi-b0_desc-brain.nii.gz").get_fdata().astype(np.int16)
bzero_brain_h5 = np.zeros_like(dwi_h5.bzero)
bzero_brain_h5[dwi_h5.brainmask] = dwi_h5.bzero[dwi_h5.brainmask]
assert np.allclose(bzero_brain_nifti, bzero_brain_h5)
```
raises an error. I must be missing something if they should contain the same data.

https://gin.g-node.org/nipreps-data/tests-nifreeze
The brainmask is not serialized, so if I were to test that, the round trip would fail. To be added in a separate PR.

This block

nifreeze/src/nifreeze/data/dmri.py

Lines 390 to 394 in e964b96

    
           # The b=0 volumes are those that did NOT pass b0_thres 
        
           b0_volumes = fulldata[..., ~gradmsk] 
        
           # A simple approach is to take the median across that last dimension 
        
           # Note that axis=3 is valid only if your data is 4D (x, y, z, volumes). 
        
           dwi_obj.bzero = np.median(b0_volumes, axis=3)

should be put to a function so that it can be re-used and changed should another strategy be preferred. To be done in a separate PR.

oesteban

I've allowed myself to edit the PR's title because I think it was biasing the solution.

IMHO, #79 partially misrepresents the problem: HDF5 offers built-in serialization (and this is why we are using it, e.g., to access the data from multiple processes/threads without headaches).

I believe the issue is in reality simpler. The object's to_nifti() has a poor implementation. If we removed to_nifti() (which honestly, doesn't sound crazy, there's nothing that requires this feature to be part of the data object), then we don't have the problem anymore. It could be a separate function and #79 would read like "to_nifti() doesn't work great".

If that argument can be agreed upon, I've reviewed the PR accordingly. I'm not extracting to_nifti() from the object (but it possibly makes sense to consider in the future). I'll send a PR against this branch for (IMHO) a more consistent handling of cases when bzero is not set but insert_b0 is set on.

I think I am done with this: it is ready to be reviewed/merged.

A few comments:

Do the dwi-b0_desc-avg.nii.gz, dwi-b0_desc-brain.nii.gz and dwi-b0_desc-brain_mask.nii.gz testing data files correspond to what we should expect to be contained in dwi.h5 testing data file?

I don't remember, TBH. There's no need to stick with them if they do not meet our purposes in testing.

The brainmask is not serialized, so if I were to test that, the round trip would fail. To be added in a separate PR.

I'm comfortable with this, I don't think nifreeze should operate on the mask so there's no reason to write it out. It is part of the object because it is necessary in execution time. Recently, I've been thinking we could recommend that this were not a "brain" mask, but rather a conservative "brain parenchyma" mask (i.e., excluding CSF, and conservative because maybe a binary dilation could be applied to ensure no brain tissue is excluded).

By the same argument that nifreeze should not deal with the generation of good/customized b=0s, I don't think it should take responsibility for the mask either. If something is given from the exterior, the responsibility for storing it when/if necessary should rely on the exterior.

This block

nifreeze/src/nifreeze/data/dmri.py

Lines 390 to 394 in e964b96

# The b=0 volumes are those that did NOT pass b0_thres

b0_volumes = fulldata[..., ~gradmsk]

# A simple approach is to take the median across that last dimension

# Note that axis=3 is valid only if your data is 4D (x, y, z, volumes).

dwi_obj.bzero = np.median(b0_volumes, axis=3)

should be put to a function so that it can be re-used and changed should another strategy be preferred. To be done in a separate PR.

oesteban · 2025-06-08T07:36:46Z

My code suggestions are at jhlegarreta#1

One issue I realized of while reviewing is that, with insert_b0=False the b- vecs/vals would not be written out!

Hopefully my suggestions make sense :)

fix: better handling of bzero & write out b-vecs/vals

jhlegarreta · 2025-06-08T14:49:22Z

@oesteban I am fine with this if you believe this is the way forward. I'd grateful if you merged this as soon as I make the tests pass: we are back to the Stanford dataset issues 🤦‍♂️ (x-ref issue #149) :
https://github.com/nipreps/nifreeze/actions/runs/15519578287/job/43690979304?pr=82#step:11:222

  E   dipy.data.fetcher.FetcherError: The downloaded file, /home/runner/.dipy/stanford_hardi/HARDI150.nii.gz, does not have the expected md5
  E      checksum of "0b18513b46132b4d1051ed3364f2acbc". Instead, the md5 checksum was: "24f8bbdb75cbd20fd79b099a10f58567". This
  E      could mean that something is wrong with the file or that the upstream file has been
  E      updated. You can try downloading the file again or updating to the newest version of
  E      dipy.

jhlegarreta force-pushed the FixNonDWIWeightedGradientRW branch 3 times, most recently from 94513d5 to 9be0147 Compare May 3, 2025 17:13

jhlegarreta force-pushed the FixNonDWIWeightedGradientRW branch from 9be0147 to 5f0c848 Compare May 28, 2025 00:40

jhlegarreta force-pushed the FixNonDWIWeightedGradientRW branch 2 times, most recently from 42d2d6f to 5effcf3 Compare June 8, 2025 01:29

jhlegarreta force-pushed the FixNonDWIWeightedGradientRW branch from 5effcf3 to 8413690 Compare June 8, 2025 01:30

jhlegarreta marked this pull request as ready for review June 8, 2025 01:31

jhlegarreta changed the title ~~BUG: Fix the non-weighted DWI data serialization~~ FIX: Add null gradient values to DWI data serialization if appropriate Jun 8, 2025

oesteban changed the title ~~FIX: Add null gradient values to DWI data serialization if appropriate~~ FIX: Revise bzero handling in dMRI data's to_nifti() Jun 8, 2025

oesteban reviewed Jun 8, 2025

View reviewed changes

oesteban added 2 commits June 8, 2025 09:27

fix: better handling of bzero & write out b-vecs/vals

959d98b

doc: improve docstring

4d2f73c

oesteban mentioned this pull request Jun 8, 2025

Design issues of parallelization #158

Open

Merge pull request #1 from nipreps/review/82

38ccdad

fix: better handling of bzero & write out b-vecs/vals

oesteban merged commit 5fd5833 into nipreps:main Jun 8, 2025
17 of 20 checks passed

jhlegarreta deleted the FixNonDWIWeightedGradientRW branch June 8, 2025 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FIX: Revise bzero handling in dMRI data's `to_nifti()` #82

FIX: Revise bzero handling in dMRI data's `to_nifti()` #82

Uh oh!

jhlegarreta commented Jan 31, 2025 •

edited

Loading

Uh oh!

jhlegarreta commented Jan 31, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jan 31, 2025 •

edited

Loading

Uh oh!

jhlegarreta commented May 30, 2025

Uh oh!

oesteban commented Jun 2, 2025

Uh oh!

jhlegarreta commented Jun 8, 2025 •

edited

Loading

Uh oh!

oesteban left a comment

Uh oh!

oesteban commented Jun 8, 2025

Uh oh!

jhlegarreta commented Jun 8, 2025

Uh oh!

Uh oh!

Uh oh!

	# The b=0 volumes are those that did NOT pass b0_thres
	b0_volumes = fulldata[..., ~gradmsk]
	# A simple approach is to take the median across that last dimension
	# Note that axis=3 is valid only if your data is 4D (x, y, z, volumes).
	dwi_obj.bzero = np.median(b0_volumes, axis=3)

FIX: Revise bzero handling in dMRI data's to_nifti() #82

FIX: Revise bzero handling in dMRI data's to_nifti() #82

Uh oh!

Conversation

jhlegarreta commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhlegarreta commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jhlegarreta commented May 30, 2025

Uh oh!

oesteban commented Jun 2, 2025

Uh oh!

jhlegarreta commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oesteban left a comment

Choose a reason for hiding this comment

Uh oh!

oesteban commented Jun 8, 2025

Uh oh!

jhlegarreta commented Jun 8, 2025

Uh oh!

Uh oh!

Uh oh!

FIX: Revise bzero handling in dMRI data's `to_nifti()` #82

FIX: Revise bzero handling in dMRI data's `to_nifti()` #82

jhlegarreta commented Jan 31, 2025 •

edited

Loading

jhlegarreta commented Jan 31, 2025 •

edited

Loading

codecov bot commented Jan 31, 2025 •

edited

Loading

jhlegarreta commented Jun 8, 2025 •

edited

Loading