Skip to content

File-access property lists API (+MPI / direct VFD support) #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 102 commits into from
Feb 13, 2019

Conversation

aldanor
Copy link
Owner

@aldanor aldanor commented Feb 10, 2019

  • Wrapped file-access property lists (FAPL) in a new type (plist::FileAccess) with its own builder; almost all properties from HDF5 C API are supported, including those version-gated or feature-gated (like mpio/direct).
  • Added a full suite of tests for both file-create (FCPL) and file-access (FAPL) property lists.
  • Removed bindgen from build dependencies (yay!); parse H5pubconf.h header manually with regex. This is a pretty big win since we no longer depend on clang-sys/libclang which may cause version compatibility problems with other crates at build time.
  • Added MPI support: enabled via feature = "mpio" in both hdf5-rs and libhdf5-sys crates; only available when h5_have_parallel is detected. Added missing FAPL/DXPL bindings.
  • Added direct VFD support: enabled automatically when h5_have_direct is detected.
  • Support pretty much all file drivers, including log/family/multi/split/mpio/direct. The latter two require custom-built HDF5 (with mpio driver requiring openmpi, and direct driver only available on Linux), so I've only tested them on my local box for now.
  • H5FD_*_init() calls are now mutex-locked (no more segfaults due to that).
  • Fixed a few erroneous raw bindings; removed a few unused raw bindings; added various missing bindings.
  • Move hdf5-rs one level up, so it's now actually the root crate which also happens to be a workspace (instead of being one of the workspace crates) -- this allows running tests with --features enabled.

This took a bit longer than expected (planned to just add simple wrappers for FAPL), but a lot of stuff got fixed in the process including the build system, plus we now support all imaginable HDF5 backends and a convenient framework for building and testing other property list types.

A few open questions:

  • Currently, particular property lists are accessible like plist::FileAccess. While that makes sense when spelled like so, FileAccess on its own doesn't - it's not obvious that it's a property list.

    • Should it be FileAccessPropertyList? (the base type is called PropertyList) That sounds too long though?...
    • Or FileAccessPL? That looks a bit cryptic... (less than "fapl" but still)
  • Currently, there's two version of each property getter:

    • #[doc(hidden)] pub fn get_foo(&self) -> Result<T>
    • pub fn foo(&self) -> T

    Wondering if it's ok?

    The main reason is to allow easy access to properties when you're sure your objects are fine and valid (which would be the case 99.999% of time). So, like, instead of doing plist.get_userblock().unwrap() you can just do a plist.userblock().

    This gets a little weird with more complicated types, e.g. if a struct is returned we now need to have something legible to return in case the getter failed for some reason. That being said, get_*() getters are still available if the user wants full control over possible errors (typically, the errors with getters would occur only if the plist became invalid, e.g. if you closed a file handle that the plist was attached to, or something like that).

Oh, and btw... property lists are now nicely printable:
FileAccess {
    alignment: Alignment {
        threshold: 1,
        alignment: 1
    },
    chunk_cache: ChunkCache {
        nslots: 521,
        nbytes: 1048576,
        w0: 0.75
    },
    fclose_degree: Default,
    gc_references: false,
    small_data_block_size: 2048,
    libver_bounds: LibVerBounds {
        low: Earliest,
        high: V110
    },
    elink_file_cache_size: 0,
    meta_block_size: 2048,
    page_buffer_size: PageBufferSize {
        buf_size: 0,
        min_meta_perc: 0,
        min_raw_perc: 0
    },
    evict_on_close: false,
    mdc_image_config: CacheImageConfig {
        generate_image: false,
        save_resize_status: false,
        entry_ageout: -1
    },
    sieve_buf_size: 65536,
    metadata_read_attempts: 1,
    mdc_log_options: CacheLogOptions {
        is_enabled: false,
        location: "",
        start_on_access: false
    },
    all_coll_metadata_ops: false,
    coll_metadata_write: false,
    mdc_config: MetadataCacheConfig {
        rpt_fcn_enabled: false,
        open_trace_file: false,
        close_trace_file: false,
        trace_file_name: "",
        evictions_enabled: true,
        set_initial_size: true,
        initial_size: 2097152,
        min_clean_fraction: 0.30000001192092896,
        max_size: 33554432,
        min_size: 1048576,
        epoch_length: 50000,
        incr_mode: Threshold,
        lower_hr_threshold: 0.8999999761581421,
        increment: 2.0,
        apply_max_increment: true,
        max_increment: 4194304,
        flash_incr_mode: AddSpace,
        flash_multiple: 1.0,
        flash_threshold: 0.25,
        decr_mode: AgeOutWithThreshold,
        upper_hr_threshold: 0.9990000128746033,
        decrement: 0.8999999761581421,
        apply_max_decrement: true,
        max_decrement: 1048576,
        epochs_before_eviction: 3,
        apply_empty_reserve: true,
        empty_reserve: 0.10000000149011612,
        dirty_bytes_threshold: 262144,
        metadata_write_strategy: Distributed
    },
    driver: Mpio(
        MpioDriver {
            comm: 0x00007fa0e2a3da30,
            info: 0x00007fa0e2a3e950
        }
    )
}

@aldanor
Copy link
Owner Author

aldanor commented Feb 11, 2019

(Failures on both AppVeyor and Travis are reasonable, will be fixed tonight; kind of proves that a proper test suite is not just a waste of time).

Would appreciate if anyone could take at least a very brief look at this since it's quite a lot of stuff :) (@pmarks, @Enet4?)

The next step (and the last step before 0.3.0 release) would be refactoring a major portion of File, particularly opening/creating files and dealing with fcpl/fapl properties. I'll opening a separate discussion issue for that since it would involve some bikeshedding.

@pmarks
Copy link
Contributor

pmarks commented Feb 11, 2019

@aldanor very cool! agreed that dropping the bindgen requirement is a nice win. Things look reasonable, but I have no experience with fapls and fancy IO modes, so I'm not really qualified to comment.

@aldanor
Copy link
Owner Author

aldanor commented Feb 11, 2019

Thanks for comment @pmarks :)

I guess the most user-visible change after this gets merged and subsequent changes to File get implemented, would be removing things like userblock() (which belongs to fcpl) or filebacked() (which belongs to core driver of fapl) from FileBuilder entirely (and actually probably removing the whole FileBuilder altogether).

This would make things more verbose, but more logical and organized as well (hopefully not too verbose).

So, the current

let file = FileBuilder::new()
    .userblock(1024)   // belongs to FCPL H5P API
    .driver("core")    // belongs to FAPL H5P API
    .filebacked(false) // belongs to core-driver portion of FAPL H5P API
    .mode("x")         // belongs to H5F file API
    .open("foo.h5")?;  // H5F file API as well

would be more of something like this (preliminary sketch):

let fapl = plist::FileAccess::build().core().finish()?;          // all of H5P FCPL stuff
let fcpl = plist::FileCreate::build().userblock(1024).finish()?; // all of H5P FAPL stuff
let file = File::create_excl_with(&fcpl, &fapl, "foo.h5")?;      // all of H5F file stuff

To make the latter less verbose, File constructors could accept plist builders or refs to them directly I guess (so you don't even have to call .finish()?); also, could export new_fcpl(), new_fapl() at the crate root level, etc.

(Note that most of the time you'd just simply use File::create_excl("foo.h5") or maybe File::open("foo.h5").)

@aldanor aldanor mentioned this pull request Feb 12, 2019
9 tasks
@Enet4
Copy link

Enet4 commented Feb 12, 2019

Alas, my experience with FAPL and related contents is also quite limited. But if I can provide at least some assurance, I did not notice anything reprehensible in the changes, and all tests pass on my machine.

I also agree with the idea of improving the File API. I already started using the crate for my own tools, and there are probably some interactions that could be improved.

@aldanor
Copy link
Owner Author

aldanor commented Feb 12, 2019

@Enet4 thanks for checking it out.

there are probably some interactions that could be improved

Would love to hear if there’s anything aside from what’s been already discussed (reworking file creating / opening api to be more like stdlib and integrating fcpl/ fapl in a more hdf5-like manner).

@aldanor aldanor merged commit 9d8fe6e into master Feb 13, 2019
@aldanor aldanor deleted the feature/file_access branch February 13, 2019 00:03
mulimoen pushed a commit to mulimoen/hdf5-rust that referenced this pull request Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants