Our Expectation:

# what we got!!

PHDF5 compiled and linked against IOR from 2003 - 2019

IOR executed with transfer size and block size 1MB

srun --export ALL $INSTALL_DIR/bin/ior -a HDF5   -b 1G -t 1G -C -e

Platform: H5CLUSTER v1.0.14

Instance Type: 4xNVME 96core m5d.metal 25Mb/sec ethernet
ior  -a HDF5  -b 10GB -t 1GB ; ior  -a MPIIO -b 10GB -t 1GB
Throughput gain: 3GByte/sec with each node added

Live Web page: https://www.hdfgroup.org/solutions/hdf-kita/hdf-kita-architecture/"

Live Web page: https://nrel.github.io/hsds-viz/"

H5CPP

the first non-intrusive persistence with MPI-IO and KITA™

for modern C++

Created by Steven Varga in co-operation with Gerd Heber, HDFGroup

Online version of this presentation: http://presentation.h5cpp.org

take a header file with POD struct


typedef unsigned long long int MyUInt;
namespace sn {
	namespace example {
		struct Record {                   
			MyUInt               field_01; 
			char                 field_02;
			double            field_03[3]; 
			other::Record field_04[4];
		};
	}
}
  • typedefs are fine
  • nested namespace are OK
  • mapped to : H5T_NATIVE_CHAR
  • H5Tarray_create(H5T_NATIVE_DOUBLE,1, ... )
  • first `other::Record` is parsed: type_hid_t = ...
  • then the generated type is used: H5Tarray_create(type_hid_t, ...)

write your program

write your cpp program as if `generated.h` were already written 
#include "some_header_file.h"
#include <h5cpp/core>
	#include "generated.h"
#include <h5cpp/io>
int main(){
	std::vector<sn::example::Record> stream =
		...
	h5::fd_t fd = h5::create("example.h5",H5F_ACC_TRUNC);
	h5::pt_t pt = h5::create<sn::example::Record>(
		fd, "stream of struct",
		h5::max_dims{H5S_UNLIMITED,7}, h5::chunk{4,7} | h5::gzip{9} );
	...
}
  • sandwich the not-yet existing `generated.h`
  • write the TU translation unit as usual
  • using the POD type with one of the H5CPP CRUD like operators h5::create | h5::write | h5::read | h5::append will trigger the `h5cpp` compiler to generate code

A header file with HDF5 Compound Type descriptors:

#ifndef H5CPP_GUARD_ErRrk
#define H5CPP_GUARD_ErRrk
namespace h5{
    template<> hid_t inline register_struct(){
        hsize_t at_00_[] ={7};            hid_t at_00 = H5Tarray_create(H5T_NATIVE_FLOAT,1,at_00_);
        hsize_t at_01_[] ={3};            hid_t at_01 = H5Tarray_create(H5T_NATIVE_DOUBLE,1,at_01_);
        hid_t ct_00 = H5Tcreate(H5T_COMPOUND, sizeof (sn::typecheck::Record));
        H5Tinsert(ct_00, "_char",	HOFFSET(sn::typecheck::Record,_char),H5T_NATIVE_CHAR);
		...
		H5Tclose(at_03); H5Tclose(at_04); H5Tclose(at_05); 
        return ct_02;
    };
}
H5CPP_REGISTER_STRUCT(sn::example::Record);
#endif
  • random include guards
  • within namespace
  • template specialization for h5::operators
  • compound types are recursively created
  • calls the template specialization when h5::operator needs it

CAPI hid_t Conversion Policy

allows to tighten control how HDF5 CAPI and H5CPP calls can interact. In the default case seamless conversion permits to use handles from the two API-s interchangeably.
#define H5CPP_CONVERSION_IMPLICIT
#define H5CPP_CONVERSION_EXPLICIT
#define H5CPP_CONVERSION_FROM_CAPI_DISABLED
#define H5CPP_CONVERSION_TO_CAPI_DISABLED
h5::fd_t fd = h5::open( ... );
H5capi_call( static_cast(fd), ... );

hid_t capi_fd = H5Fopen( ... );
h5::read(fd, "dataset", ...));
  • seamless conversion between H5CPP and CAPI is allowed, this is default behaviour
  • passing from/to CAPI possible with `static_cast` only
  • compile time error, h5::operators will not take capi hid_t
  • compile time error no conversion from h5::hid_t<T> to capi hid_t

The Packet Table:

#include <h5cpp/core>
    #include "generated.h"
#include <h5cpp/io>
int main(...){
    std::vector<sn::example::Record> stream = ... ;
    h5::pt_t pt = h5::create<sn::example::Record>(fd, "stream of struct",
                    h5::max_dims{H5S_UNLIMITED,7}, h5::chunk{4,7} | h5::gzip{9} );
    for( auto record : stream )
        h5::append(pt, record);
}
  • works with H5CPP compiler assisted reflection
  • h5::ds_t automatically converts to h5::pt_t
  • same property list may be used with HDF5 pipeline
  • use low latency h5::append in tight loops

Experimental high throughput pipeline

is a new H5CPP specific feature, where the code path avoids the traditional HDF5 CAPI processing pipeline, instead a L0-L3 cache aware blocking machanism used based on Level 3 BLAS blocking algorithm, then the data is delegated to the new CAPI v1.10.4 direct chunk read | writes.
h5::ds_t ds = h5::open(fd,"movie", 
    h5::high_throughput);

h5::pt_t pt = h5::create(fd,"append scalar",
    h5::max_dims{H5S_UNLIMITED,nrows,ncols}, h5::chunk{1,nrows,ncols},
    h5::high_throughput );
  • may be activated with h5::high_throughput dxpl
  • works with all h5::operators
This feature currently doesn't support any filters, but is bare metal fast.

Attributes:

do the right thing. Here are some examples, and come with an easy to use operator:

h5::ds_t ds = h5::write(fd,"some dataset with attributes", ... );
ds["att_01"] = 42 ;
ds["att_02"] = {1.,3.,4.,5.};
ds["att_03"] = {'1','3','4','5'};
ds["att_04"] = {"alpha", "beta","gamma","..."};
ds["att_05"] = "const char[N]";
ds["att_06"] = u8"const char[N]áééé";
ds["att_07"] = std::string( "std::string");
ds["att_08"] = record; // pod/compound datatype
ds["att_09"] = vector; // vector of pod/compound type
ds["att_10"] = matrix; // linear algebra object
  • obtain a handle by h5::create | h5::open | h5::write
  • rank N objects, even compound types when h5cpp compiler used
  • arrays of various element types
  • mapped to rank 0 variable length character types

This work is still under development, please provide feedback...

MPI-IO made easy

H5CPP with parallel HDF5
#include <mpi.h>
#include <h5cpp/all>
...
int main(int argc, char** argv) {
    ...
    MPI_Init(NULL, NULL);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    ...
    { .. /* for H5CPP see next slide */.. }
    ...
    MPI_Barrier(MPI_COMM_WORLD);
    MPI_Finalize();
}
  • start your program as usual
  • write the MPI boilerplate

MPI-IO made easy

H5CPP with parallel HDF5
{
std::vector v(nrows);
auto fd = h5::create("hdf5_container_name.h5", H5F_ACC_TRUNC,
               h5::fcpl,
	           h5::mpiio({mpi_comm, mpi_info}) );
h5::write( fd, "dataset", v,
	h5::chunk{nrows,1}, h5::current_dims{nrows,size},h5::count{nrows,1}
	h5::offset{0,rank}, h5::block{1,1}, h5::stride{1,1},
	h5::collective );
}
  • use code-block to activate H5CPP RAII
  • arbitrary objects works, as long as you can get a pointer to memory
  • pass `h5::mpiio` file access property list
  • control IO with `rank`
  • set `h5::independent` | `h5::collective`

MPI-IO diagnostics

H5CPP with parallel HDF5
...
h5::dxpl_t dxpl = h5::collective;

h5::write( fd, "dataset", v,
	h5::chunk{nrows,1}, h5::current_dims{nrows,size},h5::count{nrows,1}
	h5::offset{0,rank}, h5::block{1,1}, h5::stride{1,1},
	dxpl );

std::cout << dxpl <<"\n";
...
  • make a copy of h5::collective | h5::independent properties
  • pass the mutable dxpl to IO operators
  • print out result to std::cout

Live Webpage: http://h5cpp.org