H5CPP Type System#

some text


operator data space data type dimension
h5::write(fd, "path", {"1","2","...","N"}) array H5T_ARRAY { [N] H5T_C_S1 VL } N
h5::write(fd, "path", array) array H5T_ARRAY { [N] H5T_C_S1 VL } N
h5::write(fd, "path", vector) hypercube H5T_C_S1 VL N
h5::write(fd, "path", vector, h5::current_dims{M}, h5::max_dims{N}, h5::offset{K}) hypercube H5T_C_S1 VL N

fixed length#

const char var[][10] = {"A","B","C","...","Z"};
auto ds = h5::create<char[5][10]>(this->fd, this->name);
h5::write(ds, var);

variable length#

Type Traits for internal use#

withing h5::impl namspace the following traits are defined:

template data; template std::array size( const T& ref ); template struct get { static inline T ctor( std::array dims ){ return T(); }};

sequence containers c++ layout shape
std::array<T> standard vector
std::vector<T> standard hypercube
std::deque<T> no ragged
std::forward_list no ragged
std::list<T> no ragged
associative containers c++ layout shape
HDF5 data type STL like containers dataset shape
bitfield std::vector<bool> ragged
VL string sequential<std::string> ragged

|arithmetic | std::vector | regular |

STL like objects HDF5 data type dataset shape

Objects and type classification#

arithmetic ::= (signed | unsigned)[char | short | int | long | long int, long long int] 
                      | [float | double | long double]
reference ::= [ pointer | R value reference | PR value reference]

What you need to know of C++ data types#

The way objects arranged in memory is called the layout. The C++ memory model is more relaxed than the one in C or Fortran therefore one can't assume contiguous arrangement of class members, or even being of the same order as defined. Since data transfer operation in HDF5 require contiguous memory arrangement which creates a mismatch between the two systems. C++ objects may be categorized by memory layout such as:

some text

Some C++ classes are treated special, as they almost fulfill the standard layout requirements. Linear algebra libraries with data structures supporting BLAS/LAPACK calls ie: arma::Mat<T>, or STL like objects with contiguous memory layout such as std::vector<T>, std::array<T> may be converted into Standard Layout Type by providing a shim code to grab a pointer to the underlying memory and size. Indeed the previous versions of H5CPP had been supporting only objects where the underlying data could be easily obtained.

HDF5 dataset shapes#

Scalars, Vectors, Matrices and Hypercubes#

some text some text some text
Are the most frequently used objects, and the cells may take up any fixed size data format. STL like Sequential and Set containers as well as C++ built in arrays may be mapped 0 - 7 dimensions of HDF5 homogeneous, and regular in shape data structure. Note that std::array<T,N> requires the size N known at compile time, therefore it is only suitable for partial IO read operations.

T::= arithmetic | pod_struct

Ragged Arrays#

VL datatypes are designed allow the amount of data stored in each element of a dataset to vary. This change could be over time as new values, with different lengths, were written to the element. Or, the change can be over "space" - the dataset's space, with each element in the dataset having the same fundamental type, but different lengths. "Ragged arrays" are the classic example of elements that change over the "space" of the dataset. If the elements of a dataset are not going to change over "space" or time, a VL datatype should probably not be used.

T::= arithmetic | pod_struct | pod_struct 
element_t ::= std::string | std::vector<T> | std::list<T> | std::forward_list 
some text

Sequences of variable lengths are mapped to HDF5 ragged arrays, a data structure with the fastest growing dimension of variable length. The C++ equivalent is a container within a sequential container -- with embedding limited to one level.

Mapping C++ Non Standard Layout Classes#

Since the class member variables are non-consecutive memory locations the data transfer needs to be broken into multiple pieces.

multiple homogeneous datasets#

some text

single dataset: compound data type#

some text

multiple records