|
HighFive 3.2.0
HighFive - Header-only C++ HDF5 interface
|
First clone the repository and remember the --recursive:
The instructions to recover if you forgot are:
One remark on submodules: each HighFive commit expects that the submodules are at a particular commit. The catch is that performing git checkout will not update the submodules automatically. Hence, sometimes a git submodule update --recursive might be needed to checkout the expected version of the submodules.
The instructions for compiling with examples and unit-tests are:
You might want to add:
-DHIGHFIVE_TEST_BOOST=On or other optional dependencies on,-DHIGHFIVE_MAX_ERRORS=3 to only show the first three errors.Generic CMake reminders:
-DCMAKE_INSTALL_PREFIX defines where HighFive will be installed,-DCMAKE_PREFIX_PATH defines where *Config.cmake files are found.There's numerous HDF5 features that haven't been wrapped yet. HighFive is a collaborative effort to slowly cover ever larger parts of the HDF5 library. The process of contributing is to fork the repository and then create a PR. Please ensure that any new API is appropriately documented and covered with tests.
The project is formatted using clang-format version 12.0.1 and CI will complain if a commit isn't formatted accordingly. The .clang-format is at the root of the git repository. Conveniently, clang-format is available via pip.
Formatting the entire code base can be done with:
which will install the required version of clang-format in a venv called .clang-format-venv.
To format only the changed files git-clang-format can be used.
Before releasing a new version perform the following:
CHANGELOG.md and AUTHORS.txt as required.CMakeLists.txt and include/highfive/H5Version.hpp.At this point there should be a commit on master. Now create the release.
Tag: v${VERSION} Title: v${VERSION} Body: copy-paste CHANGELOG.md
Next:
*.tar.gz) and compute its SHA256.Input array of any dimension and type can be generated using the template class DataGenerator. For example:
Generates an std::vector<std::array<double, 2>> initialized with suitable values.
If "suitable" isn't specific enough, one can specify a callback:
The dims can be generated via testing::DataGenerator::default_dims or by using testing::DataGenerator::sanitize_dims. Remember, that certain containers are fixed size and that we often compute the number of elements by multiplying the dims.
To generate a single "suitable" element use template class DefaultValues, e.g.
To access a particular element from an unknown container use the following trait:
Use testing::DataGenerator::allocate to allocate an array (without filling it) and testing::copy to copy an array from one type to another. There's testing::ravel, testing::unravel and testing::flat_size to compute the position in a flat array from a multi-dimensional index, the reverse and the number of element in the multi-dimensional array.
Due to how HighFive is written testing DataSet and Attribute often requires duplicating the entire test code because somewhere a createDataSet must be replaced with createAttribute. Use testing::AttributeCreateTraits and testing::DataSetCreateTraits. For example,
All tests for reading/writing whole multi-dimensional arrays to datasets or attributes belong in tests/unit/test_all_types.cpp. This includes write/read cycles; checking all the generic edges cases, e.g. empty arrays and mismatching sizes; and checking non-reallocation.
Read/Write cycles are implemented in two distinct checks. One for writing and another for reading. When checking writing we read with a "trusted" multi-dimensional array (a nested std::vector), and vice versa when checking reading. This matters because certain bugs, like writing a column major array as if it were row-major can't be caught if one reads it back into a column-major array.
Remember, std::vector<bool> is very different from all other std::vectors.
Every container template<class T> C; should at least be checked with all of the following Ts that are supported by the container: bool, double, std::string, std::vector, std::array. The reason is bool and std::string are special, double is just a POD, std::vector requires dynamic memory allocation and std::array is statically allocated.
Similarly, each container should be put inside an std::vector and an std::array.
Write-read cycles for scalar values should be implemented in tests/unit/tests_high_five_scalar.cpp.
Unit-tests related to checking that DataType API, go in tests/unit/tests_high_data_type.cpp.
Check related to empty arrays to in tests/unit/test_empty_arrays.cpp.
Anything selection related goes in tests/unit/test_high_five_selection.cpp. This includes things like ElementSet and HyperSlab.
Regular write-read cycles for strings are performed along with the other types, see above. This should cover compatibility of std::string with all containers. However, additional testing is required, e.g. character set, padding, fixed vs. variable length. These all go in tests/unit/test_string.cpp.
If containers, e.g. Eigen::Matrix require special checks those go in files called tests/unit/test_high_five_*.cpp where * is eigen for Eigen.
In HighFive we make assumptions about the memory layout of certain types. For example, we assume that
is a sensible thing to do. We assume similar about bool and details::Boolean. These types of tests go into tests/unit/tests_high_five_memory_layout.cpp.
Anything H5Easy related goes in files with the appropriate name.
What's left goes in tests/unit/test_high_five_base.cpp. This covers opening files, groups, dataset or attributes; checking certain pathological edge cases; etc.
In tests/cmake_integration we test that HighFive can be used in downstream projects; and that those project can, in turn, be used.
We'll refer to the process of embedding a copy of HighFive into a consuming library as vendoring. The vendoring strategies will include the strategy to not vendor.
There's two broad strategies for integrating HighFive into other code: finding or vendoring. When finding HighFive, the assumption by the consumer is that it's been installed somewhere and it can find it. When vendoring, the consumer brings their own copy of HighFive and uses it. The different vendoring strategies are:
find_package (or find_dependency when called from *Config.cmake).add_subdirectory is used to bring HighFive and all it's targets into the consumer.These refer to downstream projects picking different HighFive targets to "link" with HighFive. There's four: two regular targets, a target that only adds -I <dir> and one that skips all HighFive CMake code.
There are several ways of indicating where to find a package:
HighFiveConfig.There's two types of directories where a dependency can be located:
cmake --install build.HighFive_ROOT.export()Furthermore, there's export(...). Documentation describes it as being useful for cross-compilation, when one wants to have a set of host tool along with a library compiled for the device. It seems we don't need it in HighFive and can make it and easily write test consumers work perfectly without.
However, if one of our consumers adds export(...) to their CMakeLists.txt then their build breaks, complaining about missing HighFive targets (and it seems they can't "fix it up" on their end because then CMake complains that there's duplicate exported HighFive related targets).
The second way omitting the missing export() can break downstream projects is if they attempt to use HighFive's build directory (not install directory) as HighFive_ROOT (or CMAKE_PREFIX_PATH).
This is the idea that libraries behave differently depending on whether they use HighFive in their distributed headers; or not. If they don't they could attempt to hide the fact that they use HighFive from their users.
Currently, we don't check that libraries can hide the use of HighFive.
In what follows application will refer to code that compiles into an executable. While libraries refer to code that compiles into a binary that other developers will link to in their library or application.
Libraries and applications that have dependencies that use HighFive must be able to agree on a common version of HighFive they want to use. Otherwise, during the final linking phase, multiple definitions of the same HighFive symbols will exist, and they linker will pick one (arbitrarily); which is only safe if all definitions are identical.
The script to check everything is unwieldy. Here's a summary of what it attempts to do.
There's three downstream project in play: dependent_library is a library that uses HighFive in its public API, application is an executable that uses HighFive directly, test_dependent_library is an application that uses dependent_library its purpose is to check that our users can write CMake code that makes them integrate well with other projects.
The conceptually easy choices are:
find_package since it's the easiest way of injecting a common version of HighFive everywhere.Since we can't (and don't want to) force our consumers to use find_package and ban vendoring, we have to test what happens when libraries vendor HighFive. (Many of these are likely sources of headache if you try to figure out which code is when using which of the multiple copies of HighFive involved; and how they decide to use that version.)
We'll assume that there's some way that all involved projects agree on a single version of HighFive (i.e. the exact same source code or git commit).
The dependent_library will integrate HighFive using any of the following strategies: external in two variations one from HighFive's install dir and the other from HighFive's build dir; submodule in two variations which uses add_subdirectory once with and once without EXCLUDE_FROM_ALL, fetch_content in one variation.
The test_dependent_library itself always incorporates dependent_library using find_package (Config mode). Since that layer might choose any of the strategies of integrating HighFive, we again check several. Additionally, test_dependent_library isn't (and probably shouldn't be) required to integrate HighFive directly: option none.
Imagine a script that tries all combinations of the above; and attempts to only provide hints for the HighFive package location when needed, e.g. for none there's a find_dependency in Hi5DependentConfig that needs to be told where to look. This is what test_cmake_integration.sh attempts to do.