Online visualisation, exploration and analysis of HDF5 files with H5WEB

PANOSC - Photon and Neutron Science - Photon and Neutron Science
Partners:
ESRF HDF Group

H5Web is a web tool to easily browse, inspect and visualise data in HDF5 files, which has been integrated in the ESRF’s data portal and jupyterlab. It has since been integrated in multiple data portals. It is available as a web service and into the widely used development tool VS-Code.  HDF5 is a format to store data and metadata in a file-system-like manner. It is maintained and managed by the HDFGroup.

Video presentations 
European HDF Users Group meeting in July 2021
Jupyter Community Call of April 2021
 

H5Web, a viewer for HDF5 files written in React, uses WebGL to for visualizations. The components, which include the main visualisation components (LineVis, HeatmapVis, etc.), as well as some of their lower-level building blocks, are exported in a library called @h5web/lib.
You can play around with the visualisation components in the Storybook documentation and reuse them by installing the NPM package @h5web/lib. The viewer application as a whole is available in the NPM package @h5web/app.

Some of the H5Web tools developed and maintained by PaNOSC:

- jupyterlab-h5web is based on H5WebRepo for installation and usage: GitHub – silx-kit/jupyterlab-h5web, a JupyterLab extension to explore and visualise HDF5 file contents.
- the code for the VS-Code extension is here https://github.com/silx-kit/vscode-h5web
- myhdf5, an online web service for viewing HDF5 files without installing any software or transferring data - https://myhdf5.hdfgroup.org

H5Web has become the de-facto solution for viewing HDF5 files in the web. Using a web technology such as H5Web means that users don’t have to download heavy HDF5 files to their machines when running code on remote servers. The viewer runs in JupyterLab and also supports the NeXus format, which is very common at PaN facilities.
HDF5 is used in many scientific communities as a binary container e.g. photon science, earth sciences, satellite imaging, genomics, microscopy, astronomy, basically any science producing large data volumes.

Users can browse, view, evaluate and download data from the same application.
Other sites using the same portal as the ESRF (ICAT backend + ICATplus frontend) deploy the solution to view data immediately. Other sites have adopted the backend and integrating h5web into their data portal e.g. Australian Synchrotron data portal.
The multinational company GPixel, which uses the HDF5 file format extensively for validation, characterisation and volume production testing of image sensors, has adopted two software developed within PaNOSC: H5Web, the web-based HDF5 file viewer, and H5Grove, the Python package for serving HDF5 files. The VS-Code extension has been installed by thousands of individual users. Some comments and praise from users can be found in the Visual Studio marketplace.
Adoption of H5Web outside of the PaNOSC community is a great sign that the Science Cluster has achieved one of its initial goals of providing generic-enough visualisations for use in other scientific fields. Feedback from external communities is invaluable to the growth of the H5Web ecosystem, as uncovering and resolving bugs and new use cases helps build better software.