-
Notifications
You must be signed in to change notification settings - Fork 0
HDF5
You are here: Home > PIConGPU User Documentation > Post processing and Visualization > HDF5
PIConGPU uses HDF5 as its primary output format for simulation data. We use libSplash for HDF5 input/output which simplifies data access for large-scale simulations. HDF5 is a hierarchical data format for large datasets stored in binary. Even though libSplash is used for writing the data, any HDF5-based tool or library is able to access the data for reading and analysis.
In PIConGPU, most simulation data is stored on disk using the HDF5Writer plugin. It uses libSplash to create parallel HDF5 files, which means that data from all MPI processes is stored in a single file using MPI I/O to allow for high scalability and easy post-processing. Please note that MPI I/O primarily targets parallel file systems such as Lustre or GPFS. Thus, I/O performance may not be optimal if used on a non-parallel file system.
The HDF5Writer plugin produces one parallel HDF5 file per timestep (depending on the notification period for this plugin which can be specified using --hdf5.period
. File names are of the form <prefix>_<timestep>.h5
. Each file contains a hierarchy of HDF5 groups, which are comparable to UNIX directories, and contain datasets and attributes. Attributes can be attached to both groups and datasets. If not specified otherwise, files will be written to the simOutput
directory of your simulation run. The common group structure within a file looks like this:
/data/
--<timestep>/ <-- timestep group with global attributes of this timestep
----fields/ <-- all fields
------FieldE/ <-- group for a field (of vectors)
--------x <-- x-dimension component of this field
--------y
------FieldB/
------EnergyDensity_e <-- scalar field dataset
...
----particles/ <-- all particle species
------e/ <-- electrons
--------_ghosts <-- internal ghost cell information for moving window
--------globalCellIdx/ <-- group of global cell IDs
----------x <-- x-dimension component of cell IDs
----------y
--------momentum/
--------position/
--------weighting <-- scalar weighting dataset for electrons
------i/ <-- ions
PIConGPU allows to restart simulations from checkpoint data. Considering HDF5Writer, checkpoint files are identical to standard simulation output except that they must contain ghost information when moving window is enabled for the simulation run.
Ghost datasets are stored in the special _ghosts
group in the HDF5 file and contain the guard cells of the simulation grid (recall that the simulation grid is split in core, border and guard cells). They are called ghosts as they do not exist in the physical domain and should not be used for visualization or analysis. However, they are required to enable restarts.
There are several tools and libaries which can be used to access your simulation result files. Sophisticated tools for post-processing, analysis and data visulization are explained here. Additional tools are:
-
HDFView is a Java application that allows to inspect HDF5 files in a graphical user interface.
-
splash2txt comes with PIConGPU and uses libSplash to convert PIConGPU output files to ASCII text. It can be used when simple text conversion is necessary but does not allow scalable analysis for very large datasets.
-
H5py is a python module allowing access to hdf5 files. May also not be suited for very large datasets.