Dataformat
The bp format was designed to be largely convertable to both HDF-5 and netCDF while relaxing some of the consistency requirements to increase performance for both writing and reading for serial and parallel file systems. As a forward-looking standard, all offsets and sizes of potentially large items use 64-bits to express the offset or size.
Terms:
- Process Group - all output from a single processor for a single ADIOS group
- Process Group Index - an index with some attributes to aid in identification
- Vars Index - an index of all vars in the bp file with direct access offsets for each including a series of characteristics of each instance of a var.
- Attributes Index - an index of all attributes in the bp file with direct access offsets
- the 3 index offsets and a version and endianness flag
Performance Implications:
- On writing, all processes synchronize offsets on open and only again on close for index creation (handled in proc 0).
- On reading, all processes synchronize offsets on open.
Each Process Group output consists of a short header describing the output as a whole including how the data was generated. This is followed by a list of the variables provided each encoded separately with limited linking. Finally is a series of attributes with limited linking to the variables for complex or dynamic values.
The Process Group Index lists for each Process Group the name, process ID, timestep, and where it is located in the file.
The Vars Index lists for each unique variable the group name it was a member of, the name and path, the datatype, and a series of data characteristics. The data characteristics initially consist of one or more of the following:
- offset in file (for all variables)
- array dimensions (for array only)
- minimum value (for arrays only)
- maximum value (for arrays only)
- value (for scalars only)
Other characteristics will be added over time.
The Attributes Index is similar to the vars index, except that it is for attributes.
For more detailed information, please contact us.