Proposal for AliceO2 online/in-memory formats Mikolaj Krzewicki, David Rohr, Matthias Richter

CWG4 Meeting Feb 05 2016

M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

1 / 10

Outcome of discussions so far The format consists of header information and payload. The latter is not touched by the framework. Multi-part approach is used for the transport, ideally implemented by the transport framework (FairMQ) → Format consists of a sequence of separate blocks

Transport framework required to preserve sequence of parts Pure design decision, this functionality better fits in the transport framework layer, some underlying transport layers support it right away

Multi Header, each data part is preceeded by a corresponding header part Header part supports a header stack

Appropriate interface methods to support access to header and payload in both fixed order and vectorized paradigms M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

2 / 10

General header format Starts with basic header information, never serialized, with unique version number (detailed format in the following slides) Enforce strict policy: no changes to members (e.g. width) or sequence of members, new members can be appended All basic header structs are defined with fixed endianess and padding Handlers for inhomogeneous systems will be provided at compile time Strategy: “keep concept open for implementation of hendlers but do not solve problems we don’t have at the moment.”

Header-stack concept: optional headers can follow the basic header A next header is indicated in a flag member of preceeding header Optional headers consist of a fixed NextHeaderDescription and a variable NextHeaderContent

M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

3 / 10

Data Header struct DataHeader_t { /** 4 bytes of a magic string */ int32_t mMagicString; /** size of the struct */ int32_t mStructSize; /** header version, bookkeeping in the software */ int32_t mHeaderVersion /** flags, one of the bits to indicate that a sub header is following */ int32_t mFlags; /** size of payload in memory */ int64_t mPayloadSize; /** payload serialization method for transfer */ char mPayloadSerializationMethod[7+1]; /** data payload meta data: type, origin, specification */ PayloadMetaData_t mMetaData; }; /** * Basic struct for the payload meta data */ struct PayloadMetaData_t { /** Subsystem or detector */ char mDataOrigin[3+1]; /** Data description, e.g. raw, clusters, tracks */ char mDataDescriptor[15+1]; /** A system or detector specific sub specification */ int64 mSubSpec; };

See next slides for detailed comments M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

4 / 10

Some comments Even though the in-memory format is not primarily intended for permanent storage it’s good to think about evolution and compatibility. Unique header version; struct size included for consistency check and to facilitate later implementation of conversion handlers Valid bits in the mFlags field and their meaning are defined by the header version Why a magic string? Makes identification of header simpler, e.g. after a data corruption; great help for low level debugging The format is not primarily intended to be used for storage, but with a little overhead it is also suited for that. This will be used for debugging when data needs to be dumped to storage at different points in the hierarchy PayloadSize is a redundant information, to be used for integrity check and mandatory for disk dumped data Payload serialization method defined in the header, this will allow build common functionality, and the framework to choose the right tool for de-serialization Various Serialization methods will be used, a uniform serialization concept to be defined (example: root serialization embedded into current HLT data flow) M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

5 / 10

Header Stack Design an open concept of headers which can accommodate extension requests like those we had in the past (e.g. more trigger bits, new detectors, more detector links) Indicate in the header flags that there will be a next header coming immediately after the current header Additional headers consist of basic header information NextHeaderDescription and NextHeaderContent Header payload can be serialized

struct NextHeaderDescription_t { /** size of this next header description */ int32_t mStructSize; /** size of the next header payload */ int32_t mNextHeaderContentSize; /** Common flags for all next-headers. For the moment, we only need one bit to indicate an additional nextint32_t mFlags; /** Descriptor */ char mHeaderDescriptor[15+1]; /** serialization method */ char mSerializationMethod[7+1]; }; M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

6 / 10

Description Ids HLT uses 4/8 byte char fields to indicate data origin and description This has proven very useful, but needs a few adjustments to avoid confusions More intuative operators to be implemented New policy: always zero terminated Longer description field, comparison operations use arithmetic operations Optimize operations with meta data /** * Basic struct for the payload meta data */ struct PayloadMetaData_t { /** Subsystem or detector */ char mDataOrigin[3+1]; /** Data description, e.g. raw, clusters, tracks */ char mDataDescriptor[15+1]; /** A system or detector specific sub specification */ int64 mSubSpec; };

M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

7 / 10

Unified header and payload navigation Regardless what we choose, there is some complexity which we can encapsulate and make life of users easier Goal: easy navigation and access to headers, next headers, and payload using standard C++/stl tools Define an API and functional modules to ... encapsulate connection between the in-memory format and frames in the message queuing system provide direct access to headers and payload of all blocks implement iterator concept for access support optional configurable layout of data blocks in order to allow access at fixed positions optionally use (de)serializer factories to instantiate serialization tools for the device M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

8 / 10

Where does the timeframe specific information go?

Proposal: no specific timeframe header but data block like all the others Likely to be a serialized data structure containing the trigger information and other relevant information Payload specific information, e.g. trigger information of a data set from a triggered detector within timeframe, is not timeframe specific and can be added to the header stack

M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

9 / 10

What’s next Conclude within the working group and O2 on a format hopefully today or within next few days

Standalone sample implementation User API definition Prepare Alice Note

Thanks! M. Krzewicki, D. Rohr, M. Richter

AliceO2 online/in-memory formats

CWG4 2016-02-05

10 / 10