Skip to content

Conversation

@jblomer
Copy link
Contributor

@jblomer jblomer commented Dec 3, 2025

Transparently translates a leaf count array in a custom class in an RVec (read & write). Classes with leaf count arrays are used, e.g., in the ALICE RAW EDM but most likely there are other relevant cases, too.

EDIT: also needed for LHCb DST files.

Since this approach diverges the in-memory representation from the on-disk representation, we should discuss if we actually want this.

In any case, we probably want the first commit that detects leaf count arrays in classes and fails gracefully in such a case.

@jblomer jblomer self-assigned this Dec 3, 2025
@jblomer jblomer force-pushed the ntuple-leafcount-class branch from 57e7292 to 614cc6c Compare December 3, 2025 15:52
@github-actions
Copy link

github-actions bot commented Dec 4, 2025

Test Results

    22 files      22 suites   3d 19h 25m 52s ⏱️
 3 758 tests  3 758 ✅ 0 💤 0 ❌
75 646 runs  75 646 ✅ 0 💤 0 ❌

Results for commit 83ebc41.

♻️ This comment has been updated with latest results.


struct LeafCountInClass : public LeafCountInClassBase {
unsigned char *fPayload1 = nullptr; //[fSize];
unsigned char *fPayload2 = nullptr; //[fSize];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing the context behind this PR, but looking at this class makes me wonder if we're subtly introducing support for raw pointers in RNTuple on-disk format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could "emulate" a raw pointer by using fSize = 1, but since this only works for owned payloads it's not conceptually different from supporting unique_ptr (I believe the problematic part of raw pointers is the fact that they may not be owning and reference data outside the parent class)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that on-disk this is stored as an RVec<char>. This PR adds support for a peculiar memory layout of a vector, if you will.

@pcanal pcanal marked this pull request as draft December 18, 2025 22:12
friend class RClassField; // only class fields are able to construct this

private:
std::size_t fItemSize{0}; ///< The size of a child field's item
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better or worse, the leaf count is currently limited (hard coded) to 32bits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This member does not store the count (size) of the collection but the size of the collection item (as in sizeof(T) for std::vector<T>).

Copy link
Member

@pcanal pcanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM code wise. Inclusion shall be discussed at next ROOT I/O workshop.

@jblomer jblomer force-pushed the ntuple-leafcount-class branch from 614cc6c to 83ebc41 Compare January 21, 2026 09:58
@jblomer jblomer marked this pull request as ready for review January 21, 2026 09:58
New field for internal use: maps an on-disk collection to a leaf count
array in a class. Writes RVec collection. Must only be used inside a
class field and thus can only be constructed by RClassField.
Maps leaf count arrays in classes to RVecs using
Internal::RLeafCountArrayField
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants