Skip to content

Conversation

@brancz
Copy link
Contributor

@brancz brancz commented Jan 14, 2026

Which issue does this PR close?

Closes #9174

What changes are included in this PR?

Implementation and tests. It's mostly copied from List.

Are these changes tested?

Yes, see unit tests.

Are there any user-facing changes?

No, purely additive.

@alamb @Jefffrey

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jan 14, 2026
Copy link
Contributor

@friendlymatthew friendlymatthew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, do you mind rebasing? I think CI is failing because this needs to be rebased on the latest master- the base branch is missing the encoded_len fn that was added recently

Otherwise, the implementation makes sense to me

@brancz brancz force-pushed the row-list-view branch 2 times, most recently from cd2a465 to 29f13ef Compare January 15, 2026 09:13
@brancz
Copy link
Contributor Author

brancz commented Jan 15, 2026

Very confused because both this PR and #9175 are based off of latest main, and the tip of the other PR seems to work fine.

@brancz
Copy link
Contributor Author

brancz commented Jan 16, 2026

Figured it out, I did actually need to change something after the rebase here, also refactored the use in both list-types.

list_size += 1;
}
}
O::from_usize(child_count).expect("overflow");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to force a panic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, same as the other assertion, this is consistent with what we do for regular lists as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I feel it makes more sense to return an error here since the function already supports that

.collect();

let child = unsafe { converter.convert_raw(&mut child_rows, validate_utf8) }?;
assert_eq!(child.len(), 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return an error here since the function returns a result anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we perform the exact same assertion in the regular lists, so I did this for consistency and it seems like a good idea since something went pretty spectacularly wrong if this isn't true


/// Computes the minimum offset and maximum end (offset + size) for a ListView array.
/// Returns (min_offset, max_end) which can be used to slice the values array.
fn compute_list_view_bounds<O: OffsetSizeTrait>(array: &GenericListViewArray<O>) -> (usize, usize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems oddly placed; should be lower down instead of in the middle of the mod declarations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, will move

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved it further down in the file, but not sure I love that either, maybe move it to the list file? what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always put it down near row_lengths where it won't intrude in the middle of Codec here

_ => unreachable!(),
};

let null_buffer = NullBuffer::new(BooleanBuffer::new(nulls.into(), 0, rows.len()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use new_unchecked as you already have the null count and we want to avoid calculating it twice

Suggested change
let null_buffer = NullBuffer::new(BooleanBuffer::new(nulls.into(), 0, rows.len()));
let null_buffer = NullBuffer::new_unchecked(BooleanBuffer::new(nulls.into(), 0, rows.len()), null_count);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, done!


if size > 0 {
min_offset = min_offset.min(offset);
max_end = max_end.max(end);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can break if you reached maximum bounds (0 and maximum value that can be)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done (although I wonder how often this is better in practice while adding an additional branch for a case that might happen incredibly rarely)

Comment on lines 3349 to 3357
#[test]
fn test_list_view() {
test_single_list_view::<i32>();
}

#[test]
fn test_large_list_view() {
test_single_list_view::<i64>();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add nested tests like the regular list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

test_nested_list::<i64>();
}

fn test_single_list_view<O: OffsetSizeTrait>() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more tests that take advantage of the fact that this is a view, namely

  • both list point to the same value.
  • unordered offsets (one item is from offset x and some item after that is from offset y and y is before x)
  • list 1 items cover list 2 items and a little more (e.g. list 1 offset is 10 and size 5 and list 2 offset is 12 and size 2).

Copy link
Contributor Author

@brancz brancz Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done (added all cases in one test let me know if you prefer separate tests)

ListArray::new(field, offsets, values, Some(nulls))
}

fn generate_column(len: usize) -> ArrayRef {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the list view and large list view to here as well similar to how list and large list are here.
don't forget to increase the random range so it will cover the new values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a great idea, it actually caught something

@brancz
Copy link
Contributor Author

brancz commented Jan 20, 2026

@rluvaton thoughts on the assertion vs. returning an error? I don't feel too strongly one way or another, but I do think we should be consistent for both List and ListView.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support formatting ListView

4 participants