Skip to content

feat/perf: optimize AddRange in RoaringPositionBitmap#608

Open
Baunsgaard wants to merge 13 commits intoapache:mainfrom
Baunsgaard:optimize-addrange-roaring-bitmap
Open

feat/perf: optimize AddRange in RoaringPositionBitmap#608
Baunsgaard wants to merge 13 commits intoapache:mainfrom
Baunsgaard:optimize-addrange-roaring-bitmap

Conversation

@Baunsgaard
Copy link
Copy Markdown

Replace the per-position loop in RoaringPositionBitmap::AddRange with direct calls to roaring::Roaring::addRange, matching the optimization in the Java implementation (apache/iceberg#15791).

Added tests for single-position ranges, large contiguous ranges, three-key spanning ranges, and invalid inputs.

Replace the per-position loop in AddRange with direct
roaring::Roaring::addRange calls. For ranges within a
single 32-bit key, a single addRange call suffices. For
ranges spanning multiple keys, the range is split into
the first partial key, full middle keys, and the last
partial key.
AddRange() now creates run-length encoded containers directly,
leaving nothing for Optimize() to compress. Use individual Add()
calls to produce array containers that Optimize() can convert.
Comment on lines 68 to 69
/// If pos_start >= pos_end, this method does nothing.
/// \note Invalid positions are silently ignored
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rephrase may be state pos_start >= pos_end is considered invalid.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made it invalid if pos_start > pos_end, technically I can imagine cases where pos_start == pos_end, therefore i keep that as a no-op.

Let me know what you think.

if (pos_start >= pos_end) {
return;
}
if (pos_start < 0 || pos_end - 1 > kMaxPosition) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not clear to me this is correct, don't we need to fill in valid values?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

We need to ignore invalid positions but set values for valid ones.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, i follow the style of the normal operation, to ignore invalid positions, and clamp inputs to correct values.

However, this is inconsistent with the behaviour of core Java, where positions gets validated, and errors out if they are out of range.

if (pos_start >= pos_end) {
return;
}
if (pos_start < 0 || pos_end - 1 > kMaxPosition) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

We need to ignore invalid positions but set values for valid ones.

void Add(int64_t pos);

/// \brief Sets a range of positions [pos_start, pos_end).
/// If pos_start >= pos_end, this method does nothing.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contradicts the note sentence below.

Combine the empty-range check and the bounds check into one
conditional to reduce nesting and improve readability.
No behavior change.
When the requested range extends beyond valid bounds [0, kMaxPosition],
clamp to the valid portion and set those positions. This is consistent
with Add() which silently ignores individual invalid positions -- a
partially-valid range should still set its valid subset.

Update tests to verify clamping behavior for negative start and
beyond-max-position end values.
Replace the branching if/else (single-key vs multi-key) with a
unified loop from start_key to end_key. The boundary conditions
for the first and last key are handled inline with ternary
expressions. This is easier to maintain and reason about.
The previous doc said "does nothing" for empty ranges while the
note said "Invalid positions are silently ignored", which was
contradictory. Rewrite to document the clamping parameters and
clarify that out-of-bound positions are silently ignored.
Replace individual TEST() functions for reversed-range and
equal-start-end cases with a parameterized test suite. Add
additional edge cases (zero-length at zero, both-negative) for
broader coverage.
A reversed range (start > end) indicates a bug in the caller and
should fail fast with std::invalid_argument rather than silently
succeed. An empty range (start == end) remains a valid no-op.

This makes AddRange behavior consistent across the Java, C++, and
Rust Iceberg implementations.
Adjust line breaks and alignment to satisfy clang-format.
@Baunsgaard
Copy link
Copy Markdown
Author

Thanks for the reviews! Addressed all feedback, let me know if i missed anything or any changes are required.

As a reminder the related PRs are:

Benchmark results

I was asked for Benchmark results so i thought i would post some on each PR.

AddRange() (native CRoaring addRange) vs a loop of individual
Add() calls (Release build, GCC 14, 5 iterations after 3 warmup):

Scenario Before (Add loop) After (AddRange) Speedup
100 positions 1.1 µs 73.6 ns 15x
10k positions 91.1 µs 70.6 ns 1,290x
200k positions 1.60 ms 188.2 ns 8,475x
1M positions 8.80 ms 634.2 ns 13,880x
10k cross-key boundary 97.3 µs 102.4 ns 951x

The cross-key boundary scenario inserts 10k positions spanning the 32-bit key boundary (e.g., (1LL << 32) - 5000 to (1LL << 32) + 5000). This exercises the key-splitting logic where the range is divided across two internal 32-bit Roaring bitmaps.

The old test called AddRange near kMaxPosition (0x7FFFFFFE80000000),
which required allocating ~2 billion internal bitmap slots and took
~180 seconds. CI killed the util_test binary before it finished.

Replace with a range entirely beyond kMaxPosition so clamping makes
the range empty — no allocation needed, completes in < 1 ms.
The pos_start == pos_end early return is unnecessary because
the post-clamp guard (pos_start >= pos_end) already handles it.
After the throw for reversed ranges, clamping can only shrink
the range, so the only empty case is equality.
for (int64_t pos = pos_start; pos < pos_end; ++pos) {
Add(pos);
if (pos_start > pos_end) {
throw std::invalid_argument("AddRange requires pos_start <= pos_end, got [" +
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return Status instead of throwing exception if you really think this is an invalid case. IMHO, we need to be consistent with line 123 and line 140 to return on invalid inputs.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is an invalid input, however for consistency with other methods in the c++ implementation, then both versions are valid.

Before my changes in the PR to the Java base, it also would not throw any exceptions. since it used the same for loop definition. To keep the existing behaviour, I have removed the exception throwing.

Replace the std::invalid_argument throw with a silent return on
reversed ranges (pos_start > pos_end), consistent with how Add()
handles invalid positions. The clamping already reduces the range
to empty when start > end after clamping, so the explicit throw
is removed entirely.
@Baunsgaard Baunsgaard requested a review from wgtmac March 31, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants