l1/reader: Add lookahead for object metadata#29673
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds optional object metadata (object ID, footer position, object size) to the get_extent_metadata_forwards metastore API, enabling the L1 reader to populate a lookahead buffer of object metadata. This optimization reduces RPC calls from N to 1 for N objects by batching metadata lookups.
Changes:
- Added
include_object_metadataparameter toget_extent_metadata_forwardsAPI with corresponding fields inextent_metadataRPC type - Implemented lookahead buffer in L1 reader to prefetch multiple objects' metadata in a single RPC
- Added
lookahead_objectsconfiguration parameter to control prefetch behavior
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/v/cloud_topics/log_reader_config.h | Added lookahead_objects configuration field for controlling metadata prefetch |
| src/v/cloud_topics/level_one/metastore/rpc_types.h | Extended extent_metadata and get_extent_metadata_request with object metadata fields |
| src/v/cloud_topics/level_one/metastore/metastore.h | Updated interface with include_object_metadata parameter and added documentation |
| src/v/cloud_topics/level_one/metastore/simple_metastore.h | Added include_object_metadata parameter to method signatures |
| src/v/cloud_topics/level_one/metastore/simple_metastore.cc | Implemented object metadata lookup logic |
| src/v/cloud_topics/level_one/metastore/replicated_metastore.h | Updated method signature with new parameter |
| src/v/cloud_topics/level_one/metastore/replicated_metastore.cc | Forwarded include_object_metadata parameter and updated RPC conversion |
| src/v/cloud_topics/level_one/domain/simple_domain_manager.cc | Updated domain manager to pass through object metadata in conversions |
| src/v/cloud_topics/level_one/domain/db_domain_manager.cc | Implemented object metadata lookup in DB-backed implementation |
| src/v/cloud_topics/level_one/frontend_reader/level_one_reader.h | Added lookahead buffer and helper methods for managing it |
| src/v/cloud_topics/level_one/frontend_reader/level_one_reader.cc | Implemented lookahead buffer logic and refactored object lookup |
| src/v/cloud_topics/level_one/metastore/tests/simple_metastore_test.cc | Added comprehensive tests for object metadata feature |
| src/v/cloud_topics/level_one/frontend_reader/tests/reader_test.cc | Added tests for lookahead functionality with single and multiple objects |
| src/v/cloud_topics/level_one/frontend_reader/tests/l1_reader_fixture.h | Extended test fixture to support lookahead parameter |
| kafka::offset offset, model::timeout_clock::time_point /*deadline*/) { | ||
| std::optional<l1::metastore::object_response> | ||
| level_one_log_reader_impl::consume_lookahead_buffer(kafka::offset offset) { | ||
| // Discard stale entries whose data is entirely before the requested |
There was a problem hiding this comment.
Is this possible because of the caching stuff? e.g we might reuse a reader for serving a request at offset 50 from a cached reader which has/had extents [30,39],[40,49],[50,100] prefetched or something?
There was a problem hiding this comment.
This is Claude being very defensive. I don't expect we ever hit this but it is the correct thing to do if somehow the lookahead is behind where we want to read from.
|
Force push to rebase on dev. More fixes in second push. |
Add an `include_object_metadata` flag to the get_extent_metadata_forwards RPCs. When set, the domain manager does per-extent object point lookups and populates oid, footer_pos, and object_size on the extent_metadata response. The information is only included when the flag is set because of the additional cost of the per-extent lookups. The three object fields are wrapped in std::optional<extent_object_info> so that when the flag is not set, only a single false byte is encoded instead of three default-valued fields. If an extent references an object that cannot be found in the local metastore, an error is logged and an exception is thrown, since this indicates a serious data integrity problem. This will be used in a follow-up to add object metadata prefetching to the L1 reader.
This changes how the L1 reader gets object metadata. Instead of calling get_first_ge using the reader's start offset or the next offset after the last offset of the current object, it uses the forward extent iterator to get multiple objects' worth at a time. What's the advantage? Reading N objects before required N get_first_ge requests, which are full metastore RPCs, and each get_first_ge required 2 lookups in the metastore. The new lookahead buffer requires 1 RPC, which requires 1 range read for N extents and N lookups for the object metadata. This is a big reduction in metastore RPCs if readers are reading multiple objects worth of data at once, and equivalent if only one object's info is needed. That said, this is probably not helpful for the typical Kafka fetch path, because each L1 object should contain many fetch requests worth of data (assuming sensible configuration). It might be helpful for some internal readers. Therefore, the default lookahead only grabs one object, so its behavior reduces to the previous behavior (look up the next object when we need it).
|
Force push to address Tyler's feedback. |
| .build()); | ||
| auto add_res | ||
| = m.add_objects(os, terms_builder().add(tid_a, 0_tm, 0_o).build()).get(); | ||
| ASSERT_TRUE(add_res.has_value()); |
There was a problem hiding this comment.
this test consistently fails for me both locally and in CI
Revert "Merge pull request #29673 from redpanda-data/l1-readahead"
Re-applies the changes from #29673 (reverted in #29757) with two fixes: 1. TestGetExtentMetadataForwardsWithObjectMetadata now calls preregister_objects before add_objects, required after #29739 made preregistration mandatory. 2. l1_reader_cache::evict_stale() had a co_await inside its iteration loop. During suspension, take_reader() on the fetch scheduling group could erase entries from the intrusive list, invalidating the iterator. Fix by collecting stale readers without yielding, then closing them after the loop — matching the pattern already used by stop().
This adds optional object metadata information to the
get_extent_metadata_forwardsmetastore API, which allows the L1 reader to populate a lookahead buffer of object metadata. The object metadata is optionally added because it requires additional lookups to populate, and users of the extent iterator that don't need this metadata shoudn't pay for it (and it seemed like overkill to add a second almost-identical call).The previous one-at-a-time
get_first_geusage required N RPCs for N objects, and each RPC required 2 LSM lookups (2N total). For N objects, this new method requires 1 RPC, and the RPC requires 1 range scan and N point lookups. This is a win for N > 1, especially on RPC count, and the cost is the same for N = 1.However, in the typical Kafka fetch case, I don't think this lookahead is helpful, because one L1 object should have many fetches' worth of data, so the lookahead is not worth it most of the time. Still, it might be helpful for some internal readers that stream a lot of batches through one L1 reader instance.
I hope this doesn't conflict too hard with the caching stuff. It's conceptually orthogonal.
Backports Required
Release Notes