Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
Prerequisites:
- TiDB cluster with next-gen / disaggregated storage: TiKV + tikv-worker + TiFlash compute, with columnar enabled (
ENABLE_NEXT_GEN=1, ENABLE_NEXT_GEN_COLUMNAR=1).
- TiFlash replica on the test table (
AVAILABLE = 1).
SQL aligns with tests/fullstack-test2/clustered_index/query.test (int handle section, line 37).
DROP TABLE IF EXISTS test.t_1;
CREATE TABLE test.t_1 (
a BIGINT PRIMARY KEY CLUSTERED,
col INT
);
INSERT INTO test.t_1 VALUES
(-9223372036854775808, 1),
(9223372036854775807, 2),
(0, 3);
ALTER TABLE test.t_1 SET TIFLASH REPLICA 1;
-- Wait until information_schema.tiflash_replica.AVAILABLE = 1 for test.t_1
SET SESSION tidb_isolation_read_engines = 'tiflash';
-- query.test:37
SELECT * FROM test.t_1 WHERE a > -9223372036854775808;
-- Sanity: full table (also missing MAX on columnar)
SELECT * FROM test.t_1 ORDER BY a;
SELECT COUNT(*) FROM test.t_1;
Compare with SET SESSION tidb_isolation_read_engines = 'tikv' on the same table.
Optional (MPP):
SET tidb_enforce_mpp = 1;
EXPLAIN ANALYZE SELECT * FROM test.t_1 WHERE a > -9223372036854775808;
2. What did you expect to see? (Required)
Same results as TiKV / the integration test (query.test):
Line 37 — SELECT * FROM test.t_1 WHERE a > -9223372036854775808:
+---------------------+------+
| a | col |
+---------------------+------+
| 0 | 3 |
| 9223372036854775807 | 2 |
+---------------------+------+
2 rows in set
Full table — 3 rows: INT64_MIN / 1, 0 / 3, INT64_MAX / 2; COUNT(*) = 3.
3. What did you see instead (Required)
On TiFlash (columnar / disaggregated read path), queries succeed without SQL error but omit the row whose clustered PK is 9223372036854775807.
Line 37 (a > INT64_MIN):
| Engine |
Rows |
Result |
| TiKV |
2 |
0/3, 9223372036854775807/2 |
| TiFlash |
1 |
0/3 only |
Full table (SELECT * ORDER BY a):
| Engine |
Rows |
Handles present |
| TiKV |
3 |
INT64_MIN, 0, INT64_MAX |
| TiFlash |
2 |
INT64_MIN, 0 only |
TiFlash COUNT(*) |
2 |
— |
Related predicates on TiFlash (same missing MAX row):
WHERE a >= -9223372036854775808 — 2 rows (INT64_MIN, 0), not 3.
WHERE a >= 9223372036854775807 — 0 rows (expected 1).
EXPLAIN ANALYZE (MPP) may show TableFullScan on TiFlash with actRows = 2, consistent with storage returning only two handles.
This is not the enum-PK columnar bug (#10851) and not the partition-table _tidb_tid schema mismatch on transaction scans.
Investigation summary (TiFlash / proxy / columnar)
1. TiKV key encoding vs proxy decode (handle in record keys)
- TiDB/TiKV int handle keys use memcomparable
encode_i64 / decode_i64 (tidb_query_datatype::codec::table::encode_row_key / decode_int_handle), aligned with TiFlash RecordKVFormat::encodeInt64.
- Boundary roundtrip is correct, e.g.
INT64_MAX → key suffix ffffffffffffffff.
2. Columnar pack storage (logical handle)
RowToColumnarReader::push_handle_from_key decodes the KV key with decode_int_handle, then stores int_handle.to_le_bytes() in columnar packs (LE logical value, not the memcomparable key bytes).
- Rows
INT64_MIN and 0 are returned with correct a values on TiFlash, so pack decode/display for those handles works.
3. Read path: region range → scan bounds
- TiFlash passes region
KeyRanges to proxy (StorageDisaggregatedColumnar → fn_get_columnar_reader).
kvengine/src/read.rs update_range_handle sets start_handle / end_handle via decode_int_handle; if the region end is the next table prefix, end_handle = None (no upper bound).
ColumnarMvccReader filters with handle >= end_int_handle when end_int_handle is set (half-open interval). If any caller passes Some(INT64_MAX) as the handle upper bound, the MAX row itself is excluded — worth auditing for pushed ranges / cop ranges that use encode_row_key(table_id, i64::MAX) as an exclusive end key.
For the reproduced cluster, test.t_1 had a single region and full-table scan still returned only 2 rows, so the primary issue is not a per-query region end decode alone.
4. High-suspect root cause: handle index sentinel at i64::MAX
In contrib/tiflash-proxy-columnar/components/kvengine/src/table/columnar/builder.rs, finish_table appends a sentinel handle to handle_index so pack seek can load the last pack:
// if the handle is already i64::MAX, there is no next handle.
if next_handle != i64::MAX {
self.handle_builder.handle_index.push((next_handle + 1).to_le_bytes().to_vec());
}
When the table's maximum clustered PK is 9223372036854775807, MAX + 1 overflows, so no sentinel is added. HandleIndex::search_pack_idx + ColumnarTableReader::seek may then fail to read the last pack's row. This matches query.test, which explicitly inserts INT64_MAX as a PK.
5. Write path
- Columnar L2 is built via
RowToColumnarReader + ColumnarTableBuilder (compaction.rs transform_for_columnar / convert_row_file_to_columnar_file), with set_unbounded_handle_range() for ingest.
Fix direction (proposal):
- In
builder.rs, when max_handle == i64::MAX, append a non-overflowing upper sentinel for handle_index (e.g. a byte sequence that sorts after all row handles in memcomparable order, or a dedicated table-end marker used only for index seek).
- Add a regression test: three rows
(INT64_MIN, 0, INT64_MAX) — columnar build + read via disaggregated cop / read_block.
- Audit
set_int_handle_range / end_int_handle so exclusive end semantics do not drop the MAX row when the cop range end key is encode_row_key(t, i64::MAX).
Log analysis (mycli repro without DROP TABLE)
Finished reading remote snapshot through proxy, rows=N (StorageDisaggregatedColumnar.cpp, RNProxyInputStream destructor) counts rows delivered from proxy before schema projection (AddExtraTableIDColumnTransformAction::totalRows()).
| Query |
Finished reading proxy snapshots |
Finished reading remote snapshot through proxy |
MPP TableFullScan outbound_rows |
a > INT64_MIN |
rows=1 |
rows=1 |
1 |
| Full table |
rows=2 |
rows=2 |
2 |
COUNT(*) |
rows=2 |
rows=2 |
2 |
Conclusions from logs:
- Bug is at proxy → TiFlash boundary (MPP
outbound_rows equals proxy rows), not post-projection loss.
- Unlike enum-PK columnar bugs (
rows=2 with wrong values), this is row loss (TiKV 3 vs TiFlash 2 on full scan).
- Proxy snapshot metadata:
columnar_creates.biggest ends with ...FFFFFFFFFFFFFF, but read_block returns only 2 rows — metadata upper bound vs readable rows mismatch (supports handle-index / last-pack read issue at i64::MAX).
filter_conditions: [] on proxy reads — predicate not pushed to columnar filter; row count difference is from scan range + stored data.
Example TiFlash log (line 37):
[INFO] Finished reading proxy snapshots, rows=1 cost=0.001s
[INFO] Finished reading remote snapshot through proxy, rows=1 bytes=13 read_cost=0.001s deserialize_cost=0.000s
4. What is your TiFlash version? (Required)
- Environment: Next-gen / disaggregated with columnar (
ENABLE_NEXT_GEN=1, ENABLE_NEXT_GEN_COLUMNAR=1), tiflash-proxy-columnar + cloud-storage-engine (tikv-server / tikv-worker).
- Affected component:
kvengine columnar (columnar/builder.rs handle index, columnar/reader.rs seek/MVCC range) → TiFlash StorageDisaggregatedColumnar / RNProxyInputStream (not classic DeltaMerge storage).
- Reference test:
tests/fullstack-test2/clustered_index/query.test line 37.
Note for reviewers: After a proxy fix, rebuild TiFlash (proxy is linked into the columnar build), restart TiFlash, and rebuild or re-compact columnar files for the table before re-running SQL verification.
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
Prerequisites:
ENABLE_NEXT_GEN=1,ENABLE_NEXT_GEN_COLUMNAR=1).AVAILABLE = 1).SQL aligns with
tests/fullstack-test2/clustered_index/query.test(int handle section, line 37).Compare with
SET SESSION tidb_isolation_read_engines = 'tikv'on the same table.Optional (MPP):
2. What did you expect to see? (Required)
Same results as TiKV / the integration test (
query.test):Line 37 —
SELECT * FROM test.t_1 WHERE a > -9223372036854775808:Full table — 3 rows:
INT64_MIN / 1,0 / 3,INT64_MAX / 2;COUNT(*) = 3.3. What did you see instead (Required)
On TiFlash (columnar / disaggregated read path), queries succeed without SQL error but omit the row whose clustered PK is
9223372036854775807.Line 37 (
a > INT64_MIN):0/3,9223372036854775807/20/3onlyFull table (
SELECT * ORDER BY a):INT64_MIN,0,INT64_MAXINT64_MIN,0onlyCOUNT(*)Related predicates on TiFlash (same missing MAX row):
WHERE a >= -9223372036854775808— 2 rows (INT64_MIN,0), not 3.WHERE a >= 9223372036854775807— 0 rows (expected 1).EXPLAIN ANALYZE(MPP) may showTableFullScanon TiFlash withactRows = 2, consistent with storage returning only two handles.This is not the enum-PK columnar bug (#10851) and not the partition-table
_tidb_tidschema mismatch on transaction scans.Investigation summary (TiFlash / proxy / columnar)
1. TiKV key encoding vs proxy decode (handle in record keys)
encode_i64/decode_i64(tidb_query_datatype::codec::table::encode_row_key/decode_int_handle), aligned with TiFlashRecordKVFormat::encodeInt64.INT64_MAX→ key suffixffffffffffffffff.2. Columnar pack storage (logical handle)
RowToColumnarReader::push_handle_from_keydecodes the KV key withdecode_int_handle, then storesint_handle.to_le_bytes()in columnar packs (LE logical value, not the memcomparable key bytes).INT64_MINand0are returned with correctavalues on TiFlash, so pack decode/display for those handles works.3. Read path: region range → scan bounds
KeyRanges to proxy (StorageDisaggregatedColumnar→fn_get_columnar_reader).kvengine/src/read.rsupdate_range_handlesetsstart_handle/end_handleviadecode_int_handle; if the region end is the next table prefix,end_handle = None(no upper bound).ColumnarMvccReaderfilters withhandle >= end_int_handlewhenend_int_handleis set (half-open interval). If any caller passesSome(INT64_MAX)as the handle upper bound, the MAX row itself is excluded — worth auditing for pushed ranges / cop ranges that useencode_row_key(table_id, i64::MAX)as an exclusive end key.For the reproduced cluster,
test.t_1had a single region and full-table scan still returned only 2 rows, so the primary issue is not a per-query region end decode alone.4. High-suspect root cause: handle index sentinel at
i64::MAXIn
contrib/tiflash-proxy-columnar/components/kvengine/src/table/columnar/builder.rs,finish_tableappends a sentinel handle tohandle_indexso pack seek can load the last pack:When the table's maximum clustered PK is
9223372036854775807,MAX + 1overflows, so no sentinel is added.HandleIndex::search_pack_idx+ColumnarTableReader::seekmay then fail to read the last pack's row. This matchesquery.test, which explicitly insertsINT64_MAXas a PK.5. Write path
RowToColumnarReader+ColumnarTableBuilder(compaction.rstransform_for_columnar/convert_row_file_to_columnar_file), withset_unbounded_handle_range()for ingest.Fix direction (proposal):
builder.rs, whenmax_handle == i64::MAX, append a non-overflowing upper sentinel forhandle_index(e.g. a byte sequence that sorts after all row handles in memcomparable order, or a dedicated table-end marker used only for index seek).(INT64_MIN, 0, INT64_MAX)— columnar build + read via disaggregated cop /read_block.set_int_handle_range/end_int_handleso exclusive end semantics do not drop the MAX row when the cop range end key isencode_row_key(t, i64::MAX).Log analysis (mycli repro without
DROP TABLE)Finished reading remote snapshot through proxy, rows=N(StorageDisaggregatedColumnar.cpp,RNProxyInputStreamdestructor) counts rows delivered from proxy before schema projection (AddExtraTableIDColumnTransformAction::totalRows()).Finished reading proxy snapshotsFinished reading remote snapshot through proxyTableFullScanoutbound_rowsa > INT64_MINCOUNT(*)Conclusions from logs:
outbound_rowsequals proxyrows), not post-projection loss.rows=2with wrong values), this is row loss (TiKV 3 vs TiFlash 2 on full scan).columnar_creates.biggestends with...FFFFFFFFFFFFFF, butread_blockreturns only 2 rows — metadata upper bound vs readable rows mismatch (supports handle-index / last-pack read issue ati64::MAX).filter_conditions: []on proxy reads — predicate not pushed to columnar filter; row count difference is from scan range + stored data.Example TiFlash log (line 37):
4. What is your TiFlash version? (Required)
ENABLE_NEXT_GEN=1,ENABLE_NEXT_GEN_COLUMNAR=1),tiflash-proxy-columnar+cloud-storage-engine(tikv-server / tikv-worker).kvenginecolumnar (columnar/builder.rshandle index,columnar/reader.rsseek/MVCC range) → TiFlashStorageDisaggregatedColumnar/RNProxyInputStream(not classic DeltaMerge storage).tests/fullstack-test2/clustered_index/query.testline 37.Note for reviewers: After a proxy fix, rebuild TiFlash (proxy is linked into the columnar build), restart TiFlash, and rebuild or re-compact columnar files for the table before re-running SQL verification.