Skip to content

*: support new columnar data source#10842

Open
yongman wants to merge 4 commits into
pingcap:masterfrom
yongman:pick-columnar-to-master
Open

*: support new columnar data source#10842
yongman wants to merge 4 commits into
pingcap:masterfrom
yongman:pick-columnar-to-master

Conversation

@yongman
Copy link
Copy Markdown
Member

@yongman yongman commented May 13, 2026

What problem does this PR solve?

Issue Number: close #10844

Problem Summary:

What is changed and how it works?


Add the new columnar storage as data source for TiDB X.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Manual Test

  1. Build tiflash with Option -DENABLE_COLUMNAR_DISAGG=ON to enable building with tiflash-proxy-columnar
  2. Run the tiflash compute node with option use_columnar=true in config.
  3. Run TiFlash workloads.
MySQL root@localhost:test> show create table t1;
+-------+-------------------------------------------------------------+
| Table | Create Table                                                |
+-------+-------------------------------------------------------------+
| t1    | CREATE TABLE `t1` (                                         |
|       |   `a` int DEFAULT NULL                                      |
|       | ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+-------------------------------------------------------------+

1 row in set
Time: 0.019s
MySQL root@localhost:test> SET @@session.tidb_isolation_read_engines='tiflash';
Query OK, 0 rows affected
Time: 0.001s
MySQL root@localhost:test> explain analyze select * from t1;
+---------------------+---------+---------+--------------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+-----------+------+
| id                  | estRows | actRows | task         | access object | execution info                                                                                                                                                          | operator info                        | memory    | disk |
+---------------------+---------+---------+--------------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+-----------+------+
| TableReader_10      | 5.00    | 5       | root         |               | time:61.7ms, open:10.4ms, close:12µs, loops:2, RU:0.00, cop_task: {num: 2, max: 0s, min: 0s, avg: 0s, p95: 0s, copr_cache_hit_ratio: 0.00}, fetch_resp_duration: 51.2ms | MppVersion: 3, data:ExchangeSender_9 | 686 Bytes | N/A  |
| └─ExchangeSender_9  | 5.00    | 5       | mpp[tiflash] |               | tiflash_task:{time:40.8ms, loops:1, threads:1}, tiflash_network: {inner_zone_send_bytes: 56}                                                                            | ExchangeType: PassThrough            | N/A       | N/A  |
|   └─TableFullScan_8 | 5.00    | 5       | mpp[tiflash] | table:t1      | tiflash_task:{time:40.8ms, loops:1, threads:1}                                                                                                                          | keep order:false, stats:pseudo       | N/A       | N/A  |
+---------------------+---------+---------+--------------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+-----------+------+

3 rows in set
Time: 0.073s
MySQL root@localhost:test>

Summary by CodeRabbit

  • New Features

    • Added columnar disaggregated reads via a dedicated proxy path.
    • New runtime config to enable/opt into columnar disaggregated behavior.
  • Chores

    • Added dedicated submodule and build-time option to control columnar proxy support.
    • Build now conditionally includes/excludes columnar disaggregated code.
  • Bug Fixes

    • Improved proxy read error handling with clearer retry and snapshot semantics.

Review Change Stack

Signed-off-by: yongman <yming0221@gmail.com>
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented May 13, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels May 13, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented May 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hehechen for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 13, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0b5a078d-24e7-4fcf-9059-9a5c820b22cd

📥 Commits

Reviewing files that changed from the base of the PR and between 77c5a0d and 013f968.

📒 Files selected for processing (2)
  • dbms/src/DataTypes/DataTypeDecimal.cpp
  • dbms/src/Storages/StorageDisaggregatedRemote.cpp
🚧 Files skipped from review as they are similar to previous changes (2)
  • dbms/src/Storages/StorageDisaggregatedRemote.cpp
  • dbms/src/DataTypes/DataTypeDecimal.cpp

📝 Walkthrough

Walkthrough

Adds an opt-in columnar disaggregated read path: new CMake option and submodule, propagates use_columnar through config and server startup, adjusts proxy config/start logic, routes StorageDisaggregated reads to a new proxy-backed columnar implementation, and provides fallback stubs when the feature is disabled.

Changes

Columnar Disaggregated Storage Feature

Layer / File(s) Summary
Build and Submodule Setup
.gitmodules, CMakeLists.txt, cmake/find_tiflash_proxy.cmake, contrib/tiflash-proxy-cmake/CMakeLists.txt, contrib/tiflash-proxy-columnar, dbms/CMakeLists.txt, dbms/src/Common/config.h.in
New ENABLE_COLUMNAR_DISAGG CMake option, new contrib/tiflash-proxy-columnar submodule and pinned commit, proxy source selection and SERVERLESS_PROXY detection, jemalloc feature selection update, and conditional inclusion of columnar sources.
Configuration Propagation
dbms/src/Core/TiFlashDisaggregatedMode.h, dbms/src/Core/TiFlashDisaggregatedMode.cpp, dbms/src/Interpreters/SharedContexts/Disagg.h, dbms/src/Interpreters/Context.cpp, dbms/src/Server/Server.cpp
Adds use_columnar to DisaggOptions, parses flash.use_columnar, and propagates the flag into shared context and server startup wiring.
Proxy Configuration and Startup Wiring
dbms/src/Storages/KVStore/ProxyStateMachine.h, dbms/src/Common/ErrorCodes.cpp
TiFlashProxyConfig accepts use_columnar; try-parse logic supports init-only proxy startup when columnar is enabled; getArgs() omits empty arg values; blacklist-file gating updated; new COLUMNAR_SNAPSHOT_ERROR error code added.
Storage Infrastructure & Helpers
dbms/src/Storages/KVStore/KVStore.h, dbms/src/Storages/KVStore/KVStore.cpp, dbms/src/Storages/KVStore/TMTContext.cpp, dbms/src/DataStreams/AddExtraTableIDColumnTransformAction.h, dbms/src/DataStreams/AddExtraTableIDColumnTransformAction.cpp, dbms/src/DataTypes/DataTypeDecimal.cpp
KVStore adds restoreProxyHelper(). TMTContext preserves kvstore when columnar enabled and restores proxy helper on restore. Extra-table-id transform gains fill(); Decimal256 deserialization adds proxy-compatible decode fast path.
Storage Read Path Routing
dbms/src/Storages/StorageDisaggregated.h, dbms/src/Storages/StorageDisaggregated.cpp
StorageDisaggregated constructor short-circuits in columnar mode; both read() overloads delegate to readThroughProxy() when isReadColumnar() is true. getExtraCastExpr() and extraCast() extended to consider pushed-down filter columns; added includes for DAGUtils and Disagg.
Columnar Read Component Type Declarations
dbms/src/Storages/StorageDisaggregatedColumnar.h
Adds declarations for RNProxyReader, RNProxyReadTask, RNProxyInputStream, and RNProxySourceOp types with factory methods, options, and lifecycle interfaces.
Columnar Proxy Read Implementation
dbms/src/Storages/StorageDisaggregatedColumnar.cpp
Implements isReadColumnar, both readThroughProxy overloads, merging pushed-down filters, generated-column placeholder execution, RNProxyReader creation with error mapping (region/snapshot/lock), RNProxyReadTask building with region validation and concurrency sizing, RNProxyInputStream block deserialization and accounting, and RNProxySourceOp operator execution.
Decimal256 proxy decode
dbms/src/DataTypes/DataTypeDecimal.cpp
Adds decodeProxyDecimal256 helper and a per-element deserialize fast path for Decimal256 used by proxy columnar deserialization.
Fallback Implementations for Disabled Build
dbms/src/Storages/StorageDisaggregatedRemote.cpp
When !ENABLE_COLUMNAR_DISAGG: isReadColumnar() returns false; readThroughProxy() overloads abort with runtime check; filterConditionsWithPushedDownFilters() forward to existing filterConditions().

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

approved, lgtm

Suggested reviewers

  • CalvinNeo
  • JinheLin
  • yudongusa

Poem

🐰 I hopped a branch and added a gate,
A proxy path for columnar fate,
Submodule sewn and flags aligned,
Readers, tasks, and streams combined,
Now data hops through cloud and crate!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title '*: support new columnar data source' clearly describes the main change—adding support for a new columnar storage data source as evidenced by the extensive codebase additions across multiple layers.
Description check ✅ Passed The PR description follows the template structure, includes the linked issue (#10844), references testing (unit, integration, and manual), provides manual test steps with example output, and includes release notes.
Linked Issues check ✅ Passed The PR implementation fully supports the linked issue #10844 objective to add new columnar storage as a data source, including build-time configuration (ENABLE_COLUMNAR_DISAGG), runtime configuration (use_columnar flag), and complete read path implementation through new RNProxy reader components.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the columnar disaggregated storage feature: build system integration, configuration options, proxy helper restoration, deserialization support, and the complete read pipeline through RNProxy components. No unrelated code modifications are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: yongman <yming0221@gmail.com>
@yongman yongman marked this pull request as ready for review May 14, 2026 08:17
@ti-chi-bot ti-chi-bot Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 14, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai Bot commented May 14, 2026

@yongman I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

ℹ️ Learn more details on Pantheon AI.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
dbms/src/Storages/KVStore/ProxyStateMachine.h (1)

132-132: ⚖️ Poor tradeoff

Consider wrapping init_only logic in conditional compilation.

The init_only variable (line 135) and its assignment logic (lines 139-151) are only used within the #if SERVERLESS_PROXY == 1 block (lines 205-206). When SERVERLESS_PROXY is not defined, this variable and the associated logic are unused. Consider wrapping the entire init_only variable declaration and assignment in #if SERVERLESS_PROXY == 1 for cleaner conditional compilation.

♻️ Suggested refactor
     bool tryParseFromConfig(
         const Poco::Util::LayeredConfiguration & config,
         const DisaggregatedMode disaggregated_mode,
         const bool use_autoscaler,
         const bool use_columnar,
         const LoggerPtr & log)
     {
-        bool init_only = false;
         // tiflash_compute doesn't need proxy except when using columnar.
         if (disaggregated_mode == DisaggregatedMode::Compute && use_autoscaler)
         {
             if (use_columnar)
             {
+#if SERVERLESS_PROXY == 1
+                bool init_only = true;
                 LOG_INFO(
                     log,
                     "TiFlash Proxy will start because columnar is enabled with AutoScale Disaggregated Compute Mode "
                     "specified.");
-                init_only = true;
+#else
+                LOG_INFO(log, "TiFlash Proxy will not start because SERVERLESS_PROXY is not enabled.");
+                return false;
+#endif
             }

Also applies to: 135-151

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/KVStore/ProxyStateMachine.h` at line 132, The variable
init_only and its assignment logic are only used under the SERVERLESS_PROXY
feature flag; to avoid unused-code when SERVERLESS_PROXY != 1, move the
declaration and assignment of init_only inside a `#if` SERVERLESS_PROXY == 1 /
`#endif` block. Locate the init_only declaration and the block that assigns it
(inside the ProxyStateMachine constructor or where init_only is defined) and
wrap both the declaration and the subsequent assignment logic in the conditional
compilation so the symbol is only compiled when SERVERLESS_PROXY is enabled.
dbms/src/Storages/StorageDisaggregatedColumnar.h (1)

234-234: 💤 Low value

Remove or document the commented-out field.

The commented line //double duration_wait_ready_task_sec = 0; should either be removed if not needed or uncommented with documentation if planned for future use.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.h` at line 234, Remove or
properly document the commented-out field `duration_wait_ready_task_sec` in
StorageDisaggregatedColumnar: either delete the line `//double
duration_wait_ready_task_sec = 0;` if it's unused, or uncomment it and add a
brief comment explaining its purpose and units (e.g., "time spent waiting for
ready tasks in seconds") and why it's kept for future use so its intent is clear
to readers and static analysis.
dbms/src/Storages/StorageDisaggregatedColumnar.cpp (2)

221-224: 💤 Low value

Consider improving readability of the boolean parameter.

The false parameter in convertTimeZoneByOffset lacks context. If the API doesn't support named parameters, consider adding a brief comment explaining what the boolean controls.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp` around lines 221 - 224,
The anonymous boolean literal `false` passed to convertTimeZoneByOffset reduces
readability; update the call site where timezone_info.timezone_offset is handled
to either replace `false` with a clearly named constant/enum (e.g., use a
constant like IGNORE_DST or DONT_APPLY_DST) or add an inline comment describing
the flag's meaning (for example: /* apply_dst = false */) so future readers
immediately understand what that parameter controls; locate the call to
convertTimeZoneByOffset near convertTimeZone and timezone_info.timezone_offset
in StorageDisaggregatedColumnar.cpp to make the change.

566-566: ⚡ Quick win

Use fmt-style Exception constructor.

These exceptions use the old-style constructor. As per coding guidelines, prefer the fmt-style constructor with error code first: throw Exception(ErrorCodes::SOME_CODE, "Message").

♻️ Suggested fix
-        throw Exception("lock error", ErrorCodes::COLUMNAR_SNAPSHOT_ERROR);
+        throw Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "lock error");
-        throw Exception("pd client error", ErrorCodes::COLUMNAR_SNAPSHOT_ERROR);
+        throw Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "pd client error");
-        throw Exception("unknown error type", ErrorCodes::COLUMNAR_SNAPSHOT_ERROR);
+        throw Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "unknown error type");

As per coding guidelines: "Prefer the fmt-style constructor for DB::Exception with error code first: throw Exception(ErrorCodes::SOME_CODE, "Message with {}", arg);"

Also applies to: 571-571, 576-576

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp` at line 566, Replace
old-style DB::Exception construction with the fmt-style (error code first) for
the thrown exceptions in StorageDisaggregatedColumnar.cpp: change the calls that
currently use throw Exception("lock error", ErrorCodes::COLUMNAR_SNAPSHOT_ERROR)
(and the other two similar occurrences) to use throw
Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "lock error") so the error code
is the first argument; ensure you update each instance that uses the old-ordered
arguments to the new fmt-style Exception(ErrorCodes::..., "message") form.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmake/find_tiflash_proxy.cmake`:
- Around line 63-64: Replace the invalid CMake message mode "FATAL" with the
correct "FATAL_ERROR" in the conditional that checks for the next-gen
tiflash-proxy; specifically update the message(...) call inside the if
(ENABLE_NEXT_GEN AND NOT ENABLE_COLUMNAR_DISAGG AND NOT EXISTS
".../contrib/tiflash-proxy-next-gen/.../cloud_helper.rs") branch so it uses
FATAL_ERROR (matching other occurrences like the one near line 58 and
find_tipb.cmake) to ensure configuration stops when the file is missing.

---

Nitpick comments:
In `@dbms/src/Storages/KVStore/ProxyStateMachine.h`:
- Line 132: The variable init_only and its assignment logic are only used under
the SERVERLESS_PROXY feature flag; to avoid unused-code when SERVERLESS_PROXY !=
1, move the declaration and assignment of init_only inside a `#if`
SERVERLESS_PROXY == 1 / `#endif` block. Locate the init_only declaration and the
block that assigns it (inside the ProxyStateMachine constructor or where
init_only is defined) and wrap both the declaration and the subsequent
assignment logic in the conditional compilation so the symbol is only compiled
when SERVERLESS_PROXY is enabled.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp`:
- Around line 221-224: The anonymous boolean literal `false` passed to
convertTimeZoneByOffset reduces readability; update the call site where
timezone_info.timezone_offset is handled to either replace `false` with a
clearly named constant/enum (e.g., use a constant like IGNORE_DST or
DONT_APPLY_DST) or add an inline comment describing the flag's meaning (for
example: /* apply_dst = false */) so future readers immediately understand what
that parameter controls; locate the call to convertTimeZoneByOffset near
convertTimeZone and timezone_info.timezone_offset in
StorageDisaggregatedColumnar.cpp to make the change.
- Line 566: Replace old-style DB::Exception construction with the fmt-style
(error code first) for the thrown exceptions in
StorageDisaggregatedColumnar.cpp: change the calls that currently use throw
Exception("lock error", ErrorCodes::COLUMNAR_SNAPSHOT_ERROR) (and the other two
similar occurrences) to use throw Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR,
"lock error") so the error code is the first argument; ensure you update each
instance that uses the old-ordered arguments to the new fmt-style
Exception(ErrorCodes::..., "message") form.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.h`:
- Line 234: Remove or properly document the commented-out field
`duration_wait_ready_task_sec` in StorageDisaggregatedColumnar: either delete
the line `//double duration_wait_ready_task_sec = 0;` if it's unused, or
uncomment it and add a brief comment explaining its purpose and units (e.g.,
"time spent waiting for ready tasks in seconds") and why it's kept for future
use so its intent is clear to readers and static analysis.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0082d8d8-0032-40bb-9aac-04d5ea134b44

📥 Commits

Reviewing files that changed from the base of the PR and between ed4e382 and ed0a0a0.

📒 Files selected for processing (25)
  • .gitmodules
  • CMakeLists.txt
  • cmake/find_tiflash_proxy.cmake
  • contrib/tiflash-proxy-cmake/CMakeLists.txt
  • contrib/tiflash-proxy-columnar
  • dbms/CMakeLists.txt
  • dbms/src/Common/ErrorCodes.cpp
  • dbms/src/Common/config.h.in
  • dbms/src/Core/TiFlashDisaggregatedMode.cpp
  • dbms/src/Core/TiFlashDisaggregatedMode.h
  • dbms/src/DataStreams/AddExtraTableIDColumnTransformAction.cpp
  • dbms/src/DataStreams/AddExtraTableIDColumnTransformAction.h
  • dbms/src/DataTypes/DataTypeDecimal.cpp
  • dbms/src/Interpreters/Context.cpp
  • dbms/src/Interpreters/SharedContexts/Disagg.h
  • dbms/src/Server/Server.cpp
  • dbms/src/Storages/KVStore/KVStore.cpp
  • dbms/src/Storages/KVStore/KVStore.h
  • dbms/src/Storages/KVStore/ProxyStateMachine.h
  • dbms/src/Storages/KVStore/TMTContext.cpp
  • dbms/src/Storages/StorageDisaggregated.cpp
  • dbms/src/Storages/StorageDisaggregated.h
  • dbms/src/Storages/StorageDisaggregatedColumnar.cpp
  • dbms/src/Storages/StorageDisaggregatedColumnar.h
  • dbms/src/Storages/StorageDisaggregatedRemote.cpp

Comment on lines +63 to 64
if (ENABLE_NEXT_GEN AND NOT ENABLE_COLUMNAR_DISAGG AND NOT EXISTS "${TiFlash_SOURCE_DIR}/contrib/tiflash-proxy-next-gen/proxy_components/proxy_ffi/src/cloud_helper.rs")
message(FATAL "Can't find next-gen tiflash-proxy")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify valid CMake message modes and locate invalid usage in this repo.
cmake --help-command message | rg -n "FATAL_ERROR|SEND_ERROR|WARNING|STATUS|NOTICE|VERBOSE|DEBUG|TRACE"
rg -n '\bmessage\s*\(\s*FATAL\b' cmake/find_tiflash_proxy.cmake

Repository: pingcap/tiflash

Length of output: 179


🌐 Web query:

CMake message command FATAL_ERROR valid modes

💡 Result:

In CMake, the command is message() and the “mode” argument selects the message type. The valid modes for message() are: [1][2] 1) (none) / omitted: Important information 2) STATUS: Incidental information 3) WARNING: CMake Warning, continue processing 4) AUTHOR_WARNING: CMake Warning (dev), continue processing 5) SEND_ERROR: CMake Error, continue processing, but skip generation 6) FATAL_ERROR: CMake Error, stop processing and generation 7) DEPRECATION: CMake Deprecation Error or Warning (depending on whether CMAKE_ERROR_DEPRECATED / CMAKE_WARN_DEPRECATED is enabled) What FATAL_ERROR does: it “stops processing and generation” and CMake returns a non-zero exit code. [1][3]

Citations:


🏁 Script executed:

# Search for message() usage patterns in cmake files to see what modes are used
rg -n '\bmessage\s*\(' cmake/ --type cmake -A 1 | head -100

Repository: pingcap/tiflash

Length of output: 6289


Use FATAL_ERROR instead of FATAL in message()

Line 64 uses an invalid CMake message mode. The valid modes are: STATUS, WARNING, AUTHOR_WARNING, SEND_ERROR, FATAL_ERROR, and DEPRECATION. FATAL is not recognized, so this check will not stop configuration as intended when the next-gen proxy is missing. Replace it with FATAL_ERROR, which is the correct mode used elsewhere in the codebase (e.g., lines 58, and similar files like find_tipb.cmake).

Proposed fix
-    message(FATAL "Can't find next-gen tiflash-proxy")
+    message(FATAL_ERROR "Can't find next-gen tiflash-proxy")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (ENABLE_NEXT_GEN AND NOT ENABLE_COLUMNAR_DISAGG AND NOT EXISTS "${TiFlash_SOURCE_DIR}/contrib/tiflash-proxy-next-gen/proxy_components/proxy_ffi/src/cloud_helper.rs")
message(FATAL "Can't find next-gen tiflash-proxy")
if (ENABLE_NEXT_GEN AND NOT ENABLE_COLUMNAR_DISAGG AND NOT EXISTS "${TiFlash_SOURCE_DIR}/contrib/tiflash-proxy-next-gen/proxy_components/proxy_ffi/src/cloud_helper.rs")
message(FATAL_ERROR "Can't find next-gen tiflash-proxy")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmake/find_tiflash_proxy.cmake` around lines 63 - 64, Replace the invalid
CMake message mode "FATAL" with the correct "FATAL_ERROR" in the conditional
that checks for the next-gen tiflash-proxy; specifically update the message(...)
call inside the if (ENABLE_NEXT_GEN AND NOT ENABLE_COLUMNAR_DISAGG AND NOT
EXISTS ".../contrib/tiflash-proxy-next-gen/.../cloud_helper.rs") branch so it
uses FATAL_ERROR (matching other occurrences like the one near line 58 and
find_tipb.cmake) to ensure configuration stops when the file is missing.

Signed-off-by: yongman <yming0221@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
dbms/src/Storages/StorageDisaggregatedColumnar.cpp (2)

821-824: ⚡ Quick win

Log the swallowed destructor exception.

Keeping the destructor non-throwing is right, but silently discarding the failure makes accounting/logging issues invisible. tryLogCurrentException keeps the behavior and preserves the signal.

🪵 Suggested change
     catch (...)
     {
-        // Destructors must not throw.
+        tryLogCurrentException(log, "Failed to finalize proxy read accounting");
     }
As per coding guidelines, "In broad `catch (...)` paths, prefer `tryLogCurrentException(log, "context")` to avoid duplicated exception-formatting code"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp` around lines 821 - 824,
The destructor currently swallows all exceptions with a bare catch(...) in
StorageDisaggregatedColumnar (or the related destructor shown) — preserve the
non-throwing behavior but log the exception by calling
tryLogCurrentException(log,
"StorageDisaggregatedColumnar::~StorageDisaggregatedColumnar") (or an
appropriate context string) inside the catch block; ensure you reference the
existing logger variable used in this translation unit and do not rethrow so
destructors remain noexcept.

555-577: ⚡ Quick win

Use the error-code-first DB::Exception constructor here.

These new throws use the legacy (message, code) overload. Please switch them to the repo-standard (code, fmt, ...) form for consistency.

🔧 Suggested cleanup
-        throw Exception("lock error", ErrorCodes::COLUMNAR_SNAPSHOT_ERROR);
+        throw Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "lock error");
@@
-        throw Exception("pd client error", ErrorCodes::COLUMNAR_SNAPSHOT_ERROR);
+        throw Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "pd client error");
@@
-        throw Exception("unknown error type", ErrorCodes::COLUMNAR_SNAPSHOT_ERROR);
+        throw Exception(
+            ErrorCodes::COLUMNAR_SNAPSHOT_ERROR,
+            "unknown error type");
@@
-        throw Exception("read_block failed in tiflash-proxy", ErrorCodes::LOGICAL_ERROR);
+        throw Exception(ErrorCodes::LOGICAL_ERROR, "read_block failed in tiflash-proxy");
As per coding guidelines, "Prefer the fmt-style constructor for `DB::Exception` with error code first: `throw Exception(ErrorCodes::SOME_CODE, "Message with {}", arg);`"

Also applies to: 849-851

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp` around lines 555 - 577,
Replace the legacy DB::Exception constructor calls that pass (message, code)
with the fmt-style, error-code-first form; specifically, in the branches
handling ColumnarReaderErrorType::LockedError, ::PdClientError and the default
non-OK branch (where current throws are throw Exception("lock error",
ErrorCodes::COLUMNAR_SNAPSHOT_ERROR), throw Exception("pd client error",
ErrorCodes::COLUMNAR_SNAPSHOT_ERROR) and throw Exception("unknown error type",
ErrorCodes::COLUMNAR_SNAPSHOT_ERROR)), change them to throw
Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "lock error") / throw
Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "pd client error") / throw
Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "unknown error type, error_type
{}", uint8_t(columnar_reader.error_type)) respectively; apply the same pattern
to the analogous throws near the other occurrence that uses the old overload.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp`:
- Around line 241-277: In StorageDisaggregated::readThroughProxy, guard against
empty remote regions / no proxy tasks and empty pipeline before doing any sizing
or header work: after calling buildRemoteTableRanges(), if remote_table_ranges
is empty (or region_num==0) return an empty BlockInputStreams immediately; after
building read_proxy_tasks and populating pipeline.streams, if
pipeline.streams.empty() return early as well before accessing
pipeline.firstStream() or creating analyzer; ensure these early returns happen
before any divisions (regions_per_reader) or calls that assume a non-empty
header (analyzer, extraCast, filterConditionsWithPushedDownFilters) so
downstream code won’t dereference empty streams or divide by zero.
- Around line 936-1003: In RNProxySourceOp::readImpl, RNProxySourceOp::awaitImpl
and RNProxySourceOp::executeIOImpl replace all occurrences of the function-like
macros written without parentheses (e.g. "if likely (cond)" or "if unlikely
(cond)") with the canonical form that calls the macro around the condition (e.g.
"if (likely(cond))" / "if (unlikely(cond))"); ensure every check such as done,
t_block.has_value(), current_reader_idx < 0, and block && block.rows() uses the
corrected syntax so the macros expand to __builtin_expect properly.

---

Nitpick comments:
In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp`:
- Around line 821-824: The destructor currently swallows all exceptions with a
bare catch(...) in StorageDisaggregatedColumnar (or the related destructor
shown) — preserve the non-throwing behavior but log the exception by calling
tryLogCurrentException(log,
"StorageDisaggregatedColumnar::~StorageDisaggregatedColumnar") (or an
appropriate context string) inside the catch block; ensure you reference the
existing logger variable used in this translation unit and do not rethrow so
destructors remain noexcept.
- Around line 555-577: Replace the legacy DB::Exception constructor calls that
pass (message, code) with the fmt-style, error-code-first form; specifically, in
the branches handling ColumnarReaderErrorType::LockedError, ::PdClientError and
the default non-OK branch (where current throws are throw Exception("lock
error", ErrorCodes::COLUMNAR_SNAPSHOT_ERROR), throw Exception("pd client error",
ErrorCodes::COLUMNAR_SNAPSHOT_ERROR) and throw Exception("unknown error type",
ErrorCodes::COLUMNAR_SNAPSHOT_ERROR)), change them to throw
Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "lock error") / throw
Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "pd client error") / throw
Exception(ErrorCodes::COLUMNAR_SNAPSHOT_ERROR, "unknown error type, error_type
{}", uint8_t(columnar_reader.error_type)) respectively; apply the same pattern
to the analogous throws near the other occurrence that uses the old overload.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: db0c5d76-3eaf-4ef6-8e24-116bf2b71c3a

📥 Commits

Reviewing files that changed from the base of the PR and between ed0a0a0 and 77c5a0d.

📒 Files selected for processing (2)
  • dbms/src/Storages/KVStore/ProxyStateMachine.h
  • dbms/src/Storages/StorageDisaggregatedColumnar.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • dbms/src/Storages/KVStore/ProxyStateMachine.h

Comment on lines +241 to +277
BlockInputStreams StorageDisaggregated::readThroughProxy(const Context & context, unsigned num_streams)
{
DAGPipeline pipeline;
const UInt64 start_ts = sender_target_mpp_task_id.gather_id.query_id.start_ts;
auto [remote_table_ranges, region_num] = buildRemoteTableRanges();
const auto generated_column_infos = genGeneratedColumnInfosForDisaggregatedRead(table_scan);
auto read_proxy_tasks = RNProxyReadTask::buildProxyReadTaskWithBackoff(
log,
context,
start_ts,
table_scan,
filter_conditions,
remote_table_ranges,
num_streams);
for (auto & task : read_proxy_tasks)
{
auto streams = task->getInputStreams();
pipeline.streams.insert(pipeline.streams.end(), streams.begin(), streams.end());
}
// Avoid reading generated columns from proxy, generate placeholders locally.
executeGeneratedColumnPlaceholder(generated_column_infos, log, pipeline);
NamesAndTypes source_columns;
source_columns.reserve(table_scan.getColumnSize());
const auto & stream_header = pipeline.firstStream()->getHeader();
for (const auto & col : stream_header)
{
source_columns.emplace_back(col.name, col.type);
}
analyzer = std::make_unique<DAGExpressionAnalyzer>(std::move(source_columns), context);

// Handle duration/timestamp cast for proxy path.
// We still execute pushed-down filters on RN side, so timestamp columns in those filters
// must also be converted from UTC to session timezone.
extraCast(*analyzer, pipeline, /*include_pushed_down_filter_columns=*/true);
// Handle filter
filterConditionsWithPushedDownFilters(*analyzer, pipeline);
return pipeline.streams;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle empty scans before sizing readers and building headers.

If region splitting returns no remote regions, Line 705 makes real_num_streams == 0 and Line 708 divides by zero. Even after guarding that, Line 264 and Line 314 still assume a non-empty task/header. Please return early for the empty-scan case before computing regions_per_reader and before creating the analyzer.

💡 Suggested guard rails
 BlockInputStreams StorageDisaggregated::readThroughProxy(const Context & context, unsigned num_streams)
 {
     DAGPipeline pipeline;
@@
     auto read_proxy_tasks = RNProxyReadTask::buildProxyReadTaskWithBackoff(
         log,
         context,
         start_ts,
         table_scan,
         filter_conditions,
         remote_table_ranges,
         num_streams);
+    if (read_proxy_tasks.empty())
+        return {};
+
     for (auto & task : read_proxy_tasks)
     {
         auto streams = task->getInputStreams();
         pipeline.streams.insert(pipeline.streams.end(), streams.begin(), streams.end());
     }
@@
 void StorageDisaggregated::readThroughProxy(
     PipelineExecutorContext & exec_context,
     PipelineExecGroupBuilder & group_builder,
     const Context & context,
     unsigned num_streams)
 {
@@
     auto read_proxy_tasks = RNProxyReadTask::buildProxyReadTaskWithBackoff(
         log,
         context,
         start_ts,
         table_scan,
         filter_conditions,
         remote_table_ranges,
         num_streams);
+    if (read_proxy_tasks.empty())
+        return;
+
     const auto generated_column_infos = genGeneratedColumnInfosForDisaggregatedRead(table_scan);
@@
     unsigned region_num = all_remote_regions_by_region.size();
     unsigned physical_table_num = physical_table_ids.size();
+    if (region_num == 0 || num_streams == 0)
+        return tasks;
+
     unsigned real_num_streams = std::min(num_streams, region_num);

Also applies to: 281-324, 703-708

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp` around lines 241 - 277,
In StorageDisaggregated::readThroughProxy, guard against empty remote regions /
no proxy tasks and empty pipeline before doing any sizing or header work: after
calling buildRemoteTableRanges(), if remote_table_ranges is empty (or
region_num==0) return an empty BlockInputStreams immediately; after building
read_proxy_tasks and populating pipeline.streams, if pipeline.streams.empty()
return early as well before accessing pipeline.firstStream() or creating
analyzer; ensure these early returns happen before any divisions
(regions_per_reader) or calls that assume a non-empty header (analyzer,
extraCast, filterConditionsWithPushedDownFilters) so downstream code won’t
dereference empty streams or divide by zero.

Comment on lines +936 to +1003
OperatorStatus RNProxySourceOp::readImpl(Block & block)
{
if unlikely (done)
{
block = {};
return OperatorStatus::HAS_OUTPUT;
}

if (t_block.has_value())
{
std::swap(block, t_block.value());
t_block.reset();
return OperatorStatus::HAS_OUTPUT;
}

return current_reader_idx < 0 ? OperatorStatus::IO_IN : awaitImpl();
}

OperatorStatus RNProxySourceOp::awaitImpl()
{
if unlikely (done || t_block.has_value())
{
return OperatorStatus::HAS_OUTPUT;
}

if unlikely (current_reader_idx < 0)
{
current_reader_idx = 0;
}

return OperatorStatus::IO_IN;
}

OperatorStatus RNProxySourceOp::executeIOImpl()
{
if unlikely (done || t_block.has_value())
{
return OperatorStatus::HAS_OUTPUT;
}

if unlikely (current_reader_idx < 0)
{
return awaitImpl();
}

FilterPtr filter_ignored = nullptr;
Stopwatch w{CLOCK_MONOTONIC_COARSE};
Block block = task->getProxyReaders()[current_reader_idx]->getInputStream()->read(filter_ignored, false);
duration_read_sec += w.elapsedSeconds();
if likely (block && block.rows() > 0)
{
total_rows += block.rows();
t_block.emplace(std::move(block));
return OperatorStatus::HAS_OUTPUT;
}
else
{
if (current_reader_idx == Int32(task->getProxyReaders().size() - 1))
{
done = true;
}
else if (current_reader_idx < Int32(task->getProxyReaders().size() - 1))
{
++current_reader_idx;
}
// Current stream is drained, try to read from next stream.
return awaitImpl();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf 'Definitions of likely/unlikely:\n'
rg -n -C2 '#\s*define\s+(likely|unlikely)\b|constexpr\s+.*\b(likely|unlikely)\b|inline\s+.*\b(likely|unlikely)\b'

printf '\nNonstandard usages in this file:\n'
rg -n -C1 '\bif\s+(likely|unlikely)\s*\(' dbms/src/Storages/StorageDisaggregatedColumnar.cpp

printf '\nCanonical repo-wide usages:\n'
rg -n -C1 '\bif\s*\(\s*(likely|unlikely)\s*\(' -g '*.{cpp,h,hpp,cc}'

Repository: pingcap/tiflash

Length of output: 50372


Fix the likely / unlikely syntax—these are function-like macros and must be called with parentheses around the condition.

The macros likely(x) and unlikely(x) are defined in libs/libcommon/include/common/likely.h to expand to __builtin_expect(). They require the argument to be supplied directly: if (likely(...)), not if likely (...). The space before the opening parenthesis causes a syntax error at preprocessing time.

The entire codebase uses the canonical form if (unlikely(...)) / if (likely(...)). This file has six violations at lines 938, 956, 961, 971, 976, and 985.

Required fixes
-    if unlikely (done)
+    if (unlikely(done))
@@
-    if unlikely (done || t_block.has_value())
+    if (unlikely(done || t_block.has_value()))
@@
-    if unlikely (current_reader_idx < 0)
+    if (unlikely(current_reader_idx < 0))
@@
-    if unlikely (done || t_block.has_value())
+    if (unlikely(done || t_block.has_value()))
@@
-    if unlikely (current_reader_idx < 0)
+    if (unlikely(current_reader_idx < 0))
@@
-    if likely (block && block.rows() > 0)
+    if (likely(block && block.rows() > 0))
🧰 Tools
🪛 Cppcheck (2.20.0)

[error] 938-938: syntax error

(syntaxError)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dbms/src/Storages/StorageDisaggregatedColumnar.cpp` around lines 936 - 1003,
In RNProxySourceOp::readImpl, RNProxySourceOp::awaitImpl and
RNProxySourceOp::executeIOImpl replace all occurrences of the function-like
macros written without parentheses (e.g. "if likely (cond)" or "if unlikely
(cond)") with the canonical form that calls the macro around the condition (e.g.
"if (likely(cond))" / "if (unlikely(cond))"); ensure every check such as done,
t_block.has_value(), current_reader_idx < 0, and block && block.rows() uses the
corrected syntax so the macros expand to __builtin_expect properly.

Signed-off-by: yongman <yming0221@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support new columnar storage as data source

1 participant