Skip to content

[metrics] Add client metrics and prometheus example#617

Merged
leekeiabstraction merged 3 commits into
apache:mainfrom
charlesdong1991:metrics-doc-update-and-prometheus-example
Jun 17, 2026
Merged

[metrics] Add client metrics and prometheus example#617
leekeiabstraction merged 3 commits into
apache:mainfrom
charlesdong1991:metrics-doc-update-and-prometheus-example

Conversation

@charlesdong1991

@charlesdong1991 charlesdong1991 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Purpose

Document the implemented connection/writer/scanner client metrics and show how to collect them with a Prometheus exporter.

Linked issue: close #618

@charlesdong1991

Copy link
Copy Markdown
Contributor Author

cc @fresh-borzoni @leekeiabstraction PTAL! 🙏

@leekeiabstraction leekeiabstraction left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for the PR. Left a question

Comment thread crates/examples/src/example_prometheus_metrics.rs

@leekeiabstraction leekeiabstraction left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for clarifying. Left a couple more qs.


let scan_records = log_scanner.poll(Duration::from_secs(1)).await?;
let mut count = 0;
for record in scan_records {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are only verifying metrics for sent records, do we need the loop for polling records? If not, it might be worth removing this loop for a more concise example.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need poll still because all scanner metrics are recorded inside poll(), but you are right that the per-field decode loop drives no metrics, so i replaced it!

The client caches metric handles the first time a writer or scanner is created,
binding them to whichever recorder is installed at that moment. Install your global
recorder **before** calling `FlussConnection::new` (ideally as the very first thing
in `main`). If you install it after creating a writer or scanner, those metrics will

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour might catch user off guard. Is it worth making it so that an Error is returned instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think returning an error isn't workable because "no recorder installed" is a supported mode most users run in, so writer/scanner creation can't fail just because no recorder is present

Also metrics facade has no hook to detect "installed too late". The only alternative is to stop caching handles and re-resolve per record, and that will have hot-path cost so we should try to avoid.

wdyt? i can also some more explanation on that in this doc 🙏

@leekeiabstraction leekeiabstraction left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ty for the PR and addressing comments/answering questions. LGTM!

@leekeiabstraction leekeiabstraction merged commit 3187eb4 into apache:main Jun 17, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[metrics] Add client metrics reference and prometheus example

2 participants