Skip to content

feat: Add support for reading OTel ProcessContext from profiled process memory#1181

Merged
christos68k merged 22 commits intoopen-telemetry:mainfrom
DataDog:nsavoire/otel_process_context
Apr 15, 2026
Merged

feat: Add support for reading OTel ProcessContext from profiled process memory#1181
christos68k merged 22 commits intoopen-telemetry:mainfrom
DataDog:nsavoire/otel_process_context

Conversation

@nsavoire
Copy link
Copy Markdown
Contributor

@nsavoire nsavoire commented Feb 17, 2026

Implement reading of OpenTelemetry ProcessContext (OTEP #4719) from profiled processes via shared memory regions. The SDK publishes resource attributes into a named memory mapping ([anon:OTEL_CTX] or /memfd:OTEL_CTX) with a 32-byte header containing signature, version, payload size, timestamp, and a pointer to a protobuf-encoded payload.

Changes:

  • Define ProcessContext protobuf message (resource + extra_attributes)
  • Add processcontext.go to read and validate the shared memory header, deserialize the protobuf payload, and handle concurrent update retries
  • Recognize OTEL_CTX mappings in parseMappings and store ProcessContextInfo in ProcessMeta
  • Integrate ProcessContext reading into processmanager's synchronizeMappings
  • Add unit tests with mock reader and integration tests using real named anonymous and memfd mappings

This PR only addresses reading process context but does yet propagate it to profile.
The goal is to gather early feedback.

Implementation notes:

  • Currently process context read is attempted upon each call to synchronizeMappings. If process context is published later on, then it might be missed if no other call to synchronizeMappings occurs. A possibility would be to hook prctl calls to trigger process context reads.
  • If process context read fails because a concurrent update is detected, then read is retried up to 3 times. If it still fails, for now there is no mechanism to schedule a future read attempt. Note that implementing a prctl hook would also solve this by triggering an update later on.

Remaining work:

  • Merge resource attributes between process context / process env variables and containerID
  • Split profiles by resource attributes
  • Implement reading thread context on top of process context

Comment thread process/processcontext.go Fixed
@nsavoire nsavoire force-pushed the nsavoire/otel_process_context branch from fa819a1 to bc31d04 Compare February 20, 2026 15:43
Copy link
Copy Markdown

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick pass! Exciting to see this here!

Comment thread process/processcontext.go Outdated
Comment thread process/processcontext_test.go Outdated
Comment thread process/processcontext_test.go
@nsavoire nsavoire force-pushed the nsavoire/otel_process_context branch from 0330e99 to 4da59ec Compare March 4, 2026 14:24
@nsavoire nsavoire force-pushed the nsavoire/otel_process_context branch 3 times, most recently from f2605a3 to 83f7642 Compare March 13, 2026 19:14
…ss memory

Implement reading of OpenTelemetry ProcessContext (OTEP #4719) from
profiled processes via shared memory regions. The SDK publishes resource
attributes into a named memory mapping ([anon:OTEL_CTX] or /memfd:OTEL_CTX)
with a 32-byte header containing signature, version, payload size, timestamp,
and a pointer to a protobuf-encoded payload.

Changes:
- Define ProcessContext protobuf message (resource + extra_attributes)
- Add processcontext.go to read and validate the shared memory header,
  deserialize the protobuf payload, and handle concurrent update retries
- Recognize OTEL_CTX mappings in parseMappings and store ProcessContextInfo
  in ProcessMeta
- Integrate ProcessContext reading into processmanager's synchronizeMappings
- Add unit tests with mock reader and integration tests using real named
  anonymous and memfd mappings
@nsavoire nsavoire force-pushed the nsavoire/otel_process_context branch from 83f7642 to 5bb52d7 Compare March 13, 2026 19:16
@nsavoire nsavoire marked this pull request as ready for review March 16, 2026 12:57
@nsavoire nsavoire requested review from a team as code owners March 16, 2026 12:57
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5bb52d716f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread process/process.go Outdated
Copy link
Copy Markdown
Member

@christos68k christos68k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did one pass, have a few more things to look at / test

Comment thread process/processcontext.go Outdated
Comment thread processcontext/processcontext.go Outdated
Comment thread process/processcontext.go Outdated
Comment thread process/processcontext.go Outdated
Comment thread process/processcontext.go Outdated
Comment thread process/processcontext.go Outdated
Comment thread process/processcontext.go Outdated
Comment thread processmanager/processinfo.go Outdated
Comment thread process/process.go Outdated
Comment thread process/processcontext.go Outdated
Comment thread process/processcontext.go Outdated
Comment thread processcontext/processcontext.go Outdated
Comment thread processcontext/processcontext.go Outdated
Comment thread processcontext/processcontext.go Outdated
Comment thread processcontext/processcontext.go
Comment thread proto/processcontext/processcontext.proto Outdated
Comment thread process/processcontext.go Fixed
Comment thread processmanager/processinfo.go Fixed
Copy link
Copy Markdown

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 This looks great! I like that although the spec itself has some complex + subtle details, the implementation ends up looking quite sane and mostly straightforward

Comment thread process/process.go Outdated
Comment thread processcontext/proto/processcontext.proto Outdated
Comment thread processcontext/proto/processcontext.proto
Copy link
Copy Markdown
Contributor

@rogercoll rogercoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm, I will do another review once the ProcessContextInfo is forwarded down the pipeline

Comment thread processmanager/processinfo.go Outdated
Comment thread processcontext/processcontext.go Outdated
ContextMappingAnonNamed = "[anon:OTEL_CTX]"

// Default maximum retries for concurrent updates
DefaultMaxRetries = 3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be exported?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read() takes maxRetries as a parameter (per @christos68k 's suggestion), so this provides a default value for external callers (e.g. processmanager/processinfo.go). Alternatively, I could use the zero value to indicate "use the default" and keep the constant unexported. What do you prefer?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO since no caller needs a different value, just drop the parameter and use a non exported constant internally (e.g defaultMaxRetries). If we need/expect other callers to set a different value for the retries, wdyt if we use the Options pattern instead? (or we could add the Options later as being non api breaking) E.g:

  type options struct {
      maxRetries int
  }
  type Option func(*options)
  func WithMaxRetries(n int) Option {
      return func(o *options) { o.maxRetries = n }
  }
  func Read(addr libpf.Address, rm remotememory.RemoteMemory, lastPublishedAtNs
  uint64, opts ...Option) (Info, error) {

cc @christos68k

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you a preference @christos68k ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go for the simpler option of using 0 as default, allowing for caller overrides, unless we know we'll have more than one options in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that I called the parameter maxRetries but it is implemented in fact as maxAttempts.
I renamed it, and when passed 0, it uses the default number of read attempts.

ivoanjo added a commit to ivoanjo/opentelemetry-proto that referenced this pull request Apr 2, 2026
As open-telemetry/opentelemetry-specification#4719
looks to be merged soon, it came up as we were implementing this spec
in the OTel eBPF Profiler
(open-telemetry/opentelemetry-ebpf-profiler#1181)
that it'd be nice to stop copy-pasting the `process_context.proto` and
it's time to add it to the proper place.

This is my first contribution to this repo so please do point out if
I missed something! I didn't touch the collector parts since this
message is not expected to be processed by the collector directly.
@ivoanjo
Copy link
Copy Markdown

ivoanjo commented Apr 2, 2026

(I meant to approve on my previous pass but pressed comment instead of approve)

nsavoire and others added 2 commits April 2, 2026 23:18
@nsavoire nsavoire force-pushed the nsavoire/otel_process_context branch from 4e747eb to 817dfc7 Compare April 3, 2026 16:50
@nsavoire nsavoire requested review from florianl and rogercoll April 8, 2026 09:11
Copy link
Copy Markdown
Contributor

@rogercoll rogercoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the code LGTM, but the ProcessContextInfo it's never consumed downstream. This makes the processcontext feature effectively a no-op in the current changeset. Is there a follow-up PR planned to wire ProcessContextInfo through the tracer, reporter and pdata path?

@nsavoire
Copy link
Copy Markdown
Contributor Author

Overall the code LGTM, but the ProcessContextInfo it's never consumed downstream. This makes the processcontext feature effectively a no-op in the current changeset. Is there a follow-up PR planned to wire ProcessContextInfo through the tracer, reporter and pdata path?

Indeed this PR addresses only detecting/reading process context. I have a follow-up PR planned once this PR is merged (here in draft state).

@nsavoire
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback ! I think I have addressed most of the comments, let me know what I can do to progress on this PR.

Thanks for doing the work @nsavoire. I think the PR is in a good state right now. Some notes:

Hi @christos68k, some of the blockers you mentioned should now be resolved:

Let me know if there's anything else needed from my side to move this forward !

Copy link
Copy Markdown
Member

@christos68k christos68k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final batch of nits

Comment thread process/process.go Outdated
Comment thread processcontext/v1development/processcontext.pb.go
Comment thread processcontext/processcontext.go Outdated
Comment thread processcontext/processcontext.go Outdated
import (
"encoding/binary"
"errors"
"fmt"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"fmt"
"fmt"
"regexp"

Comment thread processcontext/processcontext.go Outdated
nsavoire and others added 3 commits April 13, 2026 21:04
Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co>
Rename the parameter to reflect that it controls total attempts, not
retries. Unexport the constant and treat 0 as "use default" so callers
don't need to reference an internal tuning knob.
…fd paths

IsContextMapping now takes an isExecutable parameter to reject
executable mappings.
Also adds explicit handling for "/memfd:OTEL_CTX (deleted)" paths
instead of relying on iterateMappings to trim the suffix.
Copy link
Copy Markdown
Member

@christos68k christos68k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

We'll push for open-telemetry/opentelemetry-proto#783 merge today (so I won't merge this right now PR still needs one more approval from an approver/maintainer).

Copy link
Copy Markdown
Member

@florianl florianl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we don't get open-telemetry/opentelemetry-proto#783 merged soonish. So I have just minor nits left and I think we should merge and continue from here.

Comment thread processcontext/processcontext.go Outdated
Co-authored-by: Florian Lehner <florianl@users.noreply.github.com>
@christos68k christos68k merged commit 4b03a72 into open-telemetry:main Apr 15, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants