ebpf: support Go cgo stack unwinding through goroutine stack#1331
ebpf: support Go cgo stack unwinding through goroutine stack#1331alban wants to merge 2 commits intoopen-telemetry:mainfrom
Conversation
After unwinding through runtime.asmcgocall in Go+cgo processes, the return address may point to the goroutine stack (anonymous mmap at 0xc000xxxxxx) which is not tracked in pid_page_to_mapping_info. This causes ERR_NATIVE_NO_PID_PAGE_MAPPING and truncates the stack trace, losing the Go frames above the cgo call. Add a fallback in get_next_unwinder_after_native_frame() that detects known Go processes (via the existing go_labels_procs map) and uses frame-pointer-based unwinding to traverse the goroutine stack back to the Go binary text section. This enables full Go stack trace resolution through cgo: runtime.asmcgocall -> runtime.cgocall -> main._Cfunc_... -> Go caller frames -> runtime.main -> runtime.goexit Based-on-patch-by: Burak Ok <burakok@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add support for the go_labels_procs BPF map in the coredump test
harness, enabling the Go cgo stack unwinding fallback to work in tests.
Previously, go_labels_procs always returned NULL, preventing the
frame-pointer unwinding fallback from triggering.
Add a Go+cgo coredump test case that exercises the full cgo stack:
libc (sleep) -> runtime.asmcgocall -> runtime.cgocall ->
main._Cfunc_c_outer -> main.goCallC -> main.goMiddle ->
main.goOuter -> main.main -> runtime.main -> runtime.goexit
Test results with coredump-based tests (go test -a ./tools/coredump/):
Test 1: go-1.24.1-hello WITHOUT cgo fix → PASS (no regression on pure Go)
Test 2: go-cgo-test WITHOUT cgo fix → FAIL
Main thread stack ends with:
runtime.asmcgocall+0 in asm_amd64.s:925
<unwinding aborted due to error native_no_pid_page_mapping>
Test 3: go-1.24.1-hello WITH cgo fix → PASS (no regression on pure Go)
Test 4: go-cgo-test WITH cgo fix → PASS
Main thread stack now shows the full cgo call chain:
libc.so.6 (sleep) -> runtime.asmcgocall -> runtime.cgocall ->
main._Cfunc_c_outer -> main.goCallC -> main.goMiddle ->
main.goOuter -> main.main -> runtime.main -> runtime.goexit
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
72be576 to
c389231
Compare
christos68k
left a comment
There was a problem hiding this comment.
Thanks! Left some nits, looks good to me.
| if (state->pc != 0) { | ||
| error = resolve_unwind_mapping(record, unwinder); | ||
| if (!error) { | ||
| DEBUG_PRINT("Go FP unwinding succeeded, new pc=%llx", state->pc); |
There was a problem hiding this comment.
We can add a goto here to avoid duplicating lines of code
| @@ -516,6 +516,31 @@ get_next_unwinder_after_native_frame(PerCPURecord *record, int *unwinder) | |||
|
|
|||
| DEBUG_PRINT("==== Resolve next frame unwinder: frame %d ====", record->trace.num_frames); | |||
| ErrorCode error = resolve_unwind_mapping(record, unwinder); | |||
There was a problem hiding this comment.
resolve_unwind_mapping can set error_metric when it returns ERR_NATIVE_NO_PID_PAGE_MAPPING, which we should probably clear if we return ERR_OK in this branch.
| // is not tracked in pid_page_to_mapping_info. Try frame pointer unwinding to get | ||
| // back to the Go binary's text section. | ||
| u32 pid = record->trace.pid; | ||
| if (bpf_map_lookup_elem(&go_labels_procs, &pid)) { |
There was a problem hiding this comment.
We're paying this cost for every process here, not just Go. Probably not worth worrying about at this point.
| @@ -0,0 +1,100 @@ | |||
| { | |||
There was a problem hiding this comment.
Is it possible to add an additional test case for arm64 ?
|
|
||
| DEBUG_PRINT("==== Resolve next frame unwinder: frame %d ====", record->trace.num_frames); | ||
| ErrorCode error = resolve_unwind_mapping(record, unwinder); | ||
| if (error == ERR_NATIVE_NO_PID_PAGE_MAPPING) { |
There was a problem hiding this comment.
I think it is better to not do this in ebpf. If frame pointer is valid, the go stack delta extraction should generate the command to use frame pointer. No need to add extra complexity to ebpf here.
There was a problem hiding this comment.
See:
This allows adjusting generated command per go function.
There was a problem hiding this comment.
So the asmcgocall for x86-64 is at https://github.com/golang/go/blob/master/src/runtime/asm_amd64.s#L919 and for arm64 at https://github.com/golang/go/blob/master/src/runtime/asm_arm64.s
Seems that it can be called from Goroutine context and scheduler context - the Goroutine context switches stacks, and the scheduler context does not as its in correct stack already.
The stack pointer recovery is at https://github.com/golang/go/blob/master/src/runtime/asm_amd64.s#L969-L973 and indicates that FP based unwinding would not give right answer. Instead its recovered from the the g.
This is likely due to the fact that the CGO code could call Go code and have the per-g stack moved/resized. I suspect we'd need to do the same here.
Probably needs similar custom command as the other go stack switching primitives being implemented in the other PR.
There was a problem hiding this comment.
From my understanding of the amd64 assembly, we could create a new unwind command for asmcgocall. It would be very similar to the existing systemstack one, since both functions share the same mechanism: they call gosave_systemstack_switch to save the goroutine context into g.sched before switching to the g0 stack. The gobuf layout is identical, so the recovery logic (read sched.sp, dereference to get PC/FP/SP) can be reused.
Both functions also have a "no-switch" path when already running on the system stack (noswitch for systemstack, nosave for asmcgocall). The key difference is how these paths call the target function:
systemstackuses aJMP(tail call) in itsnoswitchpath - this removessystemstackfrom the call stack entirely, so the unwinder never encounters a return address pointing into it. The unwind command never fires in this case, which is the correct behavior.asmcgocalluses aCALLin itsnosavepath - this leaves a return address on the stack pointing back intoasmcgocall. During unwinding, the profiler would see this frame and apply the unwind command, which would attempt to read gobuf - but gosave_systemstack_switch was never called in this path, so gobuf may contain stale data from a previous call.
To handle both paths correctly, we could potentially do the following (need to dig more in depth into it):
- switch path: cross to the goroutine stack using the gobuf recovery logic, same as systemstack.
- nosave path: fall back to normal frame-pointer unwinding. Since everything stays on the same stack (g0) in the nosave path, the FP chain is intact and regular unwinding gives a correct trace.
There was a problem hiding this comment.
See also #1313 (comment)
Didn't realize that earlier, but apparently the linker injects the frame pointer. Perhaps it applies only if frame pointer is enabled build time.
So probably frame pointer works assuming new enough Go and frame pointers enabled.
For universal solution, this should use the system stack approach of having similar (or the same) unwind command that resolves the old stack from G.
Problem
When profiling Go processes that use cgo, stack unwinding fails after
runtime.asmcgocall. The return address points to the Go goroutine stack(anonymous mmap at
0xc000xxxxxx), which is not tracked inpid_page_to_mapping_info. This causesERR_NATIVE_NO_PID_PAGE_MAPPINGand truncates the stack trace — all Go frames above the cgo boundary are lost.
Solution
Add a fallback in
get_next_unwinder_after_native_frame(): whenresolve_unwind_mappingreturnsERR_NATIVE_NO_PID_PAGE_MAPPING, checkif the process is a known Go process (via the existing
go_labels_procsmap) and try frame-pointer-based unwinding to traverse the goroutine
stack back to the Go binary text section.
Also add
go_labels_procsmap support in the coredump test harness(previously hardcoded to return NULL), and a Go+cgo coredump test case.
Based on a patch by @burak-ok (burak-ok@fc2edfaf).
Test results
Coredump-based tests (
go test -a ./tools/coredump/):Without fix, the main thread stack ends at:
With fix, the full cgo call chain is resolved:
Details
Without the fix:
With the fix:
Note for reviewers
The coredump test case modules need to be uploaded to the module store
before CI can run the go-cgo-test. Please help with
./coredump upload -allif you have write access to the OCI bucket.