Skip to content

Add LuaJIT interpreter#1236

Draft
umanwizard wants to merge 2 commits intoopen-telemetry:mainfrom
parca-dev:lua-upstream
Draft

Add LuaJIT interpreter#1236
umanwizard wants to merge 2 commits intoopen-telemetry:mainfrom
parca-dev:lua-upstream

Conversation

@umanwizard
Copy link
Copy Markdown
Contributor

This PR contains an interpreter for LuaJIT code, which is primarily useful in practice due to OpenResty. It has been used in Parca/Polar Signals since Nov. 2024.

A thorough explanation of the approach used as well as an introduction to Lua internals can be found at the accompanying blog post: https://www.polarsignals.com/blog/posts/2024/11/13/lua-unwinding .

@umanwizard umanwizard requested review from a team as code owners March 9, 2026 19:31
@umanwizard umanwizard marked this pull request as draft March 9, 2026 19:50
@umanwizard
Copy link
Copy Markdown
Contributor Author

Marking as draft until I fix the CI failures and add coredump tests. No major structural changes are expected, so people can feel free to start reviewing if they wish to.

@umanwizard umanwizard force-pushed the lua-upstream branch 2 times, most recently from 89efb8a to faca60a Compare March 9, 2026 20:46
@christos68k
Copy link
Copy Markdown
Member

Many thanks!

@christos68k
Copy link
Copy Markdown
Member

christos68k commented Mar 9, 2026

I have some personal projects using openresty (and also some projects using just their fork of luajit), I'll try to run some workloads and exercise this PR this week but I may run out of time as I'm also flying back to the US, so expect a review from me within a week or two.

This PR contains an interpreter for LuaJIT code, which is primarily
useful in practice due to OpenResty. It has been used in Parca/Polar
Signals since Nov. 2024.

A thorough explanation of the approach used as well as an introduction
to Lua internals can be found at the accompanying blog post:
https://www.polarsignals.com/blog/posts/2024/11/13/lua-unwinding .
Copy link
Copy Markdown
Contributor

@fabled fabled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff! I did first pass on the high level things that seems to stand out questions/discussion.

This might take a while to get fully reviewed as it is huge. Would it make sense to submit in parallel some smaller chunks as separate PRs that we could merge fast to reduce rebase conflicts.

Perhaps as first step just allocate the frame types and add the strings for them etc. Second step could add stub files for the interpreter plugin and ebpf portion - ideally those could just resolve all jit frames as <jit> or similar without any lua specifics (or fail the trace with an unsupported jit error). Then probably rest of the PR will not need to change much during rebases.

Comment thread x86helpers/x86_helpers.go
@@ -0,0 +1,33 @@
// Copyright 2024 The Parca Authors
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should go to asm/amd. @christos68k @florianl should this be also Copyright The OpenTelemetry Authors ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - copyright should be Copyright The OpenTelemetry Authors like all other files.

if unseen {
log.Infof("New LuaJIT instance detected: %v", g)
if l.ebpf.CoredumpTest() {
return interpreter.ErrLJRestart
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see that this error is tested for anywhere. Is this something that just triggers a right thing in some error handling path? Could it be clarified somewhere what the intent of this is?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The luajit unwinder isn't a one pass unwinder, it first needs to detect the "g" and use that in userland to extract the JIT trace information, only then can it unwind over of JIT stack frames. This restart stuff was an aborted attempt to add coredump tests for the luajit unwinder. So we'd error out on the first pass after extracting the trace stack size information and the second pass would succeed. It worked but in the end we decided the integration tests gave us enough coverage. We left this in because we anticipated needing to add some coredump tests in order to land this patch so thoughts are welcome on how we should proceed.

Comment on lines +254 to +260
// For Lua we need the caller and callee to process a frame.
// The callee_pt is a pointer to the GCproto of the function being called, the
// callee_pc is an index into its bytecode. The caller_pt is the
// GCproto of the calling function and the caller_pc is the index into its
// bytecode which we will walk backwards in userland to figure out a name for the
// callee. The callee_pc is for information purposes only, so the user can see where
// execution was.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So glad the ugly hack of interpreter plugin's Symbolize caching these values internally. I assume this correctly now makes the data cacheable as frames are cached individually instead of per-trace?

Comment on lines +484 to +494
func (l *luajitInstance) GetAndResetMetrics() ([]metrics.Metric, error) {
return nil, nil
}

func (l *luajitInstance) ReleaseResources() error {
return nil
}

func (l *luajitInstance) UpdateLibcInfo(ebpf interpreter.EbpfHandler, pid libpf.PID, info libc.LibcInfo) error {
return nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could be removed if you contain the interpreter.InstanceStubs to inherit the default implementations of these.

#define FRAMES_PER_WALK_LUAJIT_STACK 15

#if defined(__x86_64__)
#define DISPATCH r14
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general note, this file has lot of macro names without luajit specific prefix such as LUAJIT_ or LJ_. While the code seems to compile and not clash with anything, this might be slightly problematic in the coredump test suite for long term maintenance because all the *.ebpf.c files are included in single compulation unit. Trying to minimize macro/function name clashes would be appreciated.


///////// END code copied from luajit2 sources.

static inline __attribute__((__always_inline__)) TValue *frame_prevl(TValue *f, TValue frame_val)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have EBPF_INLINE that should be used instead of the attribute here.

return nil, err
}

luaInterp, err := extractInterpreterBounds(info.Deltas(), cframeSize)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of stashing all stack deltas, could this is use the same construct as v8 plugin and use the elfunwindinfo.NewEhFrameTable API? I think it would improve maintainability if this would not directly depend on stack deltas. See interpreters/nodev8/v8.go function locateSnapshotArea. The thing is that stack deltas go through additional manipulation and possible merging. So you could directly look at the FDE sizes and match the FDE using code and/or FDE data.

Copy link
Copy Markdown
Member

@christos68k christos68k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a very quick first pass, mostly focused on structure.

)

// Run
func TestIntegration(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move these integration tests into a separate package (e.g. similar to golabels, interpreter/luajit/integrationtests) and build-qualify them with integration?

@@ -0,0 +1,61 @@
worker_processes 1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also see my other comment regard the integration test itself)

We should probably move this (and all test .lua files] to a different directory, e.g. interpreter/luajit/integrationtests/testdata

"description": "BEAM: Ran out of iterations searching for the current code header"
},
{
"id": 7007,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention here is to reserve a range of errors e.g. 5xxx for V8, 6xxx for .NET, 7xxx for BEAM:

Suggested change
"id": 7007,
"id": 8000,

Comment thread tools/dis2go.py
@@ -0,0 +1,66 @@
#!/usr/bin/env python3
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Do we need this checked into the repo? Maybe add a comment to clarify how it can help if so.

@github-actions
Copy link
Copy Markdown

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@dalehamel
Copy link
Copy Markdown
Contributor

Hey a heads-up, I'm trying to get this working on our openresty based load balancers, and maybe it is a scale issue but i'm running into:

{"time":"2026-04-23T22:48:27.124399773Z","level":"ERROR","msg":"Error adding trace(5250): update: no space left on device"}
{"time":"2026-04-23T22:48:27.12440761Z","level":"ERROR","msg":"Error adding trace(6331): update: no space left on device"}

And if I have the luajit tracer enabled, NO native profiles show up at all (not even kernel idle frames), so I'm wondering if some shared resources is getting completely used up by the unwinder struggling to keep up?

I'm trying to tweak it in Shopify#42, if I can get to the bottom of it I may have some patches to propose. I do think there may be some legitimate map leakage going on as well.

Curious if you've hit this, and at what scale it's been tested at?

@umanwizard
Copy link
Copy Markdown
Contributor Author

We have several unwinder fixes which I will cherry-pick to this PR either this weekend or on Monday. But I don't think any of them address map leakage; do you know more about when/why that's happening?

@dalehamel
Copy link
Copy Markdown
Contributor

dalehamel commented Apr 24, 2026

Thanks @umanwizard! We've been running this on production ingress-nginx nodes (OpenResty with LuaJIT 2.1.1710398010) and wanted to share our findings. The good news: profiles come through fine, including Lua-level
frames.
The initial issue we reported (zero samples) turned out to be a problem on our ingestion side — our profile ingester didn't have a match arm for the new "luajit" frame kind, so it was silently dropping entire
profile batches (ie, my bad, completely a bug on our end). Once we fixed that, the LuaJIT unwinder worked mostly as expected but there was still a legit issue with map pressure at scale I think.

For context on our workload: our ingress-nginx tier uses OpenResty's Lua scripting extensively — the "Sorting Hat" routing module, dynamic
storefront traffic routing
, and per-request plugin chains for scriptable load
balancing
. The nodes run multiple nginx worker processes, each with 600-800+ JIT traces, which creates significant
map pressure.

1. pid_page_to_mapping_info map pressure (more critical, likely scale issue though)

The main issue we hit at scale is pid_page_to_mapping_info map exhaustion. During SynchronizeMappings, processVMs inserts LPM trie prefixes for each trace's mcode region. With multiple workers × hundreds of traces
each, the entries consumed by LuaJIT trace prefixes alone can use a large fraction of the default 1M-entry map. When it fills up, processNewMapping fails for other PIDs on the node, producing ERROR_4012: native_no_pid_page_mapping errors in the profile.

In logs, this manifests as a flood of ENOSPC errors during trace sync:

{"level":"ERROR","msg":"Error adding trace(6874): update: no space left on device"}
{"level":"ERROR","msg":"Error adding trace(5702): update: no space left on device"}
{"level":"ERROR","msg":"Error adding trace(5302): update: key already exists"}
{"level":"ERROR","msg":"Error adding trace(1578): update: no space left on device"}
{"level":"ERROR","msg":"Error adding trace(3719): update: no space left on device"}
{"level":"ERROR","msg":"Error adding trace(4504): update: no space left on device"}

The flamegraph output looks visually similar with and without the fix (profiles do flow either way), but the profiler logs show a clear difference: with the default 1M map, we see continuous ENOSPC errors and the resulting
ERROR_4012 error frames in the profile. With the raised map size, the ENOSPC errors disappear entirely.

We fixed this by raising pidPageMappingInfoSize from 1 << 20 (1M) to 1 << 22 (4M):

Shopify@c1dc372

Note that tracer/tracer.go also runtime-clamps this map, so both the constant and the loader override need to be raised.

This may not affect smaller deployments, but any environment with many nginx workers and heavy JIT trace activity will likely hit the same cap. However, this is probably contentious as it updates a shared map that is outside of the scope of the luajit interpreter, so this should probably be a separate discussion to be had with the maintainers. The luajit interpreter does seem to be able to hit the problem at scale, though.

2. GCproto symbolization failures (non-critical, informational)

We also observed invalid GCproto object errors during Lua frame symbolization. The proto address captured by the eBPF unwinder (GCfuncL.pc - sizeof(GCproto)) can become stale by the time Go reads it via
process_vm_readv, because LuaJIT's GC can free and recycle the memory in between. In logs this looks like:

{"level":"DEBUG","msg":"symbolization failed for PID 3381553, frame type 13: invalid GCproto object: pt=0x7f9a1354caf4 sizept=32666 end=0x7f9a13554a8e lineinfo=0x6700000066 uvinfo=0x6900000068 varinfo=0x6b0000006a
sizebc=482572562 chunkname=0x6300000062"}
{"level":"DEBUG","msg":"symbolization failed for PID 3381394, frame type 13: invalid GCproto object: pt=0x7f9a1354b4f0 sizept=0 end=0x7f9a1354b4f0 lineinfo=0x0 uvinfo=0x0 varinfo=0x56463cfee770 sizebc=0 chunkname=0x0"}

The sequential byte values in the first example (lineinfo=0x67..., uvinfo=0x69..., chunkname=0x63...) confirm the memory was recycled — these are ASCII characters, not valid pointers. We confirmed via LuaJIT FFI on
the production container that all struct offsets match (sizeof(GCproto)=104, sizeof(GCstr)=24) — this is not a layout or ABI issue, just a timing race.

We experimented with pre-caching GCproto objects during trace sync by reading GCtrace.startpt (at offset 0x40):

Shopify@8b57f24

However, in our A/B/C testing this didn't produce much of a visible difference in the flamegraph — the affected Lua frames show up as placeholder names either way, and the native C functions (lj_gc_step, hashkey,
lua_pcall, ngx_*) are symbolized correctly from libluajit-5.1.so debug symbols regardless. So this fix is not critical — mentioning it here for awareness in case others hit the same debug output and wonder about
it.

A (Shopify/opentelemetry-ebpf-profiler#43, which is just this branch rebased on latest main):

image

B: A above, + the map fix:

The logs are noisy, but the output looks fine. The difference in shape from A above makes me think that it probably includes traces that were dropped in A due to the map pressure, as B and C look nearly identical.

image

C: B + the GCproto / GCtrace fixes:

Gets rid of the log issues, but the end result looks identical B otherwise:

image

Our fork with both fixes is at Shopify/opentelemetry-ebpf-profiler#42.

As your branch becomes ready to review, I'd be happy to test it against our production traffic, or after it lands maybe we can figure out a PR to tweak the map management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants