Skip to content

Reworked threading model.#1652

Merged
bghgary merged 28 commits into
masterfrom
rework-thread-model
Jun 9, 2026
Merged

Reworked threading model.#1652
bghgary merged 28 commits into
masterfrom
rework-thread-model

Conversation

@bkaradzic-microsoft

@bkaradzic-microsoft bkaradzic-microsoft commented Apr 6, 2026

Copy link
Copy Markdown
Member

[Updated by Copilot on behalf of @bghgary]

Context

Replaces the old per-thread encoder model (GetEncoderForThread/EndEncoders + UpdateToken wrapping SafeTimespanGuarantor) with a single frame encoder owned by DeviceImpl, synchronized by a FrameCompletionScope RAII gate. SafeTimespanGuarantor, UpdateToken, and the Update classes are removed. DeviceUpdate remains as a no-op shim — removing it is an API break and a separate PR (#1653).

Model

StartRenderingCurrentFrame acquires the encoder (bgfx::begin) and opens the gate; FinishRenderingCurrentFrame waits for all FrameCompletionScopes to release, closes the gate, ends the encoder, and submits the frame. All consumers (NativeEngine, Canvas, NativeXr) read the shared encoder via GetActiveEncoder(). SubmitCommands and ReadTexture each hold a stack-scoped FrameCompletionScope, so the encoder can't be ended mid-work; when invoked outside a frame (e.g. an XHR callback) the scope blocks until the next frame opens.

Main Thread                       JS Thread                 bgfx render thread
  StartRenderingCurrentFrame()
    m_frameEncoder = bgfx::begin(true)
    m_frameBlocked = false; CV notify
    tick frameStartDispatcher  -----> RAF callbacks
                                        submitCommands()
                                        AcquireFrameCompletionScope (immediate; frame open)
                                        GetEncoder() -> m_frameEncoder; draw...
                                        scope released; pendingScopes-- -> CV notify
  FinishRenderingCurrentFrame()
    wait scopes == 0; m_frameBlocked = true
    tick beforeRenderDispatcher
    bgfx::end(m_frameEncoder)
    Frame() -> bgfx::frame()  ------------------------------> GPU submit & render
    tick afterRenderDispatcher

Synchronization rules

  1. Single encoder, acquired before the gate opens (mutex gives memory ordering) and ended only after all scopes release.
  2. SubmitCommands/ReadTexture always hold a scope — the encoder can't be ended mid-command, regardless of microtask draining or reentrancy.
  3. Apps must bracket any main-thread wait for JS work with Start/Finish, or SubmitCommands blocks forever waiting for the gate (HeadlessScreenshotApp, StyleTransferApp, UnitTests do this).
  4. Canvas::Flush acquires a scope when called outside a frame, then discards residual encoder state before nanovg rendering (the shared encoder carries NativeEngine state).

Notes

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reworks the graphics threading/synchronization model by replacing per-thread encoders and the SafeTimespan/UpdateToken mechanism with a single per-frame bgfx encoder owned by DeviceImpl and a new FrameCompletionScope gate/counter for cross-thread frame completion synchronization.

Changes:

  • Removed SafeTimespanGuarantor, UpdateToken/Update, and per-thread encoder management; introduced a single frame encoder lifecycle in DeviceImpl.
  • Added FrameCompletionScope + frame-start/before-render/after-render scheduling to coordinate JS-thread work with bgfx::frame().
  • Updated NativeEngine/NativeXR/Canvas/FrameBuffer APIs to use GetActiveEncoder() and internal encoder acquisition patterns instead of passing encoders through call chains.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
Core/Graphics/Source/DeviceImpl.h Adds single-frame encoder pointer and frame gate (mutex/CV + pending scope count).
Core/Graphics/Source/DeviceImpl.cpp Implements frame open/close sequencing, scope blocking, and frame-start dispatcher tick.
Core/Graphics/InternalInclude/Babylon/Graphics/DeviceContext.h Replaces Update/UpdateToken with FrameCompletionScope, adds frame-start scheduler + encoder accessors.
Core/Graphics/Source/DeviceContext.cpp Implements FrameCompletionScope RAII and forwards new scheduling/encoder methods.
Core/Graphics/Include/Shared/Babylon/Graphics/Device.h Converts DeviceUpdate into a no-op compatibility shim and inlines GetUpdate.
Core/Graphics/Source/Device.cpp Removes old GetUpdate implementation.
Core/Graphics/InternalInclude/Babylon/Graphics/FrameBuffer.h Removes encoder parameters from bind/unbind/viewport/scissor APIs.
Core/Graphics/Source/FrameBuffer.cpp Updates FrameBuffer internals to acquire view IDs without needing an encoder parameter.
Plugins/NativeEngine/Source/NativeEngine.h Switches to active-encoder access + outside-frame scope holding; adds begin/end frame no-ops.
Plugins/NativeEngine/Source/NativeEngine.cpp Updates command submission and RAF scheduling to use frame-start scheduling + scopes; updates ReadTexture blit scheduling.
Plugins/NativeEngine/Source/PerFrameValue.h Adjusts API to remove encoder parameters (but currently has a type-signature issue).
Plugins/NativeXr/Source/NativeXrImpl.h Removes stored Update member from XR session state.
Plugins/NativeXr/Source/NativeXrImpl.cpp Migrates XR scheduling to frame-start + scopes; updates encoder usage.
Polyfills/Canvas/Source/Context.h Removes stored Update member.
Polyfills/Canvas/Source/Context.cpp Uses GetActiveEncoder() and acquires a FrameCompletionScope when flushing outside the frame cycle.
Polyfills/Canvas/Source/nanovg/nanovg_babylon.cpp Updates FrameBuffer binding calls to new signature (no encoder param).
Apps/HeadlessScreenshotApp/Win32/App.cpp Wraps startup wait with Start/Finish frame calls to keep gate open during startup.
Apps/StyleTransferApp/Win32/App.cpp Same startup framing change as HeadlessScreenshotApp.
Core/Graphics/CMakeLists.txt Removes SafeTimespanGuarantor sources from build.
Comments suppressed due to low confidence (1)

Plugins/NativeEngine/Source/PerFrameValue.h:33

  • PerFrameValue is templated on T, but Set currently takes a bool parameter. This makes the template misleading and will break or silently convert if PerFrameValue is ever instantiated with a non-bool type. Change Set to accept T (e.g., const T& or T) to match the class template parameter.
        T Get() const
        {
            return m_value;
        }

        void Set(bool value)
        {
            m_value = value;
            if (!m_isResetScheduled)
            {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Plugins/NativeXr/Source/NativeXrImpl.cpp
Comment thread Core/Graphics/Source/DeviceImpl.cpp
Comment thread Core/Graphics/Source/DeviceImpl.cpp
…ompletionScope instead of deferred member release.
…cquired FrameCompletionScope in SubmitCommands.
…enderScheduler fires one frame late when called outside RAF.
bghgary added a commit to bghgary/BabylonNative that referenced this pull request Apr 23, 2026
Matches the established pattern from ExternalTexture.AddToContextAsyncAndUpdate
on master: wait for JS work to complete while the frame is still open, then
Finish. The new CreateForJavaScript and RenderTextureArray tests in this PR
had accidentally dropped this pattern, which works on master but deadlocks
under the reworked threading model in BabylonJS#1652 (Finish closes the frame gate
while JS is still acquiring FrameCompletionScopes).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary added a commit to bghgary/BabylonNative that referenced this pull request Apr 29, 2026
Hypothesis test for the 4-30x Linux JSC CI runtime variance on rework-thread-model:

The new FrameCompletionScope gate model on this branch makes JS-thread SubmitCommands legal only between StartRenderingCurrentFrame and FinishRenderingCurrentFrame. The previous test pump called Start and Finish back-to-back with the wait_for AFTER Finish — collapsing the open window to ~zero. JS-thread work then had to win a scheduler race to land its scope before the pump thread re-acquired the mutex and closed the gate. On contended runners, JS thread loses that race many times per second, producing 4-30x runtime variance.

Restructure: open the gate, then wait_for, then briefly close+reopen each iteration. Gate is open ~99% of the time so JS thread never has to race.

If this commit reduces variance to master-baseline (5-10 min) consistently, the production fix belongs in BabylonJS#1652 itself (either invert the consumer pattern or eliminate the gate).

[Created by Copilot on behalf of @bghgary]

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary added a commit to bghgary/BabylonNative that referenced this pull request Apr 29, 2026
Root cause of the 4-30x Linux JSC CI runtime variance proven on rework-thread-model-perf branch (4 baseline runs hit 30-60min, 4 fix runs ran Unit Tests in 3 sec).

On rework-thread-model the new FrameCompletionScope gate model makes JS-thread SubmitCommands legal only between StartRenderingCurrentFrame and FinishRenderingCurrentFrame. The previous test pump called Start and Finish back-to-back with the wait_for AFTER Finish, collapsing the open window to ~zero. JS-thread work then had to win a scheduler race to land its scope before the pump thread re-acquired the mutex and closed the gate. On contended runners JS lost that race repeatedly, accumulating per-poll 16ms+frame() stalls into 30+ minute runtimes.

Restructure: open the gate before the loop, then for each iteration wait_for(16ms) -> Finish -> Start. Gate is open ~99% of the time so JS thread never has to race.

Validation: rework-thread-model-perf branch (BabylonJS#1652 base + this fix only) ran Unit Tests in 3 seconds across all reachable jobs in 4 parallel runs. The previous baseline showed Unit Tests timing out at 30-60 min on multiple runs.

Note: Validation Tests (Playground pixel diff) is a separate flaky issue unrelated to this fix.

[Created by Copilot on behalf of @bghgary]

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary and others added 5 commits May 4, 2026 15:38
…fresh docs

The std::move(scope) capture in 7f98c7c produced a move-only lambda.
Dispatchable at the AppRuntime layer is move-only and accepts it, but
JsRuntime downstream type-erases the callable through std::function
which requires copy-constructibility:

  std_function.h:439:18: error: static assertion failed due to
  requirement 'is_copy_constructible<...>::value':
  std::function target must be copy-constructible

MSVC accepted this; gcc 14 (Ubuntu CI) rejects it. Wrap the scope in
shared_ptr so the lambda is copyable, matching the existing RAF
dispatcher pattern in ScheduleRequestAnimationFrameCallbacks.

Also refresh the comments on FrameCompletionScope and
AcquireFrameCompletionScope. The previous class-level comment listed
three usage patterns, but one of them ("NativeEngine::GetEncoder()
acquires lazily") didn't match the code — GetEncoder doesn't acquire a
scope. Replace with the two patterns that are actually in use:

  1. JS-frame scoped via m_runtime.Dispatch capture (RAF, SubmitCommands)
  2. Block scoped on the stack (Canvas::Flush fallback, ReadTextureAsync)

[Created by Copilot on behalf of @bghgary]

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Schedule the FrameCompletionScope-releasing lambda before the command
stream is processed, not after. Equivalent in the success path, but
makes the JS-frame coverage exception-safe — if the command processing
throws, the lambda is already queued and the scope survives to the next
dispatch cycle, instead of being destroyed at the throw and dropping
the count to 0 on the render thread's view.

Also drops the explicit local in favour of an inline make_shared in
the lambda capture — the local was only there to be captured.

[Created by Copilot on behalf of @bghgary]

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The MSAA tests deadlocked after merging master into rework-thread-model.

The render thread waits on startupDone while the JS thread runs the
AddToContextAsync .then() callback. That callback calls startup() ->
new NativeEngine() / new Scene() / new RenderTargetTexture(), which
eventually hits SubmitCommands. SubmitCommands now synchronously
acquires a FrameCompletionScope, which blocks while m_frameBlocked
is true. The gate only opens on the next StartRenderingCurrentFrame --
but the render thread is stuck on startupDone. Deadlock.

Fix: open frame 2 before waiting on startupDone, and reuse the same
frame for renderFrame(). Any JS work that touches the engine has to run
while a frame is in progress under the new model.

This workaround is temporary. Once #1646 lands, the test should migrate
from AddToContextAsync to the new synchronous CreateForJavaScript, at
which point startup() runs in the same JS task as the texture creation
and the cross-frame dance disappears entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary added a commit to bghgary/BabylonNative that referenced this pull request May 29, 2026
With BabylonJS#1646's synchronous ExternalTexture::CreateForJavaScript and BabylonJS#1652's
new threading model both in this branch, the original test pattern is
both unnecessary and broken:

- Unnecessary: AddToContextAsync's two-frame dance (queue in frame 1,
  wait for bgfx::frame() to resolve the promise, run startup in frame 2)
  exists only because the texture wrap was async. CreateForJavaScript is
  synchronous so the wrap, startup(), and renderFrame() all run in the
  same JS task inside a single frame.
- Broken: under the new model, SubmitCommands synchronously acquires a
  FrameCompletionScope which blocks when no frame is in progress.
  The old between-frames startup() pattern deadlocks.

Collapse to a single frame + single dispatch that wraps the texture,
runs startup(), and chains renderFrame()'s promise resolution. The host
app pattern host apps should follow with the new API.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary added a commit to bghgary/BabylonNative that referenced this pull request May 29, 2026
BabylonJS#1689 added the rendering-enabled check to DeviceImpl::UpdateDevice on
master. Both BabylonJS#1646 and BabylonJS#1652 also have it. The check was dropped during
this branch's merge with the older partner-test base, causing
Device.UpdateDeviceThrowsWhenRenderingEnabled to fail in CI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary added a commit to bghgary/BabylonNative that referenced this pull request May 29, 2026
Same fix as the MSAA migration: AddToContextAsync's between-frames .then()
pattern deadlocks under BabylonJS#1652's threading model because SubmitCommands
synchronously acquires a FrameCompletionScope which blocks when no frame
is in progress.

CreateForJavaScript runs synchronously inside the same dispatch lambda,
so startup() / restoreTexture() execute in the same JS task as the wrap.
No cross-frame .then() dance, no deadlock.

Without this, all four Win32 D3D11 jobs hit the 60-minute timeout on the
'ExternalTexture.RestoreAfterDeviceLoss' test (added in BabylonJS#1716, merged
into master after this branch's last sync).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary and others added 2 commits June 8, 2026 16:39
# Conflicts:
#	Plugins/NativeEngine/Source/NativeEngine.cpp
#	Polyfills/Canvas/Source/Context.cpp
The test drives device loss/restore via the async AddToContextAsync + manual
frame-pump pattern, which deadlocks under the single-frame-encoder model in
this PR. It is re-enabled in #1646 after migration to the synchronous
CreateForJavaScript API.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replaces the earlier skip. Under the single-frame-encoder model the async
AddToContextAsync .then() runs startup()/restoreTexture() on the JS thread,
whose SubmitCommands blocks until a frame is in progress. Open the next frame
before each blocking wait on the async result (and reuse it for the following
renderFrame), mirroring the workaround already applied to
Tests.ExternalTexture.Msaa.cpp. Obsoleted once #1646 migrates these tests to
the synchronous CreateForJavaScript API.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@bghgary bghgary left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Reviewed by Copilot on behalf of @bghgary]

LGTM.

@bghgary bghgary merged commit 6bb8014 into master Jun 9, 2026
28 checks passed
@bkaradzic-microsoft

Copy link
Copy Markdown
Member Author

🎉

bghgary added a commit to bghgary/BabylonNative that referenced this pull request Jun 9, 2026
Conflict resolutions (final state matches the BabylonJS#1674 integration reference plus
the layerIndex work from the Teams branch):
- Apps (HeadlessScreenshot/PrecompiledShaderTest/StyleTransfer): combine BabylonJS#1652's
  frame-pump (open a startup frame so CreateForJavaScript's blit has an active
  frame) with BabylonJS#1646's synchronous CreateForJavaScript migration.
- Tests.ExternalTexture.Msaa.cpp: BabylonJS#1646's synchronous CreateForJavaScript
  migration supersedes the frame-pump workaround BabylonJS#1652 added.
- NativeEngine::SetTexture: combine BabylonJS#1652's GetEncoder() with BabylonJS#1646's layerIndex
  single-slice view overload.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary added a commit to bghgary/BabylonNative that referenced this pull request Jun 9, 2026
The master merge brought in BabylonJS#1716's RestoreAfterDeviceLoss test using the
async AddToContextAsync + frame-pump pattern (added by BabylonJS#1652). With BabylonJS#1646's
synchronous CreateForJavaScript, that test runs startup() in the same JS task
as the texture wrap, so the frame-pump is unnecessary; migrate it to match the
other ExternalTexture tests.

Also correct a now-stale comment in AddToContextAsyncAndUpdate: the deprecated
AddToContextAsync shim resolves synchronously via CreateForJavaScript, so the
frame finish no longer drives its completion.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary added a commit to bghgary/BabylonNative that referenced this pull request Jun 9, 2026
Conflict resolutions (final state matches the BabylonJS#1674 integration reference):
- Apps (HeadlessScreenshot/PrecompiledShaderTest/StyleTransfer): combine BabylonJS#1652's
  frame-pump (open a startup frame so CreateForJavaScript's blit has an active
  frame) with BabylonJS#1646's synchronous CreateForJavaScript migration.
- Tests.ExternalTexture.Msaa.cpp: BabylonJS#1646's synchronous CreateForJavaScript
  migration supersedes the frame-pump workaround BabylonJS#1652 added.
- NativeEngine::SetTexture: combine BabylonJS#1652's GetEncoder() with BabylonJS#1646's layerIndex
  single-slice view overload.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bghgary added a commit to bghgary/BabylonNative that referenced this pull request Jun 9, 2026
The master merge brought in BabylonJS#1716's RestoreAfterDeviceLoss test using the
async AddToContextAsync + frame-pump pattern (added by BabylonJS#1652). With BabylonJS#1646's
synchronous CreateForJavaScript, that test runs startup() in the same JS task
as the texture wrap, so the frame-pump is unnecessary; migrate it to match the
other ExternalTexture tests.

Also correct a now-stale comment in AddToContextAsyncAndUpdate: the deprecated
AddToContextAsync shim resolves synchronously via CreateForJavaScript, so the
frame finish no longer drives its completion.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft added a commit that referenced this pull request Jun 10, 2026
…1744)

## Problem

After syncing to the latest master (which includes #1739 divisor-driven
per-instance vertex attributes and the #1652 thread-model rework), a
number of previously-excluded Playground validation tests now render
correctly. They were left disabled because they used to crash/hang or
pixel-diff against their references on Babylon Native.

## Change

Re-enable **16** `validation_native.js` tests in
`Apps/Playground/Scripts/config.json`. No code changes.

## Verification

Each candidate was first confirmed passing **in isolation** on both
backends, then the **full sweep** was re-run on both to ensure no
order-dependent state-leak cascade (the runner aborts the whole sweep on
the first hard failure):

- **Win32 D3D11:** `ran=289 passed=289 failed=0` (was 273)
- **OpenGL (ANGLE):** `ran=280 passed=280 failed=0`

Stencil/scissor tests that pass in isolation but leak GPU state into a
later test (`cube-with-holes-using-stencil-buffer`, `Scissor test with
0.9 hardware scaling`) were intentionally **kept excluded** to avoid
reintroducing a cascade.

## Re-enabled tests

- Area Lights - Standard Material
- Area Lights - PBR
- GLTF Node visibility test
- Export GLTF Extension KHR_texture_transform
- simple-render-target-with-blue-spheres
- pillars-sphere-and-torus-with-PCSS-shadows
- torus-knot-mirror
- simple-sphere-in-4-mirrors
- procedural-texture-with-node-material
- simple-custom-shader
- change-texture-of-material
- OpenPBR Thin Film Weight VS IOR - Analytic Lights
- OpenPBR Thin Film Thickness VS IOR - Analytic Lights
- OpenPBR Thin Film and Specular Weight - Analytic Lights
- OpenPBR Thin Film and Specular Weight Metals - Analytic Lights
- WebGPU async pipeline pre-warming

## Notes

Validated locally on **Win32 D3D11** and **OpenGL** only; Metal/iOS not
validated locally — worth watching Mac CI.

---------

Co-authored-by: Branimir Karadzic <branimirkaradzic@gmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft pushed a commit that referenced this pull request Jun 10, 2026
DeviceUpdate was reduced to a no-op compatibility shim when the threading
model was reworked (#1652) -- frame synchronization is now handled by
FrameCompletionScope inside StartRenderingCurrentFrame/
FinishRenderingCurrentFrame. This removes the deprecated shim
(Graphics::DeviceUpdate and Device::GetUpdate) and all of its now-redundant
callers across the apps, platform shells, and unit tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft pushed a commit that referenced this pull request Jun 10, 2026
DeviceUpdate was reduced to a no-op compatibility shim when the threading
model was reworked (#1652) -- frame synchronization is now handled by
FrameCompletionScope inside StartRenderingCurrentFrame/
FinishRenderingCurrentFrame. This removes the deprecated shim
(Graphics::DeviceUpdate and Device::GetUpdate) and all of its now-redundant
callers across the apps, platform shells, and unit tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft pushed a commit that referenced this pull request Jun 10, 2026
DeviceUpdate became a no-op compatibility shim when the threading model was
reworked (#1652) -- frame synchronization is now handled by
FrameCompletionScope inside StartRenderingCurrentFrame/
FinishRenderingCurrentFrame.

This removes all of its now-redundant callers across the apps, platform shells
(Win32/UWP/X11/Android/iOS/macOS/visionOS), and unit tests, and marks
Graphics::DeviceUpdate and Device::GetUpdate [[deprecated]] so any remaining
external callers get a warning. The shim itself is kept for source
compatibility and will be removed in a future PR.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft pushed a commit that referenced this pull request Jun 10, 2026
DeviceUpdate became a no-op compatibility shim when the threading model was
reworked (#1652) -- frame synchronization is now handled by
FrameCompletionScope inside StartRenderingCurrentFrame/
FinishRenderingCurrentFrame.

This removes all of its now-redundant callers across the apps, platform shells
(Win32/UWP/X11/Android/iOS/macOS/visionOS), and unit tests, and marks
DeviceUpdate::Start/Finish/RequestFinish and Device::GetUpdate [[deprecated]]
so any remaining external callers get a warning. The shim itself is kept for
source compatibility and will be removed in a future PR. The type is left
un-deprecated so GetUpdate can keep returning it without a self-warning.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft added a commit that referenced this pull request Jun 11, 2026
`DeviceUpdate` became a no-op compatibility shim when the threading
model was reworked (#1652) — frame synchronization is now handled by
`FrameCompletionScope` inside
`StartRenderingCurrentFrame`/`FinishRenderingCurrentFrame`.

This PR:
- **Removes all of its now-redundant callers** across the apps, platform
shells (Win32/UWP/X11/Android/iOS/macOS/visionOS), and unit tests.
- **Marks `DeviceUpdate::Start`/`Finish`/`RequestFinish` and
`Device::GetUpdate` `[[deprecated]]`** so any remaining external callers
get a warning. The type itself is left un-deprecated so `GetUpdate` can
keep returning it without a self-warning (no compiler-specific pragmas
needed). The shim implementation is **kept** for source compatibility
and will be removed in a future PR.

Rebuilt fresh on top of current `master` (the original branch predated
the squash-merge of #1652). Verified locally on Win32: Playground and
UnitTests build clean with no deprecation-warning leakage, and all
UnitTests pass.

Co-authored-by: Branimir Karadžić <branimirk@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft pushed a commit to bkaradzic-microsoft/BabylonNative that referenced this pull request Jun 12, 2026
LoadTextureFromImage (and the single-mip-per-face branch of
LoadCubeTextureFromImages) drove its mip loop with the condition
`mip < image->m_numMips`, re-reading image->m_numMips every iteration.

On the last iteration the mip's bgfx::makeRef hands ownership of `image` to
bgfx's release queue (releaseFn calls bimg::imageFree(image)). bgfx may run
that release on the render thread during bgfx::frame() as soon as the
Update2D/UpdateCube command is submitted. The loop then evaluates its
exit condition by dereferencing image->m_numMips on memory that has just
been freed; if the stale read happens to exceed mip, the loop re-enters
imageGetRawData() on the freed container, whose garbage m_format indexes
s_imageBlockInfo out of bounds -> intermittent access violation
(bimg/image.cpp). This is a use-after-free (CWE-416), reproduced as a rare
(~2-4%) crash on texture-loading validation tests.

Note: bgfx resource creation/update are themselves mutex-guarded
(m_resourceApiLock, also held by bgfx::frame()) and are safe to call from a
worker thread; the bug is BabylonNative dereferencing `image` after
transferring its ownership to bgfx's async release queue, not a bgfx race.

Fix: cache image->m_numMips in the loop initializer
(`for (uint8_t mip = 0, numMips = image->m_numMips; mip < numMips; ++mip)`) so
the loop bound is not re-read from `image` after the freeing submit. This
restores verbatim the change made by commit e8ee5f7 (BabylonJS#1628) that was reverted
by BabylonJS#1652.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bkaradzic-microsoft added a commit that referenced this pull request Jun 12, 2026
)

## Problem

`NativeEngine::LoadTextureFromImage` (and the single-mip-per-face branch
of
`LoadCubeTextureFromImages`) drove the mip-upload loop with
`for (uint8_t mip = 0; mip < image->m_numMips; ++mip)`, re-reading
`image->m_numMips` for the exit condition on every iteration.

On the **last** iteration the mip is submitted to bgfx via
`bgfx::makeRef(imageMip.m_data, imageMip.m_size, releaseFn, image)`,
where
`releaseFn` calls `bimg::imageFree(image)`. bgfx invokes that release
once it
has consumed the memory — which happens on the render thread inside
`bgfx::frame()`, potentially **immediately** after
`Update2D`/`UpdateCube`
submits the command. The loop then evaluates `mip < image->m_numMips` by
dereferencing `image` *after it may already have been freed*. If the
stale
`m_numMips` read happens to be greater than `mip`, the loop re-enters
`bimg::imageGetRawData(*image, ...)` on the freed container; its garbage
`m_format` indexes `s_imageBlockInfo` out of bounds → an intermittent
access
violation.

This is a **use-after-free (CWE-416)**, reproduced as a rare (~2–4%)
crash on
texture-loading validation tests (e.g. an NME particle test that loads a
ramp
texture while the render loop is active).

### Not a bgfx threading bug
bgfx resource creation/update **are** mutex-guarded
(`m_resourceApiLock`, which
`bgfx::frame()` also holds), so calling them from the decode worker
thread is
safe and supported. The defect is purely that BabylonNative keeps
dereferencing `image` after handing its ownership to bgfx's asynchronous
release queue.

## Fix

Cache `image->m_numMips` in the loop initializer
(`for (uint8_t mip = 0, numMips = image->m_numMips; mip < numMips;
++mip)`) so
the loop bound is no longer re-read from `image` after the freeing
submit.

This restores **verbatim** the change made in

[`e8ee5f7`](e8ee5f7)
(#1628) — see Regression history below.

## Regression history

This is a **re-fix of a previously-fixed bug**. The exact same race was
fixed
in April by commit

[`e8ee5f7`](e8ee5f7)
("Fixed race condition when passing data to bgfx. Issue #1398", #1628),
which
changed both loops to `for (uint8_t mip = 0, numMips = image->m_numMips;
...)`.

That fix was inadvertently reverted by commit

[`6bb8014`](6bb8014)
("Reworked threading model.", #1652), which restored the original
`for (uint8_t mip = 0; mip < image->m_numMips; ++mip)` form in both
`LoadTextureFromImage` and `LoadCubeTextureFromImages`, reintroducing
the
use-after-free. This PR re-applies the original fix unchanged.

## Validation

- Stress-tested an affected texture-loading validation test **3000×
headless**
(30 × 100, 5-way concurrent for extra scheduler pressure): **0 crashes**
(baseline crashes ~8% / ~1-in-12). At that baseline rate P(0 in 3000) is
  effectively zero, so the use-after-free is eliminated.
- Particle validation subset: no regressions.

Co-authored-by: Branimir Karadzic <branimirkaradzic@gmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants