Skip to content

[Audit][Medium] captureScreenshot bypasses submitGuarded mutex and swallows errors #709

@MichaelFisher1997

Description

@MichaelFisher1997

🔍 Module Scanned

src/engine/graphics/vulkan/ (automated audit scan)

📝 Summary

The captureScreenshot function in screenshot.zig directly calls vkQueueSubmit on the graphics queue without going through the thread-safe submitGuarded wrapper that all other queue submissions use. Additionally, it swallows all errors with a generic catch-all, making debugging impossible.

📍 Location

  • File: src/engine/graphics/vulkan/screenshot.zig:157
  • Function/Scope: captureScreenshot

🔴 Severity: Medium

  • Critical: Crashes, data corruption, GPU device loss
  • High: Memory leaks, race conditions, incorrect rendering, broken features
  • Medium: Performance degradation, missing error handling, suboptimal patterns
  • Low: Code style, dead code, minor improvements

💥 Impact

If captureScreenshot is ever called concurrently with the normal render loop (e.g., from a background thread, or if the screenshot path is triggered during a frame), it would race with the guarded submitGuarded calls in FrameManager.endFrame(), breaking the mutex-based serialization guarantee and causing undefined Vulkan behavior.

Even if called only from the main thread, the function silently swallows all errors via a catch-all, meaning users cannot diagnose why screenshot capture failed.

🔎 Evidence

screenshot.zig:157:

Utils.checkVk(c.vkQueueSubmit(device.queue, 1, &submit_info, fence)) catch {
    log.log.err("screenshot: failed to submit command buffer", .{});
    return false;
};

Compare to the proper pattern in frame_manager.zig:195:

try self.vulkan_device.submitGuarded(submit_info, self.in_flight_fences[self.current_frame]);

The submitGuarded function (in modules/engine-graphics/src/vulkan_device.zig:463-477) properly:

  1. Acquires device.mutex before submission
  2. Checks for VK_ERROR_DEVICE_LOST and logs detailed fault info
  3. Returns a typed error for proper handling

screenshot.zig does none of this - it directly submits without mutex protection and catches all errors to a boolean false.

🛠️ Proposed Fix

Modify captureScreenshot to use the device mutex around the queue submission:

device.mutex.lock();
const result = c.vkQueueSubmit(device.queue, 1, &submit_info, fence);
device.mutex.unlock();

if (result == c.VK_ERROR_DEVICE_LOST) {
    log.log.err("screenshot: GPU device lost during command buffer submit", .{});
    return false;
}
if (result != c.VK_SUCCESS) {
    log.log.err("screenshot: vkQueueSubmit failed with result: {}", .{result});
    return false;
}

Also consider surfacing the error rather than returning bool so callers can handle specific failure modes.

✅ Acceptance Criteria

  • captureScreenshot acquires device.mutex before calling vkQueueSubmit
  • VK_ERROR_DEVICE_LOST is handled specially to provide diagnostic info
  • Error is propagated (or at minimum logged with specific error code) rather than swallowed
  • The fix has been verified with nix develop --command zig build test

📚 References

  • modules/engine-graphics/src/vulkan_device.zig:463-477submitGuarded implementation showing correct pattern
  • src/engine/graphics/vulkan/frame_manager.zig:195 — Correct usage of submitGuarded
  • Vulkan spec §7.2 "Thread Safety" — Queue submission synchronization requirements

Metadata

Metadata

Assignees

No one assigned

    Labels

    automated-auditIssues found by automated opencode audit scansbugSomething isn't workingdocumentationImprovements or additions to documentationenhancementNew feature or requesthotfixquestionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions