Skip to content

Commit 7c1713f

Browse files
committed
Lua: switch to recursive mutex in lua_lock()
Previously, calling a Lua C API function that throws an error (e.g. with any of the `lua_error()` family of functions, which use `luaD_throw()` internally) in a function that also uses our own `StackUnwinder` would deadlock, because: - `luaD_throw()` calls `lua_lock()` but apparently expects something else to call `lua_unlock()` - `~StackUnwinder()` calls `lua_settop()` to rewind the stack, which internally also calls `lua_lock()` and `lua_unlock()`. This is called before the mutex can be unlocked. This was the root cause of the issue behind #5040, #5055, #5056 - `GetVector()` was called with an invalid index (only when invoked from `gui/launcher`, because this changed the data on the Lua stack) and threw an exception, which caused the `~StackUnwinder()` destructor to run while the Lua state was still locked. Other things to note: - We control the locking/unlocking implementation - the default, defined in `llimits.h`, is a _no-op_. - @ab9rf points out that the `CRITICAL_SECTION` API we're using on Windows already appears to be equivalent to a recursive mutex, so this issue likely would not have occurred there. The issue can be reproduced relatively easily with a simple test command, e.g.: ```diff @@ -716,6 +723,15 @@ void ls_helper(color_ostream &con, const std::vector<std::string> &params) { command_result Core::runCommand(color_ostream &con, const std::string &first_, std::vector<std::string> &parts) { std::string first = first_; + if (first == "luaerror") { + auto L = Lua::Core::State; + Lua::StackUnwinder top(L); + + luaL_error(L, "test error"); + + return CR_OK; + } + CommandDepthCounter counter; if (!counter.ok()) { ``` In `gui/launcher`: - before this change, invoking `luaerror` would hang DF. - after this change, `luaerror` prints `nil` in red to the native console and does nothing. This comes from the last-resort error handler in `LuaTools.cpp:report_error()`, and is equivalent to how other unexpected errors are handled in code invoked from Lua. Not ideal, but better than a crash. From the native console, invoking `luaerror` crashes DF in both cases (SIGABRT), and logs `PANIC: unprotected error in call to Lua API (test error)` to `stderr.log`. The difference in behavior is because `report_error()` above is part of the error handling system present when code is called _from Lua_, through variants of `SafeCall`. There is no Lua layer involved in the native console. (Note: the same crash would have been observed in the original issue in #5056 et. al. if the error had occurred when invoked through the console.)
1 parent b644258 commit 7c1713f

1 file changed

Lines changed: 11 additions & 2 deletions

File tree

depends/lua/include/dfhack_llimits.h

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,8 +54,17 @@ struct lua_extra_state {
5454
#define lua_lock(L) EnterCriticalSection(luai_mutex(L))
5555
#define lua_unlock(L) LeaveCriticalSection(luai_mutex(L))
5656
#else
57-
#define luai_userstateopen(L) luai_mutex(L) = (mutex_t*)malloc(sizeof(mutex_t)); *luai_mutex(L) = PTHREAD_MUTEX_INITIALIZER
58-
#define luai_userstateclose(L) lua_unlock(L); pthread_mutex_destroy(luai_mutex(L)); free(luai_mutex(L))
57+
#define luai_userstateopen(L) do { \
58+
luai_mutex(L) = (mutex_t*)malloc(sizeof(mutex_t)); \
59+
pthread_mutexattr_t attr; \
60+
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE); \
61+
pthread_mutex_init(luai_mutex(L), &attr); \
62+
} while (0)
63+
#define luai_userstateclose(L) do { \
64+
lua_unlock(L); \
65+
pthread_mutex_destroy(luai_mutex(L)); \
66+
free(luai_mutex(L)); \
67+
} while (0)
5968
#define lua_lock(L) pthread_mutex_lock(luai_mutex(L))
6069
#define lua_unlock(L) pthread_mutex_unlock(luai_mutex(L))
6170
#endif

0 commit comments

Comments
 (0)