Skip to content

[3.14] gh-148284: Block inlining of gigantic functions in ceval.c for clang 22#148326

Closed
Fidget-Spinner wants to merge 4 commits intopython:3.14from
Fidget-Spinner:block_inlining_of_gigantic_functions
Closed

[3.14] gh-148284: Block inlining of gigantic functions in ceval.c for clang 22#148326
Fidget-Spinner wants to merge 4 commits intopython:3.14from
Fidget-Spinner:block_inlining_of_gigantic_functions

Conversation

@Fidget-Spinner
Copy link
Copy Markdown
Member

@Fidget-Spinner Fidget-Spinner commented Apr 10, 2026

It seems that on clang-22, the inliner is too aggressive on _PyEval_EvalFrameDefault when on computed goto interpreter. Together with some strange interaction with the stackref buffer, the function requires 40kB of stack space (!!!) versus the usual 1-2kB normally used.

This sets the inline limit to functions of max 512B stack space (1/4th of normal) allowed to be inlined in ceval.c. I checked the dissasembly and the new function uses about 2kB of stack.

This will need a forward-port to 3.15 too.

@Fidget-Spinner Fidget-Spinner changed the title [3.14] gh-148284: Block inlining of gigantic functions in ceval.c [3.14] gh-148284: Block inlining of gigantic functions in ceval.c for clang 22 Apr 10, 2026
@Fidget-Spinner Fidget-Spinner requested a review from vstinner April 10, 2026 12:14
@vstinner
Copy link
Copy Markdown
Member

This will need a forward-port to 3.15 too.

If possible, I would prefer to fix the main branch first, and then backport to 3.14.

@Fidget-Spinner
Copy link
Copy Markdown
Member Author

If possible, I would prefer to fix the main branch first, and then backport to 3.14.

OK, I need to test on main first to see if there're issues there.

else
CFLAGS_CEVAL=""
fi
AC_SUBST([CFLAGS_CEVAL])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to make this a private variable? I don't feel good exposing it for now.

Comment on lines +7250 to +7253
// See gh-148284:
// Clang 22 seems to have interactions with inlining and the stackref buffer
// which cause 40kB of stack usage on x86-64 in buggy versions of _PyEval_EvalFrameDefault
// in computed goto interpreter. The normal usage seen is normally 1-2kB.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reformat to 80 columns and minor cleanup:

Suggested change
// See gh-148284:
// Clang 22 seems to have interactions with inlining and the stackref buffer
// which cause 40kB of stack usage on x86-64 in buggy versions of _PyEval_EvalFrameDefault
// in computed goto interpreter. The normal usage seen is normally 1-2kB.
// See gh-148284: Clang 22 seems to have interactions with inlining
// and the stackref buffer which cause 40 kB of stack usage on x86-64
// in buggy versions of _PyEval_EvalFrameDefault() in computed goto
// interpreter. The normal usage seen is normally 1-2 kB.

Comment on lines +7267 to +7268
// This number should be tuned to follow the C stack consumption
// in _PyEval_EvalFrameDefault on computed goto interpreter.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding "Suppress inlining of functions whose stack size exceeds 512 bytes." to explain the purpose of this option:

Suggested change
// This number should be tuned to follow the C stack consumption
// in _PyEval_EvalFrameDefault on computed goto interpreter.
// gh-148284: Suppress inlining of functions whose stack size exceeds
// 512 bytes. This number should be tuned to follow the C stack
// consumption in _PyEval_EvalFrameDefault() on computed goto
// interpreter.

The Clang option is documented at: https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-finline-max-stacksize

@vstinner
Copy link
Copy Markdown
Member

In the past, when I tried to reduce the stack memory consumption on Python function calls, I used Py_NO_INLINE with success. But it seems harder to annotate functions used by ceval.c with Py_NO_INLINE, instead of the generic -finline-max-stacksize=512. I'm just a little bit worried that a similar issue occurs with other compilers (GCC, MSVC).

In ceval.c, I see that get_exception_handler() is already annotated with Py_NO_INLINE.

@Fidget-Spinner
Copy link
Copy Markdown
Member Author

@vstinner PR moved to retarget main at #148334

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants