Skip to content

Subinterpreters don't properly clean up threads #128639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ZeroIntensity opened this issue Jan 8, 2025 · 3 comments
Open

Subinterpreters don't properly clean up threads #128639

ZeroIntensity opened this issue Jan 8, 2025 · 3 comments
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes extension-modules C modules in the Modules dir topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Jan 8, 2025

Crash report

What happened?

I found this issue a little while back, but I'm finally getting around to fixing it.

Currently, subinterpreter finalization assumes that there's only one thread left. So, any remaining threads--daemon or non-daemon--crash the interpreter upon finalizing:

import _interpreters

interp = _interpreters.create()
source = """
import threading
import time

def hello():
    time.sleep(1)

t = threading.Thread(target=hello)
t.start()
"""
_interpreters.run_string(interp, source)

This results in an assertion failure on my end:

python: Python/pystate.c:1969: tstate_activate: Assertion `!tstate->_status.bound_gilstate || tstate == gilstate_tss_get((tstate->interp->runtime))' failed.

So, there's a few things that need to get fixed:

  • The finalization process shouldn't assume that there's one thread. Instead, it should just dive straight into Py_EndInterpreter, which will properly clean things up.
  • It also shouldn't try to manually delete threads via attaching to them and then calling PyThreadState_Delete. That should also happen in Py_EndInterpreter, after all threads have finished.
  • Subinterpreter finalization itself happens way too late. _PyRuntimeState_SetFinalizing has been set, meaning that all threads will be already blocked, and thus cannot shutdown via threading._shutdown, resulting in a deadlock. finalize_subinterpreters should be called before that happens.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

No response

Linked PRs

@ZeroIntensity ZeroIntensity added extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump topic-subinterpreters 3.13 bugs and security fixes 3.14 bugs and security fixes labels Jan 8, 2025
@ericsnowcurrently
Copy link
Member

FWIW, I thought I had created an issue about cleaning up lingering thread states automatically. Would you mind double-checking and closing it, if so?

@ZeroIntensity ZeroIntensity marked this as a duplicate of #113146 May 19, 2025
ericsnowcurrently pushed a commit that referenced this issue May 19, 2025
…-128640)

Incidentally, this also fixed the warning not showing up if a subinterpreter wasn't
cleaned up via _interpreters.destroy. I had to update some of the tests as a result.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 19, 2025
…on (pythongh-128640)

Incidentally, this also fixed the warning not showing up if a subinterpreter wasn't
cleaned up via _interpreters.destroy. I had to update some of the tests as a result.
(cherry picked from commit 9859791)

Co-authored-by: Peter Bierma <[email protected]>
ZeroIntensity added a commit to ZeroIntensity/cpython that referenced this issue May 19, 2025
ericsnowcurrently pushed a commit that referenced this issue May 19, 2025
…tion (gh-128640)" (gh-134256)

This reverts commit 9859791.

The original change broke the iOS and android buildbots, where the tests are run single-process.
@ZeroIntensity
Copy link
Member Author

For those who aren't me or Eric: we had to revert the fix because it breaks the test suite when run with --single-process. I'm still trying to figure out why.

@ZeroIntensity
Copy link
Member Author

@ericsnowcurrently I finally figured out the buildbot failures. This is an awesome bug.

Here's a repro that crashes with my patch applied:

import _interpreters
import types
import time

interp = _interpreters.create(
    types.SimpleNamespace(
        use_main_obmalloc=False,
        allow_fork=False,
        allow_exec=False,
        allow_threads=True,
        allow_daemon_threads=True,
        check_multi_interp_extensions=True,
        gil='own',
    )
)
_interpreters.exec(interp, f"""if True:
    import threading
    import time

    def task():
        time.sleep(1)

    threads = [threading.Thread(target=task, daemon=True) for _ in range(3)]
    for t in threads:
        t.start()
    """)
_interpreters.destroy(interp)
time.sleep(2)

The problem is that something subtly changed with daemon thread shutdown in-between writing my patch and merging it. So, after the time.sleep() call, the thread would still be alive and crash. Unfortunately for me, I randomly chose 100 seconds as the sleep time, so if the process exited within those 100 seconds, nothing would crash. I'll put up the fixed version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes extension-modules C modules in the Modules dir topic-subinterpreters type-crash A hard crash of the interpreter, possibly with a core dump
Projects
Status: Todo
Development

No branches or pull requests

2 participants