Automatically clean thread and loop on fork #1790

martindurant · 2025-02-14T14:07:44Z

No description provided.

mxmlnkn · 2025-06-12T08:52:49Z

fsspec/asyn.py

+    global lock
+    loop[0] = None
+    iothread[0] = None
+    lock = None


Are you sure this works? Why is only the lock declared as global. Furthermore, there is no lock in this file, only _lock and get_lock. This got changed in ffe57d6. Furthermore, this function is basically identical to reset_lock, so why not use that function?

mxmlnkn · 2025-06-12T10:33:38Z

This is a step in the right direction. I need something like this because fusepy forks to daemonize itself.
Even better would be some method to close all event loops and threads before forking!

Currently, I am trying this:

try:
    import fsspec.asyn

    def tryCloseFSSpecIOBeforeFork() -> None:
        try:
            # We only cover a single use case: Only the main thread and a single fsspecIO thread are running.
            # Then, we can close it presumably safely. Anything else would possibly not be thread-safe.
            # See comments below about resetting the lock.
            if (
                len(threading.enumerate()) != 2
                or not all(thread.name in ["MainThread", "fsspecIO"] for thread in threading.enumerate())
                or fsspec.asyn.iothread[0] is None
            ):
                return

            # The lock was changed 3 years ago to a _lock and get_lock singleton.
            # https://github.com/fsspec/filesystem_spec/commit/ffe57d6eabe517b4c39c27487fc45b804d314b58
            # But acquiring the lock does not help us anyway to fix thread-safety in case other threads have
            # a reference to the event loop after a call to get_loop. Therefore, do the threading checks above
            # and don't bother with locking.
            # These are lists with a single element, which is None at first, for some reason Why?
            ioThread = fsspec.asyn.iothread[0]
            eventLoop = fsspec.asyn.loop[0]

            if eventLoop is not None:  # Should always be true because else iothread[0] would also be None.
                # Calling eventLoop.stop() directly does not work for some reason.
                # Probably needs to be called on the executing thread.
                eventLoop.call_soon_threadsafe(eventLoop.stop)
                if ioThread is not None and ioThread.is_alive():
                    ioThread.join()

            # This should be safe as long as no other thread is using the event loop.
            # This should not be done if there are potentially other threads using the event loop
            # because get_loop only accounts for race conditions during event loop creation, but after that,
            # it simply returns the event loop to be used without any lock, i.e., changing or deleting the
            # event loop is not thread-safe!
            # https://github.com/fsspec/filesystem_spec/blob/26f1ea75351e39a80b29b27bea792351f3e8da9f/
            #   fsspec/asyn.py#L141
            # The next call to fsspec.asyn.get_loop should simply recreate a new thread and event loop.
            # But, for my use case, the next call would only be after a fork anyway.
            # Normally, there is no reason to reset the lock on this thread. However, this is to be used before
            # forking and we even check against other threads running above, so it should be safe to reset the
            # lock.
            # See also https://github.com/fsspec/filesystem_spec/pull/1790
            reset_after_fork = getattr(fsspec.asyn, 'reset_after_fork', None)
            reset_lock = getattr(fsspec.asyn, 'reset_lock', None)
            if reset_lock:
                reset_lock()
            elif reset_after_fork:
                # Theoretically, this call is redundant because fsspec registers it to be called on fork if it exists.
                reset_after_fork()
            else:
                fsspec.asyn.iothread[0] = None
                fsspec.asyn.loop[0] = None
                fsspec.asyn.lock = None

        except Exception:
            pass

Also, consider the comment here:

def reset_lock():
    """Reset the global lock.

    This should be called only on the init of a forked process to reset the lock to
    None, enabling the new forked process to get a new lock.
    """

I think you could also test for correct usage by checking that there are no threads started for the current process, e.g., the way I do it in ratarmount to check before forking:

# Note that this will not detect threads started in shared libraries, only those started via "threading".
if not foreground and len(threading.enumerate()) > 1:
    threadNames = [thread.name for thread in threading.enumerate() if thread.name != "MainThread"]
    # Fix FUSE hangs with: https://unix.stackexchange.com/a/713621/111050
    raise ValueError(
        "Daemonizing FUSE into the background may result in errors or unkillable hangs because "
        f"there are threads still open: {', '.join(threadNames)}!\nCall ratarmount with -f or --foreground."
    )

mxmlnkn · 2025-06-12T11:52:45Z

Unfortunately, the above does not work because it triggers: the This class is not fork-safe. Is that check still necessary after this PR? I get this for the http:// backend while it seems to work for the github:// backend.

martindurant · 2025-06-12T14:09:52Z

Please submit your ideas as a new PR so that we can discuss it in isolation.

better would be some method to close all event loops and threads before forking

Not everyone will want this, sometimes the original process wants to keep using those resources.

This class is not fork-safe. Is that check still necessary after this PR

If we are sure the child process can work, setting up its new resources, then we no longer need the check.

Why is only the lock declared as global.

The other objects are mutated in-place

Furthermore, there is no lock in this file, only _lock

This may well be a typo.

Automatically clean thread and loop on fork

b97fff7

martindurant mentioned this pull request Feb 14, 2025

Improve Performance on LocalFileSystem.ls() if detail=False #1789

Merged

martindurant added 2 commits March 12, 2025 11:33

Merge branch 'master' into fork-safe

5825017

for win

785af5f

martindurant merged commit 6b85a47 into fsspec:master Mar 12, 2025
10 checks passed

mxmlnkn reviewed Jun 12, 2025

View reviewed changes

mxmlnkn mentioned this pull request Jun 12, 2025

[Feature Request] Fork Safety for async filesystems #835

Open

martindurant deleted the fork-safe branch June 12, 2025 14:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Automatically clean thread and loop on fork #1790

Automatically clean thread and loop on fork #1790

Uh oh!

martindurant commented Feb 14, 2025

Uh oh!

Uh oh!

mxmlnkn Jun 12, 2025

Uh oh!

mxmlnkn commented Jun 12, 2025

Uh oh!

mxmlnkn commented Jun 12, 2025

Uh oh!

martindurant commented Jun 12, 2025

Uh oh!

Uh oh!

Automatically clean thread and loop on fork #1790

Automatically clean thread and loop on fork #1790

Uh oh!

Conversation

martindurant commented Feb 14, 2025

Uh oh!

Uh oh!

mxmlnkn Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

mxmlnkn commented Jun 12, 2025

Uh oh!

mxmlnkn commented Jun 12, 2025

Uh oh!

martindurant commented Jun 12, 2025

Uh oh!

Uh oh!