Skip to content

GH-91636: Clear weakrefs created by finalizers. #136401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 13 additions & 7 deletions Lib/test/test_gc.py
Original file line number Diff line number Diff line change
Expand Up @@ -262,9 +262,11 @@ class Cyclic(tuple):
# finalizer.
def __del__(self):

# 5. Create a weakref to `func` now. If we had created
# it earlier, it would have been cleared by the
# garbage collector before calling the finalizers.
# 5. Create a weakref to `func` now. In previous
# versions of Python, this would avoid having it
# cleared by the garbage collector before calling
# the finalizers. Now, weakrefs get cleared after
# calling finalizers.
self[1].ref = weakref.ref(self[0])

# 6. Drop the global reference to `latefin`. The only
Expand Down Expand Up @@ -293,14 +295,18 @@ def func():
# which will find `cyc` and `func` as garbage.
gc.collect()

# 9. Previously, this would crash because `func_qualname`
# had been NULL-ed out by func_clear().
# 9. Previously, this would crash because the weakref
# created in the finalizer revealed the function after
# `tp_clear` was called and `func_qualname`
# had been NULL-ed out by func_clear(). Now, we clear
# weakrefs to unreachable objects before calling `tp_clear`
# but after calling finalizers.
print(f"{func=}")
"""
# We're mostly just checking that this doesn't crash.
rc, stdout, stderr = assert_python_ok("-c", code)
self.assertEqual(rc, 0)
self.assertRegex(stdout, rb"""\A\s*func=<function at \S+>\s*\z""")
# The `func` global is None because the weakref was cleared.
self.assertRegex(stdout, rb"""\A\s*func=None""")
self.assertFalse(stderr)

@refcount_test
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
While performing garbage collection, clear weakrefs to unreachable objects
that are created during running of finalizers. If those weakrefs were are
not cleared, they could reveal unreachable objects.
25 changes: 22 additions & 3 deletions Python/gc.c
Original file line number Diff line number Diff line change
Expand Up @@ -870,7 +870,7 @@ move_legacy_finalizer_reachable(PyGC_Head *finalizers)
* no object in `unreachable` is weakly referenced anymore.
*/
static int
handle_weakrefs(PyGC_Head *unreachable, PyGC_Head *old)
handle_weakrefs(PyGC_Head *unreachable, PyGC_Head *old, bool allow_callbacks)
{
PyGC_Head *gc;
PyObject *op; /* generally FROM_GC(gc) */
Expand All @@ -879,7 +879,9 @@ handle_weakrefs(PyGC_Head *unreachable, PyGC_Head *old)
PyGC_Head *next;
int num_freed = 0;

gc_list_init(&wrcb_to_call);
if (allow_callbacks) {
gc_list_init(&wrcb_to_call);
}

/* Clear all weakrefs to the objects in unreachable. If such a weakref
* also has a callback, move it into `wrcb_to_call` if the callback
Expand Down Expand Up @@ -935,6 +937,11 @@ handle_weakrefs(PyGC_Head *unreachable, PyGC_Head *old)
_PyObject_ASSERT((PyObject *)wr, wr->wr_object == op);
_PyWeakref_ClearRef(wr);
_PyObject_ASSERT((PyObject *)wr, wr->wr_object == Py_None);

if (!allow_callbacks) {
continue;
}

if (wr->wr_callback == NULL) {
/* no callback */
continue;
Expand Down Expand Up @@ -987,6 +994,10 @@ handle_weakrefs(PyGC_Head *unreachable, PyGC_Head *old)
}
}

if (!allow_callbacks) {
return 0;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is worth to add an assert here for old != NULL?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question: The old and allow_callbacks arguments are orthogonal even in older versions?

/* Invoke the callbacks we decided to honor. It's safe to invoke them
* because they can't reference unreachable objects.
*/
Expand Down Expand Up @@ -1737,7 +1748,7 @@ gc_collect_region(PyThreadState *tstate,
}

/* Clear weakrefs and invoke callbacks as necessary. */
stats->collected += handle_weakrefs(&unreachable, to);
stats->collected += handle_weakrefs(&unreachable, to, true);
gc_list_validate_space(to, gcstate->visited_space);
validate_list(to, collecting_clear_unreachable_clear);
validate_list(&unreachable, collecting_set_unreachable_clear);
Expand All @@ -1751,6 +1762,14 @@ gc_collect_region(PyThreadState *tstate,
gc_list_init(&final_unreachable);
handle_resurrected_objects(&unreachable, &final_unreachable, to);

/* Clear weakrefs to objects in the unreachable set. No Python-level
* code must be allowed to access those unreachable objects. During
* delete_garbage(), finalizers outside the unreachable set might run
* and create new weakrefs. If those weakrefs were not cleared, they
* could reveal unreachable objects. Callbacks are not executed.
*/
handle_weakrefs(&final_unreachable, NULL, false);

/* Call tp_clear on objects in the final_unreachable set. This will cause
* the reference cycles to be broken. It may also cause some objects
* in finalizers to be freed.
Expand Down
20 changes: 16 additions & 4 deletions Python/gc_free_threading.c
Original file line number Diff line number Diff line change
Expand Up @@ -1492,9 +1492,9 @@ move_legacy_finalizer_reachable(struct collection_state *state)
}

// Clear all weakrefs to unreachable objects. Weakrefs with callbacks are
// enqueued in `wrcb_to_call`, but not invoked yet.
// optionally enqueued in `wrcb_to_call`, but not invoked yet.
static void
clear_weakrefs(struct collection_state *state)
clear_weakrefs(struct collection_state *state, bool enqueue_callbacks)
{
PyObject *op;
WORKSTACK_FOR_EACH(&state->unreachable, op) {
Expand Down Expand Up @@ -1526,6 +1526,10 @@ clear_weakrefs(struct collection_state *state)
_PyWeakref_ClearRef(wr);
_PyObject_ASSERT((PyObject *)wr, wr->wr_object == Py_None);

if (!enqueue_callbacks) {
continue;
}

// We do not invoke callbacks for weakrefs that are themselves
// unreachable. This is partly for historical reasons: weakrefs
// predate safe object finalization, and a weakref that is itself
Expand Down Expand Up @@ -2211,7 +2215,7 @@ gc_collect_internal(PyInterpreterState *interp, struct collection_state *state,
interp->gc.long_lived_total = state->long_lived_total;

// Clear weakrefs and enqueue callbacks (but do not call them).
clear_weakrefs(state);
clear_weakrefs(state, true);
_PyEval_StartTheWorld(interp);

// Deallocate any object from the refcount merge step
Expand All @@ -2222,11 +2226,19 @@ gc_collect_internal(PyInterpreterState *interp, struct collection_state *state,
call_weakref_callbacks(state);
finalize_garbage(state);

// Handle any objects that may have resurrected after the finalization.
_PyEval_StopTheWorld(interp);
// Handle any objects that may have resurrected after the finalization.
err = handle_resurrected_objects(state);
// Clear free lists in all threads
_PyGC_ClearAllFreeLists(interp);
if (err == 0) {
// Clear weakrefs to objects in the unreachable set. No Python-level
// code must be allowed to access those unreachable objects. During
// delete_garbage(), finalizers outside the unreachable set might
// run and create new weakrefs. If those weakrefs were not cleared,
// they could reveal unreachable objects.
clear_weakrefs(state, false);
}
_PyEval_StartTheWorld(interp);

if (err < 0) {
Expand Down
Loading