Skip to content

Commit 8759c5d

Browse files
authored
BUG: Series.replace with CoW when made from an Index (#61972)
1 parent b2fbf09 commit 8759c5d

File tree

3 files changed

+27
-8
lines changed

3 files changed

+27
-8
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -941,6 +941,7 @@ Other
941941
- Bug in Dataframe Interchange Protocol implementation was returning incorrect results for data buffers' associated dtype, for string and datetime columns (:issue:`54781`)
942942
- Bug in ``Series.list`` methods not preserving the original :class:`Index`. (:issue:`58425`)
943943
- Bug in ``Series.list`` methods not preserving the original name. (:issue:`60522`)
944+
- Bug in ``Series.replace`` when the Series was created from an :class:`Index` and Copy-On-Write is enabled (:issue:`61622`)
944945
- Bug in printing a :class:`DataFrame` with a :class:`DataFrame` stored in :attr:`DataFrame.attrs` raised a ``ValueError`` (:issue:`60455`)
945946
- Bug in printing a :class:`Series` with a :class:`DataFrame` stored in :attr:`Series.attrs` raised a ``ValueError`` (:issue:`60568`)
946947
- Fixed bug where the :class:`DataFrame` constructor misclassified array-like objects with a ``.name`` attribute as :class:`Series` or :class:`Index` (:issue:`61443`)

pandas/core/internals/blocks.py

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@
1010
final,
1111
)
1212
import warnings
13-
import weakref
1413

1514
import numpy as np
1615

@@ -863,14 +862,22 @@ def replace_list(
863862
)
864863

865864
if i != src_len:
866-
# This is ugly, but we have to get rid of intermediate refs
867-
# that did not go out of scope yet, otherwise we will trigger
868-
# many unnecessary copies
865+
# This is ugly, but we have to get rid of intermediate refs. We
866+
# can simply clear the referenced_blocks if we already copied,
867+
# otherwise we have to remove ourselves
868+
self_blk_ids = {
869+
id(b()): i for i, b in enumerate(self.refs.referenced_blocks)
870+
}
869871
for b in result:
870-
ref = weakref.ref(b)
871-
b.refs.referenced_blocks.pop(
872-
b.refs.referenced_blocks.index(ref)
873-
)
872+
if b.refs is self.refs:
873+
# We are still sharing memory with self
874+
if id(b) in self_blk_ids:
875+
# Remove ourselves from the refs; we are temporary
876+
self.refs.referenced_blocks.pop(self_blk_ids[id(b)])
877+
else:
878+
# We have already copied, so we can clear the refs to avoid
879+
# future copies
880+
b.refs.referenced_blocks.clear()
874881
new_rb.extend(result)
875882
rb = new_rb
876883
return rb

pandas/tests/series/methods/test_replace.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@
33
import numpy as np
44
import pytest
55

6+
import pandas.util._test_decorators as td
7+
68
import pandas as pd
79
import pandas._testing as tm
810
from pandas.core.arrays import IntervalArray
@@ -715,3 +717,12 @@ def test_replace_all_NA(self):
715717
result = df.replace({r"^#": "$"}, regex=True)
716718
expected = pd.Series([pd.NA, pd.NA])
717719
tm.assert_series_equal(result, expected)
720+
721+
722+
@td.skip_if_no("pyarrow")
723+
def test_replace_from_index():
724+
# https://github.com/pandas-dev/pandas/issues/61622
725+
idx = pd.Index(["a", "b", "c"], dtype="string[pyarrow]")
726+
expected = pd.Series(["d", "b", "c"], dtype="string[pyarrow]")
727+
result = pd.Series(idx).replace({"z": "b", "a": "d"})
728+
tm.assert_series_equal(result, expected)

0 commit comments

Comments
 (0)