Skip to content

Commit 994cdf2

Browse files
committed
Fix describe() for ExtensionArrays with multiple internal dtypes
1 parent 8de38e8 commit 994cdf2

File tree

3 files changed

+30
-0
lines changed

3 files changed

+30
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -927,6 +927,7 @@ Other
927927
- Bug in :meth:`Index.sort_values` when passing a key function that turns values into tuples, e.g. ``key=natsort.natsort_key``, would raise ``TypeError`` (:issue:`56081`)
928928
- Bug in :meth:`MultiIndex.fillna` error message was referring to ``isna`` instead of ``fillna`` (:issue:`60974`)
929929
- Bug in :meth:`Series.describe` where median percentile was always included when the ``percentiles`` argument was passed (:issue:`60550`).
930+
- Bug in :meth:`Series.describe` where statistics with multiple dtypes for ExtensionArrays were coerced to ``float64`` which raised a ``DimensionalityError``` (:issue:`61707`)
930931
- Bug in :meth:`Series.diff` allowing non-integer values for the ``periods`` argument. (:issue:`56607`)
931932
- Bug in :meth:`Series.dt` methods in :class:`ArrowDtype` that were returning incorrect values. (:issue:`57355`)
932933
- Bug in :meth:`Series.isin` raising ``TypeError`` when series is large (>10**6) and ``values`` contains NA (:issue:`60678`)

pandas/core/methods/describe.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -251,6 +251,12 @@ def describe_numeric_1d(series: Series, percentiles: Sequence[float]) -> Series:
251251
import pyarrow as pa
252252

253253
dtype = ArrowDtype(pa.float64())
254+
255+
elif any(type(item) != type(d[0]) for item in d):
256+
# GH61707: describe() doesn't work on EAs
257+
# when series entries cannot be cast to float64, set dtype=None
258+
dtype = None
259+
254260
else:
255261
dtype = Float64Dtype()
256262
elif series.dtype.kind in "iufb":

pandas/tests/series/methods/test_describe.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,29 @@ def test_describe_empty_object(self):
9292
# ensure NaN, not None
9393
assert np.isnan(result.iloc[2])
9494
assert np.isnan(result.iloc[3])
95+
96+
def test_series_cast_to_float64_fails(self):
97+
# https://github.com/pandas-dev/pandas/issues/61707
98+
from decimal import Decimal
99+
100+
from pandas.tests.extension.decimal import to_decimal
101+
102+
s = Series(to_decimal([1, 2.5, 3]), dtype="decimal")
103+
104+
expected = Series(
105+
[
106+
3,
107+
Decimal("2.166666666666666666666666667"),
108+
Decimal("0.8498365855987974716713706849"),
109+
Decimal("1"),
110+
Decimal("3"),
111+
],
112+
index=["count", "mean", "std", "min", "max"],
113+
dtype="object",
114+
)
115+
116+
result = s.describe(percentiles=[])
117+
tm.assert_series_equal(result, expected)
95118

96119
def test_describe_with_tz(self, tz_naive_fixture):
97120
# GH 21332

0 commit comments

Comments
 (0)