From b22fc9fa66b26dcbaff0712b6baf73e88a14059c Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Wed, 9 Jul 2025 23:46:12 +0530 Subject: [PATCH 1/6] DOC: Clarify broadcasting behavior when using lists in DataFrame arithmetic (GH18857) --- doc/source/user_guide/basics.rst | 5 +++++ doc/source/user_guide/dsintro.rst | 13 +++++++++++++ 2 files changed, 18 insertions(+) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index 8155aa0ae03fa..fbfdec6af8759 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -209,6 +209,11 @@ either match on the *index* or *columns* via the **axis** keyword: df.sub(column, axis="index") df.sub(column, axis=0) +Be careful when using raw Python lists in binary operations with DataFrames. +Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. +Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. +To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. +See also: :ref:`numpy broadcasting ` Furthermore you can align a level of a MultiIndexed DataFrame with a Series. .. ipython:: python diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index 89981786d60b5..c635d9157f557 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -650,6 +650,19 @@ row-wise. For example: df - df.iloc[0] +When using a Python list in arithmetic operations with a DataFrame, the behavior is not element-wise broadcasting. +Instead, the list is treated as a single object and the operation is performed column-wise, resulting in unexpected output (e.g. arrays inside each cell). + +.. ipython:: python + + df = pd.DataFrame(np.arange(6).reshape(2, 3), columns=["A", "B", "C"]) + + df + [1, 2, 3] # Returns a Series of arrays, not a DataFrame + + df + np.array([1, 2, 3]) # Correct broadcasting + + df + pd.Series([1, 2, 3], index=["A", "B", "C"]) # Also correct + For explicit control over the matching and broadcasting behavior, see the section on :ref:`flexible binary operations `. From 7bcf683e63fbfb69601236e41efd5c0a495a2843 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Thu, 10 Jul 2025 00:01:03 +0530 Subject: [PATCH 2/6] DOC: Fix external link formatting in basics.rst --- doc/source/user_guide/basics.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index fbfdec6af8759..e25dca0da91c9 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -213,7 +213,7 @@ Be careful when using raw Python lists in binary operations with DataFrames. Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. -See also: :ref:`numpy broadcasting ` +See also: `numpy broadcasting `_ Furthermore you can align a level of a MultiIndexed DataFrame with a Series. .. ipython:: python From ec00318d45485321b5fd91a4afcdf715fca30510 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Thu, 10 Jul 2025 00:03:55 +0530 Subject: [PATCH 3/6] DOC: Removed external link in basics.rst --- doc/source/user_guide/basics.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index e25dca0da91c9..d35cc06db9f06 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -213,7 +213,6 @@ Be careful when using raw Python lists in binary operations with DataFrames. Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. -See also: `numpy broadcasting `_ Furthermore you can align a level of a MultiIndexed DataFrame with a Series. .. ipython:: python From 0e718bc6827ee13335fdf3dc8c4fa58ba177531a Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Tue, 15 Jul 2025 13:23:57 +0530 Subject: [PATCH 4/6] Comment changes --- doc/source/user_guide/dsintro.rst | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index c635d9157f557..59938a66bb507 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -655,13 +655,7 @@ Instead, the list is treated as a single object and the operation is performed c .. ipython:: python - df = pd.DataFrame(np.arange(6).reshape(2, 3), columns=["A", "B", "C"]) - - df + [1, 2, 3] # Returns a Series of arrays, not a DataFrame - - df + np.array([1, 2, 3]) # Correct broadcasting - - df + pd.Series([1, 2, 3], index=["A", "B", "C"]) # Also correct + df + np.array([1, 2, 3]) For explicit control over the matching and broadcasting behavior, see the section on :ref:`flexible binary operations `. From f10c147377e1004b89b22ef5d4d4c771cafe09a8 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Mon, 21 Jul 2025 11:15:13 +0530 Subject: [PATCH 5/6] Changes as per comment --- doc/source/user_guide/dsintro.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index 59938a66bb507..385238c12f423 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -650,12 +650,12 @@ row-wise. For example: df - df.iloc[0] -When using a Python list in arithmetic operations with a DataFrame, the behavior is not element-wise broadcasting. -Instead, the list is treated as a single object and the operation is performed column-wise, resulting in unexpected output (e.g. arrays inside each cell). +Use .add(array, axis=0) to apply row-wise broadcasting when the array length matches the number of rows — +this ensures element-wise operations are performed across each row, rather than mistakenly aligning with columns. .. ipython:: python - df + np.array([1, 2, 3]) + df.add(np.array([1, 2, 3]), axis=0) For explicit control over the matching and broadcasting behavior, see the section on :ref:`flexible binary operations `. From d438e8b4d8dfa9b100cfb58922d0edb6094b605e Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Mon, 21 Jul 2025 20:51:59 +0530 Subject: [PATCH 6/6] Made the changes --- doc/source/user_guide/basics.rst | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index d35cc06db9f06..6a496de3b8e53 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -209,11 +209,12 @@ either match on the *index* or *columns* via the **axis** keyword: df.sub(column, axis="index") df.sub(column, axis=0) -Be careful when using raw Python lists in binary operations with DataFrames. -Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. -Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. -To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. -Furthermore you can align a level of a MultiIndexed DataFrame with a Series. +Use .add(array, axis=0) to broadcast values row-wise, ensuring each element in the array is +applied to the corresponding row. This avoids accidental column alignment and preserves expected element-wise behavior. + +.. ipython:: python + + df.add(np.array([1, 2, 3]), axis=0) .. ipython:: python