BUG: Fix inconsistency when read_csv reads MultiIndex with empty values (#59560) #62644

allamlobna · 2025-10-10T18:44:34Z

closes BUG: inconsistency when read_csv reads MultiIndex with empty values #59560
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Summary

This PR fixes an inconsistency where read_csv replaced empty
MultiIndex column level values with automatically generated
"Unnamed: x_level_y" placeholders.

Empty values are now preserved as empty strings (""),
matching MultiIndex index behavior and ensuring consistent
roundtrip results between to_csv and read_csv.

Changes

Added _clean_column_levels() helper to normalize MultiIndex
column labels in BaseParser.
Updated _extract_multi_indexer_columns() to use it.
Adjusted test_multi_index_unnamed() expectations.
Added regression test for GH#59560 in test_header.py.
Added whatsnew entry under Bug Fixes → IO.

Impact

Aligns column + index behavior for MultiIndex CSVs.
No change for single-level headers.
Both C and Python parsers tested successfully.

…es (pandas-dev#59560)

allamlobna · 2025-10-10T19:56:12Z

Hi, this PR currently fixes GH#59560, which addresses inconsistent handling of empty MultiIndex level values in read_csv.

Summary of changes:

Added _clean_column_levels() to normalize empty or automatically generated "Unnamed: x_level_y" placeholders when reading MultiIndex columns.
Hooked this into _extract_multi_indexer_columns() in the CSV parser so empty header cells are preserved as "" instead of "Unnamed:".

Observed side effect:
Because this normalization happens in shared column-cleaning logic, it’s also affecting other I/O readers which now return empty strings instead of "Unnamed:" placeholders.
This causes several tests in those modules to fail since they expect the old "Unnamed:" behavior.

Question for maintainers:
Would you prefer that I:

Limit the scope to CSV in this PR (restore previous behavior for Excel/HTML to keep this focused on GH#59560),

or

Extend the change across all I/O readers and update the corresponding parser tests for consistency in this same PR?

I’m happy to take either direction, just wanted to check what’s preferable from a review and scope standpoint.

allamlobna added 2 commits October 10, 2025 18:31

BUG: Fix inconsistency when read_csv reads MultiIndex with empty valu…

aa3940e

…es (pandas-dev#59560)

TST: apply Ruff auto-fix

a399a8e

allamlobna force-pushed the bugfix/clean-empty-vals-multiindex branch from 6a40ca3 to a399a8e Compare October 10, 2025 19:08

allamlobna marked this pull request as draft October 10, 2025 20:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Fix inconsistency when read_csv reads MultiIndex with empty values (#59560) #62644

BUG: Fix inconsistency when read_csv reads MultiIndex with empty values (#59560) #62644

allamlobna commented Oct 10, 2025 •

edited

Loading

Uh oh!

allamlobna commented Oct 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

BUG: Fix inconsistency when read_csv reads MultiIndex with empty values (#59560) #62644

Are you sure you want to change the base?

BUG: Fix inconsistency when read_csv reads MultiIndex with empty values (#59560) #62644

Conversation

allamlobna commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Impact

Uh oh!

allamlobna commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

allamlobna commented Oct 10, 2025 •

edited

Loading

allamlobna commented Oct 10, 2025 •

edited

Loading