Skip to content

Conversation

MengAiDev
Copy link

Description

This PR fixes an inconsistency in pd.api.types.is_string_dtype() when passed a Categorical series directly versus the dtype of that series.

Currently:

import pandas as pd

series = pd.Categorical(['A', 'B', 'C'])
print(f"is_string_dtype(series): {pd.api.types.is_string_dtype(series)}") # True
print(f"is_string_dtype(series.dtype): {pd.api.types.is_string_dtype(series.dtype)}") # False

The issue is that when a Categorical series is passed, the function correctly checks if the categories are strings, but when a Categorical dtype is passed directly, it doesn't handle it properly.

Fix

The fix adds explicit handling for CategoricalDtype in both cases (series and dtype) to ensure consistent behavior.

Test Plan

Added a new test file test_categorical_string_dtype.py with tests that verify the consistent behavior for both Categorical series and their dtypes.

Fixes #62109

…th Categorical series vs dtype

## Description
This PR fixes an inconsistency in `pd.api.types.is_string_dtype()` when passed a Categorical series directly versus the dtype of that series.

Currently:
```python
import pandas as pd

series = pd.Categorical(['A', 'B', 'C'])
print(f"is_string_dtype(series): {pd.api.types.is_string_dtype(series)}") # True
print(f"is_string_dtype(series.dtype): {pd.api.types.is_string_dtype(series.dtype)}") # False
```

The issue is that when a Categorical series is passed, the function correctly checks if the categories are strings, but when a Categorical dtype is passed directly, it doesn't handle it properly.

## Fix
The fix adds explicit handling for CategoricalDtype in both cases (series and dtype) to ensure consistent behavior.

## Test Plan
Added a new test file `test_categorical_string_dtype.py` with tests that verify the consistent behavior for both Categorical series and their dtypes.

Fixes pandas-dev#62109
@MengAiDev MengAiDev changed the title # BUG: Fix inconsistent results for pd.api.types.is_string_dtype() wi… Fix inconsistent results for pd.api.types.is_string_dtype Aug 15, 2025
@MengAiDev MengAiDev closed this Aug 15, 2025
@MengAiDev MengAiDev deleted the fix-is-string-dtype-categorical branch August 15, 2025 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: pd.api.types.is_string_dtype() returns inconsistent results for Categorical series vs dtype
1 participant