Skip to content

Commit 3033eea

Browse files
authored
Fix #1334: Update DataFrame.from_records signature and add tests (#1335)
* feat: add type hints and tests for DataFrame.from_records method * Fix DataFrame.from_records type annotations and add pd.Index test- Change np.ndarray to np_2d_array for data parameter- Change SequenceNotStr[str] to ListLike for columns and exclude parameters- Add test case with pd.Index as columns parameterAddresses review feedback from Dr-Irv * Fix DataFrame.from_records type annotations- Update data parameter to use Sequence[SequenceNotStr] | Sequence[Mapping[str, Scalar]] | Mapping[str, Sequence[Scalar]]- Update columns and exclude parameters to use ListLike | None = None- Update index parameter to use SequenceNotStr for better type precisionAddresses review feedback from Dr-Irv on issue #1334 * feat: add type hints and tests for DataFrame.from_records method- Add np_2darray support to data parameter type annotation- Add comprehensive tests for DataFrame.from_records in test_frame.py- Fix NumPy 2.0 compatibility in test (S1 instead of a1)- Test covers np.ndarray, list of tuples, pd.Index columns, and structured arrays- Addresses GitHub issue #1334 * fix: improve DataFrame.from_records type annotations - Update data parameter types to accept Sequence[Mapping[str, Any]] and Mapping[str, SequenceNotStr[Any]] - Add comprehensive tests for np.ndarray, tuples, and mapping inputs - Address GitHub issue #1334 per Dr-Irv feedback - All 207 DataFrame tests pass with no issues * fix: enhance DataFrame.from_records type annotations per issue #1334 The main changes include: - Updated data parameter types from overly restrictive Scalar to more flexible Any types - Added .reshape(2, 2) to numpy array test to handle CI compatibility issues across different numpy versions - Included a test for mapping of sequences using DataFrame constructor (which seems to be the right approach for that data type) All 207 DataFrame tests still pass * fix: enhance DataFrame.from_records type annotations per issue #1334 Addresses Dr-Irv's feedback: - Updated data parameter types from restrictive Scalar to flexible Any - Added .reshape(2, 2) to numpy test for CI compatibility - Added proper dictionary tests (list and single) without tuple conversion - Added Mapping[str, Any] type support for single dictionaries - Used DataFrame constructor for mapping sequences test All tests pass. * fix: change index parameter to list[str] for type compatibility * fix: update DataFrame.from_records index parameter to accept Hashable values - Change index parameter type from SequenceNotStr[str] to SequenceNotStr[Hashable] - Apply black formatting to test files - Resolves CI type checking issues per Dr-Irv feedback This allows index parameter to accept integers and other hashable values, not just strings, matching pandas runtime behavior. * fix: use DataFrame.from_records instead of DataFrame constructor for mapping dict test. Applied black formatting and pre-commit fixes
1 parent 11424d5 commit 3033eea

File tree

2 files changed

+132
-4
lines changed

2 files changed

+132
-4
lines changed

pandas-stubs/core/frame.pyi

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -545,10 +545,16 @@ class DataFrame(NDFrame, OpsMixin, _GetItemHack):
545545
@classmethod
546546
def from_records(
547547
cls,
548-
data,
549-
index=...,
550-
exclude: SequenceNotStr[str] | None = None,
551-
columns: SequenceNotStr[str] | None = None,
548+
data: (
549+
np_2darray
550+
| Sequence[SequenceNotStr]
551+
| Sequence[Mapping[str, Any]]
552+
| Mapping[str, Any]
553+
| Mapping[str, SequenceNotStr[Any]]
554+
),
555+
index: str | SequenceNotStr[Hashable] | None = None,
556+
columns: ListLike | None = None,
557+
exclude: ListLike | None = None,
552558
coerce_float: bool = False,
553559
nrows: int | None = None,
554560
) -> Self: ...

tests/test_frame.py

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4688,3 +4688,125 @@ def test_unstack() -> None:
46884688
),
46894689
pd.DataFrame,
46904690
)
4691+
4692+
4693+
def test_from_records() -> None:
4694+
4695+
# test with np.ndarray
4696+
arr = np.array([[1, "a"], [2, "b"]], dtype=object).reshape(2, 2)
4697+
check(assert_type(pd.DataFrame.from_records(arr), pd.DataFrame), pd.DataFrame)
4698+
4699+
# testing with list of tuples
4700+
data_tuples = [(1, "a"), (2, "b"), (3, "c")]
4701+
check(
4702+
assert_type(
4703+
pd.DataFrame.from_records(data_tuples, columns=["id", "name"]),
4704+
pd.DataFrame,
4705+
),
4706+
pd.DataFrame,
4707+
)
4708+
4709+
# testing with pd.Index as columns parameter
4710+
check(
4711+
assert_type(
4712+
pd.DataFrame.from_records(data_tuples, columns=pd.Index(["id", "name"])),
4713+
pd.DataFrame,
4714+
),
4715+
pd.DataFrame,
4716+
)
4717+
4718+
# Testing with list of tuples (instead of structured array for type compatibility)
4719+
data_array_tuples = [(1, "a"), (2, "b")]
4720+
check(
4721+
assert_type(
4722+
pd.DataFrame.from_records(data_array_tuples, columns=["id", "name"]),
4723+
pd.DataFrame,
4724+
),
4725+
pd.DataFrame,
4726+
)
4727+
4728+
# testing with list of dictionaries
4729+
data_dict_list = [{"id": 1, "name": "a"}, {"id": 2, "name": "b"}]
4730+
check(
4731+
assert_type(
4732+
pd.DataFrame.from_records(data_dict_list, columns=["id", "name"]),
4733+
pd.DataFrame,
4734+
),
4735+
pd.DataFrame,
4736+
)
4737+
4738+
# test with single dictionary
4739+
data_single_dict = {"id": 1, "name": "a"}
4740+
check(
4741+
assert_type(
4742+
pd.DataFrame.from_records(data_single_dict, index=["0"]), pd.DataFrame
4743+
),
4744+
pd.DataFrame,
4745+
)
4746+
4747+
# testing with mapping of sequences
4748+
data_mapping_dict = {"id": [1, 2], "name": ["a", "b"]}
4749+
check(
4750+
assert_type(pd.DataFrame.from_records(data_mapping_dict), pd.DataFrame),
4751+
pd.DataFrame,
4752+
)
4753+
4754+
# Testing with index parameter as string
4755+
check(
4756+
assert_type(
4757+
pd.DataFrame.from_records(data_tuples, columns=["id", "name"], index="id"),
4758+
pd.DataFrame,
4759+
),
4760+
pd.DataFrame,
4761+
)
4762+
4763+
# Testing with index parameter as sequence
4764+
check(
4765+
assert_type(
4766+
pd.DataFrame.from_records(
4767+
data_tuples, columns=["id", "name"], index=["id"]
4768+
),
4769+
pd.DataFrame,
4770+
),
4771+
pd.DataFrame,
4772+
)
4773+
4774+
# Testing with exclude parameter
4775+
check(
4776+
assert_type(
4777+
pd.DataFrame.from_records(
4778+
[(1, "a", "extra"), (2, "b", "extra")],
4779+
columns=["id", "name", "extra"],
4780+
exclude=["extra"],
4781+
),
4782+
pd.DataFrame,
4783+
),
4784+
pd.DataFrame,
4785+
)
4786+
4787+
# Testing with all parameters
4788+
check(
4789+
assert_type(
4790+
pd.DataFrame.from_records(
4791+
data_tuples,
4792+
index=None,
4793+
columns=["id", "name"],
4794+
exclude=None,
4795+
coerce_float=True,
4796+
nrows=2,
4797+
),
4798+
pd.DataFrame,
4799+
),
4800+
pd.DataFrame,
4801+
)
4802+
4803+
# Testing parameter order
4804+
check(
4805+
assert_type(
4806+
pd.DataFrame.from_records(
4807+
data_tuples, columns=["id", "name"], exclude=None
4808+
),
4809+
pd.DataFrame,
4810+
),
4811+
pd.DataFrame,
4812+
)

0 commit comments

Comments
 (0)