-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
API: timestamp resolution inference: default to microseconds when possible #62031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
API: timestamp resolution inference: default to microseconds when possible #62031
Conversation
@@ -3770,3 +3769,77 @@ def test_to_datetime_wrapped_datetime64_ps(): | |||
["1970-01-01 00:00:01.901901901"], dtype="datetime64[ns]", freq=None | |||
) | |||
tm.assert_index_equal(result, expected) | |||
|
|||
|
|||
class TestToDatetimeInferUnit: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is already tested elsewhere (in various places?), but while developing it just wrote some simple tests with the different cases I encountered.
For example now I see there is pandas/tests/tslibs/test_array_to_datetime.py::TestArrayToDatetimeResolutionInference
and pandas/tests/tslibs/test_strptime.py::TestArrayStrptimeResolutionInference
that are failing, so can integrate those tests.
@jbrockmendel would you have time to give this a review? |
Yes, but its in line behind a few other reviews i owe. |
Draft PR for #58989
This should already make sure that we consistently use 'us' when converting non-numeric data in
pd.to_datetime
andpd.Timestamp
, but if we want to do this, this PR still requires updating lots of tests and docs (and whatsnew) and cleaning up.Currently the changes here will ensure that we use microseconds more consistently when inferring the resolution while creating datetime64 data. Exceptions: if the data don't fit in the range of us (either because out of bounds (use ms or s) or because it has nanoseconds or below (use ns)), or if the input data already has a resolution defined (for Timestamp objects, or numpy datetime64 data).