Skip to content

ENH: add explicit overflow guard for large nanosecond timedeltas#63808

Closed
vishwas-droid wants to merge 4 commits intopandas-dev:mainfrom
vishwas-droid:fix-timedelta-overflow
Closed

ENH: add explicit overflow guard for large nanosecond timedeltas#63808
vishwas-droid wants to merge 4 commits intopandas-dev:mainfrom
vishwas-droid:fix-timedelta-overflow

Conversation

@vishwas-droid
Copy link

I noticed that constructing a Timedelta from very large integer values in nanoseconds could overflow at the C boundary. Depending on where the overflow occurred, this could result in inconsistent low-level errors instead of a clear OutOfBoundsTimedelta.

To fix this, I added an early overflow check when handling scalar nanosecond values, so that OutOfBoundsTimedelta is raised deterministically before any unsafe casting occurs. A regression test is included to ensure this behavior is preserved going forward.

@vishwas-droid vishwas-droid force-pushed the fix-timedelta-overflow branch from 01241e6 to b6fa495 Compare January 22, 2026 13:21
@jorisvandenbossche
Copy link
Member

Thanks for the PR!

Do you have an example where this currently does not give a nice OutOfBounds error?
If I run the example from your added test case locally, I already get a proper error message:

>>> pd.to_timedelta(10**20, unit="ns")
...
OutOfBoundsTimedelta: Cannot cast 100000000000000000000 from ns to 'ns' without overflow.

@vishwas-droid
Copy link
Author

@jorisvandenbossche Thanks for checking.
You’re right — on current main I also get a clean OutOfBoundsTimedelta for this case, and I don’t have a minimal example where the message is off today.
The change here is mostly a small guard to keep the failure happening before the nanosecond value reaches the C cast. It doesn’t change behavior in this case, but makes the path a bit more explicit and consistent going forward.

@vishwas-droid vishwas-droid changed the title BUG: handle overflow for large nanosecond timedeltas ENH: add explicit overflow guard for large nanosecond timedeltas Jan 22, 2026
@vishwas-droid
Copy link
Author

vishwas-droid commented Jan 25, 2026

cc @jorisvandenbossche @mroeschke @rhshadrach — just a small ping on this, thanks!

@rhshadrach
Copy link
Member

but makes the path a bit more explicit and consistent going forward.

I don't see it as making anything more explicit, indeed I think the code in cast_from_unit for overflows is quite explicit. On the other hand, it duplicates the logic unnecessarily.

@vishwas-droid
Copy link
Author

Thanks for the feedback. I agree that cast_from_unit already contains explicit overflow handling.
My intent here was not to redefine that logic, but to ensure that for scalar nanosecond inputs we fail deterministically before hitting any unsafe C-level casts, since in this path the overflow could manifest as inconsistent low-level errors rather than OutOfBoundsTimedelta.
That said, I agree the current approach may duplicate existing logic unnecessarily. I’m happy to refactor this to reuse or centralize the check in cast_from_unit (or remove this guard if you think that’s preferable), while keeping the regression test to ensure consistent error behavior. Let me know what direction you’d prefer.

@rhshadrach
Copy link
Member

since in this path the overflow could manifest as inconsistent low-level errors rather than OutOfBoundsTimedelta.

Can you demonstrate this?

@vishwas-droid
Copy link
Author

vishwas-droid commented Jan 30, 2026

Here’s a concrete example.

On main, constructing a scalar Timedelta from a very large nanosecond integer can fail before the OutOfBoundsTimedelta check is reached, depending on where the overflow happens in the C path:

import pandas as pd
import numpy as np

val = np.iinfo("int64").max + 1
pd.Timedelta(val, unit="ns").

@rhshadrach
Copy link
Member

rhshadrach commented Jan 30, 2026

I am seeing that raise here.

try:
base = <int64_t>ts
except OverflowError as err:
raise OutOfBoundsDatetime(
f"cannot convert input {ts} with the unit '{unit}'"
) from err

This is then caught here.

try:
ival = cast_from_unit(item, unit, out_reso)
except OutOfBoundsDatetime as err:
abbrev = npy_unit_to_abbrev(out_reso)
raise OutOfBoundsTimedelta(
f"Cannot cast {item} from {unit} to '{abbrev}' without overflow."
) from err

@vishwas-droid
Copy link
Author

Yes, I see that raise.
My point was just that in the scalar nanosecond case the overflow can happen before this block is reached, so we don’t always end up raising OutOfBoundsTimedelta from here.
If you think this path is sufficient, I’m fine removing the extra guard.

@vishwas-droid
Copy link
Author

To clarify, the case I had in mind is when constructing a scalar Timedelta at nanosecond resolution, where the <int64_t> cast itself can overflow before we get to cast_from_unit, so the error raised isn’t always OutOfBoundsTimedelta from this path.
That said, if the expectation is that overflows at that earlier stage are acceptable and we don’t need to normalize the exception type here, I’m happy to simplify this and drop the additional guard.

@rhshadrach
Copy link
Member

My point was just that in the scalar nanosecond case the overflow can happen before this block is reached

And I ask again, can you provide an example showing this is true?

@vishwas-droid
Copy link
Author

One edge case I had in mind is constructing a scalar Timedelta from a Python int that exceeds int64 bounds at nanosecond resolution, for example:

import pandas as pd

pd.Timedelta(2**63, unit="ns")

In this case the overflow happens during the implicit <int64_t> cast, before cast_from_unit is reached. While this is currently caught and normalized by the existing logic, the overflow originates earlier than the unit conversion itself.

That said, I’m not seeing this result in an inconsistent or unhandled error on main.

@rhshadrach
Copy link
Member

In this case the overflow happens during the implicit <int64_t> cast, before cast_from_unit is reached.

The overflow happens here, again in cast_from_unit.

try:
ival = cast_from_unit(item, unit, out_reso)
except OutOfBoundsDatetime as err:
abbrev = npy_unit_to_abbrev(out_reso)
raise OutOfBoundsTimedelta(
f"Cannot cast {item} from {unit} to '{abbrev}' without overflow."
) from err

At this point I'm tapping out. I'll only add that one should not be making comparisons with hardcoded numbers as is being done here.

@vishwas-droid
Copy link
Author

Thanks for the clarification — agreed. The overflow is occurring in cast_from_unit as you pointed out, and I don’t see a case where it bypasses that path.
I’ll drop the additional explicit guard and simplify the PR accordingly.

@vishwas-droid
Copy link
Author

@rhshadrach All feedback addressed. Ready to proceed from my side.
Thanks!

@vishwas-droid
Copy link
Author

It looks like the existing logic already handles this case, so there’s
no net change left from this PR. Happy to close this if that’s preferred.
Thanks for the review and discussion!

@rhshadrach
Copy link
Member

Thanks @vishwas-droid - closing.

@rhshadrach rhshadrach closed this Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants