ENH: add explicit overflow guard for large nanosecond timedeltas#63808
ENH: add explicit overflow guard for large nanosecond timedeltas#63808vishwas-droid wants to merge 4 commits intopandas-dev:mainfrom
Conversation
01241e6 to
b6fa495
Compare
|
Thanks for the PR! Do you have an example where this currently does not give a nice OutOfBounds error? |
|
@jorisvandenbossche Thanks for checking. |
|
cc @jorisvandenbossche @mroeschke @rhshadrach — just a small ping on this, thanks! |
I don't see it as making anything more explicit, indeed I think the code in |
|
Thanks for the feedback. I agree that cast_from_unit already contains explicit overflow handling. |
Can you demonstrate this? |
|
Here’s a concrete example. On main, constructing a scalar Timedelta from a very large nanosecond integer can fail before the OutOfBoundsTimedelta check is reached, depending on where the overflow happens in the C path: import pandas as pd
import numpy as np
val = np.iinfo("int64").max + 1
pd.Timedelta(val, unit="ns"). |
|
I am seeing that raise here. pandas/pandas/_libs/tslibs/conversion.pyx Lines 218 to 223 in 924573a This is then caught here. pandas/pandas/_libs/tslibs/timedeltas.pyx Lines 303 to 309 in 924573a |
|
Yes, I see that raise. |
|
To clarify, the case I had in mind is when constructing a scalar Timedelta at nanosecond resolution, where the <int64_t> cast itself can overflow before we get to cast_from_unit, so the error raised isn’t always OutOfBoundsTimedelta from this path. |
And I ask again, can you provide an example showing this is true? |
|
One edge case I had in mind is constructing a scalar import pandas as pd
pd.Timedelta(2**63, unit="ns")In this case the overflow happens during the implicit That said, I’m not seeing this result in an inconsistent or unhandled error on main. |
The overflow happens here, again in pandas/pandas/_libs/tslibs/timedeltas.pyx Lines 303 to 309 in 1058c32 At this point I'm tapping out. I'll only add that one should not be making comparisons with hardcoded numbers as is being done here. |
|
Thanks for the clarification — agreed. The overflow is occurring in cast_from_unit as you pointed out, and I don’t see a case where it bypasses that path. |
|
@rhshadrach All feedback addressed. Ready to proceed from my side. |
|
It looks like the existing logic already handles this case, so there’s |
|
Thanks @vishwas-droid - closing. |
I noticed that constructing a Timedelta from very large integer values in nanoseconds could overflow at the C boundary. Depending on where the overflow occurred, this could result in inconsistent low-level errors instead of a clear OutOfBoundsTimedelta.
To fix this, I added an early overflow check when handling scalar nanosecond values, so that OutOfBoundsTimedelta is raised deterministically before any unsafe casting occurs. A regression test is included to ensure this behavior is preserved going forward.