Hi,
I found what looks like a bug in to_date() / to_timestamp(). (Saw it in master branch)
Inputs with month = 00 or day = 00 are accepted silently and normalized to
January / day 1, instead of being rejected as out of range.
Simple repro steps:
SELECT to_date('2024-00-15', 'YYYY-MM-DD');
SELECT to_date('2024-01-00', 'YYYY-MM-DD');
SELECT to_timestamp('2024-00-15', 'YYYY-MM-DD');
SELECT to_timestamp('2024-01-00', 'YYYY-MM-DD');
Observed results here:
to_date('2024-00-15', 'YYYY-MM-DD') -> 2024-01-15
to_date('2024-01-00', 'YYYY-MM-DD') -> 2024-01-01
to_timestamp('2024-00-15', 'YYYY-MM-DD') -> 2024-01-15 00:00:00+05:30
to_timestamp('2024-01-00', 'YYYY-MM-DD') -> 2024-01-01 00:00:00+05:30
I would expect all four calls to error, similar to how an invalid date literal
is rejected.
Looking at the root cause in src/backend/utils/adt/formatting.c, the issue seems
rooted in the interaction between ZERO_tm() and do_to_timestamp().
When to_date()/to_timestamp() initializes pg_tm, it uses ZERO_tm() which
defaults tm->tm_mday = 1 and tm->tm_mon = 1. The TmFromChar struct (tmfc),
which is used to collect the parsed inputs, stores these fields as plain
integers without independent presence flags (has_mm or has_dd have booleans).
In do_to_timestamp(), the parsed values are applied back into the pg_tm struct
using simple truthiness checks:
if (
tmfc.mm) tm->tm_mon =
tmfc.mm;
if (tmfc.dd) tm->tm_mday = tmfc.dd;
Because an explicitly parsed 00 evaluates to false, the zeros are never copied
into pg_tm. Thus, the 1 defaults from ZERO_tm() remain untouched, and the
subsequent validation function ValidateDate() never actually sees the 0 to throw
an out-of-bounds error.
Fixing this likely requires adding boolean flags to TmFromChar to distinguish
between an omitted field and an explicitly parsed 0?
Regards,
Ayush