i am looking at some corner case which might also cause troubles for
other people.
consider the following:
SELECT some_timestamp::date FROM very_large_table GROUP BY
some_timestamp::date
my very_large_table is around 1billion entries.
the problem is: the planner has a problem here as it is taking the
(correct) estimates for timestamp. this avoids a HashAggregate
because the dataset seems to large for work_mem.
what the planner cannot know is that the number of days is quite
limited (in my case around 1000 different values).
i wonder how to teach the planner to take the cast into consideration.
at the moment the planner uses the per column statistics - it cannot
know that the cast might change the number of different values.
how about the following?
Command: CREATE CAST
Description: define a new cast
Syntax:
CREATE CAST (sourcetype AS targettype) [USING SELECTIVITY number | funcname(argtypes)] WITH FUNCTION funcname
(argtypes) [ AS ASSIGNMENT | AS IMPLICIT ]
if it was possible to assign a constant or some function to the cast
i think we could make the example used above work. by default no
costs are changed. if somebody is doing some fancy query it would be
possible to tweak GOUOP BY planning by assigning some cleverly
written function or a constant to the scenery.
a constant would be useful in terms of casts to boolean or so.
does anybody have an idea which could help solving this issue?
best regards,
hans