Re: Planner estimates and cast operations ,... - Mailing list pgsql-hackers
From | Hans-Juergen Schoenig |
---|---|
Subject | Re: Planner estimates and cast operations ,... |
Date | |
Msg-id | 14AF07E7-CEFC-4BC3-96C5-CFC394D7DE87@cybertec.at Whole thread Raw |
In response to | Re: Planner estimates and cast operations ,... (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Planner estimates and cast operations ,...
|
List | pgsql-hackers |
hi tom ... i thought about creating an index on the expression but the problem is that this is hardly feasable. in 8.0 (what i have here) this would block the table and i would run out of disk space as well. this is a 600 gb biest :( what about the planner approach? this would solve the problem for some other issues as well. an index might not be flexible enough :(. many thanks, hans On Sep 4, 2006, at 4:57 PM, Tom Lane wrote: > Hans-Juergen Schoenig <postgres@cybertec.at> writes: >> consider the following: > >> SELECT some_timestamp::date FROM very_large_table GROUP BY >> some_timestamp::date > >> my very_large_table is around 1billion entries. >> the problem is: the planner has a problem here as it is taking the >> (correct) estimates for timestamp. this avoids a HashAggregate >> because the dataset seems to large for work_mem. >> what the planner cannot know is that the number of days is quite >> limited (in my case around 1000 different values). >> i wonder how to teach the planner to take the cast into >> consideration. > > Create an index on that expression. > > regression=# create table foo(x) as select x * '864 sec'::interval > + now()::timestamp from generate_series(1,10000) x; > SELECT > regression=# analyze foo; > ANALYZE > regression=# explain select x::date from foo group by x::date; > QUERY PLAN > --------------------------------------------------------------- > HashAggregate (cost=205.00..330.00 rows=10000 width=8) > -> Seq Scan on foo (cost=0.00..180.00 rows=10000 width=8) > (2 rows) > > regression=# create index fooi on foo((x::date)); > CREATE INDEX > regression=# analyze foo; > ANALYZE > regression=# explain select x::date from foo group by x::date; > QUERY PLAN > --------------------------------------------------------------- > HashAggregate (cost=205.00..206.26 rows=101 width=8) > -> Seq Scan on foo (cost=0.00..180.00 rows=10000 width=8) > (2 rows) > > regression=# > > I had to cheat a little bit here: I tried to do this example with a > timestamptz column, and the index creation failed because > timestamptz to > date isn't immutable (it depends on TimeZone). If yours is too, you > could perhaps do something involving AT TIME ZONE to generate an > immutable conversion to date. > > It would perhaps make sense to provide a way to cue ANALYZE to compute > stats on expressions that aren't actually being indexed, but I see no > good reason to limit our attention to cast expressions. > > regards, tom lane
pgsql-hackers by date: