Thread: Full table scan: 300 million rows
Hi,
I have the following query:
I posted the explain plan as an image (can't figure out how to export it from the pgadmin3):
http://i.imgur.com/m6YIV.png
I have yet to let this query finish.
Any ideas how I can speed it up?
Thank you!
I have the following query:
Select avg(d.amount) AS amount, y.year
From year_ref y
Join month_ref m
On m.year_ref_id = y.id
Join daily d
On d.month_ref_id = m.id
Where y.year Between 1980 And 2000
And m.month = 12
And m.category_id = '001'
And d.daily_flag_id <> 'M'
And exists (
Select 1
From station s1
inner Join city c1
On c1.id = 10663
Where 6371.009
* SQRT(
POW(RADIANS(c1.latitude_decimal - s1.latitude_decimal), 2)
+ (
COS(RADIANS(c1.latitude_decimal + s1.latitude_decimal) / 2)
* POW(RADIANS(c1.longitude_decimal - s1.longitude_decimal), 2)
)
) <= 50
And S1.station_district_id = y.station_district_id
)
Group By y.year
From year_ref y
Join month_ref m
On m.year_ref_id = y.id
Join daily d
On d.month_ref_id = m.id
Where y.year Between 1980 And 2000
And m.month = 12
And m.category_id = '001'
And d.daily_flag_id <> 'M'
And exists (
Select 1
From station s1
inner Join city c1
On c1.id = 10663
Where 6371.009
* SQRT(
POW(RADIANS(c1.latitude_decimal - s1.latitude_decimal), 2)
+ (
COS(RADIANS(c1.latitude_decimal + s1.latitude_decimal) / 2)
* POW(RADIANS(c1.longitude_decimal - s1.longitude_decimal), 2)
)
) <= 50
And S1.station_district_id = y.station_district_id
)
Group By y.year
I posted the explain plan as an image (can't figure out how to export it from the pgadmin3):
http://i.imgur.com/m6YIV.png
I have yet to let this query finish.
Any ideas how I can speed it up?
Thank you!
David Jarvis <thangalin@gmail.com> wrote: > Hi, > > I have the following query: > > Select avg(d.amount) AS amount, y.year > From year_ref y > Join month_ref m > On m.year_ref_id = y.id > Join daily d > On d.month_ref_id = m.id > Where y.year Between 1980 And 2000 > And m.month = 12 > And m.category_id = '001' > And d.daily_flag_id <> 'M' > And exists ( I think, you have a bad table design: you have splitted a date into year, month and day, and stored that in different tables. If i were you, i would use a regular date-field and indexes like: test=# create table d (d date); CREATE TABLE test=*# create index idx_d_year on d(extract (year from d)); CREATE INDEX test=*# create index idx_d_month on d(extract (month from d)); CREATE INDEX Your query with this structure: select ... from table where extract(year from d) between 1980 And 2000 and extract(month from d) = 12 and daily_flag_id ... This can use the indexes and avoid the seq-scan. You can also use table-partitioning and constraint exclusion (for instance, one table per month) > http://i.imgur.com/m6YIV.png > > I have yet to let this query finish. > > Any ideas how I can speed it up? Do you have an index on daily.daily_flag_id ? Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°