Re: Performance woes relating to DISTINCT (I think) - Mailing list pgsql-general
From | boinger |
---|---|
Subject | Re: Performance woes relating to DISTINCT (I think) |
Date | |
Msg-id | 9e6d8b53050927070755640862@mail.gmail.com Whole thread Raw |
In response to | Re: Performance woes relating to DISTINCT (I think) (Dawid Kuroczko <qnex42@gmail.com>) |
Responses |
Re: Performance woes relating to DISTINCT (I think)
|
List | pgsql-general |
On 9/27/05, Dawid Kuroczko <qnex42@gmail.com> wrote: > > QUERY PLAN > > -> GroupAggregate (cost=0.00..85168.65 rows=11 > width=22) > > (actual time=3149.916..45578.292 rows=515 loops=1) > > Hmm, planner expected 11 rows, got 515 > > > > (cost=0.00..85167.23 rows=107 width=22) (actual > > time=3144.908..45366.147 rows=29893 loops=1) > > > planner expected 107 rows, got 29893... > I guess the problem here is that planner has wrong idea how your > data looks. Try doing two things: > > VACUUM ANALYZE; > (of tables in question or whole database) > > If that doesn't help, do increase the statistics target. By default > PostgreSQL > keeps 10 samples, but you might want to increase it to 50 or even 100. > And then rerun VACUUM ANALYZE. > > If it doesn't help -- please repost the new query plan once again. I actually kind of inadvertently "fixed" it. I threw my hands up and thought to myself "FINE! If it's going to take that long, at least it can do all the joins and whatnot instead of having to loop back and do separate queries" So, I piled in everything I needed it to do, and now it's inexplicably (to me) fast (!?). I'm still running a full VACUUM ANALYZE on your recommendation...maybe shave a few more ms off. Here's what I have, now (pre-vacuum): SQL: SELECT tasks_applied.modcode AS modcode, tasks_applied.seid AS seid, tasks_applied.yearcode AS yearcode, vin_years.year AS year, COUNT(DISTINCT(tid)) AS task_count FROM "SS_valid_modelyears", tasks_applied, vin_years WHERE cid=0 AND tasks_applied.seid='500001' AND "SS_valid_modelyears".modcode=tasks_applied.modcode AND "SS_valid_modelyears".year=vin_years.year AND tasks_applied.yearcode=vin_years.yearcode AND "SS_valid_modelyears".valid=1 GROUP BY tasks_applied.seid, vin_years.year, tasks_applied.modcode, "SS_valid_modelyears".shortname, tasks_applied.yearcode ORDER BY tasks_applied.seid ASC, vin_years.year ASC QUERY PLAN: GroupAggregate (cost=201.39..201.42 rows=1 width=69) (actual time=80.383..80.386 rows=1 loops=1) -> Sort (cost=201.39..201.40 rows=1 width=69) (actual time=79.737..79.898 rows=59 loops=1) Sort Key: tasks_applied.seid, vin_years."year", tasks_applied.modcode, "SS_valid_modelyears".shortname, tasks_applied.yearcode -> Nested Loop (cost=1.38..201.38 rows=1 width=69) (actual time=72.599..78.765 rows=59 loops=1) -> Hash Join (cost=1.38..165.15 rows=6 width=61) (actual time=0.530..18.881 rows=1188 loops=1) Hash Cond: ("outer"."year" = "inner"."year") -> Seq Scan on "SS_valid_modelyears" (cost=0.00..163.54 rows=36 width=56) (actual time=0.183..9.202 rows=1188 loops=1) Filter: ("valid" = 1) -> Hash (cost=1.30..1.30 rows=30 width=9) (actual time=0.230..0.230 rows=0 loops=1) -> Seq Scan on vin_years (cost=0.00..1.30 rows=30 width=9) (actual time=0.019..0.116 rows=30 loops=1) -> Index Scan using strafe_group on tasks_applied (cost=0.00..6.02 rows=1 width=22) (actual time=0.042..0.043 rows=0 loops=1188) Index Cond: ((("outer".modcode)::text = (tasks_applied.modcode)::text) AND (tasks_applied.yearcode = "outer".yearcode) AND (tasks_applied.seid = 500001)) Filter: (cid = 0) Total runtime: 80.764 ms
pgsql-general by date: