Automatic analyze on select into - Mailing list pgsql-performance
From | Jim C. Nasby |
---|---|
Subject | Automatic analyze on select into |
Date | |
Msg-id | 20030426101537.F66185@flake.decibel.org Whole thread Raw |
Responses |
Re: Automatic analyze on select into
|
List | pgsql-performance |
BEGIN; SET LOCAL enable_seqscan = off; SELECT id, team_id, sum(work_units) AS work_units INTO TEMP email_contrib_summary FROM email_contrib WHERE project_id = :ProjectID GROUP by id, team_id ; COMMIT; inserts 29000 rows... UPDATE email_contrib_summary SET id = sp.retire_to FROM stats_participant sp WHERE sp.id = email_contrib_summary.id AND sp.retire_to >= 0 AND (sp.retire_date >= (SELECT ps.last_date FROM project_statsrun ps WHERE ps.project_id = :ProjectID) OR sp.retire_date IS NULL) ; Nested Loop (cost=0.00..5475.20 rows=982 width=54) (actual time=25.54..2173363.11 rows=29181 loops=1) InitPlan -> Seq Scan on project_statsrun ps (cost=0.00..1.06 rows=1 width=4) (actual time=0.06..0.07 rows=1 loops=1) Filter: (project_id = 8) -> Seq Scan on email_contrib_summary (cost=0.00..20.00 rows=1000 width=46) (actual time=25.11..1263.26 rows=29753 loops=1) -> Index Scan using stats_participant__participantretire_id on stats_participant sp (cost=0.00..5.44 rows=1 width=8)(actual time=2.16..72.93 rows=1 loops=29753) Index Cond: ((sp.retire_to >= 0) AND (sp.id = "outer".id)) Filter: ((retire_date >= $0) OR (retire_date IS NULL)) Total runtime: 2174315.61 msec GAH! 45 minutes to update 29k rows! BUT, if I do Hash Join (cost=41104.03..42410.14 rows=29166 width=38) (actual time=8391.81..10925.07 rows=29181 loops=1) Hash Cond: ("outer".id = "inner".id) InitPlan -> Seq Scan on project_statsrun ps (cost=0.00..1.06 rows=1 width=4) (actual time=0.05..0.06 rows=1 loops=1) Filter: (project_id = 8) -> Seq Scan on email_contrib_summary (cost=0.00..496.01 rows=29701 width=30) (actual time=0.20..387.95 rows=29753 loops=1) -> Hash (cost=13939.69..13939.69 rows=394217 width=8) (actual time=8390.72..8390.72 rows=0 loops=1) -> Seq Scan on stats_participant sp (cost=0.00..13939.69 rows=394217 width=8) (actual time=0.22..5325.38 rows=389115loops=1) Filter: ((retire_to >= 0) AND ((retire_date >= $0) OR (retire_date IS NULL))) Total runtime: 11584.09 msec Ahhh... soothing relief... So, question is, would it make sense to automatically do an analyze after/during a SELECT INTO? Would it be very expensive to analyze the data as it's being inserted? I think it's pretty well understood that you want to vacuum/vacuum analyze the entire database regularly, but that obviously wouldn't help temporary tables... maybe it makes the most sense to automatically analyze temporary tables only. For that matter, since temp tables only have one operation performed on them at a time, maybe it makes sense to keep stats for them up-to-date as part of every operation? -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
pgsql-performance by date: