Re: same plan, add 1 condition, 1900x slower - Mailing list pgsql-performance
From | Mitch Skinner |
---|---|
Subject | Re: same plan, add 1 condition, 1900x slower |
Date | |
Msg-id | 1131700635.29496.136.camel@enzian Whole thread Raw |
In response to | Re: same plan, add 1 condition, 1900x slower (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: same plan, add 1 condition, 1900x slower
Re: same plan, add 1 condition, 1900x slower |
List | pgsql-performance |
On Thu, 2005-11-10 at 12:23 -0500, Tom Lane wrote: > Apparently, you are using a platform and/or locale in which strcoll() is > spectacularly, god-awfully slow --- on the order of 10 msec per comparison. The version with the condition is definitely doing more I/O. The version without the condition doesn't read at all. I strace'd an explain analyze for each separately, and this is what I ended up with (the first is with the condition, the second is without): bash-2.05b$ cut '-d(' -f1 subsourcestrace | sort | uniq -c 7127 gettimeofday 75213 _llseek 1 Process 30227 attached - interrupt to quit 1 Process 30227 detached 148671 read 2 recv 4 semop 4 send bash-2.05b$ cut '-d(' -f1 subsourcestrace-nocond | sort | uniq -c 9103 gettimeofday 7 _llseek 1 Process 30227 attached - interrupt to quit 1 Process 30227 detached 2 recv 4 send For the moment, all of the rows in the view I'm selecting from satisfy the condition, so the output of both queries is the same. The relevant rows of the underlying tables are probably pretty contiguous (all of the rows satisfying the condition and the join were inserted at the same time). Could it just be the result of a weird physical distribution of data in the table/index files? For the fast query, the actual number of rows is a lot less than the planner expects. > This is a bit hard to believe but I can't make sense of those numbers > any other way. What is the platform exactly, and what database locale > and encoding are you using? It's RHEL 3 on x86: [root@rehoboam root]# uname -a Linux rehoboam 2.4.21-32.0.1.ELsmp #1 SMP Tue May 17 17:52:23 EDT 2005 i686 i686 i386 GNU/Linux The glibc version is 2.3.2. statgen=# select current_setting('lc_collate'); current_setting ----------------- en_US.UTF-8 Not sure what's relevant, but here's some more info: The machine has 4.5GiB of RAM and a 5-disk Raid 5. It's a dual xeon 3.2ghz. relname | relpages | reltuples -----------------------------+----------+------------- external_id_map | 126883 | 1.55625e+07 external_id_map_primary_key | 64607 | 1.55625e+07 subject | 31 | 1186 subject_pkey | 19 | 1186 I've attached the output of "select name, setting from pg_settings". And, in case my original message isn't handy, the explain analyze output and table/view info is below. Thanks for taking a look, Mitch statgen=> explain analyze select * from subject_source; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Merge Join (cost=0.00..330.72 rows=1186 width=46) (actual time=0.051..8.890 rows=1186 loops=1) Merge Cond: ("outer".id = "inner".target_id) -> Index Scan using subject_pkey on subject norm (cost=0.00..63.36 rows=1186 width=28) (actual time=0.022..1.441 rows=1186 loops=1) -> Index Scan using external_id_map_primary_key on external_id_map eim (cost=0.00..2485226.70 rows=15562513 width=26) (actual time=0.016..2.532 rows=2175 loops=1) Total runtime: 9.592 ms (5 rows) statgen=> explain analyze select * from subject_source where source='SCH'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Merge Join (cost=0.00..1147.33 rows=1 width=46) (actual time=0.054..20258.161 rows=1186 loops=1) Merge Cond: ("outer".id = "inner".target_id) -> Index Scan using subject_pkey on subject norm (cost=0.00..63.36 rows=1186 width=28) (actual time=0.022..1.478 rows=1186 loops=1) -> Index Scan using external_id_map_primary_key on external_id_map eim (cost=0.00..2524132.99 rows=2335 width=26) (actual time=0.022..20252.326 rows=1186 loops=1) Filter: (source = 'SCH'::bpchar) Total runtime: 20258.922 ms (6 rows) statgen=> \d subject_source View "public.subject_source" Column | Type | Modifiers -----------+-----------------------+----------- id | bigint | sex | integer | parent1 | bigint | parent2 | bigint | source | character(3) | source_id | character varying(32) | View definition: SELECT norm.id, norm.sex, norm.parent1, norm.parent2, eim.source, eim.source_id FROM subject norm JOIN util.external_id_map eim ON norm.id = eim.target_id; statgen=> \d subject Table "public.subject" Column | Type | Modifiers ---------+---------+----------- id | bigint | not null sex | integer | parent1 | bigint | parent2 | bigint | Indexes: "subject_pkey" PRIMARY KEY, btree (id) Foreign-key constraints: "subject_parent1" FOREIGN KEY (parent1) REFERENCES subject(id) DEFERRABLE INITIALLY DEFERRED "subject_parent2" FOREIGN KEY (parent2) REFERENCES subject(id) DEFERRABLE INITIALLY DEFERRED "subject_id_map" FOREIGN KEY (id) REFERENCES util.external_id_map(target_id) DEFERRABLE INITIALLY DEFERRED statgen=> \d util.external_id_map Table "util.external_id_map" Column | Type | Modifiers -----------+-----------------------+----------- source_id | character varying(32) | not null source | character(3) | not null target_id | bigint | not null Indexes: "external_id_map_primary_key" PRIMARY KEY, btree (target_id) "external_id_map_source_source_id_unique" UNIQUE, btree (source, source_id) "external_id_map_source" btree (source) "external_id_map_source_target_id" btree (source, target_id) Foreign-key constraints: "external_id_map_source" FOREIGN KEY (source) REFERENCES util.source(id)
Attachment
pgsql-performance by date: