Thread: specific query (not all) on Pg8 MUCH slower than Pg7
Hi, Despite numerous efforts, we're unable to solve a severe performance limitation between Pg 7.3.2 and Pg 8.1.4. The query and 'explain analyze' plan below, runs in 26.20 msec on Pg 7.3.2, and 2463.968 ms on Pg 8.1.4, and the Pg7.3.2 is on older hardware and OS. Multiply this time difference by >82K, and a 10 minute procedure (which includes this query), now runs in 10 *hours*.....not good.... In general, however, we're pleased with performance of this very same Pg8.1.4 server as compared to the Pg7.3.2 server (loading/dumping, and other queries are much faster). QUERY: SELECT dx.db_id, dx.accession, f.uniquename, f.name, cvt.name as ntype, fd.is_current from feature f, feature_dbxref fd, dbxref dx, cvterm cvt where fd.dbxref_id = dx.dbxref_id and fd.feature_id = f.feature_id and f.type_id = cvt.cvterm_id and accession like 'AY851043%' and cvt.name not in ('gene','protein','natural_transposable_element','chromosome_structure_variation','chromosome_arm','repeat_region') ; explain analyze output on Pg7.3.2: ----------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..23.45 rows=1 width=120) (actual time=25.59..25.59 rows=0 loops=1) -> Nested Loop (cost=0.00..17.49 rows=1 width=82) (actual time=25.58..25.58 rows=0 loops=1) -> Nested Loop (cost=0.00..11.93 rows=1 width=30) (actual time=25.58..25.58 rows=0 loops=1) -> Index Scan using dbxref_idx2 on dbxref dx (cost=0.00..5.83 rows=1 width=21) (actual time=25.58..25.58rows=0 loops=1) Index Cond: ((accession >= 'AY851043'::character varying) AND (accession < 'AY851044'::character varying)) Filter: (accession ~~ 'AY851043%'::text) -> Index Scan using feature_dbxref_idx2 on feature_dbxref fd (cost=0.00..6.05 rows=5 width=9) (never executed) Index Cond: (fd.dbxref_id = "outer".dbxref_id) -> Index Scan using feature_pkey on feature f (cost=0.00..5.54 rows=1 width=52) (never executed) Index Cond: ("outer".feature_id = f.feature_id) -> Index Scan using cvterm_pkey on cvterm cvt (cost=0.00..5.94 rows=1 width=38) (never executed) Index Cond: ("outer".type_id = cvt.cvterm_id) Filter: ((name <> 'gene'::character varying) AND (name <> 'protein'::character varying) AND (name <> 'natural_transposable_element'::charactervarying) AND (name <> 'chromosome_structure_variation'::character varying) AND (name<> 'chromosome_arm'::character varying) AND (name <> 'repeat_region'::character varying)) Total runtime: 26.20 msec (14 rows) ======== explain analyze output on Pg8.1.4: ----------------------------------------------------------------- Nested Loop (cost=0.00..47939.87 rows=1 width=108) (actual time=2463.654..2463.654 rows=0 loops=1) -> Nested Loop (cost=0.00..47933.92 rows=1 width=73) (actual time=2463.651..2463.651 rows=0 loops=1) -> Nested Loop (cost=0.00..47929.86 rows=1 width=22) (actual time=2463.649..2463.649 rows=0 loops=1) -> Seq Scan on dbxref dx (cost=0.00..47923.91 rows=1 width=21) (actual time=2463.646..2463.646 rows=0 loops=1) Filter: ((accession)::text ~~ 'AY851043%'::text) -> Index Scan using feature_dbxref_idx2 on feature_dbxref fd (cost=0.00..5.90 rows=4 width=9) (never executed) Index Cond: (fd.dbxref_id = "outer".dbxref_id) -> Index Scan using feature_pkey on feature f (cost=0.00..4.05 rows=1 width=59) (never executed) Index Cond: ("outer".feature_id = f.feature_id) -> Index Scan using cvterm_pkey on cvterm cvt (cost=0.00..5.94 rows=1 width=43) (never executed) Index Cond: ("outer".type_id = cvt.cvterm_id) Filter: (((name)::text <> 'gene'::text) AND ((name)::text <> 'protein'::text) AND ((name)::text <> 'natural_transposable_element'::text)AND ((name)::text <> 'chromosome_structure_variation'::text) AND ((name)::text <> 'chromosome_arm'::text)AND ((name)::text <> 'repeat_region'::text)) Total runtime: 2463.968 ms (13 rows) ======= I tried tuning configs, including shutting off enable seqscan, forcing use of index (set shared_buffers high with random_page_cost set low). A colleague who gets 1697ms on Pg8.1.4 with this query provided his postgresql.conf -- didn't help.... We use standard dump/load commands between these servers: pg_dump -O fb_2007_01_05 | compress > fb_2007_01_05.Z uncompress -c fb_2007_01_05.Z | psql fb_2007_01_05 Hardware/OS specs: - Pg7.3.2: SunFire 280R, 900mHz SPARC processor, 3gb total RAM, 10Krpm SCSI internal disks, Solaris 2.8 - Pg8.1.4: v240 - dual Ultra-SPARC IIIi 1500MHz SPARC processor, 8GB total RAM, Solaris 2.10 (used both Sun-supplied postgres binaries, and compiled postgres from source) Thanks for your help, Susan Russo
Susan Russo <russo@morgan.harvard.edu> writes: > Despite numerous efforts, we're unable to solve a severe performance limitation between Pg 7.3.2 > and Pg 8.1.4. > The query and 'explain analyze' plan below, runs in > 26.20 msec on Pg 7.3.2, and > 2463.968 ms on Pg 8.1.4, You're not getting the indexscan optimization of the LIKE clause, which is most likely due to having initdb'd the 8.1 installation in something other than C locale. You can either redo the initdb in C locale (which might be a good move to fix other inconsistencies from the 7.3 behavior you're used to) or create a varchar_pattern_ops index on the column(s) you're using LIKE with. regards, tom lane
On Tue, May 08, 2007 at 10:18:34AM -0400, Susan Russo wrote: > explain analyze output on Pg7.3.2: > > -> Index Scan using dbxref_idx2 on dbxref dx (cost=0.00..5.83 rows=1 width=21) (actual time=25.58..25.58rows=0 loops=1) > Index Cond: ((accession >= 'AY851043'::character varying) AND (accession < 'AY851044'::character varying)) > Filter: (accession ~~ 'AY851043%'::text) > > explain analyze output on Pg8.1.4: > > -> Seq Scan on dbxref dx (cost=0.00..47923.91 rows=1 width=21) (actual time=2463.646..2463.646 rows=0loops=1) > Filter: ((accession)::text ~~ 'AY851043%'::text) This is almost all of your cost. Did you perchance initdb the 8.1.4 cluster in a non-C locale? You could always try CREATE INDEX test_index ON dbxref (accession varchar_pattern_ops); which would create an index that might be more useful for your LIKE query, even in a non-C locale. /* Steinar */ -- Homepage: http://www.sesse.net/
--- Susan Russo <russo@morgan.harvard.edu> wrote: > and accession like 'AY851043%' I don't know if you've tried refactoring your query, but you could try: AND accession BETWEEN 'AY8510430' AND 'AY8510439' -- where the last digit is ^ ^ -- lowest AND highest expected value Regards, Richard Broersma Jr.
On 5/8/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > You're not getting the indexscan optimization of the LIKE clause, which > is most likely due to having initdb'd the 8.1 installation in something > other than C locale. You can either redo the initdb in C locale (which > might be a good move to fix other inconsistencies from the 7.3 behavior > you're used to) or create a varchar_pattern_ops index on the column(s) > you're using LIKE with. Given the performance implications of setting the wrong locale, and the high probability of accidentally doing this (I run my shells with LANG=en_US.UTF-8, so all my databases have inherited this locale), why is there no support for changing the database locale after the fact? # alter database test set lc_collate = 'C'; ERROR: parameter "lc_collate" cannot be changed Alexander.
In response to "Alexander Staubo" <alex@purefiction.net>: > On 5/8/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > You're not getting the indexscan optimization of the LIKE clause, which > > is most likely due to having initdb'd the 8.1 installation in something > > other than C locale. You can either redo the initdb in C locale (which > > might be a good move to fix other inconsistencies from the 7.3 behavior > > you're used to) or create a varchar_pattern_ops index on the column(s) > > you're using LIKE with. > > Given the performance implications of setting the wrong locale, and > the high probability of accidentally doing this (I run my shells with > LANG=en_US.UTF-8, so all my databases have inherited this locale), why > is there no support for changing the database locale after the fact? > > # alter database test set lc_collate = 'C'; > ERROR: parameter "lc_collate" cannot be changed How would that command handle UTF data that could not be converted to C? -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ wmoran@collaborativefusion.com Phone: 412-422-3463x4023
"Alexander Staubo" <alex@purefiction.net> writes: > why is there no support for changing the database locale after the fact? It'd corrupt all your indexes (or all the ones on textual columns anyway). There are some TODO entries related to this, but don't hold your breath waiting for a fix ... regards, tom lane
Hi, >You could always try > > CREATE INDEX test_index ON dbxref (accession varchar_pattern_ops); WOW! we're now at runtime 0.367ms on Pg8 Next step is to initdb w/C Locale (tonight) (Thanks Tom et al.!). Thanks again - will report back soon. Susan
Susan Russo wrote: > Hi, > > >You could always try > > > > CREATE INDEX test_index ON dbxref (accession varchar_pattern_ops); > > WOW! we're now at runtime 0.367ms on Pg8 > > Next step is to initdb w/C Locale (tonight) (Thanks Tom et al.!). That's alternative to the pattern_ops index; it won't help you obtain a plan faster than this one. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Susan Russo wrote: >> Next step is to initdb w/C Locale (tonight) (Thanks Tom et al.!). > That's alternative to the pattern_ops index; it won't help you obtain a > plan faster than this one. No, but since their old DB was evidently running in C locale, this seems like a prudent thing to do to avoid other surprising changes in behavior. regards, tom lane