Thread: specific query (not all) on Pg8 MUCH slower than Pg7

specific query (not all) on Pg8 MUCH slower than Pg7

From
Susan Russo
Date:
Hi,

Despite numerous efforts, we're unable to solve a severe performance limitation between Pg 7.3.2
and Pg 8.1.4.

The query and 'explain analyze' plan below, runs in
    26.20 msec on Pg 7.3.2, and
    2463.968 ms on Pg 8.1.4,

and the Pg7.3.2 is on older hardware and OS.

Multiply this time difference by >82K, and a 10 minute procedure (which includes
this query), now runs in 10 *hours*.....not good....

In general, however, we're pleased with performance of this very same Pg8.1.4 server
as compared to the Pg7.3.2 server (loading/dumping, and other queries are much faster).

QUERY:

SELECT dx.db_id, dx.accession, f.uniquename, f.name, cvt.name as ntype,
        fd.is_current
from feature f, feature_dbxref fd, dbxref dx, cvterm cvt
where fd.dbxref_id = dx.dbxref_id
        and fd.feature_id = f.feature_id
        and f.type_id = cvt.cvterm_id
        and accession like 'AY851043%'
        and cvt.name not in
('gene','protein','natural_transposable_element','chromosome_structure_variation','chromosome_arm','repeat_region')
;


explain analyze output on Pg7.3.2:

-----------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..23.45 rows=1 width=120) (actual time=25.59..25.59 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..17.49 rows=1 width=82) (actual time=25.58..25.58 rows=0 loops=1)
         ->  Nested Loop  (cost=0.00..11.93 rows=1 width=30) (actual time=25.58..25.58 rows=0 loops=1)
               ->  Index Scan using dbxref_idx2 on dbxref dx  (cost=0.00..5.83 rows=1 width=21) (actual
time=25.58..25.58rows=0 loops=1) 
                     Index Cond: ((accession >= 'AY851043'::character varying) AND (accession < 'AY851044'::character
varying))
                     Filter: (accession ~~ 'AY851043%'::text)
               ->  Index Scan using feature_dbxref_idx2 on feature_dbxref fd  (cost=0.00..6.05 rows=5 width=9) (never
executed)
                     Index Cond: (fd.dbxref_id = "outer".dbxref_id)
         ->  Index Scan using feature_pkey on feature f  (cost=0.00..5.54 rows=1 width=52) (never executed)
               Index Cond: ("outer".feature_id = f.feature_id)
   ->  Index Scan using cvterm_pkey on cvterm cvt  (cost=0.00..5.94 rows=1 width=38) (never executed)
         Index Cond: ("outer".type_id = cvt.cvterm_id)
         Filter: ((name <> 'gene'::character varying) AND (name <> 'protein'::character varying) AND (name <>
'natural_transposable_element'::charactervarying) AND (name <> 'chromosome_structure_variation'::character varying) AND
(name<> 'chromosome_arm'::character varying) AND (name <> 'repeat_region'::character varying)) 
 Total runtime: 26.20 msec
(14 rows)

========


explain analyze output on Pg8.1.4:

-----------------------------------------------------------------
 Nested Loop  (cost=0.00..47939.87 rows=1 width=108) (actual time=2463.654..2463.654 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..47933.92 rows=1 width=73) (actual time=2463.651..2463.651 rows=0 loops=1)
         ->  Nested Loop  (cost=0.00..47929.86 rows=1 width=22) (actual time=2463.649..2463.649 rows=0 loops=1)
               ->  Seq Scan on dbxref dx  (cost=0.00..47923.91 rows=1 width=21) (actual time=2463.646..2463.646 rows=0
loops=1)
                     Filter: ((accession)::text ~~ 'AY851043%'::text)
               ->  Index Scan using feature_dbxref_idx2 on feature_dbxref fd  (cost=0.00..5.90 rows=4 width=9) (never
executed)
                     Index Cond: (fd.dbxref_id = "outer".dbxref_id)
         ->  Index Scan using feature_pkey on feature f  (cost=0.00..4.05 rows=1 width=59) (never executed)
               Index Cond: ("outer".feature_id = f.feature_id)
   ->  Index Scan using cvterm_pkey on cvterm cvt  (cost=0.00..5.94 rows=1 width=43) (never executed)
         Index Cond: ("outer".type_id = cvt.cvterm_id)
         Filter: (((name)::text <> 'gene'::text) AND ((name)::text <> 'protein'::text) AND ((name)::text <>
'natural_transposable_element'::text)AND ((name)::text <> 'chromosome_structure_variation'::text) AND ((name)::text <>
'chromosome_arm'::text)AND ((name)::text <> 'repeat_region'::text)) 
 Total runtime: 2463.968 ms
(13 rows)


=======

I tried tuning configs, including shutting off enable seqscan, forcing use of index (set shared_buffers high
with random_page_cost set low).  A colleague who gets 1697ms on Pg8.1.4 with this query provided his
postgresql.conf -- didn't help....

We use standard dump/load commands between these servers:
        pg_dump -O fb_2007_01_05 | compress > fb_2007_01_05.Z
        uncompress -c fb_2007_01_05.Z  |  psql fb_2007_01_05

Hardware/OS specs:
    - Pg7.3.2:  SunFire 280R, 900mHz SPARC processor, 3gb total RAM, 10Krpm SCSI internal disks, Solaris 2.8
    - Pg8.1.4:  v240 - dual Ultra-SPARC IIIi 1500MHz SPARC processor, 8GB total RAM, Solaris 2.10
      (used both Sun-supplied postgres binaries, and compiled postgres from source)


Thanks for your help,
Susan Russo


Re: specific query (not all) on Pg8 MUCH slower than Pg7

From
Tom Lane
Date:
Susan Russo <russo@morgan.harvard.edu> writes:
> Despite numerous efforts, we're unable to solve a severe performance limitation between Pg 7.3.2
> and Pg 8.1.4.

> The query and 'explain analyze' plan below, runs in
>     26.20 msec on Pg 7.3.2, and
>     2463.968 ms on Pg 8.1.4,

You're not getting the indexscan optimization of the LIKE clause, which
is most likely due to having initdb'd the 8.1 installation in something
other than C locale.  You can either redo the initdb in C locale (which
might be a good move to fix other inconsistencies from the 7.3 behavior
you're used to) or create a varchar_pattern_ops index on the column(s)
you're using LIKE with.

            regards, tom lane

Re: specific query (not all) on Pg8 MUCH slower than Pg7

From
"Steinar H. Gunderson"
Date:
On Tue, May 08, 2007 at 10:18:34AM -0400, Susan Russo wrote:
> explain analyze output on Pg7.3.2:
>
>                ->  Index Scan using dbxref_idx2 on dbxref dx  (cost=0.00..5.83 rows=1 width=21) (actual
time=25.58..25.58rows=0 loops=1) 
>                      Index Cond: ((accession >= 'AY851043'::character varying) AND (accession < 'AY851044'::character
varying))
>                      Filter: (accession ~~ 'AY851043%'::text)
>
> explain analyze output on Pg8.1.4:
>
>                ->  Seq Scan on dbxref dx  (cost=0.00..47923.91 rows=1 width=21) (actual time=2463.646..2463.646
rows=0loops=1) 
>                      Filter: ((accession)::text ~~ 'AY851043%'::text)

This is almost all of your cost. Did you perchance initdb the 8.1.4 cluster
in a non-C locale? You could always try

  CREATE INDEX test_index ON dbxref (accession varchar_pattern_ops);

which would create an index that might be more useful for your LIKE query,
even in a non-C locale.

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: specific query (not all) on Pg8 MUCH slower than Pg7

From
Richard Broersma Jr
Date:
--- Susan Russo <russo@morgan.harvard.edu> wrote:
>         and accession like 'AY851043%'

I don't know if you've tried refactoring your query, but you could try:

   AND accession BETWEEN 'AY8510430' AND 'AY8510439'  -- where the last digit is
                                  ^               ^   -- lowest AND highest expected value

Regards,
Richard Broersma Jr.

Re: specific query (not all) on Pg8 MUCH slower than Pg7

From
"Alexander Staubo"
Date:
On 5/8/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> You're not getting the indexscan optimization of the LIKE clause, which
> is most likely due to having initdb'd the 8.1 installation in something
> other than C locale.  You can either redo the initdb in C locale (which
> might be a good move to fix other inconsistencies from the 7.3 behavior
> you're used to) or create a varchar_pattern_ops index on the column(s)
> you're using LIKE with.

Given the performance implications of setting the wrong locale, and
the high probability of accidentally doing this (I run my shells with
LANG=en_US.UTF-8, so all my databases have inherited this locale), why
is there no support for changing the database locale after the fact?

# alter database test set lc_collate = 'C';
ERROR:  parameter "lc_collate" cannot be changed

Alexander.

Re: specific query (not all) on Pg8 MUCH slower than Pg7

From
Bill Moran
Date:
In response to "Alexander Staubo" <alex@purefiction.net>:

> On 5/8/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > You're not getting the indexscan optimization of the LIKE clause, which
> > is most likely due to having initdb'd the 8.1 installation in something
> > other than C locale.  You can either redo the initdb in C locale (which
> > might be a good move to fix other inconsistencies from the 7.3 behavior
> > you're used to) or create a varchar_pattern_ops index on the column(s)
> > you're using LIKE with.
>
> Given the performance implications of setting the wrong locale, and
> the high probability of accidentally doing this (I run my shells with
> LANG=en_US.UTF-8, so all my databases have inherited this locale), why
> is there no support for changing the database locale after the fact?
>
> # alter database test set lc_collate = 'C';
> ERROR:  parameter "lc_collate" cannot be changed

How would that command handle UTF data that could not be converted to C?

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: specific query (not all) on Pg8 MUCH slower than Pg7

From
Tom Lane
Date:
"Alexander Staubo" <alex@purefiction.net> writes:
> why is there no support for changing the database locale after the fact?

It'd corrupt all your indexes (or all the ones on textual columns anyway).

There are some TODO entries related to this, but don't hold your breath
waiting for a fix ...

            regards, tom lane

Re: specific query (not all) on Pg8 MUCH slower than Pg7

From
Susan Russo
Date:
Hi,

>You could always try
>
>  CREATE INDEX test_index ON dbxref (accession varchar_pattern_ops);

WOW!  we're now at runtime 0.367ms on Pg8

Next step is to initdb w/C Locale (tonight) (Thanks Tom et al.!).

Thanks again - will report back soon.

Susan

Re: specific query (not all) on Pg8 MUCH slower than Pg7

From
Alvaro Herrera
Date:
Susan Russo wrote:
> Hi,
>
> >You could always try
> >
> >  CREATE INDEX test_index ON dbxref (accession varchar_pattern_ops);
>
> WOW!  we're now at runtime 0.367ms on Pg8
>
> Next step is to initdb w/C Locale (tonight) (Thanks Tom et al.!).

That's alternative to the pattern_ops index; it won't help you obtain a
plan faster than this one.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: specific query (not all) on Pg8 MUCH slower than Pg7

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Susan Russo wrote:
>> Next step is to initdb w/C Locale (tonight) (Thanks Tom et al.!).

> That's alternative to the pattern_ops index; it won't help you obtain a
> plan faster than this one.

No, but since their old DB was evidently running in C locale, this seems
like a prudent thing to do to avoid other surprising changes in behavior.

            regards, tom lane