Re: Increasing pattern index query speed - Mailing list pgsql-performance

From Andrus
Subject Re: Increasing pattern index query speed
Date
Msg-id CF57B986CDE345A9990BD921B56990CF@andrusnotebook
Whole thread Raw
In response to Re: Increasing pattern index query speed  (Scott Carey <scott@richrelevance.com>)
List pgsql-performance
Scott,

>You may also end up setting higher statistics targets on some other columns
to fix other issues.  You may want to set the value in the configuration
file higher than the default 10 -- I'd recommend starting with 40 and
re-analyzing the tables.

Thank you.

I set rid.toode statistics target to 1000. analyze rid now takes 40 seconds.
More queries run less than 10 seconds after this change.

I set default_statistic target to 40 and ran ANALYZE.

However there are still queries which take more than 10 seconds.

set search_path to firma2,public;
ALTER TABLE dok ALTER COLUMN kuupaev SET STATISTICS 1000;
analyze dok; -- 86 seconds

explain analyze select sum(1)
   FROM dok JOIN rid USING (dokumnr)
   JOIN toode USING (toode)
   WHERE dok.kuupaev>='2008-10-01'
 AND
 ( (
      dok.doktyyp IN
('V','G','Y','K','I','T','D','N','H','M','E','B','A','R','C','F','J','Q')
          AND CASE WHEN NOT dok.objrealt OR dok.doktyyp='I' THEN dok.yksus
ELSE rid.kuluobjekt END LIKE 'RIISIPERE%'
      )
     OR
     ( dok.doktyyp IN ('O','S','I','U','D','P')
          AND CASE WHEN dok.objrealt THEN rid.kuluobjekt ELSE dok.sihtyksus
END LIKE 'RIISIPERE%'
      )
   )
ALTER TABLE dok ALTER COLUMN kuupaev SET STATISTICS -1;
analyze dok; -- 3 seconds

"Aggregate  (cost=302381.83..302381.84 rows=1 width=0) (actual
time=32795.966..32795.970 rows=1 loops=1)"
"  ->  Merge Join  (cost=302298.07..302379.45 rows=951 width=0) (actual
time=31478.319..32614.691 rows=47646 loops=1)"
"        Merge Cond: ("outer".toode = "inner".toode)"
"        ->  Sort  (cost=300522.54..300524.92 rows=954 width=24) (actual
time=31254.424..31429.436 rows=47701 loops=1)"
"              Sort Key: rid.toode"
"              ->  Hash Join  (cost=73766.03..300475.32 rows=954 width=24)
(actual time=920.122..30418.627 rows=47701 loops=1)"
"                    Hash Cond: ("outer".dokumnr = "inner".dokumnr)"
"                    Join Filter: (((("inner".doktyyp = 'V'::bpchar) OR
("inner".doktyyp = 'G'::bpchar) OR ("inner".doktyyp = 'Y'::bpchar) OR
("inner".doktyyp = 'K'::bpchar) OR ("inner".doktyyp = 'I'::bpchar) OR
("inner".doktyyp = 'T'::bpchar) OR ("inner".doktyyp = 'D'::bpchar) OR
("inner".doktyyp = 'N'::bpchar) OR ("inner".doktyyp = 'H'::bpchar) OR
("inner".doktyyp = 'M'::bpchar) OR ("inner".doktyyp = 'E'::bpchar) OR
("inner".doktyyp = 'B'::bpchar) OR ("inner".doktyyp = 'A'::bpchar) OR
("inner".doktyyp = 'R'::bpchar) OR ("inner".doktyyp = 'C'::bpchar) OR
("inner".doktyyp = 'F'::bpchar) OR ("inner".doktyyp = 'J'::bpchar) OR
("inner".doktyyp = 'Q'::bpchar)) AND (CASE WHEN ((NOT
("inner".objrealt)::boolean) OR ("inner".doktyyp = 'I'::bpchar)) THEN
"inner".yksus ELSE "outer".kuluobjekt END ~~ 'RIISIPERE%'::text)) OR
((("inner".doktyyp = 'O'::bpchar) OR ("inner".doktyyp = 'S'::bpchar) OR
("inner".doktyyp = 'I'::bpchar) OR ("inner".doktyyp = 'U'::bpchar) OR
("inner".doktyyp = 'D'::bpchar) OR ("inner".doktyyp = 'P'::bpchar)) AND
(CASE WHEN ("inner".objrealt)::boolean THEN "outer".kuluobjekt ELSE
"inner".sihtyksus END ~~ 'RIISIPERE%'::text)))"
"                    ->  Seq Scan on rid  (cost=0.00..129635.37 rows=3305337
width=42) (actual time=0.040..14458.666 rows=3293574 loops=1)"
"                    ->  Hash  (cost=73590.93..73590.93 rows=70042 width=38)
(actual time=916.812..916.812 rows=72439 loops=1)"
"                          ->  Bitmap Heap Scan on dok
(cost=414.75..73590.93 rows=70042 width=38) (actual time=28.704..589.116
rows=72439 loops=1)"
"                                Recheck Cond: (kuupaev >=
'2008-10-01'::date)"
"                                Filter: ((doktyyp = 'V'::bpchar) OR
(doktyyp = 'G'::bpchar) OR (doktyyp = 'Y'::bpchar) OR (doktyyp =
'K'::bpchar) OR (doktyyp = 'I'::bpchar) OR (doktyyp = 'T'::bpchar) OR
(doktyyp = 'D'::bpchar) OR (doktyyp = 'N'::bpchar) OR (doktyyp =
'H'::bpchar) OR (doktyyp = 'M'::bpchar) OR (doktyyp = 'E'::bpchar) OR
(doktyyp = 'B'::bpchar) OR (doktyyp = 'A'::bpchar) OR (doktyyp =
'R'::bpchar) OR (doktyyp = 'C'::bpchar) OR (doktyyp = 'F'::bpchar) OR
(doktyyp = 'J'::bpchar) OR (doktyyp = 'Q'::bpchar) OR (doktyyp =
'O'::bpchar) OR (doktyyp = 'S'::bpchar) OR (doktyyp = 'I'::bpchar) OR
(doktyyp = 'U'::bpchar) OR (doktyyp = 'D'::bpchar) OR (doktyyp =
'P'::bpchar))"
"                                ->  Bitmap Index Scan on dok_kuupaev_idx
(cost=0.00..414.75 rows=72500 width=0) (actual time=20.049..20.049
rows=72664 loops=1)"
"                                      Index Cond: (kuupaev >=
'2008-10-01'::date)"
"        ->  Sort  (cost=1775.54..1809.10 rows=13423 width=24) (actual
time=223.235..457.888 rows=59876 loops=1)"
"              Sort Key: toode.toode"
"              ->  Seq Scan on toode  (cost=0.00..855.23 rows=13423
width=24) (actual time=0.046..63.783 rows=13427 loops=1)"
"Total runtime: 32807.767 ms"

How to speed this up ?

'RIISIPERE%'  is shop group code. Using this condition can limit scan to 6
times less documentes since there are 6 shops.
Mayber is it possible to create indexes or other way to force index search
for condition

      dok.doktyyp IN
('V','G','Y','K','I','T','D','N','H','M','E','B','A','R','C','F','J','Q')
          AND CASE WHEN NOT dok.objrealt OR dok.doktyyp='I' THEN dok.yksus
ELSE rid.kuluobjekt END LIKE 'RIISIPERE%'
      )
     OR
     ( dok.doktyyp IN ('O','S','I','U','D','P')
          AND CASE WHEN dok.objrealt THEN rid.kuluobjekt ELSE dok.sihtyksus
END LIKE 'RIISIPERE%'
      )

or is it possible to re-write this condition so that it uses existing
pattern indexes

CREATE INDEX dok_sihtyksus_pattern_idx  ON firma2.dok  (sihtyksus
bpchar_pattern_ops);
CREATE INDEX dok_yksus_pattern_idx ON firma2.dok  (yksus
bpchar_pattern_ops);

without changing tables structureˇ?

Andrus.


pgsql-performance by date:

Previous
From: Mario Weilguni
Date:
Subject: Re: Deteriorating performance when loading large objects
Next
From: "Andrus"
Date:
Subject: Increasing GROUP BY CHAR columns speed