Thread: Avoid huge perfomance loss on string concatenation

Avoid huge perfomance loss on string concatenation

From
"Andrus"
Date:
Using string concatenation in where clause causes huge perfomance loss:

explain analyze select
       rid.toode
   FROM dok JOIN rid USING (dokumnr)
   JOIN toode USING (toode)
   LEFT JOIN artliik using(grupp,liik)
   WHERE rid.toode='NAH S'
AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

"Nested Loop Left Join  (cost=68.75..5064.86 rows=1 width=24) (actual
time=8.081..26995.552 rows=567 loops=1)"
"  Join Filter: ((toode.grupp = artliik.grupp) AND (toode.liik =
artliik.liik))"
"  ->  Nested Loop  (cost=68.75..5062.19 rows=1 width=43) (actual
time=8.045..26965.731 rows=567 loops=1)"
"        ->  Index Scan using toode_pkey on toode  (cost=0.00..8.27 rows=1
width=43) (actual time=0.023..0.026 rows=1 loops=1)"
"              Index Cond: ('NAH S'::bpchar = toode)"
"        ->  Nested Loop  (cost=68.75..5053.91 rows=1 width=24) (actual
time=8.016..26964.698 rows=567 loops=1)"
"              ->  Index Scan using dok_kuupaev_idx on dok
(cost=0.00..4326.16 rows=10 width=4) (actual time=0.059..67.985 rows=3543
loops=1)"
"                    Index Cond: ((kuupaev >= '2007-11-01'::date) AND
(kuupaev <= '2007-12-04'::date))"
"                    Filter: ((((kuupaev)::text || (kellaaeg)::text) >=
'2007-11-01'::text) AND (((kuupaev)::text || (kellaaeg)::text) <=
'2007-12-0423 59'::text))"
"              ->  Bitmap Heap Scan on rid  (cost=68.75..72.76 rows=1
width=28) (actual time=7.577..7.577 rows=0 loops=3543)"
"                    Recheck Cond: ((dok.dokumnr = rid.dokumnr) AND
(rid.toode = 'NAH S'::bpchar))"
"                    ->  BitmapAnd  (cost=68.75..68.75 rows=1 width=0)
(actual time=7.574..7.574 rows=0 loops=3543)"
"                          ->  Bitmap Index Scan on rid_dokumnr_idx
(cost=0.00..5.13 rows=83 width=0) (actual time=0.037..0.037 rows=14
loops=3543)"
"                                Index Cond: (dok.dokumnr = rid.dokumnr)"
"                          ->  Bitmap Index Scan on rid_toode_idx
(cost=0.00..63.03 rows=1354 width=0) (actual time=7.528..7.528 rows=21144
loops=3543)"
"                                Index Cond: (toode = 'NAH S'::bpchar)"
"  ->  Seq Scan on artliik  (cost=0.00..2.27 rows=27 width=19) (actual
time=0.007..0.020 rows=27 loops=567)"
"Total runtime: 26996.399 ms"

takes 26 seconds !

If I remove last line it takes only 0 seconds:

SET SEARCH_PATH TO FIRMA1,public;
explain analyze select
       rid.toode
   FROM dok JOIN rid USING (dokumnr)
   JOIN toode USING (toode)
   LEFT JOIN artliik using(grupp,liik)
   WHERE rid.toode='NAH S'
AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'

"Hash Left Join  (cost=4313.85..7702.10 rows=24 width=24) (actual
time=10.138..48.884 rows=567 loops=1)"
"  Hash Cond: ((toode.grupp = artliik.grupp) AND (toode.liik =
artliik.liik))"
"  ->  Nested Loop  (cost=4311.17..7699.14 rows=24 width=43) (actual
time=10.049..47.877 rows=567 loops=1)"
"        ->  Index Scan using toode_pkey on toode  (cost=0.00..8.27 rows=1
width=43) (actual time=0.043..0.046 rows=1 loops=1)"
"              Index Cond: ('NAH S'::bpchar = toode)"
"        ->  Hash Join  (cost=4311.17..7690.63 rows=24 width=24) (actual
time=9.998..47.341 rows=567 loops=1)"
"              Hash Cond: (rid.dokumnr = dok.dokumnr)"
"              ->  Index Scan using rid_toode_idx on rid
(cost=0.00..3372.45 rows=1354 width=28) (actual time=0.089..24.265
rows=21144 loops=1)"
"                    Index Cond: (toode = 'NAH S'::bpchar)"
"              ->  Hash  (cost=4286.20..4286.20 rows=1998 width=4) (actual
time=9.871..9.871 rows=3543 loops=1)"
"                    ->  Index Scan using dok_kuupaev_idx on dok
(cost=0.00..4286.20 rows=1998 width=4) (actual time=0.057..6.779 rows=3543
loops=1)"
"                          Index Cond: ((kuupaev >= '2007-11-01'::date) AND
(kuupaev <= '2007-12-04'::date))"
"  ->  Hash  (cost=2.27..2.27 rows=27 width=19) (actual time=0.060..0.060
rows=27 loops=1)"
"        ->  Seq Scan on artliik  (cost=0.00..2.27 rows=27 width=19) (actual
time=0.009..0.027 rows=27 loops=1)"
"Total runtime: 49.409 ms"


How to rewrite the query

select
       rid.toode,
      artliik.*
   FROM dok JOIN rid USING (dokumnr)
   JOIN toode USING (toode)
   LEFT JOIN artliik using(grupp,liik)
   WHERE rid.toode='NAH S'
AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

so it runs fast ?

Andrus.


"PostgreSQL 8.2.4 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)"



Re: Avoid huge perfomance loss on string concatenation

From
Bill Moran
Date:
"Andrus" <kobruleht2@hot.ee> wrote:
>
> Using string concatenation in where clause causes huge perfomance loss:
>
> explain analyze select
>        rid.toode
>    FROM dok JOIN rid USING (dokumnr)
>    JOIN toode USING (toode)
>    LEFT JOIN artliik using(grupp,liik)
>    WHERE rid.toode='NAH S'
> AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
> and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

You provide zero information on the table layout, and the explain output
has been horribly mangled by your MUA.

I would suspect the problem is that there's no index that can be used
for that final comparison.  Do you have an index along the lines of
CREATE INDEX dokindex ON dok (kuupaeve||kellaaeg) ?

Overall, the fact that you're concatenating two text fields to generate a
date field tends to suggest that your database schema has some fairly
major design problems, but I can only speculate at this point.

If neither of those help, I expect you'll need to provide more information.

> "Nested Loop Left Join  (cost=68.75..5064.86 rows=1 width=24) (actual
> time=8.081..26995.552 rows=567 loops=1)"
> "  Join Filter: ((toode.grupp = artliik.grupp) AND (toode.liik =
> artliik.liik))"
> "  ->  Nested Loop  (cost=68.75..5062.19 rows=1 width=43) (actual
> time=8.045..26965.731 rows=567 loops=1)"
> "        ->  Index Scan using toode_pkey on toode  (cost=0.00..8.27 rows=1
> width=43) (actual time=0.023..0.026 rows=1 loops=1)"
> "              Index Cond: ('NAH S'::bpchar = toode)"
> "        ->  Nested Loop  (cost=68.75..5053.91 rows=1 width=24) (actual
> time=8.016..26964.698 rows=567 loops=1)"
> "              ->  Index Scan using dok_kuupaev_idx on dok
> (cost=0.00..4326.16 rows=10 width=4) (actual time=0.059..67.985 rows=3543
> loops=1)"
> "                    Index Cond: ((kuupaev >= '2007-11-01'::date) AND
> (kuupaev <= '2007-12-04'::date))"
> "                    Filter: ((((kuupaev)::text || (kellaaeg)::text) >=
> '2007-11-01'::text) AND (((kuupaev)::text || (kellaaeg)::text) <=
> '2007-12-0423 59'::text))"
> "              ->  Bitmap Heap Scan on rid  (cost=68.75..72.76 rows=1
> width=28) (actual time=7.577..7.577 rows=0 loops=3543)"
> "                    Recheck Cond: ((dok.dokumnr = rid.dokumnr) AND
> (rid.toode = 'NAH S'::bpchar))"
> "                    ->  BitmapAnd  (cost=68.75..68.75 rows=1 width=0)
> (actual time=7.574..7.574 rows=0 loops=3543)"
> "                          ->  Bitmap Index Scan on rid_dokumnr_idx
> (cost=0.00..5.13 rows=83 width=0) (actual time=0.037..0.037 rows=14
> loops=3543)"
> "                                Index Cond: (dok.dokumnr = rid.dokumnr)"
> "                          ->  Bitmap Index Scan on rid_toode_idx
> (cost=0.00..63.03 rows=1354 width=0) (actual time=7.528..7.528 rows=21144
> loops=3543)"
> "                                Index Cond: (toode = 'NAH S'::bpchar)"
> "  ->  Seq Scan on artliik  (cost=0.00..2.27 rows=27 width=19) (actual
> time=0.007..0.020 rows=27 loops=567)"
> "Total runtime: 26996.399 ms"
>
> takes 26 seconds !
>
> If I remove last line it takes only 0 seconds:
>
> SET SEARCH_PATH TO FIRMA1,public;
> explain analyze select
>        rid.toode
>    FROM dok JOIN rid USING (dokumnr)
>    JOIN toode USING (toode)
>    LEFT JOIN artliik using(grupp,liik)
>    WHERE rid.toode='NAH S'
> AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
>
> "Hash Left Join  (cost=4313.85..7702.10 rows=24 width=24) (actual
> time=10.138..48.884 rows=567 loops=1)"
> "  Hash Cond: ((toode.grupp = artliik.grupp) AND (toode.liik =
> artliik.liik))"
> "  ->  Nested Loop  (cost=4311.17..7699.14 rows=24 width=43) (actual
> time=10.049..47.877 rows=567 loops=1)"
> "        ->  Index Scan using toode_pkey on toode  (cost=0.00..8.27 rows=1
> width=43) (actual time=0.043..0.046 rows=1 loops=1)"
> "              Index Cond: ('NAH S'::bpchar = toode)"
> "        ->  Hash Join  (cost=4311.17..7690.63 rows=24 width=24) (actual
> time=9.998..47.341 rows=567 loops=1)"
> "              Hash Cond: (rid.dokumnr = dok.dokumnr)"
> "              ->  Index Scan using rid_toode_idx on rid
> (cost=0.00..3372.45 rows=1354 width=28) (actual time=0.089..24.265
> rows=21144 loops=1)"
> "                    Index Cond: (toode = 'NAH S'::bpchar)"
> "              ->  Hash  (cost=4286.20..4286.20 rows=1998 width=4) (actual
> time=9.871..9.871 rows=3543 loops=1)"
> "                    ->  Index Scan using dok_kuupaev_idx on dok
> (cost=0.00..4286.20 rows=1998 width=4) (actual time=0.057..6.779 rows=3543
> loops=1)"
> "                          Index Cond: ((kuupaev >= '2007-11-01'::date) AND
> (kuupaev <= '2007-12-04'::date))"
> "  ->  Hash  (cost=2.27..2.27 rows=27 width=19) (actual time=0.060..0.060
> rows=27 loops=1)"
> "        ->  Seq Scan on artliik  (cost=0.00..2.27 rows=27 width=19) (actual
> time=0.009..0.027 rows=27 loops=1)"
> "Total runtime: 49.409 ms"
>
>
> How to rewrite the query
>
> select
>        rid.toode,
>       artliik.*
>    FROM dok JOIN rid USING (dokumnr)
>    JOIN toode USING (toode)
>    LEFT JOIN artliik using(grupp,liik)
>    WHERE rid.toode='NAH S'
> AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
> and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'
>
> so it runs fast ?
>
> Andrus.
>
>
> "PostgreSQL 8.2.4 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
> (mingw-special)"
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings


--
Bill Moran
http://www.potentialtech.com

Re: Avoid huge perfomance loss on string concatenation

From
"Merlin Moncure"
Date:
On Dec 4, 2007 8:02 PM, Bill Moran <wmoran@potentialtech.com> wrote:
> "Andrus" <kobruleht2@hot.ee> wrote:
> > Using string concatenation in where clause causes huge perfomance loss:
> >
> > explain analyze select
> >        rid.toode
> >    FROM dok JOIN rid USING (dokumnr)
> >    JOIN toode USING (toode)
> >    LEFT JOIN artliik using(grupp,liik)
> >    WHERE rid.toode='NAH S'
> > AND dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
> > and dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

can you please give us the types of  dok.kuupaev and dok.kellaaeg?  I
think a simple fix is possible here.


> You provide zero information on the table layout, and the explain output
> has been horribly mangled by your MUA.
>
> I would suspect the problem is that there's no index that can be used
> for that final comparison.  Do you have an index along the lines of
> CREATE INDEX dokindex ON dok (kuupaeve||kellaaeg) ?
>
> Overall, the fact that you're concatenating two text fields to generate a
> date field tends to suggest that your database schema has some fairly
> major design problems, but I can only speculate at this point.

just small correction here...expressions like that in the create index
need an extra set of parens (but I agree with your sentiment):
CREATE INDEX dokindex ON dok ((kuupaeve||kellaaeg))

merlin

Re: Avoid huge perfomance loss on string concatenation

From
Bill Moran
Date:
In response to "Andrus" <eetasoft@online.ee>:

> Thank you very much for quick reply.

Keep the mailing list included in this discussion.

> > can you please give us the types of  dok.kuupaev and dok.kellaaeg?  I
> > think a simple fix is possible here.
>
> dok.kuupaev type is DATE
>
> dok.kellaaeg type is     character(5) NOT NULL DEFAULT ''
> and is used to represent dokument time in format   hh mm
>
> Database encoding is UTF-8 , cluster locale is estonian, OS is Windows 2003
> server.
>
> >> You provide zero information on the table layout
>
> dok table full definition is below.
> What other information do you need ?

That's pretty much it.

> >>, and the explain output
> >> has been horribly mangled by your MUA.
>
> I used copy and paste from pgAdmin.
> I checked my message and it seems that explain output is OK, havent found
> any truncation.
> So I do'nt understand this.

I had to cut/paste the data into a text editor and reformat it before I
could work with it.  The arbitrary line-wrap is painful:

"Nested Loop Left Join  (cost=68.75..5064.86 rows=1 width=24) (actual
time=8.081..26995.552 rows=567 loops=1)"
"  Join Filter: ((toode.grupp = artliik.grupp) AND (toode.liik =
artliik.liik))"
"  ->  Nested Loop  (cost=68.75..5062.19 rows=1 width=43) (actual
time=8.045..26965.731 rows=567 loops=1)"
"        ->  Index Scan using toode_pkey on toode  (cost=0.00..8.27 rows=1
width=43) (actual time=0.023..0.026 rows=1 loops=1)"
"              Index Cond: ('NAH S'::bpchar = toode)"
"        ->  Nested Loop  (cost=68.75..5053.91 rows=1 width=24) (actual
time=8.016..26964.698 rows=567 loops=1)"
"              ->  Index Scan using dok_kuupaev_idx on dok
(cost=0.00..4326.16 rows=10 width=4) (actual time=0.059..67.985 rows=3543
loops=1)"
"                    Index Cond: ((kuupaev >= '2007-11-01'::date) AND
(kuupaev <= '2007-12-04'::date))"
"                    Filter: ((((kuupaev)::text || (kellaaeg)::text) >=
'2007-11-01'::text) AND (((kuupaev)::text || (kellaaeg)::text) <=
'2007-12-0423 59'::text))"
"              ->  Bitmap Heap Scan on rid  (cost=68.75..72.76 rows=1
width=28) (actual time=7.577..7.577 rows=0 loops=3543)"
"                    Recheck Cond: ((dok.dokumnr = rid.dokumnr) AND
(rid.toode = 'NAH S'::bpchar))"
"                    ->  BitmapAnd  (cost=68.75..68.75 rows=1 width=0)
(actual time=7.574..7.574 rows=0 loops=3543)"
"                          ->  Bitmap Index Scan on rid_dokumnr_idx
(cost=0.00..5.13 rows=83 width=0) (actual time=0.037..0.037 rows=14
loops=3543)"
"                                Index Cond: (dok.dokumnr = rid.dokumnr)"
"                          ->  Bitmap Index Scan on rid_toode_idx
(cost=0.00..63.03 rows=1354 width=0) (actual time=7.528..7.528 rows=21144
loops=3543)"
"                                Index Cond: (toode = 'NAH S'::bpchar)"
"  ->  Seq Scan on artliik  (cost=0.00..2.27 rows=27 width=19) (actual
time=0.007..0.020 rows=27 loops=567)"
"Total runtime: 26996.399 ms"

> >> I would suspect the problem is that there's no index that can be used
> >> for that final comparison.
>
> Postgres must use index on kuupaev in both queries.

I'm not sure what that comment is supposed to mean.

PG is using the index for the condition
dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'

but there is no index that matches the expression
dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

If you look at your explain output, you'll see that step is taking a
lot of time, and it's inside a nested loop, which means it's run
repeatedly.

> This index filters out most rows.

Have you run a VACUUM ANALYZE on the tables involved with this query
recently?  It's possible that PG has outdated statistics and is
running a poor plan as a result.

> >> Do you have an index along the lines of
> >> CREATE INDEX dokindex ON dok (kuupaeve||kellaaeg) ?
>
> I do'nt have this index.
> dok.kuupaev||dok.kellaaeg conditon should applied after index search is
> performed.

The query planner doesn't seem to think so.

> It filters out only a small number of rows additionally to the plain kuupaev
> filter.
> So adding index on dok.kuupaev||dok.kellaaeg  is not reasonable IMHO.
>
> Please confirm that most reasonable way to fix this to add this index, I
> will add this.

Just add the index and rerun to see if it helps.  If it doesn't, then
drop the index.  I have absolutely no way to investigate this for you.

> >> Overall, the fact that you're concatenating two text fields to generate a
> >> date field tends to suggest that your database schema has some fairly
> >> major design problems, but I can only speculate at this point.
>
> This schema is migrated from dbms where there was no datetime support.
> char(5) field is used to express time  in form   hh mm
> This schema is deployed in a large number of servers.
> Its change would be very expensive. change requires huge amout of  work time
> to re-write applications, create database conversion scripts, re-write
> pl/sql triggers, test and fix new bugs causes by change.

I understand.  However, that doesn't change the fact that such a change
will improve performance and accuracy of the data.

For example, in your query, you have a text string meant to represent
a date: '2007-12-0423 59'

This is not a valid date/time, but PostgreSQL has no way to know that
because it's just a text string.  As a result, you're query is liable
to give you outright incorrect results.

>
> Andrus.
>
>
> CREATE TABLE firma1.dok
> (
>   doktyyp character(1) NOT NULL,
>   dokumnr integer NOT NULL DEFAULT nextval('dok_dokumnr_seq'::regclass),
>   kuupaev date NOT NULL,
>   oper character(3),
>   klient character(12),
>   laonr numeric(2),
>   raha character(3),
>   tasudok character(25),
>   knr character(10),
>   tasukuup date,
>   yksus character(10),
>   sihtyksus character(10),
>   pais2obj character(10),
>   saaja character(12),
>   krdokumnr integer,
>   eimuuda ebool,
>   kasutaja character(10),
>   username character(10),
>   kellaaeg character(5) NOT NULL DEFAULT ''::bpchar,
>   arvekonto character(10),
>   maksetin character(5),
>   exchrate numeric(11,6),
>   ratefound date,
>   kurss numeric(10,5),
>   tekst1 text,
>   viitenr character(20),
>   objrealt ebool,
>   arvenumber character(25),
>   pais3obj character(10),
>   pais4obj character(10),
>   pais5obj character(10),
>   pais6obj character(10),
>   pais7obj character(10),
>   pais8obj character(10),
>   pais9obj character(10),
>   masin character(5),
>   tegmasin character(5),
>   guid character(36) NOT NULL,
>   doksumma numeric(12,2),
>   kinnitatud ebool,
>   tasumata numeric(12,2),
>   sularaha numeric(12,2),
>   kaardimaks numeric(12,2),
>   kalkliik character(1),
>   kalktoode character(20),
>   inventuur ebool,
>   algus date,
>   lopp date,
>   taidetud ebool,
>   kaal numeric(7,3),
>   "timestamp" character(14) NOT NULL DEFAULT to_char(now(),
> 'YYYYMMDDHH24MISS'::text),
>   vmnr integer,
>   tellimus character(25),
>   volitaisik character(36),
>   liikmesrii character(2),
>   tehingulii character(2),
>   tarneklaus character(10),
>   statprots character(2),
>   CONSTRAINT dok_pkey PRIMARY KEY (dokumnr),
>   CONSTRAINT dok_arvekonto_fkey FOREIGN KEY (arvekonto)
>       REFERENCES firma1.konto (kontonr) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_kalktoode_fkey FOREIGN KEY (kalktoode)
>       REFERENCES firma1.toode (toode) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_kasutaja_fkey FOREIGN KEY (kasutaja)
>       REFERENCES kasutaja (kasutaja) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_klient_fkey FOREIGN KEY (klient)
>       REFERENCES firma1.klient (kood) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_knr_fkey FOREIGN KEY (knr)
>       REFERENCES firma1.konto (kontonr) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_krdokumnr_fkey FOREIGN KEY (krdokumnr)
>       REFERENCES firma1.dok (dokumnr) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_liikmesrii_fkey FOREIGN KEY (liikmesrii)
>       REFERENCES riik (kood) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_maksetin_fkey FOREIGN KEY (maksetin)
>       REFERENCES maksetin (maksetin) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_oper_fkey FOREIGN KEY (oper)
>       REFERENCES alamdok (oper) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_pais2obj_fkey FOREIGN KEY (pais2obj)
>       REFERENCES firma1.yksus2 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_pais3obj_fkey FOREIGN KEY (pais3obj)
>       REFERENCES firma1.yksus3 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_pais4obj_fkey FOREIGN KEY (pais4obj)
>       REFERENCES firma1.yksus4 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_pais5obj_fkey FOREIGN KEY (pais5obj)
>       REFERENCES firma1.yksus5 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_pais6obj_fkey FOREIGN KEY (pais6obj)
>       REFERENCES firma1.yksus6 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_pais7obj_fkey FOREIGN KEY (pais7obj)
>       REFERENCES firma1.yksus7 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_pais8obj_fkey FOREIGN KEY (pais8obj)
>       REFERENCES firma1.yksus8 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_pais9obj_fkey FOREIGN KEY (pais9obj)
>       REFERENCES firma1.yksus9 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_raha_fkey FOREIGN KEY (raha)
>       REFERENCES raha (raha) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_saaja_fkey FOREIGN KEY (saaja)
>       REFERENCES firma1.klient (kood) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_sihtyksus_fkey FOREIGN KEY (sihtyksus)
>       REFERENCES firma1.yksus1 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_statprots_fkey FOREIGN KEY (statprots)
>       REFERENCES transpor (kood) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_tarneklaus_fkey FOREIGN KEY (tarneklaus)
>       REFERENCES tarnekla (kood) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_tehingulii_fkey FOREIGN KEY (tehingulii)
>       REFERENCES tehingul (kood) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_username_fkey FOREIGN KEY (username)
>       REFERENCES kasutaja (kasutaja) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_vmnr_fkey FOREIGN KEY (vmnr)
>       REFERENCES firma1.vmaks (vmnr) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_volitaisik_fkey FOREIGN KEY (volitaisik)
>       REFERENCES firma1.kaardika (guid) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_yksus_fkey FOREIGN KEY (yksus)
>       REFERENCES firma1.yksus1 (yksus) MATCH SIMPLE
>       ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
>   CONSTRAINT dok_check CHECK (krdokumnr IS NULL OR (doktyyp = ANY
> (ARRAY['G'::bpchar, 'O'::bpchar]))),
>   CONSTRAINT dok_dokumnr_check CHECK (dokumnr > 0),
>   CONSTRAINT dok_guid_check CHECK (guid <> ''::bpchar)
> )
> WITHOUT OIDS;
>
> ALTER TABLE firma1.dok OWNER TO eeva_owner;
>
> CREATE INDEX dok_klient_idx
>   ON firma1.dok
>   USING btree
>   (klient);
>
> CREATE INDEX dok_krdokumnr_idx
>   ON firma1.dok
>   USING btree
>   (krdokumnr);
>
> CREATE INDEX dok_kuupaev_idx
>   ON firma1.dok
>   USING btree
>   (kuupaev);
>
> CREATE INDEX dok_tasudok_idx
>   ON firma1.dok
>   USING btree
>   (tasudok);
>
> CREATE UNIQUE INDEX dok_tasudok_unique_idx
>   ON firma1.dok
>   USING btree
>   (doktyyp, tasudok)
>   WHERE doktyyp = ANY (ARRAY['T'::bpchar, 'U'::bpchar]);
>
> CREATE INDEX dok_tasumata_idx
>   ON firma1.dok
>   USING btree
>   (tasumata);
>
> CREATE INDEX dok_tellimus_idx
>   ON firma1.dok
>   USING btree
>   (tellimus);
>
> CREATE TRIGGER dok_btrig
>   BEFORE INSERT
>   ON firma1.dok
>   FOR EACH ROW
>   EXECUTE PROCEDURE firma1.dok_seq_trig();
>
>
>
>


--
Bill Moran
http://www.potentialtech.com

Re: Avoid huge perfomance loss on string concatenation

From
"Andrus"
Date:
> just small correction here...expressions like that in the create index
> need an extra set of parens (but I agree with your sentiment):
> CREATE INDEX dokindex ON dok ((kuupaeve||kellaaeg))

I tried

CREATE INDEX dok_kuupaev_kellaaeg_idx ON dok ((kuupaev||kellaaeg));

but got error

ERROR: functions in index expression must be marked IMMUTABLE
SQL state: 42P17

How to create such index ?

Andrus.



Re: Avoid huge perfomance loss on string concatenation

From
"Andrus"
Date:
> I'm not sure what that comment is supposed to mean.
>
> PG is using the index for the condition
> dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
>
> but there is no index that matches the expression
> dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'
>
> If you look at your explain output, you'll see that step is taking a
> lot of time, and it's inside a nested loop, which means it's run
> repeatedly.

Postgres must use index to filter out rows matching to the  condition

dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'

this returns 121 rows.

Additional condition

 dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

should be evaluated only for these 121 rows.

> Have you run a VACUUM ANALYZE on the tables involved with this query
> recently?  It's possible that PG has outdated statistics and is
> running a poor plan as a result.

I have the follwing command in end of postgresql.conf file:

stats_start_collector = on
stats_row_level = on
autovacuum = on

So autovacuum should be running.

> Just add the index and rerun to see if it helps.  If it doesn't, then
> drop the index.  I have absolutely no way to investigate this for you.

I tried

CREATE INDEX dok_kuupaev_kellaaeg_idx ON dok ((kuupaev||kellaaeg));

but got error

ERROR: functions in index expression must be marked IMMUTABLE
SQL state: 42P17

How to create such index ?

> For example, in your query, you have a text string meant to represent
> a date: '2007-12-0423 59'
>
> This is not a valid date/time, but PostgreSQL has no way to know that
> because it's just a text string.  As a result, you're query is liable
> to give you outright incorrect results.

My fields are kuupaev date  and  kellaaeg char(5)
kellaaeg is in format hh mm

I compare this always with kuupaev||kellaaeg

'2007-12-0423 59' is my valid datetime for to be used for this conversion.

I can probably convert kuupaev||kellaaeg to a datetime and use datetime
comparison instead of this. Will this increase perfomance ?

Andrus.



Re: Avoid huge perfomance loss on string concatenation

From
"Andrus"
Date:
Thank you very much for quick reply.

> can you please give us the types of  dok.kuupaev and dok.kellaaeg?  I
> think a simple fix is possible here.

dok.kuupaev type is DATE

dok.kellaaeg type is     character(5) NOT NULL DEFAULT ''
and is used to represent dokument time in format   hh mm

Database encoding is UTF-8 , cluster locale is estonian, OS is Windows 2003
server.

>> You provide zero information on the table layout

dok table full definition is below.
What other information do you need ?

>>, and the explain output
>> has been horribly mangled by your MUA.

I used copy and paste from pgAdmin.
I checked my message and it seems that explain output is OK, havent found
any truncation.
So I do'nt understand this.

>> I would suspect the problem is that there's no index that can be used
>> for that final comparison.

Postgres must use index on kuupaev in both queries.
This index filters out most rows.

>> Do you have an index along the lines of
>> CREATE INDEX dokindex ON dok (kuupaeve||kellaaeg) ?

I do'nt have this index.
dok.kuupaev||dok.kellaaeg conditon should applied after index search is
performed.
It filters out only a small number of rows additionally to the plain kuupaev
filter.
So adding index on dok.kuupaev||dok.kellaaeg  is not reasonable IMHO.

Please confirm that most reasonable way to fix this to add this index, I
will add this.

>> Overall, the fact that you're concatenating two text fields to generate a
>> date field tends to suggest that your database schema has some fairly
>> major design problems, but I can only speculate at this point.

This schema is migrated from dbms where there was no datetime support.
char(5) field is used to express time  in form   hh mm
This schema is deployed in a large number of servers.
Its change would be very expensive. change requires huge amout of  work time
to re-write applications, create database conversion scripts, re-write
pl/sql triggers, test and fix new bugs causes by change.

Andrus.


CREATE TABLE firma1.dok
(
  doktyyp character(1) NOT NULL,
  dokumnr integer NOT NULL DEFAULT nextval('dok_dokumnr_seq'::regclass),
  kuupaev date NOT NULL,
  oper character(3),
  klient character(12),
  laonr numeric(2),
  raha character(3),
  tasudok character(25),
  knr character(10),
  tasukuup date,
  yksus character(10),
  sihtyksus character(10),
  pais2obj character(10),
  saaja character(12),
  krdokumnr integer,
  eimuuda ebool,
  kasutaja character(10),
  username character(10),
  kellaaeg character(5) NOT NULL DEFAULT ''::bpchar,
  arvekonto character(10),
  maksetin character(5),
  exchrate numeric(11,6),
  ratefound date,
  kurss numeric(10,5),
  tekst1 text,
  viitenr character(20),
  objrealt ebool,
  arvenumber character(25),
  pais3obj character(10),
  pais4obj character(10),
  pais5obj character(10),
  pais6obj character(10),
  pais7obj character(10),
  pais8obj character(10),
  pais9obj character(10),
  masin character(5),
  tegmasin character(5),
  guid character(36) NOT NULL,
  doksumma numeric(12,2),
  kinnitatud ebool,
  tasumata numeric(12,2),
  sularaha numeric(12,2),
  kaardimaks numeric(12,2),
  kalkliik character(1),
  kalktoode character(20),
  inventuur ebool,
  algus date,
  lopp date,
  taidetud ebool,
  kaal numeric(7,3),
  "timestamp" character(14) NOT NULL DEFAULT to_char(now(),
'YYYYMMDDHH24MISS'::text),
  vmnr integer,
  tellimus character(25),
  volitaisik character(36),
  liikmesrii character(2),
  tehingulii character(2),
  tarneklaus character(10),
  statprots character(2),
  CONSTRAINT dok_pkey PRIMARY KEY (dokumnr),
  CONSTRAINT dok_arvekonto_fkey FOREIGN KEY (arvekonto)
      REFERENCES firma1.konto (kontonr) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_kalktoode_fkey FOREIGN KEY (kalktoode)
      REFERENCES firma1.toode (toode) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_kasutaja_fkey FOREIGN KEY (kasutaja)
      REFERENCES kasutaja (kasutaja) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_klient_fkey FOREIGN KEY (klient)
      REFERENCES firma1.klient (kood) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_knr_fkey FOREIGN KEY (knr)
      REFERENCES firma1.konto (kontonr) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_krdokumnr_fkey FOREIGN KEY (krdokumnr)
      REFERENCES firma1.dok (dokumnr) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_liikmesrii_fkey FOREIGN KEY (liikmesrii)
      REFERENCES riik (kood) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_maksetin_fkey FOREIGN KEY (maksetin)
      REFERENCES maksetin (maksetin) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_oper_fkey FOREIGN KEY (oper)
      REFERENCES alamdok (oper) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_pais2obj_fkey FOREIGN KEY (pais2obj)
      REFERENCES firma1.yksus2 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_pais3obj_fkey FOREIGN KEY (pais3obj)
      REFERENCES firma1.yksus3 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_pais4obj_fkey FOREIGN KEY (pais4obj)
      REFERENCES firma1.yksus4 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_pais5obj_fkey FOREIGN KEY (pais5obj)
      REFERENCES firma1.yksus5 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_pais6obj_fkey FOREIGN KEY (pais6obj)
      REFERENCES firma1.yksus6 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_pais7obj_fkey FOREIGN KEY (pais7obj)
      REFERENCES firma1.yksus7 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_pais8obj_fkey FOREIGN KEY (pais8obj)
      REFERENCES firma1.yksus8 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_pais9obj_fkey FOREIGN KEY (pais9obj)
      REFERENCES firma1.yksus9 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_raha_fkey FOREIGN KEY (raha)
      REFERENCES raha (raha) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_saaja_fkey FOREIGN KEY (saaja)
      REFERENCES firma1.klient (kood) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_sihtyksus_fkey FOREIGN KEY (sihtyksus)
      REFERENCES firma1.yksus1 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_statprots_fkey FOREIGN KEY (statprots)
      REFERENCES transpor (kood) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_tarneklaus_fkey FOREIGN KEY (tarneklaus)
      REFERENCES tarnekla (kood) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_tehingulii_fkey FOREIGN KEY (tehingulii)
      REFERENCES tehingul (kood) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_username_fkey FOREIGN KEY (username)
      REFERENCES kasutaja (kasutaja) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_vmnr_fkey FOREIGN KEY (vmnr)
      REFERENCES firma1.vmaks (vmnr) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_volitaisik_fkey FOREIGN KEY (volitaisik)
      REFERENCES firma1.kaardika (guid) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_yksus_fkey FOREIGN KEY (yksus)
      REFERENCES firma1.yksus1 (yksus) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT dok_check CHECK (krdokumnr IS NULL OR (doktyyp = ANY
(ARRAY['G'::bpchar, 'O'::bpchar]))),
  CONSTRAINT dok_dokumnr_check CHECK (dokumnr > 0),
  CONSTRAINT dok_guid_check CHECK (guid <> ''::bpchar)
)
WITHOUT OIDS;

ALTER TABLE firma1.dok OWNER TO eeva_owner;

CREATE INDEX dok_klient_idx
  ON firma1.dok
  USING btree
  (klient);

CREATE INDEX dok_krdokumnr_idx
  ON firma1.dok
  USING btree
  (krdokumnr);

CREATE INDEX dok_kuupaev_idx
  ON firma1.dok
  USING btree
  (kuupaev);

CREATE INDEX dok_tasudok_idx
  ON firma1.dok
  USING btree
  (tasudok);

CREATE UNIQUE INDEX dok_tasudok_unique_idx
  ON firma1.dok
  USING btree
  (doktyyp, tasudok)
  WHERE doktyyp = ANY (ARRAY['T'::bpchar, 'U'::bpchar]);

CREATE INDEX dok_tasumata_idx
  ON firma1.dok
  USING btree
  (tasumata);

CREATE INDEX dok_tellimus_idx
  ON firma1.dok
  USING btree
  (tellimus);

CREATE TRIGGER dok_btrig
  BEFORE INSERT
  ON firma1.dok
  FOR EACH ROW
  EXECUTE PROCEDURE firma1.dok_seq_trig();




Re: Avoid huge perfomance loss on string concatenation

From
Martijn van Oosterhout
Date:
On Wed, Dec 05, 2007 at 11:32:59AM +0200, Andrus wrote:
> I do'nt have this index.
> dok.kuupaev||dok.kellaaeg conditon should applied after index search is
> performed.
> It filters out only a small number of rows additionally to the plain kuupaev
> filter.
> So adding index on dok.kuupaev||dok.kellaaeg  is not reasonable IMHO.

Your problem is that it is doing that, except postgres is assuming that
it does filter more rows. In actual fact the extra condition is making
no difference whatsoever.

So postgres assumes the result will only give 10 rows and so a nested
loop is better.

As for the index problem, I don't know but it will probably work better.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution inevitable.
>  -- John F Kennedy

Attachment

Re: Avoid huge perfomance loss on string concatenation

From
"Andrus"
Date:
Thank you.
I ran manually analyze command.
After that query runs fast.
I have enabled autovacuum and statitics collection in config file.
Every day a lot of rows are added to dok table.

However it seems that statitics is not collected (autovacuum is not running)

Any idea autovacuum is not running ?

Andrus.