Re: bytea encode performance issues - Mailing list pgsql-general

From Tomasz Ostrowski
Subject Re: bytea encode performance issues
Date
Msg-id 4896C86C.6030804@batory.org.pl
Whole thread Raw
In response to bytea encode performance issues  (Sim Zacks <sim@compulab.co.il>)
List pgsql-general
On 2008-08-03 12:12, Sim Zacks wrote:

> SELECT m.message_idnr,k.messageblk
> FROM dbmail_messageblks k
> JOIN dbmail_physmessage p ON k.physmessage_id = p.id
> JOIN dbmail_messages m ON p.id = m.physmessage_id
> WHERE
>  mailbox_idnr = 8
>  AND status IN (0,1 )
>  AND k.is_header = '0'
> GROUP BY m.message_idnr,k.messageblk
> HAVING ENCODE(k.messageblk::bytea,'escape') LIKE '%John%'

What is this encode() for? I think it is not needed and kills
performance, as it needs to copy every message body in memory, possibly
several times.

Why not just "HAVING k.messageblk LIKE '%John%'"?


Try this:

=> \timing

=> create temporary table test as
select
decode(
  repeat(
    'lorem ipsum dolor sit amet '
    ||s::text||E'\n'
    ,1000
  ),
  'escape'
) as a
from generate_series(1,10000) as s;
SELECT
Time: 10063.807 ms

=> select count(*) from test where a like '%John%';
 count
-------
     0
(1 row)

Time: 1280.973 ms

=> select count(*) from test where encode(a,'escape') like '%John%';
 count
-------
     0
(1 row)

Time: 5690.097 ms


Without encode search is 5 times faster. And for bigger bytea a
difference is even worse.


Even better:

=> select count(*) from test where position('John' in a) != 0;
select count(*) from test where position('John' in a) != 0;
 count
-------
     0
(1 row)

Time: 1098.768 ms

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
                                                      Winnie the Pooh

pgsql-general by date:

Previous
From: Markus Lehmann
Date:
Subject: postgres-r patch: autoconf/make problem
Next
From: "Markus Wollny"
Date:
Subject: Efficient data structures and UI for product matrix