Re: bytea encode performance issues - Mailing list pgsql-general

From Tom Lane
Subject Re: bytea encode performance issues
Date
Msg-id 3975.1217857958@sss.pgh.pa.us
Whole thread Raw
In response to Re: bytea encode performance issues  (Sim Zacks <sim@compulab.co.il>)
Responses Re: bytea encode performance issues
List pgsql-general
Sim Zacks <sim@compulab.co.il> writes:
> Tom Lane wrote:
>> Could we see EXPLAIN ANALYZE, not EXPLAIN?  Without actual facts
>> to work from, any suggestions would be mere guesswork.

> "              ->  Seq Scan on dbmail_messageblks k
> (cost=0.00..39193.21 rows=259 width=764) (actual time=30.662..486585.126
> rows=2107 loops=1)"
> "                    Filter: ((is_header = 0::smallint) AND
> (encode(messageblk, 'escape'::text) ~~ '%Yossi%'::text))"

okay, the time really is being spent in the seqscan ...

>> Also, what can you tell us about the sizes of the messageblk
>> strings (max and avg would be interesting)?
>>
> select max(length(messageblk)),avg(length(messageblk)) from
> dbmail_messageblks
> MAX        AVG
> 532259;    48115.630147120314

... but given that, I wonder whether the cost isn't from fetching
the toasted messageblk data, and nothing directly to do with either
the encode() call or the ~~ test.  It would be interesting to compare
the results of

explain analyze select encode(messageblk, 'escape') ~~ '%Yossi%'
from dbmail_messageblks where is_header = 0;

explain analyze select encode(messageblk, 'escape')
from dbmail_messageblks where is_header = 0;

explain analyze select messageblk = 'X'
from dbmail_messageblks where is_header = 0;

explain analyze select length(messageblk)
from dbmail_messageblks where is_header = 0;

(length is chosen with malice aforethought: unlike the other cases,
it doesn't require detoasting a toasted input)

            regards, tom lane

pgsql-general by date:

Previous
From: rihad
Date:
Subject: Re: index speed and failed expectations?
Next
From: Tom Lane
Date:
Subject: Re: index speed and failed expectations?