Re: Performance problems testing with Spamassassin 3.1.0 - Mailing list pgsql-performance

From Matthew Schumacher
Subject Re: Performance problems testing with Spamassassin 3.1.0
Date
Msg-id 42EE5720.2020306@aptalaska.net
Whole thread Raw
In response to Re: Performance problems testing with Spamassassin 3.1.0  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom Lane wrote:
> Michael Parker <parkerm@pobox.com> writes:
>
>>sub bytea_esc {
>>  my ($str) = @_;
>>  my $buf = "";
>>  foreach my $char (split(//,$str)) {
>>    if (ord($char) == 0) { $buf .= "\\\\000"; }
>>    elsif (ord($char) == 39) { $buf .= "\\\\047"; }
>>    elsif (ord($char) == 92) { $buf .= "\\\\134"; }
>>    else { $buf .= $char; }
>>  }
>>  return $buf;
>>}
>
>
> Oh, I see the problem: you forgot to convert " to a backslash sequence.
>
> It would probably also be wise to convert anything >= 128 to a backslash
> sequence, so as to avoid any possible problems with multibyte character
> encodings.  You wouldn't see this issue in a SQL_ASCII database, but I
> suspect it would rise up to bite you with other encoding settings.
>
>             regards, tom lane

Here is some code that applies Toms Suggestions:

38c39,41
<     if (ord($char) == 0) { $buf .= "\\\\000"; }
---
>     if (ord($char) >= 128) { $buf .= "\\\\" . sprintf ("%lo",
ord($char)); }
>     elsif (ord($char) == 0) { $buf .= "\\\\000"; }
>     elsif (ord($char) == 34) { $buf .= "\\\\042"; }

But this begs the question, why not escape everything?

schu

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCHES] COPY FROM performance improvements
Next
From: "Alon Goldshuv"
Date:
Subject: Re: [PATCHES] COPY FROM performance improvements