Re: bad COPY performance with NOTIFY in a trigger - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: bad COPY performance with NOTIFY in a trigger
Date
Msg-id CAHyXU0yu-hVx1PDbhyBtAzx89NjVuD9vn0WaM6WqGX=5+=Xqwg@mail.gmail.com
Whole thread Raw
In response to Re: bad COPY performance with NOTIFY in a trigger  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-performance
On Mon, Feb 8, 2016 at 8:35 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Sat, Feb 6, 2016 at 6:03 AM, Filip Rembiałkowski
> <filip.rembialkowski@gmail.com> wrote:
>> Thanks for the feedback.
>>
>> This patch is my first and obvious approach.
>>
>> @Merlin, I'm not sure if I get your idea:
>> - keep previous behaviour as obligatory? (which is: automatic
>> de-duplicating of incoming messages by channel+payload),
>> - instead of trivial search (sorting by browsing) use some kind of
>> faster lookups?
>>
>> I'm not sure if this statement in async.c is carved in stone:
>>
>> * Duplicate notifications from the same transaction are sent out as one
>> * notification only. This is done to save work when for example a trigger
>> * on a 2 million row table fires a notification for each row that has been
>> * changed. If the application needs to receive every single notification
>> * that has been sent, it can easily add some unique string into the extra
>> * payload parameter.
>>
>> 1) "work-saving" is disputable in some cases
>>
>> 2) an idea to "add some unique string" is OK logical-wise but it's not
>> OK performance-wise.
>>
>> Current search code is a sequential search:
>> https://github.com/filiprem/postgres/blob/master/src/backend/commands/async.c#L2139
>>
>> I'm not that smart to devise an algorithm for faster lookups -
>> probably you guys can give some advice.
>>
>> Again, my rationale is... This feature can burn a lot of CPU for
>> nothing. I was hoping to use NOTIFY/LISTEN as superfast notification
>> mechanism. Superfast regardless on whether you insert 100, 10k or 1m
>> rows.
>
> Sure, I get it -- you want to have fast notification events -- this is
> a good thing to want to have.  However, a GUC is probably not the best
> way to do that in this particular case.  It's way to fringey and the
> bar for behavior controlling GUC is incredibly high (short version:
> most modern introductions were to manage security issues).  I'm far
> from the last word on this thoug, but it's better to get this all
> sorted out now.
>
> Anyways, it should be possible to micro-optimize that path.  Perhaps
> using a hash table?  I'm not sure.
>
> Another possible way to work things out here is to expose your switch
> in the syntax of the command itself, or perhaps via the pg_notify
> function to avoid syntax issues.

whoops, I just noticed this thread moved to -hackers -- so please respond there.

merlin


pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: gin performance issue.
Next
From: Dan Langille
Date:
Subject: Running lots of inserts from selects on 9.4.5