Thread: LISTEN / NOTIFY performance in 8.3

LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

23 February 2008, 14:51:08

Hi,

I'm noticing a strange increase in the amount of time it takes to
issue a NOTIFY statement.

I have an existing app that provides a producer / consumer type of
queue and that uses the LISTEN / NOTIFY mechanism to signal the
consumers of new items arriving in the queue.  The consumers then
process these items and issue a notify to signal that they have been
processed.  In the past issuing these notifications happened very
quickly, now on 8.3 I'm seeing all of them taking over 300ms and many
of them taking 1500ms or more!  The notifications are happening
outside of any transactions (which is itself a probable area for
performance improvement, I realize) but I'm wondering what might have
changed between 8.1 (the version I was using in the past) and 8.3?

TIA,
Joel

Re: LISTEN / NOTIFY performance in 8.3

From

Tom Lane

Date:

23 February 2008, 15:39:23

Joel Stevenson <joelstevenson@mac.com> writes:
> I have an existing app that provides a producer / consumer type of
> queue and that uses the LISTEN / NOTIFY mechanism to signal the
> consumers of new items arriving in the queue.  The consumers then
> process these items and issue a notify to signal that they have been
> processed.  In the past issuing these notifications happened very
> quickly, now on 8.3 I'm seeing all of them taking over 300ms and many
> of them taking 1500ms or more!

That's strange, I would not have thought that listen/notify behavior
would change at all.  How are you measuring this delay exactly?
Can you put together a self-contained test case?

            regards, tom lane

Re: LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

24 February 2008, 01:22:37

At 11:58 PM -0500 2/23/08, Tom Lane wrote:
>Joel Stevenson <joelstevenson@mac.com> writes:
>>>  That's strange, I would not have thought that listen/notify behavior
>>>  would change at all.  How are you measuring this delay exactly?
>>>  Can you put together a self-contained test case?
>
>>  Attached is a perl script that sort of simulates what's going on.
>
>Thanks for the script.  It's not showing any particular problems here,
>though.  With log_min_duration_statement = 10, the only statements that
>(slightly) exceed 10ms are the select count(*) from generate_series(1,
>15000) ones.
>
>>  Also of note, the iowait percentages on this quad core linux box jump
>>  to 30-40% while this test script is running, event though there's no
>>  table activity involved and the producer consumers pause for up to a
>>  second between iterations.
>
>This sounds a bit like pg_listener has gotten bloated.  Try a "VACUUM
>VERBOSE pg_listener" (as superuser) and see what it says.

At the moment (server is inactive):

pcdb=# VACUUM VERBOSE pg_listener;
INFO:  vacuuming "pg_catalog.pg_listener"
INFO:  "pg_listener": removed 1 row versions in 1 pages
INFO:  "pg_listener": found 1 removable, 21 nonremovable row versions
in 28 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 2319 unused item pointers.
28 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

running the test script and then the above command:

pcdb=# VACUUM VERBOSE pg_listener;
INFO:  vacuuming "pg_catalog.pg_listener"
INFO:  "pg_listener": removed 693 row versions in 12 pages
INFO:  "pg_listener": found 693 removable, 21 nonremovable row
versions in 28 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 2308 unused item pointers.
28 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

Numerous notifications took 1000ms or so to complete in the test
script execution between those two vacuum runs.

>If that is the problem then the next question is why it got so much more
>bloated than you were used to --- something wrong with vacuuming
>procedures, perhaps?

I have autovacuum on and using default settings.  I have an explicit
vacuum routine that runs nightly over the whole DB.

-Joel

Re: LISTEN / NOTIFY performance in 8.3

From

Tom Lane

Date:

24 February 2008, 14:57:53

Joel Stevenson <joelstevenson@mac.com> writes:
>> This sounds a bit like pg_listener has gotten bloated.  Try a "VACUUM
>> VERBOSE pg_listener" (as superuser) and see what it says.

> At the moment (server is inactive):

> pcdb=# VACUUM VERBOSE pg_listener;
> INFO:  vacuuming "pg_catalog.pg_listener"
> INFO:  "pg_listener": removed 1 row versions in 1 pages
> INFO:  "pg_listener": found 1 removable, 21 nonremovable row versions
> in 28 pages

OK, that destroys the table-bloat theory.  Just to make sure, I
pre-populated pg_listener with enough dead rows to make 28 pages,
but I still don't see any slow notifies or noticeable load in vmstat.

That server is not quite "inactive", though.  What are the 21 remaining
pg_listener entries for?  Is it possible that those jobs are having
some impact on the ones run by the test script?

Also, it might be worth enabling log_lock_waits to see if the slow
notifies are due to having to wait on some lock or other.

            regards, tom lane

Re: LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

25 February 2008, 03:11:08

At 1:57 PM -0500 2/24/08, Tom Lane wrote:
>Joel Stevenson <joelstevenson@mac.com> writes:
>>>  This sounds a bit like pg_listener has gotten bloated.  Try a "VACUUM
>>>  VERBOSE pg_listener" (as superuser) and see what it says.
>
>>  At the moment (server is inactive):
>
>>  pcdb=# VACUUM VERBOSE pg_listener;
>>  INFO:  vacuuming "pg_catalog.pg_listener"
>>  INFO:  "pg_listener": removed 1 row versions in 1 pages
>>  INFO:  "pg_listener": found 1 removable, 21 nonremovable row versions
>>  in 28 pages
>
>OK, that destroys the table-bloat theory.  Just to make sure, I
>pre-populated pg_listener with enough dead rows to make 28 pages,
>but I still don't see any slow notifies or noticeable load in vmstat.
>
>That server is not quite "inactive", though.  What are the 21 remaining
>pg_listener entries for?  Is it possible that those jobs are having
>some impact on the ones run by the test script?
>
>Also, it might be worth enabling log_lock_waits to see if the slow
>notifies are due to having to wait on some lock or other.

The other listeners are the application's consumers and producer.  At
the time of the testing they were not active but were alive.

For isolation I've just shutdown all listener / notifier processes
that were using the box, vacuumed pg_listener, and run the test
script again.  There were several LISTEN or NOTIFY statements that
took longer than expected to complete (default test script settings
of 5 consumers and a loop of 100):

2008-02-24 23:00:48 PST 7541 LOG:  duration: 514.697 ms  statement:
LISTEN to_consumer
2008-02-24 23:00:48 PST 7544 LOG:  duration: 508.790 ms  statement:
LISTEN to_consumer
2008-02-24 23:00:48 PST 7543 LOG:  duration: 511.061 ms  statement:
LISTEN to_consumer
2008-02-24 23:00:48 PST 7545 LOG:  duration: 506.390 ms  statement:
LISTEN to_producer
2008-02-24 23:00:57 PST 7544 LOG:  duration: 400.595 ms  statement:
NOTIFY to_producer
2008-02-24 23:00:57 PST 7538 LOG:  duration: 369.018 ms  statement:
NOTIFY to_producer
2008-02-24 23:01:03 PST 7544 LOG:  duration: 410.588 ms  statement:
NOTIFY to_producer
2008-02-24 23:01:03 PST 7541 LOG:  duration: 300.774 ms  statement:
NOTIFY to_producer
2008-02-24 23:01:32 PST 7545 LOG:  duration: 325.380 ms  statement:
NOTIFY to_consumer
2008-02-24 23:01:42 PST 7538 LOG:  duration: 349.640 ms  statement:
NOTIFY to_producer
2008-02-24 23:01:43 PST 7543 LOG:  duration: 529.700 ms  statement:
NOTIFY to_producer

-Joel

Re: LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

25 February 2008, 13:42:46

>Also, it might be worth enabling log_lock_waits to see if the slow
>notifies are due to having to wait on some lock or other.

Turning on log_lock_waits shows that there is a lot of waiting for
locks on the pg_listener table ala:

process 22791 still waiting for ExclusiveLock on relation 2614 of
database 16387 after 992.397 ms
...
process 22791 acquired ExclusiveLock on relation 2614 of database
16387 after 1433.152 ms

deadlock_timeout is left at the default 1s setting.

Though these are being generated during application activity -
running the simulation script does produce 300ms - 600ms NOTIFY
statements but doesn't (at the moment) trigger a lock_wait log entry.

Re: LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

25 February 2008, 14:08:53

>Also, it might be worth enabling log_lock_waits to see if the slow
>notifies are due to having to wait on some lock or other.

Turning on log_lock_waits shows that there is a lot of waiting for
locks on the pg_listener table ala:

process 22791 still waiting for ExclusiveLock on relation 2614 of
database 16387 after 992.397 ms
...
process 22791 acquired ExclusiveLock on relation 2614 of database
16387 after 1433.152 ms

deadlock_timeout is left at the default 1s setting.

Though these are being generated during application activity -
running the simulation script does produce 300ms - 600ms NOTIFY
statements but doesn't (at the moment) trigger a lock_wait log entry.

Re: LISTEN / NOTIFY performance in 8.3

From

Tom Lane

Date:

25 February 2008, 14:13:40

Joel Stevenson <joelstevenson@mac.com> writes:
>> Also, it might be worth enabling log_lock_waits to see if the slow
>> notifies are due to having to wait on some lock or other.

> Turning on log_lock_waits shows that there is a lot of waiting for
> locks on the pg_listener table ala:

Interesting.  The LISTEN/NOTIFY mechanism itself takes ExclusiveLock
on pg_listener, but never for very long at a time (assuming pg_listener
doesn't get horribly bloated, which we know isn't happening for you).

Another thought that comes to mind is that maybe the delays you see
come from these lock acquisitions getting blocked behind autovacuums of
pg_listener.  I did not see that while trying to replicate your problem,
but maybe the issue requires more update load on pg_listener than the
test script can create by itself, or maybe some nondefault autovacuum
setting is needed --- what are you using?

            regards, tom lane

Re: LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

25 February 2008, 15:24:15

At 1:13 PM -0500 2/25/08, Tom Lane wrote:
>Joel Stevenson <joelstevenson@mac.com> writes:
>>>  Also, it might be worth enabling log_lock_waits to see if the slow
>>>  notifies are due to having to wait on some lock or other.
>
>>  Turning on log_lock_waits shows that there is a lot of waiting for
>>  locks on the pg_listener table ala:
>
>Interesting.  The LISTEN/NOTIFY mechanism itself takes ExclusiveLock
>on pg_listener, but never for very long at a time (assuming pg_listener
>doesn't get horribly bloated, which we know isn't happening for you).
>
>Another thought that comes to mind is that maybe the delays you see
>come from these lock acquisitions getting blocked behind autovacuums of
>pg_listener.  I did not see that while trying to replicate your problem,
>but maybe the issue requires more update load on pg_listener than the
>test script can create by itself, or maybe some nondefault autovacuum
>setting is needed --- what are you using?

Default autovacuum settings.

I turned on all autovacuum logging and cranked up the test script and
have it fork 25 consumers each running 25 iterations.  At that level
on my machine I can get the lock waiting to exceed the 1s
deadlock_timeout right away but the autovacuum activity on
pg_listener is entirely absent until the end when the forked
consumers are mostly done and disconnected.

Re: LISTEN / NOTIFY performance in 8.3

From

Tom Lane

Date:

25 February 2008, 16:07:10

Joel Stevenson <joelstevenson@mac.com> writes:
> I turned on all autovacuum logging and cranked up the test script and
> have it fork 25 consumers each running 25 iterations.  At that level
> on my machine I can get the lock waiting to exceed the 1s
> deadlock_timeout right away but the autovacuum activity on
> pg_listener is entirely absent until the end when the forked
> consumers are mostly done and disconnected.

Hmph.  At 25/100 I can get a few complaints about NOTIFY taking more
than 20ms, but it seems to be due to blocking behind autovacuum, as
in this example:

2008-02-25 14:53:41.812 EST 13773 LOG:  automatic vacuum of table "joels.pg_catalog.pg_listener": index scans: 0
    pages: 0 removed, 78 remain
    tuples: 5560 removed, 25 remain
    system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec
2008-02-25 14:53:41.850 EST 13773 LOG:  automatic analyze of table "joels.pg_catalog.pg_listener" system usage: CPU
0.00s/0.00usec elapsed 0.03 sec 
2008-02-25 14:53:41.851 EST 13728 LOG:  duration: 29.270 ms  statement: NOTIFY to_producer
2008-02-25 14:53:41.852 EST 13758 LOG:  duration: 22.835 ms  statement: NOTIFY to_producer


It's weird that the behavior is robust for you but I can't make it
happen at all.  Would you show the output of pg_config, as well as
all your nondefault postgresql.conf settings?

            regards, tom lane

Re: LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

25 February 2008, 16:37:48

At 2:57 PM -0500 2/25/08, Tom Lane wrote:
>It's weird that the behavior is robust for you but I can't make it
>happen at all.  Would you show the output of pg_config, as well as
>all your nondefault postgresql.conf settings?

pg_config:
BINDIR = /usr/local/pgsql/bin
DOCDIR = /usr/local/pgsql/doc
INCLUDEDIR = /usr/local/pgsql/include
PKGINCLUDEDIR = /usr/local/pgsql/include
INCLUDEDIR-SERVER = /usr/local/pgsql/include/server
LIBDIR = /usr/local/pgsql/lib
PKGLIBDIR = /usr/local/pgsql/lib
LOCALEDIR =
MANDIR = /usr/local/pgsql/man
SHAREDIR = /usr/local/pgsql/share
SYSCONFDIR = /usr/local/pgsql/etc
PGXS = /usr/local/pgsql/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = 'CFLAGS=-O2 -pipe' '--with-openssl'
'--enable-thread-safety' '--with-includes=/usr/kerberos/include'
'--with-perl'
CC = gcc
CPPFLAGS = -D_GNU_SOURCE -I/usr/kerberos/include
CFLAGS = -O2 -pipe -Wall -Wmissing-prototypes -Wpointer-arith
-Winline -Wdeclaration-after-statement -fno-strict-aliasing
CFLAGS_SL = -fpic
LDFLAGS = -Wl,-rpath,'/usr/local/pgsql/lib'
LDFLAGS_SL =
LIBS = -lpgport -lssl -lcrypto -lz -lreadline -ltermcap -lcrypt -ldl -lm
VERSION = PostgreSQL 8.3.0


Non-default postgresql.conf settings:
max_connections = 80
ssl = on
shared_buffers = 1GB
work_mem = 100MB
maintenance_work_mem = 100MB
max_fsm_pages = 204800
vacuum_cost_delay = 100
wal_buffers = 124kB
wal_writer_delay = 200ms
commit_delay = 100
checkpoint_segments = 6
effective_cache_size = 6GB

Re: LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

26 February 2008, 12:27:23

At 11:58 PM -0500 2/23/08, Tom Lane wrote:
>  > Attached is a perl script that sort of simulates what's going on.
>
>Thanks for the script.  It's not showing any particular problems here,
>though.  With log_min_duration_statement = 10, the only statements that
>(slightly) exceed 10ms are the select count(*) from generate_series(1,
>15000) ones.

I tried the test script on another machine (similar but not identical
to the original machine) running 8.3 and although the notify
performance was *much* better than the original I still see
notifications taking longer than the select count(*) from
generate_series(1, 15000) queries, and also longer than some simple
updates to other tables that are also happening on the server.

duration: 10.030 ms  statement: select count(*) from generate_series(1, 15000)
duration: 224.833 ms  statement: NOTIFY to_producer

Perhaps this shouldn't be made much of as I'm sure there are many way
that this could quite naturally happen.

I've been thinking of LISTEN / NOTIFY as one of the least expensive
and therefore speedy ways to get the word out to participating
processes that something has changed (versus using a manually setup
signals table that interested parties updated and selected from).

Now that I see a little bit more of what goes on under the hood of
this function I see that it's still basically table-driven and I'll
adjust my expectations accordingly, but I'm still puzzled by the
hugely slow notifications showing up on the original server running
the producer / consumer setup.

With ps I can see some postgres backends with a 'notify interrupt
waiting' command line during the tests - could it be an issue with
signal handling on the original machine - something entirely outside
of PG's control?

Thx,
-Joel

Re: LISTEN / NOTIFY performance in 8.3

From

Tom Lane

Date:

26 February 2008, 13:45:40

Joel Stevenson <joelstevenson@mac.com> writes:
> Now that I see a little bit more of what goes on under the hood of
> this function I see that it's still basically table-driven and I'll
> adjust my expectations accordingly,

Yeah, there's been discussion of replacing the implementation with some
all-in-memory queue kind of setup, but no one's got round to that yet.

> With ps I can see some postgres backends with a 'notify interrupt
> waiting' command line during the tests - could it be an issue with
> signal handling on the original machine - something entirely outside
> of PG's control?

No, that's not unexpected if you have the same notify being delivered to
multiple processes that had been idle.  They'll all get wakened and try
to read pg_listener to see what happened, but since this is a
read-modify-write type of operation it uses an exclusive lock, so only
one can clear its pg_listener entry at a time.  The 'waiting' ones you
are seeing are stacked up behind whichever one has the lock at the
moment.  They shouldn't be waiting for long.

I'm still baffled by why we aren't seeing comparable performance for the
same test case.  What I'm testing on is couple-year-old desktop kit
(dual 2.8GHz Xeon, consumer-grade disk drive) --- I had assumed your
server would be at least as fast as that, but maybe not?

            regards, tom lane

Re: LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

26 February 2008, 18:18:17

At 12:43 PM -0500 2/26/08, Tom Lane wrote:
>I'm still baffled by why we aren't seeing comparable performance for the
>same test case.  What I'm testing on is couple-year-old desktop kit
>(dual 2.8GHz Xeon, consumer-grade disk drive) --- I had assumed your
>server would be at least as fast as that, but maybe not?

It's a quad-core Xeon 3.0Ghz machine, 7200rpm SATA discs in a
software RAID.  It's not a particularly high performance machine in
terms of disc IO - but again the comparative speed of most
select-update-commit queries to plain notify's on the server seem off.

What's really baffling is that there are plenty of other OLTP queries
going in multiple backends simultaneously that don't fall over my
300ms query log threshold, and yet NOTIFY and LISTEN consistently do.
What's more it's looks like it's only happening for registered
listener relnames.

This is while the server processes are alive but inactive:

joels=# \timing
Timing is on.
joels=# select * from pg_listener;
     relname     | listenerpid | notification
----------------+-------------+--------------
  alert_inbound |       15013 |            0
  alert_inbound |       13371 |            0
  alert_inbound |       26856 |            0
  alert_inbound |       12016 |            0
  alert_inbound |       26911 |            0
  alert_inbound |       11956 |            0
  alert_process |       13365 |            0
  alert_inbound |       26855 |            0
  alert_inbound |       12248 |            0
  alert_inbound |       13367 |            0
  alert_inbound |       12304 |            0
  alert_inbound |       32633 |            0
  alert_inbound |       30979 |            0
  alert_inbound |       29290 |            0
  alert_inbound |       30394 |            0
  alert_inbound |       14490 |            0
  alert_inbound |       14491 |            0
  alert_inbound |       14492 |            0
(18 rows)

Time: 0.402 ms
joels=# notify foo;
NOTIFY
Time: 0.244 ms
joels=# notify foo2;
NOTIFY
Time: 0.211 ms
joels=# notify alert_process;
NOTIFY
Time: 34.585 ms
joels=# notify alert_process;
NOTIFY
Time: 45.554 ms
joels=# notify alert_inbound;
NOTIFY
Time: 40.868 ms
joels=# notify alert_inbound;
NOTIFY
Time: 176.309 ms
joels=# notify alert_inbound;
NOTIFY
Time: 36.669 ms
joels=# notify alert_inbound;
NOTIFY
Time: 369.761 ms
joels=# notify alert_inbound;
NOTIFY
Time: 34.449 ms
joels=# notify alert_inbound;
NOTIFY
Time: 121.990 ms
joels=# notify foo3;
NOTIFY
Time: 0.250 ms
joels=# notify foo2;
NOTIFY
Time: 0.175 ms

There's no autovacuum log entries prior to or immediately after the
369ms notify command.

-Joel

Re: LISTEN / NOTIFY performance in 8.3

From

James Mansion

Date:

26 February 2008, 18:50:45

Tom Lane wrote:
> read-modify-write type of operation it uses an exclusive lock, so only
> one can clear its pg_listener entry at a time.  The 'waiting' ones you
> are seeing are stacked up behind whichever one has the lock at the
> moment.  They shouldn't be waiting for long.
>
I certainly hadn't expected that to be the implementation technique -
isn't it smply that we need
a sngle flag per worker process and can set/test-and-clear with atomic
operations and then a
signal to wake them up?

Anyway - how hard would it be to install triggers on commit and
rollback?  Then we could write
our own mechanisms.

James

Re: LISTEN / NOTIFY performance in 8.3

From

Tom Lane

Date:

26 February 2008, 19:03:20

Joel Stevenson <joelstevenson@mac.com> writes:
> What's really baffling is that there are plenty of other OLTP queries
> going in multiple backends simultaneously that don't fall over my
> 300ms query log threshold, and yet NOTIFY and LISTEN consistently do.
> What's more it's looks like it's only happening for registered
> listener relnames.

Hmm, that says that it's not a matter of locking on pg_listener,
but of actually applying the row update(s) and/or signaling the
recipient(s).  If you're not seeing performance issues for ordinary
table-update operations it's hard to see why pg_listener updates would
be any worse, so that seems to point the finger at the signaling.
Which is just a matter of a kill(2) and shouldn't be that expensive.

It might be interesting to try strace'ing the whole PG process tree
while these notifies are going on, and seeing if you can identify
any specific kernel calls that seem to take a long time.

            regards, tom lane

Re: LISTEN / NOTIFY performance in 8.3

From

Tom Lane

Date:

26 February 2008, 19:36:25

James Mansion <james@mansionfamily.plus.com> writes:
> I certainly hadn't expected that to be the implementation technique -
> isn't it smply that we need
> a sngle flag per worker process and can set/test-and-clear with atomic
> operations and then a
> signal to wake them up?

Hardly --- how's that going to pass a notify name?  Also, a lot of
people want some payload data in a notify, not just a condition name;
any reimplementation that doesn't address that desire probably won't
get accepted.

There's lots of threads in the -hackers archives about reimplementing
listen/notify in a saner fashion.  Personally I lean towards using
something much like the sinval queue.

            regards, tom lane

Re: LISTEN / NOTIFY performance in 8.3

From

James Mansion

Date:

27 February 2008, 01:56:50

Tom Lane wrote:
> Hardly --- how's that going to pass a notify name?  Also, a lot of
> people want some payload data in a notify, not just a condition name;
> any reimplementation that doesn't address that desire probably won't
> get accepted.
>
Ah - forgot about the name.  At least there need be just one instance of
a name record queued
per worker if I'm reading the documentation right - it suggest that
notifications can be folded
with the proviso that if the process generates a notification and at
least one other process
generates a notification then it will get at least (but possibly only)
two events.  Not sure why
the PID is there rather than a couple of flag bits.

You'll alsways have the danger of overflowing a shm area and need to
spill: is the signal and then
lookup in storage materially quicker than using the master process to
route messages via pipes?
As you say, you have a lock contention issue and often the total signal
data volume outstanding
for a single back end will be less than will fit in a kernel's pipe buffer.

The sending processes can track what signals they've generated in the
current transaction so
the master (or signal distributor) needn't get bombarded with signals
from lots of rows within
one transaction.

Re: LISTEN / NOTIFY performance in 8.3

From

Joel Stevenson

Date:

27 February 2008, 22:19:25

At 6:01 PM -0500 2/26/08, Tom Lane wrote:
>Hmm, that says that it's not a matter of locking on pg_listener,
>but of actually applying the row update(s) and/or signaling the
>recipient(s).  If you're not seeing performance issues for ordinary
>table-update operations it's hard to see why pg_listener updates would
>be any worse, so that seems to point the finger at the signaling.
>Which is just a matter of a kill(2) and shouldn't be that expensive.
>
>It might be interesting to try strace'ing the whole PG process tree
>while these notifies are going on, and seeing if you can identify
>any specific kernel calls that seem to take a long time.

I ran PG via strace and then ran the test script at 25 consumers
looping 25 times each.  There were no other connections to the
database for this strace run and test.

Digging through the strace file is a bit mind-numbing but here's some
signs that semop and send are the culprits:

3 misc examples coming from near LISTEN or NOTIFIES:
- - - - - -
7495  18:10:40.251855 <... semop resumed> ) = 0 <1.006149>
7495  18:10:41.325442 <... semop resumed> ) = -1 EINTR (Interrupted
system call) <0.299135>
7495  18:10:41.998219 <... semop resumed> ) = 0 <0.603566>

A chunk of log following the action on fd 7 (slow recv on
ERESTARTSYS) and then the slow semop that follows:
- - - - - - - -
7495  18:10:42.576401 send(7, "C\0\0\0\vNOTIFY\0Z\0\0\0\5I", 18, 0
<unfinished ...>
7495  18:10:42.576503 <... send resumed> ) = 18 <0.000070>
7495  18:10:42.576620 recv(7,  <unfinished ...>
7495  18:10:42.873796 <... recv resumed> 0x8331d40, 8192, 0) = ?
ERESTARTSYS (To be restarted) <0.297158>
7495  18:10:42.873911 --- SIGUSR2 (User defined signal 2) @ 0 (0) ---
7495  18:10:42.874079 gettimeofday( <unfinished ...>
7495  18:10:42.874198 <... gettimeofday resumed> {1204078242,
874097}, NULL) = 0 <0.000101>
7495  18:10:42.874324 setitimer(ITIMER_REAL, {it_interval={0, 0},
it_value={1, 0}},  <unfinished ...>
7495  18:10:42.874470 <... setitimer resumed> NULL) = 0 <0.000121>
7495  18:10:42.874604 semop(50495522, 0xbfff9764, 1 <unfinished ...>
7495  18:10:43.678431 <... semop resumed> ) = 0 <0.803809>

A little further on:
- - - - - - - -
7495  18:10:44.905320 <... semop resumed> ) = -1 EINTR (Interrupted
system call) <0.998192>


I'm not sure what exactly that means, in terms of next steps.  I'll
dig more through the strace file and see if I can find anything else
but those look to be definite bottlenecks for some reason.

At 2:24 PM -0800 2/26/08, Maurice Aubrey wrote:
>What's the OS/Dist?

Red Hat Enterprise Linux ES release 3 (Taroon Update 8)

Thx,
-Joel