Re: On-demand running query plans using auto_explain and signals - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: On-demand running query plans using auto_explain and signals
Date
Msg-id CAFj8pRAB4otjEtv2eEkEyX1YCmDg+5JCEH0GeabKosjopN52Cw@mail.gmail.com
Whole thread Raw
In response to Re: On-demand running query plans using auto_explain and signals  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: On-demand running query plans using auto_explain and signals  ("Shulgin, Oleksandr" <oleksandr.shulgin@zalando.de>)
Re: On-demand running query plans using auto_explain and signals  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers


2015-09-17 16:46 GMT+02:00 Robert Haas <robertmhaas@gmail.com>:
On Thu, Sep 17, 2015 at 10:28 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Please, can you explain what is wrong on using of shm_mq? It works pretty
> well now.
>
> There can be a contra argument, why don't use shm_mq, because it does data
> exchange between processes and this patch introduce new pattern for same
> thing.

Sure, I can explain that.

First, when you communication through a shm_mq, the writer has to wait
when the queue is full, and the reader has to wait when the queue is
empty.  If the message is short enough to fit in the queue, then you
can send it and be done without blocking.  But if it is long, then you
will have each process repeatedly waiting for the other one until the
whole message has been sent.  That is going to make sending the
message potentially a lot slower than writing it all in one go,
especially if the system is under heavy load.


Is there some risk if we take too much large DSM segment? And what is max size of DSM segment? When we use shm_mq, we don't need to solve limits.
 

Also, if there are any bugs in the way the shm_mq is being used,
they're likely to be quite rare and hard to find, because the vast
majority of messages will probably be short enough to be sent in a
single chunk, so whatever bugs may exist when the processes play
ping-ping are unlikely to occur in practice except in unusual cases
where the message being returned is very long.

This is true for any functionality based on shm_mq - parallel seq scan,
 

Second, using a shm_mq manipulates the state of the process latch.  I
don't think you can make the assumption that it's safe to reset the
process latch at any and every place where we check for interrupts.
For example, suppose the process is already using a shm_mq and the
CHECK_FOR_INTERRUPTS() call inside that code then discovers that
somebody has activated this mechanism and you now go try to send and
receive from a new shm_mq.  But even if that and every other
CHECK_FOR_INTERRUPTS() in the code can tolerate a process latch reset
today, it's a new coding rule that could easily trip people up in the
future.


It is valid, and probably most important. But if we introduce own mechanism, we will play with process latch too (although we can use LWlocks)
 
Using a shm_mq is appropriate when the amount of data that needs to be
transmitted might be very large - too large to just allocate a buffer
for the whole thing - or when the amount of data can't be predicted
before memory is allocated.  But there is obviously no rule that a
shm_mq should be used any time we have "data exchange between
processes"; we have lots of shared-memory based IPC already and have
for many years, and shm_mq is newer than the vast majority of that
code.

I am little bit disappointed - I hoped so shm_mq can be used as generic interprocess mechanism - that will ensure all corner cases for work with shared memory. I understand to shm_mq is new, and nobody used it, but this risk is better than invent wheels again and again.

Regards

Pavel


--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Postgres releases scheduled for first week of October
Next
From: "Shulgin, Oleksandr"
Date:
Subject: Re: On-demand running query plans using auto_explain and signals