Re: On-demand running query plans using auto_explain and signals - Mailing list pgsql-hackers

From Shulgin, Oleksandr
Subject Re: On-demand running query plans using auto_explain and signals
Date
Msg-id CACACo5Rmob7aGP0y9zn8ZUcRWuNnV7swk2ugY9pPcEVziP_3yQ@mail.gmail.com
Whole thread Raw
In response to Re: On-demand running query plans using auto_explain and signals  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: On-demand running query plans using auto_explain and signals  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
On Fri, Sep 4, 2015 at 6:11 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Sorry, but I still don't see how the slots help this issue - could you please elaborate?

with slot (or some similiar) there is not global locked resource. If I'll have a time at weekend I'll try to write some prototype.

But you will still lock on the slots list to find an unused one.  How is that substantially different from what I'm doing?

>> Other smaller issues:
>>
>> * probably sending line by line is useless - shm_mq_send can pass bigger data when nowait = false

I'm not sending it like that because of the message size - I just find it more convenient. If you think it can be problematic, its easy to do this as before, by splitting lines on the receiving side.

Yes, shm queue sending data immediately - so slicing on sender generates more interprocess communication

Well, we are talking about hundreds to thousands bytes per plan in total.  And if my reading of shm_mq implementation is correct, if the message fits into the shared memory buffer, the receiver gets the direct pointer to the shared memory, no extra allocation/copy to process-local memory.  So this can be actually a win.

>> * pg_usleep(1000L); - it is related to single point resource

But not a highly concurrent one.

I believe so it is not becessary - waiting (sleeping) can be deeper in reading from queue - the code will be cleaner

The only way I expect this line to be reached is when a concurrent pg_cmdstatus() call is in progress: the receiving backend has set the target_pid and has created the queue, released the lock and now waits to read something from shm_mq.  So the backend that's trying to also use this communication channel can obtain the lwlock, checks if the channel is not used at the time, fails and then it needs to check again, but that's going to put a load on the CPU, so there's a small sleep.

The real problem could be if the process that was signaled to connect to the message queue never handles the interrupt, and we keep waiting forever in shm_mq_receive().  We could add a timeout parameter or just let the user cancel the call: send a cancellation request, use pg_cancel_backend() or set statement_timeout before running this.

--
Alex

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Foreign join pushdown vs EvalPlanQual
Next
From: Alexander Korotkov
Date:
Subject: Re: Waits monitoring