Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries - Mailing list pgsql-hackers

From Remi Colinet
Subject Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries
Date
Msg-id CADdR5nzim+-oWw94G=ThsLRyyWtm1GobBu32ZyyauqMC-wyWLA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers

2017-05-16 8:17 GMT+02:00 Michael Paquier <michael.paquier@gmail.com>:
On Sat, May 13, 2017 at 10:53 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, May 10, 2017 at 10:05 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Regarding the patch, this is way to close to the progress facility
>> already in place. So why don't you extend it for the executor?
>
> I don't think that's a good idea.  The existing progress facility
> advertises a fixed number of counters with a command-type-specific
> interpretation, but for *query* progress reporting, we really need a
> richer model.  A query plan can contain an unbounded number of
> executor nodes and Remi's idea is, quite reasonably, to report
> something about each one.

I see what you're coming at. As st_progress_param is upper-bounded,
what should be actually used as report structure is a tree made of
nodes with one planner node attached to each node of the report tree.
Then the reporting facility should need to make persistent the data
reported.
 which is in this case a set of planner nodes, being able to also make
persistent the changes on the tree reported to the user. Then when a
backend reports its execution status, what it does is just updating a
portion of the tree in shared memory. Lock contention may be a problem
though, perhaps things could be made atomic?

In order to report progress of a SQL query, the progress request handler walks the execution tree of the query. No new node is ever created. The execution tree has been slightly modified to collect the number of rows/blocks returned during the executor run. This collection of the progress request state is only done when it is requested by a user, and it is dump at this moment in shared memory.

As long as nobody asks for a progress report, there is no overhead unless a few counters updated during executor run.

Regards
 

> From a design point of view, I think a patch like this has to clear a
> number of hurdles.  It needs to:
>
> 1. Decide what data to expose.  The sample output looks reasonable in
> this case, although the amount of code churn looks really high.
>
> 2. Find a way to advertise the data that it exposes.  This patch looks
> like it allocates a half-MB of shared memory per backend and gives up
> if that's not enough.

Perhaps DSM? It is not user-friendly to fail sporadically...

> 3. Find a way to synchronize the access to the data that it exposes.
> This patch ignores that problem completely, so multiple progress
> report commands running at the same time will behave unpredictably.

Yeah..
--
Michael

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Pulling up more complicated subqueries
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] COPY FROM STDIN behaviour on end-of-file