Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries - Mailing list pgsql-hackers

From Remi Colinet
Subject Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries
Date
Msg-id CADdR5nxZ8kxMpmcnAXnKYUe8Mdx6J-Hxo08sPLBqQdfYk=18rQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers


2017-05-11 4:05 GMT+02:00 Michael Paquier <michael.paquier@gmail.com>:
On Thu, May 11, 2017 at 1:40 AM, Remi Colinet <remi.colinet@gmail.com> wrote:
> This is version 2 of the new command name PROGRESS which I wrote in order to
> monitor long running SQL queries in a Postgres backend process.

It would be a good idea to add this patch to the next commit fest if
you are expecting some feedback:
https://commitfest.postgresql.org/14/
But please don't expect immediate feedback, the primary focus these
days is to stabilize the upcoming release.

Yes, I understand. I will submit the patch to the commit fest once I will have had a few feedbacks.
I already received some interesting suggestions such as using a SQL function instead of a whole new command.

The purpose for the moment is to gather feedback and advise about such feature.

> New command justification
> ====================
>
> [root@rco pg]# git diff --stat master..
> [...]
>  58 files changed, 5972 insertions(+), 2619 deletions(-)

For your first patch, you may want something less... Challenging.
Please note as well that it would be nice if you review other patches
to get more familiar with the community process, it is expected that
for each patch submitted, an equivalent amount of review is done.
That's hard to measure but for a patch as large as that you will need
to pick up at least one patch equivalent in difficulty.

Acked


Regarding the patch, this is way to close to the progress facility
already in place. So why don't you extend it for the executor? We can
likely come up with something that can help, though I see the part
where the plan run by a backend is shared among all processes as a
major bottleneck in performance. You have at least three concepts
different here:
- Store plans run in shared memory to allow access to any other processes.
- Allow a process to look at the plan run by another one with a SQL interface.
- Track the progress of a running plan, via pg_stat_activity.
All in all, I think that a new command is not adapted, and that you
had better split each concept before implementing something
over-engineered like the patch attached.

I initially had a look at the progress facility but this seemed to me very far from the current goal.
Btw, I will reconsider it. Things may have changed.

The plan and its planstate are not permanently shared with other backends.
This would be useless and very costly.

As long as the PROGRESS command is not used, the patch has a very low impact on the executor.
A few counters and flags are added/updated by the patch in the executor. This is all what is needed to track progress of execution.

For instance in src/backend/executor/execProcnode.c, I've added:

ExecInitNode()
+ result->plan_rows = 0;

ExecProcNode()
+ node->plan_rows++;

ExecEndNode()
+ node->plan_rows = 0;

This is enough to compute the number of rows fetched by the plan at this step of the plan.
A few counters have been added to track sorts, scans, hashes, tuple stores.

The disk space used by nodes is also traced. This shows disk resource needed to complete a SQL query.
This was not initially a purpose of the patch, but I find this very handy.

Some debugging code will also be removed making the change in the executor very small.
The code change in the executor is less than 50 lines and it is about 150 lines in the tuple sort/store code mainly to add functions to fetch sort and store status.

The execution tree is dumped in shared memory only when requested by a monitoring backend sending a signal to the monitored backend.
The monitoring backend first sends a signal which is noted by the backend running the long SQL query.
Once the long SQL query can deal with the request (interrupt enabled), it dumps its execution tree in shared memory and then follows up its execution.

So the dump of the execution tree is done only when it is requested.
This does not seem to incur any performance hit. Or I missed something.

Once the concept is mature, I would split the patch in small consistent parts to allow a smooth and clean merge.

Regards
Remi
  
--
Michael

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] [POC] hash partitioning
Next
From: Remi Colinet
Date:
Subject: Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries