Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries
Date
Msg-id CAA4eK1JhHbMPh+m5R9jyKU+pQEENBzGn6pnUVKjAfPeoqopZNg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH v2] Progress command to monitor progression oflong running SQL queries  (Remi Colinet <remi.colinet@gmail.com>)
List pgsql-hackers
On Wed, May 17, 2017 at 9:43 PM, Remi Colinet <remi.colinet@gmail.com> wrote:
>
> 2017-05-13 14:38 GMT+02:00 Amit Kapila <amit.kapila16@gmail.com>:
>>
>> On Wed, May 10, 2017 at 10:10 PM, Remi Colinet <remi.colinet@gmail.com>
>> wrote:
>> >
>> > Parallel queries can also be monitored. The same mecanism is used to
>> > monitor
>> > child workers with a slight difference: the main worker requests the
>> > child
>> > progression directly in order to dump the whole progress tree in shared
>> > memory.
>> >
>>
>> What if there is any error in the worker (like "out of memory") while
>> gathering the statistics?  It seems both for workers as well as for
>> the main backend it will just error out.  I am not sure if it is a
>> good idea to error out the backend or parallel worker as it will just
>> end the query execution.
>
>
> The handling of progress report starts by the creation of a MemoryContext
> attached to CurrentMemoryContext. Then, few memory (few KB) is allocated.
> Meanwhile, the handling of progress report could indeed exhaust memory and
> fail the backend request. But, in such situation, the backend could also
> have fail even without any progress request.
>
>>
>> Also, even if it is okay, there doesn't seem
>> to be a way by which a parallel worker can communicate the error back
>> to master backend, rather it will just exit silently which is not
>> right.
>
>
> If a child worker fails, it will not respond to the main backend request.
> The main backend will follow up it execution after a 5 seconds timeout (GUC
> param to be added may be). In which case, the report would be partially
> filled.
>
> If the main backend fails, the requesting backend will have a response such
> as:
>
> test=# PROGRESS 14611;
>  PLAN PROGRESS
> ----------------
>  <backend timeout>
> (1 row)
>
> test=#
>
> and the child workers will log their response to the shared memory. This
> response will not be collected by the main backend which has failed.
>

If the worker errors out due to any reason, we should end the main
query as well, otherwise, it can produce wrong results.  See the error
handling of workers in HandleParallelMessage


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw
Next
From: Michael Paquier
Date:
Subject: [HACKERS] Making replication commands case-insensitive