Home > mailing lists

Re: Logging parallel worker draught - Mailing list pgsql-hackers

From	Benoit Lobréau
Subject	Re: Logging parallel worker draught
Date	October 12, 2023 13:01:46
Msg-id	11e34b80-b0a6-e2e4-1606-1f5077379a34@dalibo.com Whole thread Raw
In response to	Re: Logging parallel worker draught ("Imseih (AWS), Sami" <simseih@amazon.com>)
Responses	Re: Logging parallel worker draught
List	pgsql-hackers

Tree view

On 10/11/23 17:26, Imseih (AWS), Sami wrote:

Thank you for resurrecting this thread.

>> Well, if you read Benoit's earlier proposal at [1] you'll see that he
>> does propose to have some cumulative stats; this LOG line he proposes
>> here is not a substitute for stats, but rather a complement.  I don't
>> see any reason to reject this patch even if we do get stats.

I believe both cumulative statistics and logs are needed. Logs excel in 
pinpointing specific queries at precise times, while statistics provide 
a broader overview of the situation. Additionally, I often encounter 
situations where clients lack pg_stat_statements and can't restart their 
production promptly.

> Regarding the current patch, the latest version removes the separate GUC,
> but the user should be able to control this behavior.

I created this patch in response to Amit Kapila's proposal to keep the 
discussion ongoing. However, I still favor the initial version with the 
GUCs.

> Query text is logged when  log_min_error_statement > default level of "error".
> 
> This could be especially problematic when there is a query running more than 1 Parallel
> Gather node that is in draught. In those cases each node will end up
> generating a log with the statement text. So, a single query execution could end up
> having multiple log lines with the statement text.
> ...
> I wonder if it will be better to accumulate the total # of workers planned and # of workers launched and
> logging this information at the end of execution?

log_temp_files exhibits similar behavior when a query involves multiple 
on-disk sorts. I'm uncertain whether this is something we should or need 
to address. I'll explore whether the error message can be made more 
informative.

[local]:5437 postgres@postgres=# SET work_mem to '125kB';
[local]:5437 postgres@postgres=# SET log_temp_files TO 0;
[local]:5437 postgres@postgres=# SET client_min_messages TO log;
[local]:5437 postgres@postgres=# WITH a AS ( SELECT x FROM 
generate_series(1,10000) AS F(x) ORDER BY 1 ) , b AS (SELECT x FROM 
generate_series(1,10000) AS F(x) ORDER BY 1 ) SELECT * FROM a,b;
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.20", size 
122880 => First sort
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.19", size 140000
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.23", size 140000
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.22", size 
122880 => Second sort
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.21", size 140000

-- 
Benoit Lobréau
Consultant
http://dalibo.com

pgsql-hackers by date:

From: Alexander Lakhin
Date: 12 October 2023, 13:00:01
Subject: Re: cataloguing NOT NULL constraints

From: Ashutosh Bapat
Date: 12 October 2023, 13:05:53
Subject: Re: Use virtual tuple slot for Unique node

Re: Logging parallel worker draught - Mailing list pgsql-hackers

Previous

Next