Thread: Logging parallel worker draught

Logging parallel worker draught

From

Benoit Lobréau

Date:

21 April 2023, 13:04:01

Hi hackers,

Following my previous mail about adding stats on parallelism[1], this 
patch introduces the log_parallel_worker_draught parameter, which 
controls whether a log message is produced when a backend attempts to 
spawn a parallel worker but fails due to insufficient worker slots. The 
shortage can stem from insufficent settings for max_worker_processes, 
max_parallel_worker or max_parallel_maintenance_workers. It could also 
be caused by another pool (max_logical_replication_workers) or an 
extention using bg worker slots. This new parameter can help database 
administrators and developers diagnose performance issues related to 
parallelism and optimize the configuration of the system accordingly.

Here is a test script:

```
psql << _EOF_

SET log_parallel_worker_draught TO on;

-- Index creation
DROP TABLE IF EXISTS test_pql;
CREATE TABLE test_pql(i int, j int);
INSERT INTO test_pql SELECT x,x FROM generate_series(1,1000000) as F(x);

SET max_parallel_workers TO 0;

CREATE INDEX ON test_pql(i);
REINDEX TABLE test_pql;

RESET max_parallel_workers;

-- VACUUM
CREATE INDEX ON test_pql(j);
CREATE INDEX ON test_pql(i,j);
ALTER TABLE test_pql SET (autovacuum_enabled = off);
DELETE FROM test_pql WHERE i BETWEEN 1000 AND 2000;

SET max_parallel_workers TO 1;

VACUUM (PARALLEL 2, VERBOSE) test_pql;

RESET max_parallel_workers;

-- SELECT
SET min_parallel_table_scan_size TO 0;
SET parallel_setup_cost TO 0;
SET max_parallel_workers TO 1;

EXPLAIN (ANALYZE) SELECT i, avg(j) FROM test_pql GROUP BY i;

_EOF_
```

Which produces the following logs:

```
LOG:  Parallel Worker draught during statement execution: workers 
spawned 0, requested 1
STATEMENT:  CREATE INDEX ON test_pql(i);

LOG:  Parallel Worker draught during statement execution: workers 
spawned 0, requested 1
STATEMENT:  REINDEX TABLE test_pql;

LOG:  Parallel Worker draught during statement execution: workers 
spawned 1, requested 2
CONTEXT:  while scanning relation "public.test_pql"
STATEMENT:  VACUUM (PARALLEL 2, VERBOSE) test_pql;

LOG:  Parallel Worker draught during statement execution: workers 
spawned 1, requested 2
STATEMENT:  EXPLAIN (ANALYZE) SELECT i, avg(j) FROM test_pql GROUP BY i;
```

[1] 
https://www.postgresql.org/message-id/d657df20-c4bf-63f6-e74c-cb85a81d0383@dalibo.com

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Attachment

0001-Add-logging-for-exceeded-parallel-worker-slot-limits.patch_v1

Re: Logging parallel worker draught

From

Amit Kapila

Date:

22 April 2023, 11:06:02

On Fri, Apr 21, 2023 at 6:34 PM Benoit Lobréau
<benoit.lobreau@dalibo.com> wrote:
>
> Following my previous mail about adding stats on parallelism[1], this
> patch introduces the log_parallel_worker_draught parameter, which
> controls whether a log message is produced when a backend attempts to
> spawn a parallel worker but fails due to insufficient worker slots.
>

I don't think introducing a GUC for this is a good idea. We can
directly output this message in the server log either at LOG or DEBUG1
level.

--
With Regards,
Amit Kapila.

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

28 April 2023, 08:11:20

On 4/22/23 13:06, Amit Kapila wrote:
> I don't think introducing a GUC for this is a good idea. We can
> directly output this message in the server log either at LOG or DEBUG1
> level.

Hi,

Sorry for the delayed answer, I was away from my computer for a few 
days. I don't mind removing the guc, but I would like to keep it at the 
LOG level. When I do audits most client are at that level and setting it 
to DEBUG1 whould add too much log for them on the long run.

I'll post the corresponding patch asap.

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

Robert Haas

Date:

01 May 2023, 16:33:25

On Sat, Apr 22, 2023 at 7:06 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> I don't think introducing a GUC for this is a good idea. We can
> directly output this message in the server log either at LOG or DEBUG1
> level.

Why not? It seems like something some people might want to log and
others not. Running the whole server at DEBUG1 to get this information
doesn't seem like a suitable answer.

What I was wondering was whether we would be better off putting this
into the statistics collector, vs. doing it via logging. Both
approaches seem to have pros and cons.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

02 May 2023, 10:44:58

On 5/1/23 18:33, Robert Haas wrote:
 > Why not? It seems like something some people might want to log and
 > others not. Running the whole server at DEBUG1 to get this information
 > doesn't seem like a suitable answer.

Since the statement is also logged, it could spam the log with huge 
queries, which might also be a reason to stop logging this kind of 
problems until the issue is fixed.

I attached the patch without the guc anyway just in case.

> What I was wondering was whether we would be better off putting this
> into the statistics collector, vs. doing it via logging. Both
> approaches seem to have pros and cons.

We tried to explore different options with my collegues in another 
thread [1]. I feel like both solutions are worthwhile, and would be 
helpful. I plan to take a look at the pg_stat_statement patch [2] next.

Since it's my first patch, I elected to choose the easiest solution to 
implement first. I also proposed this because I think logging can help 
pinpoint a lot of problems at a minimal cost, since it is easy to setup 
and exploit for everyone, everywhere

[1] 
https://www.postgresql.org/message-id/d657df20-c4bf-63f6-e74c-cb85a81d0383@dalibo.com

[2] 
https://www.postgresql.org/message-id/flat/6acbe570-068e-bd8e-95d5-00c737b865e8%40gmail.com

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Attachment

0001-Add-logging-for-exceeded-parallel-worker-slot-limits.patch_v2_no_guc

Re: Logging parallel worker draught

From

Amit Kapila

Date:

02 May 2023, 10:57:19

On Mon, May 1, 2023 at 10:03 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sat, Apr 22, 2023 at 7:06 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > I don't think introducing a GUC for this is a good idea. We can
> > directly output this message in the server log either at LOG or DEBUG1
> > level.
>
> Why not? It seems like something some people might want to log and
> others not. Running the whole server at DEBUG1 to get this information
> doesn't seem like a suitable answer.
>

We can output this at the LOG level to avoid running the server at
DEBUG1 level. There are a few other cases where we are not able to
spawn the worker or process and those are logged at the LOG level. For
example, "could not fork autovacuum launcher process .." or "too many
background workers". So, not sure, if this should get a separate
treatment. If we fear this can happen frequently enough that it can
spam the LOG then a GUC may be worthwhile.

> What I was wondering was whether we would be better off putting this
> into the statistics collector, vs. doing it via logging. Both
> approaches seem to have pros and cons.
>

I think it could be easier for users to process the information if it
is available via some view, so there is a benefit in putting this into
the stats subsystem.

--
With Regards,
Amit Kapila.

Re: Logging parallel worker draught

From

Robert Haas

Date:

03 May 2023, 13:17:45

On Tue, May 2, 2023 at 6:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> We can output this at the LOG level to avoid running the server at
> DEBUG1 level. There are a few other cases where we are not able to
> spawn the worker or process and those are logged at the LOG level. For
> example, "could not fork autovacuum launcher process .." or "too many
> background workers". So, not sure, if this should get a separate
> treatment. If we fear this can happen frequently enough that it can
> spam the LOG then a GUC may be worthwhile.

I think we should definitely be afraid of that. I am in favor of a separate GUC.

> > What I was wondering was whether we would be better off putting this
> > into the statistics collector, vs. doing it via logging. Both
> > approaches seem to have pros and cons.
>
> I think it could be easier for users to process the information if it
> is available via some view, so there is a benefit in putting this into
> the stats subsystem.

Unless we do this instead.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Logging parallel worker draught

From

"Imseih (AWS), Sami"

Date:

09 October 2023, 15:51:34

Hi,

This thread has been quiet for a while, but I'd like to share some
thoughts.

+1 to the idea of improving visibility into parallel worker saturation.
But overall, we should improve parallel processing visibility, so DBAs can
detect trends in parallel usage ( is the workload doing more parallel, or doing less )
and have enough data to either tune the workload or change parallel GUCs.

>> We can output this at the LOG level to avoid running the server at
>> DEBUG1 level. There are a few other cases where we are not able to
>> spawn the worker or process and those are logged at the LOG level. For
>> example, "could not fork autovacuum launcher process .." or "too many
>> background workers". So, not sure, if this should get a separate
>> treatment. If we fear this can happen frequently enough that it can
>> spam the LOG then a GUC may be worthwhile.


> I think we should definitely be afraid of that. I am in favor of a separate GUC.

Currently explain ( analyze ) will give you the "Workers Planned"
and "Workers launched". Logging this via auto_explain is possible, so I am
not sure we need additional GUCs or debug levels for this info.

   ->  Gather  (cost=10430.00..10430.01 rows=2 width=8) (actual tim
e=131.826..134.325 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2


>> What I was wondering was whether we would be better off putting this
>> into the statistics collector, vs. doing it via logging. Both
>> approaches seem to have pros and cons.
>>
>> I think it could be easier for users to process the information if it
>> is available via some view, so there is a benefit in putting this into
>> the stats subsystem.


> Unless we do this instead.

Adding cumulative stats is a much better idea.

3 new columns can be added to pg_stat_database:
workers_planned, 
workers_launched,
parallel_operations - There could be more than 1 operation
per query, if for example there are multiple Parallel Gather
operations in a plan.

With these columns, monitoring tools can trend if there is more
or less parallel work happening over time ( by looking at parallel
operations ) or if the workload is suffering from parallel saturation.
workers_planned/workers_launched < 1 means there is a lack
of available worker processes.

Also, We could add this information on a per query level as well 
in pg_stat_statements, but this can be taken up in a seperate
discussion.

Regards,

--
Sami Imseih
Amazon Web Services (AWS)

Re: Logging parallel worker draught

From

Alvaro Herrera

Date:

11 October 2023, 10:11:30

On 2023-Oct-09, Imseih (AWS), Sami wrote:

> > I think we should definitely be afraid of that. I am in favor of a
> > separate GUC.

I agree.

> Currently explain ( analyze ) will give you the "Workers Planned"
> and "Workers launched". Logging this via auto_explain is possible, so I am
> not sure we need additional GUCs or debug levels for this info.
> 
>    ->  Gather  (cost=10430.00..10430.01 rows=2 width=8) (actual tim
> e=131.826..134.325 rows=3 loops=1)
>          Workers Planned: 2
>          Workers Launched: 2

I don't think autoexplain is a good substitute for the originally
proposed log line.  The possibility for log bloat is enormous.  Some
explain plans are gigantic, and I doubt people can afford that kind of
log traffic just in case these numbers don't match.

> Adding cumulative stats is a much better idea.

Well, if you read Benoit's earlier proposal at [1] you'll see that he
does propose to have some cumulative stats; this LOG line he proposes
here is not a substitute for stats, but rather a complement.  I don't
see any reason to reject this patch even if we do get stats.

Also, we do have a patch on stats, by Sotolongo and Bonne here [2].  I
think there was some drift on the scope, so eventually they gave up (I
don't blame them).  If you have concrete ideas on what direction that
effort should take, I think that thread would welcome that.  I have not
reviewed it myself, and I'm not sure when I'll have time for that.

[1] https://postgr.es/m/d657df20-c4bf-63f6-e74c-cb85a81d0383@dalibo.com
[2] https://postgr.es/m/6acbe570-068e-bd8e-95d5-00c737b865e8@gmail.com 

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"I'm impressed how quickly you are fixing this obscure issue. I came from 
MS SQL and it would be hard for me to put into words how much of a better job
you all are doing on [PostgreSQL]."
 Steve Midgley, http://archives.postgresql.org/pgsql-sql/2008-08/msg00000.php

Re: Logging parallel worker draught

From

"Imseih (AWS), Sami"

Date:

11 October 2023, 15:26:49

>> Currently explain ( analyze ) will give you the "Workers Planned"
>> and "Workers launched". Logging this via auto_explain is possible, so I am
>> not sure we need additional GUCs or debug levels for this info.
>>
>> -> Gather (cost=10430.00..10430.01 rows=2 width=8) (actual tim
>> e=131.826..134.325 rows=3 loops=1)
>> Workers Planned: 2
>> Workers Launched: 2

> I don't think autoexplain is a good substitute for the originally
> proposed log line. The possibility for log bloat is enormous. Some
> explain plans are gigantic, and I doubt people can afford that kind of
> log traffic just in case these numbers don't match.

Correct, that is a downside of auto_explain in general. 

The logging traffic can be controlled by 
auto_explain.log_min_duration/auto_explain.sample_rate/etc.
of course. 

> Well, if you read Benoit's earlier proposal at [1] you'll see that he
> does propose to have some cumulative stats; this LOG line he proposes
> here is not a substitute for stats, but rather a complement.  I don't
> see any reason to reject this patch even if we do get stats.

> Also, we do have a patch on stats, by Sotolongo and Bonne here [2].  I

Thanks. I will review the threads in depth and see if the ideas can be combined
in a comprehensive proposal.

Regarding the current patch, the latest version removes the separate GUC,
but the user should be able to control this behavior. 

Query text is logged when  log_min_error_statement > default level of "error".

This could be especially problematic when there is a query running more than 1 Parallel
Gather node that is in draught. In those cases each node will end up 
generating a log with the statement text. So, a single query execution could end up 
having multiple log lines with the statement text.

i.e.
LOG:  Parallel Worker draught during statement execution: workers spawned 0, requested 2
STATEMENT:  select (select count(*) from large) as a, (select count(*) from large) as b, (select count(*) from large)
asc ;
 
LOG:  Parallel Worker draught during statement execution: workers spawned 0, requested 2
STATEMENT:  select (select count(*) from large) as a, (select count(*) from large) as b, (select count(*) from large)
asc ;
 
LOG:  Parallel Worker draught during statement execution: workers spawned 0, requested 2
STATEMENT:  select (select count(*) from large) as a, (select count(*) from large) as b, (select count(*) from large)
asc ;
 

I wonder if it will be better to accumulate the total # of workers planned and # of workers launched and
logging this information at the end of execution?

Regards,

Sami

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

12 October 2023, 10:01:46

On 10/11/23 17:26, Imseih (AWS), Sami wrote:

Thank you for resurrecting this thread.

>> Well, if you read Benoit's earlier proposal at [1] you'll see that he
>> does propose to have some cumulative stats; this LOG line he proposes
>> here is not a substitute for stats, but rather a complement.  I don't
>> see any reason to reject this patch even if we do get stats.

I believe both cumulative statistics and logs are needed. Logs excel in 
pinpointing specific queries at precise times, while statistics provide 
a broader overview of the situation. Additionally, I often encounter 
situations where clients lack pg_stat_statements and can't restart their 
production promptly.

> Regarding the current patch, the latest version removes the separate GUC,
> but the user should be able to control this behavior.

I created this patch in response to Amit Kapila's proposal to keep the 
discussion ongoing. However, I still favor the initial version with the 
GUCs.

> Query text is logged when  log_min_error_statement > default level of "error".
> 
> This could be especially problematic when there is a query running more than 1 Parallel
> Gather node that is in draught. In those cases each node will end up
> generating a log with the statement text. So, a single query execution could end up
> having multiple log lines with the statement text.
> ...
> I wonder if it will be better to accumulate the total # of workers planned and # of workers launched and
> logging this information at the end of execution?

log_temp_files exhibits similar behavior when a query involves multiple 
on-disk sorts. I'm uncertain whether this is something we should or need 
to address. I'll explore whether the error message can be made more 
informative.

[local]:5437 postgres@postgres=# SET work_mem to '125kB';
[local]:5437 postgres@postgres=# SET log_temp_files TO 0;
[local]:5437 postgres@postgres=# SET client_min_messages TO log;
[local]:5437 postgres@postgres=# WITH a AS ( SELECT x FROM 
generate_series(1,10000) AS F(x) ORDER BY 1 ) , b AS (SELECT x FROM 
generate_series(1,10000) AS F(x) ORDER BY 1 ) SELECT * FROM a,b;
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.20", size 
122880 => First sort
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.19", size 140000
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.23", size 140000
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.22", size 
122880 => Second sort
LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp138850.21", size 140000

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

"Imseih (AWS), Sami"

Date:

15 October 2023, 17:48:51

> I believe both cumulative statistics and logs are needed. Logs excel in 
> pinpointing specific queries at precise times, while statistics provide 
> a broader overview of the situation. Additionally, I often encounter 
> situations where clients lack pg_stat_statements and can't restart their 
> production promptly.

I agree that logging will be very useful here. 
Cumulative stats/pg_stat_statements can be handled in a separate discussion.

> log_temp_files exhibits similar behavior when a query involves multiple
> on-disk sorts. I'm uncertain whether this is something we should or need
> to address. I'll explore whether the error message can be made more
> informative.


> [local]:5437 postgres@postgres=# SET work_mem to '125kB';
> [local]:5437 postgres@postgres=# SET log_temp_files TO 0;
> [local]:5437 postgres@postgres=# SET client_min_messages TO log;
> [local]:5437 postgres@postgres=# WITH a AS ( SELECT x FROM
> generate_series(1,10000) AS F(x) ORDER BY 1 ) , b AS (SELECT x FROM
> generate_series(1,10000) AS F(x) ORDER BY 1 ) SELECT * FROM a,b;
> LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp138850.20", size
> 122880 => First sort
> LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp138850.19", size 140000
> LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp138850.23", size 140000
> LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp138850.22", size
> 122880 => Second sort
> LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp138850.21", size 140000

That is true.

Users should also control if they want this logging overhead or not, 
The best answer is a new GUC that is OFF by default.

I am also not sure if we want to log draught only. I think it's important
to not only see which operations are in parallel draught, but to also log 
operations are using 100% of planned workers. 
This will help the DBA tune queries that are eating up the parallel workers.

Regards,

Sami

Re: Logging parallel worker draught

From

Tomas Vondra

Date:

25 February 2024, 19:13:33

Hi,

I see the thread went a bit quiet, but I think it'd be very useful (and
desirable) to have this information in log. So let me share my thoughts
about the patch / how it should work.

The patch is pretty straightforward, I don't have any comments about the
code as is. Obviously, some of the following comments might require
changes to the patch, etc.

I see there was discussion about logging vs. adding this to the pgstats
system. I agree with Alvaro it should not be one or the other - these
are complementary approaches, used by different tools. The logging is
needed for tools like pgbadger etc. for example.

I do see a lot of value in adding this to the statistics collector, and
to things like pg_stat_statements, but that's being discussed in a
separate thread, so I'll comment on that there.

As for whether to have a GUC for this, I certainly think we should have
one. We have GUC for logging stuff that we generally expect to happen,
like lock waits, temp files, etc.

True, there are similar cases that we just log every time, like when we
can't fork a process ("could not fork autovacuum launcher process"), but
I'd say those are unexpected to happen in general / seem more like an
error in the environment. While we may exhaust parallel workers pretty
easily / often, especially on busy systems.


There's a couple things I'm not quite sure about:


1) name of the GUC

I find the "log_parallel_worker_draught" to be rather unclear :-( Maybe
it's just me and everyone else just immediately understands what this
does / what will happen after it's set to "on", but I find it rather
non-intuitive.


2) logging just the failures provides an incomplete view

As a DBA trying to evaluate if I need to bump up the number of workers,
I'd be interested what fraction of parallel workers fails to start. If
it's 1%, that's probably a short spike and I don't need to do anything.
If it's 50%, well, that might have unpredictable impact on user queries,
and thus something I may need to look into. But if we only log the
failures, that's not possible.

I may be able to approximate this somehow by correlating this with the
query/transaction rate, or something, but ideally I'd like to know how
many parallel workers we planned to use, and how many actually started.

Obviously, logging this for every Gather [Merge] node, even when all the
workers start, that may be a lot of data. Perhaps the GUC should not be
on/off, but something like

log_parallel_workers = {none | failures | all}

where "failures" only logs when at least one worker fails to start, and
"all" logs everything.

AFAIK Sami made the same observation/argument in his last message [1].


3) node-level or query-level?

There's a brief discussion about whether this should be a node-level or
query-level thing in [2]:

> I wonder if it will be better to accumulate the total # of workers
> planned and # of workers launched and logging this information at the
> end of execution?

And there's a reference to log_temp_files, but it's not clear to me if
that's meant to be an argument for doing it the same way (node-level).

I think we should not do node-level logging just because that's what
log_temp_files=on dies. I personally think log_temp_files was
implemented like this mostly because it was simple.

There's no value in having information about every individual temp file,
because we don't even know which node produced it. It'd be perfectly
sufficient to log some summary statistics (number of files, total space,
...), either for the query as a whole, or perhaps per node (because
that's what matters for work_mem). So I don't think we should mimic this
just because log_temp_files does this.

I don't know how difficult would it be to track/collect this information
for the whole query.


regards


[1]
https://www.postgresql.org/message-id/D04977E3-9F54-452C-A4C4-CDA67F392BD1%40amazon.com

[2]
https://www.postgresql.org/message-id/11e34b80-b0a6-e2e4-1606-1f5077379a34%40dalibo.com
-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Logging parallel worker draught

From

Peter Smith

Date:

25 February 2024, 22:32:52

On Mon, Feb 26, 2024 at 6:13 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>

> 1) name of the GUC
>
> I find the "log_parallel_worker_draught" to be rather unclear :-( Maybe
> it's just me and everyone else just immediately understands what this
> does / what will happen after it's set to "on", but I find it rather
> non-intuitive.
>

Also, I don't understand how the word "draught" (aka "draft") makes
sense here -- I assume the intended word was "drought" (???).

==========
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

27 February 2024, 09:55:27

On 2/25/24 20:13, Tomas Vondra wrote:
 > 1) name of the GUC
...
 > 2) logging just the failures provides an incomplete view
 > log_parallel_workers = {none | failures | all}>
 > where "failures" only logs when at least one worker fails to start, and
 > "all" logs everything.
 >
 > AFAIK Sami made the same observation/argument in his last message [1].

I like the name and different levels you propose. I was initially 
thinking that the overall picture would be better summarized (an easier 
to implement) with pg_stat_statements. But with the granularity you 
propose, the choice is in the hands of the DBA, which is great and 
provides more options when we don't have full control of the installation.

 > 3) node-level or query-level?
...
 > There's no value in having information about every individual temp file,
 > because we don't even know which node produced it. It'd be perfectly
 > sufficient to log some summary statistics (number of files, total space,
 > ...), either for the query as a whole, or perhaps per node (because
 > that's what matters for work_mem). So I don't think we should mimic this
 > just because log_temp_files does this.

I must admit that, given my (poor) technical level, I went for the 
easiest way.

If we go for the three levels you proposed, "node-level" makes even less 
sense (and has great "potential" for spam).

I can see one downside to the "query-level" approach: it might be more 
difficult to to give information in cases where the query doesn't end 
normally. It's sometimes useful to have hints about what was going wrong 
before a query was cancelled or crashed, which log_temp_files kinda does.

 > I don't know how difficult would it be to track/collect this information
 > for the whole query.

I am a worried this will be a little too much for me, given the time and 
the knowledge gap I have (both in C and PostgreSQL internals). I'll try 
to look anyway.

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

27 February 2024, 10:03:54

On 2/25/24 23:32, Peter Smith wrote:
> Also, I don't understand how the word "draught" (aka "draft") makes
> sense here -- I assume the intended word was "drought" (???).

yes, that was the intent, sorry about that. English is not my native 
langage and I was convinced the spelling was correct.

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

Tomas Vondra

Date:

27 February 2024, 14:09:24

On 2/27/24 10:55, Benoit Lobréau wrote:
> On 2/25/24 20:13, Tomas Vondra wrote:
>> 1) name of the GUC
> ...
>> 2) logging just the failures provides an incomplete view
>> log_parallel_workers = {none | failures | all}>
>> where "failures" only logs when at least one worker fails to start, and
>> "all" logs everything.
>>
>> AFAIK Sami made the same observation/argument in his last message [1].
> 
> I like the name and different levels you propose. I was initially
> thinking that the overall picture would be better summarized (an easier
> to implement) with pg_stat_statements. But with the granularity you
> propose, the choice is in the hands of the DBA, which is great and
> provides more options when we don't have full control of the installation.
> 

Good that you like the idea with multiple levels.

I agree pg_stat_statements may be an easier way to get a summary, but it
has limitations too (e.g. no built-in ability to show how the metrics
evolves over time, which is easier to restore from logs). So I think
those approaches are complementary.

>> 3) node-level or query-level?
> ...
>> There's no value in having information about every individual temp file,
>> because we don't even know which node produced it. It'd be perfectly
>> sufficient to log some summary statistics (number of files, total space,
>> ...), either for the query as a whole, or perhaps per node (because
>> that's what matters for work_mem). So I don't think we should mimic this
>> just because log_temp_files does this.
> 
> I must admit that, given my (poor) technical level, I went for the
> easiest way.
> 

That's understandable, I'd do the same thing.

> If we go for the three levels you proposed, "node-level" makes even less
> sense (and has great "potential" for spam).
> 

Perhaps.

> I can see one downside to the "query-level" approach: it might be more
> difficult to to give information in cases where the query doesn't end
> normally. It's sometimes useful to have hints about what was going wrong
> before a query was cancelled or crashed, which log_temp_files kinda does.
> 

That is certainly true, but it's not a new thing, I believe. IIRC we may
not report statistics until the end of the transaction, so no progress
updates, I'm not sure what happens if the doesn't end correctly (e.g.
backend dies, ...). Similarly for the temporary files, we don't report
those until the temporary file gets closed, so I'm not sure you'd get a
lot of info about the progress either.

I admit I haven't tried and maybe I don't remember the details wrong.
Might be useful to experiment with this first a little bit ...

>> I don't know how difficult would it be to track/collect this information
>> for the whole query.
> 
> I am a worried this will be a little too much for me, given the time and
> the knowledge gap I have (both in C and PostgreSQL internals). I'll try
> to look anyway.
> 

I certainly understand that, and I realize it may feel daunting and even
discouraging. What I can promise is that I'm willing to help, either by
suggesting ways to approach the problems, doing reviews, and so on.
Would that be sufficient for you to continue working on this patch?

Some random thoughts/ideas about how to approach this:

- For parallel workers I think this might be as simple as adding some
counters into QueryDesc, and update those during query exec (instead of
just logging stuff directly). I'm not sure if it should be added to
"Instrumentation" or separately.

- I was thinking maybe we could just walk the execution plan and collect
the counts that way. But I'm not sure that'd work if the Gather node
happens to be executed repeatedly (in a loop). Also, not sure we'd want
to walk all plans.

- While log_temp_files is clearly out of scope for this patch, it might
be useful to think about it and how it should behave. We've already used
log_temp_files to illustrate some issues with logging the info right
away, so maybe there's something to learn here ...

- For temporary files I think it'd be more complicated, because we can
create temporary files from many different places, not just in executor,
so we can't simply update QueryDesc etc. Also, the places that log info
about temporary files (using ReportTemporaryFileUsage) only really know
about individual temporary files, so if a Sort or HashJoin creates a
1000 files, we'll get one LOG for each of those temp files. But they're
really a single "combined" file. So maybe we should introduce some sort
of "context" to group those files, and only accumulate/log the size for
the group as a whole? Maybe it'd even allow printing some info about
what the temporary file is for (e.g. tuplestore / tuplesort / ...).




regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Logging parallel worker draught

From

Andrew Dunstan

Date:

28 February 2024, 01:45:26

On 2024-02-27 Tu 05:03, Benoit Lobréau wrote:
>
>
> On 2/25/24 23:32, Peter Smith wrote:
>> Also, I don't understand how the word "draught" (aka "draft") makes
>> sense here -- I assume the intended word was "drought" (???).
>
> yes, that was the intent, sorry about that. English is not my native 
> langage and I was convinced the spelling was correct.

Both are English words spelled correctly, but with very different 
meanings. (Drought is definitely the one you want here.) This reminds me 
of the Errata section of Sellars and Yeatman's classic "history" work 
"1066 And All That":

"For 'pheasant' read 'peasant' throughout."

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

29 February 2024, 08:24:20

On 2/27/24 15:09, Tomas Vondra wrote> That is certainly true, but it's 
not a new thing, I believe. IIRC we may
 > not report statistics until the end of the transaction, so no progress
 > updates, I'm not sure what happens if the doesn't end correctly (e.g.
 > backend dies, ...). Similarly for the temporary files, we don't report
 > those until the temporary file gets closed, so I'm not sure you'd get a
 > lot of info about the progress either.
 >
 > I admit I haven't tried and maybe I don't remember the details wrong.
 > Might be useful to experiment with this first a little bit ...

Ah, yes, Tempfile usage is reported on tempfile closure or deletion
for shared file sets.

 > I certainly understand that, and I realize it may feel daunting and even
 > discouraging. What I can promise is that I'm willing to help, either by
 > suggesting ways to approach the problems, doing reviews, and so on.
 > Would that be sufficient for you to continue working on this patch?

Yes, thanks for the proposal, I'll work on it on report here. I am otherwise
booked on company projects so I cannot promise a quick progress.

 > Some random thoughts/ideas about how to approach this:
 >
 > - For parallel workers I think this might be as simple as adding some
 > counters into QueryDesc, and update those during query exec (instead of
 > just logging stuff directly). I'm not sure if it should be added to
 > "Instrumentation" or separately.

I will start here to see how it works.

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

"Andrey M. Borodin"

Date:

08 April 2024, 08:05:36

> On 29 Feb 2024, at 11:24, Benoit Lobréau <benoit.lobreau@dalibo.com> wrote:
>
> Yes, thanks for the proposal, I'll work on it on report here.

Hi Benoit!

This is kind reminder that this thread is waiting for your response.
CF entry [0] is in "Waiting on Author", I'll move it to July CF.

Thanks!

Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/47/4291/

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

08 April 2024, 08:13:05

On 4/8/24 10:05, Andrey M. Borodin wrote:
> Hi Benoit!
> 
> This is kind reminder that this thread is waiting for your response.
> CF entry [0] is in "Waiting on Author", I'll move it to July CF.

Hi thanks for the reminder,

The past month as been hectic for me.
It should calm down by next week at wich point I'll have time to go back 
to this. sorry for the delay.

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

04 September 2024, 18:31:25

I found out in [1] that I am not correctly tracking the workers for 
vacuum commands. I trap workers used by 
parallel_vacuum_cleanup_all_indexes but not 
parallel_vacuum_bulkdel_all_indexes.

Back to the drawing board.

[1] 
https://www.postgresql.org/message-id/flat/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

15 October 2024, 10:52:20

On 10/14/24 22:42, Alena Rybakina wrote:
> Attached is the correct version of the patch.

Thank you, it's a lot cleaner that way.
I'll add this asap.

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

Guillaume Lelarge

Date:

29 November 2024, 11:43:21

Hi,

Le sam. 19 oct. 2024 à 06:46, Benoit Lobréau <benoit.lobreau@dalibo.com> a écrit :

On 10/15/24 09:52, Benoit Lobréau wrote:
> Thank you, it's a lot cleaner that way.
> I'll add this asap.

This is an updated version with the suggested changes.

AFAICT, Benoit answered to all questions and requests. Is there anything we can do to make this move forward?

Thanks.

Guillaume.

Re: Logging parallel worker draught

From

Sami Imseih

Date:

28 December 2024, 07:17:59

Thanks for rebasing.

I took a look and have some high level comments.

1/ The way the patches stand now, different parallel commands
will result in a different formatted log line. This may be
OK for reading the logs manually, but will be hard to
consume via tools that parse logs and if we support other
parallel commands in the future.

Thinking about this further, it seems to me this logging only makes sense
for parallel query and not maintenance commands. This is because
for the latter case, the commands are executed manually and a user can issue
VACUUM VERBOSE ( for the case of vacuum ). For CREATE INDEX, one
can enable DEBUG1. For example:

postgres=# CREATE INDEX tbl_c1 ON tbl(c1);
DEBUG:  building index "tbl_c1" on table "tbl" with request for 1
parallel workers
DEBUG:  index "tbl_c1" can safely use deduplication
CREATE INDEX

It does appear though the debug1 message is missing the number of workers
actually used, but this could be taken up in a separate discussion.

2/ For logging parallel query workers, the log line could be simpler.

Instead of:

+ elog(LOG, "%i parallel nodes planned (%i obtained all their workers,
%i obtained none), "
+ "%i workers planned to be launched (%i workers launched)",

what about something like:

"launched %d parallel workers (planned: %)"

For example, if I have 2 gather nodes and they each plan and launch
2 workers, the log line should be as simple as

"launched 4 parallel workers (planned: 4)"

This behavior in which workers planned and launched are aggregated in the
log can be described in the documentation.

3/ Also, log_parallel_query_workers as the name of the GUC will make more sense
if we go logging parallel query only.

What do you think?

Regards,

Sami Imseih
Amazon Web Services (AWS)

On Thu, Dec 26, 2024 at 8:25 AM Benoit Lobréau
<benoit.lobreau@dalibo.com> wrote:
>
> This is just a rebase.
>
> As stated before, I added some information to the error message for
> parallel queries as an experiment. It can be removed, if you re not
> convinced.
>
> ---
> Benoit Lobréau
> Consultant
> http://dalibo.com

Re: Logging parallel worker draught

From

Sami Imseih

Date:

30 December 2024, 18:06:01

I missed one more point earlier.

I don't think "failure" is a good name for the setting as it's
a bit too harsh. What we really have is a "shortage" of
workers.

Instead of
+    {"failure", LOG_PARALLEL_WORKERS_FAILURE, false},

what about?

 {"shortage", LOG_PARALLEL_WORKERS_SHORTAGE, false},

Regards,

Sami

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

03 January, 13:25:35

Hi, thank you for the review and sorry for the delayed answer.

On 12/28/24 05:17, Sami Imseih wrote:> Thinking about this further, it 
seems to me this logging only makes sense
 > for parallel query and not maintenance commands. This is because
 > for the latter case, the commands are executed manually and a user 
can issue
 > VACUUM VERBOSE ( for the case of vacuum ). For CREATE INDEX, one
 > can enable DEBUG1. For example:
 >
 > postgres=# CREATE INDEX tbl_c1 ON tbl(c1);
 > DEBUG:  building index "tbl_c1" on table "tbl" with request for 1
 > parallel workers
 > DEBUG:  index "tbl_c1" can safely use deduplication
 > CREATE INDEX

In my opinion, it's useful:

* Centralization of logs: The maintenance operation can be planned in
cron jobs. In that context, it could certainly be logged separately via
the script. However, when I am doing an audit on a client database, I
think it's useful to have all the relevant information in PostgreSQL logs.

* Centralization of the configuration: If I am auditing the usage of
parallelism, I would like to have a single GUC to enable the logging
for all relevant information.

* Context: Maintenance workers are part of the global pool of workers.
To know why there is a shortage after the fact, having all the information
on what was using workers could be useful. I tied this argument when we
added the parallel worker info to pg_stat_database and wasn't successful.
It would be nice to have the info somewhere.

The logging for CREATE INDEX and VACUUM is in separate patches. If you
don't mind, I would like to keep it as long as possible.

 > 2/ For logging parallel query workers, the log line could be simpler.
 >
 > Instead of:
 >
 > + elog(LOG, "%i parallel nodes planned (%i obtained all their workers,
 > %i obtained none), "
 > + "%i workers planned to be launched (%i workers launched)",
 >
 > what about something like:
 >
 > "launched %d parallel workers (planned: %)"
 >
 > For example, if I have 2 gather nodes and they each plan and launch
 > 2 workers, the log line should be as simple as
 >
 > "launched 4 parallel workers (planned: 4)"
 >
 > This behavior in which workers planned and launched are aggregated in the
 > log can be described in the documentation.

"%i parallel nodes planned (%i obtained all their workers, %i obtained 
none),"

I added this part during my first reorganization. I was trying to extract
as much information as possible and thought that the special case where a
parallel node was planned but no workers were obtained was the worst-case
scenario, which might be interesting to log.

If you think it doesn't bring much value, we can strip it.

 > 3/ Also, log_parallel_query_workers as the name of the GUC will make 
more sense
 > if we go logging parallel query only.

I agree that log_parallel_query_workers would make more sense if we focus
on logging parallel queries only. Since the maintenance-related logging
was added later, I missed this point. Thank you for bringging this up.

If agreeable, I will rename the GUC to log_parallel_workers as long as
the argument regarding maintenance workers is unresolved. If we decide
to strip them, I will revert it to log_parallel_query_workers.

 >> I don't think "failure" is a good name for the setting as it's
 >> a bit too harsh. What we really have is a "shortage" of
 >> workers.

I agree that "failure" is too harsh. The term "shortage" better describes
the situation.

Note: I added Tomas Vondra to the recipients of the email since he advised
me on this topic initially and might have an opinion on this.

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

Sami Imseih

Date:

03 January, 19:24:16

> * Centralization of logs: The maintenance operation can be planned in
> cron jobs. In that context, it could certainly be logged separately via
> the script. However, when I am doing an audit on a client database, I
> think it's useful to have all the relevant information in PostgreSQL logs.

Maintenance work is usually planned, so if queries
issues by the applications are not launching enough workers, it's
easy to point the blame on the maintenance activity.

> * Centralization of the configuration: If I am auditing the usage of
> parallelism, I would like to have a single GUC to enable the logging
> for all relevant information.
> * Context: Maintenance workers are part of the global pool of workers.
> To know why there is a shortage after the fact, having all the information
> on what was using workers could be useful. I tied this argument when we
> added the parallel worker info to pg_stat_database and wasn't successful.
> It would be nice to have the info somewhere.
>
> The logging for CREATE INDEX and VACUUM is in separate patches. If you
> don't mind, I would like to keep it as long as possible.

Maybe it's better to log parallel maintenance workers separately actually
if there is a truly good justification for it. As it stands now, even
pg_stat_database
does not track maintenance workers. Maybe adding logging could also be part
of that discussion?

I think logging maintenance workers will be an odd experience.
For example, if a user runs "VACUUM VERBOSE sometable",
all details are sent to the client ( including the parallel worker )
and not the logs. I doubt putting this infor alone in the log will be useful.

Also, the CREATE INDEX provides this info in DEBUG1. Although,
I think we need a separate discussion to add the # of launched workers
to then output. And maybe some documentation on how to emit
the parallel worker info for CREATE INDEX.

I wonder if we should really just have a CREATE INDEX VERBOSE syntax.

Regards,

Sami

Re: Logging parallel worker draught

From

Sami Imseih

Date:

29 January, 02:41:39

> I feel that observability is important, and I don't understand why we
> would want to have the information for only a portion of the
> functionality's usage (even if it's the most important).

In my opinion, the requirement for parallel usage in
the utility statement is different. In fact, I think the story
there needs improvement as is being discussed in [1]
by introducing something like a VERBOSE.

Here is the kind of case I think will be a bit odd if we
introduce the parallel logging for vacuum in V6_0003
and V6_0004.

postgres=# set client_min_messages = LOG;
SET
postgres=# set log_parallel_workers = "all";
SET
postgres=# vacuum (verbose, parallel 4) t ;
INFO:  vacuuming "postgres.public.t"
INFO:  launched 2 parallel vacuum workers for index vacuuming (planned: 2)
LOG:  launched 2 parallel workers (planned: 2)
INFO:  table "t": truncated 17242 to 0 pages
INFO:  finished vacuuming "postgres.public.t": index scans: 1
pages: 17242 removed, 0 remain, 17242 scanned (100.00% of total)

There will both be an INFO ( existing behavior ) and LOG ( new behavior ).
This seems wrong to me and there should only really be one
mechanism to log parallel workers for utility statements.
Others may have different opinions.

I also have a few comments on V6

1/
For the new comments.

Change:
          "workers is produced"
To
          "workers is emitted"

"emit" is used mainly to describe logs.

Also, change "displays" to "emits"

2/
Should the function LoggingParallelWorkers do all the work,
including logging? This way the callers just need to
call the function without checking for the return
value? Currently, the patch just repeats the same
logging calls everywhere it's needed.

Maybe use a void returning function called
LogParallelWorkersIfNeeded for this purpose?

Regards,

Sami

[1] https://www.postgresql.org/message-id/CAA5RZ0trTUL6_vpvW79daGgkp7B-ZtWUc5yrPz5Sjm8Ns4KRgQ@mail.gmail.com

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

05 February, 11:31:13

On 2/3/25 6:36 AM, Sami Imseih wrote:
> As far as the current set of patches, I had some other changes that
> I missed earlier; indentation to the calls in LogParallelWorkersIfNeeded
> and comment for the LogParallelWorkersIfNeeded function. I also re-worked the
> setup of the GUC as it was not setup the same way as other
> GUCs with an options list. I ended up just making those changes
> in v8.

> Besides that, I think this is ready for committer.

Thank you for your time and advice on this.

-- 
Benoit Lobréau
Consultant
http://dalibo.com

Re: Logging parallel worker draught

From

Melanie Plageman

Date:

07 April, 19:41:24

On Mon, Feb 3, 2025 at 12:37 AM Sami Imseih <samimseih@gmail.com> wrote:
>
> Besides that, I think this is ready for committer.

I started looking at this, and I like the idea.

A few comments: I don't understand what 0002 is. For starters, the
commit message says something about pg_stat_database, and there are no
changes related to that.

Also, we already have basically identical logging coming from
parallel_vacuum_process_all_indexes() and viewable in existing output.
Not only does your implementation not replace this, it is odd that
setting your new guc to none does not disable this. It seems a bit
inconsistent. I'm not sure what the exact right behavior is here,
though.

Since your last update, it seems parallel gin index build has been
committed, so perhaps you want to add that.

- Melanie

Re: Logging parallel worker draught

From

Benoit Lobréau

Date:

14 April, 17:41:31

On 4/7/25 6:41 PM, Melanie Plageman wrote:
> On Mon, Feb 3, 2025 at 12:37 AM Sami Imseih <samimseih@gmail.com> wrote:
> I started looking at this, and I like the idea.

Thanks for taking a look!

> A few comments: I don't understand what 0002 is. For starters, the
> commit message says something about pg_stat_database, and there are no
> changes related to that.

I had originally split this part out while working on the patch to add 
parallel worker stats in pg_stat_database [1], in order to isolate the 
common components. In the end, that patch only accounted for user queries.

I merged it into "Implements logging for parallel worker usage in 
utilities" for v9.

> Also, we already have basically identical logging coming from
> parallel_vacuum_process_all_indexes() and viewable in existing output.
> Not only does your implementation not replace this, it is odd that
> setting your new guc to none does not disable this. It seems a bit
> inconsistent. I'm not sure what the exact right behavior is here,
> though.

That logging is used for the VERBOSE mode of VACUUM. There was also 
dicussion to add similar info for parallel index creation.

The use case here is different — the goal is to audit parallel worker 
usage across the entire instance, without needing every call site to use 
VACUUM (VERBOSE) along with SET log_min_messages = info.

I avoided reusing that part of the code because I thought the 
expectation was to aggregate worker counts and display them in 
parallel_vacuum_end(). Sami also mentionned that using the same log
line everywhere in the patch would make parsing easier, which I tought 
was a good idea.

> Since your last update, it seems parallel gin index build has been
> committed, so perhaps you want to add that.

Thanks for the heads-up! I've added logging in _gin_end_parallel().

You’ll find the updated patch attached.

[1] https://commitfest.postgresql.org/patch/5212/

-- 
Benoit