Thread: Re: Parallel workers stats in pg_stat_database

Re: Parallel workers stats in pg_stat_database

From
Bertrand Drouvot
Date:
Hi,

On Tue, Sep 03, 2024 at 02:34:06PM +0200, Benoit Lobréau wrote:
> I noticed that the tests are still not stable. I tried using tenk2
> but fail to have stable plans. I'd love to have pointers on that front.

What about moving the tests to places where it's "guaranteed" to get 
parallel workers involved? For example, a "parallel_maint_workers" only test
could be done in vacuum_parallel.sql.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Parallel workers stats in pg_stat_database

From
Benoit Lobréau
Date:
On 9/4/24 08:46, Bertrand Drouvot wrote:> What about moving the tests to 
places where it's "guaranteed" to get
> parallel workers involved? For example, a "parallel_maint_workers" only test
> could be done in vacuum_parallel.sql.

Thank you ! I was too focussed on the stat part and missed the obvious.
It's indeed better with this file.

... Which led me to discover that the area I choose to gather my stats 
is wrong (parallel_vacuum_end), it only traps workers allocated for 
parallel_vacuum_cleanup_all_indexes() and not 
parallel_vacuum_bulkdel_all_indexes().

Back to the drawing board...

-- 
Benoit Lobréau
Consultant
http://dalibo.com



Re: Parallel workers stats in pg_stat_database

From
Benoit Lobréau
Date:
Hi,

Thanks for your imput ! I will fix the doc as proposed and do the split 
as soon as I have time.

On 10/1/24 09:27, Michael Paquier wrote:
> I'm less
> a fan of the addition for utilities because these are less common
> operations.

My thought process was that in order to size max_parallel_workers we 
need to
have information on the maintenance parallel worker and "query" parallel 
workers.

> Actually, could we do better than what's proposed here?  How about
> presenting an aggregate of this data in pg_stat_statements for each
> query instead?

I think both features are useful.

My collegues and I had a discussion about what could be done to improve
parallelism observability in PostgreSQL [0]. We thought about several
places to do it for several use cases.

Guillaume Lelarge worked on pg_stat_statements [1] and
pg_stat_user_[tables|indexes] [2]. I proposed a patch for the logs [3].

As a consultant, I frequently work on installation without
pg_stat_statements and I cannot install it on the client's production
in the timeframe of my intervention.

pg_stat_database is available everywhere and can easily be sampled by 
collectors/supervision services (like check_pgactivity).

Lastly the number would be more precise/easier to make sense of, since 
pg_stat_statement has a limited size.

[0] 
https://www.postgresql.org/message-id/flat/d657df20-c4bf-63f6-e74c-cb85a81d0383@dalibo.com
[1] 
https://www.postgresql.org/message-id/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X%2B_dZqKw%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/flat/CAECtzeXXuMkw-RVGTWvHGOJsmFdsRY%2BjK0ndQa80sw46y2uvVQ%40mail.gmail.com
[3] 
https://www.postgresql.org/message-id/8123423a-f041-4f4c-a771-bfd96ab235b0%40dalibo.com

-- 
Benoit Lobréau
Consultant
http://dalibo.com



Re: Parallel workers stats in pg_stat_database

From
Benoit Lobréau
Date:
On 10/7/24 10:19, Guillaume Lelarge wrote:
> I've done the split, but I didn't go any further than that.

Thank you Guillaume. I have done the rest of the reformatting
suggested by Michael but I decided to see If I have similar stuff
in my logging patch and refactor accordingly if needed before posting 
the result here.

I have hopes to finish it this week.

-- 
Benoit Lobréau
Consultant
http://dalibo.com