Home > mailing lists

Re: Query planning read a large amount of buffers for partitioned tables - Mailing list pgsql-performance

From	bruno vieira da silva
Subject	Re: Query planning read a large amount of buffers for partitioned tables
Date	February 18, 2025 18:48:26
Msg-id	CAB+Nuk_rqmZjtVJmTAbmyCeM8-eqB97VkQcTtkYTUksHj0pwjA@mail.gmail.com Whole thread
In response to	Re: Query planning read a large amount of buffers for partitioned tables (David Rowley <dgrowleyml@gmail.com>)
List	pgsql-performance

Tree view

Hello, Your explanation aligns with the idea I had that having more shared_buffers and connection pooling are very important in the context of the partitioned tables.

Thanks. Regards.

On Tue, Feb 18, 2025 at 7:16 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 18 Feb 2025 at 09:18, bruno vieira da silva
<brunogiovs@gmail.com> wrote:
>
> Hello, I did a more comprehensive test with a different number of partitions and I found this:
>
> Summary buffers usage for the first call vs second call on the same session.
>
> Query 200, 100, 50, and 10 partitions:
> 200 Partitions: 12,828 (100MB)
> 100 Partitions: 9,329 (72MB)
> 50 Partitions: 3,305 (25MB)
> 10 Partitions: 875 (7MB)
>
> Same query on the same session:
> 200 Partitions: 205 (1.6MB)
> 100 Partitions: 5 (40KB)
> 50 Partitions: 5 (40KB)
> 10 Partitions: 5 (40KB)
>
> I did test on PG 17.3 no relevant changes.
>
> Question is, does it make sense?

I didn't analyze this in great detail, but nothing looks too
surprising to me. I get roughly the same numbers on the latest git
master branch as you've shown above.

A PostgreSQL backend will cache various metadata about relations the
first time they're accessed in a backend. Building those caches
requires accessing the system catalogue tables. I expect the majority
of the buffer accesses are for those tables. If you're curious about
what's being accessed and have a fresh test instance handy, you could
use strace to see which buffers are being read. You'll need to ensure
the shared buffers are not caching anything. Restarting PostgreSQL
should clear those out sufficiently. You can translate the filenodes
back into relation names by using a query such as: select relname from
pg_class where pg_relation_filenode(oid)=1259;

If this is causing you problems then maybe a connection pooler would
help you. With one of those, the backend will live longer than just 1
query. You could also perhaps revisit your partition count to see if
the number you've chosen gives you the best performance. It's very
common for people to over-partition and not properly consider the
overheads of partitioning.

David

Bruno Vieira da Silva

pgsql-performance by date:

From: David Rowley
Date: 18 February 2025, 15:16:09
Subject: Re: Query planning read a large amount of buffers for partitioned tables

From: Lincoln Swaine-Moore
Date: 22 February 2025, 02:46:15
Subject: Unfortunate Nested Loop + Missing Autovacuum

Re: Query planning read a large amount of buffers for partitioned tables - Mailing list pgsql-performance

Previous

Next