Re: Memory consumed by paths during partitionwise join planning - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Memory consumed by paths during partitionwise join planning
Date
Msg-id CAExHW5u-X+7xj2udyXSxqPhqyLDz99o_Ev6YoPViZX5tT0Uquw@mail.gmail.com
Whole thread Raw
In response to Re: Memory consumed by paths during partitionwise join planning  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Memory consumed by paths during partitionwise join planning
List pgsql-hackers
On Wed, Nov 29, 2023 at 1:10 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Fri, 28 Jul 2023 at 02:06, Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> > Table 1: Join between unpartitioned tables
> > Number of tables | without patch  | with patch | % reduction |
> > being joined     |                |            |             |
> > --------------------------------------------------------------
> >                2 |      29.0 KiB  |   28.9 KiB |       0.66% |
> >                3 |      79.1 KiB  |   76.7 KiB |       3.03% |
> >                4 |     208.6 KiB  |  198.2 KiB |       4.97% |
> >                5 |     561.6 KiB  |  526.3 KiB |       6.28% |
>
> I have mostly just random thoughts and some questions...
>
> Have you done any analysis of the node types that are consuming all
> that memory? Are Path and subclasses of Path really that dominant in
> this? The memory savings aren't that impressive and it makes me
> wonder if this is worth the trouble.

This thread and patch targets saving memory consumed by Path nodes
i.e. Path and its subclasses. Breakdown of memory consumed by various
nodes can be found in [1] for partitioned table queries.

>
> What's the goal of the memory reduction work?  If it's for
> performance, does this increase performance?  Tracking refcounts of
> Paths isn't free.

This memory reduction work is part of work to reduce planner's memory
consumption while planning queries involving partitioned tables with a
large number (thousands) of partitions [1]. The second table (copies
below) in my first email in this thread gives the memory saved by
freeing unused path nodes when planning queries involving partitioned
tables with a thousand partitions each.

Table 2: join between partitioned tables with partitionwise join
enabled (enable_partitionwise_join = true).
Number of tables | without patch  | with patch | % reduction |
being joined     |                |            |             |
------------------------------
----------------------------------
               2 |      40.3 MiB  |   40.0 MiB |       0.70% |
               3 |     146.9 MiB  |  143.1 MiB |       2.55% |
               4 |     445.4 MiB  |  430.4 MiB |       3.38% |
               5 |    1563.3 MiB  | 1517.6 MiB |       2.92% |

%wise the memory savings are not great but because of sheer amount of
memory used, the savings are in MBs. Also I don't think I managed to
free all the unused paths for the reasons listed in my original mail.
But I was doubtful whether the work is worth it. So I halted my work.
I see you think that it's not worth it. So one vote against it.


>
> Why do your refcounts start at 1?  This seems weird:
>
> + /* Free the path if none references it. */
> + if (path->ref_count == 1)

Refcounts start from 0. This code is from free_unused_paths_from_rel()
which frees paths in the rel->pathlist. At this stage a path is
referenced from rel->pathlist, hence the count will >= 1 and we should
be freeing paths with refcount > 1 i.e. referenced elsewhere.

>
> Does ref_count really need to be an int?  There's a 16-bit hole you
> could use between param_info and parallel_aware.  I doubt you'd need
> 32 bits of space for this.

16 bit space allows the refcount to go upto 65535 (or twice of that).
This is somewhere between 18C9 and 19C9. If a (the cheapest) path in a
given relation in 19 relation query is referenced in all the joins
that that relation is part of, this refcount will be exhausted. If we
aren't dealing with queries involving those many relations already, we
will soon. Maybe we should add code to protect from overflows.

>
> I agree that it would be nice to get rid of the "if (!IsA(old_path,
> IndexPath))" hack in add_path() and it seems like this idea could
> allow that. However, I think if we were to have something like this
> then you'd want all the logic to be neatly contained inside pathnode.c
> without adding any burden in other areas to track or check Path
> refcounts.

The code is in following files
 src/backend/optimizer/path/allpaths.c:
The definitions of free_unused_paths_from_rel() and
free_unused_paths() can be moved to pathnode.c

 src/backend/optimizer/util/pathnode.c
 src/include/nodes/pathnodes.h
 src/include/optimizer/pathnode.h
Those three together I consider as pathnode.c. But let me know if you
feel otherwise.

src/backend/optimizer/path/joinpath.c
this is the only place where other than pathnode.c where we use
link_path(). That's required to stop add_path from freeing a
materialized path that is tried multiple times.We can't add_path that
path to rel since it will get kicked out by the path that it's
materialising for. But I think it's still path creation code so should
be ok. Let me know if you feel otherwise.

There may be other places but I can find those out and consider them
case by case basis.

>
> I imagine the patch would be neater if you added some kind of Path
> traversal function akin to expression_tree_walker() that allows you to
> visit each subpath recursively and run a callback function on each.

Yes. Agreed.

I am fine to work on this further if the community thinks it is
acceptable. It seems from your point of view the worth of this work is
as follows
a. memory savings - not worth it
b. getting rid of !IsA(old_path, IndexPath) - worth it
c. problem of same path being added to multiple relations - not worth
it since you have other solution

Can't judge whether that means it's acceptable or not.

[1] https://www.postgresql.org/message-id/CAExHW5stmOUobE55pMt83r8UxvfCph+Pvo5dNpdrVCsBgXEzDQ@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: "Ryo Matsumura (Fujitsu)"
Date:
Subject: doc: improve document of ECPG host variable
Next
From: shveta malik
Date:
Subject: Re: Synchronizing slots from primary to standby