Memory consumed by paths during partitionwise join planning - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Memory consumed by paths during partitionwise join planning
Date
Msg-id CAExHW5tUcVsBkq9qT=L5vYz4e-cwQNw=KAGJrtSyzOp3F=XacA@mail.gmail.com
Whole thread Raw
Responses Re: Memory consumed by paths during partitionwise join planning
List pgsql-hackers
Hi All,

If add_path() and add_partial_path() do not find a new path to be
superior to any existing paths, they free the new path. They free an
existing path if it is found to be inferior to the new path. But not
all paths surviving in a RelOptInfo are used to create paths for
relations which use it as input. Further add_path and add_partial_path
do not free the whole path tree but just the topmost pathnode.  The
subpath nodes are not freed because they may be referenced by other
paths. The subpaths continue to occupy memory even if they are not
used anywhere. As we build the relation tree upwards (base, join,
upper relations) more and more such paths are accumulated and continue
to consume memory till the statement ends. With partitionwise join
involving partitioned tables with thousands of partitions this memory
consumption increases proportional to the number of partitions.

Attached WIP patches address this issue by adding infrastructure to
track references to paths and free unreferenced paths once they can be
deemed as useless. A new member Path::ref_count in a pathnode tracks
how many other objects, like pathlists in RelOptInfo and other paths,
reference that pathnode. As the path nodes are referenced they are
"linked" using link_path() to referencing objects. When the path nodes
are no longer referenced they are "unlinked" using unlink_path() from
the referencing objects. Path nodes are freed using free_path(). The
function unlinks the sub path nodes so that they can be freed when
their reference count drops to 0. The paths whose references reach 0
during unlinking are freed automatically using free_path(). Patch 0002
adds this infrastructure. 0003 and 0004 use these functions in example
cases.

With these patches the memory consumption numbers look like below.
Experiment
----------
Memory consumption is measured using a method similar to the one
described in [1]. The changes to EXPLAIN ANALYZE to report planning
memory are in 0001. Memory consumed when planning a self-join query is
measured. The queries involve partitioned and unpartitioned tables
respectively. The partitioned table has 1000 partitions in it. The
table definitions and helper function can be found in setup.sql and
the queries can be found in queries.sql. This is the simplest setup.
Higher savings may be seen with more complex setups involving indexes,
SQL operators and clauses.

Table 1: Join between unpartitioned tables
Number of tables | without patch  | with patch | % reduction |
being joined     |                |            |             |
--------------------------------------------------------------
               2 |      29.0 KiB  |   28.9 KiB |       0.66% |
               3 |      79.1 KiB  |   76.7 KiB |       3.03% |
               4 |     208.6 KiB  |  198.2 KiB |       4.97% |
               5 |     561.6 KiB  |  526.3 KiB |       6.28% |

Table 2: join between partitioned tables with partitionwise join
enabled (enable_partitionwise_join = true).
Number of tables | without patch  | with patch | % reduction |
being joined     |                |            |             |
----------------------------------------------------------------
               2 |      40.3 MiB  |   40.0 MiB |       0.70% |
               3 |     146.9 MiB  |  143.1 MiB |       2.55% |
               4 |     445.4 MiB  |  430.4 MiB |       3.38% |
               5 |    1563.3 MiB  | 1517.6 MiB |       2.92% |

The patch is not complete because of following issues:

a. 0003 and 0004 patches do not cover all the path nodes. I have
covered only those which I encountered in the queries I ran. If we
accept this proposal, I will work on covering all the path nodes.

b. It does not cover the entire RelOptInfo tree that the planner
creates. The paths in a lower level RelOptInfo can be deemed as
useless only when path creation for all the immediate upper
RelOptInfos is complete. Thus we need access to both upper and lower
level RelOptInfos at the same time. The RelOptInfos created while
planning join are available in standard_join_search(). Thus it's
possible to free unused paths listed in all the RelOptInfos except the
topmost RelOptInfo in this function as done in the patch. But the
topmost RelOptInfo acts as an input to upper RelOptInfo in
grouping_planner() where we don't have access to the RelOptInfos at
lower levels in the join tree. Hence we can't free the unused paths
from topmost RelOptInfo and cascade that effect down the RelOptInfo
tree. This is the reason why we don't see memory reduction in case of
2-way join. This is also the reason why the numbers above are lower
than the actual memory that can be saved. If we decide to move ahead
with this approach, I will work on this too.

c. The patch does not call link_path and unlink_path in all the
required places. We will need some work to identify such places, to
build infrastructure and methods to identify such places in future.

Another approach to fixing this would be to use separate memory
context for creating path nodes and then deleting the entire memory
context at the end of planning once the plan is created. We will need
to reset path lists as well as cheapest_path members in RelOptInfos as
well. This will possibly free up more memory and might be faster. But
I have not tried it. The approach taken in the patch has an advantage
over this one i.e. the paths can be freed at any stage in the planning
using free_unused_paths_from_rel() implemented in the patch. Thus we
can monitor the memory consumption and trigger garbage collection when
it crosses a certain threashold. Or we may implement both the
approaches to clean every bit of paths at the end of planning while
garbage collecting pathnodes when memory consumption goes beyond
threashold. The reference mechanism may have other usages as well.

Suggestions/comments welcome.

References
1. https://www.postgresql.org/message-id/CAExHW5stmOUobE55pMt83r8UxvfCph+Pvo5dNpdrVCsBgXEzDQ@mail.gmail.com

--
Best Wishes,
Ashutosh Bapat

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Next
From: Ashutosh Bapat
Date:
Subject: Memory consumed by child SpecialJoinInfo in partitionwise join planning