Thread: Excessive memory usage in multi-statement queries w/ partitioning

Excessive memory usage in multi-statement queries w/ partitioning

From
Andreas Seltenreich
Date:
Hi,

a customer reported excessive memory usage and out-of-memory ERRORs
after introducing native partitioning in one of their databases.  We
could narrow it down to the overhead introduced by the partitioning when
issuing multiple statements in a single query.  I could reduce the
problem to the following recipe:

--8<---------------cut here---------------start------------->8---
#!/bin/bash

# create 100 partitions
psql -c 'create table t(c int primary key) partition by range(c)'
for i in {1..100}; do
    psql -e -c "create table t$i partition of t for values
         from ($(((i-1)*100))) to ($((i*100-1))) "
done

# artificially limit per-process memory by setting a resource limit for
# the postmaster to 256MB

prlimit -d$((256*1024*1024)) -p $POSTMASTER_PID
--8<---------------cut here---------------end--------------->8---

Now, updates to a partition are fine with 4000 update statements:

,----
| $ psql -c "$(yes update t2 set c=c where c=6 \; | head -n 4000)"
| UPDATE 0
`----

…but when doing it on the parent relation, even 100 statements are
enough to exceed the limit:

,----
| $ psql -c "$(yes update t set c=c where c=6 \; | head -n 100)"
| FEHLER:  Speicher aufgebraucht
| DETAIL:  Failed on request of size 200 in memory context "MessageContext".
`----

The memory context dump shows plausible values except for the MessageContext:

TopMemoryContext: 124336 total in 8 blocks; 18456 free (11 chunks); 105880 used
  [...]
  MessageContext: 264241152 total in 42 blocks; 264 free (0 chunks); 264240888 used
  [...]

Maybe some tactically placed pfrees or avoiding putting redundant stuff
into MessageContext can relax the situation?

regards,
Andreas



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
David Rowley
Date:
On Thu, 23 May 2019 at 17:55, Andreas Seltenreich
<andreas.seltenreich@credativ.de> wrote:
> a customer reported excessive memory usage and out-of-memory ERRORs
> after introducing native partitioning in one of their databases.  We
> could narrow it down to the overhead introduced by the partitioning when
> issuing multiple statements in a single query.

"multiple statements in a single query", did you mean to write session
or maybe transaction there?

Which version?

I tried your test case with REL_11_STABLE and I see nowhere near as
much memory used in MessageContext.

After repeating the query twice, I see:

MessageContext: 8388608 total in 11 blocks; 3776960 free (1 chunks);
4611648 used
Grand total: 8388608 bytes in 11 blocks; 3776960 free (1 chunks); 4611648 used
MessageContext: 8388608 total in 11 blocks; 3776960 free (1 chunks);
4611648 used
Grand total: 8388608 bytes in 11 blocks; 3776960 free (1 chunks); 4611648 used

which is quite a long way off the 252MB you're getting.

perhaps I'm not testing with the same version as you are.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Excessive memory usage in multi-statement queries w/partitioning

From
Julian Schauder
Date:
Hey David,


> "multiple statements in a single query", did you mean to write
> session
> or maybe transaction there?

Maybe the wording isn't perfect. It is required that the querys are
sent as a single batch. Try the exact bash-script Andreas used for
updating the parent.

> Which version?

Tested including 11.2. Initially found on 11.1. Memory-consumption
Scales somewhat linearly with existing partitions and ';' delimited
Querys per single Batch.


regards
-- 
Julian Schauder




Re: Excessive memory usage in multi-statement queries w/ partitioning

From
David Rowley
Date:
On Thu, 23 May 2019 at 21:19, Julian Schauder
<julian.schauder@credativ.de> wrote:
> > "multiple statements in a single query", did you mean to write
> > session
> > or maybe transaction there?
>
> Maybe the wording isn't perfect. It is required that the querys are
> sent as a single batch. Try the exact bash-script Andreas used for
> updating the parent.

Thanks for explaining.

> > Which version?
>
> Tested including 11.2. Initially found on 11.1. Memory-consumption
> Scales somewhat linearly with existing partitions and ';' delimited
> Querys per single Batch.

Yeah, unfortunately, if the batch contains 100 of those statements
then the planner is going to eat 100 times the memory since it stores
all 100 plans at once.

Since your pruning all but 1 partition then the situation should be
much better for you when you can upgrade to v12. Unfortunately, that's
still about 5 months away.

The best thing you can do for now is going to be either reduce the
number of partitions or reduce the number of statements in the
batch... or install more memory.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Amit Langote
Date:
Hi,

On 2019/05/23 4:15, Andreas Seltenreich wrote:
> …but when doing it on the parent relation, even 100 statements are
> enough to exceed the limit:
> 
> ,----
> | $ psql -c "$(yes update t set c=c where c=6 \; | head -n 100)"
> | FEHLER:  Speicher aufgebraucht
> | DETAIL:  Failed on request of size 200 in memory context "MessageContext".
> `----
> 
> The memory context dump shows plausible values except for the MessageContext:
> 
> TopMemoryContext: 124336 total in 8 blocks; 18456 free (11 chunks); 105880 used
>   [...]
>   MessageContext: 264241152 total in 42 blocks; 264 free (0 chunks); 264240888 used
>   [...]

As David Rowley said, planning that query hundreds of times under a single
MessageContext is not something that will end well on 11.3, because even a
single instance takes up tons of memory that's only released when
MessageContext is reset.

> Maybe some tactically placed pfrees or avoiding putting redundant stuff
> into MessageContext can relax the situation?

I too have had similar thoughts on the matter.  If the planner had built
all its subsidiary data structures in its own private context (or tree of
contexts) which is reset once a plan for a given query is built and passed
on, then there wouldn't be an issue of all of that subsidiary memory
leaking into MessageContext.  However, the problem may really be that
we're subjecting the planner to use cases that it wasn't perhaps designed
to perform equally well under -- running it many times while handling the
same message.  It is worsened by the fact that the query in question is
something that ought to have been documented as not well supported by the
planner; David has posted a documentation patch for that [1].  PG 12 has
alleviated the situation to a large degree, so you won't see the OOM
occurring for this query, but not for all queries unfortunately.

With that said, we may want to look into the planner sometimes hoarding
memory, especially when planning complex queries involving partitions.
AFAIK, one of the reasons for partition-wise join, aggregate to be turned
off by default is that its planning consumes a lot of CPU and memory,
partly because of the fact that planner doesn't actively release the
memory of its subsidiary structures, or maybe because of inferior ways in
which partitions and partitioning properties are represented in the
planner.  Though if there's evidence that it's the latter, maybe we should
fix that before pondering any sophisticated planner memory management.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAKJS1f-2rx%2BE9mG3xrCVHupefMjAp1%2BtpczQa9SEOZWyU7fjEA%40mail.gmail.com




Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Joe Conway
Date:
On 5/24/19 1:47 AM, Amit Langote wrote:
> On 2019/05/23 4:15, Andreas Seltenreich wrote:
>> …but when doing it on the parent relation, even 100 statements are
>> enough to exceed the limit:
>>
>> ,----
>> | $ psql -c "$(yes update t set c=c where c=6 \; | head -n 100)"
>> | FEHLER:  Speicher aufgebraucht
>> | DETAIL:  Failed on request of size 200 in memory context "MessageContext".
>> `----
>>
>> The memory context dump shows plausible values except for the MessageContext:
>>
>> TopMemoryContext: 124336 total in 8 blocks; 18456 free (11 chunks); 105880 used
>>   [...]
>>   MessageContext: 264241152 total in 42 blocks; 264 free (0 chunks); 264240888 used
>>   [...]
>
> As David Rowley said, planning that query hundreds of times under a single
> MessageContext is not something that will end well on 11.3, because even a
> single instance takes up tons of memory that's only released when
> MessageContext is reset.
>
>> Maybe some tactically placed pfrees or avoiding putting redundant stuff
>> into MessageContext can relax the situation?
>
> I too have had similar thoughts on the matter.  If the planner had built
> all its subsidiary data structures in its own private context (or tree of
> contexts) which is reset once a plan for a given query is built and passed
> on, then there wouldn't be an issue of all of that subsidiary memory
> leaking into MessageContext.  However, the problem may really be that
> we're subjecting the planner to use cases that it wasn't perhaps designed
> to perform equally well under -- running it many times while handling the
> same message.  It is worsened by the fact that the query in question is
> something that ought to have been documented as not well supported by the
> planner; David has posted a documentation patch for that [1].  PG 12 has
> alleviated the situation to a large degree, so you won't see the OOM
> occurring for this query, but not for all queries unfortunately.


I admittedly haven't followed this thread too closely, but if having 100
partitions causes out of memory on pg11, that sounds like a massive
regression to me.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: Excessive memory usage in multi-statement queries w/ partitioning

From
David Rowley
Date:
On Sat, 25 May 2019 at 00:18, Joe Conway <mail@joeconway.com> wrote:
> I admittedly haven't followed this thread too closely, but if having 100
> partitions causes out of memory on pg11, that sounds like a massive
> regression to me.

For it to have regressed it would have had to once have been better,
but where was that mentioned?  The only thing I saw was
non-partitioned tables compared to partitioned tables, but you can't
really say it's a regression if you're comparing apples to oranges.

I think the only regression here is in the documents from bebc46931a1
having removed the warning about too many partitions in a partitioned
table at the end of ddl.sgml. As Amit mentions, we'd like to put
something back about that.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Joe Conway
Date:
On 5/24/19 9:33 AM, David Rowley wrote:
> On Sat, 25 May 2019 at 00:18, Joe Conway <mail@joeconway.com> wrote:
>> I admittedly haven't followed this thread too closely, but if having 100
>> partitions causes out of memory on pg11, that sounds like a massive
>> regression to me.
>
> For it to have regressed it would have had to once have been better,
> but where was that mentioned?  The only thing I saw was
> non-partitioned tables compared to partitioned tables, but you can't
> really say it's a regression if you're comparing apples to oranges.


I have very successfully used multiple hundreds and even low thousands
of partitions without running out of memory under the older inheritance
based "partitioning", and declarative partitioning is supposed to be
(and we have advertised it to be) better, not worse, isn't it?

At least from my point of view if 100 partitions is unusable due to
memory leaks it is a regression. Perhaps not *technically* a regression
assuming it behaves this way in pg10 also, but I bet lots of users would
see it that way.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment
Joe Conway <mail@joeconway.com> writes:
> On 5/24/19 9:33 AM, David Rowley wrote:
>> For it to have regressed it would have had to once have been better,
>> but where was that mentioned?  The only thing I saw was
>> non-partitioned tables compared to partitioned tables, but you can't
>> really say it's a regression if you're comparing apples to oranges.

> I have very successfully used multiple hundreds and even low thousands
> of partitions without running out of memory under the older inheritance
> based "partitioning", and declarative partitioning is supposed to be
> (and we have advertised it to be) better, not worse, isn't it?

Have you done the exact thing described in the test case?  I think
that's going to be quite unpleasantly memory-intensive in any version.

The real issue here is that we have designed around the assumption
that MessageContext will be used to parse and plan one single statement
before being reset.  The described usage breaks that assumption.
No matter how memory-efficient any one statement is or isn't,
if you throw enough of them at the backend without giving it a chance
to reset MessageContext, it won't end well.

So my thought, if we want to do something about this, is not "find
some things we can pfree at the end of planning" but "find a way
to use a separate context for each statement in the query string".
Maybe multi-query strings could be handled by setting up a child
context of MessageContext (after we've done the raw parsing there
and determined that indeed there are multiple queries), running
parse analysis and planning in that context, and resetting that
context after each query.

            regards, tom lane



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Joe Conway
Date:
On 5/24/19 10:28 AM, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>> On 5/24/19 9:33 AM, David Rowley wrote:
>>> For it to have regressed it would have had to once have been better,
>>> but where was that mentioned?  The only thing I saw was
>>> non-partitioned tables compared to partitioned tables, but you can't
>>> really say it's a regression if you're comparing apples to oranges.
>
>> I have very successfully used multiple hundreds and even low thousands
>> of partitions without running out of memory under the older inheritance
>> based "partitioning", and declarative partitioning is supposed to be
>> (and we have advertised it to be) better, not worse, isn't it?
>
> Have you done the exact thing described in the test case?  I think
> that's going to be quite unpleasantly memory-intensive in any version.


Ok, fair point. Will test and report back.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Attachment

Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Amit Langote
Date:
On 2019/05/24 23:28, Tom Lane wrote:
> So my thought, if we want to do something about this, is not "find
> some things we can pfree at the end of planning" but "find a way
> to use a separate context for each statement in the query string".
> Maybe multi-query strings could be handled by setting up a child
> context of MessageContext (after we've done the raw parsing there
> and determined that indeed there are multiple queries), running
> parse analysis and planning in that context, and resetting that
> context after each query.

Maybe like the attached?  I'm not sure if we need to likewise be concerned
about exec_sql_string() being handed multi-query strings.

Thanks,
Amit

Attachment

Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Amit Langote
Date:
Hi,

On 2019/05/24 21:18, Joe Conway wrote:
> On 5/24/19 1:47 AM, Amit Langote wrote:
>> I too have had similar thoughts on the matter.  If the planner had built
>> all its subsidiary data structures in its own private context (or tree of
>> contexts) which is reset once a plan for a given query is built and passed
>> on, then there wouldn't be an issue of all of that subsidiary memory
>> leaking into MessageContext.  However, the problem may really be that
>> we're subjecting the planner to use cases that it wasn't perhaps designed
>> to perform equally well under -- running it many times while handling the
>> same message.  It is worsened by the fact that the query in question is
>> something that ought to have been documented as not well supported by the
>> planner; David has posted a documentation patch for that [1].  PG 12 has
>> alleviated the situation to a large degree, so you won't see the OOM
>> occurring for this query, but not for all queries unfortunately.
> 
> I admittedly haven't followed this thread too closely, but if having 100
> partitions causes out of memory on pg11, that sounds like a massive
> regression to me.

You won't run out of memory if you are running just one query per message,
but that's not the case in this discussion.  With multi-query submissions
like in this case, memory taken up by parsing and planning of *all*
queries adds up to a single MessageContext, so can lead to OOM if there
are enough queries to load up MessageContext beyond limit.  The only point
I was trying to make in what I wrote is that reaching OOM of this sort is
easier with partitioning, because of the age-old behavior that planning
UPDATE/DELETE queries on inherited tables (and so partitioned tables)
needs tons of memory that grows as the number of child tables / partitions
increases.

We fixed things in PG 12, at least for partitioning, so that as long as a
query needs to affect only a small number of partitions of the total
present, its planning will use only a fixed amount of CPU and memory, so
increasing the number of partitions won't lead to explosive growth in
memory used.  You might be able to tell however that that effort had
nothing to do improving the situation with multi-query submissions.

Thanks,
Amit




Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> On 2019/05/24 23:28, Tom Lane wrote:
>> So my thought, if we want to do something about this, is not "find
>> some things we can pfree at the end of planning" but "find a way
>> to use a separate context for each statement in the query string".

> Maybe like the attached?  I'm not sure if we need to likewise be concerned
> about exec_sql_string() being handed multi-query strings.

Please add this to the upcoming CF so we don't forget about it.
(I don't think there's anything very new about this behavior, so
I don't feel that we should consider it an open item for v12 ---
anyone think differently?)

            regards, tom lane



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Amit Langote
Date:
On 2019/05/27 21:56, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
>> On 2019/05/24 23:28, Tom Lane wrote:
>>> So my thought, if we want to do something about this, is not "find
>>> some things we can pfree at the end of planning" but "find a way
>>> to use a separate context for each statement in the query string".
> 
>> Maybe like the attached?  I'm not sure if we need to likewise be concerned
>> about exec_sql_string() being handed multi-query strings.
> 
> Please add this to the upcoming CF so we don't forget about it.

Done; added to Performance for lack of a better topic for this.

https://commitfest.postgresql.org/23/2131/

> (I don't think there's anything very new about this behavior, so
> I don't feel that we should consider it an open item for v12 ---
> anyone think differently?)

Agree that there's nothing new about the behavior itself.  What may be new
though is people getting increasingly bitten by it if they query tables
containing large numbers of partitions most of which need to be scanned
[1].  That is, provided they have use cases where a single client request
contains hundreds of such queries to begin with.

Thanks,
Amit


[1] AFAICT, that's the only class of queries where planner needs to keep a
lot of stuff around, the memory cost of which increases with the number of
partitions.  I was thinking that the planning complex queries involving
going through tons of indexes, joins, etc. also hoards tons of memory, but
apparently not, because the planner seems fairly good at cleaning after
itself as it's doing the work.




Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Julien Rouhaud
Date:
On Tue, May 28, 2019 at 6:57 AM Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> On 2019/05/27 21:56, Tom Lane wrote:
> > Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> >> On 2019/05/24 23:28, Tom Lane wrote:
> >>> So my thought, if we want to do something about this, is not "find
> >>> some things we can pfree at the end of planning" but "find a way
> >>> to use a separate context for each statement in the query string".
> >
> >> Maybe like the attached?  I'm not sure if we need to likewise be concerned
> >> about exec_sql_string() being handed multi-query strings.

the whole extension sql script is passed to execute_sql_string(), so I
think that it's a good thing to have similar workaround there.

About the patch:

 -        * Switch to appropriate context for constructing querytrees (again,
-        * these must outlive the execution context).
+        * Switch to appropriate context for constructing querytrees.
+        * Memory allocated during this construction is released before
+        * the generated plan is executed.

The comment should mention query and plan trees, everything else seems ok to me.



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Amit Langote
Date:
Hi Julien,

Thanks for taking a look at this.

On Thu, Jul 4, 2019 at 6:52 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> On Tue, May 28, 2019 at 6:57 AM Amit Langote wrote:
> > >> Maybe like the attached?  I'm not sure if we need to likewise be concerned
> > >> about exec_sql_string() being handed multi-query strings.
>
> the whole extension sql script is passed to execute_sql_string(), so I
> think that it's a good thing to have similar workaround there.

That makes sense, although it is perhaps much less likely for memory
usage explosion to occur in execute_sql_strings(), because the scripts
passed to execute_sql_strings() mostly contain utility statements and
rarely anything whose planning will explode in memory usage.

Anyway, I've added similar handling in execute_sql_strings() for consistency.

Now I wonder if we'll need to consider another path which calls
pg_plan_queries() on a possibly multi-statement query --
BuildCachedPlan()...

> About the patch:
>
>  -        * Switch to appropriate context for constructing querytrees (again,
> -        * these must outlive the execution context).
> +        * Switch to appropriate context for constructing querytrees.
> +        * Memory allocated during this construction is released before
> +        * the generated plan is executed.
>
> The comment should mention query and plan trees, everything else seems ok to me.

Okay, fixed.

Attached updated patch.  Thanks again.

Regards,
Amit

Attachment

Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Julien Rouhaud
Date:
Hi Amit,

On Mon, Jul 8, 2019 at 10:52 AM Amit Langote <amitlangote09@gmail.com> wrote:
>
> On Thu, Jul 4, 2019 at 6:52 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > On Tue, May 28, 2019 at 6:57 AM Amit Langote wrote:
> > > >> Maybe like the attached?  I'm not sure if we need to likewise be concerned
> > > >> about exec_sql_string() being handed multi-query strings.
> >
> > the whole extension sql script is passed to execute_sql_string(), so I
> > think that it's a good thing to have similar workaround there.
>
> That makes sense, although it is perhaps much less likely for memory
> usage explosion to occur in execute_sql_strings(), because the scripts
> passed to execute_sql_strings() mostly contain utility statements and
> rarely anything whose planning will explode in memory usage.
>
> Anyway, I've added similar handling in execute_sql_strings() for consistency.

Thanks!

> Now I wonder if we'll need to consider another path which calls
> pg_plan_queries() on a possibly multi-statement query --
> BuildCachedPlan()...

I also thought about this when reviewing the patch, but AFAICS you
can't provide a multi-statement query to BuildCachedPlan() using
prepared statements and I'm not sure that SPI is worth the trouble.
I'll mark this patch as ready for committer.

>
> > About the patch:
> >
> >  -        * Switch to appropriate context for constructing querytrees (again,
> > -        * these must outlive the execution context).
> > +        * Switch to appropriate context for constructing querytrees.
> > +        * Memory allocated during this construction is released before
> > +        * the generated plan is executed.
> >
> > The comment should mention query and plan trees, everything else seems ok to me.
>
> Okay, fixed.
>
> Attached updated patch.  Thanks again.
>
> Regards,
> Amit



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> [ parse-plan-memcxt_v2.patch ]

I got around to looking at this finally.  I'm not at all happy with
the fact that it's added a plantree copy step to the only execution
path through exec_simple_query.  That's a very significant overhead,
in many use-cases, to solve something that nobody had complained
about for a couple of decades before now.  I don't see the need for
any added copy step anyway.  The only reason you're doing it AFAICS
is so you can release the per-statement context a bit earlier, which
is a completely unnecessary optimization.  Just wait to release it
till the bottom of the loop.

Also, creating/deleting the sub-context is in itself an added overhead
that accomplishes exactly nothing in the typical case where there's
not multiple statements.  I thought the idea was to do that only if
there was more than one raw parsetree (or, maybe better, do it for
all but the last parsetree).

To show that this isn't an empty concern, I did a quick pgbench
test.  Using a single-client select-only test ("pgbench -S -T 60"
in an -s 10 database), I got these numbers in three trials with HEAD:

tps = 9593.818478 (excluding connections establishing)
tps = 9570.189163 (excluding connections establishing)
tps = 9596.579038 (excluding connections establishing)

and these numbers after applying the patch:

tps = 9411.918165 (excluding connections establishing)
tps = 9389.279079 (excluding connections establishing)
tps = 9409.350175 (excluding connections establishing)

That's about a 2% dropoff.  Now it's possible that that can be
explained away as random variations from a slightly different layout
of critical loops vs cacheline boundaries ... but I don't believe it.

            regards, tom lane



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Amit Langote
Date:
On Tue, Jul 9, 2019 at 6:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Langote <amitlangote09@gmail.com> writes:
> > [ parse-plan-memcxt_v2.patch ]
>
> I got around to looking at this finally.

Thanks for the review.

> I'm not at all happy with
> the fact that it's added a plantree copy step to the only execution
> path through exec_simple_query.  That's a very significant overhead,
> in many use-cases, to solve something that nobody had complained
> about for a couple of decades before now.  I don't see the need for
> any added copy step anyway.  The only reason you're doing it AFAICS
> is so you can release the per-statement context a bit earlier, which
> is a completely unnecessary optimization.  Just wait to release it
> till the bottom of the loop.

Ah, that makes sense.  I've removed the copying of plan tree and also
moved the temporary context deletion to the bottom of the loop.

> Also, creating/deleting the sub-context is in itself an added overhead
> that accomplishes exactly nothing in the typical case where there's
> not multiple statements.  I thought the idea was to do that only if
> there was more than one raw parsetree (or, maybe better, do it for
> all but the last parsetree).

That makes sense too.  I've made it (creation/deletion of the child
context) conditional on whether there are more than one queries to
plan.

> To show that this isn't an empty concern, I did a quick pgbench
> test.  Using a single-client select-only test ("pgbench -S -T 60"
> in an -s 10 database), I got these numbers in three trials with HEAD:
>
> tps = 9593.818478 (excluding connections establishing)
> tps = 9570.189163 (excluding connections establishing)
> tps = 9596.579038 (excluding connections establishing)
>
> and these numbers after applying the patch:
>
> tps = 9411.918165 (excluding connections establishing)
> tps = 9389.279079 (excluding connections establishing)
> tps = 9409.350175 (excluding connections establishing)
>
> That's about a 2% dropoff.

With the updated patch, here are the numbers on my machine (HEAD vs patch)

HEAD:

tps = 3586.233815 (excluding connections establishing)
tps = 3569.252542 (excluding connections establishing)
tps = 3559.027733 (excluding connections establishing)

Patched:

tps = 3586.988057 (excluding connections establishing)
tps = 3585.169589 (excluding connections establishing)
tps = 3526.437968 (excluding connections establishing)

A bit noisy but not much degradation.

Attached updated patch.  Thanks again.

Regards,
Amit

Attachment

Re: Excessive memory usage in multi-statement queries w/partitioning

From
Kyotaro Horiguchi
Date:
Hi,

At Wed, 10 Jul 2019 16:35:18 +0900, Amit Langote <amitlangote09@gmail.com> wrote in
<CA+HiwqFCO4c8tdQmXcDNzyaD43A81caapYLJ6CEh8H3P0tPL4A@mail.gmail.com>
> On Tue, Jul 9, 2019 at 6:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Langote <amitlangote09@gmail.com> writes:
> > > [ parse-plan-memcxt_v2.patch ]
> >
> > I got around to looking at this finally.
> 
> Thanks for the review.
> 
> > I'm not at all happy with
> > the fact that it's added a plantree copy step to the only execution
> > path through exec_simple_query.  That's a very significant overhead,
> > in many use-cases, to solve something that nobody had complained
> > about for a couple of decades before now.  I don't see the need for
> > any added copy step anyway.  The only reason you're doing it AFAICS
> > is so you can release the per-statement context a bit earlier, which
> > is a completely unnecessary optimization.  Just wait to release it
> > till the bottom of the loop.
> 
> Ah, that makes sense.  I've removed the copying of plan tree and also
> moved the temporary context deletion to the bottom of the loop.

-     * Switch to appropriate context for constructing querytrees (again,
-     * these must outlive the execution context).
+     * Switch to appropriate context for constructing query and plan trees
+     * (again, these must outlive the execution context).  Normally, it's
+     * MessageContext, but if there are more queries to plan, we use a
+     * temporary child context that will be reset after executing this
+     * query.  We avoid that overhead of setting up a separate context
+     * for the common case of having just a single query.

Might be stupid, but I feel uneasy that "usually it must live in
MessageContxt, but not necessarily if there is succeeding
query".. *I* need more explanation why it is safe to use that
short-lived context.

> > Also, creating/deleting the sub-context is in itself an added overhead
> > that accomplishes exactly nothing in the typical case where there's
> > not multiple statements.  I thought the idea was to do that only if
> > there was more than one raw parsetree (or, maybe better, do it for
> > all but the last parsetree).
> 
> That makes sense too.  I've made it (creation/deletion of the child
> context) conditional on whether there are more than one queries to
> plan.
>
> > To show that this isn't an empty concern, I did a quick pgbench
> > test.  Using a single-client select-only test ("pgbench -S -T 60"
> > in an -s 10 database), I got these numbers in three trials with HEAD:
> >
> > tps = 9593.818478 (excluding connections establishing)
> > tps = 9570.189163 (excluding connections establishing)
> > tps = 9596.579038 (excluding connections establishing)
> >
> > and these numbers after applying the patch:
> >
> > tps = 9411.918165 (excluding connections establishing)
> > tps = 9389.279079 (excluding connections establishing)
> > tps = 9409.350175 (excluding connections establishing)
> >
> > That's about a 2% dropoff.
> 
> With the updated patch, here are the numbers on my machine (HEAD vs patch)
> 
> HEAD:
> 
> tps = 3586.233815 (excluding connections establishing)
> tps = 3569.252542 (excluding connections establishing)
> tps = 3559.027733 (excluding connections establishing)
> 
> Patched:
> 
> tps = 3586.988057 (excluding connections establishing)
> tps = 3585.169589 (excluding connections establishing)
> tps = 3526.437968 (excluding connections establishing)
> 
> A bit noisy but not much degradation.
> 
> Attached updated patch.  Thanks again.
> 
> Regards,
> Amit

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Tom Lane
Date:
Amit Langote <amitlangote09@gmail.com> writes:
> Attached updated patch.  Thanks again.

Pushed with a bit of further cleanup --- most notably, the way
you had execute_sql_string(), it was still leaking any cruft
ProcessUtility might generate.  We can fix that by running
ProcessUtility in the per-statement context too.

I also dropped the optimization for a single/last statement in
execute_sql_string(), and simplified it to just always create
and delete a child context.  This was based on a couple of
thoughts.  The norm in this code path is that there's multiple
statements, probably lots of them, so that the percentage savings
from getting rid of one context creation is likely negligible.
Also, unlike exec_simple_query, we *don't* know that the outer
context is due to be cleared right afterwards.  Since
execute_sql_string() can run multiple times in one extension
command, in principle we could get bloat from not cleaning up
after the last command of each string.   Admittedly, it's not
likely that you'd have so many strings involved that that
amounts to a lot, but between following upgrade-script chains
and cascaded module loads, there could be more than a couple.
So it seems like the argument for saving a context creation is
much weaker here than in exec_simple_query.

I tried to improve the comments too.  I noticed that the bit about
"(again, these must outlive the execution context)" seemed to be
a dangling reference --- whatever previous comment it was referring
to is not to be found anymore.  So I made that self-contained.

            regards, tom lane



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Amit Langote
Date:
On Thu, Jul 11, 2019 at 3:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Langote <amitlangote09@gmail.com> writes:
> > Attached updated patch.  Thanks again.
>
> Pushed with a bit of further cleanup

Thanks a lot.

> --- most notably, the way
> you had execute_sql_string(), it was still leaking any cruft
> ProcessUtility might generate.  We can fix that by running
> ProcessUtility in the per-statement context too.

Ah, I was thinking only about planning.

> I also dropped the optimization for a single/last statement in
> execute_sql_string(), and simplified it to just always create
> and delete a child context.  This was based on a couple of
> thoughts.  The norm in this code path is that there's multiple
> statements, probably lots of them, so that the percentage savings
> from getting rid of one context creation is likely negligible.
> Also, unlike exec_simple_query, we *don't* know that the outer
> context is due to be cleared right afterwards.  Since
> execute_sql_string() can run multiple times in one extension
> command, in principle we could get bloat from not cleaning up
> after the last command of each string.   Admittedly, it's not
> likely that you'd have so many strings involved that that
> amounts to a lot, but between following upgrade-script chains
> and cascaded module loads, there could be more than a couple.
> So it seems like the argument for saving a context creation is
> much weaker here than in exec_simple_query.

Agreed.

> I tried to improve the comments too.  I noticed that the bit about
> "(again, these must outlive the execution context)" seemed to be
> a dangling reference --- whatever previous comment it was referring
> to is not to be found anymore.  So I made that self-contained.

Thanks.

Regards,
Amit



Re: Excessive memory usage in multi-statement queries w/ partitioning

From
Amit Langote
Date:
Horiguchi-san,

Thanks for the comment.  My reply is a bit late now, but....

On Wed, Jul 10, 2019 at 5:39 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> At Wed, 10 Jul 2019 16:35:18 +0900, Amit Langote <amitlangote09@gmail.com> wrote:
> -     * Switch to appropriate context for constructing querytrees (again,
> -     * these must outlive the execution context).
> +     * Switch to appropriate context for constructing query and plan trees
> +     * (again, these must outlive the execution context).  Normally, it's
> +     * MessageContext, but if there are more queries to plan, we use a
> +     * temporary child context that will be reset after executing this
> +     * query.  We avoid that overhead of setting up a separate context
> +     * for the common case of having just a single query.
>
> Might be stupid, but I feel uneasy that "usually it must live in
> MessageContxt, but not necessarily if there is succeeding
> query".. *I* need more explanation why it is safe to use that
> short-lived context.

So the problem we're trying solve with this is that memory consumed
when parsing/planning individual statements pile up in a single
context (currently, MessageContext), which can lead to severe bloat
especially if the planning of individual statements consumes huge
amount of memory.  The solution is to use a sub-context of
MessageContext for each statement that's reset when its execution is
finished.  I think it's safe to use a shorter-lived context for each
statement because the memory of a given statement should not need to
be referenced when its execution finishes.  Do you see any problem
with that assumption?

Thanks,
Amit