Thread: Discarding DISCARD ALL

Discarding DISCARD ALL

From
Simon Riggs
Date:
Currently, session poolers operating in transaction mode need to send
a "server_reset_query" which is mostly DISCARD ALL.

It seems strange to me that we put this work onto the pooler, forcing
poolers to repeatedly issue the same command, at some cost in
performance. Measuring the overhead with pgbench might miss the points
that poolers are frequently configured on different network hosts and
that monitoring tools used in production will record the DISCARD
statement. YMMV, but the overhead is measurably non-zero.

Proposal is to have a simple new parameter:
  transaction_cleanup = off (default) | on
A setting of "on" will issue the equivalent of a DISCARD ALL as soon
as the transaction has been ended by a COMMIT, ROLLBACK or PREPARE.

Poolers such as pgbouncer would then be able to connect transaction
mode pools by setting transaction_cleanup=on at time of connection,
avoiding any need to issue a server_reset_query, removing the DISCARD
ALL command from the normal execution path, while still achieving the
same thing.

This has an additional side benefit: if we know we will clean up at
the end of the transaction, then all temp tables become effectively ON
COMMIT DROP and we are able to allow temp tables in prepared
transactions. There are likely other side benefits from this
knowledge, allowing us to further tune the PostgreSQL server to the
common use case of transaction session poolers. I think it should be
possible to avoid looking for holdable portals if we are dropping them
all anyway.

Patch attached, passes make check with new tests added.

Comments welcome.

-- 
Simon Riggs                http://www.EnterpriseDB.com/

Attachment

Re: Discarding DISCARD ALL

From
Vladimir Sitnikov
Date:
Simon>It seems strange to me that we put this work onto the pooler, forcing
Simon>poolers to repeatedly issue the same command

What if poolers learn to manage connections and prepared statements better?
Then poolers won't have to reset the session every time, and everyone wins.

Simon>This has an additional side benefit: if we know we will clean up at
Simon>the end of the transaction, then all temp tables become effectively ON
Simon>COMMIT DROP

The ability to use the temporary tables sounds cool, however, server-prepared statements
allow improve performance significantly (both at frontend and backend), so I would not treat
"server_reset_query=discard all" as a best practice.
That is why transaction_cleanup=on (discard everything at transaction finish) seems to be a workaround
rather than a best practice for the future.

Just to clarify: the patch seems to be small, so it looks harmless, however, I would not be that
enthusiastic with tying new features with transaction_cleanup=on.

---

What do you mean by "session cleanup"?
Does that always have to be the same as "discard all"?
What if the user wants "deallocate all" behavior?
Should transaction_cleanup be an enum rather than on/off?

Vladimir

Re: Discarding DISCARD ALL

From
Andreas Karlsson
Date:
On 12/23/20 3:47 PM, Vladimir Sitnikov wrote:
> Simon>It seems strange to me that we put this work onto the pooler, forcing
> Simon>poolers to repeatedly issue the same command
> 
> What if poolers learn to manage connections and prepared statements better?
> Then poolers won't have to reset the session every time, and everyone wins.

While that is be possible to implement since some client libraries 
implement this in their pools (e.g. Sequel for Ruby) this patch would 
help connection poolers which are not aware of prepared statements, for 
example PgBouncer, so it is worthwhile as long as there are connection 
poolers out there which are not aware of prepared statements. And even 
the connection poolers which are aware might want to automatically drop 
temporary tables and reset GUCs. So I do not think that this feature 
would become pointless even if people write a patch for PgBouncer.

> Simon>This has an additional side benefit: if we know we will clean up at
> Simon>the end of the transaction, then all temp tables become effectively ON
> Simon>COMMIT DROP
> 
> The ability to use the temporary tables sounds cool, however, 
> server-prepared statements
> allow improve performance significantly (both at frontend and backend), 
> so I would not treat
> "server_reset_query=discard all" as a best practice.
> That is why transaction_cleanup=on (discard everything at transaction 
> finish) seems to be a workaround
> rather than a best practice for the future.

While I know how to add support for prepared statements to a connection 
pooler it is unclear to me how one would add support for temporary 
tables which are not at least ON COMMIT DELETE ROWS since rows inserted 
on one connection will only be visible at the connection.

Andreas




Re: Discarding DISCARD ALL

From
Simon Riggs
Date:
On Wed, 23 Dec 2020 at 15:19, Andreas Karlsson <andreas@proxel.se> wrote:
>
> On 12/23/20 3:47 PM, Vladimir Sitnikov wrote:
> > Simon>It seems strange to me that we put this work onto the pooler, forcing
> > Simon>poolers to repeatedly issue the same command
> >
> > What if poolers learn to manage connections and prepared statements better?
> > Then poolers won't have to reset the session every time, and everyone wins.
>
> While that is be possible to implement since some client libraries
> implement this in their pools (e.g. Sequel for Ruby) this patch would
> help connection poolers which are not aware of prepared statements, for
> example PgBouncer, so it is worthwhile as long as there are connection
> poolers out there which are not aware of prepared statements. And even
> the connection poolers which are aware might want to automatically drop
> temporary tables and reset GUCs. So I do not think that this feature
> would become pointless even if people write a patch for PgBouncer.

The whole premise of the patch is tighter integration, with the server
providing the facilities that poolers need.

The patch can be enhanced to do whatever else we agree is desirable.

Do we need something like DISCARD ALL EXCEPT PREPARED STATEMENTS;  ??

If there are different requirements for each pooler, what are they?

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: Discarding DISCARD ALL

From
Andreas Karlsson
Date:
On 12/23/20 6:49 PM, Simon Riggs wrote:
> The whole premise of the patch is tighter integration, with the server
> providing the facilities that poolers need.

I am all for that. Ideally I would want builtin connection pooling but 
short term I think the way forward is most likely tighter integration.

> The patch can be enhanced to do whatever else we agree is desirable.
> 
> Do we need something like DISCARD ALL EXCEPT PREPARED STATEMENTS;  ??
> 
> If there are different requirements for each pooler, what are they?

If someone adds prepared statement support to e.g. PgBouncer that might 
be a nice feature to have. Plus maybe something like "SET 
transaction_cleanup = 'except_prepared'". But right now I think the 
pools which support prepared statements do not use DISCARD ALL and 
instead just trust the end user to not run SET or use temporary tables 
which outlive transactions.

Andreas



Re: Discarding DISCARD ALL

From
Peter Eisentraut
Date:
On 2020-12-23 15:33, Simon Riggs wrote:
> Poolers such as pgbouncer would then be able to connect transaction
> mode pools by setting transaction_cleanup=on at time of connection,
> avoiding any need to issue a server_reset_query, removing the DISCARD
> ALL command from the normal execution path, while still achieving the
> same thing.

PgBouncer does not send DISCARD ALL in transaction mode.  There is a 
separate setting to do that, but it's not the default, and it's more of 
a workaround for bad client code.  So I don't know if this feature would 
be of much use for PgBouncer.  Other connection poolers might have other 
opinions.




Re: Discarding DISCARD ALL

From
James Coleman
Date:
I hope to do further review of the patch later this week, but I wanted
to at least comment on this piece:

On Wed, Jan 20, 2021 at 2:48 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 2020-12-23 15:33, Simon Riggs wrote:
> > Poolers such as pgbouncer would then be able to connect transaction
> > mode pools by setting transaction_cleanup=on at time of connection,
> > avoiding any need to issue a server_reset_query, removing the DISCARD
> > ALL command from the normal execution path, while still achieving the
> > same thing.
>
> PgBouncer does not send DISCARD ALL in transaction mode.  There is a
> separate setting to do that, but it's not the default, and it's more of
> a workaround for bad client code.  So I don't know if this feature would
> be of much use for PgBouncer.  Other connection poolers might have other
> opinions.

Yes, to have server_reset_query apply in transaction pooling mode you
have to additionally configure pgbouncer with
server_reset_query_always enabled.

I'd mildly take issue with "a workaround for bad client code". Yes,
clients in transaction pooling mode shouldn't issue (for example) `SET
...`, but there's no way I'm aware of in Postgres to prevent
session-specific items like those GUCs from being set by a given user,
so I view it more like a safeguard than a workaround.

In our setup we have server_reset_query_always=1 as such a safeguard,
because it's too easy for application code to update, for example,
statement_timeout to disastrous results. But we also work to make sure
those don't happen (or get cleaned up if they happen to slip in).

An alternative approach that occurred to me while typing this reply: a
setting in Postgres that would disallow setting session level GUCs
(i.e., enforce `SET LOCAL` transaction level usage instead) would
remove a large chunk of our need to set server_reset_query_always=1
(and more interestingly it'd highlight when broken code gets pushed).
But even with that, I see some value in the proposed setting since
there is additional session state beyond GUCs.

James



Re: Discarding DISCARD ALL

From
Simon Riggs
Date:
On Wed, 20 Jan 2021 at 14:21, James Coleman <jtc331@gmail.com> wrote:

> An alternative approach that occurred to me while typing this reply: a
> setting in Postgres that would disallow setting session level GUCs
> (i.e., enforce `SET LOCAL` transaction level usage instead) would
> remove a large chunk of our need to set server_reset_query_always=1
> (and more interestingly it'd highlight when broken code gets pushed).
> But even with that, I see some value in the proposed setting since
> there is additional session state beyond GUCs.

With transaction_cleanup=on we could force all SETs to be SET LOCAL.

The point is that if we declare ahead of time that the transaction
will be reset then we can act differently and more easily for various
circumstances, for SETs, for Temp tables and others.

-- 
Simon Riggs                http://www.EnterpriseDB.com/



Re: Discarding DISCARD ALL

From
James Coleman
Date:
On Wed, Jan 20, 2021 at 9:58 AM Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On Wed, 20 Jan 2021 at 14:21, James Coleman <jtc331@gmail.com> wrote:
>
> > An alternative approach that occurred to me while typing this reply: a
> > setting in Postgres that would disallow setting session level GUCs
> > (i.e., enforce `SET LOCAL` transaction level usage instead) would
> > remove a large chunk of our need to set server_reset_query_always=1
> > (and more interestingly it'd highlight when broken code gets pushed).
> > But even with that, I see some value in the proposed setting since
> > there is additional session state beyond GUCs.
>
> With transaction_cleanup=on we could force all SETs to be SET LOCAL.
>
> The point is that if we declare ahead of time that the transaction
> will be reset then we can act differently and more easily for various
> circumstances, for SETs, for Temp tables and others.

Right, I agree it's independently useful. My "alternative" is a subset
of that functionality and doesn't cover as many cases.

James



Re: Discarding DISCARD ALL

From
Simon Riggs
Date:
On Wed, Jan 20, 2021 at 3:53 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Wed, Jan 20, 2021 at 9:58 AM Simon Riggs <simon@2ndquadrant.com> wrote:
> >
> > On Wed, 20 Jan 2021 at 14:21, James Coleman <jtc331@gmail.com> wrote:
> >
> > > An alternative approach that occurred to me while typing this reply: a
> > > setting in Postgres that would disallow setting session level GUCs
> > > (i.e., enforce `SET LOCAL` transaction level usage instead) would
> > > remove a large chunk of our need to set server_reset_query_always=1
> > > (and more interestingly it'd highlight when broken code gets pushed).
> > > But even with that, I see some value in the proposed setting since
> > > there is additional session state beyond GUCs.
> >
> > With transaction_cleanup=on we could force all SETs to be SET LOCAL.
> >
> > The point is that if we declare ahead of time that the transaction
> > will be reset then we can act differently and more easily for various
> > circumstances, for SETs, for Temp tables and others.
>
> Right, I agree it's independently useful. My "alternative" is a subset
> of that functionality and doesn't cover as many cases.

So if we go for that option, would we call it?

session_state = 'session' (default) | 'local_set'

If you use 'local' then you find that all state is transaction only
* SET defaults to meaning SET LOCAL
* SET SESSION returns ERROR

-- 
Simon Riggs                http://www.EnterpriseDB.com/