Thread: Discarding DISCARD ALL
Currently, session poolers operating in transaction mode need to send a "server_reset_query" which is mostly DISCARD ALL. It seems strange to me that we put this work onto the pooler, forcing poolers to repeatedly issue the same command, at some cost in performance. Measuring the overhead with pgbench might miss the points that poolers are frequently configured on different network hosts and that monitoring tools used in production will record the DISCARD statement. YMMV, but the overhead is measurably non-zero. Proposal is to have a simple new parameter: transaction_cleanup = off (default) | on A setting of "on" will issue the equivalent of a DISCARD ALL as soon as the transaction has been ended by a COMMIT, ROLLBACK or PREPARE. Poolers such as pgbouncer would then be able to connect transaction mode pools by setting transaction_cleanup=on at time of connection, avoiding any need to issue a server_reset_query, removing the DISCARD ALL command from the normal execution path, while still achieving the same thing. This has an additional side benefit: if we know we will clean up at the end of the transaction, then all temp tables become effectively ON COMMIT DROP and we are able to allow temp tables in prepared transactions. There are likely other side benefits from this knowledge, allowing us to further tune the PostgreSQL server to the common use case of transaction session poolers. I think it should be possible to avoid looking for holdable portals if we are dropping them all anyway. Patch attached, passes make check with new tests added. Comments welcome. -- Simon Riggs http://www.EnterpriseDB.com/
Attachment
Simon>It seems strange to me that we put this work onto the pooler, forcing
Simon>poolers to repeatedly issue the same commandWhat if poolers learn to manage connections and prepared statements better?
Then poolers won't have to reset the session every time, and everyone wins.
Simon>the end of the transaction, then all temp tables become effectively ON
Simon>COMMIT DROP
The ability to use the temporary tables sounds cool, however, server-prepared statements
allow improve performance significantly (both at frontend and backend), so I would not treat
"server_reset_query=discard all" as a best practice.
That is why transaction_cleanup=on (discard everything at transaction finish) seems to be a workaround
rather than a best practice for the future.
Just to clarify: the patch seems to be small, so it looks harmless, however, I would not be that
enthusiastic with tying new features with transaction_cleanup=on.
---
What do you mean by "session cleanup"?
Does that always have to be the same as "discard all"?
What if the user wants "deallocate all" behavior?
Should transaction_cleanup be an enum rather than on/off?
Vladimir
On 12/23/20 3:47 PM, Vladimir Sitnikov wrote: > Simon>It seems strange to me that we put this work onto the pooler, forcing > Simon>poolers to repeatedly issue the same command > > What if poolers learn to manage connections and prepared statements better? > Then poolers won't have to reset the session every time, and everyone wins. While that is be possible to implement since some client libraries implement this in their pools (e.g. Sequel for Ruby) this patch would help connection poolers which are not aware of prepared statements, for example PgBouncer, so it is worthwhile as long as there are connection poolers out there which are not aware of prepared statements. And even the connection poolers which are aware might want to automatically drop temporary tables and reset GUCs. So I do not think that this feature would become pointless even if people write a patch for PgBouncer. > Simon>This has an additional side benefit: if we know we will clean up at > Simon>the end of the transaction, then all temp tables become effectively ON > Simon>COMMIT DROP > > The ability to use the temporary tables sounds cool, however, > server-prepared statements > allow improve performance significantly (both at frontend and backend), > so I would not treat > "server_reset_query=discard all" as a best practice. > That is why transaction_cleanup=on (discard everything at transaction > finish) seems to be a workaround > rather than a best practice for the future. While I know how to add support for prepared statements to a connection pooler it is unclear to me how one would add support for temporary tables which are not at least ON COMMIT DELETE ROWS since rows inserted on one connection will only be visible at the connection. Andreas
On Wed, 23 Dec 2020 at 15:19, Andreas Karlsson <andreas@proxel.se> wrote: > > On 12/23/20 3:47 PM, Vladimir Sitnikov wrote: > > Simon>It seems strange to me that we put this work onto the pooler, forcing > > Simon>poolers to repeatedly issue the same command > > > > What if poolers learn to manage connections and prepared statements better? > > Then poolers won't have to reset the session every time, and everyone wins. > > While that is be possible to implement since some client libraries > implement this in their pools (e.g. Sequel for Ruby) this patch would > help connection poolers which are not aware of prepared statements, for > example PgBouncer, so it is worthwhile as long as there are connection > poolers out there which are not aware of prepared statements. And even > the connection poolers which are aware might want to automatically drop > temporary tables and reset GUCs. So I do not think that this feature > would become pointless even if people write a patch for PgBouncer. The whole premise of the patch is tighter integration, with the server providing the facilities that poolers need. The patch can be enhanced to do whatever else we agree is desirable. Do we need something like DISCARD ALL EXCEPT PREPARED STATEMENTS; ?? If there are different requirements for each pooler, what are they? -- Simon Riggs http://www.EnterpriseDB.com/
On 12/23/20 6:49 PM, Simon Riggs wrote: > The whole premise of the patch is tighter integration, with the server > providing the facilities that poolers need. I am all for that. Ideally I would want builtin connection pooling but short term I think the way forward is most likely tighter integration. > The patch can be enhanced to do whatever else we agree is desirable. > > Do we need something like DISCARD ALL EXCEPT PREPARED STATEMENTS; ?? > > If there are different requirements for each pooler, what are they? If someone adds prepared statement support to e.g. PgBouncer that might be a nice feature to have. Plus maybe something like "SET transaction_cleanup = 'except_prepared'". But right now I think the pools which support prepared statements do not use DISCARD ALL and instead just trust the end user to not run SET or use temporary tables which outlive transactions. Andreas
On 2020-12-23 15:33, Simon Riggs wrote: > Poolers such as pgbouncer would then be able to connect transaction > mode pools by setting transaction_cleanup=on at time of connection, > avoiding any need to issue a server_reset_query, removing the DISCARD > ALL command from the normal execution path, while still achieving the > same thing. PgBouncer does not send DISCARD ALL in transaction mode. There is a separate setting to do that, but it's not the default, and it's more of a workaround for bad client code. So I don't know if this feature would be of much use for PgBouncer. Other connection poolers might have other opinions.
I hope to do further review of the patch later this week, but I wanted to at least comment on this piece: On Wed, Jan 20, 2021 at 2:48 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 2020-12-23 15:33, Simon Riggs wrote: > > Poolers such as pgbouncer would then be able to connect transaction > > mode pools by setting transaction_cleanup=on at time of connection, > > avoiding any need to issue a server_reset_query, removing the DISCARD > > ALL command from the normal execution path, while still achieving the > > same thing. > > PgBouncer does not send DISCARD ALL in transaction mode. There is a > separate setting to do that, but it's not the default, and it's more of > a workaround for bad client code. So I don't know if this feature would > be of much use for PgBouncer. Other connection poolers might have other > opinions. Yes, to have server_reset_query apply in transaction pooling mode you have to additionally configure pgbouncer with server_reset_query_always enabled. I'd mildly take issue with "a workaround for bad client code". Yes, clients in transaction pooling mode shouldn't issue (for example) `SET ...`, but there's no way I'm aware of in Postgres to prevent session-specific items like those GUCs from being set by a given user, so I view it more like a safeguard than a workaround. In our setup we have server_reset_query_always=1 as such a safeguard, because it's too easy for application code to update, for example, statement_timeout to disastrous results. But we also work to make sure those don't happen (or get cleaned up if they happen to slip in). An alternative approach that occurred to me while typing this reply: a setting in Postgres that would disallow setting session level GUCs (i.e., enforce `SET LOCAL` transaction level usage instead) would remove a large chunk of our need to set server_reset_query_always=1 (and more interestingly it'd highlight when broken code gets pushed). But even with that, I see some value in the proposed setting since there is additional session state beyond GUCs. James
On Wed, 20 Jan 2021 at 14:21, James Coleman <jtc331@gmail.com> wrote: > An alternative approach that occurred to me while typing this reply: a > setting in Postgres that would disallow setting session level GUCs > (i.e., enforce `SET LOCAL` transaction level usage instead) would > remove a large chunk of our need to set server_reset_query_always=1 > (and more interestingly it'd highlight when broken code gets pushed). > But even with that, I see some value in the proposed setting since > there is additional session state beyond GUCs. With transaction_cleanup=on we could force all SETs to be SET LOCAL. The point is that if we declare ahead of time that the transaction will be reset then we can act differently and more easily for various circumstances, for SETs, for Temp tables and others. -- Simon Riggs http://www.EnterpriseDB.com/
On Wed, Jan 20, 2021 at 9:58 AM Simon Riggs <simon@2ndquadrant.com> wrote: > > On Wed, 20 Jan 2021 at 14:21, James Coleman <jtc331@gmail.com> wrote: > > > An alternative approach that occurred to me while typing this reply: a > > setting in Postgres that would disallow setting session level GUCs > > (i.e., enforce `SET LOCAL` transaction level usage instead) would > > remove a large chunk of our need to set server_reset_query_always=1 > > (and more interestingly it'd highlight when broken code gets pushed). > > But even with that, I see some value in the proposed setting since > > there is additional session state beyond GUCs. > > With transaction_cleanup=on we could force all SETs to be SET LOCAL. > > The point is that if we declare ahead of time that the transaction > will be reset then we can act differently and more easily for various > circumstances, for SETs, for Temp tables and others. Right, I agree it's independently useful. My "alternative" is a subset of that functionality and doesn't cover as many cases. James
On Wed, Jan 20, 2021 at 3:53 PM James Coleman <jtc331@gmail.com> wrote: > > On Wed, Jan 20, 2021 at 9:58 AM Simon Riggs <simon@2ndquadrant.com> wrote: > > > > On Wed, 20 Jan 2021 at 14:21, James Coleman <jtc331@gmail.com> wrote: > > > > > An alternative approach that occurred to me while typing this reply: a > > > setting in Postgres that would disallow setting session level GUCs > > > (i.e., enforce `SET LOCAL` transaction level usage instead) would > > > remove a large chunk of our need to set server_reset_query_always=1 > > > (and more interestingly it'd highlight when broken code gets pushed). > > > But even with that, I see some value in the proposed setting since > > > there is additional session state beyond GUCs. > > > > With transaction_cleanup=on we could force all SETs to be SET LOCAL. > > > > The point is that if we declare ahead of time that the transaction > > will be reset then we can act differently and more easily for various > > circumstances, for SETs, for Temp tables and others. > > Right, I agree it's independently useful. My "alternative" is a subset > of that functionality and doesn't cover as many cases. So if we go for that option, would we call it? session_state = 'session' (default) | 'local_set' If you use 'local' then you find that all state is transaction only * SET defaults to meaning SET LOCAL * SET SESSION returns ERROR -- Simon Riggs http://www.EnterpriseDB.com/