Thread: Do we need to handle orphaned prepared transactions in the server?
Hello Everyone,
I have been thinking about the orphaned prepared transaction problem in PostgreSQL and pondering on ways for handling it.
A prepared transaction can be left unfinished (neither committed nor rollbacked) if the client has disappeared. It can happen for various reasons including a client crash, or a server crash leading to client's connection getting terminated and never returning back. Another way a prepared transaction can be left unfinished is if a backup is restored that carried the preparation steps, but not the steps closing the transaction.
Needless to mention that this does hamper maintenance work including vacuuming of dead tuples.
First and foremost is to define what an orphaned transaction is. At this stage, I believe any prepared transaction that has been there for more than X time may be considered as an orphan. X may be defined as an integer in seconds (a GUC perhaps). May be there are better ways to define this. Please feel free to chime in.
This leads to a question whether at server level, we need to be better at managing these orphaned prepared transactions. There are obviously other ways of identifying such transactions by simply querying the pg_prepared_xacts and checking transaction start date, which begs the question if there is a real need here to make a change in the server to either terminate these transactions (perhaps something similar to idle_in_transaction_session_timeout) or notify an administrator (less preferred as I believe notifications should be handled by some external tools, not by server).
I see 3 potential solutions for solving this:
(1) Only check for any prepared transactions when server is starting or restarting (may be after a crash)
(2) Have a background process that is checking on an idle timeout of prepared transactions
(3) Do not make any change in the server and let the administrator handle this by a client or an external tool
Option (1) IMHO seems to be the least suitable one as I'd expect that when a server is being started (or restarted) perhaps after a crash, it is done manually and user can see the server startup logs. So it is very likely that user will notice any prepared transactions that were created when the server was previously running and take any necessary actions.
Option (3) is let user manage it on their own, however they wish. This is the simplest and the easiest way as we don't need to do anything here.
Option (2) is probably the best solution IMHO. Though, it does require changes in the server which might not be an undertaking we wish to not pursue for this problem.
So in case we wish to move forward with Option (2), this will require a change in the server. One potential place is in autovacuum by adding a similar change as it was done for idle_in_transaction_session_timeout, but rather than terminating the connection in this case, we simply abort/roll back the transaction. We could have a similar GUC for a prepared transaction timeout. Though in this case, to be able to do that, we obviously need a backend process that can monitor the timer which will add overhead to any existing background process like the autovacuum, or creation of a new background process (which is not such a good idea IMHO) which will add even more overhead.
At this stage, I'm not sure of the scale of changes this will require, however, I wanted to get an understanding and consensus on whether (a) this is something we should work on, and (b) whether an approach to implementing a timeout makes sense.
Please feel free to share your thoughts here.
Regards.
--
I have been thinking about the orphaned prepared transaction problem in PostgreSQL and pondering on ways for handling it.
A prepared transaction can be left unfinished (neither committed nor rollbacked) if the client has disappeared. It can happen for various reasons including a client crash, or a server crash leading to client's connection getting terminated and never returning back. Another way a prepared transaction can be left unfinished is if a backup is restored that carried the preparation steps, but not the steps closing the transaction.
Needless to mention that this does hamper maintenance work including vacuuming of dead tuples.
First and foremost is to define what an orphaned transaction is. At this stage, I believe any prepared transaction that has been there for more than X time may be considered as an orphan. X may be defined as an integer in seconds (a GUC perhaps). May be there are better ways to define this. Please feel free to chime in.
This leads to a question whether at server level, we need to be better at managing these orphaned prepared transactions. There are obviously other ways of identifying such transactions by simply querying the pg_prepared_xacts and checking transaction start date, which begs the question if there is a real need here to make a change in the server to either terminate these transactions (perhaps something similar to idle_in_transaction_session_timeout) or notify an administrator (less preferred as I believe notifications should be handled by some external tools, not by server).
I see 3 potential solutions for solving this:
(1) Only check for any prepared transactions when server is starting or restarting (may be after a crash)
(2) Have a background process that is checking on an idle timeout of prepared transactions
(3) Do not make any change in the server and let the administrator handle this by a client or an external tool
Option (1) IMHO seems to be the least suitable one as I'd expect that when a server is being started (or restarted) perhaps after a crash, it is done manually and user can see the server startup logs. So it is very likely that user will notice any prepared transactions that were created when the server was previously running and take any necessary actions.
Option (3) is let user manage it on their own, however they wish. This is the simplest and the easiest way as we don't need to do anything here.
Option (2) is probably the best solution IMHO. Though, it does require changes in the server which might not be an undertaking we wish to not pursue for this problem.
So in case we wish to move forward with Option (2), this will require a change in the server. One potential place is in autovacuum by adding a similar change as it was done for idle_in_transaction_session_timeout, but rather than terminating the connection in this case, we simply abort/roll back the transaction. We could have a similar GUC for a prepared transaction timeout. Though in this case, to be able to do that, we obviously need a backend process that can monitor the timer which will add overhead to any existing background process like the autovacuum, or creation of a new background process (which is not such a good idea IMHO) which will add even more overhead.
At this stage, I'm not sure of the scale of changes this will require, however, I wanted to get an understanding and consensus on whether (a) this is something we should work on, and (b) whether an approach to implementing a timeout makes sense.
Please feel free to share your thoughts here.
Regards.
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
> First and foremost is to define what an orphaned transaction is. At > this stage, I believe any prepared transaction that has been there > for more than X time may be considered as an orphan. X may be defined > as an integer in seconds (a GUC perhaps). May be there are better > ways to define this. Please feel free to chime in. What about specifying a timeout when starting the prepared transaction? I can imagine situations where a timeout of hours might be needed/anticipated (e.g. really slow external systems) and situations where the developer knows that the other systems are never slower than a few seconds. Something like: prepare transaction 42 timeout interval '2 days'; or prepare transaction 42 timeout interval '30 second'; Or maybe even with a fixed timestamp instead of an interval? prepare transaction 42 timeout timestamp '2020-01-30 14:00:00'; Thomas
On Wed, 22 Jan 2020 at 09:02, Hamid Akhtar <hamid.akhtar@gmail.com> wrote: > > At this stage, I'm not sure of the scale of changes this will require, however, I wanted to get an understanding and consensuson whether (a) this is something we should work on, and (b) whether an approach to implementing a timeout makessense. > > Please feel free to share your thoughts here. The intended use case of two phase transactions is ensuring atomic durability of transactions across multiple database systems. This necessarily means that there needs to be a failure tolerant agent that ensures there is consensus about the status of the transaction and then executes that consensus across all systems. In other words, there needs to be a transaction manager for prepared statements to actually fulfil their purpose. Therefore I think that unilaterally timing out prepared statements is just shifting the consequences of a broken client from availability to durability. But if durability was never a concern, why is the client even using prepared statements? Citing the documentation: > PREPARE TRANSACTION is not intended for use in applications or interactive sessions. Its purpose is to allow an externaltransaction manager to perform atomic global transactions across multiple databases or other transactional resources.Unless you're writing a transaction manager, you probably shouldn't be using PREPARE TRANSACTION. Regards, Ants Aasma
On Wed, 22 Jan 2020 at 16:45, Ants Aasma <ants@cybertec.at> wrote: > The intended use case of two phase transactions is ensuring atomic > durability of transactions across multiple database systems. Exactly. I was trying to find a good way to say this. It doesn't make much sense to embed a 2PC resolver in Pg unless it's an XA coordinator or similar. And generally it doesn't make sense for the distributed transaction coordinator to reside alongside one of the datasources being managed anyway, especially where failover and HA are in the picture. I *can* see it being useful, albeit rather heavyweight, to implement an XA coordinator on top of PostgreSQL. Mostly for HA and replication reasons. But generally you'd use postgres instances for the HA coordinator and the DB(s) in which 2PC txns are being managed. While you could run them in the same instance it'd probably mostly be for toy-scale PoC/demo/testing use. So I don't really see the point of doing anything with 2PC xacts within Pg proper. It's the job of the app that prepares the 2PC xacts, and if that app is unable to resolve them for some reason there's no generally-correct action to take without administrator action.
Craig Ringer <craig@2ndquadrant.com> writes: > So I don't really see the point of doing anything with 2PC xacts > within Pg proper. It's the job of the app that prepares the 2PC xacts, > and if that app is unable to resolve them for some reason there's no > generally-correct action to take without administrator action. Right. It's the XA transaction manager's job not to forget uncommitted transactions. Reasoning as though no TM exists is not only not very relevant, but it might lead you to put in features that actually make the TM's job harder. In particular, a timeout (or any other mechanism that leads PG to abort or commit a prepared transaction of its own accord) does that. Or another way to put it: the fundamental premise of a prepared transaction is that it will be possible to commit it on-demand with extremely low chance of failure. Designing in a reason why we'd fail to be able to do that would be an anti-feature. regards, tom lane
Tom Lane schrieb am 22.01.2020 um 16:05: > Craig Ringer <craig@2ndquadrant.com> writes: >> So I don't really see the point of doing anything with 2PC xacts >> within Pg proper. It's the job of the app that prepares the 2PC xacts, >> and if that app is unable to resolve them for some reason there's no >> generally-correct action to take without administrator action. > > Right. It's the XA transaction manager's job not to forget uncommitted > transactions. Reasoning as though no TM exists is not only not very > relevant, but it might lead you to put in features that actually > make the TM's job harder. In particular, a timeout (or any other > mechanism that leads PG to abort or commit a prepared transaction > of its own accord) does that. > > Or another way to put it: the fundamental premise of a prepared > transaction is that it will be possible to commit it on-demand with > extremely low chance of failure. Designing in a reason why we'd > fail to be able to do that would be an anti-feature. That's a fair point, but the reality is that not all XA transaction managers do a good job with that. Having somthing on the database side that can handle that in exceptional cases would be very welcome. (In Oracle you can't sometimes even run DML on tables where you have orphaned XA transactions - which is extremely annoying, because by default only the DBA can clean that up) Thomas
Thomas Kellerer <shammat@gmx.net> writes: > Tom Lane schrieb am 22.01.2020 um 16:05: >> Right. It's the XA transaction manager's job not to forget uncommitted >> transactions. Reasoning as though no TM exists is not only not very >> relevant, but it might lead you to put in features that actually >> make the TM's job harder. In particular, a timeout (or any other >> mechanism that leads PG to abort or commit a prepared transaction >> of its own accord) does that. > That's a fair point, but the reality is that not all XA transaction managers > do a good job with that. If you've got a crappy XA manager, you should get a better one, not ask us to put in features that make PG unsafe to use with well-designed XA managers. regards, tom lane
On Wed, Jan 22, 2020 at 10:22:21AM -0500, Tom Lane wrote: > Thomas Kellerer <shammat@gmx.net> writes: > > Tom Lane schrieb am 22.01.2020 um 16:05: > >> Right. It's the XA transaction manager's job not to forget uncommitted > >> transactions. Reasoning as though no TM exists is not only not very > >> relevant, but it might lead you to put in features that actually > >> make the TM's job harder. In particular, a timeout (or any other > >> mechanism that leads PG to abort or commit a prepared transaction > >> of its own accord) does that. > > > That's a fair point, but the reality is that not all XA transaction managers > > do a good job with that. > > If you've got a crappy XA manager, you should get a better one, not > ask us to put in features that make PG unsafe to use with well-designed > XA managers. I think the big question is whether we want to make active prepared transactions more visible to administrators, either during server start or idle duration. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Bruce Momjian <bruce@momjian.us> writes: > I think the big question is whether we want to make active prepared > transactions more visible to administrators, either during server start > or idle duration. There's already the pg_prepared_xacts view ... regards, tom lane
On Wed, 22 Jan 2020 at 23:22, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Thomas Kellerer <shammat@gmx.net> writes: > > Tom Lane schrieb am 22.01.2020 um 16:05: > >> Right. It's the XA transaction manager's job not to forget uncommitted > >> transactions. Reasoning as though no TM exists is not only not very > >> relevant, but it might lead you to put in features that actually > >> make the TM's job harder. In particular, a timeout (or any other > >> mechanism that leads PG to abort or commit a prepared transaction > >> of its own accord) does that. > > > That's a fair point, but the reality is that not all XA transaction managers > > do a good job with that. > > If you've got a crappy XA manager, you should get a better one, not > ask us to put in features that make PG unsafe to use with well-designed > XA managers. Agreed. Or use some bespoke script that does the cleanup that you think is appropriate for your particular environment and set of bugs. -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise
On Thu, 23 Jan 2020 at 01:20, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Bruce Momjian <bruce@momjian.us> writes: > > I think the big question is whether we want to make active prepared > > transactions more visible to administrators, either during server start > > or idle duration. > > There's already the pg_prepared_xacts view ... I think Bruce has a point here. We shouldn't go around "resolving" prepared xacts, but the visibility of them is a problem for users. I've seen that myself quite enough times, even now that they cannot be used by default. Our monitoring and admin views are not keeping up with Pg's complexity. Resource retention is one area where that's becoming a usability and admin challenge. If a user has growing bloat (and have managed to figure that out, since we don't make it easy to do that either) or unexpected WAL retention they may find it hard to quickly work out why. We could definitely improve on that by exposing a view that integrates everything that holds down xmin and catalog_xmin. It'd show * the datfrozenxid and datminmxid for the oldest database * if that database is the current database, the relation(s) with the oldest relfrozenxid and relminmxd * ... and the catalog relation(s) with the oldest relfrozenxid and relminmxid if greater * the absolute xid and xid-age positions of entries in pg_replication_slots * pg_stat_replication connections (joined to pg_stat_replication if connected) with their feedback xmin * pg_stat_activity backend_xid and backend_xmin for the backend(s) with oldest values; this may be different sets of backends * pg_prepared_xacts entries by oldest xid ... probably sorted by xid age. It'd be good to expose some internal state too, which would usually correspond to the oldest values found in the above, but is useful for cross-checking: * RecentGlobalXmin and RecentGlobalDataXmin to show the xmin and catalog_xmin actually used * procArray->replication_slot_xmin and procArray->replication_slot_catalog_xmin I'm not sure whether WAL retention (lsn tracking) should be in the same view or a different one, but I lean toward different. I already have another TODO kicking around for me to write a view that generates a blocking locks graph, since pg_locks is really more of a building block than a directly useful view for admins to understand the system's state. And if that's not enough I also want to write a decent bloat-checking view to include in the system views, since IMO lock-blocking, bloat, and resource retention are real monitoring pain points right now. -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise
On Thu, Jan 23, 2020 at 12:56:41PM +0800, Craig Ringer wrote: > We could definitely improve on that by exposing a view that integrates > everything that holds down xmin and catalog_xmin. It'd show > > * the datfrozenxid and datminmxid for the oldest database > * if that database is the current database, the relation(s) with the > oldest relfrozenxid and relminmxd > * ... and the catalog relation(s) with the oldest relfrozenxid and > relminmxid if greater > * the absolute xid and xid-age positions of entries in pg_replication_slots > * pg_stat_replication connections (joined to pg_stat_replication if > connected) with their feedback xmin > * pg_stat_activity backend_xid and backend_xmin for the backend(s) > with oldest values; this may be different sets of backends > * pg_prepared_xacts entries by oldest xid It seems to me that what you are describing here is a set of properties good for a monitoring tool that we don't necessarily need to maintain in core. There are already tools able to do that in ways I think are better than what we could ever design, like check_pgactivity and such. And there are years of experience behind that from the maintainers of such scripts and/or extensions. The argument about Postgres getting more and more complex is true as the code grows, but I am not really convinced that we need to make it grow more with more layers that we think are good, because we may finish by piling up stuff which are not actually that good in the long term. I'd rather just focus in the core code on the basics with views that map directly to what we have in memory and/or disk. -- Michael
Attachment
On Thu, 23 Jan 2020 at 15:04, Michael Paquier <michael@paquier.xyz> wrote: > It seems to me that what you are describing here is a set of > properties good for a monitoring tool that we don't necessarily need > to maintain in core. There are already tools able to do that in ways > I think are better than what we could ever design, like > check_pgactivity and such. I really have to disagree here. Relying on external tools gives users who already have to piece together a lot of fragments even more moving parts to keep track of. It introduces more places where new server releases may not be supported in a timely manner by various tools users rely on. More places where users may get wrong or incomplete information from outdated or incorrect tools. I cite the monstrosity that "check_postgres.pl" has become as a specific example of why pushing our complexity onto external tools is not always the right answer. We already have a number of views that prettify information to help administrators operate the server. You could argue that pg_stat_activity and pg_stat_replication are unnecessary for example; users should use external tools to query pg_stat_get_activity(), pg_stat_get_wal_senders(), pg_authid and pg_database directly to get the information they need. Similarly, we could do away with pg_stat_user_indexes and the like, as they're just convenience views over lower level information exposed by the server. But can you really imagine using postgres day to day without pg_stat_activity? It is my firm opinion that visibility into locking behaviour and lock waits is of a similar level of importance. So is giving users some way to get insight into table and index bloat on our MVCC database. With the enormous uptake of various forms of replication and HA it's also important that users also be able to see what's affecting resource retention - holding down vacuum, retaining WAL, etc. The server knows more than any tools. Views in the server can also be maintained along with the server to address changes in how it manages things like resource retention, so external tools get a more consistent insight into server behaviour. > I'd rather just focus in the core code on the basics with views > that map directly to what we have in memory and/or disk. Per above, I just can't agree with this. PostgreSQL is a system with end users who need to interact with it, most of whom will not know how its innards work. If we're going to position it even more as a component in some larger stack such that it's not expected to really be used standalone, then we should make some effort to guide users toward the other components they will need *in our own documentation* and ensure they're tested and maintained. Proposals to do that with HA and failover tooling, backup tooling etc have never got off the ground. I think we do users a great disservice there personally. I don't expect any proposal to bless specific monitoring tools to be any more successful. More importantly, I fail to see why every monitoring tool should reinvent the same information collection queries and views, each with their own unique bugs and quirks, when we can provide information users need directly from the server. In any case I guess it's all hot air unless I pony up a patch to show how I think it should work. -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise
So having seen the feedback on this thread, and I tend to agree with most of what has been said here, I also agree that the server core isn't really the ideal place to handle the orphan prepared transactions.
Ideally, these must be handled by a transaction manager, however, I do believe that we cannot let database suffer for failing of an external software, and we did a similar change through introduction of idle in transaction timeout behavior. That said, implementing something similar for this feature is too much of an overhead both in terms of code complexity and resources utilisation (if the feature is implemented).
I'm currently working on other options to tackle this problem.
On Tue, 28 Jan 2020 at 9:04 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
On Thu, 23 Jan 2020 at 15:04, Michael Paquier <michael@paquier.xyz> wrote:
> It seems to me that what you are describing here is a set of
> properties good for a monitoring tool that we don't necessarily need
> to maintain in core. There are already tools able to do that in ways
> I think are better than what we could ever design, like
> check_pgactivity and such.
I really have to disagree here.
Relying on external tools gives users who already have to piece
together a lot of fragments even more moving parts to keep track of.
It introduces more places where new server releases may not be
supported in a timely manner by various tools users rely on. More
places where users may get wrong or incomplete information from
outdated or incorrect tools. I cite the monstrosity that
"check_postgres.pl" has become as a specific example of why pushing
our complexity onto external tools is not always the right answer.
We already have a number of views that prettify information to help
administrators operate the server. You could argue that
pg_stat_activity and pg_stat_replication are unnecessary for example;
users should use external tools to query pg_stat_get_activity(),
pg_stat_get_wal_senders(), pg_authid and pg_database directly to get
the information they need. Similarly, we could do away with
pg_stat_user_indexes and the like, as they're just convenience views
over lower level information exposed by the server.
But can you really imagine using postgres day to day without pg_stat_activity?
It is my firm opinion that visibility into locking behaviour and lock
waits is of a similar level of importance. So is giving users some way
to get insight into table and index bloat on our MVCC database. With
the enormous uptake of various forms of replication and HA it's also
important that users also be able to see what's affecting resource
retention - holding down vacuum, retaining WAL, etc.
The server knows more than any tools. Views in the server can also be
maintained along with the server to address changes in how it manages
things like resource retention, so external tools get a more
consistent insight into server behaviour.
> I'd rather just focus in the core code on the basics with views
> that map directly to what we have in memory and/or disk.
Per above, I just can't agree with this. PostgreSQL is a system with
end users who need to interact with it, most of whom will not know how
its innards work. If we're going to position it even more as a
component in some larger stack such that it's not expected to really
be used standalone, then we should make some effort to guide users
toward the other components they will need *in our own documentation*
and ensure they're tested and maintained.
Proposals to do that with HA and failover tooling, backup tooling etc
have never got off the ground. I think we do users a great disservice
there personally. I don't expect any proposal to bless specific
monitoring tools to be any more successful.
More importantly, I fail to see why every monitoring tool should
reinvent the same information collection queries and views, each with
their own unique bugs and quirks, when we can provide information
users need directly from the server.
In any case I guess it's all hot air unless I pony up a patch to show
how I think it should work.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
On Thu, 30 Jan 2020 at 02:04, Hamid Akhtar <hamid.akhtar@gmail.com> wrote: > > So having seen the feedback on this thread, and I tend to agree with most of what has been said here, I also agree thatthe server core isn't really the ideal place to handle the orphan prepared transactions. > > Ideally, these must be handled by a transaction manager, however, I do believe that we cannot let database suffer for failingof an external software, and we did a similar change through introduction of idle in transaction timeout behavior. The difference, IMO, is that idle-in-transaction aborts don't affect anything we've promised to be durable. Once you PREPARE TRANSACTION the DB has made a promise that that txn is durable. We don't have any consistent feedback channel to back to applications and say "Hey, if you're not going to finish this up we need to get rid of it soon, ok?". If a distributed transaction manager gets consensus for commit and goes to COMMIT PREPARED a previously prepared txn only to find that it has vanished, that's a major problem, and one that may bring the entire DTM to a halt until the admin can intervene. This isn't like idle-in-transaction aborts. It's closer to something like uncommitting a previously committed transaction. I do think it'd make sense to ensure that the documentation clearly highlights the impact of abandoned prepared xacts on server resource retention and performance, preferably with pointers to appropriate views. I haven't reviewed the docs to see how clear that is already. I can also see an argument for a periodic log message (maybe from vacuum?) warning when old prepared xacts hold xmin down. Including one sent to the client application when an explicit VACUUM is executed. (In fact, it'd make sense to generalise that for all xmin-retention). But I'm really not a fan of aborting such txns. If you operate with some kind of broken global transaction manager that can forget or abandon prepared xacts, then fix it, or adopt site-local periodic cleanup tasks that understand your site's needs. -- Craig Ringer http://www.2ndQuadrant.com/ 2ndQuadrant - PostgreSQL Solutions for the Enterprise
On Thu, Jan 30, 2020 at 8:28 AM Craig Ringer <craig@2ndquadrant.com> wrote:
On Thu, 30 Jan 2020 at 02:04, Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
>
> So having seen the feedback on this thread, and I tend to agree with most of what has been said here, I also agree that the server core isn't really the ideal place to handle the orphan prepared transactions.
>
> Ideally, these must be handled by a transaction manager, however, I do believe that we cannot let database suffer for failing of an external software, and we did a similar change through introduction of idle in transaction timeout behavior.
The difference, IMO, is that idle-in-transaction aborts don't affect
anything we've promised to be durable.
Once you PREPARE TRANSACTION the DB has made a promise that that txn
is durable. We don't have any consistent feedback channel to back to
applications and say "Hey, if you're not going to finish this up we
need to get rid of it soon, ok?". If a distributed transaction manager
gets consensus for commit and goes to COMMIT PREPARED a previously
prepared txn only to find that it has vanished, that's a major
problem, and one that may bring the entire DTM to a halt until the
admin can intervene.
This isn't like idle-in-transaction aborts. It's closer to something
like uncommitting a previously committed transaction.
I do think it'd make sense to ensure that the documentation clearly
highlights the impact of abandoned prepared xacts on server resource
retention and performance, preferably with pointers to appropriate
views. I haven't reviewed the docs to see how clear that is already.
Having seen the documentation, IMHO the document does contain enough
information for users to understand what issues can be caused by these
orphaned prepared transactions.
I can also see an argument for a periodic log message (maybe from
vacuum?) warning when old prepared xacts hold xmin down. Including one
sent to the client application when an explicit VACUUM is executed.
(In fact, it'd make sense to generalise that for all xmin-retention).
I think that opens up the debate on what we really mean by "old" and
whether that requires a syntax change when creating a prepared
transactions as Thomas Kellerer suggested earlier?
I agree that vacuum should periodically throw warnings for any prepared
xacts that are holding xmin down.
Generalising it for all xmin-retention is a fair idea IMHO, though that
does increase the overall scope here. A vacuum process should (ideally)
periodically throw out warnings for anything that is preventing it (including
orphaned prepared transactions) from doing its routine work so that
somebody can take necessary actions.
But I'm really not a fan of aborting such txns. If you operate with
some kind of broken global transaction manager that can forget or
abandon prepared xacts, then fix it, or adopt site-local periodic
cleanup tasks that understand your site's needs.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
All,
If the GUCs are enabled (set to a value higher than -1), an
autovacuum worker running in the background checks if the
timeout has expired. If so, it checks if there are any orphaned
prepared transactions (i.e. their age has exceeded
max_age_prepared_xacts). If it finds any, it throws a warning for
every such transaction. It also emits the total number of orphaned
prepared transactions if one or more are found.
When a vacuum command is issued from within a client, say psql,
in that case, we skip the vacuum timeout check and simply scan
for any orphaned prepared transactions. Warnings are emitted to
the client and log file if any are found.
- About the New GUCs:
= max_age_prepared_xacts:
Sets maximum age after which a prepared transaction is considered an
orphan. It applies when "prepared transactions" are enabled. The
age for a transaction is calculated from the time it was created to
the current time. If this value is specified without units, it is taken
as milliseconds. The default value is -1 which allows prepared
transactions to live forever.
= prepared_xacts_vacuum_warn_timeout:
Sets timeout after which vacuum starts throwing warnings for every
prepared transactions that has exceeded maximum age defined by
"max_age_prepared_xacts". If this value is specified without units,
it is taken as milliseconds. The default value of -1 will disable
this warning mechanism. Setting a too value could potentially fill
up log with orphaned prepared transaction warnings, so this
parameter must be set to a value that is reasonably large to not
fill up log file, but small enough to notify of long running and
potential orphaned prepared transactions. There is no additional
timer or worker introduced with this change. Whenever a vacuum
worker runs, it first checks for any orphaned prepared transactions.
So at best, this GUC serves as a guideline for a vacuum worker
if a warning should be thrown to log file or a client issuing
vacuum command.
- What this Patch Does Not Cover:
The warning is not thrown when user either runs vacuumdb or passes
individual relations to be vacuum. Specifically in case of vacuumdb,
it breaks down a vacuum command to an attribute-wise vacuum command.
So the vacuum command is indirectly run many times. Considering that
we want to emit warnings for every manual vacuum command, this simply
floods the terminal and log with orphaned prepared transactions
warnings. We could potentially handle that, but the overhead of
that seemed too much to me (and I've not invested any time looking
to fix that either). Hence, warnings are not thrown when user runs
vacuumdb and relation specific vacuum.
Attached is version 1 of POC patch for notifying of orphaned
prepared transactions via warnings emitted to a client
application and/or log file. It applies to PostgreSQL branch
"master" on top of "e2e02191" commit.
I've tried to keep the patch as less invasive as I could with
minimal impact on vacuum processes, so the performance impact
and the changes are minimal in that area of PostgreSQL core.
- What's in this Patch:
This patch throws warnings when an autovacuum worker encounters
an orphaned prepared transaction. It also throws warnings to a
client when a vacuum command is issued. This patch also
introduces two new GUCs:
(1) max_age_prepared_xacts
- The age after creation of a prepared transaction after which it
will be considered an orphan.
(2) prepared_xacts_vacuum_warn_timeout
- The timeout period for an autovacuum (essentially any of its
worker) to check for orphaned prepared transactions and throw
warnings if any are found.
- What This Patch Does:
prepared transactions via warnings emitted to a client
application and/or log file. It applies to PostgreSQL branch
"master" on top of "e2e02191" commit.
I've tried to keep the patch as less invasive as I could with
minimal impact on vacuum processes, so the performance impact
and the changes are minimal in that area of PostgreSQL core.
- What's in this Patch:
This patch throws warnings when an autovacuum worker encounters
an orphaned prepared transaction. It also throws warnings to a
client when a vacuum command is issued. This patch also
introduces two new GUCs:
(1) max_age_prepared_xacts
- The age after creation of a prepared transaction after which it
will be considered an orphan.
(2) prepared_xacts_vacuum_warn_timeout
- The timeout period for an autovacuum (essentially any of its
worker) to check for orphaned prepared transactions and throw
warnings if any are found.
- What This Patch Does:
If the GUCs are enabled (set to a value higher than -1), an
autovacuum worker running in the background checks if the
timeout has expired. If so, it checks if there are any orphaned
prepared transactions (i.e. their age has exceeded
max_age_prepared_xacts). If it finds any, it throws a warning for
every such transaction. It also emits the total number of orphaned
prepared transactions if one or more are found.
When a vacuum command is issued from within a client, say psql,
in that case, we skip the vacuum timeout check and simply scan
for any orphaned prepared transactions. Warnings are emitted to
the client and log file if any are found.
- About the New GUCs:
= max_age_prepared_xacts:
Sets maximum age after which a prepared transaction is considered an
orphan. It applies when "prepared transactions" are enabled. The
age for a transaction is calculated from the time it was created to
the current time. If this value is specified without units, it is taken
as milliseconds. The default value is -1 which allows prepared
transactions to live forever.
= prepared_xacts_vacuum_warn_timeout:
Sets timeout after which vacuum starts throwing warnings for every
prepared transactions that has exceeded maximum age defined by
"max_age_prepared_xacts". If this value is specified without units,
it is taken as milliseconds. The default value of -1 will disable
this warning mechanism. Setting a too value could potentially fill
up log with orphaned prepared transaction warnings, so this
parameter must be set to a value that is reasonably large to not
fill up log file, but small enough to notify of long running and
potential orphaned prepared transactions. There is no additional
timer or worker introduced with this change. Whenever a vacuum
worker runs, it first checks for any orphaned prepared transactions.
So at best, this GUC serves as a guideline for a vacuum worker
if a warning should be thrown to log file or a client issuing
vacuum command.
- What this Patch Does Not Cover:
The warning is not thrown when user either runs vacuumdb or passes
individual relations to be vacuum. Specifically in case of vacuumdb,
it breaks down a vacuum command to an attribute-wise vacuum command.
So the vacuum command is indirectly run many times. Considering that
we want to emit warnings for every manual vacuum command, this simply
floods the terminal and log with orphaned prepared transactions
warnings. We could potentially handle that, but the overhead of
that seemed too much to me (and I've not invested any time looking
to fix that either). Hence, warnings are not thrown when user runs
vacuumdb and relation specific vacuum.
On Fri, Jan 31, 2020 at 7:02 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
On Thu, Jan 30, 2020 at 8:28 AM Craig Ringer <craig@2ndquadrant.com> wrote:On Thu, 30 Jan 2020 at 02:04, Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
>
> So having seen the feedback on this thread, and I tend to agree with most of what has been said here, I also agree that the server core isn't really the ideal place to handle the orphan prepared transactions.
>
> Ideally, these must be handled by a transaction manager, however, I do believe that we cannot let database suffer for failing of an external software, and we did a similar change through introduction of idle in transaction timeout behavior.
The difference, IMO, is that idle-in-transaction aborts don't affect
anything we've promised to be durable.
Once you PREPARE TRANSACTION the DB has made a promise that that txn
is durable. We don't have any consistent feedback channel to back to
applications and say "Hey, if you're not going to finish this up we
need to get rid of it soon, ok?". If a distributed transaction manager
gets consensus for commit and goes to COMMIT PREPARED a previously
prepared txn only to find that it has vanished, that's a major
problem, and one that may bring the entire DTM to a halt until the
admin can intervene.
This isn't like idle-in-transaction aborts. It's closer to something
like uncommitting a previously committed transaction.
I do think it'd make sense to ensure that the documentation clearly
highlights the impact of abandoned prepared xacts on server resource
retention and performance, preferably with pointers to appropriate
views. I haven't reviewed the docs to see how clear that is already.Having seen the documentation, IMHO the document does contain enoughinformation for users to understand what issues can be caused by theseorphaned prepared transactions.
I can also see an argument for a periodic log message (maybe from
vacuum?) warning when old prepared xacts hold xmin down. Including one
sent to the client application when an explicit VACUUM is executed.
(In fact, it'd make sense to generalise that for all xmin-retention).I think that opens up the debate on what we really mean by "old" andwhether that requires a syntax change when creating a preparedtransactions as Thomas Kellerer suggested earlier?I agree that vacuum should periodically throw warnings for any preparedxacts that are holding xmin down.Generalising it for all xmin-retention is a fair idea IMHO, though thatdoes increase the overall scope here. A vacuum process should (ideally)periodically throw out warnings for anything that is preventing it (includingorphaned prepared transactions) from doing its routine work so thatsomebody can take necessary actions.
But I'm really not a fan of aborting such txns. If you operate with
some kind of broken global transaction manager that can forget or
abandon prepared xacts, then fix it, or adopt site-local periodic
cleanup tasks that understand your site's needs.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise--Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.caSKYPE: engineeredvirus
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
Attachment
Here is the v2 of the same patch after rebasing it and running it through pgindent. There are no other code changes.
On Wed, Feb 19, 2020 at 8:04 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
All,Attached is version 1 of POC patch for notifying of orphaned
prepared transactions via warnings emitted to a client
application and/or log file. It applies to PostgreSQL branch
"master" on top of "e2e02191" commit.
I've tried to keep the patch as less invasive as I could with
minimal impact on vacuum processes, so the performance impact
and the changes are minimal in that area of PostgreSQL core.
- What's in this Patch:
This patch throws warnings when an autovacuum worker encounters
an orphaned prepared transaction. It also throws warnings to a
client when a vacuum command is issued. This patch also
introduces two new GUCs:
(1) max_age_prepared_xacts
- The age after creation of a prepared transaction after which it
will be considered an orphan.
(2) prepared_xacts_vacuum_warn_timeout
- The timeout period for an autovacuum (essentially any of its
worker) to check for orphaned prepared transactions and throw
warnings if any are found.
- What This Patch Does:
If the GUCs are enabled (set to a value higher than -1), an
autovacuum worker running in the background checks if the
timeout has expired. If so, it checks if there are any orphaned
prepared transactions (i.e. their age has exceeded
max_age_prepared_xacts). If it finds any, it throws a warning for
every such transaction. It also emits the total number of orphaned
prepared transactions if one or more are found.
When a vacuum command is issued from within a client, say psql,
in that case, we skip the vacuum timeout check and simply scan
for any orphaned prepared transactions. Warnings are emitted to
the client and log file if any are found.
- About the New GUCs:
= max_age_prepared_xacts:
Sets maximum age after which a prepared transaction is considered an
orphan. It applies when "prepared transactions" are enabled. The
age for a transaction is calculated from the time it was created to
the current time. If this value is specified without units, it is taken
as milliseconds. The default value is -1 which allows prepared
transactions to live forever.
= prepared_xacts_vacuum_warn_timeout:
Sets timeout after which vacuum starts throwing warnings for every
prepared transactions that has exceeded maximum age defined by
"max_age_prepared_xacts". If this value is specified without units,
it is taken as milliseconds. The default value of -1 will disable
this warning mechanism. Setting a too value could potentially fill
up log with orphaned prepared transaction warnings, so this
parameter must be set to a value that is reasonably large to not
fill up log file, but small enough to notify of long running and
potential orphaned prepared transactions. There is no additional
timer or worker introduced with this change. Whenever a vacuum
worker runs, it first checks for any orphaned prepared transactions.
So at best, this GUC serves as a guideline for a vacuum worker
if a warning should be thrown to log file or a client issuing
vacuum command.
- What this Patch Does Not Cover:
The warning is not thrown when user either runs vacuumdb or passes
individual relations to be vacuum. Specifically in case of vacuumdb,
it breaks down a vacuum command to an attribute-wise vacuum command.
So the vacuum command is indirectly run many times. Considering that
we want to emit warnings for every manual vacuum command, this simply
floods the terminal and log with orphaned prepared transactions
warnings. We could potentially handle that, but the overhead of
that seemed too much to me (and I've not invested any time looking
to fix that either). Hence, warnings are not thrown when user runs
vacuumdb and relation specific vacuum.On Fri, Jan 31, 2020 at 7:02 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:On Thu, Jan 30, 2020 at 8:28 AM Craig Ringer <craig@2ndquadrant.com> wrote:On Thu, 30 Jan 2020 at 02:04, Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
>
> So having seen the feedback on this thread, and I tend to agree with most of what has been said here, I also agree that the server core isn't really the ideal place to handle the orphan prepared transactions.
>
> Ideally, these must be handled by a transaction manager, however, I do believe that we cannot let database suffer for failing of an external software, and we did a similar change through introduction of idle in transaction timeout behavior.
The difference, IMO, is that idle-in-transaction aborts don't affect
anything we've promised to be durable.
Once you PREPARE TRANSACTION the DB has made a promise that that txn
is durable. We don't have any consistent feedback channel to back to
applications and say "Hey, if you're not going to finish this up we
need to get rid of it soon, ok?". If a distributed transaction manager
gets consensus for commit and goes to COMMIT PREPARED a previously
prepared txn only to find that it has vanished, that's a major
problem, and one that may bring the entire DTM to a halt until the
admin can intervene.
This isn't like idle-in-transaction aborts. It's closer to something
like uncommitting a previously committed transaction.
I do think it'd make sense to ensure that the documentation clearly
highlights the impact of abandoned prepared xacts on server resource
retention and performance, preferably with pointers to appropriate
views. I haven't reviewed the docs to see how clear that is already.Having seen the documentation, IMHO the document does contain enoughinformation for users to understand what issues can be caused by theseorphaned prepared transactions.
I can also see an argument for a periodic log message (maybe from
vacuum?) warning when old prepared xacts hold xmin down. Including one
sent to the client application when an explicit VACUUM is executed.
(In fact, it'd make sense to generalise that for all xmin-retention).I think that opens up the debate on what we really mean by "old" andwhether that requires a syntax change when creating a preparedtransactions as Thomas Kellerer suggested earlier?I agree that vacuum should periodically throw warnings for any preparedxacts that are holding xmin down.Generalising it for all xmin-retention is a fair idea IMHO, though thatdoes increase the overall scope here. A vacuum process should (ideally)periodically throw out warnings for anything that is preventing it (includingorphaned prepared transactions) from doing its routine work so thatsomebody can take necessary actions.
But I'm really not a fan of aborting such txns. If you operate with
some kind of broken global transaction manager that can forget or
abandon prepared xacts, then fix it, or adopt site-local periodic
cleanup tasks that understand your site's needs.
--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise--Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.caSKYPE: engineeredvirus--Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.caSKYPE: engineeredvirus
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
Attachment
On Mon, Mar 2, 2020 at 05:42:11PM +0500, Hamid Akhtar wrote: > Here is the v2 of the same patch after rebasing it and running it through > pgindent. There are no other code changes. The paragraph about max_age_prepared_xacts doesn't define what is the effect of treating a transaction as orphaned. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Mon, 2 Mar 2020 at 21:42, Hamid Akhtar <hamid.akhtar@gmail.com> wrote: > > Here is the v2 of the same patch after rebasing it and running it through pgindent. There are no other code changes. > Thank you for working on this. I think what this patch tries to achieve would be helpful to inform orphaned prepared transactions that can be cause of inefficient vacuum to users. As far as I read the patch, the setting this feature using newly added parameters seems to be complicated to me. IIUC, even if a prepared transactions is enough older than max_age_prepared_xacts, we don't warn if it doesn't elapsed prepared_xacts_vacuum_warn_timeout since when the "first" prepared transaction is created. And the first prepared transaction means that the first entry for TwoPhaseStateData->prepXacts. Therefore, if there is always more than one prepared transaction, we don't warn for at least prepared_xacts_vacuum_warn_timeout seconds even if the first added prepared transaction is already removed. So I'm not sure how we can think the setting value of prepared_xacts_vacuum_warn_timeout. Regarding the warning message, I wonder if the current message is too detailed. If we want to inform that there is orphaned prepared transactions to users, it seems to me that it's enough to report the existence (and possibly the number of orphaned prepared transactions), rather than individual details. Given that the above things, we can simply think this feature; we can have only max_age_prepared_xacts, and autovacuum checks the minimum of prepared_at of prepared transactions, and compare it to max_age_prepared_xacts. We can warn if (CurrentTimestamp - min(prepared_at)) > max_age_prepared_xacts. In addition, if we also want to control this behavior by the age of xid, we can have another GUC parameter for comparing the age(min(xid of prepared transactions)) to that value. Finally, regarding the name of parameters, when we mention the age of transaction it means the age of xid of the transaction, not the time. Please refer to other GUC parameter having "age" in its name such as autovacuum_freeze_max_age, vacuum_freeze_min_age. The patch adds max_age_prepared_xacts but I think it should be renamed. For example, prepared_xact_warn_min_duration is for time and prepared_xact_warn_max_age for age. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Thank you so much Bruce and Sawada.
Bruce:
I'll update the documentation with more details for max_age_prepared_xacts
Sawada:
You have raised 4 very valid points. Here are my thoughts.
(1)
I think your concern is that if we can reduce the need of 2 GUCs to one.
The purpose of having 2 different GUCs was serving two different purposes.
- max_age_prepared_xacts; this defines the maximum age of a prepared transaction after which it may be considered an orphan.
- prepared_xacts_vacuum_warn_timeout; since we are throwing warnings in the log, we need a way of controlling the behaviour to prevent from flooding the log file with our messages. This timeout defines that. May be there is a better way of handling this, but may be not.
(2)
Your point is that when there are more than one prepared transactions (and even if the first one is removed), timeout prepared_xacts_vacuum_warn_timeout isn't always accurate.
Yes, I agree. However, for us to hit the exact timeout for each prepared transaction, we need setup timers and callback functions. That's too much of an overhead IMHO. The alternative design that I took (the current design) is based on the assumption that we don't need a precise timeout for these transactions or for vacuum to report these issues to log. So, a decent enough way of setting a timeout should be good enough considering that it doesn't add any real overhead to the vacuum process. This does mean that an orphaned prepared transaction may be notified after prepared_xacts_vacuum_warn_timeout * 2. This, IMHO should be acceptable behavior.
(3)
Message is too detailed.
Yes, I agree. I'll review this an update the patch.
(4)
GUCs should be renamed.
Yes, I agree. The names you have suggested make more sense. I'll send an updated version of the patch with the new names and other suggested changes.
Bruce:
I'll update the documentation with more details for max_age_prepared_xacts
Sawada:
You have raised 4 very valid points. Here are my thoughts.
(1)
I think your concern is that if we can reduce the need of 2 GUCs to one.
The purpose of having 2 different GUCs was serving two different purposes.
- max_age_prepared_xacts; this defines the maximum age of a prepared transaction after which it may be considered an orphan.
- prepared_xacts_vacuum_warn_timeout; since we are throwing warnings in the log, we need a way of controlling the behaviour to prevent from flooding the log file with our messages. This timeout defines that. May be there is a better way of handling this, but may be not.
(2)
Your point is that when there are more than one prepared transactions (and even if the first one is removed), timeout prepared_xacts_vacuum_warn_timeout isn't always accurate.
Yes, I agree. However, for us to hit the exact timeout for each prepared transaction, we need setup timers and callback functions. That's too much of an overhead IMHO. The alternative design that I took (the current design) is based on the assumption that we don't need a precise timeout for these transactions or for vacuum to report these issues to log. So, a decent enough way of setting a timeout should be good enough considering that it doesn't add any real overhead to the vacuum process. This does mean that an orphaned prepared transaction may be notified after prepared_xacts_vacuum_warn_timeout * 2. This, IMHO should be acceptable behavior.
(3)
Message is too detailed.
Yes, I agree. I'll review this an update the patch.
(4)
GUCs should be renamed.
Yes, I agree. The names you have suggested make more sense. I'll send an updated version of the patch with the new names and other suggested changes.
On Wed, Mar 11, 2020 at 10:43 AM Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote:
On Mon, 2 Mar 2020 at 21:42, Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
>
> Here is the v2 of the same patch after rebasing it and running it through pgindent. There are no other code changes.
>
Thank you for working on this. I think what this patch tries to
achieve would be helpful to inform orphaned prepared transactions that
can be cause of inefficient vacuum to users.
As far as I read the patch, the setting this feature using newly added
parameters seems to be complicated to me. IIUC, even if a prepared
transactions is enough older than max_age_prepared_xacts, we don't
warn if it doesn't elapsed prepared_xacts_vacuum_warn_timeout since
when the "first" prepared transaction is created. And the first
prepared transaction means that the first entry for
TwoPhaseStateData->prepXacts. Therefore, if there is always more than
one prepared transaction, we don't warn for at least
prepared_xacts_vacuum_warn_timeout seconds even if the first added
prepared transaction is already removed. So I'm not sure how we can
think the setting value of prepared_xacts_vacuum_warn_timeout.
Regarding the warning message, I wonder if the current message is too
detailed. If we want to inform that there is orphaned prepared
transactions to users, it seems to me that it's enough to report the
existence (and possibly the number of orphaned prepared transactions),
rather than individual details.
Given that the above things, we can simply think this feature; we can
have only max_age_prepared_xacts, and autovacuum checks the minimum of
prepared_at of prepared transactions, and compare it to
max_age_prepared_xacts. We can warn if (CurrentTimestamp -
min(prepared_at)) > max_age_prepared_xacts. In addition, if we also
want to control this behavior by the age of xid, we can have another
GUC parameter for comparing the age(min(xid of prepared transactions))
to that value.
Finally, regarding the name of parameters, when we mention the age of
transaction it means the age of xid of the transaction, not the time.
Please refer to other GUC parameter having "age" in its name such as
autovacuum_freeze_max_age, vacuum_freeze_min_age. The patch adds
max_age_prepared_xacts but I think it should be renamed. For example,
prepared_xact_warn_min_duration is for time and
prepared_xact_warn_max_age for age.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
On Wed, Feb 19, 2020 at 10:05 AM Hamid Akhtar <hamid.akhtar@gmail.com> wrote: > Attached is version 1 of POC patch for notifying of orphaned > prepared transactions via warnings emitted to a client > application and/or log file. It applies to PostgreSQL branch > "master" on top of "e2e02191" commit. I think this is a bad idea and that we should reject the patch. It's true that forgotten prepared transactions are a problem, but it's also true that you can monitor for that yourself using the pg_prepared_xacts view. If you do, you will have a lot more flexibility than this patch gives you, or than any similar patch ever can give you. Generally, people don't pay attention to warnings in logs, so they're just clutter. Moreover, there are tons of different things for which you should probably monitor (wraparound perils, slow checkpoints, bloated tables, etc.) and so the real solution is to run some monitoring software. So even if you do pay attention to your logs, and even if the GUCs this provides you sufficient flexibility for your needs in this one area, you still need to run some monitoring software. At which point, you don't also need this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, 15 Apr 2020 at 03:19, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Feb 19, 2020 at 10:05 AM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
> Attached is version 1 of POC patch for notifying of orphaned
> prepared transactions via warnings emitted to a client
> application and/or log file. It applies to PostgreSQL branch
> "master" on top of "e2e02191" commit.
I think this is a bad idea and that we should reject the patch. It's
true that forgotten prepared transactions are a problem, but it's also
true that you can monitor for that yourself using the
pg_prepared_xacts view. If you do, you will have a lot more
flexibility than this patch gives you, or than any similar patch ever
can give you.
I agree. It's going to cause nothing but problems.
I am generally a fan of improvements that make PostgreSQL easier to use, easier to monitor and understand, harder to break accidentally, etc. But not when those improvements come at the price of correct behaviour for standard, externally specified interfaces.
Nothing allows us to just throw away prepared xacts. Doing so violates the very definition of what a prepared xact is. It's like saying "hey, this table is bloated, lets discard all rows with xmin < foo because we figure the user probably doesn't care about them; though they're visible to some still-running xacts, but those xacts haven't accessed the table.". Um. No. We can't do that.
If you want this, write an extension that does it as a background worker. You can munge the prepared xacts state in any manner you please from there.
I advocated for visibility / monitoring improvements upthread that might help mitigate the operational issues. Because I do agree that there's a problem with users having to watch the logs or query obscure state to understand what the system is doing and why bloat is being created by abandoned prepared xacts.
Just discarding the prepared xacts is not the answer though.
On Thu, 16 Apr 2020 at 13:23, Craig Ringer <craig@2ndquadrant.com> wrote:
Just discarding the prepared xacts is not the answer though.
We wouldn't need to track a fine-grained snapshot with an in-progress list (or inverted in-progress list like historic snapshots) for these. We'd just remember the needed xid range in [xmin,xmax] form. And we could even do the same for live backends' PGXACT - it might not be worth the price there, but if you have workloads that have batch xacts + high churn rate xacts it'd be pretty appealing.
It wouldn't help with xid wraparound concerns, but it could help a lot with bloat caused by old snapshots for some very common workloads.
This patch actually does not discard any prepared transactions and only throws a warning for each orphaned one. So, there is no behaviour change except for getting some warnings in the log or emitting some warning to a client executing a vacuum command.
I hear all the criticism which I don't disagree with. Obviously, scripts and other solutions could provide a lot more flexibility.
Also, I believe most of us agree that vacuum needs to be smarter.
src/backend/commands/vacuum.c does throw warnings for upcoming wraparound issues with one warning in particular mentioning prepared transactions and stale replication slots. So, throwing warnings is not unprecedented. There are 3 warnings in this file which I believe can also be handled by external tools. I'm not debating the merit of these warnings, nor am I trying to justify the addition of new warnings based on these.
My real question is whether vacuum should be preemptively complaining about prepared transactions or stale replication slots rather than waiting for transaction id to exceed the safe limit. I presume by the time safe limit is exceeded, vacuum's work would already have been significantly impacted.
AFAICT, my patch actually doesn't break anything and doesn't add any significant overhead to the vacuum process. It does supplement the current warnings though which might be useful.
I hear all the criticism which I don't disagree with. Obviously, scripts and other solutions could provide a lot more flexibility.
Also, I believe most of us agree that vacuum needs to be smarter.
src/backend/commands/vacuum.c does throw warnings for upcoming wraparound issues with one warning in particular mentioning prepared transactions and stale replication slots. So, throwing warnings is not unprecedented. There are 3 warnings in this file which I believe can also be handled by external tools. I'm not debating the merit of these warnings, nor am I trying to justify the addition of new warnings based on these.
My real question is whether vacuum should be preemptively complaining about prepared transactions or stale replication slots rather than waiting for transaction id to exceed the safe limit. I presume by the time safe limit is exceeded, vacuum's work would already have been significantly impacted.
AFAICT, my patch actually doesn't break anything and doesn't add any significant overhead to the vacuum process. It does supplement the current warnings though which might be useful.
On Thu, Apr 16, 2020 at 10:32 AM Craig Ringer <craig@2ndquadrant.com> wrote:
On Thu, 16 Apr 2020 at 13:23, Craig Ringer <craig@2ndquadrant.com> wrote:Just discarding the prepared xacts is not the answer though.... however, I have wondered a few times about making vacuum smarter about cases where the xmin is held down by prepared xacts or by replication slots. If we could record the oldest *and newest* xid needed by such resource retention markers we could potentially teach vacuum to remove intermediate dead rows. For high-churn workloads like like workqueue applications that could be a really big win.We wouldn't need to track a fine-grained snapshot with an in-progress list (or inverted in-progress list like historic snapshots) for these. We'd just remember the needed xid range in [xmin,xmax] form. And we could even do the same for live backends' PGXACT - it might not be worth the price there, but if you have workloads that have batch xacts + high churn rate xacts it'd be pretty appealing.It wouldn't help with xid wraparound concerns, but it could help a lot with bloat caused by old snapshots for some very common workloads.
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
On Thu, Apr 16, 2020 at 4:43 AM Hamid Akhtar <hamid.akhtar@gmail.com> wrote: > My real question is whether vacuum should be preemptively complaining about prepared transactions or stale replicationslots rather than waiting for transaction id to exceed the safe limit. I presume by the time safe limit is exceeded,vacuum's work would already have been significantly impacted. Yeah, for my part, I agree that letting things go until the point where VACUUM starts to complain is usually bad. Generally, you want to know a lot sooner. That being said, I think the solution to that is to run a monitoring tool, not to overload the autovacuum worker with additional duties. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Apr 16, 2020 at 5:20 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Apr 16, 2020 at 4:43 AM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
> My real question is whether vacuum should be preemptively complaining about prepared transactions or stale replication slots rather than waiting for transaction id to exceed the safe limit. I presume by the time safe limit is exceeded, vacuum's work would already have been significantly impacted.
Yeah, for my part, I agree that letting things go until the point
where VACUUM starts to complain is usually bad. Generally, you want to
know a lot sooner. That being said, I think the solution to that is to
run a monitoring tool, not to overload the autovacuum worker with
additional duties.
So is the concern performance overhead rather than the need for such a feature?
Any server running with prepared transactions enabled, more likely than not, requires a monitoring tool for tracking orphaned prepared transactions. For such environments, surely the overhead created by such a feature implemented in the server will create a lower overhead than their monitoring tool.
Any server running with prepared transactions enabled, more likely than not, requires a monitoring tool for tracking orphaned prepared transactions. For such environments, surely the overhead created by such a feature implemented in the server will create a lower overhead than their monitoring tool.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
On Thu, Apr 16, 2020 at 2:17 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote: > So is the concern performance overhead rather than the need for such a feature? No, I don't think this would have any significant overhead. My concern is that I think it's the wrong way to solve the problem. If you need to check for prepared transactions that got missed, the right way to do that is to use a monitoring tool that runs an appropriate query against the server on a regular basis and alerts based on the output. Such a tool can be used for many things, of which this is just one, and the queries can be customized to the needs of a particular environment, whereas this feature is much less flexible in that way because it is hard-coded into the server. To put that another way, any problem you can solve with this feature, you can also solve without this feature. And you can solve it any released branch, without waiting for a release that would hypothetically contain this patch, and you can solve it in a more flexible way than this patch allows, because you can tailor the query any way you like. The right place for a feature like this in something like check_postgres.pl, not the server. It looks like they added it in 2009: https://bucardo.org/pipermail/check_postgres/2009-April/000349.html -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Apr 16, 2020 at 2:17 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote: >> So is the concern performance overhead rather than the need for such a feature? > No, I don't think this would have any significant overhead. My concern > is that I think it's the wrong way to solve the problem. FWIW, I agree with Robert that this patch is a bad idea. His recommendation is to use an external monitoring tool, which is not a self-contained solution, but this isn't either: you'd need to add an external log-scraping tool to spot the warnings. Even if I liked the core idea, loading the functionality onto VACUUM seems like a fairly horrid design choice. It's quite unrelated to what that command does. In the autovac code path, it's going to lead to multiple autovac workers all complaining simultaneously about the same problem. But having manual vacuums complain about issues unrelated to the task at hand is also a seriously poor bit of UX design. Moreover, that won't do all that much to surface problems, since most(?) installations never run manual vacuums; or if they do, the "manual" runs are really done by a cron job or the like, which is not going to notice the warnings. So you still need a log-scraping tool. If we were going to go down the path of periodically logging warnings about old prepared transactions, some single-instance background task like the checkpointer would be a better place to do the work in. But I'm not really recommending that, because I agree with Robert that we just plain don't want this functionality. regards, tom lane
On Thu, Apr 16, 2020 at 03:11:51PM -0400, Tom Lane wrote: > Even if I liked the core idea, loading the functionality onto VACUUM seems > like a fairly horrid design choice. It's quite unrelated to what that > command does. In the autovac code path, it's going to lead to multiple > autovac workers all complaining simultaneously about the same problem. > But having manual vacuums complain about issues unrelated to the task at > hand is also a seriously poor bit of UX design. Moreover, that won't do > all that much to surface problems, since most(?) installations never run > manual vacuums; or if they do, the "manual" runs are really done by a cron > job or the like, which is not going to notice the warnings. So you still > need a log-scraping tool. +1. > If we were going to go down the path of periodically logging warnings > about old prepared transactions, some single-instance background task > like the checkpointer would be a better place to do the work in. But > I'm not really recommending that, because I agree with Robert that > we just plain don't want this functionality. I am not sure that the checkpointer is a good place to do that either, joining back with your argument in the first paragraph of this email related to vacuum. One potential approach would be a contrib module that works as a background worker? However, I would think that finding a minimum set of requirements that we think are generic enough for most users would be something hard to draft a list of. If we had a small, minimal contrib/ module in core that people could easily extend for their own needs and that we would intentionally keep as minimal, in the same spirit as say passwordcheck, perhaps.. -- Michael
Attachment
Thank you everyone for the detailed feedback.
On Fri, Apr 17, 2020 at 5:40 AM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Apr 16, 2020 at 03:11:51PM -0400, Tom Lane wrote:
> Even if I liked the core idea, loading the functionality onto VACUUM seems
> like a fairly horrid design choice. It's quite unrelated to what that
> command does. In the autovac code path, it's going to lead to multiple
> autovac workers all complaining simultaneously about the same problem.
> But having manual vacuums complain about issues unrelated to the task at
> hand is also a seriously poor bit of UX design. Moreover, that won't do
> all that much to surface problems, since most(?) installations never run
> manual vacuums; or if they do, the "manual" runs are really done by a cron
> job or the like, which is not going to notice the warnings. So you still
> need a log-scraping tool.
+1.
> If we were going to go down the path of periodically logging warnings
> about old prepared transactions, some single-instance background task
> like the checkpointer would be a better place to do the work in. But
> I'm not really recommending that, because I agree with Robert that
> we just plain don't want this functionality.
I am not sure that the checkpointer is a good place to do that either,
joining back with your argument in the first paragraph of this email
related to vacuum. One potential approach would be a contrib module
that works as a background worker? However, I would think that
finding a minimum set of requirements that we think are generic enough
for most users would be something hard to draft a list of. If we had
a small, minimal contrib/ module in core that people could easily
extend for their own needs and that we would intentionally keep as
minimal, in the same spirit as say passwordcheck, perhaps..
--
Michael
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid.akhtar@highgo.ca
SKYPE: engineeredvirus
On Thu, Apr 16, 2020 at 03:11:51PM -0400, Tom Lane wrote: > If we were going to go down the path of periodically logging warnings > about old prepared transactions, some single-instance background task > like the checkpointer would be a better place to do the work in. But > I'm not really recommending that, because I agree with Robert that > we just plain don't want this functionality. I thought we would just emit a warning at boot time. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Mon, Apr 20, 2020 at 10:35:15PM -0400, Bruce Momjian wrote: > On Thu, Apr 16, 2020 at 03:11:51PM -0400, Tom Lane wrote: >> If we were going to go down the path of periodically logging warnings >> about old prepared transactions, some single-instance background task >> like the checkpointer would be a better place to do the work in. But >> I'm not really recommending that, because I agree with Robert that >> we just plain don't want this functionality. > > I thought we would just emit a warning at boot time. That's more tricky than boot time (did you mean postmaster context?), especially if you are starting a cluster from a base backup as you have no guarantee that the 2PC information is consistent by just looking at what's on disk (some of the 2PC files may still be in WAL records to-be-replayed), so a natural candidate to gather the information wanted here would be RecoverPreparedTransactions() for a primary, and StandbyRecoverPreparedTransactions() for a standby. -- Michael
Attachment
On Tue, Apr 21, 2020 at 01:52:46PM +0900, Michael Paquier wrote: > On Mon, Apr 20, 2020 at 10:35:15PM -0400, Bruce Momjian wrote: > > On Thu, Apr 16, 2020 at 03:11:51PM -0400, Tom Lane wrote: > >> If we were going to go down the path of periodically logging warnings > >> about old prepared transactions, some single-instance background task > >> like the checkpointer would be a better place to do the work in. But > >> I'm not really recommending that, because I agree with Robert that > >> we just plain don't want this functionality. > > > > I thought we would just emit a warning at boot time. > > That's more tricky than boot time (did you mean postmaster context?), > especially if you are starting a cluster from a base backup as you > have no guarantee that the 2PC information is consistent by just > looking at what's on disk (some of the 2PC files may still be in WAL > records to-be-replayed), so a natural candidate to gather the > information wanted here would be RecoverPreparedTransactions() for a > primary, and StandbyRecoverPreparedTransactions() for a standby. Sorry, I meant something in the Postgres logs at postmaster start. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Bruce Momjian <bruce@momjian.us> writes: > On Tue, Apr 21, 2020 at 01:52:46PM +0900, Michael Paquier wrote: >> On Mon, Apr 20, 2020 at 10:35:15PM -0400, Bruce Momjian wrote: >>> On Thu, Apr 16, 2020 at 03:11:51PM -0400, Tom Lane wrote: >>>> If we were going to go down the path of periodically logging warnings >>>> about old prepared transactions, some single-instance background task >>>> like the checkpointer would be a better place to do the work in. But >>>> I'm not really recommending that, because I agree with Robert that >>>> we just plain don't want this functionality. > Sorry, I meant something in the Postgres logs at postmaster start. That seems strictly worse than periodic logging as far as the probability that somebody will notice the log entry goes. In any case it would only help people when they restart their postmaster, which ought to be pretty infrequent in a production situation. regards, tom lane
On Tue, Apr 21, 2020 at 04:03:53PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Tue, Apr 21, 2020 at 01:52:46PM +0900, Michael Paquier wrote: > >> On Mon, Apr 20, 2020 at 10:35:15PM -0400, Bruce Momjian wrote: > >>> On Thu, Apr 16, 2020 at 03:11:51PM -0400, Tom Lane wrote: > >>>> If we were going to go down the path of periodically logging warnings > >>>> about old prepared transactions, some single-instance background task > >>>> like the checkpointer would be a better place to do the work in. But > >>>> I'm not really recommending that, because I agree with Robert that > >>>> we just plain don't want this functionality. > > > Sorry, I meant something in the Postgres logs at postmaster start. > > That seems strictly worse than periodic logging as far as the probability > that somebody will notice the log entry goes. In any case it would only > help people when they restart their postmaster, which ought to be pretty > infrequent in a production situation. I thought if something was wrong, they might look at the server logs after a restart, or they might have a higher probability of having orphaned prepared transactions after a restart. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Tue, Apr 21, 2020 at 6:10 PM Bruce Momjian <bruce@momjian.us> wrote: > I thought if something was wrong, they might look at the server logs > after a restart, or they might have a higher probability of having > orphaned prepared transactions after a restart. Maybe slightly, but having a monitoring tool like check_postgres.pl for this sort of thing still seems like a vastly better solution. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Apr 22, 2020 at 01:05:17PM -0400, Robert Haas wrote: > On Tue, Apr 21, 2020 at 6:10 PM Bruce Momjian <bruce@momjian.us> wrote: > > I thought if something was wrong, they might look at the server logs > > after a restart, or they might have a higher probability of having > > orphaned prepared transactions after a restart. > > Maybe slightly, but having a monitoring tool like check_postgres.pl > for this sort of thing still seems like a vastly better solution. It is --- I was just thinking we should have a minimal native warning. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +