Thread: XID wraparound and busy databases
I was talking to someone at LinuxWorld and they mentioned they often have activity of 6k SELECTs per second, and that they were needing to autovacuum every few days because of xid wraparound. I did some calculations and found that: > 6000 * 60 * 60 * 24 518400000 or 500 million xids per day, confirming they would need the autovacuum to run every few days. Is enlarging the xid field something we should consider for 8.4? Is the autovacuum I/O less then the extra I/O needed for an expanded xid fields on every row? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Is enlarging the xid field something we should consider for 8.4? No. We just got the tuple header down to 24 bytes, we are not going to give that back and then some. If you are processing 6K transactions per second, you can afford to vacuum every couple days... and probably need to vacuum much more often than that anyway, to avoid table bloat. Possibly your respondent should think about trying to do more than one thing per transaction? regards, tom lane
Bruce Momjian wrote: > Is enlarging the xid field something we should consider for 8.4? Is the > autovacuum I/O less then the extra I/O needed for an expanded xid fields > on every row? I doubt that's going to happen... Maybe we can do something to reduce the xid consumption? For example, reuse xids for read-only queries. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Is enlarging the xid field something we should consider for 8.4? > > No. We just got the tuple header down to 24 bytes, we are not going > to give that back and then some. > > If you are processing 6K transactions per second, you can afford to > vacuum every couple days... and probably need to vacuum much more often > than that anyway, to avoid table bloat. > > Possibly your respondent should think about trying to do more than one > thing per transaction? OK, yea, I think that makes sense. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Heikki Linnakangas <heikki@enterprisedb.com> writes: > Maybe we can do something to reduce the xid consumption? For example, > reuse xids for read-only queries. Hmm, that's an idea. More simply, just keep the current transaction open (resetting everything but the XID) if we have made no changes by the time we're told to commit or rollback ... which is something we track already, so as not to waste cycles on useless commit XLOG records. You'd want some upper limit on transaction lifetime, so as to avoid the "long lived transactions hurt VACUUM" problem, but even reusing a single xact for a few seconds would pretty much eliminate this issue, I bet. It's hard to see how anyone could be doing 6K xacts/sec unless most are read-only. regards, tom lane
On Wed, Aug 15, 2007 at 12:49:52PM -0400, Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: > > Maybe we can do something to reduce the xid consumption? For example, > > reuse xids for read-only queries. > > Hmm, that's an idea. > > More simply, just keep the current transaction open (resetting > everything but the XID) if we have made no changes by the time we're > told to commit or rollback ... which is something we track already, > so as not to waste cycles on useless commit XLOG records. > > You'd want some upper limit on transaction lifetime, so as to avoid the > "long lived transactions hurt VACUUM" problem, but even reusing a single > xact for a few seconds would pretty much eliminate this issue, I bet. > It's hard to see how anyone could be doing 6K xacts/sec unless most are > read-only. Aren't there potential issues with keeping the same XID if a transaction in the middle has modified data? IE: SELECT * FROM a DELETE FROM a ... SELECT * FROM a I'm thinking that in any system that's doing an enormous transaction rate, most will be read-only, which means there'd still be a lot of benefit to simply using one transaction until the next write transaction came along. Something else to think about... any app that's doing that kind of transaction rate is likely going to have a large number of backends, so it would be even better if one XID could be shared across backends. -- Decibel!, aka Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Decibel! <decibel@decibel.org> writes: > Aren't there potential issues with keeping the same XID if a transaction > in the middle has modified data? I don't see any, as long as you take a new snapshot. > Something else to think about... any app that's doing that kind of > transaction rate is likely going to have a large number of backends, so > it would be even better if one XID could be shared across backends. Not sane in the least. What happens if two of them start to modify data, and then one commits and one rolls back? In any case, if we can cut the xact rate to one every few seconds per backend, the problem goes away. Actually ... an idea that might be worth investigating is to do something similar to what we already did for subtransactions: don't assign an XID at all until the transaction makes a data change. I think that the main reason for assigning an XID earlier is just that the ProcArray routines ignore backends that are not currently showing an active XID when they figure global XMIN. But a backend could probably advertise an XMIN, indicating the age of the oldest active snapshot it's got, without advertising an active XID. (This could also tie into the idea we discussed awhile back of tracking minimum XMIN better by maintaining a central list of open snapshots within each backend. Perhaps that should be totally decoupled from whether we are advertising an active XID ...) regards, tom lane
On Wednesday 15 August 2007 09:49:52 Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: > > Maybe we can do something to reduce the xid consumption? For example, > > reuse xids for read-only queries. > > Hmm, that's an idea. > > More simply, just keep the current transaction open (resetting > everything but the XID) if we have made no changes by the time we're > told to commit or rollback ... which is something we track already, > so as not to waste cycles on useless commit XLOG records. Jan and myself were discussing something like this as it would relate to a subscribe process for slony. Jan care to summerize your thoughts on this? > > You'd want some upper limit on transaction lifetime, so as to avoid the > "long lived transactions hurt VACUUM" problem, but even reusing a single > xact for a few seconds would pretty much eliminate this issue, I bet. > It's hard to see how anyone could be doing 6K xacts/sec unless most are > read-only. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: >> Is enlarging the xid field something we should consider for 8.4? > > No. We just got the tuple header down to 24 bytes, we are not going > to give that back and then some. > > If you are processing 6K transactions per second, you can afford to > vacuum every couple days... and probably need to vacuum much more often > than that anyway, to avoid table bloat. Speaking of vacuum, hopefully we'll get some sort of dead space map in 8.4. If we keep track of frozen pages there, vacuuming to avoid xid wraparound will be much cheaper. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Tom Lane írta: > It's hard to see how anyone could be doing 6K xacts/sec unless most are > read-only. > > regards, tom lane > In a recent stress test with our PostgreSQL-based cluster between two machines 3 million transaction were performed with "pgbench -c 150 -t 20000 -s 200" in about _ten hours_. The primary machine (desktop-level machine for development) used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want about my disk lying about flush, its 75MB/sec transfer rate transfer rate is real. So 5 million "real" transaction in 24 hours is not unrealistic. -- ---------------------------------- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/
Zoltan Boszormenyi wrote: > Tom Lane írta: >> It's hard to see how anyone could be doing 6K xacts/sec unless most are >> read-only. > > In a recent stress test with our PostgreSQL-based cluster between two > machines > 3 million transaction were performed with "pgbench -c 150 -t 20000 -s 200" > in about _ten hours_. The primary machine (desktop-level machine for > development) > used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want > about my disk lying about flush, its 75MB/sec transfer rate transfer > rate is real. > So 5 million "real" transaction in 24 hours is not unrealistic. 6k xacts / s is five *hundred* million transactions, not five million... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas írta: > Zoltan Boszormenyi wrote: > >> Tom Lane írta: >> >>> It's hard to see how anyone could be doing 6K xacts/sec unless most are >>> read-only. >>> >> In a recent stress test with our PostgreSQL-based cluster between two >> machines >> 3 million transaction were performed with "pgbench -c 150 -t 20000 -s 200" >> in about _ten hours_. The primary machine (desktop-level machine for >> development) >> used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want >> about my disk lying about flush, its 75MB/sec transfer rate transfer >> rate is real. >> So 5 million "real" transaction in 24 hours is not unrealistic. >> > > 6k xacts / s is five *hundred* million transactions, not five million... > Blush. :-) You're right. However a single machine with ramdisk is able to do that. -- ---------------------------------- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/
Zoltan Boszormenyi wrote: > >> used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want > >> about my disk lying about flush, its 75MB/sec transfer rate transfer > >> rate is real. > >> So 5 million "real" transaction in 24 hours is not unrealistic. > >> > > > > 6k xacts / s is five *hundred* million transactions, not five million... > > > > Blush. :-) You're right. However a single machine with ramdisk is able > to do that. A battery-backed hard disk controller card can do that too. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Wednesday 15 August 2007 13:54, Tom Lane wrote: > Decibel! <decibel@decibel.org> writes: > > Aren't there potential issues with keeping the same XID if a transaction > > in the middle has modified data? > > I don't see any, as long as you take a new snapshot. > I'm a little confused, wouldnt the transaction that waits 30 minutes before modifying data need to get an XID that jives with the system when it's transaction started, not when it began manipulating data? Would it really be safe to take a new snapshot at that time, istm concurrent writers might have caused potential issues by that point. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
Robert Treat <xzilla@users.sourceforge.net> writes: > I'm a little confused, wouldnt the transaction that waits 30 minutes before > modifying data need to get an XID that jives with the system when it's > transaction started, not when it began manipulating data? Why? > Would it really be safe to take a new snapshot at that time, You wouldn't take a new snapshot. The thought that occurs to me is that there's no reason that a transaction has to have an XID for itself before it takes a snapshot. We always special-case our own XID anyway. regards, tom lane
>>> On Wed, Aug 15, 2007 at 5:06 PM, in message <7968.1187215570@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Treat <xzilla@users.sourceforge.net> writes: >> I'm a little confused, wouldnt the transaction that waits 30 minutes before >> modifying data need to get an XID that jives with the system when it's >> transaction started, not when it began manipulating data? > > Why? > >> Would it really be safe to take a new snapshot at that time, > > You wouldn't take a new snapshot. The thought that occurs to me is that > there's no reason that a transaction has to have an XID for itself > before it takes a snapshot. We always special-case our own XID anyway. I'm having trouble picturing how that would work with a transaction using the SERIALIZABLE transaction isolation level, or would this just be done at the READ COMMITTED level? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Tom Lane <tgl@sss.pgh.pa.us> wrote: >> You wouldn't take a new snapshot. The thought that occurs to me is that >> there's no reason that a transaction has to have an XID for itself >> before it takes a snapshot. We always special-case our own XID anyway. > I'm having trouble picturing how that would work with a transaction using > the SERIALIZABLE transaction isolation level, Why? You take a snapshot, you use it. If you later need to allocate an XID for yourself, you do that --- your own XID is surely uncommitted in any case, so this doesn't affect the validity of the snapshot. The bottom line here is that we need own-XID-if-any to be >= snapshot xmin, but there's no obvious reason why it has to be < snapshot xmax. This is, in fact, *not* the case for subtransaction XIDs, and I see no fundamental reason why it need be true for the top transaction XID. regards, tom lane
On Wed, 2007-08-15 at 12:49 -0400, Tom Lane wrote: > Heikki Linnakangas <heikki@enterprisedb.com> writes: > > Maybe we can do something to reduce the xid consumption? For example, > > reuse xids for read-only queries. > > Hmm, that's an idea. That would be a very useful idea for additional scalability and it does sound like a good idea. We kicked around the idea of reusing snapshots previously also, as a way of reducing access to the ProcArray. That idea had some problems, but if your workload was all read-only and we reused xids then everybody would have an identical snapshot, so reusing it would make lots of sense. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: >> Is enlarging the xid field something we should consider for 8.4? > > No. We just got the tuple header down to 24 bytes, we are not going > to give that back and then some. > > If you are processing 6K transactions per second, you can afford to > vacuum every couple days... and probably need to vacuum much more often > than that anyway, to avoid table bloat. > > Possibly your respondent should think about trying to do more than one > thing per transaction? I'm wondering how many of those 6k xacts/second are actually modifying data. If a large percentage of those are readonly queries, than the need for vacuuming could be reduced if postgres assigned an xid only if that xid really hits the disk. Otherwise (for purely select-type queries) it could use some special xid value. This is what I'm doing in my Readonly-Queries-On-PITR-Slave patch. greetings, Florian Pflug
Added to TODO: * Reduce XID consumption of read-only queries http://archives.postgresql.org/pgsql-hackers/2007-08/msg00516.php --------------------------------------------------------------------------- Florian G. Pflug wrote: > Tom Lane wrote: > > Bruce Momjian <bruce@momjian.us> writes: > >> Is enlarging the xid field something we should consider for 8.4? > > > > No. We just got the tuple header down to 24 bytes, we are not going > > to give that back and then some. > > > > If you are processing 6K transactions per second, you can afford to > > vacuum every couple days... and probably need to vacuum much more often > > than that anyway, to avoid table bloat. > > > > Possibly your respondent should think about trying to do more than one > > thing per transaction? > > I'm wondering how many of those 6k xacts/second are actually modifying > data. If a large percentage of those are readonly queries, than the need > for vacuuming could be reduced if postgres assigned an xid only if that > xid really hits the disk. Otherwise (for purely select-type queries) it > could use some special xid value. > > This is what I'm doing in my Readonly-Queries-On-PITR-Slave patch. > > greetings, Florian Pflug -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Thu, Aug 16, 2007 at 04:13:55PM +0200, Florian G. Pflug wrote: > Tom Lane wrote: > >Bruce Momjian <bruce@momjian.us> writes: > >>Is enlarging the xid field something we should consider for 8.4? > > > >No. We just got the tuple header down to 24 bytes, we are not going > >to give that back and then some. > > > >If you are processing 6K transactions per second, you can afford to > >vacuum every couple days... and probably need to vacuum much more often > >than that anyway, to avoid table bloat. > > > >Possibly your respondent should think about trying to do more than one > >thing per transaction? > > I'm wondering how many of those 6k xacts/second are actually modifying > data. If a large percentage of those are readonly queries, than the need > for vacuuming could be reduced if postgres assigned an xid only if that > xid really hits the disk. Otherwise (for purely select-type queries) it > could use some special xid value. > > This is what I'm doing in my Readonly-Queries-On-PITR-Slave patch. Bruce talked to me and others at Hi5. We're the ones with this high workload, and yes, the traffic is predominantly SELECTs. Our application issues parallel SELECT queries to 32 partitioned tables in 32 separate threads. For our purposes this is much, much faster than a UNION view or table inheritance. These tables are partitioned on one key, but we also need to search on other indexed keys, thus we use what we call a 'broadcast query'. We'd be happy with better faster vacuums or limits on xids (or both!) -- Paul Lindner ||||| | | | | | | | | | lindner@inuus.com