Thread: How to avoid transaction ID wrap
OK, here's my problem, I have a nature study where we have about 10 video cameras taking 15 frames per second. For each frame we make a few transactions on a PostgreSQL database. We want to keep about a years worth of data at any specific time. We have triggers that fire is something interesting is found on insert. We want this thing to run for a log time. From the numbers, you can see the PostgreSQL database is VERY loaded. Running VACUUM may not always be possible without losing data. The numbers I have amount to 466,560,000 transactions per month, lasting a maximum of about 9 months until XID wrap. I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number of transactions, COPY, etc. so I'm not dead in the water, but I would be interested in any observations yo may have.
On Tue, 6 Jun 2006, Mark Woodward wrote: > OK, here's my problem, I have a nature study where we have about 10 video > cameras taking 15 frames per second. > For each frame we make a few transactions on a PostgreSQL database. I would suggest doing all the inserts of one frame in one transaction. Maybe even multiple frames in one transaction. That should bring down the number of transactions significantly. > We want to keep about a years worth of data at any specific time. > We have triggers that fire is something interesting is found on insert. > We want this thing to run for a log time. > From the numbers, you can see the PostgreSQL database is VERY loaded. > Running VACUUM may not always be possible without losing data. Why not? > The numbers I have amount to 466,560,000 transactions per month, lasting a > maximum of about 9 months until XID wrap. If you can get that maximum up above one year (which was how long you want to keep the data), you won't need to freeze the records to avoid ID wraparound. - Heikki
Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward: > OK, here's my problem, I have a nature study where we have about 10 video > cameras taking 15 frames per second. > For each frame we make a few transactions on a PostgreSQL database. > We want to keep about a years worth of data at any specific time. partition by month, then you have better chances of removing old data without causing overload/data loss; > We have triggers that fire is something interesting is found on insert. > We want this thing to run for a log time. > From the numbers, you can see the PostgreSQL database is VERY loaded. > Running VACUUM may not always be possible without losing data. why ? just run it with very friendly delay settings. > The numbers I have amount to 466,560,000 transactions per month, lasting a > maximum of about 9 months until XID wrap. actually 4.5 months as you will start having problems at 2G xacts. > I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number > of transactions, COPY, etc. so I'm not dead in the water, but I would be > interested in any observations yo may have. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
Mark Woodward wrote: > OK, here's my problem, I have a nature study where we have about 10 video > cameras taking 15 frames per second. > For each frame we make a few transactions on a PostgreSQL database. Maybe if you grouped multiple operations on bigger transactions, the I/O savings could be enough to buy you the ability to vacuum once in a while. Or consider buffering somehow -- save the data elsewhere, and have some sort of daemon to put it into the database. This would allow to cope with the I/O increase during vacuum. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Clinging to sanity, hannu@skype.net (Hannu Krosing) mumbled into her beard: > Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward: >> OK, here's my problem, I have a nature study where we have about 10 video >> cameras taking 15 frames per second. >> For each frame we make a few transactions on a PostgreSQL database. >> We want to keep about a years worth of data at any specific time. > > partition by month, then you have better chances of removing old data > without causing overload/data loss; It's certainly worth something to be able to TRUNCATE an elderly partition; that cleans things out very nicely... >> We have triggers that fire is something interesting is found on insert. >> We want this thing to run for a log time. >> From the numbers, you can see the PostgreSQL database is VERY loaded. >> Running VACUUM may not always be possible without losing data. > > why ? just run it with very friendly delay settings. "Friendly delay settings" can have adverse effects; it is likely to make vacuum run on the order of 3x as long, which means that if you have a very large table that takes 12h to VACUUM, "vacuum delay" will increase that to 36h, which means you'll have a transaction open for 36h. That'll be very evil, to be sure... >> The numbers I have amount to 466,560,000 transactions per month, lasting a >> maximum of about 9 months until XID wrap. > > actually 4.5 months as you will start having problems at 2G xacts. Right. >> I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number >> of transactions, COPY, etc. so I'm not dead in the water, but I would be >> interested in any observations yo may have. Grouping work together to diminish numbers of transactions is almost always something of a win... -- select 'cbbrowne' || '@' || 'gmail.com'; http://linuxdatabases.info/info/rdbms.html Roses are red, Violets are blue, I'm schizophrenic... And I am too.
On Tue, 2006-06-06 at 13:53 -0400, Christopher Browne wrote: > Clinging to sanity, hannu@skype.net (Hannu Krosing) mumbled into her beard: > > Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward: > >> OK, here's my problem, I have a nature study where we have about 10 video > >> cameras taking 15 frames per second. > >> For each frame we make a few transactions on a PostgreSQL database. > >> We want to keep about a years worth of data at any specific time. > > > > partition by month, then you have better chances of removing old data > > without causing overload/data loss; > > It's certainly worth something to be able to TRUNCATE an elderly > partition; that cleans things out very nicely... With one potential snafu -- it blocks new SELECTs against the parent table while truncate runs on the child (happens with constraint exclusion as well). If your transactions are short then it won't be an issue. If you have mixed length transactions (many short which the occasional long select) then it becomes tricky since those short transactions will be blocked. > >> We have triggers that fire is something interesting is found on insert. > >> We want this thing to run for a log time. > >> From the numbers, you can see the PostgreSQL database is VERY loaded. > >> Running VACUUM may not always be possible without losing data. > > > > why ? just run it with very friendly delay settings. > > "Friendly delay settings" can have adverse effects; it is likely to > make vacuum run on the order of 3x as long, which means that if you > have a very large table that takes 12h to VACUUM, "vacuum delay" will > increase that to 36h, which means you'll have a transaction open for > 36h. Sounds like this is almost strictly inserts and selects though. If there is limited garbage collection (updates, deletes, rollbacks of inserts) required then it isn't all that bad. --
Rod Taylor wrote: > With one potential snafu -- it blocks new SELECTs against the parent > table while truncate runs on the child (happens with constraint > exclusion as well). > > If your transactions are short then it won't be an issue. If you have > mixed length transactions (many short which the occasional long select) > then it becomes tricky since those short transactions will be blocked. One idea is to try to acquire the lock before issuing the TRUNCATE itself. If the LOCK TABLE times out, you know you should wait for a long-running transaction ... -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
> Mark Woodward wrote: >> OK, here's my problem, I have a nature study where we have about 10 >> video >> cameras taking 15 frames per second. >> For each frame we make a few transactions on a PostgreSQL database. > > Maybe if you grouped multiple operations on bigger transactions, the I/O > savings could be enough to buy you the ability to vacuum once in a > while. Or consider buffering somehow -- save the data elsewhere, and > have some sort of daemon to put it into the database. This would allow > to cope with the I/O increase during vacuum. The problem is ssufficiently large that any minor modification can easily hide the problem for a predictble amount of time. My hope was that someone would have a real "long term" work around.
I've once proposed a patch for 64bit transaction ID, but this causes some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit transaction ID has to pay about a couple of percent of performance. If 64bit transaction ID is a reasonable fix, I've already posted this patch. Anyone can apply this to later versions. Mark Woodward wrote: >> Mark Woodward wrote: >>> OK, here's my problem, I have a nature study where we have about 10 >>> video >>> cameras taking 15 frames per second. >>> For each frame we make a few transactions on a PostgreSQL database. >> Maybe if you grouped multiple operations on bigger transactions, the I/O >> savings could be enough to buy you the ability to vacuum once in a >> while. Or consider buffering somehow -- save the data elsewhere, and >> have some sort of daemon to put it into the database. This would allow >> to cope with the I/O increase during vacuum. > > The problem is ssufficiently large that any minor modification can easily > hide the problem for a predictble amount of time. My hope was that someone > would have a real "long term" work around. > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- Koichi Suzuki
Koichi Suzuki wrote: > I've once proposed a patch for 64bit transaction ID, but this causes > some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit > transaction ID has to pay about a couple of percent of performance. > If 64bit transaction ID is a reasonable fix, I've already posted this > patch. Anyone can apply this to later versions. Did you check performance on 32-bit or 64-bit systems and 64-bit binary version of PGSQL? I think that today is not problem to have 64-bit architecture and 64-bit ID should increase scalability of Postgres. Zdenek
Koichi Suzuki wrote: > I've once proposed a patch for 64bit transaction ID, but this causes > some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit > transaction ID has to pay about a couple of percent of performance. If > 64bit transaction ID is a reasonable fix, I've already posted this > patch. Anyone can apply this to later versions. Be careful, the pg_multixact stuff assumes that a MultiXactId is the same size as TransactionId, so you have to change that too. I don't recall offhand if it was defined in a way that would make it "just work" automatically. (You'd also have to be careful about it not overrunning the SLRU files when it's close to the end of 'em.). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Wed, Jun 07, 2006 at 01:48:50PM +0200, Zdenek Kotala wrote: > Koichi Suzuki wrote: > >I've once proposed a patch for 64bit transaction ID, but this causes > >some overhead to each tuple (XMIN and XMAX). Pgbench with 64bit > >transaction ID has to pay about a couple of percent of performance. > >If 64bit transaction ID is a reasonable fix, I've already posted this > >patch. Anyone can apply this to later versions. > Did you check performance on 32-bit or 64-bit systems and 64-bit binary > version of PGSQL? I think that today is not problem to have 64-bit > architecture and 64-bit ID should increase scalability of Postgres. I doubt performance is the issue directly. Increasing the size of the transaction counter would increase the size of narrow tables by maybe 30%. That's 30% more disk space and 30% more memory usage in some places. Maybe at some point it'll be worth it, but right now I don't think those commodoties are cheap enough to use like this for fairly marginal benefits. Beside, memory bandwidth hasn't grown anywhere enar as fast as memory space, so it's always a good idea to use as little memory as possible. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > Koichi Suzuki wrote: >> I've once proposed a patch for 64bit transaction ID, but this causes >> some overhead to each tuple (XMIN and XMAX). > Did you check performance on 32-bit or 64-bit systems and 64-bit binary > version of PGSQL? I think that today is not problem to have 64-bit > architecture and 64-bit ID should increase scalability of Postgres. The percentage increase in I/O demand is the main reason the patch was rejected, not so much the arithmetic. regards, tom lane
On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote: > Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > > Koichi Suzuki wrote: > >> I've once proposed a patch for 64bit transaction ID, but this causes > >> some overhead to each tuple (XMIN and XMAX). > > > Did you check performance on 32-bit or 64-bit systems and 64-bit binary > > version of PGSQL? I think that today is not problem to have 64-bit > > architecture and 64-bit ID should increase scalability of Postgres. > > The percentage increase in I/O demand is the main reason the patch was > rejected, not so much the arithmetic. Before considering 64 bit XIDs, it'd be very helpful to know why Mark can't vacuum frequently enough to handle rollover... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Ühel kenal päeval, T, 2006-06-06 kell 13:53, kirjutas Christopher Browne: > >> We have triggers that fire is something interesting is found on insert. > >> We want this thing to run for a log time. > >> From the numbers, you can see the PostgreSQL database is VERY loaded. > >> Running VACUUM may not always be possible without losing data. > > > > why ? just run it with very friendly delay settings. > > "Friendly delay settings" can have adverse effects; it is likely to > make vacuum run on the order of 3x as long, which means that if you > have a very large table that takes 12h to VACUUM, "vacuum delay" will > increase that to 36h, which means you'll have a transaction open for > 36h. > > That'll be very evil, to be sure... Not always. I know that it is evil in slony1 context, but often it *is* possible to design your system in a way where a superlong transaction is almost unnoticable. Long transactions are evil in case they cause some fast-changing table to grow its storage size several orders of magnitude, but if that is not the case then they just run there in backgroun with no ill effects, especially do-nothing transactions like vacuum. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
On Thu, Jun 08, 2006 at 01:30:42AM +0300, Hannu Krosing wrote: > ??hel kenal p??eval, T, 2006-06-06 kell 13:53, kirjutas Christopher > Browne: > > >> We have triggers that fire is something interesting is found on insert. > > >> We want this thing to run for a log time. > > >> From the numbers, you can see the PostgreSQL database is VERY loaded. > > >> Running VACUUM may not always be possible without losing data. > > > > > > why ? just run it with very friendly delay settings. > > > > "Friendly delay settings" can have adverse effects; it is likely to > > make vacuum run on the order of 3x as long, which means that if you > > have a very large table that takes 12h to VACUUM, "vacuum delay" will > > increase that to 36h, which means you'll have a transaction open for > > 36h. > > > > That'll be very evil, to be sure... > > Not always. I know that it is evil in slony1 context, but often it *is* > possible to design your system in a way where a superlong transaction is > almost unnoticable. > > Long transactions are evil in case they cause some fast-changing table > to grow its storage size several orders of magnitude, but if that is not > the case then they just run there in backgroun with no ill effects, > especially do-nothing transactions like vacuum. Plus, if the only issue here is in fact the long-running transaction for vacuum, there's other ways to address that which would be a lot less intrusive than doing something like going to 64 bit XIDs. IIRC, in 8.2 vacuum will start a new transaction every time it fills up maintenance_work_mem, so just setting that low could solve the problem (at the expense of a heck of a lot of extra IO). -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
> On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote: >> Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: >> > Koichi Suzuki wrote: >> >> I've once proposed a patch for 64bit transaction ID, but this causes >> >> some overhead to each tuple (XMIN and XMAX). >> >> > Did you check performance on 32-bit or 64-bit systems and 64-bit >> binary >> > version of PGSQL? I think that today is not problem to have 64-bit >> > architecture and 64-bit ID should increase scalability of Postgres. >> >> The percentage increase in I/O demand is the main reason the patch was >> rejected, not so much the arithmetic. > > Before considering 64 bit XIDs, it'd be very helpful to know why Mark > can't vacuum frequently enough to handle rollover... The system is under heavy load, and while there are tricks that can be done, vacuum is a process which is extra load the system when it is running. It is a sliding scale, as always, you may get the system to the point where it can vacuum AND perform as needed, but the database is growing constantly. Eventually you will get to the point where you can't run vacuum *and* keep up with the data stream. I guess what I am saying is that PostgreSQL isn't "smooth," between checkpoints and vacuum, it is near impossible to make a product that performs consistently under high load. Now don't flame me, I really do love PostgreSQL, it is just that I bump up against these issues from time to time and it would be nice if there were some way to work around them.
On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote: > I guess what I am saying is that PostgreSQL isn't "smooth," between > checkpoints and vacuum, it is near impossible to make a product that > performs consistently under high load. Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to find a case where I couldn't smooth out the IO load so that it wasn't an issue. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Tom Lane wrote: > Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: >> Koichi Suzuki wrote: >>> I've once proposed a patch for 64bit transaction ID, but this causes >>> some overhead to each tuple (XMIN and XMAX). > >> Did you check performance on 32-bit or 64-bit systems and 64-bit binary >> version of PGSQL? I think that today is not problem to have 64-bit >> architecture and 64-bit ID should increase scalability of Postgres. I checked the performance on 64-bit system and 64bit binary. > > The percentage increase in I/O demand is the main reason the patch was > rejected, not so much the arithmetic. That's right. I've also ovserved I/O demand increase. I remember we have to pay three to five percent performance decrease in pgbench. So I don't think we should apply this patch without further justification. I'm looking for other reasons for larger transaction ID. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- Koichi Suzuki
Ühel kenal päeval, K, 2006-06-07 kell 17:45, kirjutas Jim C. Nasby: > Plus, if the only issue here is in fact the long-running transaction for > vacuum, there's other ways to address that which would be a lot less > intrusive than doing something like going to 64 bit XIDs. IIRC, in 8.2 > vacuum will start a new transaction every time it fills up > maintenance_work_mem, so just setting that low could solve the problem > (at the expense of a heck of a lot of extra IO). If the aim is to *only* avoid transaction wraparound, then maybe we could introduce VACUUM FREEZE ONLY; which never removes any old tuples, but instead just marks them by setting xmin=xmax for them, in addition to its freezing of live-and-visible-to-all tuples. This would avoid touching indexes at all and may well be what is desired for tables with only very little updates/deletes. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
> On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote: >> I guess what I am saying is that PostgreSQL isn't "smooth," between >> checkpoints and vacuum, it is near impossible to make a product that >> performs consistently under high load. > > Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to > find a case where I couldn't smooth out the IO load so that it wasn't an > issue. In several project that I have been involved with, PostgreSQL had most of the important features to be used, but in one project, checkpoints caused us to time out under load. In this current project I am researching, I know that vacuum may be an issue. The load is brutally constant.
Mark Woodward wrote: >> On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote: >>> I guess what I am saying is that PostgreSQL isn't "smooth," between >>> checkpoints and vacuum, it is near impossible to make a product that >>> performs consistently under high load. >> Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to >> find a case where I couldn't smooth out the IO load so that it wasn't an >> issue. > > In several project that I have been involved with, PostgreSQL had most of > the important features to be used, but in one project, checkpoints caused > us to time out under load. In this current project I am researching, I > know that vacuum may be an issue. The load is brutally constant. I was recently involved in a project where we had to decrease the checkpoint_timeout . The problem was, that the database was performing so many transactions that if we waiting for 5 minutes, checkpoint would take entirely too long. We ended up doing checkpoints every two minutes which with the increase in checkpoint_segments and adjustment of bgwriter settings would level out the load. Sincerely, Joshua D. Drake > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
"Joshua D. Drake" <jd@commandprompt.com> writes: > I was recently involved in a project where we had to decrease the > checkpoint_timeout . The problem was, that the database was performing > so many transactions that if we waiting for 5 minutes, checkpoint would > take entirely too long. Seems like the correct fix for that is to make the bgwriter more aggressive. Narrowing the checkpoint spacing is a pretty horrid answer because of the resulting increase in full-page-image WAL traffic. regards, tom lane
Hannu Krosing <hannu@skype.net> writes: > If the aim is to *only* avoid transaction wraparound, then maybe we > could introduce VACUUM FREEZE ONLY; which never removes any old tuples, > but instead just marks them by setting xmin=xmax for them, in addition > to its freezing of live-and-visible-to-all tuples. > This would avoid touching indexes at all and may well be what is desired > for tables with only very little updates/deletes. Seems like useless complexity. If there are so few dead tuples that you can afford to not reclaim them, then there are so few that reclaiming them isn't really going to cost much either ... regards, tom lane
Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: >> I was recently involved in a project where we had to decrease the >> checkpoint_timeout . The problem was, that the database was performing >> so many transactions that if we waiting for 5 minutes, checkpoint would >> take entirely too long. > > Seems like the correct fix for that is to make the bgwriter more > aggressive. Narrowing the checkpoint spacing is a pretty horrid answer > because of the resulting increase in full-page-image WAL traffic. Well we did that as well. Here are the basic symptons: During normal processing which contained about 250 connections everything was fine. A checkpoint would start and connections would start piling up, sometimes breaking 1000. We narrowed that down to users having to wait longer for query execution so instead of just reusing connections new connections had to be initiated because the existing connections were busy. We tried many different parameters, and bgwriter did significantly help but the only "solution" was to make checkpoints happen at a much more aggressive time frame. Modify bgwriters settings and the checkpoint actually increased our velocity by about 70% by the time we were done. Bgwriter was definitely the largest chunk of that although other parameters combined outweighed it (effective_cache, shared_buffers etc...). Sincerely, Joshua D. Drake > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutionssince 1997 http://www.commandprompt.com/
On Thu, Jun 08, 2006 at 09:13:33AM -0700, Joshua D. Drake wrote: > Tom Lane wrote: > >"Joshua D. Drake" <jd@commandprompt.com> writes: > >>I was recently involved in a project where we had to decrease the > >>checkpoint_timeout . The problem was, that the database was performing > >>so many transactions that if we waiting for 5 minutes, checkpoint would > >>take entirely too long. > > > >Seems like the correct fix for that is to make the bgwriter more > >aggressive. Narrowing the checkpoint spacing is a pretty horrid answer > >because of the resulting increase in full-page-image WAL traffic. > > Well we did that as well. Here are the basic symptons: > > During normal processing which contained about 250 connections > everything was fine. A checkpoint would start and connections would > start piling up, sometimes breaking 1000. > > We narrowed that down to users having to wait longer for query execution > so instead of just reusing connections new connections had to be > initiated because the existing connections were busy. > > We tried many different parameters, and bgwriter did significantly help > but the only "solution" was to make checkpoints happen at a much more > aggressive time frame. > > Modify bgwriters settings and the checkpoint actually increased our > velocity by about 70% by the time we were done. Bgwriter was definitely > the largest chunk of that although other parameters combined outweighed > it (effective_cache, shared_buffers etc...). Did you try increasing the checkpoint interval, in the hopes that it would allow the bgwritter enough extra time to get everything pushed out? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Ühel kenal päeval, N, 2006-06-08 kell 12:09, kirjutas Tom Lane: > Hannu Krosing <hannu@skype.net> writes: > > If the aim is to *only* avoid transaction wraparound, then maybe we > > could introduce VACUUM FREEZE ONLY; which never removes any old tuples, > > but instead just marks them by setting xmin=xmax for them, in addition > > to its freezing of live-and-visible-to-all tuples. > > > This would avoid touching indexes at all and may well be what is desired > > for tables with only very little updates/deletes. > > Seems like useless complexity. If there are so few dead tuples that you > can afford to not reclaim them, then there are so few that reclaiming > them isn't really going to cost much either ... It will cost 1 full scan per index, which can be quite a lot of disk read traffic, if indexes are not used, say when most access is local to some hotspot. > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com
On Tuesday 2006-06-06 20:11, Mark Woodward wrote: > > Mark Woodward wrote: > >> OK, here's my problem, I have a nature study where we have about 10 > >> video > >> cameras taking 15 frames per second. > >> For each frame we make a few transactions on a PostgreSQL database. > > > > Maybe if you grouped multiple operations on bigger transactions, the I/O > > savings could be enough to buy you the ability to vacuum once in a > > while. Or consider buffering somehow -- save the data elsewhere, and > > have some sort of daemon to put it into the database. This would allow > > to cope with the I/O increase during vacuum. > > The problem is ssufficiently large that any minor modification can easily > hide the problem for a predictble amount of time. My hope was that someone > would have a real "long term" work around. I'm not certain that I understand the original problem correctly so I am going to restate it. VACCUM needs to be run for two reasons. 1) To recover the transaction counter. 2) To recover records marked for deletion. VACCUM needs to be run over the entire database. If the data in the database is N, then VACCUM is O(N). Roughly, VACCUM scales linearly with the size of the database. In the digital video problem: Data is stored indefinitely online. (It is not archived.) (Virtually no records need to be recovered from deletion.) Data comes in at a constant rate, frames. The database therefore grows at frames/time (D). It follows that no matter how much tuning is done, given constant hardware, VACCUM grows to consume so many resources that it is no longer possible to process frames/time[m] before frames/time[m+1] arrives. Ideally, the transaction management system would be proportional to the marginal change in size of the database rather than the gross size of the database. That is VACCUM being O(N) should be replaced (or there should be an optional alternative) that scales with D, O^k(D) where any k > 1 involves a tradeoff with VACCUM.
On Fri, Jun 09, 2006 at 06:20:21PM -0700, Trent Shipley wrote: > Ideally, the transaction management system would be proportional to the > marginal change in size of the database rather than the gross size of the > database. That is VACCUM being O(N) should be replaced (or there should be > an optional alternative) that scales with D, O^k(D) where any k > 1 involves > a tradeoff with VACCUM. That's something that's been discussed quite a bit; search the archives for 'dead space map'. Granted, that wasn't targeted so much at the need to VACUUM FREEZE, but there's no reason it wouldn't be possible to make it handle that as well. In the mean time, if you partition the table on date, you won't need to be vacuuming the entire database to handle XID wrap. BTW, you're also ignoring the fact that technology keeps improving the hardware that's out there. It's possible that hardware advances alone would keep pace with your vacuuming needs. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, Jun 09, 2006 at 06:20:21PM -0700, Trent Shipley wrote: > VACCUM needs to be run for two reasons. > 1) To recover the transaction counter. > 2) To recover records marked for deletion. > > VACCUM needs to be run over the entire database. If the data in the database > is N, then VACCUM is O(N). Roughly, VACCUM scales linearly with the size of > the database. Well, you only need to vacuum the entire database once every billion transactions. Secondly, you can vacuum table by table. If you know a table will never be modified, you can VACUUM FREZZE it, then it will never need to be vacuumed again, ever (until you make changes ofcourse). > Ideally, the transaction management system would be proportional to the > marginal change in size of the database rather than the gross size of the > database. That is VACCUM being O(N) should be replaced (or there should be > an optional alternative) that scales with D, O^k(D) where any k > 1 involves > a tradeoff with VACCUM. That's why people suggest partitions. Then you only vacuum the partitions that are new and the old ones never need to be touched... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Martijn van Oosterhout <kleptog@svana.org> writes: > That's why people suggest partitions. Then you only vacuum the > partitions that are new and the old ones never need to be touched... This will all work a lot better once we track XID wraparound risk on a per-table rather than per-database basis. I hope that will be done in time for 8.2. regards, tom lane
Jim C. Nasby wrote: > On Fri, Jun 09, 2006 at 06:20:21PM -0700, Trent Shipley wrote: > > Ideally, the transaction management system would be proportional to the > > marginal change in size of the database rather than the gross size of the > > database. That is VACCUM being O(N) should be replaced (or there should be > > an optional alternative) that scales with D, O^k(D) where any k > 1 involves > > a tradeoff with VACCUM. > > That's something that's been discussed quite a bit; search the archives > for 'dead space map'. Granted, that wasn't targeted so much at the need > to VACUUM FREEZE, but there's no reason it wouldn't be possible to make > it handle that as well. In the mean time, if you partition the table on > date, you won't need to be vacuuming the entire database to handle XID > wrap. FWIW my patch for vacuum fixes some of these issues. First because you can freeze a table and will never need to vacuum it again; database-wide vacuums will not be necessary. And secondly, because as soon as a table is frozen (either because you VACUUM FREEZE'd it, or because regular vacuuming froze all tuples on it completely), then you don't need to vacuum it again and indeed (non-full) VACUUM turns into a no-op. Mix this with partitioned tables.
Tom Lane wrote: > Martijn van Oosterhout <kleptog@svana.org> writes: > > That's why people suggest partitions. Then you only vacuum the > > partitions that are new and the old ones never need to be touched... > > This will all work a lot better once we track XID wraparound risk on a > per-table rather than per-database basis. I hope that will be done in > time for 8.2. FWIW I posted the patch for non-transactional pg_class stuff in -patches awhile back, so it's pending review ;-) I'll repost it (today I expect, or tomorrow at the latest) with some minor corrections, along with the corresponding relminxid patch. I indend to commit both during next week (or this week, for those whose weeks start on sundays), barring objections.