Thread: How to avoid transaction ID wrap

How to avoid transaction ID wrap

From
"Mark Woodward"
Date:
OK, here's my problem, I have a nature study where we have about 10 video
cameras taking 15 frames per second.
For each frame we make a few transactions on a PostgreSQL database.
We want to keep about a years worth of data at any specific time.
We have triggers that fire is something interesting is found on insert.
We want this thing to run for a log time.
From the numbers, you can see the PostgreSQL database is VERY loaded.
Running VACUUM may not always be possible without losing data.
The numbers I have amount to 466,560,000 transactions per month, lasting a
maximum of about 9 months until XID wrap.

I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
of transactions, COPY, etc. so I'm not dead in the water, but I would be
interested in any observations yo may have.


Re: How to avoid transaction ID wrap

From
Heikki Linnakangas
Date:
On Tue, 6 Jun 2006, Mark Woodward wrote:

> OK, here's my problem, I have a nature study where we have about 10 video
> cameras taking 15 frames per second.
> For each frame we make a few transactions on a PostgreSQL database.

I would suggest doing all the inserts of one frame in one transaction. 
Maybe even multiple frames in one transaction. That should bring down the 
number of transactions significantly.

> We want to keep about a years worth of data at any specific time.
> We have triggers that fire is something interesting is found on insert.
> We want this thing to run for a log time.
> From the numbers, you can see the PostgreSQL database is VERY loaded.

> Running VACUUM may not always be possible without losing data.

Why not?

> The numbers I have amount to 466,560,000 transactions per month, lasting a
> maximum of about 9 months until XID wrap.

If you can get that maximum up above one year (which was how long you want 
to keep the data), you won't need to freeze the records to 
avoid ID wraparound.

- Heikki


Re: How to avoid transaction ID wrap

From
Hannu Krosing
Date:
Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:
> OK, here's my problem, I have a nature study where we have about 10 video
> cameras taking 15 frames per second.
> For each frame we make a few transactions on a PostgreSQL database.
> We want to keep about a years worth of data at any specific time.

partition by month, then you have better chances of removing old data
without causing overload/data loss;

> We have triggers that fire is something interesting is found on insert.
> We want this thing to run for a log time.
> From the numbers, you can see the PostgreSQL database is VERY loaded.
> Running VACUUM may not always be possible without losing data.

why ? just run it with very friendly delay settings.

> The numbers I have amount to 466,560,000 transactions per month, lasting a
> maximum of about 9 months until XID wrap.

actually 4.5 months as you will start having problems at 2G xacts.

> I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
> of transactions, COPY, etc. so I'm not dead in the water, but I would be
> interested in any observations yo may have.


-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




Re: How to avoid transaction ID wrap

From
Alvaro Herrera
Date:
Mark Woodward wrote:
> OK, here's my problem, I have a nature study where we have about 10 video
> cameras taking 15 frames per second.
> For each frame we make a few transactions on a PostgreSQL database.

Maybe if you grouped multiple operations on bigger transactions, the I/O
savings could be enough to buy you the ability to vacuum once in a
while.  Or consider buffering somehow -- save the data elsewhere, and
have some sort of daemon to put it into the database.  This would allow
to cope with the I/O increase during vacuum.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: How to avoid transaction ID wrap

From
Christopher Browne
Date:
Clinging to sanity, hannu@skype.net (Hannu Krosing) mumbled into her beard:
> Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:
>> OK, here's my problem, I have a nature study where we have about 10 video
>> cameras taking 15 frames per second.
>> For each frame we make a few transactions on a PostgreSQL database.
>> We want to keep about a years worth of data at any specific time.
>
> partition by month, then you have better chances of removing old data
> without causing overload/data loss;

It's certainly worth something to be able to TRUNCATE an elderly
partition; that cleans things out very nicely...

>> We have triggers that fire is something interesting is found on insert.
>> We want this thing to run for a log time.
>> From the numbers, you can see the PostgreSQL database is VERY loaded.
>> Running VACUUM may not always be possible without losing data.
>
> why ? just run it with very friendly delay settings.

"Friendly delay settings" can have adverse effects; it is likely to
make vacuum run on the order of 3x as long, which means that if you
have a very large table that takes 12h to VACUUM, "vacuum delay" will
increase that to 36h, which means you'll have a transaction open for
36h.

That'll be very evil, to be sure...

>> The numbers I have amount to 466,560,000 transactions per month, lasting a
>> maximum of about 9 months until XID wrap.
>
> actually 4.5 months as you will start having problems at 2G xacts.

Right.

>> I am thinking about a few work arounds, BEGIN/COMMIT to reduce the number
>> of transactions, COPY, etc. so I'm not dead in the water, but I would be
>> interested in any observations yo may have.

Grouping work together to diminish numbers of transactions is almost
always something of a win...
-- 
select 'cbbrowne' || '@' || 'gmail.com';
http://linuxdatabases.info/info/rdbms.html
Roses are red, Violets are blue,
I'm schizophrenic... And I am too.


Re: How to avoid transaction ID wrap

From
Rod Taylor
Date:
On Tue, 2006-06-06 at 13:53 -0400, Christopher Browne wrote:
> Clinging to sanity, hannu@skype.net (Hannu Krosing) mumbled into her beard:
> > Ühel kenal päeval, T, 2006-06-06 kell 08:42, kirjutas Mark Woodward:
> >> OK, here's my problem, I have a nature study where we have about 10 video
> >> cameras taking 15 frames per second.
> >> For each frame we make a few transactions on a PostgreSQL database.
> >> We want to keep about a years worth of data at any specific time.
> >
> > partition by month, then you have better chances of removing old data
> > without causing overload/data loss;
>
> It's certainly worth something to be able to TRUNCATE an elderly
> partition; that cleans things out very nicely...

With one potential snafu -- it blocks new SELECTs against the parent
table while truncate runs on the child (happens with constraint
exclusion as well).

If your transactions are short then it won't be an issue. If you have
mixed length transactions (many short which the occasional long select)
then it becomes tricky since those short transactions will be blocked.

> >> We have triggers that fire is something interesting is found on insert.
> >> We want this thing to run for a log time.
> >> From the numbers, you can see the PostgreSQL database is VERY loaded.
> >> Running VACUUM may not always be possible without losing data.
> >
> > why ? just run it with very friendly delay settings.
>
> "Friendly delay settings" can have adverse effects; it is likely to
> make vacuum run on the order of 3x as long, which means that if you
> have a very large table that takes 12h to VACUUM, "vacuum delay" will
> increase that to 36h, which means you'll have a transaction open for
> 36h.

Sounds like this is almost strictly inserts and selects though. If there
is limited garbage collection (updates, deletes, rollbacks of inserts)
required then it isn't all that bad.

--



Re: How to avoid transaction ID wrap

From
Alvaro Herrera
Date:
Rod Taylor wrote:

> With one potential snafu -- it blocks new SELECTs against the parent
> table while truncate runs on the child (happens with constraint
> exclusion as well).
> 
> If your transactions are short then it won't be an issue. If you have
> mixed length transactions (many short which the occasional long select)
> then it becomes tricky since those short transactions will be blocked.

One idea is to try to acquire the lock before issuing the TRUNCATE
itself.  If the LOCK TABLE times out, you know you should wait for a
long-running transaction ...


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: How to avoid transaction ID wrap

From
"Mark Woodward"
Date:
> Mark Woodward wrote:
>> OK, here's my problem, I have a nature study where we have about 10
>> video
>> cameras taking 15 frames per second.
>> For each frame we make a few transactions on a PostgreSQL database.
>
> Maybe if you grouped multiple operations on bigger transactions, the I/O
> savings could be enough to buy you the ability to vacuum once in a
> while.  Or consider buffering somehow -- save the data elsewhere, and
> have some sort of daemon to put it into the database.  This would allow
> to cope with the I/O increase during vacuum.

The problem is ssufficiently large that any minor modification can easily
hide the problem for a predictble amount of time. My hope was that someone
would have a real "long term" work around.


Re: How to avoid transaction ID wrap

From
Koichi Suzuki
Date:
I've once proposed a patch for 64bit transaction ID, but this causes 
some overhead to each tuple (XMIN and XMAX).   Pgbench with 64bit 
transaction ID has to pay about a couple of percent of performance.   If 
64bit transaction ID is a reasonable fix,  I've already posted this 
patch.   Anyone can apply this to later versions.

Mark Woodward wrote:
>> Mark Woodward wrote:
>>> OK, here's my problem, I have a nature study where we have about 10
>>> video
>>> cameras taking 15 frames per second.
>>> For each frame we make a few transactions on a PostgreSQL database.
>> Maybe if you grouped multiple operations on bigger transactions, the I/O
>> savings could be enough to buy you the ability to vacuum once in a
>> while.  Or consider buffering somehow -- save the data elsewhere, and
>> have some sort of daemon to put it into the database.  This would allow
>> to cope with the I/O increase during vacuum.
> 
> The problem is ssufficiently large that any minor modification can easily
> hide the problem for a predictble amount of time. My hope was that someone
> would have a real "long term" work around.
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
> 


-- 
Koichi Suzuki


Re: How to avoid transaction ID wrap

From
Zdenek Kotala
Date:
Koichi Suzuki wrote:
> I've once proposed a patch for 64bit transaction ID, but this causes 
> some overhead to each tuple (XMIN and XMAX).   Pgbench with 64bit 
> transaction ID has to pay about a couple of percent of performance.   
> If 64bit transaction ID is a reasonable fix,  I've already posted this 
> patch.   Anyone can apply this to later versions.
Did you check performance on 32-bit or 64-bit systems and 64-bit binary 
version of PGSQL? I think that today is not problem to have 64-bit 
architecture and 64-bit ID should increase scalability of Postgres.
      Zdenek


Re: How to avoid transaction ID wrap

From
Alvaro Herrera
Date:
Koichi Suzuki wrote:
> I've once proposed a patch for 64bit transaction ID, but this causes 
> some overhead to each tuple (XMIN and XMAX).   Pgbench with 64bit 
> transaction ID has to pay about a couple of percent of performance.   If 
> 64bit transaction ID is a reasonable fix,  I've already posted this 
> patch.   Anyone can apply this to later versions.

Be careful, the pg_multixact stuff assumes that a MultiXactId is the
same size as TransactionId, so you have to change that too.  I don't
recall offhand if it was defined in a way that would make it "just work"
automatically.  (You'd also have to be careful about it not overrunning
the SLRU files when it's close to the end of 'em.).

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: How to avoid transaction ID wrap

From
Martijn van Oosterhout
Date:
On Wed, Jun 07, 2006 at 01:48:50PM +0200, Zdenek Kotala wrote:
> Koichi Suzuki wrote:
> >I've once proposed a patch for 64bit transaction ID, but this causes
> >some overhead to each tuple (XMIN and XMAX).   Pgbench with 64bit
> >transaction ID has to pay about a couple of percent of performance.
> >If 64bit transaction ID is a reasonable fix,  I've already posted this
> >patch.   Anyone can apply this to later versions.
> Did you check performance on 32-bit or 64-bit systems and 64-bit binary
> version of PGSQL? I think that today is not problem to have 64-bit
> architecture and 64-bit ID should increase scalability of Postgres.

I doubt performance is the issue directly. Increasing the size of the
transaction counter would increase the size of narrow tables by maybe
30%. That's 30% more disk space and 30% more memory usage in some
places. Maybe at some point it'll be worth it, but right now I don't
think those commodoties are cheap enough to use like this for fairly
marginal benefits.

Beside, memory bandwidth hasn't grown anywhere enar as fast as memory
space, so it's always a good idea to use as little memory as possible.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: How to avoid transaction ID wrap

From
Tom Lane
Date:
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
> Koichi Suzuki wrote:
>> I've once proposed a patch for 64bit transaction ID, but this causes 
>> some overhead to each tuple (XMIN and XMAX).

> Did you check performance on 32-bit or 64-bit systems and 64-bit binary 
> version of PGSQL? I think that today is not problem to have 64-bit 
> architecture and 64-bit ID should increase scalability of Postgres.

The percentage increase in I/O demand is the main reason the patch was
rejected, not so much the arithmetic.
        regards, tom lane


Re: How to avoid transaction ID wrap

From
"Jim C. Nasby"
Date:
On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote:
> Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
> > Koichi Suzuki wrote:
> >> I've once proposed a patch for 64bit transaction ID, but this causes 
> >> some overhead to each tuple (XMIN and XMAX).
> 
> > Did you check performance on 32-bit or 64-bit systems and 64-bit binary 
> > version of PGSQL? I think that today is not problem to have 64-bit 
> > architecture and 64-bit ID should increase scalability of Postgres.
> 
> The percentage increase in I/O demand is the main reason the patch was
> rejected, not so much the arithmetic.

Before considering 64 bit XIDs, it'd be very helpful to know why Mark
can't vacuum frequently enough to handle rollover...
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: How to avoid transaction ID wrap

From
Hannu Krosing
Date:
Ühel kenal päeval, T, 2006-06-06 kell 13:53, kirjutas Christopher
Browne:
> >> We have triggers that fire is something interesting is found on insert.
> >> We want this thing to run for a log time.
> >> From the numbers, you can see the PostgreSQL database is VERY loaded.
> >> Running VACUUM may not always be possible without losing data.
> >
> > why ? just run it with very friendly delay settings.
> 
> "Friendly delay settings" can have adverse effects; it is likely to
> make vacuum run on the order of 3x as long, which means that if you
> have a very large table that takes 12h to VACUUM, "vacuum delay" will
> increase that to 36h, which means you'll have a transaction open for
> 36h.
> 
> That'll be very evil, to be sure...

Not always. I know that it is evil in slony1 context, but often it *is*
possible to design your system in a way where a superlong transaction is
almost unnoticable. 

Long transactions are evil in case they cause some fast-changing table
to grow its storage size several orders of magnitude, but if that is not
the case then they just run there in backgroun with no ill effects,
especially do-nothing transactions like vacuum.

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




Re: How to avoid transaction ID wrap

From
"Jim C. Nasby"
Date:
On Thu, Jun 08, 2006 at 01:30:42AM +0300, Hannu Krosing wrote:
> ??hel kenal p??eval, T, 2006-06-06 kell 13:53, kirjutas Christopher
> Browne:
> > >> We have triggers that fire is something interesting is found on insert.
> > >> We want this thing to run for a log time.
> > >> From the numbers, you can see the PostgreSQL database is VERY loaded.
> > >> Running VACUUM may not always be possible without losing data.
> > >
> > > why ? just run it with very friendly delay settings.
> > 
> > "Friendly delay settings" can have adverse effects; it is likely to
> > make vacuum run on the order of 3x as long, which means that if you
> > have a very large table that takes 12h to VACUUM, "vacuum delay" will
> > increase that to 36h, which means you'll have a transaction open for
> > 36h.
> > 
> > That'll be very evil, to be sure...
> 
> Not always. I know that it is evil in slony1 context, but often it *is*
> possible to design your system in a way where a superlong transaction is
> almost unnoticable. 
> 
> Long transactions are evil in case they cause some fast-changing table
> to grow its storage size several orders of magnitude, but if that is not
> the case then they just run there in backgroun with no ill effects,
> especially do-nothing transactions like vacuum.

Plus, if the only issue here is in fact the long-running transaction for
vacuum, there's other ways to address that which would be a lot less
intrusive than doing something like going to 64 bit XIDs. IIRC, in 8.2
vacuum will start a new transaction every time it fills up
maintenance_work_mem, so just setting that low could solve the problem
(at the expense of a heck of a lot of extra IO).
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: How to avoid transaction ID wrap

From
"Mark Woodward"
Date:
> On Wed, Jun 07, 2006 at 11:47:45AM -0400, Tom Lane wrote:
>> Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
>> > Koichi Suzuki wrote:
>> >> I've once proposed a patch for 64bit transaction ID, but this causes
>> >> some overhead to each tuple (XMIN and XMAX).
>>
>> > Did you check performance on 32-bit or 64-bit systems and 64-bit
>> binary
>> > version of PGSQL? I think that today is not problem to have 64-bit
>> > architecture and 64-bit ID should increase scalability of Postgres.
>>
>> The percentage increase in I/O demand is the main reason the patch was
>> rejected, not so much the arithmetic.
>
> Before considering 64 bit XIDs, it'd be very helpful to know why Mark
> can't vacuum frequently enough to handle rollover...

The system is under heavy load, and while there are tricks that can be
done, vacuum is a process which is extra load the system when it is
running. It is a sliding scale, as always, you may get the system to the
point where it can vacuum AND perform as needed, but the database is
growing constantly. Eventually you will get to the point where you can't
run vacuum *and* keep up with the data stream.

I guess what I am saying is that PostgreSQL isn't "smooth," between
checkpoints and vacuum, it is near impossible to make a product that
performs consistently under high load.

Now don't flame me, I really do love PostgreSQL, it is just that I bump up
against these issues from time to time and it would be nice if there were
some way to work around them.




Re: How to avoid transaction ID wrap

From
"Jim C. Nasby"
Date:
On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote:
> I guess what I am saying is that PostgreSQL isn't "smooth," between
> checkpoints and vacuum, it is near impossible to make a product that
> performs consistently under high load.

Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to
find a case where I couldn't smooth out the IO load so that it wasn't an
issue.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: How to avoid transaction ID wrap

From
Koichi Suzuki
Date:
Tom Lane wrote:
> Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
>> Koichi Suzuki wrote:
>>> I've once proposed a patch for 64bit transaction ID, but this causes 
>>> some overhead to each tuple (XMIN and XMAX).
> 
>> Did you check performance on 32-bit or 64-bit systems and 64-bit binary 
>> version of PGSQL? I think that today is not problem to have 64-bit 
>> architecture and 64-bit ID should increase scalability of Postgres.

I checked the performance on 64-bit system and 64bit binary.

> 
> The percentage increase in I/O demand is the main reason the patch was
> rejected, not so much the arithmetic.

That's right.  I've also ovserved I/O demand increase.   I remember we
have to pay three to five percent performance decrease in pgbench.  So I
don't think we should apply this patch without further justification.
I'm looking for other reasons for larger transaction ID.


> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org
> 


-- 
Koichi Suzuki


Re: How to avoid transaction ID wrap

From
Hannu Krosing
Date:
Ühel kenal päeval, K, 2006-06-07 kell 17:45, kirjutas Jim C. Nasby:

> Plus, if the only issue here is in fact the long-running transaction for
> vacuum, there's other ways to address that which would be a lot less
> intrusive than doing something like going to 64 bit XIDs. IIRC, in 8.2
> vacuum will start a new transaction every time it fills up
> maintenance_work_mem, so just setting that low could solve the problem
> (at the expense of a heck of a lot of extra IO).

If the aim is to *only* avoid transaction wraparound, then maybe we
could introduce VACUUM FREEZE ONLY; which never removes any old tuples,
but instead just marks them by setting xmin=xmax for them, in addition
to its freezing of live-and-visible-to-all tuples.

This would avoid touching indexes at all and may well be what is desired
for tables with only very little updates/deletes.

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




Re: How to avoid transaction ID wrap

From
"Mark Woodward"
Date:
> On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote:
>> I guess what I am saying is that PostgreSQL isn't "smooth," between
>> checkpoints and vacuum, it is near impossible to make a product that
>> performs consistently under high load.
>
> Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to
> find a case where I couldn't smooth out the IO load so that it wasn't an
> issue.

In several project that I have been involved with, PostgreSQL had most of
the important features to be used, but in one project, checkpoints caused
us to time out under load. In this current project I am researching, I
know that vacuum may be an issue. The load is brutally constant.



Re: How to avoid transaction ID wrap

From
"Joshua D. Drake"
Date:
Mark Woodward wrote:
>> On Wed, Jun 07, 2006 at 07:07:55PM -0400, Mark Woodward wrote:
>>> I guess what I am saying is that PostgreSQL isn't "smooth," between
>>> checkpoints and vacuum, it is near impossible to make a product that
>>> performs consistently under high load.
>> Have you tuned the bgwriter and all the vacuum_cost stuff? I've get to
>> find a case where I couldn't smooth out the IO load so that it wasn't an
>> issue.
> 
> In several project that I have been involved with, PostgreSQL had most of
> the important features to be used, but in one project, checkpoints caused
> us to time out under load. In this current project I am researching, I
> know that vacuum may be an issue. The load is brutally constant.

I was recently involved in a project where we had to decrease the 
checkpoint_timeout . The problem was, that the database was performing 
so many transactions that if we waiting for 5 minutes, checkpoint would 
take entirely too long.

We ended up doing checkpoints every two minutes which with the increase 
in checkpoint_segments and adjustment of bgwriter settings would level 
out the load.

Sincerely,

Joshua D. Drake



> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
> 


-- 
   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240   Providing the most comprehensive  PostgreSQL
solutionssince 1997             http://www.commandprompt.com/
 




Re: How to avoid transaction ID wrap

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> I was recently involved in a project where we had to decrease the 
> checkpoint_timeout . The problem was, that the database was performing 
> so many transactions that if we waiting for 5 minutes, checkpoint would 
> take entirely too long.

Seems like the correct fix for that is to make the bgwriter more
aggressive.  Narrowing the checkpoint spacing is a pretty horrid answer
because of the resulting increase in full-page-image WAL traffic.
        regards, tom lane


Re: How to avoid transaction ID wrap

From
Tom Lane
Date:
Hannu Krosing <hannu@skype.net> writes:
> If the aim is to *only* avoid transaction wraparound, then maybe we
> could introduce VACUUM FREEZE ONLY; which never removes any old tuples,
> but instead just marks them by setting xmin=xmax for them, in addition
> to its freezing of live-and-visible-to-all tuples.

> This would avoid touching indexes at all and may well be what is desired
> for tables with only very little updates/deletes.

Seems like useless complexity.  If there are so few dead tuples that you
can afford to not reclaim them, then there are so few that reclaiming
them isn't really going to cost much either ...
        regards, tom lane


Re: How to avoid transaction ID wrap

From
"Joshua D. Drake"
Date:
Tom Lane wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
>> I was recently involved in a project where we had to decrease the 
>> checkpoint_timeout . The problem was, that the database was performing 
>> so many transactions that if we waiting for 5 minutes, checkpoint would 
>> take entirely too long.
> 
> Seems like the correct fix for that is to make the bgwriter more
> aggressive.  Narrowing the checkpoint spacing is a pretty horrid answer
> because of the resulting increase in full-page-image WAL traffic.

Well we did that as well. Here are the basic symptons:

During normal processing which contained about 250 connections 
everything was fine. A checkpoint would start and connections would 
start piling up, sometimes breaking 1000.

We narrowed that down to users having to wait longer for query execution 
so instead of just reusing connections new connections had to be 
initiated because the existing connections were busy.

We tried many different parameters, and bgwriter did significantly help 
but the only "solution" was to make checkpoints happen at a much more 
aggressive time frame.

Modify bgwriters settings and the checkpoint actually increased our 
velocity by about 70% by the time we were done. Bgwriter was definitely 
the largest chunk of that although other parameters combined outweighed 
it (effective_cache, shared_buffers etc...).

Sincerely,

Joshua D. Drake


> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
> 


-- 
   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240   Providing the most comprehensive  PostgreSQL
solutionssince 1997             http://www.commandprompt.com/
 




Re: How to avoid transaction ID wrap

From
"Jim C. Nasby"
Date:
On Thu, Jun 08, 2006 at 09:13:33AM -0700, Joshua D. Drake wrote:
> Tom Lane wrote:
> >"Joshua D. Drake" <jd@commandprompt.com> writes:
> >>I was recently involved in a project where we had to decrease the 
> >>checkpoint_timeout . The problem was, that the database was performing 
> >>so many transactions that if we waiting for 5 minutes, checkpoint would 
> >>take entirely too long.
> >
> >Seems like the correct fix for that is to make the bgwriter more
> >aggressive.  Narrowing the checkpoint spacing is a pretty horrid answer
> >because of the resulting increase in full-page-image WAL traffic.
> 
> Well we did that as well. Here are the basic symptons:
> 
> During normal processing which contained about 250 connections 
> everything was fine. A checkpoint would start and connections would 
> start piling up, sometimes breaking 1000.
> 
> We narrowed that down to users having to wait longer for query execution 
> so instead of just reusing connections new connections had to be 
> initiated because the existing connections were busy.
> 
> We tried many different parameters, and bgwriter did significantly help 
> but the only "solution" was to make checkpoints happen at a much more 
> aggressive time frame.
> 
> Modify bgwriters settings and the checkpoint actually increased our 
> velocity by about 70% by the time we were done. Bgwriter was definitely 
> the largest chunk of that although other parameters combined outweighed 
> it (effective_cache, shared_buffers etc...).

Did you try increasing the checkpoint interval, in the hopes that it
would allow the bgwritter enough extra time to get everything pushed
out?
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: How to avoid transaction ID wrap

From
Hannu Krosing
Date:
Ühel kenal päeval, N, 2006-06-08 kell 12:09, kirjutas Tom Lane:
> Hannu Krosing <hannu@skype.net> writes:
> > If the aim is to *only* avoid transaction wraparound, then maybe we
> > could introduce VACUUM FREEZE ONLY; which never removes any old tuples,
> > but instead just marks them by setting xmin=xmax for them, in addition
> > to its freezing of live-and-visible-to-all tuples.
> 
> > This would avoid touching indexes at all and may well be what is desired
> > for tables with only very little updates/deletes.
> 
> Seems like useless complexity.  If there are so few dead tuples that you
> can afford to not reclaim them, then there are so few that reclaiming
> them isn't really going to cost much either ...

It will cost 1 full scan per index, which can be quite a lot of disk
read traffic, if indexes are not used, say when most access is local to
some hotspot.

>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org
-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




Re: How to avoid transaction ID wrap

From
Trent Shipley
Date:
On Tuesday 2006-06-06 20:11, Mark Woodward wrote:
> > Mark Woodward wrote:
> >> OK, here's my problem, I have a nature study where we have about 10
> >> video
> >> cameras taking 15 frames per second.
> >> For each frame we make a few transactions on a PostgreSQL database.
> >
> > Maybe if you grouped multiple operations on bigger transactions, the I/O
> > savings could be enough to buy you the ability to vacuum once in a
> > while.  Or consider buffering somehow -- save the data elsewhere, and
> > have some sort of daemon to put it into the database.  This would allow
> > to cope with the I/O increase during vacuum.
>
> The problem is ssufficiently large that any minor modification can easily
> hide the problem for a predictble amount of time. My hope was that someone
> would have a real "long term" work around.

I'm not certain that I understand the original problem correctly so I am going 
to restate it.

VACCUM needs to be run for two reasons.
1) To recover the transaction counter.
2) To recover records marked for deletion.

VACCUM needs to be run over the entire database.  If the data in the database 
is N, then VACCUM is O(N).  Roughly, VACCUM scales linearly with the size of 
the database.

In the digital video problem:

Data is stored indefinitely online.  (It is not archived.)
(Virtually no records need to be recovered from deletion.)
Data comes in at a constant rate, frames.
The database therefore grows at frames/time (D).

It follows that no matter how much tuning is done, given constant hardware, 
VACCUM grows to consume so many resources that it is no longer possible to 
process frames/time[m] before frames/time[m+1] arrives.

Ideally, the transaction management system would be proportional to the 
marginal change in size of the database rather than the gross size of the 
database.  That is VACCUM being O(N) should be replaced (or there should be 
an optional alternative) that scales with D, O^k(D) where any k > 1 involves 
a tradeoff with VACCUM.  




Re: How to avoid transaction ID wrap

From
"Jim C. Nasby"
Date:
On Fri, Jun 09, 2006 at 06:20:21PM -0700, Trent Shipley wrote:
> Ideally, the transaction management system would be proportional to the 
> marginal change in size of the database rather than the gross size of the 
> database.  That is VACCUM being O(N) should be replaced (or there should be 
> an optional alternative) that scales with D, O^k(D) where any k > 1 involves 
> a tradeoff with VACCUM.  

That's something that's been discussed quite a bit; search the archives
for 'dead space map'. Granted, that wasn't targeted so much at the need
to VACUUM FREEZE, but there's no reason it wouldn't be possible to make
it handle that as well. In the mean time, if you partition the table on
date, you won't need to be vacuuming the entire database to handle XID
wrap.

BTW, you're also ignoring the fact that technology keeps improving the
hardware that's out there. It's possible that hardware advances alone
would keep pace with your vacuuming needs.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: How to avoid transaction ID wrap

From
Martijn van Oosterhout
Date:
On Fri, Jun 09, 2006 at 06:20:21PM -0700, Trent Shipley wrote:
> VACCUM needs to be run for two reasons.
> 1) To recover the transaction counter.
> 2) To recover records marked for deletion.
>
> VACCUM needs to be run over the entire database.  If the data in the database
> is N, then VACCUM is O(N).  Roughly, VACCUM scales linearly with the size of
> the database.

Well, you only need to vacuum the entire database once every billion
transactions.

Secondly, you can vacuum table by table. If you know a table will never
be modified, you can VACUUM FREZZE it, then it will never need to be
vacuumed again, ever (until you make changes ofcourse).

> Ideally, the transaction management system would be proportional to the
> marginal change in size of the database rather than the gross size of the
> database.  That is VACCUM being O(N) should be replaced (or there should be
> an optional alternative) that scales with D, O^k(D) where any k > 1 involves
> a tradeoff with VACCUM.

That's why people suggest partitions. Then you only vacuum the
partitions that are new and the old ones never need to be touched...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: How to avoid transaction ID wrap

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> That's why people suggest partitions. Then you only vacuum the
> partitions that are new and the old ones never need to be touched...

This will all work a lot better once we track XID wraparound risk on a
per-table rather than per-database basis.  I hope that will be done in
time for 8.2.
        regards, tom lane


Re: How to avoid transaction ID wrap

From
Alvaro Herrera
Date:
Jim C. Nasby wrote:
> On Fri, Jun 09, 2006 at 06:20:21PM -0700, Trent Shipley wrote:
> > Ideally, the transaction management system would be proportional to the 
> > marginal change in size of the database rather than the gross size of the 
> > database.  That is VACCUM being O(N) should be replaced (or there should be 
> > an optional alternative) that scales with D, O^k(D) where any k > 1 involves 
> > a tradeoff with VACCUM.  
> 
> That's something that's been discussed quite a bit; search the archives
> for 'dead space map'. Granted, that wasn't targeted so much at the need
> to VACUUM FREEZE, but there's no reason it wouldn't be possible to make
> it handle that as well. In the mean time, if you partition the table on
> date, you won't need to be vacuuming the entire database to handle XID
> wrap.

FWIW my patch for vacuum fixes some of these issues.  First because you
can freeze a table and will never need to vacuum it again; database-wide
vacuums will not be necessary.  And secondly, because as soon as a table
is frozen (either because you VACUUM FREEZE'd it, or because regular
vacuuming froze all tuples on it completely), then you don't need to
vacuum it again and indeed (non-full) VACUUM turns into a no-op.

Mix this with partitioned tables.


Re: How to avoid transaction ID wrap

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > That's why people suggest partitions. Then you only vacuum the
> > partitions that are new and the old ones never need to be touched...
> 
> This will all work a lot better once we track XID wraparound risk on a
> per-table rather than per-database basis.  I hope that will be done in
> time for 8.2.

FWIW I posted the patch for non-transactional pg_class stuff in -patches
awhile back, so it's pending review ;-)  I'll repost it (today I expect,
or tomorrow at the latest) with some minor corrections, along with the
corresponding relminxid patch.  I indend to commit both during next week
(or this week, for those whose weeks start on sundays), barring
objections.