Thread: RE: Transaction ID wraparound: problem and proposed sol ution

RE: Transaction ID wraparound: problem and proposed sol ution

From
"Mikheev, Vadim"
Date:
> This comparison will work as long as the range of interesting XIDs
> never exceeds WRAPLIMIT/2.  Essentially, we envision the actual value
> of XID as being the low-order bits of a logical XID that always
> increases, and we assume that no extant XID is more than WRAPLIMIT/2
> transactions old, so we needn't keep track of the high-order bits.

So, we'll have to abort some long running transaction.
And before after-wrap XIDs will be close to aborted xid you'd better
ensure that vacuum *successfully* run over all tables in database
(and shared tables) aborted transaction could touch.

> This scheme allows us to survive XID wraparound at the cost of slight
> additional complexity in ordered comparisons of XIDs (which is not a
> really performance-critical task AFAIK), and at the cost that the
> original insertion XIDs of all but recent tuples will be lost by
> VACUUM.  The system doesn't particularly care about that, but old XIDs
> do sometimes come in handy for debugging purposes.  A possible

I wouldn't care about this.

> compromise is to overwrite only XIDs that are older than, say,
> WRAPLIMIT/4 instead of doing so as soon as possible.  This would mean
> the required VACUUM frequency is every WRAPLIMIT/4 xacts instead of
> every WRAPLIMIT/2 xacts.
> 
> We have a straightforward tradeoff between the maximum size of pg_log
> (WRAPLIMIT/4 bytes) and the required frequency of VACUUM (at least

Required frequency of *successful* vacuum over *all* tables.
We would have to remember something in pg_class/pg_database
and somehow force vacuum over "too-long-unvacuumed-tables"
*automatically*.

> every WRAPLIMIT/2 or WRAPLIMIT/4 transactions).  This could be made
> configurable in config.h for those who're intent on customization,
> but I'd be inclined to set the default value at WRAPLIMIT = 1G.
> 
> Comments?  Vadim, is any of this about to be superseded by WAL?
> If not, I'd like to fix it for 7.1.

If undo would be implemented then we could delete pg_log between
postmaster startups - startup counter is remembered in pages, so
seeing old startup id in a page we would know that there are only
long ago committed xactions (ie only visible changes) there
and avoid xid comparison. But ... there will be no undo in 7.1.
And I foresee problems with WAL based BAR implementation if we'll
follow proposed solution: redo restores original xmin/xmax - how
to "freeze" xids while restoring DB?

(Sorry, I have to run away now... and have to think more about issue).

Vadim


Re: Transaction ID wraparound: problem and proposed solution

From
Tom Lane
Date:
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> So, we'll have to abort some long running transaction.

Well, yes, some transaction that continues running while ~ 500 million
other transactions come and go might give us trouble.  I wasn't really
planning to worry about that case ;-)

> Required frequency of *successful* vacuum over *all* tables.
> We would have to remember something in pg_class/pg_database
> and somehow force vacuum over "too-long-unvacuumed-tables"
> *automatically*.

I don't think this is a problem now; in practice you couldn't possibly
go for half a billion transactions without vacuuming, I'd think.

If your plans to eliminate regular vacuuming become reality, then this
scheme might become less reliable, but at present I think there's plenty
of safety margin.

> If undo would be implemented then we could delete pg_log between
> postmaster startups - startup counter is remembered in pages, so
> seeing old startup id in a page we would know that there are only
> long ago committed xactions (ie only visible changes) there
> and avoid xid comparison. But ... there will be no undo in 7.1.
> And I foresee problems with WAL based BAR implementation if we'll
> follow proposed solution: redo restores original xmin/xmax - how
> to "freeze" xids while restoring DB?

So, we might eventually have a better answer from WAL, but not for 7.1.

I think my idea is reasonably non-invasive and could be removed without
much trouble once WAL offers a better way.  I'd really like to have some
answer for 7.1, though.  The sort of numbers John Scott was quoting to
me for Verizon's paging network throughput make it clear that we aren't
going to survive at that level with a limit of 4G transactions per
database reload.  Having to vacuum everything on at least a
1G-transaction cycle is salable, dump/initdb/reload is not ...
        regards, tom lane


Re: Transaction ID wraparound: problem and proposed solution

From
"Vadim Mikheev"
Date:
> > So, we'll have to abort some long running transaction.
> 
> Well, yes, some transaction that continues running while ~ 500 million
> other transactions come and go might give us trouble.  I wasn't really
> planning to worry about that case ;-)

Agreed, I just don't like to rely on assumptions -:)

> > Required frequency of *successful* vacuum over *all* tables.
> > We would have to remember something in pg_class/pg_database
> > and somehow force vacuum over "too-long-unvacuumed-tables"
> > *automatically*.
> 
> I don't think this is a problem now; in practice you couldn't possibly
> go for half a billion transactions without vacuuming, I'd think.

Why not?
And once again - assumptions are not good for transaction area.

> If your plans to eliminate regular vacuuming become reality, then this
> scheme might become less reliable, but at present I think there's plenty
> of safety margin.
>
> > If undo would be implemented then we could delete pg_log between
> > postmaster startups - startup counter is remembered in pages, so
> > seeing old startup id in a page we would know that there are only
> > long ago committed xactions (ie only visible changes) there
> > and avoid xid comparison. But ... there will be no undo in 7.1.
> > And I foresee problems with WAL based BAR implementation if we'll
> > follow proposed solution: redo restores original xmin/xmax - how
> > to "freeze" xids while restoring DB?
> 
> So, we might eventually have a better answer from WAL, but not for 7.1.
> I think my idea is reasonably non-invasive and could be removed without
> much trouble once WAL offers a better way.  I'd really like to have some
> answer for 7.1, though.  The sort of numbers John Scott was quoting to
> me for Verizon's paging network throughput make it clear that we aren't
> going to survive at that level with a limit of 4G transactions per
> database reload.  Having to vacuum everything on at least a
> 1G-transaction cycle is salable, dump/initdb/reload is not ...

Understandable. And probably we can get BAR too but require full
backup every WRAPLIMIT/2 (or better /4) transactions.

Vadim




Re: Transaction ID wraparound: problem and proposed solution

From
Jan Wieck
Date:
Tom Lane wrote:
> "Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> > Required frequency of *successful* vacuum over *all* tables.
> > We would have to remember something in pg_class/pg_database
> > and somehow force vacuum over "too-long-unvacuumed-tables"
> > *automatically*.
>
> I don't think this is a problem now; in practice you couldn't possibly
> go for half a billion transactions without vacuuming, I'd think.
   ISTM you forgot that the XID counter (and usage) is global.
   You  need  to  have  *any*  table  of  *any*  database in the   instance vacuumed before you are sure. Some
low-traffic DB's   might not get vacuumed for years (for example template1).
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #