Thread: Data loss, vacuum, transaction wrap-around

Data loss, vacuum, transaction wrap-around

From
pgsql@mohawksoft.com
Date:
I want to see if there is a concensus of opinion out there.

We've all known that data loss "could" happen if vacuum is not run and you
perform more than 2b transactions.  These days with faster and bigger
computers and disks, it more likely that this problem can be hit in months
-- not years.

To me, the WORST thing a program can do is lose data. (Certainly this is
bad for a database.) I don't think there is any real excuse for this.
While the 2b transaction problem was always there, it seemed so remote
that I never obcessed about it. Now that it seems like a real problem that
more than one user has hit, I am worried.

In fact, I think it is so bad, that I think we need to back-port a fix to
previous versions and issue a notice of some kind.

Here as my suggestions:

(1) As Tom has already said, at some point start issuing warning in the
log that vacuum needs to be run.

(2) At some point, stop accepting transactions on anything but template1,
issuing an error saying the vacuum needs to be run.

(3) Either with psql on template1 or "postgres" or some "vacuumall"
program, open the database in single user mode or on template1 and vacuum
database.

(4) This should remain even after autovacuum is in place. If for some
reason auto vacuum is installed but not running, we still need to protect
the data from a stupid admin. (Last time I looked, auto vacuum used
various stats, and that may be something an admin disables.)

(5) Vacuum could check for a wrap-around condition in the database cluster
and take it upon itself to run more broadly even if it was directed only
towards a table.

We've been saying that mysql is ok if you don't care about your data, I
would hate if people started using this issue against postgresql.




Re: Data loss, vacuum, transaction wrap-around

From
Tom Lane
Date:
pgsql@mohawksoft.com writes:
> In fact, I think it is so bad, that I think we need to back-port a fix to
> previous versions and issue a notice of some kind.

They already do issue notices --- see VACUUM.

A real fix (eg the forcible stop we were talking about earlier) will not
be reasonable to back-port.
        regards, tom lane


Re: Data loss, vacuum, transaction wrap-around

From
"Matthew T. O'Connor"
Date:
Tom Lane wrote:

>pgsql@mohawksoft.com writes:
>  
>
>>In fact, I think it is so bad, that I think we need to back-port a fix to
>>previous versions and issue a notice of some kind.
>>    
>>
>
>They already do issue notices --- see VACUUM.
>
>A real fix (eg the forcible stop we were talking about earlier) will not
>be reasonable to back-port.
>

I hope this question isn't too stupid.... 

Is it be possible to create a "vacuum wraparound" or "vacuum xidreset" 
command which would do the work required to fix the wraparound problem, 
without being as expensive as a normal vacuum of an entire database?


Re: Data loss, vacuum, transaction wrap-around

From
pgsql@mohawksoft.com
Date:
More suggestions:

(1) At startup, postmaster checks for an XID, if it is close to a problem,
force a vacuum.

(2) At "sig term" shutdown, can the postmaster start a vacuum?

(3) When the XID count goes past the "trip wire" can it spontaneously
issue a vacuum?


NOTE:
Suggestions 1 and 2 are for 8.0 and prior. 3 is for later than 8.0.1



Re: Data loss, vacuum, transaction wrap-around

From
Tom Lane
Date:
"Matthew T. O'Connor" <matthew@zeut.net> writes:
> I hope this question isn't too stupid.... 

> Is it be possible to create a "vacuum wraparound" or "vacuum xidreset" 
> command which would do the work required to fix the wraparound problem, 
> without being as expensive as a normal vacuum of an entire database?

I don't think it'd be worth the trouble.  You could skip index cleanup
if you didn't actually delete any tuples, but you'd still have to do all
of the scanning work.  The cases where people think they don't need to
do vacuum (because the table didn't have any deleted tuples) wouldn't
get any cheaper at all.
        regards, tom lane


Re: Data loss, vacuum, transaction wrap-around

From
"Joshua D. Drake"
Date:
Tom Lane wrote:
> pgsql@mohawksoft.com writes:
>
>>In fact, I think it is so bad, that I think we need to back-port a fix to
>>previous versions and issue a notice of some kind.
>
>
> They already do issue notices --- see VACUUM.
>
> A real fix (eg the forcible stop we were talking about earlier) will not
> be reasonable to back-port.

Would at least a automated warning mechanism be a reasonable backport?

Sincerely,

Joshua D. Drake


>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq


--
Command Prompt, Inc., your source for PostgreSQL replication,
professional support, programming, managed services, shared
and dedicated hosting. Home of the Open Source Projects plPHP,
plPerlNG, pgManage,  and pgPHPtoolkit.
Contact us now at: +1-503-667-4564 - http://www.commandprompt.com


Attachment

Re: Data loss, vacuum, transaction wrap-around

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> Tom Lane wrote:
>> A real fix (eg the forcible stop we were talking about earlier) will not
>> be reasonable to back-port.

> Would at least a automated warning mechanism be a reasonable backport?

No, because the hard part of the problem actually is detecting that the
condition exists in a reasonably cheap way.  The check in VACUUM is
really extremely expensive, which is why we don't make it except after
completing a database-wide vacuum.  Once we have an XID limit value
sitting in shared memory then the code to use it (generate warnings
and/or error out) is simple; it's initializing that value during
postmaster start that I consider too complicated/risky to stick into
existing versions.

There is another issue here, which is that I have no faith that the
people who actually need this are going to be clueful enough to update
to 7.4.8 or 7.3.10 or whatever they'd need...
        regards, tom lane


Re: Data loss, vacuum, transaction wrap-around

From
Tom Lane
Date:
pgsql@mohawksoft.com writes:
> More suggestions:
> (1) At startup, postmaster checks for an XID, if it is close to a problem,
> force a vacuum.

Useless to a system that's run 24x7; also presumes the existence of a
complete solution anyway (since getting the postmaster to find that out
is the hard part).

> (2) At "sig term" shutdown, can the postmaster start a vacuum?

Certainly not.  We have to assume that SIGTERM means we are under a
short-term sentence of death from init.  And if it's a manual stop
it doesn't sound much better: the sort of DBA that needs this "feature"
is likely to decide he should kill -9 the postmaster because it's taking
too long to shut down.

> (3) When the XID count goes past the "trip wire" can it spontaneously
> issue a vacuum?

Only in the database you're connected to, which very likely isn't where
the problem is.  Moreover, having N backends all decide they need to do
this at once doesn't sound like a winner.  Furthermore, this still
presumes the existence of the hard part of the solution, which is
knowing where the trip point is.
        regards, tom lane


Re: Data loss, vacuum, transaction wrap-around

From
"Joshua D. Drake"
Date:
> There is another issue here, which is that I have no faith that the
> people who actually need this are going to be clueful enough to update
> to 7.4.8 or 7.3.10 or whatever they'd need...

Well I can't argue with that one ;)

>
>             regards, tom lane


--
Command Prompt, Inc., your source for PostgreSQL replication,
professional support, programming, managed services, shared
and dedicated hosting. Home of the Open Source Projects plPHP,
plPerlNG, pgManage,  and pgPHPtoolkit.
Contact us now at: +1-503-667-4564 - http://www.commandprompt.com


Attachment

Re: Data loss, vacuum, transaction wrap-around

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> > (3) When the XID count goes past the "trip wire" can it spontaneously
> > issue a vacuum?
> 
> Only in the database you're connected to, which very likely isn't where
> the problem is.  Moreover, having N backends all decide they need to do
> this at once doesn't sound like a winner.  Furthermore, this still
> presumes the existence of the hard part of the solution, which is
> knowing where the trip point is.

Alright, I have a suggestion. If the database kept a "oldest xid" for each
table then there wouldn't be any expensive work to scan the table looking for
an oldest xid. The only time "oldest xid" needs to be updated is when vacuum
is run, which is precisely when it would be known.

There could be a per-database "oldest xid" that any vacuum on any table
updates (by skimming all the "oldest xid"s for the current database). If
that's stored in the shared pg_database table then it's accessible regardless
of what database you connect to, no?

Then on every connection and every n-thousandth transaction you just have to
check the "oldest xid" for all the databases, and make sure the difference
between the oldest one and the current xid is reasonable.

-- 
greg



Re: Data loss, vacuum, transaction wrap-around

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> There could be a per-database "oldest xid" that any vacuum on any table
> updates (by skimming all the "oldest xid"s for the current database). If
> that's stored in the shared pg_database table then it's accessible regardless
> of what database you connect to, no?

You mean like pg_database.datfrozenxid?

The problem is not that we don't have the data.  The problem is getting
the data to where it's needed, which is GetNewTransactionId().  That
routine cannot be responsible for finding it out for itself, because we
haven't yet started a transaction at the time where we need to know
if it's safe to consume an XID.
        regards, tom lane


Re: Data loss, vacuum, transaction wrap-around

From
Russell Smith
Date:
On Sat, 19 Feb 2005 04:10 am, Tom Lane wrote:
> pgsql@mohawksoft.com writes:
> > In fact, I think it is so bad, that I think we need to back-port a fix to
> > previous versions and issue a notice of some kind.
> 
> They already do issue notices --- see VACUUM.
> 
> A real fix (eg the forcible stop we were talking about earlier) will not
> be reasonable to back-port.
> 
Not to be rude, but if backporting is not an option, why do we not just
focus on the job of getting autovacuum into 8.1, and not have to think
about how a patch that will warn users will work?

Regards

Russell Smith


Re: Data loss, vacuum, transaction wrap-around

From
pgsql@mohawksoft.com
Date:
> On Sat, 19 Feb 2005 04:10 am, Tom Lane wrote:
>> pgsql@mohawksoft.com writes:
>> > In fact, I think it is so bad, that I think we need to back-port a fix
>> to
>> > previous versions and issue a notice of some kind.
>>
>> They already do issue notices --- see VACUUM.
>>
>> A real fix (eg the forcible stop we were talking about earlier) will not
>> be reasonable to back-port.
>>
> Not to be rude, but if backporting is not an option, why do we not just
> focus on the job of getting autovacuum into 8.1, and not have to think
> about how a patch that will warn users will work?

Unless I'm mistaken, even "autovacuum" may not be enough. AFAIK,
autovacuum depends on the statistics daemon, and some admins may turn that
off for performance. Even so, how unlikely is it that autovacuum doesn't
run.

I think there should be a 100% no data loss fail safe. Anything less is a
cop-out. I can't see one successful argument that starts with data loss
and ends with maintenence.



Re: Data loss, vacuum, transaction wrap-around

From
Andrew Dunstan
Date:

Russell Smith wrote:

>On Sat, 19 Feb 2005 04:10 am, Tom Lane wrote:
>  
>
>>pgsql@mohawksoft.com writes:
>>    
>>
>>>In fact, I think it is so bad, that I think we need to back-port a fix to
>>>previous versions and issue a notice of some kind.
>>>      
>>>
>>They already do issue notices --- see VACUUM.
>>
>>A real fix (eg the forcible stop we were talking about earlier) will not
>>be reasonable to back-port.
>>
>>    
>>
>Not to be rude, but if backporting is not an option, why do we not just
>focus on the job of getting autovacuum into 8.1, and not have to think
>about how a patch that will warn users will work?
>
>
>  
>

What if autovacuum is turned off for some reason? Or fails? A more 
graceful failure along the lines suggested would be a good thing, ISTM.

I agree with Tom about not backpatching, though. The situation seems 
analogous with a car owner who neglects the clear instructions in the 
manual to perform regular oil changes and then finds to his great 
surprise that the car stops running. It's hardly the manufacturer's fault.

cheers

andrew




Re: Data loss, vacuum, transaction wrap-around

From
lsunley@mb.sympatico.ca
Date:
In <4216A911.9020704@dunslane.net>, on 02/18/05   at 09:48 PM, Andrew Dunstan <andrew@dunslane.net> said:



>Russell Smith wrote:

>>On Sat, 19 Feb 2005 04:10 am, Tom Lane wrote:
>>  
>>
>>>pgsql@mohawksoft.com writes:
>>>    
>>>
>>>>In fact, I think it is so bad, that I think we need to back-port a fix to
>>>>previous versions and issue a notice of some kind.
>>>>      
>>>>
>>>They already do issue notices --- see VACUUM.
>>>
>>>A real fix (eg the forcible stop we were talking about earlier) will not
>>>be reasonable to back-port.
>>>
>>>    
>>>
>>Not to be rude, but if backporting is not an option, why do we not just
>>focus on the job of getting autovacuum into 8.1, and not have to think
>>about how a patch that will warn users will work?
>>
>>
>>  
>>

>What if autovacuum is turned off for some reason? Or fails? A more 
>graceful failure along the lines suggested would be a good thing, ISTM.

>I agree with Tom about not backpatching, though. The situation seems 
>analogous with a car owner who neglects the clear instructions in the 
>manual to perform regular oil changes and then finds to his great 
>surprise that the car stops running. It's hardly the manufacturer's
>fault.

>cheers

>andrew



I never did like car analogies...

At least a car comes with an IDIOT light or maybe even an oil pressure
gauge.

Something like that (the idiot light) is missing from postgreSQL. The oil
pressure gauge would be good to have, kind of like a gauge that lets you
when you are about to run out of fuel.

Of course the best thing is the auto-fill gas tank.

-- 
-----------------------------------------------------------
lsunley@mb.sympatico.ca
-----------------------------------------------------------



Re: Data loss, vacuum, transaction wrap-around

From
Tom Lane
Date:
pgsql@mohawksoft.com writes:
> I think there should be a 100% no data loss fail safe.

Possibly we need to recalibrate our expectations here.  The current
situation is that PostgreSQL will not lose data if:
1. Your disk drive doesn't screw up (eg, lie about write complete,   or just plain die on you).2. Your kernel and
filesystemdon't screw up.3. You follow the instructions about routine vacuuming.4. You don't hit any bugs that we don't
knowabout.
 

I agree that it's a nice idea to be able to eliminate assumption #3 from
our list of gotchas, but the big picture is that it's hard to believe
that doing this will make for a quantum jump in the overall level of
reliability.  I think I listed the risks in roughly the right order of
severity ...

I'm willing to fix this for 8.1 (and am already in process of drafting a
patch), especially since it ties into some other known problems such as
the pg_pwd/pg_group files not being properly reconstructed after PITR
recovery.  But I think that a "Chinese fire drill" is not called for,
and backpatching a significant but poorly tested change falls into that
category IMHO.
        regards, tom lane


Re: Data loss, vacuum, transaction wrap-around

From
pgsql@mohawksoft.com
Date:
> pgsql@mohawksoft.com writes:
>> I think there should be a 100% no data loss fail safe.

OK, maybe I was overly broad in my statement, but I assumed a context that
I guess you missed. Don't you think that in normal operations, i.e. with
no hardware of OS failure, we should see any data loss as unacceptable?

If a bug causes data loss, it is a big deal right?
>
> Possibly we need to recalibrate our expectations here.  The current
> situation is that PostgreSQL will not lose data if:
>
>     1. Your disk drive doesn't screw up (eg, lie about write complete,
>        or just plain die on you).
>     2. Your kernel and filesystem don't screw up.
>     3. You follow the instructions about routine vacuuming.
>     4. You don't hit any bugs that we don't know about.
>

See, here is where I strongly disagree.Items (1) and (2) are completely
out of our control and no one would blame PostgreSQL.

Item (4) is an issue with all software, every now and then people hit bugs
and the bug is reported and assumed to get fixed.

Item (3) is just nasty, RTFM or else sucka! I think it is a very user
hostile stance.


> I agree that it's a nice idea to be able to eliminate assumption #3 from
> our list of gotchas, but the big picture is that it's hard to believe
> that doing this will make for a quantum jump in the overall level of
> reliability.  I think I listed the risks in roughly the right order of
> severity ...

Sometimes the edge conditions of a problem are not so obscure. I think (3)
is a huge issue, iamgine I'm in this meeting:

DBA: We can't use PostgreSQL, if we forget to do normal maintenence we'll
lose all our data.

ME: Well, there is an amount of truth in that, but we just won't forget.

DBA: Sorry, I don't trust it.

CTO: Mark, I think joe has some serious issues that need to be resolved
before we can move on this.

Boom!! Lost.

>
> I'm willing to fix this for 8.1 (and am already in process of drafting a
> patch), especially since it ties into some other known problems such as
> the pg_pwd/pg_group files not being properly reconstructed after PITR
> recovery.  But I think that a "Chinese fire drill" is not called for,
> and backpatching a significant but poorly tested change falls into that
> category IMHO.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>



Re: Data loss, vacuum, transaction wrap-around

From
Jürgen Cappel
Date:
[ Shrugs ] and looks at other database systems ...

CA has put Ingres into Open Source last year.

Very reliable system with a replicator worth looking at.

Just a thought.




-------- Ursprüngliche Nachricht --------
Betreff: Re: [HACKERS] Data loss, vacuum, transaction wrap-around
Datum: Sat, 19 Feb 2005 09:15:14 -0500 (EST)
Von: pgsql@mohawksoft.com
An: "Tom Lane" <tgl@sss.pgh.pa.us>
CC: pgsql@mohawksoft.com, "Russell Smith" <mr-russ@pws.com.au>,    pgsql-hackers@postgresql.org
Referenzen: <16672.24.91.171.78.1108743398.squirrel@mail.mohawksoft.com>    <19175.1108746609@sss.pgh.pa.us> 
<200502191302.58099.mr-russ@pws.com.au>    <16585.24.91.171.78.1108780763.squirrel@mail.mohawksoft.com> 
<6391.1108784131@sss.pgh.pa.us>
> pgsql@mohawksoft.com writes:>> I think there should be a 100% no data loss fail safe.

OK, maybe I was overly broad in my statement, but I assumed a context that
I guess you missed. Don't you think that in normal operations, i.e. with
no hardware of OS failure, we should see any data loss as unacceptable?

If a bug causes data loss, it is a big deal right?>> Possibly we need to recalibrate our expectations here.  The
current>situation is that PostgreSQL will not lose data if:>>     1. Your disk drive doesn't screw up (eg, lie about
writecomplete,>        or just plain die on you).>     2. Your kernel and filesystem don't screw up.>     3. You follow
theinstructions about routine vacuuming.>     4. You don't hit any bugs that we don't know about.>
 

See, here is where I strongly disagree.Items (1) and (2) are completely
out of our control and no one would blame PostgreSQL.

Item (4) is an issue with all software, every now and then people hit bugs
and the bug is reported and assumed to get fixed.

Item (3) is just nasty, RTFM or else sucka! I think it is a very user
hostile stance.

> I agree that it's a nice idea to be able to eliminate assumption #3 from> our list of gotchas, but the big picture is
thatit's hard to believe> that doing this will make for a quantum jump in the overall level of> reliability.  I think I
listedthe risks in roughly the right order of> severity ...
 

Sometimes the edge conditions of a problem are not so obscure. I think (3)
is a huge issue, iamgine I'm in this meeting:

DBA: We can't use PostgreSQL, if we forget to do normal maintenence we'll
lose all our data.

ME: Well, there is an amount of truth in that, but we just won't forget.

DBA: Sorry, I don't trust it.

CTO: Mark, I think joe has some serious issues that need to be resolved
before we can move on this.

Boom!! Lost.
>> I'm willing to fix this for 8.1 (and am already in process of drafting a> patch), especially since it ties into some
otherknown problems such as> the pg_pwd/pg_group files not being properly reconstructed after PITR> recovery.  But I
thinkthat a "Chinese fire drill" is not called for,> and backpatching a significant but poorly tested change falls into
that>category IMHO.>>             regards, tom lane>> ---------------------------(end of
broadcast)--------------------------->TIP 7: don't forget to increase your free space map settings>
 


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org




Re: Data loss, vacuum, transaction wrap-around

From
Jaime Casanova
Date:
On Fri, 18 Feb 2005 22:35:31 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> pgsql@mohawksoft.com writes:
> > I think there should be a 100% no data loss fail safe.
> 
> Possibly we need to recalibrate our expectations here.  The current
> situation is that PostgreSQL will not lose data if:
> 
>        1. Your disk drive doesn't screw up (eg, lie about write complete,
>           or just plain die on you).
>        2. Your kernel and filesystem don't screw up.
>        3. You follow the instructions about routine vacuuming.
>        4. You don't hit any bugs that we don't know about.
> 
I'm not an expert but a happy user. My opinion is:
1)  there is nothing to do with #1 and #2. 
2)  #4 is not a big problem because of the velocity developers fix
those when a bug is found.

3) All databases has some type of maintenance routine, in informix for
example we have (update statistics, and there are others for oracle)
of course they are for performance reasons, but vacuum is too for that
and additionally give us the XID wraparound.
So, to have a maintenance routine in PostgreSQL is not bad. *Bad* is
to have a DBA(1) with no clue about the tool is using. Tools that do
to much are an incentive in hire *no clue* people.

(1) DBA: DataBase Administrator or DataBase Aniquilator???

regards,
Jaime Casanova


Re: Data loss, vacuum, transaction wrap-around

From
lsunley@mb.sympatico.ca
Date:
In <c2d9e70e050219112379204df4@mail.gmail.com>, on 02/19/05   at 02:23 PM, Jaime Casanova <systemguards@gmail.com>
said:

>On Fri, 18 Feb 2005 22:35:31 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote: >
>pgsql@mohawksoft.com writes:
>> > I think there should be a 100% no data loss fail safe.
>> 
>> Possibly we need to recalibrate our expectations here.  The current
>> situation is that PostgreSQL will not lose data if:
>> 
>>        1. Your disk drive doesn't screw up (eg, lie about write complete,
>>           or just plain die on you).
>>        2. Your kernel and filesystem don't screw up.
>>        3. You follow the instructions about routine vacuuming.
>>        4. You don't hit any bugs that we don't know about.
>> 
>I'm not an expert but a happy user. My opinion is:
>1)  there is nothing to do with #1 and #2. 
>2)  #4 is not a big problem because of the velocity developers fix those
>when a bug is found.

>3) All databases has some type of maintenance routine, in informix for
>example we have (update statistics, and there are others for oracle) of
>course they are for performance reasons, but vacuum is too for that and
>additionally give us the XID wraparound.
>So, to have a maintenance routine in PostgreSQL is not bad. *Bad* is to
>have a DBA(1) with no clue about the tool is using. Tools that do to much
>are an incentive in hire *no clue* people.

>(1) DBA: DataBase Administrator or DataBase Aniquilator???

>regards,
>Jaime Casanova

Bad mouthing the people who use your software is a good way to make sure
no one uses the software.

The catastrophic failure of the database because a maintenence function is
not performed is a problem with the software, not with the people using
it.

-- 
-----------------------------------------------------------
lsunley@mb.sympatico.ca
-----------------------------------------------------------



Re: Data loss, vacuum, transaction wrap-around

From
Bruno Wolff III
Date:
On Sat, Feb 19, 2005 at 13:35:25 -0500, lsunley@mb.sympatico.ca wrote:
> 
> The catastrophic failure of the database because a maintenence function is
> not performed is a problem with the software, not with the people using
> it.

There doesn't seem to be disagreement that something should be done going
forward.

The disagreement sems to be what effort should be made in back porting
fixes to previous versions.


Re: Data loss, vacuum, transaction wrap-around

From
pgsql@mohawksoft.com
Date:
> On Fri, 18 Feb 2005 22:35:31 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> pgsql@mohawksoft.com writes:
>> > I think there should be a 100% no data loss fail safe.
>>
>> Possibly we need to recalibrate our expectations here.  The current
>> situation is that PostgreSQL will not lose data if:
>>
>>        1. Your disk drive doesn't screw up (eg, lie about write
>> complete,
>>           or just plain die on you).
>>        2. Your kernel and filesystem don't screw up.
>>        3. You follow the instructions about routine vacuuming.
>>        4. You don't hit any bugs that we don't know about.
>>
> I'm not an expert but a happy user. My opinion is:
> 1)  there is nothing to do with #1 and #2.
> 2)  #4 is not a big problem because of the velocity developers fix
> those when a bug is found.
>
> 3) All databases has some type of maintenance routine, in informix for
> example we have (update statistics, and there are others for oracle)
> of course they are for performance reasons, but vacuum is too for that
> and additionally give us the XID wraparound.
> So, to have a maintenance routine in PostgreSQL is not bad. *Bad* is
> to have a DBA(1) with no clue about the tool is using. Tools that do
> to much are an incentive in hire *no clue* people.
>
> (1) DBA: DataBase Administrator or DataBase Aniquilator???

PostgreSQL is such an awesome project. The only thing it seems to suffer
from is a disregard for its users.


Re: Data loss, vacuum, transaction wrap-around

From
pgsql@mohawksoft.com
Date:
> On Sat, Feb 19, 2005 at 13:35:25 -0500,
>   lsunley@mb.sympatico.ca wrote:
>>
>> The catastrophic failure of the database because a maintenence function
>> is
>> not performed is a problem with the software, not with the people using
>> it.
>
> There doesn't seem to be disagreement that something should be done going
> forward.
>
> The disagreement sems to be what effort should be made in back porting
> fixes to previous versions.


Now, lets imagine PostgreSQL is being developed by a large company. QA
announces it has found a bug that will cause all the users data to
disappear if they don't run a maintenence program correctly. Vacuuming one
or two tables is not enough, you have to vacuum all tables in all
databases.

This bug would get marked as a critical error and a full scale effort
would be made to contact previous users to upgrade or check their
procedures.




Re: Data loss, vacuum, transaction wrap-around

From
Andrew Dunstan
Date:

pgsql@mohawksoft.com wrote:

>
>PostgreSQL is such an awesome project. The only thing it seems to suffer
>from is a disregard for its users.
>
>
>  
>
Mark,

This is completely untrue and very offensive. Here's a tip I've often 
found useful even though I have also often ignored it (and later 
regretted doing so): before you hit the Send button, take a few deep 
breaths and count to 10.

There is no news in the problem you're complaining of. It's completely 
known and documented. You've stated before that you've been using 
PostgreSQL for years - why is this suddenly so urgent that we have to 
drop everything and backpatch old releases? Please move along, there's 
nothing to see here, these are not the bugs you've been looking for.

cheers

andrew


Re: Data loss, vacuum, transaction wrap-around

From
Mark Kirkwood
Date:
Andrew Dunstan wrote:

> There is no news in the problem you're complaining of. It's completely 
> known and documented. You've stated before that you've been using 
> PostgreSQL for years - why is this suddenly so urgent that we have to 
> drop everything and backpatch old releases? Please move along, there's 
> nothing to see here, these are not the bugs you've been looking for.

To be fair to Mark, there does seem to be an increasing number of 
reports of this issue. In spite of the in-the-works fix for 8.1, it 
would be a pity to see customers losing data from xid wrap-around.

However the quandary is this : even if we did back patches for every
version, said customers probably wouldn't know they needed to apply them
- hmmm, not much help there. We might be better off merely announcing 
the need to use vacuumdb on www.postgresql.org!

regards

Mark (the other one)



Re: Data loss, vacuum, transaction wrap-around

From
pgsql@mohawksoft.com
Date:
> [ Shrugs ] and looks at other database systems ...
>
> CA has put Ingres into Open Source last year.
>
> Very reliable system with a replicator worth looking at.
>
> Just a thought.

The discussion on hackers is how to make PostgreSQL better. There are many
different perspectives, differences are argued and concensus reached, and
a better PostgreSQL emerges.

Going to another database would be counter productive to the process.



Re: Data loss, vacuum, transaction wrap-around

From
Tom Lane
Date:
Mark Kirkwood <markir@coretech.co.nz> writes:
> To be fair to Mark, there does seem to be an increasing number of 
> reports of this issue. In spite of the in-the-works fix for 8.1, it 
> would be a pity to see customers losing data from xid wrap-around.

The question is whether we are willing to back-patch a fairly large
amount of not-very-well-tested code into 8.0.  See
http://archives.postgresql.org/pgsql-patches/2005-02/msg00123.php
http://archives.postgresql.org/pgsql-committers/2005-02/msg00127.php
http://archives.postgresql.org/pgsql-committers/2005-02/msg00131.php

I personally don't think it's worth the risk.  The code works well
enough to commit to development tip, but it's fundamentally alpha
quality code.
        regards, tom lane


Re: Data loss, vacuum, transaction wrap-around

From
Bruno Wolff III
Date:
On Sat, Feb 19, 2005 at 18:04:42 -0500,
> 
> Now, lets imagine PostgreSQL is being developed by a large company. QA
> announces it has found a bug that will cause all the users data to
> disappear if they don't run a maintenence program correctly. Vacuuming one
> or two tables is not enough, you have to vacuum all tables in all
> databases.

Except that Postgres isn't a large company and doing the work of
back patching and testing old versions will be done instead of
more important work.

> This bug would get marked as a critical error and a full scale effort
> would be made to contact previous users to upgrade or check their
> procedures.

I don't think all commercial companies would do that. I doubt that even
most of them would.


Re: Data loss, vacuum, transaction wrap-around

From
pgsql@mohawksoft.com
Date:
> On Sat, Feb 19, 2005 at 18:04:42 -0500,
>>
>> Now, lets imagine PostgreSQL is being developed by a large company. QA
>> announces it has found a bug that will cause all the users data to
>> disappear if they don't run a maintenence program correctly. Vacuuming
>> one
>> or two tables is not enough, you have to vacuum all tables in all
>> databases.
>
> Except that Postgres isn't a large company and doing the work of
> back patching and testing old versions will be done instead of
> more important work.

PostgreSQL is an open source project that plays with the big guys. Look at
the Linux kernel. Imagine their file system guys thinking this way. Linux
would still be Linus' hobbie.

>
>> This bug would get marked as a critical error and a full scale effort
>> would be made to contact previous users to upgrade or check their
>> procedures.
>
> I don't think all commercial companies would do that. I doubt that even
> most of them would.

Database companies? You bet they would.


Re: Data loss, vacuum, transaction wrap-around

From
Robert Treat
Date:
On Sunday 20 February 2005 00:30, Tom Lane wrote:
> Mark Kirkwood <markir@coretech.co.nz> writes:
> > To be fair to Mark, there does seem to be an increasing number of
> > reports of this issue. In spite of the in-the-works fix for 8.1, it
> > would be a pity to see customers losing data from xid wrap-around.
>
> The question is whether we are willing to back-patch a fairly large
> amount of not-very-well-tested code into 8.0.  See
> http://archives.postgresql.org/pgsql-patches/2005-02/msg00123.php
> http://archives.postgresql.org/pgsql-committers/2005-02/msg00127.php
> http://archives.postgresql.org/pgsql-committers/2005-02/msg00131.php
>
> I personally don't think it's worth the risk.  The code works well
> enough to commit to development tip, but it's fundamentally alpha
> quality code.
>

I would lean away from putting it in 8.0, however aren't we planning an 8.0.x 
release that will have a beta and/or rc testing for arc related changes? If 
so I might be open to putting it in that release (though the bits requiring 
initdb are a killer).  

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


Re: Data loss, vacuum, transaction wrap-around

From
Stephan Szabo
Date:
On Sun, 20 Feb 2005 pgsql@mohawksoft.com wrote:

> > On Sat, Feb 19, 2005 at 18:04:42 -0500,
> >>
> >> Now, lets imagine PostgreSQL is being developed by a large company. QA
> >> announces it has found a bug that will cause all the users data to
> >> disappear if they don't run a maintenence program correctly. Vacuuming
> >> one
> >> or two tables is not enough, you have to vacuum all tables in all
> >> databases.
> >
> > Except that Postgres isn't a large company and doing the work of
> > back patching and testing old versions will be done instead of
> > more important work.
>
> PostgreSQL is an open source project that plays with the big guys. Look at
> the Linux kernel. Imagine their file system guys thinking this way. Linux
> would still be Linus' hobbie.

So, you are certain that every Linux file system bug has been patched all
the way back to say kernel version 1.0 then?  Do you have any evidence of
this claim?

> >> This bug would get marked as a critical error and a full scale effort
> >> would be made to contact previous users to upgrade or check their
> >> procedures.
> >
> > I don't think all commercial companies would do that. I doubt that even
> > most of them would.
>
> Database companies? You bet they would.

Do you have any evidence or are you merely spouting an opinion as fact?


Re: Data loss, vacuum, transaction wrap-around

From
pgsql@mohawksoft.com
Date:
> On Sun, 20 Feb 2005 pgsql@mohawksoft.com wrote:
>
>> > On Sat, Feb 19, 2005 at 18:04:42 -0500,
>> >>
>> >> Now, lets imagine PostgreSQL is being developed by a large company.
>> QA
>> >> announces it has found a bug that will cause all the users data to
>> >> disappear if they don't run a maintenence program correctly.
>> Vacuuming
>> >> one
>> >> or two tables is not enough, you have to vacuum all tables in all
>> >> databases.
>> >
>> > Except that Postgres isn't a large company and doing the work of
>> > back patching and testing old versions will be done instead of
>> > more important work.
>>
>> PostgreSQL is an open source project that plays with the big guys. Look
>> at
>> the Linux kernel. Imagine their file system guys thinking this way.
>> Linux
>> would still be Linus' hobbie.
>
> So, you are certain that every Linux file system bug has been patched all
> the way back to say kernel version 1.0 then?  Do you have any evidence of
> this claim?

No one is suggesting back to version 1.0, but critical data loss bugs that
are present and relvent in used prior versions are fixed.

>
>> >> This bug would get marked as a critical error and a full scale effort
>> >> would be made to contact previous users to upgrade or check their
>> >> procedures.
>> >
>> > I don't think all commercial companies would do that. I doubt that
>> even
>> > most of them would.
>>
>> Database companies? You bet they would.
>
> Do you have any evidence or are you merely spouting an opinion as fact?
>
With Oracle and DB2, yes I have some personal experience.


Re: Data loss, vacuum, transaction wrap-around

From
Stephan Szabo
Date:
On Sun, 20 Feb 2005 pgsql@mohawksoft.com wrote:

> > On Sun, 20 Feb 2005 pgsql@mohawksoft.com wrote:
> >
> >> > On Sat, Feb 19, 2005 at 18:04:42 -0500,
> >> >>
> >> >> Now, lets imagine PostgreSQL is being developed by a large company.
> >> QA
> >> >> announces it has found a bug that will cause all the users data to
> >> >> disappear if they don't run a maintenence program correctly.
> >> Vacuuming
> >> >> one
> >> >> or two tables is not enough, you have to vacuum all tables in all
> >> >> databases.
> >> >
> >> > Except that Postgres isn't a large company and doing the work of
> >> > back patching and testing old versions will be done instead of
> >> > more important work.
> >>
> >> PostgreSQL is an open source project that plays with the big guys. Look
> >> at
> >> the Linux kernel. Imagine their file system guys thinking this way.
> >> Linux
> >> would still be Linus' hobbie.
> >
> > So, you are certain that every Linux file system bug has been patched all
> > the way back to say kernel version 1.0 then?  Do you have any evidence of
> > this claim?
>
> No one is suggesting back to version 1.0, but critical data loss bugs that
> are present and relvent in used prior versions are fixed.

I still doubt your claim about patching and youhaven't given any evidence,
but let's just make the assumption it's true because otherwise even trying
to hold a discussion is fruitless.

I also dispute your claim based on the backpatching claim that Linux would
be Linus' hobby if the file system guys thought this way.  Given that
"stable" Linux branches often aren't, if there weren't aggregators who
provide upgrades that are at least supposedly tested and reasonably stable
and sometimes did their own back patching, Linux wouldn't have the sort of
success it does because people would have to do alot more choosing between
getting bug fix X and doing huge amounts of tests to make sure nothing
else is broken. Thus, I believe you are greatly overstating the effect
that your first claim has towards your second to the point of making an
invalid argument.

Personally, I'd in general wish that 8.0 got a fix for this because that
way we could (after sufficient testing) push an 8.0 version that we
considered stable to suggest people move to.  However, I don't have a
whole lot of time to do such a patch nor to do "sufficient" testing, and
I'm not arrogant enough to believe I can order around volunteers and
companies I'm not a customer of.

> >> >> This bug would get marked as a critical error and a full scale effort
> >> >> would be made to contact previous users to upgrade or check their
> >> >> procedures.
> >> >
> >> > I don't think all commercial companies would do that. I doubt that
> >> even
> >> > most of them would.
> >>
> >> Database companies? You bet they would.
> >
> > Do you have any evidence or are you merely spouting an opinion as fact?
> >
> With Oracle and DB2, yes I have some personal experience.

My last company's experience with Oracle support still leaves me
questioning that claim.  They basically got "don't do that then or move to
the newest major revision" when they had a construct which caused the
server to stop responding.  It's not the same conditions (although I
believe the DBA did reload from backup because noone could guarantee that
there couldn't possibly have been dataloss), but it's certainly not
indicative of the sort of "full scale efforts" you're describing.


Re: Data loss, vacuum, transaction wrap-around

From
Mark Kirkwood
Date:
Tom Lane wrote:

> The question is whether we are willing to back-patch a fairly large
> amount of not-very-well-tested code into 8.0.  See
> http://archives.postgresql.org/pgsql-patches/2005-02/msg00123.php
> http://archives.postgresql.org/pgsql-committers/2005-02/msg00127.php
> http://archives.postgresql.org/pgsql-committers/2005-02/msg00131.php
> 
> I personally don't think it's worth the risk.  The code works well
> enough to commit to development tip, but it's fundamentally alpha
> quality code.

I think this makes the most sense. If we are going to do an extended 
testing period for 8.0.without-arc then bundling it in there might worth 
considering.

regards

Mark


Re: Data loss, vacuum, transaction wrap-around

From
Jeff
Date:
On Feb 20, 2005, at 11:02 AM, Stephan Szabo wrote:

> My last company's experience with Oracle support still leaves me
> questioning that claim.  They basically got "don't do that then or 
> move to
> the newest major revision" when they had a construct which caused the
> server to stop responding.

For the record, that is the kind of support I've got from Informix in 
the past.
Even for issues where the db would reliably return invalid results.

What is great is I have one informix db that I cannot take a backup of 
without causing several DAYS of downtime.   Their solution: Do an 
in-place upgrade and hope it works.  Yes, they actually said "And 
hopefully it will work. You'll need to take a backup to be safe" "But I 
can't take a backup.  That is the point of this call" "Oh, well... 
umm.. it SHOULD work!"

you get the idea.

as for the xid wraparound... issuing a NOTICE / ERROR in a new version 
will be good, but backpatching won't be needed.  As others have said, 
the people who really need this are not smart enough to upgrade / watch 
for patches / RTFM

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/



Re: Data loss, vacuum, transaction wrap-around

From
Bruce Momjian
Date:
Jeff wrote:
> 
> On Feb 20, 2005, at 11:02 AM, Stephan Szabo wrote:
> 
> > My last company's experience with Oracle support still leaves me
> > questioning that claim.  They basically got "don't do that then or 
> > move to
> > the newest major revision" when they had a construct which caused the
> > server to stop responding.
> 
> For the record, that is the kind of support I've got from Informix in 
> the past.
> Even for issues where the db would reliably return invalid results.
> 
> What is great is I have one informix db that I cannot take a backup of 
> without causing several DAYS of downtime.   Their solution: Do an 
> in-place upgrade and hope it works.  Yes, they actually said "And 
> hopefully it will work. You'll need to take a backup to be safe" "But I 
> can't take a backup.  That is the point of this call" "Oh, well... 
> umm.. it SHOULD work!"

"Uh, did we mention we are sure it would work in DB2."  :-)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Data loss, vacuum, transaction wrap-around

From
Andrew Sullivan
Date:
On Sat, Feb 19, 2005 at 04:49:03PM -0500, pgsql@mohawksoft.com wrote:
> PostgreSQL is such an awesome project. The only thing it seems to suffer
> from is a disregard for its users.

Gee.  And all this time I thought that "free support from the guy who
wrote the code and gave it to you" was better regard for the users
that "cryptic support by someone whos is reading a script and who's
afraid of the legal department." Silly me.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
When my information changes, I alter my conclusions.  What do you do sir?    --attr. John Maynard Keynes