Thread: Multiple table entries?

Multiple table entries?

From
Jeff Ross
Date:
Hi,

I recently upgraded to 8.4 and everything went great.  All databases are
working as they are supposed to, no problems seen.

Today, however, I did a \d on a database and was surprised to see sets
of 5 identical table entries for each one that is supposed to be there.
Here's the databases in question (used to store apache web logs).
Several other of my databases (but not all) show the same 5 entries for
each real table symptom.

_postgresql@heinlein:/backup/postgresql $ psql pglogd
psql (8.4.0)
Type "help" for help.

pglogd=# \d++
                          List of relations
 Schema |     Name     | Type  |    Owner    |  Size   | Description
--------+--------------+-------+-------------+---------+-------------
 public | full_entries | table | _postgresql | 528 kB  |
 public | full_entries | table | _postgresql | 528 kB  |
 public | full_entries | table | _postgresql | 528 kB  |
 public | full_entries | table | _postgresql | 528 kB  |
 public | full_entries | table | _postgresql | 528 kB  |
 public | full_temp    | table | jross       | 1904 kB |
 public | full_temp    | table | jross       | 1904 kB |
 public | full_temp    | table | jross       | 1904 kB |
 public | full_temp    | table | jross       | 1904 kB |
 public | full_temp    | table | jross       | 1904 kB |
 public | log_entries  | table | _postgresql | 1648 kB |
 public | log_entries  | table | _postgresql | 1648 kB |
 public | log_entries  | table | _postgresql | 1648 kB |
 public | log_entries  | table | _postgresql | 1648 kB |
 public | log_entries  | table | _postgresql | 1648 kB |
 public | page_hits    | table | _postgresql | 5296 kB |
 public | page_hits    | table | _postgresql | 5296 kB |
 public | page_hits    | table | _postgresql | 5296 kB |
 public | page_hits    | table | _postgresql | 5296 kB |
 public | page_hits    | table | _postgresql | 5296 kB |
 public | total_hits   | table | _postgresql | 16 kB   |
 public | total_hits   | table | _postgresql | 16 kB   |
 public | total_hits   | table | _postgresql | 16 kB   |
 public | total_hits   | table | _postgresql | 16 kB   |
 public | total_hits   | table | _postgresql | 16 kB   |
(25 rows)

\l correctly lists the individual databases, and when I took a look at
yesterday's dump file I only see one create table statement, not 5.

I browsed through the system catalogs but haven't found anything yet
that can shine some light on this.

System is running OpenBSD -current.

Thanks,

Jeff Ross



Re: Multiple table entries?

From
Greg Stark
Date:
On Sat, Aug 22, 2009 at 9:31 PM, Jeff Ross<jross@wykids.org> wrote:
> Hi,
>
> I recently upgraded to 8.4 and everything went great.  All databases are
> working as they are supposed to, no problems seen.
>
> Today, however, I did a \d on a database and was surprised to see sets of 5
> identical table entries for each one that is supposed to be there.

Ugh.

What does

select xmin,xmax,ctid,oid,* from pg_class

return?

--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Greg Stark
Date:
On Sat, Aug 22, 2009 at 9:31 PM, Jeff Ross<jross@wykids.org> wrote:
> I browsed through the system catalogs but haven't found anything yet that
> can shine some light on this.

Actually, I wonder if this isn't more likely to show the problem -- it
would explain why *all* your tables are showing up with duplicates
rather than just one.

select xmin,xmax,ctid,oid,* from pg_namespace


--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Jeff Ross
Date:
Greg Stark wrote:
> On Sat, Aug 22, 2009 at 9:31 PM, Jeff Ross<jross@wykids.org> wrote:
>
>> Hi,
>>
>> I recently upgraded to 8.4 and everything went great. �All databases are
>> working as they are supposed to, no problems seen.
>>
>> Today, however, I did a \d on a database and was surprised to see sets of 5
>> identical table entries for each one that is supposed to be there.
>>
>
> Ugh.
>
> What does
>
> select xmin,xmax,ctid,oid,* from pg_class
>
> return?
>
>
216 rows worth, but I only see one entry per table, not 5.

For easier reading, I put the output at

    http://openvistas.net/pg_class_query.html

Thanks!

Jeff


Re: Multiple table entries?

From
Jeff Ross
Date:
Greg Stark wrote:
> On Sat, Aug 22, 2009 at 9:31 PM, Jeff Ross<jross@wykids.org> wrote:
>
>> I browsed through the system catalogs but haven't found anything yet that
>> can shine some light on this.
>>
>
> Actually, I wonder if this isn't more likely to show the problem -- it
> would explain why *all* your tables are showing up with duplicates
> rather than just one.
>
> select xmin,xmax,ctid,oid,* from pg_namespace
>
>
>
http://openvistas.net/pg_namespace_query.html

Jeff

Re: Multiple table entries?

From
Greg Stark
Date:
On Sun, Aug 23, 2009 at 4:06 AM, Jeff Ross<jross@wykids.org> wrote:
> Greg Stark wrote:

>> Actually, I wonder if this isn't more likely to show the problem -- it
>> would explain why *all* your tables are showing up with duplicates
>> rather than just one.
>>
>> select xmin,xmax,ctid,oid,* from pg_namespace
>
> http://openvistas.net/pg_namespace_query.html

Yeah, that's a problem. Would you be able to load the pageinspect
contrib module and run a query?

select (h).* from (select
heap_page_items(get_raw_page('pg_namespace',0)) as h from p) as x;



--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Greg Stark
Date:
On Sun, Aug 23, 2009 at 4:40 AM, Greg Stark<gsstark@mit.edu> wrote:
> On Sun, Aug 23, 2009 at 4:06 AM, Jeff Ross<jross@wykids.org> wrote:
>> Greg Stark wrote:
>
> Yeah, that's a problem. Would you be able to load the pageinspect
> contrib module and run a query?
>
> select (h).* from (select
> heap_page_items(get_raw_page('pg_namespace',0)) as h from p) as x;


Also, do you have the WAL log files going back to the database
creation? If so *please* preserve them! You're the second person to
report similar symptoms so it's starting to look like there may be a
serious bug here. And the nature of the problem is such that I think
it may require someone sifting through the xlog WAL files to see what
sequence of events happened.

The wal files are in a subdirection of the database root called
pg_xlog. If you still have one named 000000010000000000000000 then you
have all of them. Please, if possible, copy them to a backup
directory.

--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Jeff Ross
Date:
Greg Stark wrote:
> On Sun, Aug 23, 2009 at 4:06 AM, Jeff Ross<jross@wykids.org> wrote:
>
>> Greg Stark wrote:
>>
>
>
>>> Actually, I wonder if this isn't more likely to show the problem -- it
>>> would explain why *all* your tables are showing up with duplicates
>>> rather than just one.
>>>
>>> select xmin,xmax,ctid,oid,* from pg_namespace
>>>
>> http://openvistas.net/pg_namespace_query.html
>>
>
> Yeah, that's a problem. Would you be able to load the pageinspect
> contrib module and run a query?
>
> select (h).* from (select
> heap_page_items(get_raw_page('pg_namespace',0)) as h from p) as x;
>
>
>
>

I had to modify your query slightly to make it run--hope I got what you
are after!

select (h).* from (select
heap_page_items(get_raw_page('pg_namespace',0)) as h) as x;

http://www.openvistas.net/pageinspect.html

Thanks,

Jeff


Re: Multiple table entries?

From
Greg Stark
Date:
On Sun, Aug 23, 2009 at 2:18 PM, Jeff Ross<jross@wykids.org> wrote:
> I had to modify your query slightly to make it run--hope I got what you are
> after!
>
> select (h).* from (select
> heap_page_items(get_raw_page('pg_namespace',0)) as h) as x;
>
> http://www.openvistas.net/pageinspect.html

Incidentally, may as well ask the usual questions:

Did you upgrade this via pg_dump/pg_restore or via pg_migrator?
Are you running with fsync=off?
Are you running with synchronous_commit=off?
Have you had any system crashes?
Have you had any Postgres crashes?

Of the above only fsync=off combined with a system crash ought to be
unsafe, but it could be good to know which other events have occurred
to know which code paths have been exercised.

Do you still have pg_clog/0000 (that's kept longer than the xlog so I
would expect you would), how large is it? could you send that file? It
has a record of which transactions have committed or aborted, no
private data.

--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Jeff Ross
Date:
Greg Stark wrote:
> On Sun, Aug 23, 2009 at 2:18 PM, Jeff Ross<jross@wykids.org> wrote:
>
>> I had to modify your query slightly to make it run--hope I got what you are
>> after!
>>
>> select (h).* from (select
>> heap_page_items(get_raw_page('pg_namespace',0)) as h) as x;
>>
>> http://www.openvistas.net/pageinspect.html
>>
>
> Incidentally, may as well ask the usual questions:
>
> Did you upgrade this via pg_dump/pg_restore or via pg_migrator?
>

pg_dump
> Are you running with fsync=off?
>
No.
> Are you running with synchronous_commit=off?
>
No.

> Have you had any system crashes?
>
No.

> Have you had any Postgres crashes?
>

No.

> Of the above only fsync=off combined with a system crash ought to be
> unsafe, but it could be good to know which other events have occurred
> to know which code paths have been exercised.
>
> Do you still have pg_clog/0000 (that's kept longer than the xlog so I
> would expect you would), how large is it? could you send that file? It
> has a record of which transactions have committed or aborted, no
> private data.
>
>
pg_clog is 32K.  I've put it at http://www.openvistas.net/pg_clog

Jeff

Re: Multiple table entries?

From
Jeff Ross
Date:
Greg Stark wrote:
> On Sun, Aug 23, 2009 at 5:02 PM, Greg Stark<gsstark@mit.edu> wrote:
>
>> On Sun, Aug 23, 2009 at 2:18 PM, Jeff Ross<jross@wykids.org> wrote:
>> Incidentally, may as well ask the usual questions:
>>
>
> And just for reference, what does pg_controldata print?
>
>
_postgresql@heinlein:/var/postgresql $
pg_controldata
pg_control version number:            843
Catalog version number:               200904091
Database system identifier:           5368117850282091454
Database cluster state:               in production
pg_control last modified:             Sun Aug 23 10:37:39 2009
Latest checkpoint location:           0/4203B43C
Prior checkpoint location:            0/4202F978
Latest checkpoint's REDO location:    0/4203B43C
Latest checkpoint's TimeLineID:       1
Latest checkpoint's NextXID:          0/116884
Latest checkpoint's NextOID:          866525
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Time of latest checkpoint:            Sun Aug 23 10:37:38 2009
Minimum recovery ending location:     0/0
Maximum data alignment:               4
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        2000
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by reference


Re: Multiple table entries?

From
Greg Stark
Date:
On Sun, Aug 23, 2009 at 5:37 PM, Jeff Ross<jross@wykids.org> wrote:
>
> pg_clog is 32K.  I've put it at http://www.openvistas.net/pg_clog

Sorry, I'm getting a 404


--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Greg Stark
Date:
On Sun, Aug 23, 2009 at 6:23 PM, Greg Stark<gsstark@mit.edu> wrote:
> On Sun, Aug 23, 2009 at 5:37 PM, Jeff Ross<jross@wykids.org> wrote:
>>
>> pg_clog is 32K.  I've put it at http://www.openvistas.net/pg_clog
>
> Sorry, I'm getting a 404

For what it's worth this is what the heap dump shows. I don't
understand why you're seeing these tuples show up if they're all
properly marked as updated and XMAX_COMMITTED.

The last tuple is marked strangely I think. I don't think it's
supposed to have XMAX_INVALID if xmax is 0 but I could be wrong. Also,
I don't understand why it's marked as UPDATED if ctid and xmax aren't
set.

Could you also send
select (h).* from (select  page_header(get_raw_page('pg_namespace',0))
as h) as x;

I'm wondering if the page allvisible flag is set. The visibility map
is one of the few 8.4 patches which impact transaction visibility.


lp    lp_off    flags    lp_len    t_xmin    t_xmax    field3    t_ctid    t_info2    t_info
3    7616    1    141    2    897    0    (0,6)    16387    9482
HOT_UPDATED,                  UPDATED, XMAX_COMMITTED, XMIN_COMMITTED

6    7340    1    129    897    898    0    (0,7)    -16381    9482
HOT_UPDATED, HEAP_ONLY_TUPLE, UPDATED, XMAX_COMMITTED, XMIN_COMMITTED

7    7220    1    117    898    899    0    (0,9)    -16381    9482
HOT_UPDATED, HEAP_ONLY_TUPLE, UPDATED, XMAX_COMMITTED, XMIN_COMMITTED

9    7088    1    129    899    900    0    (0,10)    -16381    9482
HOT_UPDATED, HEAP_ONLY_TUPLE, UPDATED, XMAX_COMMITTED, XMIN_COMMITTED

10    6944    1    141    900    0    0    (0,10)    -32765    10506
 UPDATED, XMAX_INVALID,   XMIN_COMMITTED




--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> The last tuple is marked strangely I think. I don't think it's
> supposed to have XMAX_INVALID if xmax is 0 but I could be wrong. Also,
> I don't understand why it's marked as UPDATED if ctid and xmax aren't
> set.

No, that all looks right to me.  UPDATE sets HEAP_UPDATED on the newer
version of the row, not the older one.  What looks interesting to me is
that the last update isn't marked HEAP_ONLY_TUPLE, ie, it's not in the
same HOT chain.  Why is that I wonder ...

> I'm wondering if the page allvisible flag is set. The visibility map
> is one of the few 8.4 patches which impact transaction visibility.

But we don't use that while examining individual tuples, do we?

            regards, tom lane

Re: Multiple table entries?

From
Greg Stark
Date:
On Sun, Aug 23, 2009 at 7:00 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Greg Stark <gsstark@mit.edu> writes:
>> The last tuple is marked strangely I think. I don't think it's
>> supposed to have XMAX_INVALID if xmax is 0 but I could be wrong. Also,
>> I don't understand why it's marked as UPDATED if ctid and xmax aren't
>> set.
>
> No, that all looks right to me.  UPDATE sets HEAP_UPDATED on the newer
> version of the row, not the older one.  What looks interesting to me is
> that the last update isn't marked HEAP_ONLY_TUPLE, ie, it's not in the
> same HOT chain.  Why is that I wonder ...
>
>> I'm wondering if the page allvisible flag is set. The visibility map
>> is one of the few 8.4 patches which impact transaction visibility.
>
> But we don't use that while examining individual tuples, do we?

We don't use the visibility map itself but we *do* use the page
header's all visible bit. On a sequential scan we skip the visibility
check for tuples on the page if the page header bit is set.


--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> On Sun, Aug 23, 2009 at 7:00 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>> But we don't use that while examining individual tuples, do we?

> We don't use the visibility map itself but we *do* use the page
> header's all visible bit. On a sequential scan we skip the visibility
> check for tuples on the page if the page header bit is set.

Urgh.  I bet that's where the problem is then.  Some path is failing
to clear that bit, or maybe there's a race condition that allows it
to become set incorrectly (ie right after somebody else adds a tuple).

            regards, tom lane

Re: Multiple table entries?

From
Greg Stark
Date:
On Sun, Aug 23, 2009 at 7:34 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Greg Stark <gsstark@mit.edu> writes:
>> On Sun, Aug 23, 2009 at 7:00 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>>> But we don't use that while examining individual tuples, do we?
>
>> We don't use the visibility map itself but we *do* use the page
>> header's all visible bit. On a sequential scan we skip the visibility
>> check for tuples on the page if the page header bit is set.
>
> Urgh.  I bet that's where the problem is then.  Some path is failing
> to clear that bit, or maybe there's a race condition that allows it
> to become set incorrectly (ie right after somebody else adds a tuple).

Well it only gets set in vacuum when we have the vacuum cleanout lock
so I can't see it getting set incorrectly unless there's a simple
logic error in vacuum. I just scanned it and it looks right to me.

I can imagine it failing to be unset. Or being unset but allowing a
race condition window before it's unset when it's wrong. That doesn't
explain why it the HEAP_ONLY_TUPLE flag would be missing though.

Also, vacuum logs a warning if the flag is found to be set and
shouldn't be. I suppose that could easily be missed though if it's
autovacuum that ran it. Also, the flag might fail to be cleared but
then be cleared later by another operation. I wonder how much damage
it can do while it's wrong.

--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Greg Stark
Date:
On Sun, Aug 23, 2009 at 5:02 PM, Greg Stark<gsstark@mit.edu> wrote:
> On Sun, Aug 23, 2009 at 2:18 PM, Jeff Ross<jross@wykids.org> wrote:
> Incidentally, may as well ask the usual questions:

And just for reference, what does pg_controldata print?

--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> On Sun, Aug 23, 2009 at 7:34 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>> Urgh.  I bet that's where the problem is then.  Some path is failing
>> to clear that bit, or maybe there's a race condition that allows it
>> to become set incorrectly (ie right after somebody else adds a tuple).

> Well it only gets set in vacuum when we have the vacuum cleanout lock
> so I can't see it getting set incorrectly unless there's a simple
> logic error in vacuum. I just scanned it and it looks right to me.

heap_update is broken.  Details left as an exercise for the reader
(hint: Heikki has blatantly ignored the WAL-logging rules in transam/README)

            regards, tom lane

Re: Multiple table entries?

From
Jeff Ross
Date:
Greg Stark wrote:
> On Sun, Aug 23, 2009 at 5:37 PM, Jeff Ross<jross@wykids.org> wrote:
>
>> pg_clog is 32K. �I've put it at http://www.openvistas.net/pg_clog
>>
>
> Sorry, I'm getting a 404
>
>
>
Oops--fixed now.

Jeff


Re: Multiple table entries?

From
Jeff Ross
Date:
Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
>
>> On Sun, Aug 23, 2009 at 7:34 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>>
>>> Urgh.  I bet that's where the problem is then.  Some path is failing
>>> to clear that bit, or maybe there's a race condition that allows it
>>> to become set incorrectly (ie right after somebody else adds a tuple).
>>>
>
>
>> Well it only gets set in vacuum when we have the vacuum cleanout lock
>> so I can't see it getting set incorrectly unless there's a simple
>> logic error in vacuum. I just scanned it and it looks right to me.
>>
>
> heap_update is broken.  Details left as an exercise for the reader
> (hint: Heikki has blatantly ignored the WAL-logging rules in transam/README)
>
>             regards, tom lane
>
>
Well, as the reader that started this all ;-) should I be worried?  I
don't see any obvious problem with the way 8.4.0 is working.  Nearly
every web page on my server is stored in postgres, and we're still
churning out pages, no problem.

Should I do a pg_dump and reinstall?  Roll back to 8.3.7?  Or just
relax, don't worry and have a sparkling adult beverage?

Thanks,

Jeff (proceeding with relaxing, not worrying, and so on)



Re: Multiple table entries?

From
Greg Stark
Date:
On Sun, Aug 23, 2009 at 9:06 PM, Jeff Ross<jross@wykids.org> wrote:
> pglogd=# select (h).* from (select
>  page_header(get_raw_page('pg_namespace',0))
> pglogd(# as h) as x;
>   lsn    | tli | flags | lower | upper | special | pagesize | version |
> prune_xid
> -----------+-----+-------+-------+-------+---------+----------+---------+-----------
> 0/1A15904 |   1 |     5 |    64 |  6944 |    8192 |     8192 |       4 |
>   897

Well this confirms that the bug involves the all-visible flag as the 5
is PD_ALL_VISIBLE + PD_HAS_FREE_LINES

It's been overtaken by events as Tom has spotted a problem already.
--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Multiple table entries?

From
Tom Lane
Date:
Jeff Ross <jross@wykids.org> writes:
> Tom Lane wrote:
>> heap_update is broken.  Details left as an exercise for the reader

> Well, as the reader that started this all ;-) should I be worried?
> Should I do a pg_dump and reinstall?  Roll back to 8.3.7?  Or just
> relax, don't worry and have a sparkling adult beverage?

Well, it's a pretty bad bug but as far as I can see a simple "VACUUM
table" command should fix it up --- would you confirm?

            regards, tom lane

Re: Multiple table entries?

From
Jeff Ross
Date:
Tom Lane wrote:
> Jeff Ross <jross@wykids.org> writes:
>
>> Tom Lane wrote:
>>
>>> heap_update is broken.  Details left as an exercise for the reader
>>>
>
>
>> Well, as the reader that started this all ;-) should I be worried?
>> Should I do a pg_dump and reinstall?  Roll back to 8.3.7?  Or just
>> relax, don't worry and have a sparkling adult beverage?
>>
>
> Well, it's a pretty bad bug but as far as I can see a simple "VACUUM
> table" command should fix it up --- would you confirm?
>
>             regards, tom lane
>
>
Hah!  It did indeed clear it up!

pglogd=# \d
              List of relations
 Schema |     Name     | Type  |    Owner
--------+--------------+-------+-------------
 public | full_entries | table | _postgresql
 public | full_temp    | table | jross
 public | log_entries  | table | _postgresql
 public | page_hits    | table | _postgresql
 public | total_hits   | table | _postgresql
(5 rows)


Thanks, Tom and Greg.  Looking forward to 8.4.1 :-)

Jeff


Re: Multiple table entries?

From
Tom Lane
Date:
Jeff Ross <jross@wykids.org> writes:
> Tom Lane wrote:
>> Well, it's a pretty bad bug but as far as I can see a simple "VACUUM
>> table" command should fix it up --- would you confirm?

> Hah!  It did indeed clear it up!

[ thinks... ]  Actually, that only proves that the PD_ALL_VISIBLE fixup
logic in vacuumlazy.c does what it's supposed to do; it doesn't in
itself show anything about how the flag got to be wrong.  The path that
I'm seeing is enabled by the bogus coding that messes with the flag
after having done XLogInsert --- that's forbidden by our coding rules,
and the reason is that if XLogInsert chooses to log the entire page then
the replay routine will assume that all the required changes got applied
already.  It's fairly easy to demonstrate the problem:

create table foo (f1 int, f2 int);
insert into foo values (1,2);
insert into foo values (3,4);
select ctid,xmin,xmax,* from foo;
vacuum foo;
vacuum foo;
checkpoint;
update foo set f2 = f2 + 1;
select ctid,xmin,xmax,* from foo;
-- now cause a crash, eg kill -9 on this backend

After recovery, there'll be two visible copies of the two rows.

However, this theory requires that you had a backend crash, and
you averred upthread that you hadn't ...

            regards, tom lane

Re: Multiple table entries?

From
Greg Stark
Date:
On Mon, Aug 24, 2009 at 2:03 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Jeff Ross <jross@wykids.org> writes:
>> Tom Lane wrote:
>>> heap_update is broken.  Details left as an exercise for the reader
>
>> Well, as the reader that started this all ;-) should I be worried?
>> Should I do a pg_dump and reinstall?  Roll back to 8.3.7?  Or just
>> relax, don't worry and have a sparkling adult beverage?
>
> Well, it's a pretty bad bug but as far as I can see a simple "VACUUM
> table" command should fix it up --- would you confirm?

At the very least taking regular pg_dumps is probably wise. That's
probably wise even if there aren't Postges bugs though since it's the
most flexible type of backup to deal with application bugs with.

The answer to whether you should roll back until 8.4.1 comes out will
depend on how valuable your data is, how critical the downtime to
repair any corruption would be, versus the time that you'll spend on
rolling it back. That's a complicated calculus which will be different
for every user.

The bug found should only affect recovery though. So unless you have a
standby slave database or have postgres or system crashes it shouldn't
be relevant.


--
greg
http://mit.edu/~gsstark/resume.pdf