Re: Maintenance question / DB size anomaly... - Mailing list pgsql-performance

From Kurt Overberg
Subject Re: Maintenance question / DB size anomaly...
Date
Msg-id 4D1C51F2-79EA-4468-BF2D-1D5591BD8A3D@hotdogrecords.com
Whole thread Raw
In response to Re: Maintenance question / DB size anomaly...  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Maintenance question / DB size anomaly...  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Jun 19, 2007, at 7:26 PM, Tom Lane wrote:

> Kurt Overberg <kurt@hotdogrecords.com> writes:
>> That's the thing thats kinda blowing my mind here, when I look at
>> that table:
>
>> db1=# select count(*) from _my_cluster.sl_log_1 ;
>> count
>> -------
>>    6788
>> (1 row)
>
> Well, that's real interesting.  AFAICS there are only two
> possibilities:
>
> 1. VACUUM sees the other 300k tuples as INSERT_IN_PROGRESS; a look at
> the code shows that these are counted the same as plain live tuples,
> but they'd not be visible to other transactions.  I wonder if you
> could
> have any really old open transactions that might have inserted all
> those
> tuples?
>

Unlikely- the database has been stopped and restarted, which I think
closes
out transactions?  Or could that cause the problems?


> 2. The other 300k tuples are committed good, but they are not seen as
> valid by a normal MVCC-aware transaction, probably because of
> transaction wraparound.  This would require the sl_log_1 table to have
> escaped vacuuming for more than 2 billion transactions, which seems a
> bit improbable but maybe not impossible.  (You did say you were
> running
> PG 8.0.x, right?  That's the last version without any strong defenses
> against transaction wraparound...)

Yep, this 8.0.4.  It has been running for over a year, fairly heavy
updates, so
I would guess its possible.

> The way to get some facts, instead of speculating, would be to get
> hold
> of the appropriate version of pg_filedump from
> http://sources.redhat.com/rhdb/ and dump out sl_log_1 with it
> (probably the -i option would be sufficient), then take a close look
> at the tuples that aren't visible to other transactions.  (You could
> do "select ctid from sl_log_1" to determine which ones are visible.)
>

Okay, I've grabbed pg_filedump and got it running on the appropriate
server.
I really have No Idea how to read its output though.  Where does the
ctid from sl_log_1
appear in the following listing?


Block    0 ********************************************************
<Header> -----
Block Offset: 0x00000000         Offsets: Lower      20 (0x0014)
Block: Size 8192  Version    2            Upper    8176 (0x1ff0)
LSN:  logid    949 recoff 0xae63b06c      Special  8176 (0x1ff0)
Items:    0                   Free Space: 8156
Length (including item array): 24

BTree Meta Data:  Magic (0x00053162)   Version (2)
                    Root:     Block (1174413)  Level (3)
                    FastRoot: Block (4622)  Level (1)

<Data> ------
Empty block - no items listed

<Special Section> -----
BTree Index Section:
   Flags: 0x0008 (META)
   Blocks: Previous (0)  Next (0)  Level (0)


.../this was taken from the first page file (955960160.0 I guess you
could
call it).  Does this look interesting to you, Tom?

FWIW- this IS on my master DB.  I've been slowly preparing an upgrade
to 8.2, I guess
I'd better get that inta gear, hmmm?  :-(

/kurt



>             regards, tom lane


pgsql-performance by date:

Previous
From: Mike Benoit
Date:
Subject: Re: Volunteer to build a configuration tool
Next
From: Tom Lane
Date:
Subject: Re: Maintenance question / DB size anomaly...