Thread: vacuum does not reclaim rows

vacuum does not reclaim rows

From
Tatsuo Ishii
Date:
Hi,

Does anybody know why vacuum full does not relcaim deleted rows if a
open transaction which started before the deletion happens is running
even on a different database?

Here is an example:

T1: psql db1
T2: psql db2
T3: psql db2

T2: create table t1(i int); insert into t1 values(1);
T1: begin;
T2: delete from t1;
T3: vacuum full t1;

here vacuum will not reclaim deleted rows.

While tracking this, I found that GetOldestXmin(false) checks the
database id correctly:
 if (allDbs || proc->databaseId == MyDatabaseId)

but after that it checks proc->xmin, where xmin may not be running on
the same database. I wonder if this is correct or not. Maybe we should
make sure that xmin is running on the same database if GetOldestXmin()
is called with its arg being set false? This is PostgreSQL 7.3.3.
--
Tatsuo Ishii


Re: vacuum does not reclaim rows

From
Kurt Roeckx
Date:
On Sun, Jul 06, 2003 at 12:00:39AM +0900, Tatsuo Ishii wrote:
> Hi,
> 
> Does anybody know why vacuum full does not relcaim deleted rows if a
> open transaction which started before the deletion happens is running
> even on a different database?

On what do you base that it doesn't reclaim those rows?


Kurt



Re: vacuum does not reclaim rows

From
Tatsuo Ishii
Date:
> > Does anybody know why vacuum full does not relcaim deleted rows if a
> > open transaction which started before the deletion happens is running
> > even on a different database?
> 
> On what do you base that it doesn't reclaim those rows?

test=# vacuum full verbose t1;
INFO:  --Relation public.t1--
INFO:  Pages 1: Changed 1, reaped 0, Empty 0, New 0; Tup 1: Vac 0, Keep/VTL 1/0, UnUsed 0, MinLen 32, MaxLen 32;
Re-using:Free/Avail. Space 8136/8136; EndEmpty/Avail. Pages 0/1.       CPU 0.00s/0.00u sec elapsed 0.00 sec.
 
INFO:  Rel t1: Pages: 1 --> 1; Tuple(s) moved: 0.       CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

Here you can see "Pages: 1 --> 1; Tuple(s) moved: 0", it indicates
vacuum actually skips the table and does not reclaim those rows.

Here is a case when vacuum does reclaim.

test=# vacuum full verbose t1;
INFO:  --Relation public.t1--
INFO:  Pages 1: Changed 0, reaped 1, Empty 0, New 0; Tup 0: Vac 1, Keep/VTL 0/0, UnUsed 0, MinLen 0, MaxLen 0;
Re-using:Free/Avail. Space 8168/0; EndEmpty/Avail. Pages 1/0.       CPU 0.00s/0.00u sec elapsed 0.00 sec.
 
INFO:  Rel t1: Pages: 1 --> 0.
VACUUM
--
Tatsuo Ishii


Re: vacuum does not reclaim rows

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> but after that it checks proc->xmin, where xmin may not be running on
> the same database. I wonder if this is correct or not. Maybe we should
> make sure that xmin is running on the same database

How would you know?  (At the time you are looking, it's quite possible
the other guy's xmin doesn't exist anymore.)  In any case you can't just
arbitrarily ignore the other guy's xmin, since it's a proxy for
subsequent transaction IDs as well, and those might be in any database.

It might be possible to do something by having each proc store both
a "local" and a "global" xmin computed as of its current xid start,
but I haven't really thought through the details.  In any case, that
would be extra bookkeeping needed during every transaction start,
so I'd want to see proof of a generally-useful improvement in return.

On the whole I'm against changing this logic ... I think the odds
of breaking something are high, and the odds of making a useful
improvement low ...
        regards, tom lane


Re: vacuum does not reclaim rows

From
Andrew Sullivan
Date:
On Sat, Jul 05, 2003 at 09:51:16PM -0400, Tom Lane wrote:

> but I haven't really thought through the details.  In any case, that
> would be extra bookkeeping needed during every transaction start,
> so I'd want to see proof of a generally-useful improvement in return.

For what it's worth, we have redesigned around this very problem,
because we had a table, vacuumed every 5 minutes, which was always >
50% dead tuples.  Of course, we _were_ able to redesign around it,
but I'm not sure whether we just moved the problem to a new location.
We'll see in the upcoming weeks.  (Our testing says no, but I always
figure there's _some_ strange client case I never thought of.)

A

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8                                        +1 416 646 3304
x110



Re: vacuum does not reclaim rows

From
Bruce Momjian
Date:
I think the big issue is that people think that if they have no one in a
specific database, that  VACUUM FULL will completely remove unused
space, while this is not true if there are other backends connected to
other databases.

This might be a stupid question, but why does one backend have to care
about the global xmin at all?  Isn't the local xmin the only important
value?

---------------------------------------------------------------------------

Tom Lane wrote:
> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > but after that it checks proc->xmin, where xmin may not be running on
> > the same database. I wonder if this is correct or not. Maybe we should
> > make sure that xmin is running on the same database
> 
> How would you know?  (At the time you are looking, it's quite possible
> the other guy's xmin doesn't exist anymore.)  In any case you can't just
> arbitrarily ignore the other guy's xmin, since it's a proxy for
> subsequent transaction IDs as well, and those might be in any database.
> 
> It might be possible to do something by having each proc store both
> a "local" and a "global" xmin computed as of its current xid start,
> but I haven't really thought through the details.  In any case, that
> would be extra bookkeeping needed during every transaction start,
> so I'd want to see proof of a generally-useful improvement in return.
> 
> On the whole I'm against changing this logic ... I think the odds
> of breaking something are high, and the odds of making a useful
> improvement low ...
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: vacuum does not reclaim rows

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> This might be a stupid question, but why does one backend have to care
> about the global xmin at all?  Isn't the local xmin the only important
> value?

Not when it's looking at shared tables.

Do you want to try to make the tqual code aware of whether it's looking
at a shared or nonshared table?  I don't honestly think it's worth it...
        regards, tom lane


Re: vacuum does not reclaim rows

From
Bruce Momjian
Date:
Oh, shared tables --- I forgot about those.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > This might be a stupid question, but why does one backend have to care
> > about the global xmin at all?  Isn't the local xmin the only important
> > value?
> 
> Not when it's looking at shared tables.
> 
> Do you want to try to make the tqual code aware of whether it's looking
> at a shared or nonshared table?  I don't honestly think it's worth it...
> 
>             regards, tom lane
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073