Thread: No long-lived transaction, still can't delete tuples

No long-lived transaction, still can't delete tuples

From
Jeffrey Baker
Date:
Hi again informative friends.  I have a new minor problem with my
database running postgresql 7.2.  I vacuum all the tables hourly, to
keep the peak number of tuples low.  I do this so that my toast
tables don't get beyond the amount of free space that the toast
system can track (whereupon it grows out of control).  This usually
works fine.

Today I have noticed that I can't vacuum a table, because all the of
the deleted tuples are marked as "Keep" -- they are still visible to
some transaction.  I do have a system that keeps a connection open
to postgres all the time, so I suspected that.  But, this systems
main loop looks like this:

while(1) {
    $dbh->commit();

    ...


So I believe the transaction is not long-lived.  Also, I checked
with tethereal to make absolutely certain that the commit was
happening:

->    0  5163 6f6d 6d69 7400                       Qcommit.
<-    0  4343 4f4d 4d49 5400 5a                    CCOMMIT.Z
->    0  5162 6567 696e 00                         Qbegin.
<-    0  4342 4547 494e 005a                       CBEGIN.Z

This makes me completely sure that my transactions are NOT staying
open.  However, I still can't vacuum the silly table:

rupert=# vacuum verbose resp_body;
NOTICE:  --Relation resp_body--
NOTICE:  Pages 322: Changed 0, Empty 0; Tup 11287: Vac 0, Keep 8013,
UnUsed 0.
        Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
NOTICE:  --Relation pg_toast_129372505--
NOTICE:  Pages 9070: Changed 0, Empty 0; Tup 37772: Vac 0, Keep
27768, UnUsed 0.
        Total CPU 0.00s/0.03u sec elapsed 0.02 sec.
NOTICE:  Analyzing resp_body
VACUUM

I shut down this long-running system, then I was able to vacuum the
table:

rupert=# vacuum verbose resp_body;
NOTICE:  --Relation resp_body--
NOTICE:  Index resp_body_resp_idx: Pages 34; Tuples 3286: Deleted
8013.
        CPU 0.01s/0.06u sec elapsed 3.77 sec.
NOTICE:  Removed 8013 tuples in 301 pages.
        CPU 0.01s/0.09u sec elapsed 0.34 sec.
NOTICE:  Pages 322: Changed 4, Empty 0; Tup 3286: Vac 8013, Keep 0,
UnUsed 0.
        Total CPU 0.02s/0.16u sec elapsed 4.11 sec.
NOTICE:  --Relation pg_toast_129372505--
NOTICE:  Index pg_toast_129372505_idx: Pages 146; Tuples 10054:
Deleted 27768.
        CPU 0.02s/0.23u sec elapsed 0.26 sec.
NOTICE:  Removed 27768 tuples in 6716 pages.
        CPU 0.39s/1.64u sec elapsed 4.15 sec.
NOTICE:  Pages 9082: Changed 14, Empty 0; Tup 10054: Vac 27768, Keep
0, UnUsed 0.
        Total CPU 0.41s/1.91u sec elapsed 4.44 sec.
NOTICE:  Analyzing resp_body
VACUUM

I really need some help here.  My understanding is that you cannot
vacuum any tuple which was deleted in a transaction number greater
than the oldest open transaction number.  This is a simple and
elegant system, and it seems like it would always work.  But as I
said above I am totally certain that I have only one connection
open continuously and in that connection I am constantly committing
and starting a new transaction.

Please help me reconcile this problem.  Perhaps there is a way I can
query the transaction numbers and live connections at runtime?

-jwb

Re: No long-lived transaction, still can't delete tuples

From
Tom Lane
Date:
Jeffrey Baker <jwbaker@acm.org> writes:
> So I believe the transaction is not long-lived.  Also, I checked
> with tethereal to make absolutely certain that the commit was
> happening:

> ->    0  5163 6f6d 6d69 7400                       Qcommit.
> <-    0  4343 4f4d 4d49 5400 5a                    CCOMMIT.Z
> ->    0  5162 6567 696e 00                         Qbegin.
> <-    0  4342 4547 494e 005a                       CBEGIN.Z

Isn't that BEGIN opening a new transaction?

Some front-end libraries have a bad habit of issuing a BEGIN instantly
after a commit, rather than waiting for the next command to be issued.
That means your app goes to sleep with an open transaction.

            regards, tom lane

Re: No long-lived transaction, still can't delete tuples

From
Jeffrey Baker
Date:
On Wed, Apr 24, 2002 at 06:21:31PM -0400, Tom Lane wrote:
> Jeffrey Baker <jwbaker@acm.org> writes:
> > So I believe the transaction is not long-lived.  Also, I checked
> > with tethereal to make absolutely certain that the commit was
> > happening:
>
> > ->    0  5163 6f6d 6d69 7400                       Qcommit.
> > <-    0  4343 4f4d 4d49 5400 5a                    CCOMMIT.Z
> > ->    0  5162 6567 696e 00                         Qbegin.
> > <-    0  4342 4547 494e 005a                       CBEGIN.Z
>
> Isn't that BEGIN opening a new transaction?

Sure is.  Very next thing.  That's how the Perl DBI operates.

> Some front-end libraries have a bad habit of issuing a BEGIN instantly
> after a commit, rather than waiting for the next command to be issued.
> That means your app goes to sleep with an open transaction.

Right, but it's the *next* transaction it goes to sleep in, correct?
So the time sequence looks like this:

   proc 1 | proc 2

00 connect
01 begin
02 work
03 commit
04 begin
05 work
06 commit
07 begin    connect
08 work     begin
09 commit   delete everything
10 begin    commit
11 work     vacuum <= this should get rid of everything < t = 07
12 commit   disconnect
13 begin
14 work
15 commit

Unless I totally misunderstand.

(BTW in the intervening hours I have updated to 7.2.1 to see if that
changes anything.)

Regards,
Jeffrey Baker

Re: No long-lived transaction, still can't delete tuples

From
Tom Lane
Date:
Jeffrey Baker <jwbaker@acm.org> writes:
> 04 begin
> 05 work
> 06 commit
> 07 begin    connect
> 08 work     begin
> 09 commit   delete everything
> 10 begin    commit
> 11 work     vacuum <= this should get rid of everything < t = 07
> 12 commit   disconnect

Actually that vacuum should get rid of everything older than t=10.
What I'm concerned about is this scenario:

client A            client B        you

begin;
work...
commit;
begin;
take a lunch break...

                begin;
                work...
                commit;


                            connect
                            vacuum;

Until client A returns from his lunch break, you'll not be able to
vacuum the trash that client B generated, even though B has committed
his changes.

            regards, tom lane

Re: No long-lived transaction, still can't delete tuples

From
Lincoln Yeoh
Date:
At 06:59 PM 4/24/02 -0400, Tom Lane wrote:

>Actually that vacuum should get rid of everything older than t=10.
>What I'm concerned about is this scenario:
>
>Until client A returns from his lunch break, you'll not be able to
>vacuum the trash that client B generated, even though B has committed
>his changes.

Does just a BEGIN without anything else hold up vacuum? It doesn't seem to
for 7.1.3. Whereas a BEGIN followed by a select from a table holds up
vacuum once vacuum reaches the relevant table. BEGIN followed by select
(1), causes vacuum to stop with:
ERROR:  Parent tuple was not found

Once begin is rolled back, then vacuum works but you get:
NOTICE:  RegisterSharedInvalid: SI buffer overflow
NOTICE:  InvalidateSharedInvalid: cache state reset

Trying select (1) again gives:
NOTICE:  InvalidateSharedInvalid: cache state reset
  ?column?
----------
         1
(1 row)

Has this behaviour changed for 7.2?

If so the scenario is worse. It affects lots of people who use Perl DBI.

Typical perl webapp scenario:
initialize.
  make DB connection.

Serve requests:
while (wait for http request)
  rollback (implicit begin to get timestamps right)
  Serve up page
  commit/rollback (implicit begin)

Simple perl daemon app scenario:
make DB connection:
while (things ok and things to do)
  rollback (implicit begin)
  do stuff
  commit/rollback (implicit begin)
  sleep.

If just a plain BEGIN holds up vacuum, I won't be able to vacuum without
shutting down ALL my perl apps.

I understand why BEGIN might hold up vacuums, but 7.1 didn't (maybe
erroneously), so people who are migrating should be warned.

How can this problem be fixed? Perl DBI changed? Transactions don't begin
until first statement after BEGIN?

Regards,
Link.


Re: No long-lived transaction, still can't delete tuples

From
Tom Lane
Date:
Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> At 06:59 PM 4/24/02 -0400, Tom Lane wrote:
>> Until client A returns from his lunch break, you'll not be able to
>> vacuum the trash that client B generated, even though B has committed
>> his changes.

> Does just a BEGIN without anything else hold up vacuum? It doesn't seem to
> for 7.1.3. Whereas a BEGIN followed by a select from a table holds up
> vacuum once vacuum reaches the relevant table.

You're confusing obtaining a lock with determining xmin for tuple
removal purposes.

> BEGIN followed by select
> (1), causes vacuum to stop with:
> ERROR:  Parent tuple was not found

Oh?  If you have a repeatable example of that, I'd like to see it.

            regards, tom lane

Re: No long-lived transaction, still can't delete tuples

From
Lincoln Yeoh
Date:
At 09:56 AM 4/25/02 -0400, Tom Lane wrote:
>Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> > At 06:59 PM 4/24/02 -0400, Tom Lane wrote:
> >> Until client A returns from his lunch break, you'll not be able to
> >> vacuum the trash that client B generated, even though B has committed
> >> his changes.
>
> > Does just a BEGIN without anything else hold up vacuum? It doesn't seem to
> > for 7.1.3. Whereas a BEGIN followed by a select from a table holds up
> > vacuum once vacuum reaches the relevant table.
>
>You're confusing obtaining a lock with determining xmin for tuple
>removal purposes.

OK. I get it now.

> > BEGIN followed by select
> > (1), causes vacuum to stop with:
> > ERROR:  Parent tuple was not found
>
>Oh?  If you have a repeatable example of that, I'd like to see it.

Hmm, only happens in one database where I have (for long periods) two
processes continuously inserting/updating data in two separate tables.
Maybe that database is a bit screwed up. I tried creating another database
and doing manual inserts into a table and I can't reproduce it.

well maybe vacuum verbose might help:
VACUUM verbose;
NOTICE:  Skipping "pg_type" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_attribute" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_class" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_group" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_database" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_attrdef" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_trigger" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_inherits" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_index" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_operator" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_opclass" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_am" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_amop" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_amproc" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_language" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_largeobject" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_aggregate" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_ipl" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_inheritproc" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_description" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_listener" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_shadow" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_proc" --- only table owner can VACUUM it
NOTICE:  --Relation wordlist--
NOTICE:  Pages 42: Changed 0, reaped 0, Empty 0, New 0; Tup 6313: Vac 0,
Keep/VTL 0/0, Crash
  0, UnUsed 0, MinLen 41, MaxLen 57; Re-using: Free/Avail. Space 0/0;
EndEmpty/Avail. Pages 0
/0. CPU 0.00s/0.00u sec.
NOTICE:  Index wordlist_id_key: Pages 28; Tuples 6313. CPU 0.00s/0.01u sec.
NOTICE:  --Relation pg_toast_29612--
NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/0, Crash 0,
UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0;
EndEmpty/Avail. Pages 0/0. CP
U 0.00s/0.00u sec.
NOTICE:  Index pg_toast_29612_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
NOTICE:  --Relation arch_ranks_arch4--
NOTICE:  Pages 137: Changed 2, reaped 135, Empty 0, New 0; Tup 6892: Vac 0,
Keep/VTL 31/31,
Crash 0, UnUsed 4238, MinLen 88, MaxLen 1138; Re-using: Free/Avail. Space
10412/2676; EndEmp
ty/Avail. Pages 0/24. CPU 0.00s/0.00u sec.
NOTICE:  Rel arch_ranks_arch4: Pages: 137 --> 137; Tuple(s) moved: 0. CPU
0.01s/0.07u sec.
NOTICE:  --Relation pg_toast_102844--
NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/0, Crash 0,
UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0;
EndEmpty/Avail. Pages 0/0. CP
U 0.00s/0.00u sec.
NOTICE:  Index pg_toast_102844_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
NOTICE:  --Relation arch_ranks_arch6--
NOTICE:  Pages 319: Changed 2, reaped 318, Empty 0, New 0; Tup 10529: Vac
0, Keep/VTL 30/30,
  Crash 0, UnUsed 8628, MinLen 88, MaxLen 1143; Re-using: Free/Avail. Space
13548/3488; EndEm
pty/Avail. Pages 0/24. CPU 0.01s/0.01u sec.
ERROR:  Parent tuple was not found

Another try:
VACUUM verbose;
NOTICE:  Skipping "pg_type" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_attribute" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_class" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_group" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_database" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_attrdef" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_trigger" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_inherits" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_index" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_operator" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_opclass" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_am" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_amop" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_amproc" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_language" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_largeobject" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_aggregate" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_ipl" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_inheritproc" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_description" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_listener" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_shadow" --- only table owner can VACUUM it
NOTICE:  Skipping "pg_proc" --- only table owner can VACUUM it
NOTICE:  --Relation wordlist--
NOTICE:  Pages 42: Changed 0, reaped 0, Empty 0, New 0; Tup 6313: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 41, MaxLen 57; Re-using:
Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.01s/0.00u sec.
NOTICE:  Index wordlist_id_key: Pages 28; Tuples 6313. CPU 0.00s/0.01u sec.
NOTICE:  --Relation pg_toast_29612--
NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail.
Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE:  Index pg_toast_29612_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
NOTICE:  --Relation arch_ranks_arch4--
NOTICE:  Pages 137: Changed 2, reaped 135, Empty 0, New 0; Tup 6867: Vac 0,
Keep/VTL 6/6, Crash 0, UnUsed 4266, MinLen 88, MaxLen 1138; Re-using:
Free/Avail. Space 13412/3076; EndEmpty/Avail. Pages 0/26. CPU 0.00s/0.00u sec.
ERROR:  Parent tuple was not found

Another try, narrowing vacuum to one table:
session #1: rollback ;begin; select (1);

session #2: VACUUM verbose arch_ranks_arch4;
NOTICE:  --Relation arch_ranks_arch4--
NOTICE:  Pages 137: Changed 2, reaped 135, Empty 0, New 0; Tup 6867: Vac 0,
Keep/VTL 6/6, Crash 0, UnUsed 4266, MinLen 88, MaxLen 1138; Re-using:
Free/Avail. Space 13640/3136; EndEmpty/Avail. Pages 0/26. CPU 0.00s/0.01u sec.
ERROR:  Parent tuple was not found

session #1: rollback;
session #2: VACUUM verbose arch_ranks_arch4;
NOTICE:  --Relation arch_ranks_arch4--
NOTICE:  Pages 138: Changed 1, reaped 135, Empty 0, New 0; Tup 6861: Vac
101, Keep/VTL 0/0, Crash 0, UnUsed 4212, MinLen 88, MaxLen 1138; Re-using:
Free/Avail. Space 19516/15948; EndEmpty/Avail. Pages 0/32. CPU 0.01s/0.00u sec.
NOTICE:  Rel arch_ranks_arch4: Pages: 138 --> 136; Tuple(s) moved: 105. CPU
0.00s/0.02u sec.
NOTICE:  --Relation pg_toast_102844--
NOTICE:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail.
Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE:  Index pg_toast_102844_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
VACUUM

\d arch_ranks_arch4
              Table "arch_ranks_arch4"
   Attribute  |           Type           | Modifier
-------------+--------------------------+----------
  id          | integer                  |
  updated     | timestamp with time zone |
  valid       | integer                  |
  name        | text                     |
  specialty   | text                     |
  status      | text                     |
  ranking     | integer                  |
  power       | integer                  |
  land        | integer                  |
  forts       | integer                  |
  description | text                     |

Regards,
Link.