Thread: BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.

The following bug has been logged on the website:

Bug reference:      10155
Logged by:          digoal.zhou
Email address:      digoal@126.com
PostgreSQL version: 9.3.3
Operating system:   CentOS 6.4 x64
Description:

SESSION A :
digoal=# begin isolation level repeatable read;
BEGIN
digoal=# \dt
                 List of relations
 Schema |         Name          | Type  |  Owner
--------+-----------------------+-------+----------
 public | stat_pg_stat_database | table | postgres
 public | tbl_cost_align        | table | postgres
 public | test                  | table | postgres
 public | ttt                   | table | postgres
 public | tttt                  | table | postgres
 public | ttttt                 | table | postgres
(6 rows)

SESSION B :
digoal=# create table t as select * from pg_class;
SELECT 306

SESSION A :
no table t in it , A cann't see the t metadata in pg_class and pg_attr and
so on.
digoal=# \dt
                 List of relations
 Schema |         Name          | Type  |  Owner
--------+-----------------------+-------+----------
 public | stat_pg_stat_database | table | postgres
 public | tbl_cost_align        | table | postgres
 public | test                  | table | postgres
 public | ttt                   | table | postgres
 public | tttt                  | table | postgres
 public | ttttt                 | table | postgres
(6 rows)

SESSION B :
but B cann't reclaim rows from table t.
why?
i think postgresql cann't reclaim tuples already exists before repeatable
read transaction start, why this case t's tuples after session a and cann't
reclaim.


digoal=# delete from t;
DELETE 306
digoal=# vacuum freeze verbose t;
INFO:  vacuuming "public.t"
INFO:  "t": found 0 removable, 306 nonremovable row versions in 2 out of 2
pages
DETAIL:  306 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  vacuuming "pg_toast.pg_toast_33578"
INFO:  index "pg_toast_33578_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_33578": found 0 removable, 0 nonremovable row versions in 0
out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM


when SESSION end;
session b can reclaim these dead tuple.
On 04/28/2014 11:37 AM, digoal@126.com wrote:
> SESSION B :
> but B cann't reclaim rows from table t.
> why?
> i think postgresql cann't reclaim tuples already exists before repeatable
> read transaction start, why this case t's tuples after session a and cann't
> reclaim.

I think what you're arguing is that the system should be smarter and be
able to reclaim the dead tuples. Because session A began before the
table was even created, and there are no other backends that would need
to see them either, they could indeed be safely vacuumed. The system
just isn't smart enough to distinguish the case.

The short answer is that such an optimization just doesn't exist in
PostgreSQL. It's certainly not a bug.

The long answer is that actually, even though the table was created
after the transaction in session A began, session A *can* access the
table. Schema changes don't follow the normal MVCC rules. If you do
"SELECT * FROM t" in session A, it will work. However, the rows still
won't be visible, to sessin A, because they were inserted after the
snapshot was taken, so they could still be vacuumed if the system
tracked the snapshots more carefully and was able to deduce that. But
the fact that a new table was created is not relevant.

- Heikki
HI,
   This is so bad in pg_dump use, when a database so big.
   because pg_dump is also use repeatable read isolation. and when pg_dump backup database , the database will bloat .
   Can we optimize it?


--
公益是一辈子的事,I'm Digoal,Just Do It.


At 2014-04-29 02:53:33,"Heikki Linnakangas" <hlinnakangas@vmware.com> wrote: >On 04/28/2014 11:37 AM, digoal@126.com wrote: >> SESSION B : >> but B cann't reclaim rows from table t. >> why? >> i think postgresql cann't reclaim tuples already exists before repeatable >> read transaction start, why this case t's tuples after session a and cann't >> reclaim. > >I think what you're arguing is that the system should be smarter and be  >able to reclaim the dead tuples. Because session A began before the  >table was even created, and there are no other backends that would need  >to see them either, they could indeed be safely vacuumed. The system  >just isn't smart enough to distinguish the case. > >The short answer is that such an optimization just doesn't exist in  >PostgreSQL. It's certainly not a bug. > >The long answer is that actually, even though the table was created  >after the transaction in session A began, session A *can* access the  >table. Schema changes don't follow the normal MVCC rules. If you do  >"SELECT * FROM t" in session A, it will work. However, the rows still  >won't be visible, to sessin A, because they were inserted after the  >snapshot was taken, so they could still be vacuumed if the system  >tracked the snapshots more carefully and was able to deduce that. But  >the fact that a new table was created is not relevant. > >- Heikki


On 04/29/2014 02:18 AM, 德哥 wrote:
> HI,
>     This is so bad in pg_dump use, when a database so big.
>     because pg_dump is also use repeatable read isolation. and when pg_dump backup database , the database will bloat
.

Yep. One approach is to take a filesystem-level backup (ie. with
pg_start/stop_backup() or pg_basebackup), start a second server from
that backup, and run pg_dump against that.

>     Can we optimize it?

Not easily. I'd love to do something about it, but it's going to be a
long project.

- Heikki