Thread: truncate a table instead of vaccum full when count(*) is 0

truncate a table instead of vaccum full when count(*) is 0

From
Pomarede Nicolas
Date:

Hello to all,

I have a table that is used as a spool for various events. Some processes
write data into it, and another process reads the resulting rows, do some
work, and delete the rows that were just processed.

As you can see, with hundreds of thousands events a day, this table will
need being vaccumed regularly to avoid taking too much space (data and
index).

Note that processing rows is quite fast in fact, so at any time a
count(*) on this table rarely exceeds 10-20 rows.


For the indexes, a good way to bring them to a size corresponding to the
actual count(*) is to run 'reindex'.

But for the data (dead rows), even running a vacuum analyze every day is
not enough, and doesn't truncate some empty pages at the end, so the data
size remains in the order of 200-300 MB, when only a few effective rows
are there.

I see in the 8.3 list of coming changes that the FSM will try to re-use
pages in a better way to help truncating empty pages. Is this correct ?

Running a vacuum full is a solution for now, but it locks the table for
too long (10 minutes or so), which is not acceptable in that case, since
events should be processed in less that 10 seconds.

So, I would like to truncate the table when the number of rows reaches 0
(just after the table was processed, and just before some new rows are
added).

Is there an easy way to do this under psql ? For example, lock the table,
do a count(*), if result is 0 row then truncate the table, unlock the
table (a kind of atomic 'truncate table if count(*) == 0').

Would this work and what would be the steps ?

Thanks

Nicolas

Re: truncate a table instead of vaccum full when count(*) is 0

From
Guillaume Cottenceau
Date:
Pomarede Nicolas <npomarede 'at' corp.free.fr> writes:

> Hello to all,
>
> I have a table that is used as a spool for various events. Some
> processes write data into it, and another process reads the resulting
> rows, do some work, and delete the rows that were just processed.
>
> As you can see, with hundreds of thousands events a day, this table
> will need being vaccumed regularly to avoid taking too much space
> (data and index).
>
> Note that processing rows is quite fast in fact, so at any time a
> count(*) on this table rarely exceeds 10-20 rows.
>
>
> For the indexes, a good way to bring them to a size corresponding to
> the actual count(*) is to run 'reindex'.
>
> But for the data (dead rows), even running a vacuum analyze every day
> is not enough, and doesn't truncate some empty pages at the end, so
> the data size remains in the order of 200-300 MB, when only a few
> effective rows are there.

As far as I know, you probably need to increase your
max_fsm_pages, because your pg is probably not able to properly
track unused pages between subsequent VACUUM's.

http://www.postgresql.org/docs/8.2/interactive/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-FSM

Have you investigated this? It seems that you already know about
the FSM stuff, according to your question about FSM and 8.3.

You can also run VACUUM ANALYZE more frequently (after all, it
doesn't lock the table).

--
Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company
Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36

Re: truncate a table instead of vaccum full when count(*) is 0

From
Heikki Linnakangas
Date:
Pomarede Nicolas wrote:
> But for the data (dead rows), even running a vacuum analyze every day is
> not enough, and doesn't truncate some empty pages at the end, so the
> data size remains in the order of 200-300 MB, when only a few effective
> rows are there.

For a table like that you should run VACUUM much more often than once a
day. Turn on autovacuum, or set up a cron script etc. to run it every 15
minutes or so.

> Running a vacuum full is a solution for now, but it locks the table for
> too long (10 minutes or so), which is not acceptable in that case, since
> events should be processed in less that 10 seconds.
>
> So, I would like to truncate the table when the number of rows reaches 0
> (just after the table was processed, and just before some new rows are
> added).
>
> Is there an easy way to do this under psql ? For example, lock the
> table, do a count(*), if result is 0 row then truncate the table, unlock
> the table (a kind of atomic 'truncate table if count(*) == 0').
>
> Would this work and what would be the steps ?

It should work, just like you describe it, with the caveat that TRUNCATE
will remove any old row versions that might still be visible to an older
transaction running in serializable mode. It sounds like it's not a
problem in your scenario, but it's hard to say for sure without seeing
the application. Running vacuum more often is probably a simpler and
better solution, anyway.

Which version of PostgreSQL is this?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: truncate a table instead of vaccum full when count(*) is 0

From
Pomarede Nicolas
Date:
On Tue, 8 May 2007, ismo.tuononen@solenovo.fi wrote:

>
>
> On Tue, 8 May 2007, Pomarede Nicolas wrote:
>
>> As you can see, with hundreds of thousands events a day, this table will need
>> being vaccumed regularly to avoid taking too much space (data and index).
>>
>> Note that processing rows is quite fast in fact, so at any time a count(*) on
>> this table rarely exceeds 10-20 rows.
>>
>> For the indexes, a good way to bring them to a size corresponding to the
>> actual count(*) is to run 'reindex'.
>
> why you have index in table where is only 10-20 rows?
>
> are those indexes to prevent some duplicate rows?

I need these indexes to sort rows to process in chronological order. I'm
also using an index on 'oid' to delete a row after it was processed (I
could use a unique sequence too, but I think it would be the same).

Also, I sometime have peaks that insert lots of data in a short time, so
an index on the event's date is useful.

And as the number of effective row compared to the number of dead rows is
only 1%, doing a count(*) for example takes many seconds, even if the
result of count(*) is 10 row (because pg will sequential scan all the data
pages of the table). Without index on the date, I would need sequential
scan to fetch row to process, and this would be slower due to the high
number of dead rows.

>
> I have some tables just to store unprosessed data, and because there is
> only few rows and I always process all rows there is no need for
> indexes. there is just column named id, and when I insert row I take
> nextval('id_seq') :
>
> insert into some_tmp_table(id,'message',...) values (nextval('id_seq'),'do
> something',...);
>
> I know that deleting is slower than with indexes, but it's still fast
> enough, because all rows are in memory.
>
> and that id-column is just for delete, it's unique and i can always delete
> using only it.
>
> Ismo

Nicolas

Re: truncate a table instead of vaccum full when count(*) is 0

From
Pomarede Nicolas
Date:
On Tue, 8 May 2007, Guillaume Cottenceau wrote:

> Pomarede Nicolas <npomarede 'at' corp.free.fr> writes:
>
>> Hello to all,
>>
>> I have a table that is used as a spool for various events. Some
>> processes write data into it, and another process reads the resulting
>> rows, do some work, and delete the rows that were just processed.
>>
>> As you can see, with hundreds of thousands events a day, this table
>> will need being vaccumed regularly to avoid taking too much space
>> (data and index).
>>
>> Note that processing rows is quite fast in fact, so at any time a
>> count(*) on this table rarely exceeds 10-20 rows.
>>
>>
>> For the indexes, a good way to bring them to a size corresponding to
>> the actual count(*) is to run 'reindex'.
>>
>> But for the data (dead rows), even running a vacuum analyze every day
>> is not enough, and doesn't truncate some empty pages at the end, so
>> the data size remains in the order of 200-300 MB, when only a few
>> effective rows are there.
>
> As far as I know, you probably need to increase your
> max_fsm_pages, because your pg is probably not able to properly
> track unused pages between subsequent VACUUM's.
>
> http://www.postgresql.org/docs/8.2/interactive/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-FSM
>
> Have you investigated this? It seems that you already know about
> the FSM stuff, according to your question about FSM and 8.3.
>
> You can also run VACUUM ANALYZE more frequently (after all, it
> doesn't lock the table).

thanks, but max FSM is already set to a large enough value (I'm running a
vacuum analyze every day on the whole database, and set max fsm according
to the last lines of vacuum, so all pages are stored in the FSM).


Nicolas


Re: truncate a table instead of vaccum full when count(*) is 0

From
Gregory Stark
Date:
"Pomarede Nicolas" <npomarede@corp.free.fr> writes:

> But for the data (dead rows), even running a vacuum analyze every day is not
> enough, and doesn't truncate some empty pages at the end, so the data size
> remains in the order of 200-300 MB, when only a few effective rows are there.

Try running vacuum more frequently. Once per day isn't very frequent for
vacuum, every 60 or 30 minutes isn't uncommon. For your situation you might
even consider running it continuously in a loop.

> I see in the 8.3 list of coming changes that the FSM will try to re-use pages
> in a better way to help truncating empty pages. Is this correct ?

There are several people working on improvements to vacuum but it's not clear
right now exactly what we'll end up with. I think most of the directly vacuum
related changes wouldn't actually help you either.

The one that would help you is named "HOT". If you're interested in
experimenting with an experimental patch you could consider taking CVS and
applying HOT and seeing how it affects you. Or if you see an announcement that
it's been comitted taking a beta and experimenting with it before the 8.3
release could be interesting. Experiments with real-world databases can be
very helpful for developers since it's hard to construct truly realistic
benchmarks.

> So, I would like to truncate the table when the number of rows reaches 0 (just
> after the table was processed, and just before some new rows are added).
>
> Is there an easy way to do this under psql ? For example, lock the table, do a
> count(*), if result is 0 row then truncate the table, unlock the table (a kind
> of atomic 'truncate table if count(*) == 0').
>
> Would this work and what would be the steps ?

It would work but you may end up keeping the lock for longer than you're happy
for. Another option to consider would be to use CLUSTER instead of vacuum full
though the 8.2 CLUSTER wasn't entirely MVCC safe and I think in your situation
that might actually be a problem. It would cause transactions that started
before the cluster (but didn't access the table before the cluster) to not see
any records after the cluster.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


Re: truncate a table instead of vaccum full when count(*) is 0

From
Pomarede Nicolas
Date:
On Tue, 8 May 2007, Heikki Linnakangas wrote:

> Pomarede Nicolas wrote:
>> But for the data (dead rows), even running a vacuum analyze every day is
>> not enough, and doesn't truncate some empty pages at the end, so the data
>> size remains in the order of 200-300 MB, when only a few effective rows are
>> there.
>
> For a table like that you should run VACUUM much more often than once a day.
> Turn on autovacuum, or set up a cron script etc. to run it every 15 minutes
> or so.

Yes, I already do this on another spool table ; I run a vacuum after
processing it, but I wondered if there was another way to keep the disk
size low for this table.

As for autovacuum, the threshold values to analyze/vacuum are not adapted
to my situation, because I have some big tables that I prefer to keep
vacuumed frequently to prevent growing in disk size, even if the number of
insert/update is not big enough and in my case autovacuum would not run
often enough. Instead of configuring autovacuum on a per table basis, I
prefer running a vacuum on the database every day.



>
>> Running a vacuum full is a solution for now, but it locks the table for too
>> long (10 minutes or so), which is not acceptable in that case, since events
>> should be processed in less that 10 seconds.
>>
>> So, I would like to truncate the table when the number of rows reaches 0
>> (just after the table was processed, and just before some new rows are
>> added).
>>
>> Is there an easy way to do this under psql ? For example, lock the table,
>> do a count(*), if result is 0 row then truncate the table, unlock the table
>> (a kind of atomic 'truncate table if count(*) == 0').
>>
>> Would this work and what would be the steps ?
>
> It should work, just like you describe it, with the caveat that TRUNCATE will
> remove any old row versions that might still be visible to an older
> transaction running in serializable mode. It sounds like it's not a problem
> in your scenario, but it's hard to say for sure without seeing the
> application. Running vacuum more often is probably a simpler and better
> solution, anyway.
>
> Which version of PostgreSQL is this?

Shouldn't locking the table prevent this ? I mean, if I try to get an
exclusive lock on the table, shouldn't I get one only when there's no
older transaction, and in that case I can truncate the table safely,
knowing that no one is accessing it due to the lock ?

the pg version is 8.1.2 (not the latest I know, but migrating this base is
quite complicated since it needs to be up 24/24 a day)

thanks

Nicolas


Re: truncate a table instead of vaccum full when count(*) is 0

From
ismo.tuononen@solenovo.fi
Date:

On Tue, 8 May 2007, Pomarede Nicolas wrote:

> As you can see, with hundreds of thousands events a day, this table will need
> being vaccumed regularly to avoid taking too much space (data and index).
>
> Note that processing rows is quite fast in fact, so at any time a count(*) on
> this table rarely exceeds 10-20 rows.
>
> For the indexes, a good way to bring them to a size corresponding to the
> actual count(*) is to run 'reindex'.

why you have index in table where is only 10-20 rows?

are those indexes to prevent some duplicate rows?

I have some tables just to store unprosessed data, and because there is
only few rows and I always process all rows there is no need for
indexes. there is just column named id, and when I insert row I take
nextval('id_seq') :

insert into some_tmp_table(id,'message',...) values (nextval('id_seq'),'do
something',...);

I know that deleting is slower than with indexes, but it's still fast
enough, because all rows are in memory.

and that id-column is just for delete, it's unique and i can always delete
using only it.

Ismo

Re: truncate a table instead of vaccum full when count(*) is 0

From
Heikki Linnakangas
Date:
Pomarede Nicolas wrote:
> On Tue, 8 May 2007, Heikki Linnakangas wrote:
>> Pomarede Nicolas wrote:
>>> But for the data (dead rows), even running a vacuum analyze every day
>>> is not enough, and doesn't truncate some empty pages at the end, so
>>> the data size remains in the order of 200-300 MB, when only a few
>>> effective rows are there.
>>
>> For a table like that you should run VACUUM much more often than once
>> a day. Turn on autovacuum, or set up a cron script etc. to run it
>> every 15 minutes or so.
>
> Yes, I already do this on another spool table ; I run a vacuum after
> processing it, but I wondered if there was another way to keep the disk
> size low for this table.

How much concurrent activity is there in the database? Running a vacuum
right after processing it would not remove the deleted tuples if there's
another transaction running at the same time. Running the vacuum a few
minutes later might help with that. You should run VACUUM VERBOSE to see
how many non-removable dead tuples there is.

>>> Is there an easy way to do this under psql ? For example, lock the
>>> table, do a count(*), if result is 0 row then truncate the table,
>>> unlock the table (a kind of atomic 'truncate table if count(*) == 0').
>>>
>>> Would this work and what would be the steps ?
>>
>> It should work, just like you describe it, with the caveat that
>> TRUNCATE will remove any old row versions that might still be visible
>> to an older transaction running in serializable mode. It sounds like
>> it's not a problem in your scenario, but it's hard to say for sure
>> without seeing the application. Running vacuum more often is probably
>> a simpler and better solution, anyway.
>
> Shouldn't locking the table prevent this ? I mean, if I try to get an
> exclusive lock on the table, shouldn't I get one only when there's no
> older transaction, and in that case I can truncate the table safely,
> knowing that no one is accessing it due to the lock ?

Serializable transactions that started before the transaction that takes
the lock would need to see the old row versions:

Xact 1: BEGIN ISOLATION LEVEL SERIALIZABLE;
Xact 1: SELECT 1; -- To take a snapshot, perform any query
Xact 2: DELETE FROM foo;
Xact 3: BEGIN;
Xact 3: LOCK TABLE foo;
Xact 3: SELECT COUNT(*) FROM foo; -- Sees delete by xact 2, returns 0,
Xact 3: TRUNCATE foo;
Xact 3: COMMIT;
Xact 1: SELECT COUNT(*) FROM foo; -- Returns 0, but because the
transaction is in serializable mode, it should've still seen the rows
deleted by xact 2.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: truncate a table instead of vaccum full when count(*) is 0

From
Gregory Stark
Date:
"Pomarede Nicolas" <npomarede@corp.free.fr> writes:

> Yes, I already do this on another spool table ; I run a vacuum after processing
> it, but I wondered if there was another way to keep the disk size low for this
> table.

"after processing it" might be too soon if there are still transactions around
that are a few minutes old and predate you committing after processing it.

But any table that receives as many deletes or updates as these tables do will
need to be vacuumed on the order of minutes, not days.

>> It should work, just like you describe it, with the caveat that TRUNCATE will
>> remove any old row versions that might still be visible to an older
>> transaction running in serializable mode.
>
> Shouldn't locking the table prevent this ? I mean, if I try to get an exclusive
> lock on the table, shouldn't I get one only when there's no older transaction,
> and in that case I can truncate the table safely, knowing that no one is
> accessing it due to the lock ?

It would arise if the transaction starts before you take the lock but hasn't
looked at the table yet. Then the lock table succeeds, you truncate it and
commit, then the old transaction gets around to looking at the table.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


Re: truncate a table instead of vaccum full when count(*) is 0

From
Pomarede Nicolas
Date:
On Tue, 8 May 2007, Heikki Linnakangas wrote:

> Pomarede Nicolas wrote:
>> On Tue, 8 May 2007, Heikki Linnakangas wrote:
>>> Pomarede Nicolas wrote:
>>>> But for the data (dead rows), even running a vacuum analyze every day is
>>>> not enough, and doesn't truncate some empty pages at the end, so the data
>>>> size remains in the order of 200-300 MB, when only a few effective rows
>>>> are there.
>>>
>>> For a table like that you should run VACUUM much more often than once a
>>> day. Turn on autovacuum, or set up a cron script etc. to run it every 15
>>> minutes or so.
>>
>> Yes, I already do this on another spool table ; I run a vacuum after
>> processing it, but I wondered if there was another way to keep the disk
>> size low for this table.
>
> How much concurrent activity is there in the database? Running a vacuum right
> after processing it would not remove the deleted tuples if there's another
> transaction running at the same time. Running the vacuum a few minutes later
> might help with that. You should run VACUUM VERBOSE to see how many
> non-removable dead tuples there is.
>

There's not too much simultaneous transaction on the database, most of the
time it shouldn't exceed one minute (worst case). Except, as I need to run
a vacuum analyze on the whole database every day, it now takes 8 hours to
do the vacuum (I changed vacuum values to be a little slower instead of
taking too much i/o and making the base unusable, because with
default vacuum values it takes 3-4 hours of high i/o usage (total base
is 20 GB) ).

So, at this time, the complete vacuum is running, and vacuuming only the
spool table gives all dead rows are currently not removable (which is
normal).

I will run it again later when the complete vacuum is over, to see how
pages are affected.


Nicolas



Re: truncate a table instead of vaccum full when count(*) is 0

From
Heikki Linnakangas
Date:
Pomarede Nicolas wrote:
> There's not too much simultaneous transaction on the database, most of
> the time it shouldn't exceed one minute (worst case). Except, as I need
> to run a vacuum analyze on the whole database every day, it now takes 8
> hours to do the vacuum (I changed vacuum values to be a little slower
> instead of taking too much i/o and making the base unusable, because
> with default vacuum values it takes 3-4 hours of high i/o usage (total
> base is 20 GB) ).
>
> So, at this time, the complete vacuum is running, and vacuuming only the
> spool table gives all dead rows are currently not removable (which is
> normal).

Oh, I see. I know you don't want to upgrade, but that was changed in
8.2. Vacuum now ignores concurrent vacuums in the oldest xid
calculation, so the long-running vacuum won't stop the vacuum on the
spool table from removing dead rows.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: truncate a table instead of vaccum full when count(*) is 0

From
Guillaume Cottenceau
Date:
Heikki Linnakangas <heikki 'at' enterprisedb.com> writes:

> Pomarede Nicolas wrote:
> > But for the data (dead rows), even running a vacuum analyze every
> > day is not enough, and doesn't truncate some empty pages at the end,
> > so the data size remains in the order of 200-300 MB, when only a few
> > effective rows are there.
>
> For a table like that you should run VACUUM much more often than once
> a day. Turn on autovacuum, or set up a cron script etc. to run it
> every 15 minutes or so.

Heikki, is there theoretical need for frequent VACUUM when
max_fsm_pages is large enough to hold references of dead rows?

VACUUM documentation says: "tuples that are deleted or obsoleted
by an update are not physically removed from their table; they
remain present until a VACUUM is done".

Free Space Map documentation says: "the shared free space map
tracks the locations of unused space in the database. An
undersized free space map may cause the database to consume
increasing amounts of disk space over time, because free space
that is not in the map cannot be re-used".

I am not sure of the relationship between these two statements.
Are these deleted/obsoleted tuples stored in the FSM and actually
the occupied space is reused before a VACUUM is performed, or is
something else happening? Maybe the FSM is only storing a
reference to diskspages containing only dead rows, and that's the
difference I've been missing?

--
Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company
Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36

Re: truncate a table instead of vaccum full when count(*) is 0

From
Heikki Linnakangas
Date:
Guillaume Cottenceau wrote:
> Heikki, is there theoretical need for frequent VACUUM when
> max_fsm_pages is large enough to hold references of dead rows?

Not really, if you don't mind that your table with 10 rows takes
hundreds of megabytes on disk. If max_fsm_pages is large enough, the
table size will reach a steady state size and won't grow further. It
depends on your scenario, it might be totally acceptable.

> VACUUM documentation says: "tuples that are deleted or obsoleted
> by an update are not physically removed from their table; they
> remain present until a VACUUM is done".
>
> Free Space Map documentation says: "the shared free space map
> tracks the locations of unused space in the database. An
> undersized free space map may cause the database to consume
> increasing amounts of disk space over time, because free space
> that is not in the map cannot be re-used".
>
> I am not sure of the relationship between these two statements.
> Are these deleted/obsoleted tuples stored in the FSM and actually
> the occupied space is reused before a VACUUM is performed, or is
> something else happening? Maybe the FSM is only storing a
> reference to diskspages containing only dead rows, and that's the
> difference I've been missing?

FSM stores information on how much free space there is on each page.
Deleted but not yet vacuumed tuples don't count as free space. If a page
is full of dead tuples, it's not usable for inserting new tuples, and
it's not recorded in the FSM.

When vacuum runs, it physically removes tuples from the table and frees
the space occupied by them. At the end it updates the FSM.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: truncate a table instead of vaccum full when count(*) is 0

From
Pomarede Nicolas
Date:
On Tue, 8 May 2007, Heikki Linnakangas wrote:

> Pomarede Nicolas wrote:
>> There's not too much simultaneous transaction on the database, most of the
>> time it shouldn't exceed one minute (worst case). Except, as I need to run
>> a vacuum analyze on the whole database every day, it now takes 8 hours to
>> do the vacuum (I changed vacuum values to be a little slower instead of
>> taking too much i/o and making the base unusable, because with default
>> vacuum values it takes 3-4 hours of high i/o usage (total base is 20 GB) ).
>>
>> So, at this time, the complete vacuum is running, and vacuuming only the
>> spool table gives all dead rows are currently not removable (which is
>> normal).
>
> Oh, I see. I know you don't want to upgrade, but that was changed in 8.2.
> Vacuum now ignores concurrent vacuums in the oldest xid calculation, so the
> long-running vacuum won't stop the vacuum on the spool table from removing
> dead rows.

Well, this concurrent vacuum is very interesting, I didn't notice this in
8.2, but it would really help here to vacuum frequently this spool table
and have dead rows removed while the 'big' vacuum is running.
Seems, I will have to consider migrating to 8.2 then :)


Anyway, now my vacuum is over, I can vacuum the spool table and see the
results :

before : 6422 pages for the data and 1700 pages for the indexes.

after vacuum analyze : 6422 data pages / 1700 index pages


here's the log for vacuum :

fbxtv=# vacuum analyze verbose mysql_spool ;
INFO:  vacuuming "public.mysql_spool"
INFO:  index "pk_mysql_spool" now contains 21 row versions in 1700 pages
DETAIL:  7759 index row versions were removed.
1696 index pages have been deleted, 1667 are currently reusable.
CPU 0.01s/0.00u sec elapsed 1.78 sec.
INFO:  "mysql_spool": removed 7759 row versions in 1521 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 4.88 sec.
INFO:  "mysql_spool": found 7759 removable, 21 nonremovable row versions
in 6422 pages
DETAIL:  20 dead row versions cannot be removed yet.
There were 261028 unused item pointers.
0 pages are entirely empty.
CPU 0.01s/0.01u sec elapsed 25.90 sec.
INFO:  vacuuming "pg_toast.pg_toast_386146338"
INFO:  index "pg_toast_386146338_index" now contains 0 row versions in 1
pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  "pg_toast_386146338": found 0 removable, 0 nonremovable row
versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  analyzing "public.mysql_spool"
INFO:  "mysql_spool": scanned 3000 of 6422 pages, containing 0 live rows
and 14 dead rows; 0 rows in sample, 0 estimated total rows
VACUUM


So far, so good, nearly all rows are marked as dead and removable. But
then, if I do 'select ctid,* from mysql_spool', I can see ctid values in
the range 5934, 5935, 6062, ...

Isn't it possible for postgres to start using pages 0,1,2, ... after the
vacuum, which would mean that after a few minutes, all high pages number
would now be completly free and could be truncated when the next vacuum is
run ?

Actually, if I run another vacuum, some more dead rows are added to the
list of removable rows, but I can never reach the point where data is
stored in the low pages number (in my case a few pages would be enough)
and all other pages get truncated at the end.
Well at least, the number of pages doesn't increase past 6422 in this
case, but I'd like to reclaim space sometimes.

Is this one of the feature that is planned for 8.3 : reusing low pages
number in piority after a vacuum to help subsequent vacuums truncating the
end of the table once data are located at the beginning of the table ?


Thanks to all for all your very interesting answers.

Nicolas