Thread: Alternative for vacuuming queue-like tables

Alternative for vacuuming queue-like tables

From
Csaba Nagy
Date:
Hi all,

Short background: postgres does not support very well queue type tables
in an environment where these queue tables are small in size but heavily
inserted/updated/deleted, while there are activities in the system which
cause long running transactions. The reason is that the queue table
cannot be vacuumed properly due to the long running transactions, and
suffers bloat.

This was a quite big problem for our postgres systems, until I
discovered that there is an alternative solution to vacuuming if the
transactions on the queue table are always short. Beside vacuuming
aggressively the queue table (which works fine most of the time, when I
have no long running transaction currently) I placed a cron job to
cluster the queue table on it's PK index. That seems to work well, the
table is kept small even in the presence of long running transactions.

The only time this is NOT working if some long running transaction
actually accesses the queue table. An example is a full DB backup.

I know this made one of my systems drop it's load significantly.

HTH,
Csaba.



Re: Alternative for vacuuming queue-like tables

From
kmh496
Date:
2006-04-28 (금), 14:40 +0200, Csaba Nagy 쓰시길:
> I placed a cron job to
> cluster the queue table on it's PK index.
what does that mean?
--
my site <a href="http://www.myowndictionary.com">myowndictionary</a> was
made to help students of many languages learn them faster.






Re: Alternative for vacuuming queue-like tables

From
Csaba Nagy
Date:
On Fri, 2006-04-28 at 15:20, kmh496 wrote:
> 2006-04-28 (금), 14:40 +0200, Csaba Nagy 쓰시길:
> > I placed a cron job to
> > cluster the queue table on it's PK index.
> what does that mean?

Means execute:

CLUSTER pk_queue_table ON queue_table;

See http://www.postgresql.org/docs/8.1/static/sql-cluster.html for more
details on CLUSTER.

This will actually rebuild the table I guess after it can lock it
exclusively. Due to the fact that the queue transactions are very short
lived, the exclusive lock is fast acquired and the clustering operation
is also fast as the table has only a few entries.

I'm not sure how this operation can work in the presence of other long
running transactions which did not touch the queue table yet, but it
actually does work, I can confirm that. Is it violating MVCC maybe ?

Cheers,
Csaba.



Re: Alternative for vacuuming queue-like tables

From
Vivek Khera
Date:
On Apr 28, 2006, at 9:32 AM, Csaba Nagy wrote:

> I'm not sure how this operation can work in the presence of other long
> running transactions which did not touch the queue table yet, but it
> actually does work, I can confirm that. Is it violating MVCC maybe ?

It sounds like it does potentially violate mvcc, but I haven't
thought hard about it.  It  also has the advantage that it rebuilds
your indexes during the clustering.

And a convenient feature is that you just define those CLUSTER
statements once, then whenever you need it, just run "CLUSTER;" all
by itself and all tables previously defined as clustered will be re-
clustered on the specified index.


Re: Alternative for vacuuming queue-like tables

From
Tom Lane
Date:
Csaba Nagy <nagy@ecircle-ag.com> writes:
> I'm not sure how this operation can work in the presence of other long
> running transactions which did not touch the queue table yet, but it
> actually does work, I can confirm that. Is it violating MVCC maybe ?

Yes :-(.  I think you can get away with it if all your transactions that
use the queue table run in READ COMMITTED (not serializable) mode, and
if they work like

    BEGIN;
    LOCK queue_table IN some-suitable-mode;
    process queue table;
    COMMIT;

Grabbing the lock will ensure that CLUSTER finishes before any snapshot
is taken.

            regards, tom lane

Re: Alternative for vacuuming queue-like tables

From
Csaba Nagy
Date:
> > actually does work, I can confirm that. Is it violating MVCC maybe ?
>
> Yes :-(.  I think you can get away with it if all your transactions that
[snip]

Well, I actually don't want to get away this time :-)

This table is only processed by the queue manager and that uses very
short transactions, basically only atomic insert/update/delete. No long
running transaction will ever touch that table.

So this fits perfectly my queue cleanup needs.

Cheers,
Csaba.



Re: Alternative for vacuuming queue-like tables

From
Chris Browne
Date:
nagy@ecircle-ag.com (Csaba Nagy) writes:
> On Fri, 2006-04-28 at 15:20, kmh496 wrote:
>> 2006-04-28 (금), 14:40 +0200, Csaba Nagy 쓰시길:
>> > I placed a cron job to
>> > cluster the queue table on it's PK index.
>> what does that mean?
>
> Means execute:
>
> CLUSTER pk_queue_table ON queue_table;
>
> See http://www.postgresql.org/docs/8.1/static/sql-cluster.html for more
> details on CLUSTER.
>
> This will actually rebuild the table I guess after it can lock it
> exclusively. Due to the fact that the queue transactions are very short
> lived, the exclusive lock is fast acquired and the clustering operation
> is also fast as the table has only a few entries.
>
> I'm not sure how this operation can work in the presence of other long
> running transactions which did not touch the queue table yet, but it
> actually does work, I can confirm that. Is it violating MVCC maybe ?

There is, I believe, a problem there; there is a scenario where data
can get "dropped out from under" those old connections.

This has been added to the TODO...

http://www.postgresql.org/docs/faqs.TODO.html

* Make CLUSTER preserve recently-dead tuples per MVCC requirements
--
(format nil "~S@~S" "cbbrowne" "ntlug.org")
http://cbbrowne.com/info/sgml.html
Should vegetarians eat animal crackers?

Re: Alternative for vacuuming queue-like tables

From
Csaba Nagy
Date:
> There is, I believe, a problem there; there is a scenario where data
> can get "dropped out from under" those old connections.
>
> This has been added to the TODO...
>
> http://www.postgresql.org/docs/faqs.TODO.html
>
> * Make CLUSTER preserve recently-dead tuples per MVCC requirements

OK, I can see this being a problem in the general case.

However, for my queue table the current behavior is a life-saver. Would
it be possible to still provide a variant of rebuild which is documented
to not be MVCC compliant ? Something like: I don't care the old
transactions which did not touch yet this table to see the old data for
this table, I want to rebuild it.

I actually don't care about clustering in this case, only about
rebuilding the table once I can get an exclusive lock on it. I guess
this would be possible with the "switch to a copy" method except the new
table is a different entity (as mentioned in another post) and it will
not preserve the dependencies of the original table.

I guess what I'm asking for is a kind of "REBUILD TABLE" which is not
MVCC by definition but it would be useful in the mentioned queue table
case.

Cheers,
Csaba.



Re: Alternative for vacuuming queue-like tables

From
Robert Treat
Date:
On Friday 28 April 2006 12:20, Csaba Nagy wrote:
> > There is, I believe, a problem there; there is a scenario where data
> > can get "dropped out from under" those old connections.
> >
> > This has been added to the TODO...
> >
> > http://www.postgresql.org/docs/faqs.TODO.html
> >
> > * Make CLUSTER preserve recently-dead tuples per MVCC requirements
>
> OK, I can see this being a problem in the general case.
>
> However, for my queue table the current behavior is a life-saver. Would
> it be possible to still provide a variant of rebuild which is documented
> to not be MVCC compliant ? Something like: I don't care the old
> transactions which did not touch yet this table to see the old data for
> this table, I want to rebuild it.
>
> I actually don't care about clustering in this case, only about
> rebuilding the table once I can get an exclusive lock on it. I guess
> this would be possible with the "switch to a copy" method except the new
> table is a different entity (as mentioned in another post) and it will
> not preserve the dependencies of the original table.
>
> I guess what I'm asking for is a kind of "REBUILD TABLE" which is not
> MVCC by definition but it would be useful in the mentioned queue table
> case.
>

vaccum full ?

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: Alternative for vacuuming queue-like tables

From
"Florian G. Pflug"
Date:
Csaba Nagy wrote:
>> There is, I believe, a problem there; there is a scenario where data
>> can get "dropped out from under" those old connections.
>>
>> This has been added to the TODO...
>>
>> http://www.postgresql.org/docs/faqs.TODO.html
>>
>> * Make CLUSTER preserve recently-dead tuples per MVCC requirements
>
> OK, I can see this being a problem in the general case.
>
> However, for my queue table the current behavior is a life-saver. Would
> it be possible to still provide a variant of rebuild which is documented
> to not be MVCC compliant ? Something like: I don't care the old
> transactions which did not touch yet this table to see the old data for
> this table, I want to rebuild it.
>
> I actually don't care about clustering in this case, only about
> rebuilding the table once I can get an exclusive lock on it. I guess
> this would be possible with the "switch to a copy" method except the new
> table is a different entity (as mentioned in another post) and it will
> not preserve the dependencies of the original table.
>
> I guess what I'm asking for is a kind of "REBUILD TABLE" which is not
> MVCC by definition but it would be useful in the mentioned queue table
> case.
The general problem seems to be that a transaction has no way to promise
never to touch a specific table. Maybe some kind of "negative lock"
would help here - you'd do "exclude table foo from transaction" at the
start of your transaction, which would cause postgres to raise an error
if you indeed tried to access that table. Vacuum could then ignore your
transaction when deciding which tuples it can safely remove from the
table foo.

This would be a nice feature IMHO - would it be possible to do something
like that, or am I overlooking something.

greetings, Florian Pflug

Re: Alternative for vacuuming queue-like tables

From
Tom Lane
Date:
"Florian G. Pflug" <fgp@phlo.org> writes:
> The general problem seems to be that a transaction has no way to promise
> never to touch a specific table. Maybe some kind of "negative lock"
> would help here - you'd do "exclude table foo from transaction" at the
> start of your transaction, which would cause postgres to raise an error
> if you indeed tried to access that table. Vacuum could then ignore your
> transaction when deciding which tuples it can safely remove from the
> table foo.

Unfortunately that really wouldn't help VACUUM at all.  The nasty
problem for VACUUM is that what it has to figure out is not the oldest
transaction that it thinks is running, but the oldest transaction that
anyone else thinks is running.  So what it looks through PGPROC for is
not the oldest XID, but the oldest XMIN.  And even if it excluded procs
that had promised not to touch the target table, it would find that
their XIDs had been factored into other processes' XMINs, resulting
in no improvement.

As a comparison point, VACUUM already includes code to ignore backends
in other databases (if it's vacuuming a non-shared table), but it turns
out that that code is almost entirely useless :-(, because those other
backends still get factored into the XMINs computed by backends that are
in the same database as VACUUM.  We've speculated about fixing this by
having each backend compute and advertise both "global" and "database
local" XMINs, but the extra cycles that'd need to be spent in *every*
snapshot computation seem like a pretty nasty penalty.  And the approach
certainly does not scale to anything like per-table exclusions.

            regards, tom lane

Re: Alternative for vacuuming queue-like tables

From
Csaba Nagy
Date:
> > I guess what I'm asking for is a kind of "REBUILD TABLE" which is not
> > MVCC by definition but it would be useful in the mentioned queue table
> > case.
> >
>
> vaccum full ?

Nope, it won't work, it will still leave in all the dead tuples
potentially visible by old transactions, even if those transactions
didn't touch yet the vacuumed table. In my case that means for some
tables sometimes 1000s times of bloat, given that the table always has
few active rows and high insert/update/delete rates...

CLUSTER is locking the table exclusively, which means it will wait until
all transactions which actually touched the table are finished, and then
rebuild the table, ignoring dead rows, without caring about old
transactions who could still see the dead rows (at least this is how I
think it works based on my observations). Perfect for my purpose, but
not MVCC...

Cheers,
Csaba.



Re: Alternative for vacuuming queue-like tables

From
"Jim C. Nasby"
Date:
On Sat, Apr 29, 2006 at 06:39:21PM -0400, Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
> > The general problem seems to be that a transaction has no way to promise
> > never to touch a specific table. Maybe some kind of "negative lock"
> > would help here - you'd do "exclude table foo from transaction" at the
> > start of your transaction, which would cause postgres to raise an error
> > if you indeed tried to access that table. Vacuum could then ignore your
> > transaction when deciding which tuples it can safely remove from the
> > table foo.
>
> Unfortunately that really wouldn't help VACUUM at all.  The nasty
> problem for VACUUM is that what it has to figure out is not the oldest
> transaction that it thinks is running, but the oldest transaction that
> anyone else thinks is running.  So what it looks through PGPROC for is
> not the oldest XID, but the oldest XMIN.  And even if it excluded procs
> that had promised not to touch the target table, it would find that
> their XIDs had been factored into other processes' XMINs, resulting
> in no improvement.
>
> As a comparison point, VACUUM already includes code to ignore backends
> in other databases (if it's vacuuming a non-shared table), but it turns
> out that that code is almost entirely useless :-(, because those other
> backends still get factored into the XMINs computed by backends that are
> in the same database as VACUUM.  We've speculated about fixing this by
> having each backend compute and advertise both "global" and "database
> local" XMINs, but the extra cycles that'd need to be spent in *every*
> snapshot computation seem like a pretty nasty penalty.  And the approach
> certainly does not scale to anything like per-table exclusions.

I'd actually been thinking about this recently, and had come up with the
following half-baked ideas:

Allow a transaction to specify exactly what tables it will be touching,
perhaps as an extension to BEGIN. Should any action that transaction
takes attempt to access a table not specified, throw an error.

A possible variant on that would be to automatically determine at
transaction start all the tables that would be accessed by that
transaction.

Once that list is available, vacuum should be able to use it to ignore
any transactions that have promised not to touch whatever table it's
vacuuming. While this doesn't help with transactions that don't make any
promises, for short-running transactions we don't really care. As long
as all long-running transactions state their intentions it should allow
for useful vacuuming of queue tables and the like.

Given that we don't care about what tables a short-running transaction
will access, we could delay the (probably expensive) determination of
what tables a backend will access until the transaction is over a
specific age. At that point the list of tables could be built and
(theoretically) the transactions XMIN adjusted accordingly.

Unfortunately this won't help with pg_dump right now. But an extension
would be to allow passing a table order into pg_dump so that it dumps
queue tables first. Once a table is dumped pg_dump shouldn't need to
access it again, so it's theoretically possible to make the "I promise
not to access these tables" list dynamic during the life of a
transaction; as pg_dump finishes with tables it could then promise not
to touch them again.

Or maybe a better idea would just be to come up with some other means
for people to do things like queue tables and session tables...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Alternative for vacuuming queue-like tables

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> I'd actually been thinking about this recently, and had come up with the
> following half-baked ideas:

> Allow a transaction to specify exactly what tables it will be touching,
> perhaps as an extension to BEGIN. Should any action that transaction
> takes attempt to access a table not specified, throw an error.

> A possible variant on that would be to automatically determine at
> transaction start all the tables that would be accessed by that
> transaction.

> Once that list is available, vacuum should be able to use it to ignore
> any transactions that have promised not to touch whatever table it's
> vacuuming.

No, you missed my point entirely.  The above would help not at all,
unless the restrictions were somehow propagated through XMIN
calculations, which seems impracticable.  (Every backend calculate a
separate XMIN with respect to every table that's being mentioned by any
other backend?  I don't think so...)

            regards, tom lane

Re: Alternative for vacuuming queue-like tables

From
"Florian G. Pflug"
Date:
Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
>
>>I'd actually been thinking about this recently, and had come up with the
>>following half-baked ideas:
>
>>Allow a transaction to specify exactly what tables it will be touching,
>>perhaps as an extension to BEGIN. Should any action that transaction
>>takes attempt to access a table not specified, throw an error.
>
>>A possible variant on that would be to automatically determine at
>>transaction start all the tables that would be accessed by that
>>transaction.
>
>>Once that list is available, vacuum should be able to use it to ignore
>>any transactions that have promised not to touch whatever table it's
>>vacuuming.
>
> No, you missed my point entirely.  The above would help not at all,
> unless the restrictions were somehow propagated through XMIN
> calculations, which seems impracticable.  (Every backend calculate a
> separate XMIN with respect to every table that's being mentioned by any
> other backend?  I don't think so...)

So basically the problem is that calculating a cluster-wide XMIN is fine from
a performance point of view, but to coarse from a vacuum point of view. OTOH
calculating a table-specific XMIN is fine enough for vacuum to vacuum queue-like
tables efficiently, but bad from a performance point of view.

What about doing the xmin-calculations and the xmin exporting per tablespace instead?
This would be more generic than the local/global split, because all global objects
are in the pg_global tablespace I believe. OTOH most people probably don't have more
than a handfull of tablespaces, which seems a manageable amount...

greetings, Florian Pflug

Re: Alternative for vacuuming queue-like tables

From
"Jim C. Nasby"
Date:
On Thu, May 04, 2006 at 03:30:34PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > I'd actually been thinking about this recently, and had come up with the
> > following half-baked ideas:
>
> > Allow a transaction to specify exactly what tables it will be touching,
> > perhaps as an extension to BEGIN. Should any action that transaction
> > takes attempt to access a table not specified, throw an error.
>
> > A possible variant on that would be to automatically determine at
> > transaction start all the tables that would be accessed by that
> > transaction.
>
> > Once that list is available, vacuum should be able to use it to ignore
> > any transactions that have promised not to touch whatever table it's
> > vacuuming.
>
> No, you missed my point entirely.  The above would help not at all,
> unless the restrictions were somehow propagated through XMIN
> calculations, which seems impracticable.  (Every backend calculate a
> separate XMIN with respect to every table that's being mentioned by any
> other backend?  I don't think so...)

I mentioned it further down the post, as well as the idea that we really
don't care about short-lived transactions, so theoretically we could
just compute this information for transactions that have been running
for more than some period of time. Presumably the overhead of
calculating a seperate XMIN for each table wouldn't be that great for a
transaction that's already been running 15 seconds...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461