Re: Alternative for vacuuming queue-like tables - Mailing list pgsql-general

From Jim C. Nasby
Subject Re: Alternative for vacuuming queue-like tables
Date
Msg-id 20060504190730.GT97354@pervasive.com
Whole thread Raw
In response to Re: Alternative for vacuuming queue-like tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Alternative for vacuuming queue-like tables
List pgsql-general
On Sat, Apr 29, 2006 at 06:39:21PM -0400, Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
> > The general problem seems to be that a transaction has no way to promise
> > never to touch a specific table. Maybe some kind of "negative lock"
> > would help here - you'd do "exclude table foo from transaction" at the
> > start of your transaction, which would cause postgres to raise an error
> > if you indeed tried to access that table. Vacuum could then ignore your
> > transaction when deciding which tuples it can safely remove from the
> > table foo.
>
> Unfortunately that really wouldn't help VACUUM at all.  The nasty
> problem for VACUUM is that what it has to figure out is not the oldest
> transaction that it thinks is running, but the oldest transaction that
> anyone else thinks is running.  So what it looks through PGPROC for is
> not the oldest XID, but the oldest XMIN.  And even if it excluded procs
> that had promised not to touch the target table, it would find that
> their XIDs had been factored into other processes' XMINs, resulting
> in no improvement.
>
> As a comparison point, VACUUM already includes code to ignore backends
> in other databases (if it's vacuuming a non-shared table), but it turns
> out that that code is almost entirely useless :-(, because those other
> backends still get factored into the XMINs computed by backends that are
> in the same database as VACUUM.  We've speculated about fixing this by
> having each backend compute and advertise both "global" and "database
> local" XMINs, but the extra cycles that'd need to be spent in *every*
> snapshot computation seem like a pretty nasty penalty.  And the approach
> certainly does not scale to anything like per-table exclusions.

I'd actually been thinking about this recently, and had come up with the
following half-baked ideas:

Allow a transaction to specify exactly what tables it will be touching,
perhaps as an extension to BEGIN. Should any action that transaction
takes attempt to access a table not specified, throw an error.

A possible variant on that would be to automatically determine at
transaction start all the tables that would be accessed by that
transaction.

Once that list is available, vacuum should be able to use it to ignore
any transactions that have promised not to touch whatever table it's
vacuuming. While this doesn't help with transactions that don't make any
promises, for short-running transactions we don't really care. As long
as all long-running transactions state their intentions it should allow
for useful vacuuming of queue tables and the like.

Given that we don't care about what tables a short-running transaction
will access, we could delay the (probably expensive) determination of
what tables a backend will access until the transaction is over a
specific age. At that point the list of tables could be built and
(theoretically) the transactions XMIN adjusted accordingly.

Unfortunately this won't help with pg_dump right now. But an extension
would be to allow passing a table order into pg_dump so that it dumps
queue tables first. Once a table is dumped pg_dump shouldn't need to
access it again, so it's theoretically possible to make the "I promise
not to access these tables" list dynamic during the life of a
transaction; as pg_dump finishes with tables it could then promise not
to touch them again.

Or maybe a better idea would just be to come up with some other means
for people to do things like queue tables and session tables...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

pgsql-general by date:

Previous
From: "Sebastian Wagner"
Date:
Subject: Re: Googles sommer of Code 2005 - Postgres Project Proposal - where to discuss? who is responsible? who can mentor?
Next
From: "Jim C. Nasby"
Date:
Subject: Re: dump Functions