Thread: Problems Vacuum'ing

Problems Vacuum'ing

From
jseymour@LinxNet.com (Jim Seymour)
Date:
Hi,

[Just so y'all know: This has been discussed extensively in#PostgreSQL and I tried asking the question in both -admin
and-general,first.  Also did some web searching.]
 

Environment:
   PostgreSQL 7.4.2   Locally built with GCC 3.3.1   Solaris 8 (Sparc)

I have a relatively simple database created with...

create table ethers (   hostname   varchar(64) unique not null,   mac        macaddr not null,   created    timestamp
(0)not null default current_timestamp,   changed    timestamp (0),   last_seen  timestamp (0) not null default
current_timestamp
);

create table host_mac_hist (   hostname   varchar(64) not null,   mac        macaddr not null,   created
timestamp(0)not null default current_timestamp,   last_seen  timestamp(0) not null
 
);

I'm populating the data from bunches of existing flat files in such a
manner that the "ethers" table, in particular, is getting updated
literally thousands of times.  It got slow, so I stopped the updating
and went to vacuum.  (Using psql as the user/owner of the db.)

The problem is that attempts to vacuum these tables resulted in "NNN
dead row versions cannot be removed yet."  Went through a lot of
analysis (e.g.: "Any hanging txns?") and trying different things with
folks on the #PostgreSQL IRC channel, all to no avail.

There is a WebObjects application that is the only other thing
accessing pgsql.  It is not accessing the same database, much-less
those tables.  (This was confirmed by enabling connection logging and
checking the log.)  Yet the only way I can successfully vacuum these
tables is to shut-down WebObjects *or* if I vacuum before there are
"too many" dead rows.  (Or so I thought!  Additional info later...)

Output of one attempt...

$ vacuumdb -U sysagent -t ethers --verbose --analyze sysagent
Password: 
INFO:  vacuuming "public.ethers"
INFO:  index "ethers_hostname_key" now contains 114002 row versions in 2389 pages
DETAIL:  1865 index pages have been deleted, 1865 are currently reusable.
CPU 0.18s/0.09u sec elapsed 0.41 sec.
INFO:  "ethers": found 0 removable, 114002 nonremovable row versions in 1114 pages
DETAIL:  113590 dead row versions cannot be removed yet.
There were 2184 unused item pointers.
0 pages are entirely empty.
CPU 0.20s/0.18u sec elapsed 0.54 sec.
INFO:  analyzing "public.ethers"
INFO:  "ethers": 1114 pages, 412 rows sampled, 412 estimated total rows
VACUUM

And...

$ vacuumdb -U sysagent -t ethers --verbose --analyze --full sysagent
Password: 
INFO:  vacuuming "public.ethers"
INFO:  "ethers": found 0 removable, 114002 nonremovable row versions in 1114 pages
DETAIL:  113590 dead row versions cannot be removed yet.
Nonremovable row versions range from 64 to 88 bytes long.
There were 2184 unused item pointers.
Total free space (including removable row versions) is 169880 bytes.
0 pages are or will become empty, including 0 at the end of the table.
816 pages containing 162192 free bytes are potential move destinations.
CPU 0.06s/2.03u sec elapsed 2.11 sec.
INFO:  index "ethers_hostname_key" now contains 114002 row versions in 2389 pages
DETAIL:  0 index row versions were removed.
1865 index pages have been deleted, 1865 are currently reusable.
CPU 0.22s/0.45u sec elapsed 0.73 sec.
INFO:  "ethers": moved 1745 row versions, truncated 1114 to 1114 pages
DETAIL:  CPU 0.39s/0.80u sec elapsed 2.79 sec.
INFO:  index "ethers_hostname_key" now contains 115740 row versions in 2389 pages
DETAIL:  7 index row versions were removed.
1856 index pages have been deleted, 1856 are currently reusable.
CPU 0.30s/0.15u sec elapsed 0.53 sec.
INFO:  analyzing "public.ethers"
INFO:  "ethers": 1114 pages, 412 rows sampled, 412 estimated total rows
VACUUM

I can understand how a non-full vacuum might fail if I have
insufficient FSM.  But "full" should get around that, should it not?

Besides: I did a new test today.  I added to my Perl code a bit that
would vacuum every 10 files read-in.  This would amount to about 5000
dead rows and, IIRC, less than 300 pages.  Much less than the default
FSM.  So even a non-full vacuum should be succeeding, no?  This new
script would succeed in getting everything vacuumed-up for a while and
then, at some point (failed to notice when): The "dead row versions
cannot be removed yet" came back and steadily incremented each time
vacuum was run.  If I were going to guess, I'd *guess* maybe this
started happening about the time somebody queried the WebObjects
application, thus causing it to connect, but I've no way of knowing
after-the-fact.  (Sorry for the vagueness here.)

Any idea of what might be going on here?

TIA,
Jim
-- 
Jim Seymour                  | PGP Public Key available at:
jseymour@LinxNet.com         | http://www.uk.pgp.net/pgpnet/pks-commands.html
http://jimsun.LinxNet.com    |


Re: Problems Vacuum'ing

From
jseymour@LinxNet.com (Jim Seymour)
Date:
I had written:
> 
[snip]
> 
> The problem is that attempts to vacuum these tables resulted in "NNN
> dead row versions cannot be removed yet."  Went through a lot of
> analysis (e.g.: "Any hanging txns?") and trying different things with
> folks on the #PostgreSQL IRC channel, all to no avail.
[snip]

Okay, the mystery is *partially* solved.  In IRC, when this was
brought up again this morning due to my post to -hackers, two things
happened:
   I did a vacuumdb on one of the tables and, much to my surprise,   what wouldn't vacuum before I left work last nite
*did*work this   morning.  And...
 
   JanniCash discovered this in utils/time/tqual.c (I believe):
   * OldestXmin is a cutoff XID (obtained from GetOldestXmin()).  Tuples   * deleted by XIDs >= OldestXmin are deemed
"recentlydead"; they might   * still be visible to some open transaction, so we can't remove them,   * even if we see
thatthe deleting transaction has committed.
 

So the next thing I did was run a bunch of updates, quit the script,
then ran a while loop from the (Unix) command-line, trying to vacuum
the one table every 30 seconds.  To summarize...
   Fri Apr  2 08:54:54 EST 2004   INFO:  "ethers": found 0 removable, 1834 nonremovable row           versions in 93
pages  DETAIL:  1466 dead row versions cannot be removed yet.   ...   Fri Apr  2 08:58:56 EST 2004   INFO:  "ethers":
found1466 removable, 368 nonremovable row    versions in 93 pages   DETAIL:  0 dead row versions cannot be removed
yet.

Which is all well-and-good (tho, my ignorance readily conceded, four
minutes seems a mite... long), *except*: If I shut-down the
WebObjects application which, again, never accesses the db in
question, much-less any of its tables, this "time-out" doesn't seem
to apply.  (I tried it.)

Any explanation for this behaviour?

Thanks,
Jim
-- 
Jim Seymour                  | PGP Public Key available at:
jseymour@LinxNet.com         | http://www.uk.pgp.net/pgpnet/pks-commands.html
http://jimsun.LinxNet.com    |


Re: Problems Vacuum'ing

From
Alvaro Herrera
Date:
On Fri, Apr 02, 2004 at 12:02:22PM -0500, Jim Seymour wrote:

[...]

> Which is all well-and-good (tho, my ignorance readily conceded, four
> minutes seems a mite... long), *except*: If I shut-down the
> WebObjects application which, again, never accesses the db in
> question, much-less any of its tables, this "time-out" doesn't seem
> to apply.  (I tried it.)

Ok, so the WebObjects app keeps an idle open transaction?  (i.e. it
issues a BEGIN as soon as the previous transaction is finished.) I'm not
sure I read the code correctly -- ISTM it would only matter when you try
to vacuum a shared table, which this is not ...

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)


Re: Problems Vacuum'ing

From
jseymour@LinxNet.com (Jim Seymour)
Date:
> 
> On Fri, Apr 02, 2004 at 12:02:22PM -0500, Jim Seymour wrote:
> 
> [...]
> 
> > Which is all well-and-good (tho, my ignorance readily conceded, four
> > minutes seems a mite... long), *except*: If I shut-down the
> > WebObjects application which, again, never accesses the db in
> > question, much-less any of its tables, this "time-out" doesn't seem
> > to apply.  (I tried it.)
> 
> Ok, so the WebObjects app keeps an idle open transaction?  (i.e. it
> issues a BEGIN as soon as the previous transaction is finished.) I'm not
> sure I read the code correctly --

I really couldn't say.  I don't know what the WebObjects app is
doing.  I know, or *believe* I know, it's only doing queries.  (It
may be doing temp tables internally, or some-such.)  Its interface
to pgsql is via the JDBC that comes with pgsql.  I don't know what
the Java code it generated looks like.

>                                   ISTM it would only matter when you try
> to vacuum a shared table, which this is not ...

That's what I would've thought.

Thanks for the follow-up.  I was beginning to wonder if anybody'd
noticed ;).

Jim



Re: Problems Vacuum'ing

From
Alvaro Herrera
Date:
On Fri, Apr 02, 2004 at 02:51:30PM -0500, Jim Seymour wrote:
> > 
> > On Fri, Apr 02, 2004 at 12:02:22PM -0500, Jim Seymour wrote:

> > Ok, so the WebObjects app keeps an idle open transaction?  (i.e. it
> > issues a BEGIN as soon as the previous transaction is finished.) I'm not
> > sure I read the code correctly --
> 
> I really couldn't say.  I don't know what the WebObjects app is
> doing.  I know, or *believe* I know, it's only doing queries.  (It
> may be doing temp tables internally, or some-such.)  Its interface
> to pgsql is via the JDBC that comes with pgsql.  I don't know what
> the Java code it generated looks like.

Turn on query logging and see if the BEGIN is issued right after the
COMMIT/ROLLBACK, or whether it waits and issues it right before
SELECT/CREATE TEMP TABLE.

It doesn't matter if it's only doing queries; if it does them inside a
transaction, it would be enough to keep VACUUM from working "properly."

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La gente vulgar solo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"


Re: Problems Vacuum'ing

From
jseymour@LinxNet.com (Jim Seymour)
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> wrote:
> 
[snip]
> 
> Turn on query logging and see if the BEGIN is issued right after the
> COMMIT/ROLLBACK, or whether it waits and issues it right before
> SELECT/CREATE TEMP TABLE.
> 
> It doesn't matter if it's only doing queries; if it does them inside a
> transaction, it would be enough to keep VACUUM from working "properly."

Will the following do as well?

(Thanks to Jochem for the pointer for how to determine open
transactions and the pg_stat_activity hint.)

Logged into work.  WebObects application is running.  Database I'm
working with partly populated from earlier work.

postgres=# select * from pg_locks where transaction is not null;relation | database | transaction |  pid  |     mode
 | granted 
 
----------+----------+-------------+-------+---------------+---------         |          |     1245358 | 18020 |
ExclusiveLock| t         |          |     1245364 |   267 | ExclusiveLock | t
 
(2 rows)

postgres=# select * from pg_stat_activity;datid | datname  | procpid | usesysid |  usename   | current_query |
query_start
 
-------+----------+---------+----------+------------+---------------+-------------17142 | postgres |     267 |        1
|postgres   |               | 17144 | qantel   |   18020 |      103 | webobjects |               | 
 
(2 rows)


sysagent=> delete from ethers;
DELETE 368
sysagent=> delete from host_mac_hist;
DELETE 169
sysagent=> vacuum full analyze verbose ethers;
INFO:  vacuuming "public.ethers"
INFO:  "ethers": found 0 removable, 368 nonremovable row versions in 4 pages
DETAIL:  368 dead row versions cannot be removed yet.
Nonremovable row versions range from 64 to 88 bytes long.
There were 55 unused item pointers.
Total free space (including removable row versions) is 3724 bytes.
0 pages are or will become empty, including 0 at the end of the table.
1 pages containing 3628 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "ethers_hostname_key" now contains 368 row versions in 275 pages
DETAIL:  0 index row versions were removed.
223 index pages have been deleted, 223 are currently reusable.
CPU 0.01s/0.03u sec elapsed 0.23 sec.
INFO:  "ethers": moved 0 row versions, truncated 4 to 4 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.ethers"
INFO:  "ethers": 4 pages, 0 rows sampled, 0 estimated total rows
VACUUM
sysagent=> vacuum full analyze verbose host_mac_hist;
INFO:  vacuuming "public.host_mac_hist"
INFO:  "host_mac_hist": found 0 removable, 169 nonremovable row versions in 2 pages
DETAIL:  169 dead row versions cannot be removed yet.
Nonremovable row versions range from 64 to 80 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 3556 bytes.
0 pages are or will become empty, including 0 at the end of the table.
1 pages containing 3532 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "host_mac_hist": moved 0 row versions, truncated 2 to 2 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "public.host_mac_hist"
INFO:  "host_mac_hist": 2 pages, 0 rows sampled, 0 estimated total rows
VACUUM

Shut down WebObjects.  Clear both tables.  Do full vacuum.
Re-populate tables.  Then...

postgres=# select * from pg_locks where transaction is not null;relation | database | transaction | pid  |     mode
| granted 
 
----------+----------+-------------+------+---------------+---------         |          |     1245558 | 3110 |
ExclusiveLock| t
 
(1 row)

postgres=# select * from pg_stat_activity;datid | datname  | procpid | usesysid | usename  | current_query |
query_start
 
-------+----------+---------+----------+----------+---------------+-------------17142 | postgres |    3110 |        1 |
postgres|               | 
 
(1 row)

sysagent=> delete from ethers;
DELETE 368
sysagent=> delete from host_mac_hist;
DELETE 169
sysagent=> vacuum full analyze verbose ethers;
INFO:  vacuuming "public.ethers"
INFO:  "ethers": found 10030 removable, 0 nonremovable row versions in 98 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 0 to 0 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 760736 bytes.
98 pages are or will become empty, including 98 at the end of the table.
0 pages containing 0 free bytes are potential move destinations.
CPU 0.01s/0.01u sec elapsed 0.01 sec.
INFO:  index "ethers_hostname_key" now contains 0 row versions in 275 pages
DETAIL:  10030 index row versions were removed.
271 index pages have been deleted, 271 are currently reusable.
CPU 0.00s/0.08u sec elapsed 0.08 sec.
INFO:  "ethers": truncated 98 to 0 pages
INFO:  analyzing "public.ethers"
INFO:  "ethers": 0 pages, 0 rows sampled, 0 estimated total rows
VACUUM
sysagent=> vacuum full analyze verbose host_mac_hist;
INFO:  vacuuming "public.host_mac_hist"
INFO:  "host_mac_hist": found 169 removable, 0 nonremovable row versions in 2 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 0 to 0 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 15668 bytes.
2 pages are or will become empty, including 2 at the end of the table.
0 pages containing 0 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "host_mac_hist": truncated 2 to 0 pages
INFO:  analyzing "public.host_mac_hist"
INFO:  "host_mac_hist": 0 pages, 0 rows sampled, 0 estimated total rows
VACUUM

Again the difference: With WebObjects running, deleting rows and
trying to vacuum immediately, even full, fails.  Shut-down WebObjects
and I can.

Next test? ;)

Jim


Re: Problems Vacuum'ing

From
Tom Lane
Date:
jseymour@LinxNet.com (Jim Seymour) writes:
> Again the difference: With WebObjects running, deleting rows and
> trying to vacuum immediately, even full, fails.  Shut-down WebObjects
> and I can.

WebObjects is evidently holding an open transaction.  Ergo, anything
deleted after the start of that transaction isn't vacuumable.  You need
to do something about the client-side logic that is holding an open
transaction without doing anything ...
        regards, tom lane


Re: Problems Vacuum'ing

From
jseymour@LinxNet.com (Jim Seymour)
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> jseymour@LinxNet.com (Jim Seymour) writes:
> > Again the difference: With WebObjects running, deleting rows and
> > trying to vacuum immediately, even full, fails.  Shut-down WebObjects
> > and I can.
> 
> WebObjects is evidently holding an open transaction.  Ergo, anything
> deleted after the start of that transaction isn't vacuumable.  You need
> to do something about the client-side logic that is holding an open
> transaction without doing anything ...

It certainly isn't holding open a transaction in the database I'm
working with.  It's unclear to me it's holding any transaction open,
anywhere.  This is all that showed up:

postgres=# select * from pg_locks where transaction is not null;relation | database | transaction |  pid  |     mode
 | granted 
 
----------+----------+-------------+-------+---------------+---------         |          |     1245358 | 18020 |
ExclusiveLock| t         |          |     1245364 |   267 | ExclusiveLock | t
 
(2 rows)

postgres=# select * from pg_stat_activity;datid | datname  | procpid | usesysid |  usename   | current_query |
query_start
 
-------+----------+---------+----------+------------+---------------+-------------17142 | postgres |     267 |        1
|postgres   |               | 17144 | qantel   |   18020 |      103 | webobjects |               | 
 
(2 rows)

I don't know what those are, but they list no database or relation.  I
get this just be running psql (this time at home):

jseymour=> select * from pg_locks where transaction is not null;relation | database | transaction | pid |     mode
|granted 
 
----------+----------+-------------+-----+---------------+---------         |          |        8938 | 307 |
ExclusiveLock| t
 
(1 row)

jseymour=> select * from pg_stat_activity;datid | datname  | procpid | usesysid | usename  | current_query |
query_start
 
-------+----------+---------+----------+----------+---------------+-------------17144 | jseymour |     307 |      101 |
jseymour|               | 
 
(1 row)

Without having touched a thing.

Jim


Re: Problems Vacuum'ing

From
Tom Lane
Date:
jseymour@LinxNet.com (Jim Seymour) writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> WebObjects is evidently holding an open transaction.

> It certainly isn't holding open a transaction in the database I'm
> working with.

Which database the transaction is in isn't real relevant... the logic is
done globally so that it will be correct when vacuuming shared tables.

> It's unclear to me it's holding any transaction open,
> anywhere.

Sure it is, assuming that PID 18020 is the session we're talking about.

> postgres=# select * from pg_locks where transaction is not null;
>  relation | database | transaction |  pid  |     mode      | granted 
> ----------+----------+-------------+-------+---------------+---------
>           |          |     1245358 | 18020 | ExclusiveLock | t
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This process has an open transaction number 1245358.  That's what an
exclusive lock on a transaction means.

>  17142 | postgres |     267 |        1 | postgres   |               | 
>  17144 | qantel   |   18020 |      103 | webobjects |               | 

These entries didn't make a lot of sense to me since the other examples
you mentioned did not seem to be getting executed in the 'postgres'
database --- but I assume PID 18020 is the one you are referring to as
webobjects.
        regards, tom lane


Re: Problems Vacuum'ing

From
Alvaro Herrera
Date:
On Fri, Apr 02, 2004 at 07:35:20PM -0500, Tom Lane wrote:
> jseymour@LinxNet.com (Jim Seymour) writes:
> > Again the difference: With WebObjects running, deleting rows and
> > trying to vacuum immediately, even full, fails.  Shut-down WebObjects
> > and I can.
> 
> WebObjects is evidently holding an open transaction.  Ergo, anything
> deleted after the start of that transaction isn't vacuumable.  You need
> to do something about the client-side logic that is holding an open
> transaction without doing anything ...

But, if I read the code correctly, the oldest xmin vacuum cares about
for a non-shared relation should be local to the database, shouldn't it?
If this is so, why does it matter that he has open transaction on a
different database?

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
www.google.com: interfaz de línea de comando para la web.


Re: Problems Vacuum'ing

From
Stephan Szabo
Date:
On Fri, 2 Apr 2004, Alvaro Herrera wrote:

> On Fri, Apr 02, 2004 at 07:35:20PM -0500, Tom Lane wrote:
> > jseymour@LinxNet.com (Jim Seymour) writes:
> > > Again the difference: With WebObjects running, deleting rows and
> > > trying to vacuum immediately, even full, fails.  Shut-down WebObjects
> > > and I can.
> >
> > WebObjects is evidently holding an open transaction.  Ergo, anything
> > deleted after the start of that transaction isn't vacuumable.  You need
> > to do something about the client-side logic that is holding an open
> > transaction without doing anything ...
>
> But, if I read the code correctly, the oldest xmin vacuum cares about
> for a non-shared relation should be local to the database, shouldn't it?

AFAICS it's the oldest transaction at the start of any of the transactions
in this database, not the oldest transaction of any transaction in this
database.


Re: Problems Vacuum'ing

From
jseymour@LinxNet.com (Jim Seymour)
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> jseymour@LinxNet.com (Jim Seymour) writes:
> > Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> WebObjects is evidently holding an open transaction.
> 
> > It certainly isn't holding open a transaction in the database I'm
> > working with.
> 
> Which database the transaction is in isn't real relevant... the logic is
> done globally so that it will be correct when vacuuming shared tables.

It had occurred to me, early on, that if anything had an open
transaction, that would perhaps cause what I was seeing.  So I
killed-off WebObjects.  Ran my tests.  Psql'd as yet another user,
to another database, and did something like
   begin;   insert into foo (bar) values ('Hello');

And then ran my tests.  Vacuum'ing worked completely.

> 
> > It's unclear to me it's holding any transaction open,
> > anywhere.
> 
> Sure it is, assuming that PID 18020 is the session we're talking about.
> 
> > postgres=# select * from pg_locks where transaction is not null;
> >  relation | database | transaction |  pid  |     mode      | granted 
> > ----------+----------+-------------+-------+---------------+---------
> >           |          |     1245358 | 18020 | ExclusiveLock | t
>                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

But I see entries like that if I just *start* *up* psql, without
doing anything:

Script started on Fri 02 Apr 2004 09:42:58 PM EST
$ psql
Password: 
Welcome to psql 7.4.2, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms      \h for help with SQL commands      \? for help on internal slash commands
    \g or terminate with semicolon to execute query      \q to quit
 

jseymour=> select * from pg_locks where transaction is not null;relation | database | transaction | pid  |     mode
| granted 
 
----------+----------+-------------+------+---------------+---------         |          |        8941 | 1480 |
ExclusiveLock| t
 
(1 row)

jseymour=> select * from pg_stat_activity;datid | datname  | procpid | usesysid | usename  | current_query |
query_start
 
-------+----------+---------+----------+----------+---------------+-------------17144 | jseymour |    1480 |      101 |
jseymour|               | 
 
(1 row)

jseymour=> \q
$ exit

script done on Fri 02 Apr 2004 09:43:27 PM EST

What does that entry for pid 1480, transaction 8941 mean?

> 
> This process has an open transaction number 1245358.  That's what an
> exclusive lock on a transaction means.
> 
> >  17142 | postgres |     267 |        1 | postgres   |               | 
> >  17144 | qantel   |   18020 |      103 | webobjects |               | 
> 
> These entries didn't make a lot of sense to me since the other examples
> you mentioned did not seem to be getting executed in the 'postgres'
> database --- but I assume PID 18020 is the one you are referring to as
> webobjects.

I ran the pg_locks and pg_stat_activity selects as user postgres.  The
postgres db has nothing to do with either the WebObjects application
nor the script that's been populating the db I've been experimenting
with.

The point there was to show that the WebObjects application had nothing
open other than whatever it is seems to be there when anything connects
to a database (?) with psql (?).

Regards,
Jim


Re: Problems Vacuum'ing

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> But, if I read the code correctly, the oldest xmin vacuum cares about
> for a non-shared relation should be local to the database, shouldn't it?

It's the oldest xmin of any transaction that's local to your database,
but those xmin values themselves were computed globally --- so what
matters is the oldest transaction that was running when any local
transaction started.  In this case I expect it's the VACUUM's own
transaction that's seeing the other guy as determining its xmin.

We could fix this by making every transaction compute, and advertise in
the PGPROC array, both local and global xmin values.  In previous
iterations of this discussion we concluded that the extra cycles (which
would be spent in *every* transaction start) could not be justified by
making VACUUM better able to reclaim space in the face of misbehaving
clients.  That conclusion might be wrong, but it's not instantly obvious
that it is...
        regards, tom lane


Re: Problems Vacuum'ing

From
Tom Lane
Date:
jseymour@LinxNet.com (Jim Seymour) writes:
> But I see entries like that if I just *start* *up* psql, without
> doing anything:

Sure.  You are "doing something" when you execute "select from
pg_locks" ... that command executes inside a transaction, just
like any other Postgres operation.  The problem you're facing
is that WebObjects is creating a transaction that persists for
a long period of time.
        regards, tom lane


Re: Problems Vacuum'ing

From
Jochem van Dieten
Date:
Tom Lane wrote:
> It's the oldest xmin of any transaction that's local to your database,
> but those xmin values themselves were computed globally --- so what
> matters is the oldest transaction that was running when any local
> transaction started.  In this case I expect it's the VACUUM's own
> transaction that's seeing the other guy as determining its xmin.
> 
> We could fix this by making every transaction compute, and advertise in
> the PGPROC array, both local and global xmin values.  In previous
> iterations of this discussion we concluded that the extra cycles (which
> would be spent in *every* transaction start) could not be justified by
> making VACUUM better able to reclaim space in the face of misbehaving
> clients.

I don't suppose it is possible to find out to which database a 
transaction was local after it was committed?


> That conclusion might be wrong, but it's not instantly obvious
> that it is...

Would it be possible to find out how long a transaction has been 
open already? It is quite simple to find the oldest uncommitted 
transaction using the pg_locks table, but from there we don't 
know yet how old it is. If it were possible to determine when it 
started the vacuum verbose output could perhaps include something 
like :
DETAIL:  113590 dead row versions cannot be removed yet.
Transaction 1234567 is has been in progress for 01:45:21,
only dead row versions committed before that are removable.
Nonremovable row versions range from 64 to 88 bytes long.

Jochem

PS Sorry about messing up the threading, I read the archives.

-- 
I don't get it
immigrants don't work
and steal our jobs    - Loesje