Home > mailing lists

Re: [HACKERS] postgres inode q's - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [HACKERS] postgres inode q's
Date	October 23, 1999 00:16:56
Msg-id	26794.940641329@sss.pgh.pa.us Whole thread Raw
In response to	Re: [HACKERS] postgres inode q's (Bryan Ingram <bingram@cpsgroup.com>)
List	pgsql-hackers

Tree view

Bryan Ingram <bingram@cpsgroup.com> writes:
> What happens if the INSERT/DELETE is done without a transaction
> (BEGIN/COMMIT)? Is an XID still generated for that particular tuple,
> or is the tuple instantly commited with no XID stamped into the
> beginning/ending fields?

There is always a transaction.  Postgres effectively generates an
implict BEGIN and END around any query that's not inside an explicit
transaction block.  This is why failing statements don't cause trouble;
their transactions get aborted.

> When I updated the "new" database with the new records I used the  DELETE then
> INSERT trick to avoid having to write logic to first see if there was an
> existing record and then to update only the changing fields.  Since I actually
> deleted, then inserted,  I'm guessing that the XID would change so that when I
> moved the database over to the other server, ALL of the XIDs would be
> different, not just the newly added rows.  In which case, I would expect
> VACUUM to wipe everything.  Instead, it only wiped the new rows, which tells
> me that even though I DELETED/INSERTED all existing rows, that somehow the
> XID's still sync with the XID's on the other server.

Yeah, but the old tuples are *still there*.  They are marked as having
been deleted by transaction XID so-and-so.  When you moved the files,
those transaction numbers are no longer thought to be committed, so
the old tuples come back to life (just as the new tuples are no longer
considered valid, because their inserting transaction is not known to
be committed).

There is a potential hole in this theory, which relates to a point Jan
didn't make in his otherwise excellent discussion.  A tuple normally
doesn't stay marked with its creating or deleting XID number for all
that long, because we don't really want to pay the overhead of
consulting pg_log for every single tuple.  So, as soon as any backend
checks a tuple and sees that its inserting transaction did commit,
it rewrites the tuple with a new state "INSERT KNOWN COMMITTED" (which
is represented by inserting XID = 0 or some such).  After that, no one
has to check pg_log anymore for that tuple; it's good.  Similarly, the
deleting XID only stays on the tuple until someone verifies that the
deleting transaction committed; after that the tuple is marked KNOWN
DEAD, and it'll stay dead no matter what's in pg_log.  VACUUM is really
only special in that it reclaims space occupied by known-dead tuples;
when it checks/updates the state of a tuple, it's not doing anything
that's not done by a plain SELECT.

So, AFAICT, you could only have seen the problem for tuples that were
not scanned by any SELECT or UPDATE operation subsequent to having been
inserted/deleted and committed.  If you did all the deletes/inserts
inside a transaction, committed, and then immediately copied the files,
then for sure you'd have gotten burnt.  If you did any sort of SELECT
from the table after committing the changes, I'd have expected the tuple
states to get frozen --- at least for the tuples that SELECT visited,
which might not have been all of them if the SELECT was able to use an
index.
        regards, tom lane

pgsql-hackers by date:

From: "D'Arcy" "J.M." Cain
Date: 22 October 1999, 23:57:56
Subject: Re: [HACKERS] RFC: Industrial-strength logging (long message)

From: Bruce Momjian
Date: 23 October 1999, 00:25:56
Subject: Re: [HACKERS] New psql startup banner

Re: [HACKERS] postgres inode q's - Mailing list pgsql-hackers

Previous

Next