Re: order of nested loop - Mailing list pgsql-general
From | Jim C. Nasby |
---|---|
Subject | Re: order of nested loop |
Date | |
Msg-id | 20030617235337.X66185@flake.decibel.org Whole thread Raw |
In response to | Re: order of nested loop (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: order of nested loop
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-general |
On Tue, Jun 17, 2003 at 07:29:16PM -0400, Tom Lane wrote: > "Jim C. Nasby" <jim@nasby.net> writes: > > On a related note; since temporary tables are only visible to a single > > connection, do they have full MVCC info in them, or can it be bypassed? > > You can't really bypass it. You still need the transaction number so > you can tell if a tuple was created by a committed, aborted, or the > current transaction, and you still need the command number for the same > reasons as usual. You do need that info, but I think a local-only temp. table means you don't have to use XID (and maybe CID, I don't really know how that's used) to store the info. This is because only a single transaction can reference the table. As part of committing an update, you can either set a simple flag in the old tuples, or you might even be able to just delete them (though nested transactions would probably break that). You would have to keep a list of what tuples are old and new, but that's probably more efficient than 16/20 bytes per tuple. OTOH, I've also been wondering if MVCC info could be stored more efficiently across the board. One idea is to keep a list of valid MVCC states, and assign int (or maybe even smaller) ID's to every entry in that list. Each tuple would only have that entry number then. This would probably not be good for tables that have a lot of single-row transactions (though there's another advantage that might offset the overhead), but for tables that only see fairly large transactions it could be a tremendous gain. The other advantage to normalizing the MVCC info is it should then be reasonable to store it in indexes as well as base tuples. This could dramatically speed up index work, since you don't have to read the base tuple to see if it's still valid. It would also allow things like index covering, and count(*) to do only an index scan. Another alternative to an MVCC ID would be to store the MVCC info outside the mainline storage with a reference back to the base tuple, and use some form of RLE. IE: xmin, cmin, xmax, cmax, xvac, min_tid, max_tid where min_tid and max_tid define the range of tuples that MVCC info applies to. I'm sure there's plenty of other ways MVCC info could be stored without using 16/20 bytes per tuple. -- Jim C. Nasby (aka Decibel!) jim@nasby.net Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
pgsql-general by date: