Re: order of nested loop - Mailing list pgsql-general

From Tom Lane
Subject Re: order of nested loop
Date
Msg-id 19338.1055914922@sss.pgh.pa.us
Whole thread Raw
In response to Re: order of nested loop  ("Jim C. Nasby" <jim@nasby.net>)
Responses Re: order of nested loop
List pgsql-general
"Jim C. Nasby" <jim@nasby.net> writes:
> ... I'm sure there's plenty of other ways MVCC info could be
> stored without using 16/20 bytes per tuple.

I didn't really see a single workable idea there.  Keep in mind that
storage space is only one consideration (and not a real big one given
modern disk-drive sizes).  Ask yourself about atomicity, failure
recovery, and update costs.  RLE encoding of tuple states?  Get real ---
how many rows could get wiped out by a one-bit lossage?  How extensive
are the on-disk changes needed to encode a one-tuple change in state,
and how do you recover if the machine crashes when only some of those
changes are down to disk?  In my opinion PG's on-disk structures are
barely reliable enough now; we don't want to introduce compression
schemes with the potential for large cross-tuple failure modes.

Storing commit state in index entries has been repeatedly proposed
and repeatedly rejected, too.  It converts an atomic operation
(update one word in one page) into a non-atomic, multi-page operation,
which creates lots of performance and reliability problems.  And the
point of an index is to be smaller than the main table --- the more
stuff you cram into an index tuple header, the less the advantage
of having the index.

Criticism in the form of a patch with experimental evidence is welcome,
but I'm not really interested in debating what-if proposals, especially
not ones that are already discussed in the archives.

            regards, tom lane

pgsql-general by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: order of nested loop
Next
From: Hubert Fröhlich
Date:
Subject: PostgreSQL alternative to "Oracle Real Application Cluster"