On-disk Tuple Size - Mailing list pgsql-hackers

From Curt Sampson
Subject On-disk Tuple Size
Date
Msg-id Pine.NEB.4.43.0204201608060.467-100000@angelic.cynic.net
Whole thread Raw
Responses Re: On-disk Tuple Size  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
[I've moved this discussion about changing the line pointer from four
bytes to two from -general to -hackers, since it's fairly technical.
The entire message Tom is responding to is appended to this one.]

On Sat, 20 Apr 2002, Tom Lane wrote:

> Curt Sampson <cjs@cynic.net> writes:
> > ... Then we could declare that all tuples must be aligned on a
> > four-byte boundary, use the top 14 bits of a 16-bit line pointer as the
> > address, and the bottom two bits for the LP_USED and LP_DELETED flag.
> > This would slightly simplify the code for determining the flags, and
> > incidently boost the maximum page size to 64K.
>
> Hmm.  Maybe, but the net effect would only be to reduce the minimum row
> overhead from 36 to 34 bytes.  Not sure it's worth worrying about.

Well, unless the implementation is hideously complex, I'd say that
every byte is worth worrying about, given the amount of overhead that's
currently there. 36 to 34 bytes could give something approaching a 5%
performance increase for tables with short rows. (Actually, do we prefer
the tables/rows or relations/tuples terminology here? I guess I kinda
tend to use the latter for physical stuff.)

If we could drop the OID from the tuple when it's not being used,
that would be another four bytes, bringing the performance increase
up towards 15% on tables with short rows.

Of course I understand that all this is contingent not only on such
changes being acceptable, but someone actually caring enough to
write them.

While we're at it, would someone have the time to explain to me
how the on-disk CommandIds are used? A quick look at the code
indicates that this is used for cursor consistency, among other
things, but it's still a bit mysterious to me.

> > ... I don't see why we would then
> > need the LP_DELETED flag at all.
>
> I believe we do want to distinguish three states: live tuple, dead
> tuple, and empty space.  Otherwise there will be cases where you're
> forced to move data immediately to collapse empty space, when there's
> not a good reason to except that your representation can't cope.

I don't understand this. Why do you need to collapse empty space
immediately? Why not just wait until you can't find an empty fragment
in the page that's big enough, and then do the collapse?

Oh, on a final unrelated note, <john@akadine.com>, you're bouncing
mail from my host for reasons not well explained ("550 Access
denied.") I tried postmaster at your site, but that bounces mail
too. If you want to work out the problem, drop me e-mail from some
address at which you can be responded to.

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC
 

------- Previous Message --------

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Improved scanner performance
Next
From: Curt Sampson
Date:
Subject: Re: On-Disk Tuple Size