Home > mailing lists

Re: Fixed length data types issue - Mailing list pgsql-hackers

From	mark@mark.mielke.cc
Subject	Re: Fixed length data types issue
Date	September 8, 2006 13:28:45
Msg-id	20060908132821.GA24823@mark.mielke.cc Whole thread Raw
In response to	Re: Fixed length data types issue (Peter Eisentraut <peter_e@gmx.net>)
Responses	Re: Fixed length data types issue (Martijn van Oosterhout <kleptog@svana.org>)
List	pgsql-hackers

Tree view

On Fri, Sep 08, 2006 at 08:57:12AM +0200, Peter Eisentraut wrote:
> Gregory Stark wrote:
> > I think we have to find a way to remove the varlena length header
> > entirely for fixed length data types since it's going to be the same
> > for every single record in the table.
> But that won't help in the example you posted upthread, because char(N) 
> is not fixed-length.

It can be fixed-length, or at least, have an upper bound. If marked
up to contain only ascii characters, it doesn't, at least in theory,
and even if it is unicode, it's not going to need more than 4 bytes
per character. char(2) through char(16) only require 4 bits to
store the length header, leaving 4 bits for encoding information.
bytea(2) through bytea(16), at least in theory, should require none.

For my own uses, I would like for bytea(16) to have no length header.
The length is constant. UUID or MD5SUM. Store the length at the head
of the table, or look up the information from the schema.

I see the complexity argument. Existing code is too heavy to change
completely. People talking about compromises such as allowing the
on disk layout to be different from the in memory layout. I wonder
whether the change could be small enough to not significantly
increase CPU, while still having significant effect. I find myself
doubting the CPU bound numbers. If even 20% data is saved, this
means 20% more RAM for caching, 20% less pages touched when
scanning, and 20% less RAM read. When people say CPU-bound, are we
sure they do not mean RAM speed bound? How do they tell the
difference between the two? RAM lookups count as CPU on most
performance counters I've ever used. RAM speed is also slower than
CPU speed, allowing for calculations between accesses assuming
that the loop allows for prefetching to be possible and accurate.

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...

                          http://mark.mielke.cc/

pgsql-hackers by date:

From: Praveen Kumar N
Date: 08 September 2006, 13:23:17
Subject: postgresql shared buffers

From: Heikki Linnakangas
Date: 08 September 2006, 13:30:49
Subject: Re: postgresql shared buffers

Re: Fixed length data types issue - Mailing list pgsql-hackers

Previous

Next