Re: Reducing the overhead of NUMERIC data - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: Reducing the overhead of NUMERIC data
Date
Msg-id 20051103140926.GC15795@svana.org
Whole thread Raw
In response to Re: Reducing the overhead of NUMERIC data  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Reducing the overhead of NUMERIC data  (mark@mark.mielke.cc)
List pgsql-hackers
On Thu, Nov 03, 2005 at 01:49:46PM +0000, Simon Riggs wrote:
> In other databases, CHAR(12) and NUMERIC(12) are fixed length datatypes.
> In PostgreSQL, they are dynamically varying datatypes.

Please explain how a CHAR(12) can store 12 UTF-8 characters when each
character may be 1 to 4 bytes, unless the CHAR itself is variable
length...

> What actually happens is that in many other systems the datatype is the
> same, but additional metadata is provided for that particular attribute.
> So CHAR(12) is a datatype of CHAR with a metadata item called length
> which is set to 12 for that attribute.

We already have this metadata, it's called atttypmod and it's stored in
pg_attribute. That's where the 12 for CHAR(12) is stored BTW.

> On PostgreSQL, CHAR(12) is a bpchar datatype with all instantiations of
> that datatype having a 4 byte varlena header. In this example, all of
> those instantiations having the varlena header set to 12, so essentially
> wasting the 4 byte header.

Nope, the verlena header stores the actual length on disk. If you store
"hello" in a char(12) field it takes only 9 bytes (4 for the header, 5
for the data), which is less than 12.

Good ideas, but it all hinges on the fact that CHAR(12) can take a
fixed amount of space, which simply isn't true in a multibyte encoding.

Having a different header for things shorter than 255 bytes has been
discussed before, that's another argument though.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

pgsql-hackers by date:

Previous
From: "Steinar H. Gunderson"
Date:
Subject: Transitive closure of a directed graph
Next
From: Alvaro Herrera
Date:
Subject: Re: Reducing the overhead of NUMERIC data