Thread: About varlena2

About varlena2

From

Qingqing Zhou

Date:

05 December 2005, 20:44:44

To reduce size of varlen2.vl_len to int16. This has been mentioned before,
but is there any show-stopper reasoning preventing us from doing that or
somebody has been working on it?

Sorry, just to repeat myself. Char types will benefit from that. Many
applications are from DB2, Oracle or SQL Server:
       Max Char Length
DB2         32672
SQL         8000
Oracle      4000

All of above just need varlena2. To support bigger char types, we could
follow the tradition "long varchar", etc. Or, we can introduce several new
data types like "short varchar" to keep compatible with previous
PostgreSQL applications.

Regards,
Qingqing

Re: About varlena2

From

Tom Lane

Date:

05 December 2005, 21:10:17

Qingqing Zhou <zhouqq@cs.toronto.edu> writes:
> To reduce size of varlen2.vl_len to int16. This has been mentioned before,
> but is there any show-stopper reasoning preventing us from doing that or
> somebody has been working on it?

> Sorry, just to repeat myself. Char types will benefit from that.

I have considerably less than zero interest in creating variant char
types with an int16 header.  The proposal that was on the table was
to use this for numeric and inet types, where it could be done without
introducing any user-visible semantics changes.
        regards, tom lane

Re: About varlena2

From

ITAGAKI Takahiro

Date:

06 December 2005, 00:01:01

Qingqing Zhou <zhouqq@cs.toronto.edu> wrote:

> To reduce size of varlen2.vl_len to int16. This has been mentioned before,
> but is there any show-stopper reasoning preventing us from doing that or
> somebody has been working on it?

Hi, I'm rewriting the patch that I proposed before.
(http://archives.postgresql.org/pgsql-hackers/2005-09/msg00421.php)
This is another way to reduce the size of variable length types,
using variable length headers.

I'm sure that there are pros and cons of this approach.
Pros. - Optimized for short variables (length <= 127),   where the header takes only one byte. - It can represent long
data.
Cons. - More complexity and operations to extract lengths and buffers. - Needs more works to support TOAST.

To support TOAST, I think the following representations.
It might be good to use only A and B, if TOAST is not needed.
 | Representation             | Size  | Mode                |
--+----------------------------+-------+---------------------+
A | 0*******            + data | 1 + n | length <= 127       |
B | 10****** +  1 byte  + data | 2 + n | length <= 16K -1    |
C | 110----- +  4 bytes + data | 5 + n | length <= 4G  -1    |
D | 1110---- +  6 bytes + data | 7 + n | Compressed          |
E | 11110--- + 12 bytes        | 13    | External            |
F | 11111--- + 16 bytes        | 17    | External+Compressed |
('*' bits are used for length, '-' are unused.)

Comments welcome,
---
ITAGAKI Takahiro
NTT Cyber Space Laboratories