Re: Variable length varlena headers redux - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Variable length varlena headers redux |
Date | |
Msg-id | 200702090358.l193w7v02893@momjian.us Whole thread Raw |
Responses |
Re: Variable length varlena headers redux
|
List | pgsql-hackers |
Uh, I thought the approach was to create type-specific in/out functions, and add casting so every time there were referenced, they would expand to a varlena structure in memory. --------------------------------------------------------------------------- Gregory Stark wrote: > > I've been looking at this again and had a few conversations about it. This may > be easier than I had originally thought but there's one major issue that's > bugging me. Do you see any way to avoid having every user function everywhere > use a new macro api instead of VARDATA/VARATT_DATA and VARSIZE/VARATT_SIZEP? > > The two approaches I see are either > > a) To have two sets of macros, one of which, VARATT_DATA and VARATT_SIZEP are > for constructing new tuples and behaves exactly as it does now. So you always > construct a four-byte header datum. Then in heap_form*tuple we check if you > can use a shorter header and convert. VARDATA/VARSIZE would be for looking at > existing datums and would interpret the header bits. > > This seems very fragile since one stray call site using VARATT_DATA to find > the data in an existing datum would cause random bugs that only occur rarely > in certain circumstances. It would even work as long as the size is filled in > with VARATT_SIZEP first which it usually is, but fail if someone changes the > order of the statements. > > or > > b) throw away VARATT_DATA and VARATT_SIZEP and make all user function > everywhere change over to a new macro api. That seems like a pretty big > burden. It's safer but means every contrib module would have to be updated and > so on. > > I'm hoping I'm missing something and there's a way to do this without breaking > the api for every user function. > > -- Start of included mail From: Tom Lane <tgl@sss.pgh.pa.us> > To: Gregory Stark <stark@enterprisedb.com> > cc: Gregory Stark <gsstark@mit.edu>, Bruce Momjian <bruce@momjian.us>, > Peter Eisentraut <peter_e@gmx.net>, pgsql-hackers@postgresql.org, > Martijn van Oosterhout <kleptog@svana.org> > Subject: Re: [HACKERS] Fixed length data types issue > Date: Mon, 11 Sep 2006 13:15:43 -0400 > Lines: 64 > Xref: stark.xeocode.com work.enterprisedb:683 > Gregory Stark <stark@enterprisedb.com> writes: > > In any case it seems a bit backwards to me. Wouldn't it be better to > > preserve bits in the case of short length words where they're precious > > rather than long ones? If we make 0xxxxxxx the 1-byte case it means ... > > Well, I don't find that real persuasive: you're saying that it's > important to have a 1-byte not 2-byte header for datums between 64 and > 127 bytes long. Which is by definition less than a 2% savings for those > values. I think its's more important to pick bitpatterns that reduce > the number of cases heap_deform_tuple has to think about while decoding > the length of a field --- every "if" in that inner loop is expensive. > > I realized this morning that if we are going to preserve the rule that > 4-byte-header and compressed-header cases can be distinguished from the > data alone, there is no reason to be very worried about whether the > 2-byte cases can represent the maximal length of an in-line datum. > If you want to do 16K inline (and your page is big enough for that) > you can just fall back to the 4-byte-header case. So there's no real > disadvantage if the 2-byte headers can only go up to 4K or so. This > gives us some more flexibility in the bitpattern choices. > > Another thought that occurred to me is that if we preserve the > convention that a length word's value includes itself, then for a > 1-byte header the bit pattern 10000000 is meaningless --- the count > has to be at least 1. So one trick we could play is to take over > this value as the signal for "toast pointer follows", with the > assumption that the tuple-decoder code knows a-priori how big a > toast pointer is. I am not real enamored of this, because it certainly > adds one case to the inner heap_deform_tuple loop and it'll give us > problems if we ever want more than one kind of toast pointer. But > it's a possibility. > > Anyway, a couple of encodings that I'm thinking about now involve > limiting uncompressed data to 1G (same as now), so that we can play > with the first 2 bits instead of just 1: > > 00xxxxxx 4-byte length word, aligned, uncompressed data (up to 1G) > 01xxxxxx 4-byte length word, aligned, compressed data (up to 1G) > 100xxxxx 1-byte length word, unaligned, TOAST pointer > 1010xxxx 2-byte length word, unaligned, uncompressed data (up to 4K) > 1011xxxx 2-byte length word, unaligned, compressed data (up to 4K) > 11xxxxxx 1-byte length word, unaligned, uncompressed data (up to 63b) > > or > > 00xxxxxx 4-byte length word, aligned, uncompressed data (up to 1G) > 010xxxxx 2-byte length word, unaligned, uncompressed data (up to 8K) > 011xxxxx 2-byte length word, unaligned, compressed data (up to 8K) > 10000000 1-byte length word, unaligned, TOAST pointer > 1xxxxxxx 1-byte length word, unaligned, uncompressed data (up to 127b) > (xxxxxxx not all zero) > > This second choice allows longer datums in both the 1-byte and 2-byte > header formats, but it hardwires the length of a TOAST pointer and > requires four cases to be distinguished in the inner loop; the first > choice only requires three cases, because TOAST pointer and 1-byte > header can be handled by the same rule "length is low 6 bits of byte". > The second choice also loses the ability to store in-line compressed > data above 8K, but that's probably an insignificant loss. > > There's more than one way to do it ... > > regards, tom lane > -- End of included mail. > > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
pgsql-hackers by date: