Thread: Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

From
"Gokulakannan Somasundaram"
Date:
I have submitted the first working patch for the trailing null optimization. It currently does the following
a) Doesn't store the null bitmap, if the heap tuple / index tuple contains only trailing nulls
b) In Heap Tuple, the trailing nulls won't occupy space in the null bitmap.

The General design is like this
a) After checking for trailing nulls, i reduce the number of attributes field, which gets stored in each heap tuple.
b) For Index, i have changed the Index_form_tuple to store the unaligned total size in the size mask. While navigating through the index tuple, if the offset exceeds the unaligned total size stored, then a null is returned

Please review it and provide suggestions.

 
>
> I doubt you have fixed it; I doubt it's *possible* to fix it without
> significant rejiggering of IndexTuple representation.  The problem is
> that IndexTuple lacks a number-of-fields field, so there is no place
> to indicate how many null bitmap bits you have actually stored.

Actually i have made one change to the structure of IndexTupleData. Instead of storing the Aligned size in the size mask, i have stored the  un-aligned size. I am storing the size before the final MAXALIGN. The interface remains un-changed. IndexTupleSize does a MAXALIGN before returning the size value. so the interface remains un-changed.  The advantage of storing the un-aligned size is that we can get both aligned size and un-aligned size(As you may know). I have created two more macros to return the un-aligned size.

 
  

> I would suggest forgetting that part and submitting the part that
> has some chance of getting accepted.

Actually i want to submit the patch, which is best according to me.
 


I suspect there's also an awkward case that *does* need to handled when you
insert a tuple which has a null column which you're leaving out of the tuple
but which appears in an index. You would have to make sure that the index
tuple has that datum listed as NULL even though it's entirely missing from the
heap tuple.

Actually this is taken care because of your suggestion. When you add a new column, it doesn't appear in the heaptuple, but if you create an index on that column afterwards, the case is handled. There is a field in HeapTuple, which mentions the number of attributes in the tuple. If we are requesting for attribute numbers greater than this number, it is returned as null. So that problem was taken care.


--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)
Attachment

Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

From
Andrew Dunstan
Date:

Gokulakannan Somasundaram wrote:
>
>
>
>
>
>     > I would suggest forgetting that part and submitting the part that
>     > has some chance of getting accepted.
>
>
> Actually i want to submit the patch, which is best according to me.
>
>

That's not an attitude that is likely to succeed - you need to take
suggestions from Tom very seriously.

Also, please submit patches as context diffs, as set out in the
Developer FAQ, which you should probably read carefully:
http://www.postgresql.org/docs/faqs.FAQ_DEV.html

cheers

andrew

Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 19 Dec 2007 13:46:15 -0500
Andrew Dunstan <andrew@dunslane.net> wrote:

> >     > I would suggest forgetting that part and submitting the part
> >     > that has some chance of getting accepted.
> >
> >
> > Actually i want to submit the patch, which is best according to me.
> >  

You do need to be able to be able to feel that your work is up to a
standard that you find redeemable. However...

> That's not an attitude that is likely to succeed - you need to take 
> suggestions from Tom very seriously.

Andrew is absolutely correct here. If you do not agree with Tom, you
best prove why. Otherwise your patch will likely be ignored on
submission.

Sincerely,

Joshua D. Drake

- -- 
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ 
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHaWj9ATb/zqfZUUQRAqsNAJ9k6p0z7rQEcqal0JoKw/ZZG8h5kACfaB9y
xQJ4O+h1xe947O1gnTLEbTU=
=WaSW
-----END PGP SIGNATURE-----

Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

From
"Gokulakannan Somasundaram"
Date:
Thanks for the suggestions. I am re-submitting the patch in contextual diff format.

As far as storage savings are concened, the patch claims whatever is stated. I checked it by creating a table with 10 columns on a 32 bit machine. i inserted 100,000 rows with trailing nulls and i observed savings of 400Kbytes.
I did a similar test for index and i found similar space saving.

I have tested regression in both 32 bit system and 64 bit system.

Please go through the patch and provide further suggestions.

--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
( www.alliedgroups.com)
Attachment

Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

From
Decibel!
Date:
On Dec 20, 2007, at 2:36 AM, Gokulakannan Somasundaram wrote:
> I checked it by creating a table with 10 columns on a 32 bit
> machine. i inserted 100,000 rows with trailing nulls and i observed
> savings of 400Kbytes.


That doesn't really tell us anything... how big was the table
originally? Also, testing on 64 bit would be interesting.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment

Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)

From
"Gokulakannan Somasundaram"
Date:
Hi,
    Back from the holiday times. I have tried to present the proof, that the null bitmap was absent in the table with the trailing nulls.  

On Dec 22, 2007 4:43 AM, Decibel! < decibel@decibel.org> wrote:
On Dec 20, 2007, at 2:36 AM, Gokulakannan Somasundaram wrote:
> I checked it by creating a table with 10 columns on a 32 bit
> machine. i inserted 100,000 rows with trailing nulls and i observed
> savings of 400Kbytes.


That doesn't really tell us anything...
As i said that the patch removes the null bitmap, if the tuple has trailing nulls. Our tuple size without null bitmap is 23 bytes. Currently, as long as the table has less than 8 columns(with null), the heaptuple header size will be 24 bytes. But if the tuple has more than 8 columns, then it will occupy 4 more bytes in a 32 bit system and 8 more bytes in a 64 bit system. This patch attempts to save that extra space, if the tuple has only trailing nulls
 
how big was the table
originally?
I think it was 5.5 M and 5.1M before and after applying the patch. But how is this relevant? The patch saves 4 bytes in a 32 bit system per tuple, irrespective of the size of the tuple
 
Also, testing on 64 bit would be interesting.
I tested the patch on 64 bit system also for regression. The saving was 8 bytes per tuple.

I have attempted to provide an explanation. But i don't know whether i have answered your doubts exactly.
Please revert back, in case you haven't got clarified.



--
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
( www.alliedgroups.com)