Thread: A possible TODO item

A possible TODO item

From
"Gurjeet Singh"
Date:
The comment above TOAST_INDEX_HACK in tuptoaster.h is:<br /><br />/*<br /> * This enables de-toasting of index
entries. Needed until VACUUM is<br /> * smart enough to rebuild indexes from scratch.<br /> */<br />#define
TOAST_INDEX_HACK<br /><br />Do we already have a TODO item to remove this hack? If not, I think there should be,
becauseit is waiting for some other progress to happen.<br /><br />Best regards,<br clear="all" /><br />-- <br
/>gurjeet[.singh]@EnterpriseDB.com<br />singh.gurjeet@{ gmail | hotmail | yahoo }.com  

Re: A possible TODO item

From
Tom Lane
Date:
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
> The comment above TOAST_INDEX_HACK in tuptoaster.h is:

> /*
>  * This enables de-toasting of index entries.  Needed until VACUUM is
>  * smart enough to rebuild indexes from scratch.
>  */
> #define TOAST_INDEX_HACK

> Do we already have a TODO item to remove this hack? If not, I think there
> should be, because it is waiting for some other progress to happen.

Like what?  If you want to argue that it's important to work on, you'd
better make the case for that.

At first glance you might think that turning it off would Just Work,
because VACUUM should always remove index entries before removing
heap rows, but unfortunately an index might have more copies of a key
than just the one in the directly associated index entry.  (btree,
for example, might have copied the key into a page "high key" and/or
boundary keys in upper tree levels.  Those copies will persist long
after the underlying row is gone.)  And surely we are never going
to make VACUUM force a complete REINDEX as the comment suggests.
So any change in the situation is going to require considerable work
as well as some good ideas we haven't thought of yet.  Plus, it's
unclear that values large enough to require out-of-line storage are
reasonable candidates for indexing in the first place.  So: what's
the argument that is going to persuade everyone that this is really
important?
        regards, tom lane


Re: A possible TODO item

From
"Gurjeet Singh"
Date:
On 12/31/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
> The comment above TOAST_INDEX_HACK in tuptoaster.h is:

> /*
>  * This enables de-toasting of index entries.  Needed until VACUUM is
>  * smart enough to rebuild indexes from scratch.
>  */
> #define TOAST_INDEX_HACK

> Do we already have a TODO item to remove this hack? If not, I think there
> should be, because it is waiting for some other progress to happen.

Like what?  If you want to argue that it's important to work on, you'd
better make the case for that.

I haven't spent enough days in PGSQL-land that I can build a propaganda to support my viewpoint.It was just an impulse I got after reading the comments.

I thought that if the author of the code (which, now I see, is you) wanted this hack to be removed at some point later, then it better be documented/mentioned in TODO list, albeit at low priority, so that we don't lose sight of it.

At first glance you might think that turning it off would Just Work,

That never crossed my mind. I haven't been able to dirty my hands enough with backend code that I can think along those lines yet; someday I'd like to be able to.

When searching our archives, I saw a post ( http://archives.postgresql.org/pgsql-patches/2006-07/msg00101.php) mentioning this macro in include-files related problem, and that that eliminating it makes btree_gist to fail.

because VACUUM should always remove index entries before removing
heap rows, but unfortunately an index might have more copies of a key
than just the one in the directly associated index entry.  (btree,
for example, might have copied the key into a page "high key" and/or
boundary keys in upper tree levels.  Those copies will persist long
after the underlying row is gone.)  And surely we are never going
to make VACUUM force a complete REINDEX as the comment suggests.

In that case, can the comment be changed!

So any change in the situation is going to require considerable work
as well as some good ideas we haven't thought of yet.  Plus, it's
unclear that values large enough to require out-of-line storage are
reasonable candidates for indexing in the first place. So: what's
the argument that is going to persuade everyone that this is really
important?

As I said, none; just that, if it is a pending work, it should be in the TODO list (low priority!!?), or have the comment changed, and... if this macro is indispensable it should be removed and the the code that it surrounds should be merged.

Best regards,

--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

Re: A possible TODO item

From
Tom Lane
Date:
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
> I thought that if the author of the code (which, now I see, is you)

No, it was Jan IIRC.

>> And surely we are never going
>> to make VACUUM force a complete REINDEX as the comment suggests.

> In that case, can the comment be changed!

Even though it's a poor implementation suggestion, at least it's an
implementation suggestion.  I'm disinclined to remove it when I don't
have a better idea to put in its place ...
        regards, tom lane