Re: Fix brin_form_tuple to properly detoast data - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Fix brin_form_tuple to properly detoast data
Date
Msg-id 20201105171717.GA29029@alvherre.pgsql
Whole thread Raw
In response to Re: Fix brin_form_tuple to properly detoast data  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Fix brin_form_tuple to properly detoast data  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On 2020-Nov-04, Tomas Vondra wrote:

> The first test is fairly trivial - it simply builds index on toasted data
> and then shows how an insert and select fail. There's a caveat, that this
> requires a DELETE + VACUUM, and the VACUUM actually has to cleanup the rows.
> So there must be no concurrent transactions that might need the rows, which
> is unlikely in regression tests. So this requires waiting for all running
> transactions to finish - I did that by building an index concurrently. It's
> a bit strange, but it's better than any other solution I could think of
> (timeout or some custom wait for xacts).

There are recent changes in vacuum for temp tables (commit 94bc27b57680?)
that would maybe make this stable enough, without having to have the CIC
there.  At least, I tried it locally a few times and it appears to work well.
This won't work for older releases though, just master.  This is patch
0001 attached here.

> The second test is a bit redundant - it merely checks that both CREATE INDEX
> and INSERT INTO fail the same way when the index tuple gets too large.
> Before the fix there were some inconsistencies - the CREATE INDEX succeeded
> because it used TOASTed data. So ultimately this tests the same thing, but
> from a different perspective.

Hmm.  This one shows page size in the error messages, so it'll fail on
nonstandard builds.  I think we try to stay away from introducing those,
so I'd leave this test out.

The code fix looks all right -- I'd just move the #include lines to
their place.  Patch 0002.

You add this comment:

> +            /*
> +             * Do nothing if value is not of varlena type. We don't need to
> +             * care about NULL values here, thanks to bv_allnulls above.
> +             *
> +             * If value is stored EXTERNAL, must fetch it so we are not
> +             * depending on outside storage.
> +             *
> +             * XXX Is this actually true? Could it be that the summary is
> +             * NULL even for range with non-NULL data? E.g. degenerate bloom
> +             * filter may be thrown away, etc.
> +             */

I think the XXX comment points to a bug that we don't have right now,
since neither minmax nor inclusion can end up with a NULL summary
starting from non-NULL data.  But if the comment about bloom is correct,
then surely it'll become a bug when bloom is added.

I don't think we need the second part of this comment:

> +/*
> + * This enables de-toasting of index entries.  Needed until VACUUM is
> + * smart enough to rebuild indexes from scratch.
> + */

... because, surely, we're now never working on having VACUUM rebuild
indexes from scratch.   In fact, I wonder if we need the #define at
all.  I propose to remove all those #ifdef lines in your patch.

The fix looks good to me.  I just added a comment in 0003.


Attachment

pgsql-hackers by date:

Previous
From: Andrey Borodin
Date:
Subject: Re: Yet another fast GiST build
Next
From: Tomas Vondra
Date:
Subject: Re: Fix brin_form_tuple to properly detoast data