Home > mailing lists

Re: Ignoring BRIN for HOT udpates seems broken - Mailing list pgsql-hackers

From	Matthias van de Meent
Subject	Re: Ignoring BRIN for HOT udpates seems broken
Date	May 30, 2022 15:22:35
Msg-id	CAEze2Wi9=Bay_=rTf8Z6WPgZ5V0tDOayszQJJO=R_9aaHvr+Tg@mail.gmail.com Whole thread
In response to	Re: Ignoring BRIN for HOT udpates seems broken (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses	Re: Ignoring BRIN for HOT udpates seems broken
List	pgsql-hackers

Tree view

On Sat, 28 May 2022 at 22:51, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
>
>
> On 5/28/22 21:24, Matthias van de Meent wrote:
> > On Sat, 28 May 2022 at 16:51, Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >>
> >> Hi,
> >>
> >> while working on some BRIN stuff, I realized (my) commit 5753d4ee320b
> >> ignoring BRIN indexes for HOT is likely broken. Consider this example:
> >>
> >> ----------------------------------------------------------------------
> >> CREATE TABLE t (a INT) WITH (fillfactor = 10);
> >>
> >> INSERT INTO t SELECT i
> >>   FROM generate_series(0,100000) s(i);
> >>
> >> CREATE INDEX ON t USING BRIN (a);
> >>
> >> UPDATE t SET a = 0 WHERE random() < 0.01;
> >>
> >> SET enable_seqscan = off;
> >> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
> >>
> >> SET enable_seqscan = on;
> >> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
> >> ----------------------------------------------------------------------
> >>
> >> which unfortunately produces this:
> >>
> >>                               QUERY PLAN
> >>     ---------------------------------------------------------------
> >>      Bitmap Heap Scan on t (actual rows=23 loops=1)
> >>        Recheck Cond: (a = 0)
> >>        Rows Removed by Index Recheck: 2793
> >>        Heap Blocks: lossy=128
> >>        ->  Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1)
> >>              Index Cond: (a = 0)
> >>      Planning Time: 0.049 ms
> >>      Execution Time: 0.424 ms
> >>     (8 rows)
> >>
> >>     SET
> >>                    QUERY PLAN
> >>     -----------------------------------------
> >>      Seq Scan on t (actual rows=995 loops=1)
> >>        Filter: (a = 0)
> >>        Rows Removed by Filter: 99006
> >>      Planning Time: 0.027 ms
> >>      Execution Time: 7.670 ms
> >>     (5 rows)
> >>
> >> That is, the index fails to return some of the rows :-(
> >>
> >> I don't remember the exact reasoning behind the commit, but the commit
> >> message justifies the change like this:
> >>
> >>     There are no index pointers to individual tuples in BRIN, and the
> >>     page range summary will be updated anyway as it relies on visibility
> >>     info.
> >>
> >> AFAICS that's a misunderstanding of how BRIN uses visibility map, or
> >> rather does not use. In particular, bringetbitmap() does not look at the
> >> vm at all, so it'll produce incomplete bitmap.
> >>
> >> So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a
> >> live issue. I don't quite see if this can be salvaged - I'll think about
> >> this a bit more, but it'll probably end with a revert.
> >
> > The principle of 'amhotblocking' for only blocking HOT updates seems
> > correct, except for the fact that the HOT flag bit is also used as a
> > way to block the propagation of new values to existing indexes.
> >
> > A better abstraction would be "amSummarizes[Block]', in which updates
> > that only modify columns that are only included in summarizing indexes
> > still allow HOT, but still will see an update call to all (relevant?)
> > summarizing indexes. That should still improve performance
> > significantly for the relevant cases.
> >
>
> Yeah, I think that might/should work. We could still create the HOT
> chain, but we'd have to update the BRIN indexes. But that seems like a
> fairly complicated change to be done this late for PG15.

Here's an example patch for that (based on a branch derived from
master @ 5bb2b6ab). A nod to the authors of the pHOT patch, as that is
a related patch and was informative in how this could/should impact AM
APIs -- this is doing things similar (but not exactly the same) to
that by only updating select indexes.

Note that this is an ABI change in some critical places -- I'm not
sure it's OK to commit a fix like this into PG15 unless we really
don't want to revert 5753d4ee320b.

Also of note is that this still updates _all_ summarizing indexes, not
only those involved in the tuple update. Better performance is up to a
different implementation.

The patch includes a new regression test based on your example, which
fails on master but succeeds after applying the patch.

-Matthias

Attachment

v1-0001-Rework-5753d4ee-s-amhotblocking-infrastructure-re.patch

pgsql-hackers by date:

From: Michael Meskes
Date: 30 May 2022, 13:25:16
Subject: Re: SQL/JSON functions vs. ECPG vs. STRING as a reserved word

From: Tom Lane
Date: 30 May 2022, 17:11:04
Subject: Re: ParseTzFile doesn't FreeFile on error

Re: Ignoring BRIN for HOT udpates seems broken - Mailing list pgsql-hackers

Attachment

Previous

Next