Re: Ignoring BRIN for HOT udpates seems broken - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Ignoring BRIN for HOT udpates seems broken
Date
Msg-id CAEze2Wjq6iiFGCiVCdT4AZCMByxVdMmGdhBJdgYMVPH2sdu=XQ@mail.gmail.com
Whole thread Raw
In response to Ignoring BRIN for HOT udpates seems broken  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Ignoring BRIN for HOT udpates seems broken
List pgsql-hackers
On Sat, 28 May 2022 at 16:51, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> Hi,
>
> while working on some BRIN stuff, I realized (my) commit 5753d4ee320b
> ignoring BRIN indexes for HOT is likely broken. Consider this example:
>
> ----------------------------------------------------------------------
> CREATE TABLE t (a INT) WITH (fillfactor = 10);
>
> INSERT INTO t SELECT i
>   FROM generate_series(0,100000) s(i);
>
> CREATE INDEX ON t USING BRIN (a);
>
> UPDATE t SET a = 0 WHERE random() < 0.01;
>
> SET enable_seqscan = off;
> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
>
> SET enable_seqscan = on;
> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
> ----------------------------------------------------------------------
>
> which unfortunately produces this:
>
>                               QUERY PLAN
>     ---------------------------------------------------------------
>      Bitmap Heap Scan on t (actual rows=23 loops=1)
>        Recheck Cond: (a = 0)
>        Rows Removed by Index Recheck: 2793
>        Heap Blocks: lossy=128
>        ->  Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1)
>              Index Cond: (a = 0)
>      Planning Time: 0.049 ms
>      Execution Time: 0.424 ms
>     (8 rows)
>
>     SET
>                    QUERY PLAN
>     -----------------------------------------
>      Seq Scan on t (actual rows=995 loops=1)
>        Filter: (a = 0)
>        Rows Removed by Filter: 99006
>      Planning Time: 0.027 ms
>      Execution Time: 7.670 ms
>     (5 rows)
>
> That is, the index fails to return some of the rows :-(
>
> I don't remember the exact reasoning behind the commit, but the commit
> message justifies the change like this:
>
>     There are no index pointers to individual tuples in BRIN, and the
>     page range summary will be updated anyway as it relies on visibility
>     info.
>
> AFAICS that's a misunderstanding of how BRIN uses visibility map, or
> rather does not use. In particular, bringetbitmap() does not look at the
> vm at all, so it'll produce incomplete bitmap.
>
> So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a
> live issue. I don't quite see if this can be salvaged - I'll think about
> this a bit more, but it'll probably end with a revert.

The principle of 'amhotblocking' for only blocking HOT updates seems
correct, except for the fact that the HOT flag bit is also used as a
way to block the propagation of new values to existing indexes.

A better abstraction would be "amSummarizes[Block]', in which updates
that only modify columns that are only included in summarizing indexes
still allow HOT, but still will see an update call to all (relevant?)
summarizing indexes. That should still improve performance
significantly for the relevant cases.

-Matthias



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [RFC] building postgres with meson
Next
From: Andrey Lepikhov
Date:
Subject: Re: Compare variables of composite type with slightly different column types