Re: Ignoring BRIN for HOT udpates seems broken - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Ignoring BRIN for HOT udpates seems broken
Date
Msg-id 00babc97-7fa9-8b1d-fce2-066de9b8a25d@enterprisedb.com
Whole thread Raw
In response to Re: Ignoring BRIN for HOT udpates seems broken  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Ignoring BRIN for HOT udpates seems broken
Re: Ignoring BRIN for HOT udpates seems broken
List pgsql-hackers

On 5/28/22 21:24, Matthias van de Meent wrote:
> On Sat, 28 May 2022 at 16:51, Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> Hi,
>>
>> while working on some BRIN stuff, I realized (my) commit 5753d4ee320b
>> ignoring BRIN indexes for HOT is likely broken. Consider this example:
>>
>> ----------------------------------------------------------------------
>> CREATE TABLE t (a INT) WITH (fillfactor = 10);
>>
>> INSERT INTO t SELECT i
>>   FROM generate_series(0,100000) s(i);
>>
>> CREATE INDEX ON t USING BRIN (a);
>>
>> UPDATE t SET a = 0 WHERE random() < 0.01;
>>
>> SET enable_seqscan = off;
>> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
>>
>> SET enable_seqscan = on;
>> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0;
>> ----------------------------------------------------------------------
>>
>> which unfortunately produces this:
>>
>>                               QUERY PLAN
>>     ---------------------------------------------------------------
>>      Bitmap Heap Scan on t (actual rows=23 loops=1)
>>        Recheck Cond: (a = 0)
>>        Rows Removed by Index Recheck: 2793
>>        Heap Blocks: lossy=128
>>        ->  Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1)
>>              Index Cond: (a = 0)
>>      Planning Time: 0.049 ms
>>      Execution Time: 0.424 ms
>>     (8 rows)
>>
>>     SET
>>                    QUERY PLAN
>>     -----------------------------------------
>>      Seq Scan on t (actual rows=995 loops=1)
>>        Filter: (a = 0)
>>        Rows Removed by Filter: 99006
>>      Planning Time: 0.027 ms
>>      Execution Time: 7.670 ms
>>     (5 rows)
>>
>> That is, the index fails to return some of the rows :-(
>>
>> I don't remember the exact reasoning behind the commit, but the commit
>> message justifies the change like this:
>>
>>     There are no index pointers to individual tuples in BRIN, and the
>>     page range summary will be updated anyway as it relies on visibility
>>     info.
>>
>> AFAICS that's a misunderstanding of how BRIN uses visibility map, or
>> rather does not use. In particular, bringetbitmap() does not look at the
>> vm at all, so it'll produce incomplete bitmap.
>>
>> So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a
>> live issue. I don't quite see if this can be salvaged - I'll think about
>> this a bit more, but it'll probably end with a revert.
> 
> The principle of 'amhotblocking' for only blocking HOT updates seems
> correct, except for the fact that the HOT flag bit is also used as a
> way to block the propagation of new values to existing indexes.
> 
> A better abstraction would be "amSummarizes[Block]', in which updates
> that only modify columns that are only included in summarizing indexes
> still allow HOT, but still will see an update call to all (relevant?)
> summarizing indexes. That should still improve performance
> significantly for the relevant cases.
> 

Yeah, I think that might/should work. We could still create the HOT
chain, but we'd have to update the BRIN indexes. But that seems like a
fairly complicated change to be done this late for PG15.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: fix stats_fetch_consistency value in postgresql.conf.sample
Next
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply