Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS
Date
Msg-id 7d661b5b-8fe9-38fc-eaeb-047e98961d06@enterprisedb.com
Whole thread Raw
In response to Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS
List pgsql-hackers

On 6/7/21 2:11 PM, Masahiko Sawada wrote:
> On Mon, Jun 7, 2021 at 4:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> On Mon, Jun 7, 2021 at 11:15 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>
>>> husky just reported $SUBJECT:
>>>
>>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky&dt=2021-06-05%2013%3A42%3A17
>>>
>>> and I find I can reproduce that locally:
>>>
>>> diff -U3 /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out
/home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out
>>> --- /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out       2021-01-20 11:12:24.854346717
-0500
>>> +++ /home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out        2021-06-06 22:12:07.948890104
-0400
>>> @@ -215,7 +215,8 @@
>>>       0 | f           | f
>>>       1 | f           | f
>>>       2 | t           | t
>>> -(3 rows)
>>> +     3 | t           | t
>>> +(4 rows)
>>>
>>>  select * from pg_check_frozen('copyfreeze');
>>>   t_ctid
>>> @@ -235,7 +236,8 @@
>>>       0 | t           | t
>>>       1 | f           | f
>>>       2 | t           | t
>>> -(3 rows)
>>> +     3 | t           | t
>>> +(4 rows)
>>>
>>>  select * from pg_check_frozen('copyfreeze');
>>>   t_ctid
>>>
>>>
>>> The test cases that are failing date back to January (7db0cd2145f),
>>> so I think this is some side-effect of a recent commit, but I have
>>> no idea which one.
>>
>> It seems like the recent revert (8e03eb92e9a) is relevant.
>>
>> After committing 7db0cd2145f we had the same regression test failure
>> in January[1]. Then we fixed that issue by 39b66a91b. But since we
>> recently reverted most of 39b66a91b, the same issue happened again.
>>
> 
> So the cause of this failure seems the same as before. The failed test is,
> 
> begin;
> truncate copyfreeze;
> copy copyfreeze from stdin freeze;
> 1       '1'
> 2       '2'
> 3       '3'
> 4       '4'
> 5       '5'
> \.
> copy copyfreeze from stdin;
> 6       '6'
> \.
> copy copyfreeze from stdin freeze;
> 7       '7'
> 8       '8'
> 9       '9'
> 10      '10'
> 11      '11'
> 12      '12'
> \.
> commit;
> 
> If the target block cache is invalidated before the third COPY, we
> will start to insert the frozen tuple into a new page, resulting in
> adding two blocks in total during the third COPY. I think we still
> need the following part of the reverted code so that we don't leave
> the page partially empty after relcache invalidation:
> 
> --- a/src/backend/access/heap/hio.c
> +++ b/src/backend/access/heap/hio.c
> @@ -407,19 +407,19 @@ RelationGetBufferForTuple(Relation relation, Size len,
>          * target.
>          */
>         targetBlock = GetPageWithFreeSpace(relation, targetFreeSpace);
> -   }
> 
> -   /*
> -    * If the FSM knows nothing of the rel, try the last page before we give
> -    * up and extend.  This avoids one-tuple-per-page syndrome during
> -    * bootstrapping or in a recently-started system.
> -    */
> -   if (targetBlock == InvalidBlockNumber)
> -   {
> -       BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
> +       /*
> +        * If the FSM knows nothing of the rel, try the last page before we
> +        * give up and extend.  This avoids one-tuple-per-page syndrome during
> +        * bootstrapping or in a recently-started system.
> +        */
> +       if (targetBlock == InvalidBlockNumber)
> +       {
> +           BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
> 
> -       if (nblocks > 0)
> -           targetBlock = nblocks - 1;
> +           if (nblocks > 0)
> +               targetBlock = nblocks - 1;
> +       }
>     }
> 
> Attached the patch that brings back the above change.
> 

Thanks for the analysis! I think you're right - this bit should have
been kept. Partial reverts are tricky :-(

I'll get this fixed / pushed later today, after a bit more testing. I'd
swear I ran tests with CCA, but it's possible I skipped contrib.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: speed up verifying UTF-8
Next
From: Amit Kapila
Date:
Subject: Re: Decoding speculative insert with toast leaks memory