Thread: collect_corrupt_items_vacuum.patch

collect_corrupt_items_vacuum.patch

From
Daniel Shelepanov
Date:
Hi hackers!
 
I’ve been working on this [https://www.postgresql.org/message-id/flat/cfcca574-6967-c5ab-7dc3-2c82b6723b99%40mail.ru] bug. Finally, I’ve come up with the patch you can find attached. Basically what is does is rises a PROC_IN_VACUUM flag and resets it afterwards. I know this seems kinda crunchy and I hope you guys will give me some hints on where to continue. This [https://www.postgresql.org/message-id/20220218175119.7hwv7ksamfjwijbx%40alap3.anarazel.de] message contains reproduction script. Thank you very much in advance.
 
Kind regards,
Daniel Shelepanov
Attachment

Re: collect_corrupt_items_vacuum.patch

From
Robert Haas
Date:
On Mon, Apr 4, 2022 at 4:51 AM Daniel Shelepanov <deniel1495@mail.ru> wrote:
> I’ve been working on this [https://www.postgresql.org/message-id/flat/cfcca574-6967-c5ab-7dc3-2c82b6723b99%40mail.ru]
bug.Finally, I’ve come up with the patch you can find attached. Basically what is does is rises a PROC_IN_VACUUM flag
andresets it afterwards. I know this seems kinda crunchy and I hope you guys will give me some hints on where to
continue.This [https://www.postgresql.org/message-id/20220218175119.7hwv7ksamfjwijbx%40alap3.anarazel.de] message
containsreproduction script. Thank you very much in advance. 

I noticed the CommitFest entry for this thread today and decided to
take a look. I think the general issue here can be stated in this way:
suppose a VACUUM computes an all-visible cutoff X, i.e. it thinks all
committed XIDs < X are all-visible. Then, at a later time, pg_visible
computes an all-visible cutoff Y, i.e. it thinks all committed XIDs <
Y are all-visible. If Y < X, pg_check_visible() might falsely report
corruption, because VACUUM might have marked as all-visible some page
containing tuples which pg_check_visibile() thinks aren't really
all-visible.

In reality, the oldest all-visible XID cannot move backward, but
ComputeXidHorizons() lets it move backward, because it's intended for
use by a caller who wants to mark pages all-visible, and it's only
concerned with making sure that the value is old enough to be safe.
And that's a problem for the way that pg_visibility is (mis-)using it.

To say that another way, ComputeXidHorizons() is perfectly fine with
returning a value that is older than the true answer, as long as it
never returns a value that is newer than the new answer. pg_visibility
wants the opposite. Here, a value that is newer than the true value
can't do worse than hide corruption, which is sort of OK, but a value
that's older than the true value can report corruption where none
exists, which is very bad.

I have a feeling, therefore, that this isn't really a complete fix. I
think it might address one way for the horizon reported by
ComputeXidHorizons() to move backward, but not all the ways.

Unfortunately, I am out of time for today to study this... but will
try to find more time on another day.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: collect_corrupt_items_vacuum.patch

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> In reality, the oldest all-visible XID cannot move backward, but
> ComputeXidHorizons() lets it move backward, because it's intended for
> use by a caller who wants to mark pages all-visible, and it's only
> concerned with making sure that the value is old enough to be safe.

Right.

> And that's a problem for the way that pg_visibility is (mis-)using it.

> To say that another way, ComputeXidHorizons() is perfectly fine with
> returning a value that is older than the true answer, as long as it
> never returns a value that is newer than the new answer. pg_visibility
> wants the opposite. Here, a value that is newer than the true value
> can't do worse than hide corruption, which is sort of OK, but a value
> that's older than the true value can report corruption where none
> exists, which is very bad.

Maybe we need a different function for pg_visibility to call?
If we want ComputeXidHorizons to serve both these purposes, then it
has to always deliver exactly the right answer, which seems like
a definition that will be hard and expensive to achieve.

            regards, tom lane



Re: collect_corrupt_items_vacuum.patch

From
Robert Haas
Date:
On Wed, Jul 27, 2022 at 5:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Maybe we need a different function for pg_visibility to call?
> If we want ComputeXidHorizons to serve both these purposes, then it
> has to always deliver exactly the right answer, which seems like
> a definition that will be hard and expensive to achieve.

Yeah, I was thinking along similar lines.

I'm also kind of wondering why these calculations use
latestCompletedXid. Is that something we do solely to reduce locking?
The XIDs of running transactions matter, and their snapshots matter,
and the XIDs that could start running in the future matter, but I
don't know why it matters what the latest completed XID is.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: collect_corrupt_items_vacuum.patch

From
Michael Paquier
Date:
On Wed, Jul 27, 2022 at 09:47:19PM -0400, Robert Haas wrote:
> On Wed, Jul 27, 2022 at 5:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Maybe we need a different function for pg_visibility to call?
> > If we want ComputeXidHorizons to serve both these purposes, then it
> > has to always deliver exactly the right answer, which seems like
> > a definition that will be hard and expensive to achieve.
>
> Yeah, I was thinking along similar lines.
>
> I'm also kind of wondering why these calculations use
> latestCompletedXid. Is that something we do solely to reduce locking?
> The XIDs of running transactions matter, and their snapshots matter,
> and the XIDs that could start running in the future matter, but I
> don't know why it matters what the latest completed XID is.

Daniel, it seems to me that this thread is waiting for some input from
you, based on the remarks of Tom and Robert.  Are you planning to do
so?  This is marked as a bug fix, so I have moved this item to the
next CF for now.
--
Michael

Attachment

Re: collect_corrupt_items_vacuum.patch

From
Nikita Malakhov
Date:
Hi hackers!

Daniel is busy with other tasks. I've found this topic and this problem seems to be actual
or v15 too.
Please correct me if I am wrong. I've checked another discussion related to pg_visibility [1]. 
According to discussion: if using latest completed xid is not right for checking visibility, than
it should be the least running transaction xid? So it must be another function to be used for
these calculations, not the GetOldestNonRemovableTransactionId that uses
the ComputeXidHorizons.


On Wed, Oct 12, 2022 at 8:15 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Jul 27, 2022 at 09:47:19PM -0400, Robert Haas wrote:
> On Wed, Jul 27, 2022 at 5:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Maybe we need a different function for pg_visibility to call?
> > If we want ComputeXidHorizons to serve both these purposes, then it
> > has to always deliver exactly the right answer, which seems like
> > a definition that will be hard and expensive to achieve.
>
> Yeah, I was thinking along similar lines.
>
> I'm also kind of wondering why these calculations use
> latestCompletedXid. Is that something we do solely to reduce locking?
> The XIDs of running transactions matter, and their snapshots matter,
> and the XIDs that could start running in the future matter, but I
> don't know why it matters what the latest completed XID is.

Daniel, it seems to me that this thread is waiting for some input from
you, based on the remarks of Tom and Robert.  Are you planning to do
so?  This is marked as a bug fix, so I have moved this item to the
next CF for now.
--
Michael


--
Regards,
Nikita Malakhov
Postgres Professional 

Re: collect_corrupt_items_vacuum.patch

From
Nikita Malakhov
Date:
Hi hackers!

Just to bump this thread, because the problem seems to be still actual:

Please correct me if I am wrong. I've checked another discussion related to pg_visibility [1]. 
According to discussion: if using latest completed xid is not right for checking visibility, than
it should be the least running transaction xid? So it must be another function to be used for
these calculations, not the GetOldestNonRemovableTransactionId that uses
the ComputeXidHorizons.


On Wed, Oct 12, 2022 at 8:15 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Jul 27, 2022 at 09:47:19PM -0400, Robert Haas wrote:
> On Wed, Jul 27, 2022 at 5:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Maybe we need a different function for pg_visibility to call?
> > If we want ComputeXidHorizons to serve both these purposes, then it
> > has to always deliver exactly the right answer, which seems like
> > a definition that will be hard and expensive to achieve.
>
> Yeah, I was thinking along similar lines.
>
> I'm also kind of wondering why these calculations use
> latestCompletedXid. Is that something we do solely to reduce locking?
> The XIDs of running transactions matter, and their snapshots matter,
> and the XIDs that could start running in the future matter, but I
> don't know why it matters what the latest completed XID is.

Daniel, it seems to me that this thread is waiting for some input from
you, based on the remarks of Tom and Robert.  Are you planning to do
so?  This is marked as a bug fix, so I have moved this item to the
next CF for now.
--
Michael

--
Regards,
Nikita Malakhov
Postgres Professional 

Re: collect_corrupt_items_vacuum.patch

From
Daniel Gustafsson
Date:
This patch has been waiting on the author for about a year now, so I will close
it as Returned with Feedback.  Plesae feel free to resubmit to a future CF when
there is renewed interest in working on this.

--
Daniel Gustafsson




Re: collect_corrupt_items_vacuum.patch

From
Alexander Korotkov
Date:
Hi!

On Tue, Jul 4, 2023 at 10:21 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> This patch has been waiting on the author for about a year now, so I will close
> it as Returned with Feedback.  Plesae feel free to resubmit to a future CF when
> there is renewed interest in working on this.

I'd like to revive this thread.  While the patch proposed definitely
makes things better.  But as pointed out by Robert and Tom, it didn't
allow to avoid all false reports.  The reason is that the way we
currently calculate the oldest xmin, it could move backwards (see
comments to ComputeXidHorizons()).  The attached patch implements own
function to calculate strict oldest xmin, which should be always
greater or equal to any xid horizon calculated before.  I have to do
the following changes in comparison to what ComputeXidHorizons() do.

1. Ignore processes xmin's, because they take into account connection
to other databases which were ignored before.
2. Ignore KnownAssignedXids, because they are not database-aware.
While primary could compute its horizons database-aware.
3. Ignore walsender xmin, because it could go backward if some
replication connections don't use replication slots.

Surely these would significantly sacrifice accuracy. But we have to do
so in order to avoid reporting false errors.

Any thoughts?

------
Regards,
Alexander Korotkov

Attachment

Re: collect_corrupt_items_vacuum.patch

From
Alexander Lakhin
Date:
Hi Alexander,

06.11.2023 12:30, Alexander Korotkov wrote:
> Surely these would significantly sacrifice accuracy. But we have to do
> so in order to avoid reporting false errors.
>

I've reduced the dirty reproducer Daniel Shelepanov posted initially
to the following:
numdbs=10
for ((d=1;d<=$numdbs;d++)); do
   createdb db$d
   psql db$d -c "create extension pg_visibility"
done

for ((i=1;i<=300;i++)); do
echo "iteration $i"
for ((d=1;d<=$numdbs;d++)); do
(
echo "
create table vacuum_test as select 42 i;
vacuum (disable_page_skipping) vacuum_test;
select * from pg_check_visible('vacuum_test');
" | psql db$d -a -q >psql-$d.log 2>&1
) &
done
wait

res=0
for ((d=1;d<=$numdbs;d++)); do
grep -q '0 rows' psql-$d.log || { echo "Error condition in psql-$d.log:"; cat psql-$d.log; res=1; break; }
psql db$d -q -c "drop table vacuum_test"
done
[ $res == 0 ] || break;
done

It looks like the v2 patch doesn't fix the original issue. Maybe I miss
something, but with the patch applied, I see the failure immediately,
though without the patch several iterations are needed to get it.

Best regards,
Alexander



Re: collect_corrupt_items_vacuum.patch

From
Alexander Korotkov
Date:
Hi, Alexander.

On Tue, Nov 7, 2023 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> It looks like the v2 patch doesn't fix the original issue. Maybe I miss
> something, but with the patch applied, I see the failure immediately,
> though without the patch several iterations are needed to get it.


That's a bug in the patch.  Thank you for cathing it.  It should start
calculation from latestCompletedXid + 1, not InvalidTransactionId.
Please, check the revised patch.

------
Regards,
Alexander Korotkov

Attachment

Re: collect_corrupt_items_vacuum.patch

From
Alexander Lakhin
Date:
07.11.2023 14:38, Alexander Korotkov wrote:
> Hi, Alexander.
>
> On Tue, Nov 7, 2023 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>> It looks like the v2 patch doesn't fix the original issue. Maybe I miss
>> something, but with the patch applied, I see the failure immediately,
>> though without the patch several iterations are needed to get it.
>
> That's a bug in the patch.  Thank you for cathing it.  It should start
> calculation from latestCompletedXid + 1, not InvalidTransactionId.
> Please, check the revised patch.

Thanks for looking at this!
Unfortunately, I still see the failure with the v3, but not on a first
iteration:
...
iteration 316
Error condition in psql-8.log:
create table vacuum_test as select 42 i;
vacuum (disable_page_skipping) vacuum_test;
select * from pg_check_visible('vacuum_test');
  t_ctid
--------
  (0,1)
(1 row)

(I've double-checked that the patch is applied and get_strict_xid_horizon()
is called.)

Best regards,
Alexander



Re: collect_corrupt_items_vacuum.patch

From
Alexander Korotkov
Date:
Hi, Alexander!

On Tue, Nov 7, 2023 at 3:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>
> 07.11.2023 14:38, Alexander Korotkov wrote:
> > Hi, Alexander.
> >
> > On Tue, Nov 7, 2023 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> >> It looks like the v2 patch doesn't fix the original issue. Maybe I miss
> >> something, but with the patch applied, I see the failure immediately,
> >> though without the patch several iterations are needed to get it.
> >
> > That's a bug in the patch.  Thank you for cathing it.  It should start
> > calculation from latestCompletedXid + 1, not InvalidTransactionId.
> > Please, check the revised patch.
>
> Thanks for looking at this!
> Unfortunately, I still see the failure with the v3, but not on a first
> iteration:
> ...
> iteration 316
> Error condition in psql-8.log:
> create table vacuum_test as select 42 i;
> vacuum (disable_page_skipping) vacuum_test;
> select * from pg_check_visible('vacuum_test');
>   t_ctid
> --------
>   (0,1)
> (1 row)
>
> (I've double-checked that the patch is applied and get_strict_xid_horizon()
> is called.)

I managed to reproduce this on a Linux VM.  This problem should arise
because in extension I don't have access to ProcArrayStruct.  So, my
code is iterating the whole PGPROC's array.  I reimplemented the new
horizon calculation function in the core with usage of
ProcArrayStruct.  Now it doesn't fall for me.  Please, recheck.

------
Regards,
Alexander Korotkov

Attachment

Re: collect_corrupt_items_vacuum.patch

From
Alexander Lakhin
Date:
Hi Alexander,

04.12.2023 03:23, Alexander Korotkov wrote:
> I managed to reproduce this on a Linux VM.  This problem should arise
> because in extension I don't have access to ProcArrayStruct.  So, my
> code is iterating the whole PGPROC's array.  I reimplemented the new
> horizon calculation function in the core with usage of
> ProcArrayStruct.  Now it doesn't fall for me.  Please, recheck.

Yes, v4 works for me as well (thousands of iterations passed).
Thank you!

Though the test passes even without manipulations with the PROC_IN_VACUUM
flag in pg_visibility.c (maybe the test is not good enough to show why
those manipulations are needed).
I also couldn't see where VISHORIZON_DATA_STRICT comes into play...

Best regards,
Alexander



Re: collect_corrupt_items_vacuum.patch

From
Dmitry Koval
Date:
Hi!

I agree with Alexander Lakhin about PROC_IN_VACUUM and 
VISHORIZON_DATA_STRICT:
1) probably manipulations with the PROC_IN_VACUUM flag in 
pg_visibility.c were needed for condition [1] and can be removed now;
2) the VISHORIZON_DATA_STRICT macro is probably unnecessary too (since 
we are not going to use it in the GlobalVisHorizonKindForRel() function).

Also It would be nice to remove the get_strict_xid_horizon() function 
from the comment (replace to GetStrictOldestNonRemovableTransactionId()?).

[1] 

https://github.com/postgres/postgres/blob/4d0cf0b05defcee985d5af38cb0db2b9c2f8dbae/src/backend/storage/ipc/procarray.c#L1812

-- 
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com



Re: collect_corrupt_items_vacuum.patch

From
Alexander Korotkov
Date:
Hi!

On Tue, Dec 5, 2023 at 9:03 PM Dmitry Koval <d.koval@postgrespro.ru> wrote:
> I agree with Alexander Lakhin about PROC_IN_VACUUM and
> VISHORIZON_DATA_STRICT:
> 1) probably manipulations with the PROC_IN_VACUUM flag in
> pg_visibility.c were needed for condition [1] and can be removed now;

Right, PROC_IN_VACUUM is no longer required.  The possible benefit of
it would be to avoid bloat during a possibly long run of
pg_visibility() function.  But the downside are problems with the
snapshot if the invoking query contains something except a single call
of the pg_visibility() function, and complexity.  Removed.

> 2) the VISHORIZON_DATA_STRICT macro is probably unnecessary too (since
> we are not going to use it in the GlobalVisHorizonKindForRel() function).

Makes sense, removed.

> Also It would be nice to remove the get_strict_xid_horizon() function
> from the comment (replace to GetStrictOldestNonRemovableTransactionId()?).

Right, fixed.

The revised patch is attached.  Besides the fixes above, it contains
improvements for comments and the detailed commit message.

Tom, Robert, what do you think about the patch attached?  It required
a new type of xid horizon in core and sacrifices accuracy.  But this
is the only way I can imagine, we can fix the problem in a general
way.

------
Regards,
Alexander Korotkov

Attachment

Re: collect_corrupt_items_vacuum.patch

From
Dmitry Koval
Date:
Hi!

Thank you, there is one small point left (in the comment): can you 
replace "guarantteed to be to be newer" to "guaranteed to be newer", 
file src/backend/storage/ipc/procarray.c?

-- 
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com



Re: collect_corrupt_items_vacuum.patch

From
Alexander Korotkov
Date:
Hi, Dmitry!

On Sat, Jan 13, 2024 at 7:33 PM Dmitry Koval <d.koval@postgrespro.ru> wrote:
> Thank you, there is one small point left (in the comment): can you
> replace "guarantteed to be to be newer" to "guaranteed to be newer",
> file src/backend/storage/ipc/procarray.c?

Fixed.  Thank you for catching this.

------
Regards,
Alexander Korotkov

Attachment

Re: collect_corrupt_items_vacuum.patch

From
Alexander Korotkov
Date:
On Sun, Jan 14, 2024 at 4:35 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Sat, Jan 13, 2024 at 7:33 PM Dmitry Koval <d.koval@postgrespro.ru> wrote:
> > Thank you, there is one small point left (in the comment): can you
> > replace "guarantteed to be to be newer" to "guaranteed to be newer",
> > file src/backend/storage/ipc/procarray.c?
>
> Fixed.  Thank you for catching this.

I made the following improvements to the patch.
1. I find a way to implement the path with less changes to the core
code.  The GetRunningTransactionData() function allows to get the
least running xid, all I need is to add database-aware values.
2. I added the TAP test reproducing the original issue.

I'm going to push this if no objections.

------
Regards,
Alexander Korotkov

Attachment

Re: collect_corrupt_items_vacuum.patch

From
Noah Misch
Date:
On Tue, Mar 12, 2024 at 02:10:59PM +0200, Alexander Korotkov wrote:
> I'm going to push this if no objections.

Commit e85662d wrote:
> --- a/src/backend/storage/ipc/procarray.c
> +++ b/src/backend/storage/ipc/procarray.c

> @@ -2740,6 +2741,8 @@ GetRunningTransactionData(void)
>       */
>      for (index = 0; index < arrayP->numProcs; index++)
>      {
> +        int            pgprocno = arrayP->pgprocnos[index];
> +        PGPROC       *proc = &allProcs[pgprocno];
>          TransactionId xid;
>  
>          /* Fetch xid just once - see GetNewTransactionId */
> @@ -2760,6 +2763,13 @@ GetRunningTransactionData(void)
>          if (TransactionIdPrecedes(xid, oldestRunningXid))
>              oldestRunningXid = xid;
>  
> +        /*
> +         * Also, update the oldest running xid within the current database.
> +         */
> +        if (proc->databaseId == MyDatabaseId &&
> +            TransactionIdPrecedes(xid, oldestRunningXid))
> +            oldestDatabaseRunningXid = xid;

Shouldn't that be s/oldestRunningXid/oldestDatabaseRunningXid/?

While this isn't a hot path, I likely would test TransactionIdPrecedes()
before fetching pgprocno and PGPROC, to reduce wasted cache misses.



Re: collect_corrupt_items_vacuum.patch

From
Alexander Korotkov
Date:
On Mon, Jul 1, 2024 at 2:18 AM Noah Misch <noah@leadboat.com> wrote:
> On Tue, Mar 12, 2024 at 02:10:59PM +0200, Alexander Korotkov wrote:
> > I'm going to push this if no objections.
>
> Commit e85662d wrote:
> > --- a/src/backend/storage/ipc/procarray.c
> > +++ b/src/backend/storage/ipc/procarray.c
>
> > @@ -2740,6 +2741,8 @@ GetRunningTransactionData(void)
> >        */
> >       for (index = 0; index < arrayP->numProcs; index++)
> >       {
> > +             int                     pgprocno = arrayP->pgprocnos[index];
> > +             PGPROC     *proc = &allProcs[pgprocno];
> >               TransactionId xid;
> >
> >               /* Fetch xid just once - see GetNewTransactionId */
> > @@ -2760,6 +2763,13 @@ GetRunningTransactionData(void)
> >               if (TransactionIdPrecedes(xid, oldestRunningXid))
> >                       oldestRunningXid = xid;
> >
> > +             /*
> > +              * Also, update the oldest running xid within the current database.
> > +              */
> > +             if (proc->databaseId == MyDatabaseId &&
> > +                     TransactionIdPrecedes(xid, oldestRunningXid))
> > +                     oldestDatabaseRunningXid = xid;
>
> Shouldn't that be s/oldestRunningXid/oldestDatabaseRunningXid/?

Thank you for catching this.

> While this isn't a hot path, I likely would test TransactionIdPrecedes()
> before fetching pgprocno and PGPROC, to reduce wasted cache misses.

And thanks for suggestion.

The patchset is attached.  0001 implements
s/oldestRunningXid/oldestDatabaseRunningXid/.  0002 implements cache
misses optimization.

If no objection, I'll backpatch 0001 and apply 0002 to the head.

------
Regards,
Alexander Korotkov
Supabase

Attachment

Re: collect_corrupt_items_vacuum.patch

From
Noah Misch
Date:
On Wed, Jul 03, 2024 at 12:31:48AM +0300, Alexander Korotkov wrote:
> On Mon, Jul 1, 2024 at 2:18 AM Noah Misch <noah@leadboat.com> wrote:
> > Commit e85662d wrote:
> > > --- a/src/backend/storage/ipc/procarray.c
> > > +++ b/src/backend/storage/ipc/procarray.c
> >
> > > @@ -2740,6 +2741,8 @@ GetRunningTransactionData(void)
> > >        */
> > >       for (index = 0; index < arrayP->numProcs; index++)
> > >       {
> > > +             int                     pgprocno = arrayP->pgprocnos[index];
> > > +             PGPROC     *proc = &allProcs[pgprocno];
> > >               TransactionId xid;
> > >
> > >               /* Fetch xid just once - see GetNewTransactionId */
> > > @@ -2760,6 +2763,13 @@ GetRunningTransactionData(void)
> > >               if (TransactionIdPrecedes(xid, oldestRunningXid))
> > >                       oldestRunningXid = xid;
> > >
> > > +             /*
> > > +              * Also, update the oldest running xid within the current database.
> > > +              */
> > > +             if (proc->databaseId == MyDatabaseId &&
> > > +                     TransactionIdPrecedes(xid, oldestRunningXid))
> > > +                     oldestDatabaseRunningXid = xid;
> >
> > Shouldn't that be s/oldestRunningXid/oldestDatabaseRunningXid/?
> 
> Thank you for catching this.
> 
> > While this isn't a hot path, I likely would test TransactionIdPrecedes()
> > before fetching pgprocno and PGPROC, to reduce wasted cache misses.
> 
> And thanks for suggestion.
> 
> The patchset is attached.  0001 implements
> s/oldestRunningXid/oldestDatabaseRunningXid/.  0002 implements cache
> misses optimization.
> 
> If no objection, I'll backpatch 0001 and apply 0002 to the head.

Looks fine.  I'd drop the comment update as saying the obvious, but keeping it
is okay.



Re: collect_corrupt_items_vacuum.patch

From
Heikki Linnakangas
Date:
This causes an assertion failure when executed in a hot standby server:

  select * from pg_check_visible('pg_database');

TRAP: failed Assert("!RecoveryInProgress()"), File: 
"../src/backend/storage/ipc/procarray.c", Line: 2710, PID: 1142572

GetStrictOldestNonRemovableTransactionId does this:

>     if (rel == NULL || rel->rd_rel->relisshared || RecoveryInProgress())
>     {
>         /* Shared relation: take into account all running xids */
>         runningTransactions = GetRunningTransactionData();
>         LWLockRelease(ProcArrayLock);
>         LWLockRelease(XidGenLock);
>         return runningTransactions->oldestRunningXid;
>     }

And GetRunningTransactionData() has this:

>     Assert(!RecoveryInProgress());

So it's easy to see that you will hit that assertion.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: collect_corrupt_items_vacuum.patch

From
Alexander Korotkov
Date:
On Tue, Aug 13, 2024 at 9:39 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>
> This causes an assertion failure when executed in a hot standby server:
>
>   select * from pg_check_visible('pg_database');
>
> TRAP: failed Assert("!RecoveryInProgress()"), File:
> "../src/backend/storage/ipc/procarray.c", Line: 2710, PID: 1142572
>
> GetStrictOldestNonRemovableTransactionId does this:
>
> >       if (rel == NULL || rel->rd_rel->relisshared || RecoveryInProgress())
> >       {
> >               /* Shared relation: take into account all running xids */
> >               runningTransactions = GetRunningTransactionData();
> >               LWLockRelease(ProcArrayLock);
> >               LWLockRelease(XidGenLock);
> >               return runningTransactions->oldestRunningXid;
> >       }
>
> And GetRunningTransactionData() has this:
>
> >       Assert(!RecoveryInProgress());
>
> So it's easy to see that you will hit that assertion.

Oh, thank you!
I'll fix this and add a test for recovery!

------
Regards,
Alexander Korotkov
Supabase



Re: collect_corrupt_items_vacuum.patch

From
Alexander Korotkov
Date:
On Tue, Aug 13, 2024 at 10:15 PM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
> On Tue, Aug 13, 2024 at 9:39 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> >
> > This causes an assertion failure when executed in a hot standby server:
> >
> >   select * from pg_check_visible('pg_database');
> >
> > TRAP: failed Assert("!RecoveryInProgress()"), File:
> > "../src/backend/storage/ipc/procarray.c", Line: 2710, PID: 1142572
> >
> > GetStrictOldestNonRemovableTransactionId does this:
> >
> > >       if (rel == NULL || rel->rd_rel->relisshared || RecoveryInProgress())
> > >       {
> > >               /* Shared relation: take into account all running xids */
> > >               runningTransactions = GetRunningTransactionData();
> > >               LWLockRelease(ProcArrayLock);
> > >               LWLockRelease(XidGenLock);
> > >               return runningTransactions->oldestRunningXid;
> > >       }
> >
> > And GetRunningTransactionData() has this:
> >
> > >       Assert(!RecoveryInProgress());
> >
> > So it's easy to see that you will hit that assertion.
>
> Oh, thank you!
> I'll fix this and add a test for recovery!

Attached patch fixes the problem and adds the corresponding test.  I
would appreciate if you take a look at it.

------
Regards,
Alexander Korotkov
Supabase

Attachment

Re: collect_corrupt_items_vacuum.patch

From
Heikki Linnakangas
Date:
On 14/08/2024 04:51, Alexander Korotkov wrote:
> On Tue, Aug 13, 2024 at 10:15 PM Alexander Korotkov
> <aekorotkov@gmail.com> wrote:
>> On Tue, Aug 13, 2024 at 9:39 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>>
>>> This causes an assertion failure when executed in a hot standby server:
>>>
>>>    select * from pg_check_visible('pg_database');
>>>
>>> TRAP: failed Assert("!RecoveryInProgress()"), File:
>>> "../src/backend/storage/ipc/procarray.c", Line: 2710, PID: 1142572
>>>
>>> GetStrictOldestNonRemovableTransactionId does this:
>>>
>>>>        if (rel == NULL || rel->rd_rel->relisshared || RecoveryInProgress())
>>>>        {
>>>>                /* Shared relation: take into account all running xids */
>>>>                runningTransactions = GetRunningTransactionData();
>>>>                LWLockRelease(ProcArrayLock);
>>>>                LWLockRelease(XidGenLock);
>>>>                return runningTransactions->oldestRunningXid;
>>>>        }
>>>
>>> And GetRunningTransactionData() has this:
>>>
>>>>        Assert(!RecoveryInProgress());
>>>
>>> So it's easy to see that you will hit that assertion.
>>
>> Oh, thank you!
>> I'll fix this and add a test for recovery!
> 
> Attached patch fixes the problem and adds the corresponding test.  I
> would appreciate if you take a look at it.

The code changes seem fine. I think the "Ignore KnownAssignedXids" 
comment above the function could be made more clear. It's not wrong, but 
I think it doesn't explain the reasoning very well:

* We are now doing no effectively no checking in a standby, because we 
always just use nextXid. It's better than nothing, I suppose it will 
catch very broken cases where an XID is in the future, but that's all.

* We *could* use KnownAssignedXids for shared catalogs, because with 
shared catalogs, the global horizon is used, not a database-aware one.

* Then again, there might be rare corner cases that a transaction has 
crashed in the primary without writing a commit/abort record, and hence 
it looks like it's still running in the standby but has already ended in 
the primary. So I think it's good we ignore KnownAssignedXids for shared 
catalogs anyway.

-- 
Heikki Linnakangas
Neon (https://neon.tech)