Thread: collect_corrupt_items_vacuum.patch
Hi hackers!
I’ve been working on this [https://www.postgresql.org/message-id/flat/cfcca574-6967-c5ab-7dc3-2c82b6723b99%40mail.ru] bug. Finally, I’ve come up with the patch you can find attached. Basically what is does is rises a PROC_IN_VACUUM flag and resets it afterwards. I know this seems kinda crunchy and I hope you guys will give me some hints on where to continue. This [https://www.postgresql.org/message-id/20220218175119.7hwv7ksamfjwijbx%40alap3.anarazel.de] message contains reproduction script. Thank you very much in advance.
Kind regards,
Daniel Shelepanov
Daniel Shelepanov
Attachment
On Mon, Apr 4, 2022 at 4:51 AM Daniel Shelepanov <deniel1495@mail.ru> wrote: > I’ve been working on this [https://www.postgresql.org/message-id/flat/cfcca574-6967-c5ab-7dc3-2c82b6723b99%40mail.ru] bug.Finally, I’ve come up with the patch you can find attached. Basically what is does is rises a PROC_IN_VACUUM flag andresets it afterwards. I know this seems kinda crunchy and I hope you guys will give me some hints on where to continue.This [https://www.postgresql.org/message-id/20220218175119.7hwv7ksamfjwijbx%40alap3.anarazel.de] message containsreproduction script. Thank you very much in advance. I noticed the CommitFest entry for this thread today and decided to take a look. I think the general issue here can be stated in this way: suppose a VACUUM computes an all-visible cutoff X, i.e. it thinks all committed XIDs < X are all-visible. Then, at a later time, pg_visible computes an all-visible cutoff Y, i.e. it thinks all committed XIDs < Y are all-visible. If Y < X, pg_check_visible() might falsely report corruption, because VACUUM might have marked as all-visible some page containing tuples which pg_check_visibile() thinks aren't really all-visible. In reality, the oldest all-visible XID cannot move backward, but ComputeXidHorizons() lets it move backward, because it's intended for use by a caller who wants to mark pages all-visible, and it's only concerned with making sure that the value is old enough to be safe. And that's a problem for the way that pg_visibility is (mis-)using it. To say that another way, ComputeXidHorizons() is perfectly fine with returning a value that is older than the true answer, as long as it never returns a value that is newer than the new answer. pg_visibility wants the opposite. Here, a value that is newer than the true value can't do worse than hide corruption, which is sort of OK, but a value that's older than the true value can report corruption where none exists, which is very bad. I have a feeling, therefore, that this isn't really a complete fix. I think it might address one way for the horizon reported by ComputeXidHorizons() to move backward, but not all the ways. Unfortunately, I am out of time for today to study this... but will try to find more time on another day. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > In reality, the oldest all-visible XID cannot move backward, but > ComputeXidHorizons() lets it move backward, because it's intended for > use by a caller who wants to mark pages all-visible, and it's only > concerned with making sure that the value is old enough to be safe. Right. > And that's a problem for the way that pg_visibility is (mis-)using it. > To say that another way, ComputeXidHorizons() is perfectly fine with > returning a value that is older than the true answer, as long as it > never returns a value that is newer than the new answer. pg_visibility > wants the opposite. Here, a value that is newer than the true value > can't do worse than hide corruption, which is sort of OK, but a value > that's older than the true value can report corruption where none > exists, which is very bad. Maybe we need a different function for pg_visibility to call? If we want ComputeXidHorizons to serve both these purposes, then it has to always deliver exactly the right answer, which seems like a definition that will be hard and expensive to achieve. regards, tom lane
On Wed, Jul 27, 2022 at 5:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Maybe we need a different function for pg_visibility to call? > If we want ComputeXidHorizons to serve both these purposes, then it > has to always deliver exactly the right answer, which seems like > a definition that will be hard and expensive to achieve. Yeah, I was thinking along similar lines. I'm also kind of wondering why these calculations use latestCompletedXid. Is that something we do solely to reduce locking? The XIDs of running transactions matter, and their snapshots matter, and the XIDs that could start running in the future matter, but I don't know why it matters what the latest completed XID is. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Jul 27, 2022 at 09:47:19PM -0400, Robert Haas wrote: > On Wed, Jul 27, 2022 at 5:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Maybe we need a different function for pg_visibility to call? > > If we want ComputeXidHorizons to serve both these purposes, then it > > has to always deliver exactly the right answer, which seems like > > a definition that will be hard and expensive to achieve. > > Yeah, I was thinking along similar lines. > > I'm also kind of wondering why these calculations use > latestCompletedXid. Is that something we do solely to reduce locking? > The XIDs of running transactions matter, and their snapshots matter, > and the XIDs that could start running in the future matter, but I > don't know why it matters what the latest completed XID is. Daniel, it seems to me that this thread is waiting for some input from you, based on the remarks of Tom and Robert. Are you planning to do so? This is marked as a bug fix, so I have moved this item to the next CF for now. -- Michael
Attachment
Hi hackers!
Daniel is busy with other tasks. I've found this topic and this problem seems to be actual
or v15 too.
Please correct me if I am wrong. I've checked another discussion related to pg_visibility [1].
According to discussion: if using latest completed xid is not right for checking visibility, than
it should be the least running transaction xid? So it must be another function to be used for
these calculations, not the GetOldestNonRemovableTransactionId that uses
the ComputeXidHorizons.
On Wed, Oct 12, 2022 at 8:15 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Jul 27, 2022 at 09:47:19PM -0400, Robert Haas wrote:
> On Wed, Jul 27, 2022 at 5:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Maybe we need a different function for pg_visibility to call?
> > If we want ComputeXidHorizons to serve both these purposes, then it
> > has to always deliver exactly the right answer, which seems like
> > a definition that will be hard and expensive to achieve.
>
> Yeah, I was thinking along similar lines.
>
> I'm also kind of wondering why these calculations use
> latestCompletedXid. Is that something we do solely to reduce locking?
> The XIDs of running transactions matter, and their snapshots matter,
> and the XIDs that could start running in the future matter, but I
> don't know why it matters what the latest completed XID is.
Daniel, it seems to me that this thread is waiting for some input from
you, based on the remarks of Tom and Robert. Are you planning to do
so? This is marked as a bug fix, so I have moved this item to the
next CF for now.
--
Michael
Hi hackers!
Just to bump this thread, because the problem seems to be still actual:
Please correct me if I am wrong. I've checked another discussion related to pg_visibility [1].
According to discussion: if using latest completed xid is not right for checking visibility, than
it should be the least running transaction xid? So it must be another function to be used for
these calculations, not the GetOldestNonRemovableTransactionId that uses
the ComputeXidHorizons.
On Wed, Oct 12, 2022 at 8:15 AM Michael Paquier <michael@paquier.xyz> wrote:On Wed, Jul 27, 2022 at 09:47:19PM -0400, Robert Haas wrote:
> On Wed, Jul 27, 2022 at 5:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Maybe we need a different function for pg_visibility to call?
> > If we want ComputeXidHorizons to serve both these purposes, then it
> > has to always deliver exactly the right answer, which seems like
> > a definition that will be hard and expensive to achieve.
>
> Yeah, I was thinking along similar lines.
>
> I'm also kind of wondering why these calculations use
> latestCompletedXid. Is that something we do solely to reduce locking?
> The XIDs of running transactions matter, and their snapshots matter,
> and the XIDs that could start running in the future matter, but I
> don't know why it matters what the latest completed XID is.
Daniel, it seems to me that this thread is waiting for some input from
you, based on the remarks of Tom and Robert. Are you planning to do
so? This is marked as a bug fix, so I have moved this item to the
next CF for now.
--
Michael
This patch has been waiting on the author for about a year now, so I will close it as Returned with Feedback. Plesae feel free to resubmit to a future CF when there is renewed interest in working on this. -- Daniel Gustafsson
Hi! On Tue, Jul 4, 2023 at 10:21 AM Daniel Gustafsson <daniel@yesql.se> wrote: > This patch has been waiting on the author for about a year now, so I will close > it as Returned with Feedback. Plesae feel free to resubmit to a future CF when > there is renewed interest in working on this. I'd like to revive this thread. While the patch proposed definitely makes things better. But as pointed out by Robert and Tom, it didn't allow to avoid all false reports. The reason is that the way we currently calculate the oldest xmin, it could move backwards (see comments to ComputeXidHorizons()). The attached patch implements own function to calculate strict oldest xmin, which should be always greater or equal to any xid horizon calculated before. I have to do the following changes in comparison to what ComputeXidHorizons() do. 1. Ignore processes xmin's, because they take into account connection to other databases which were ignored before. 2. Ignore KnownAssignedXids, because they are not database-aware. While primary could compute its horizons database-aware. 3. Ignore walsender xmin, because it could go backward if some replication connections don't use replication slots. Surely these would significantly sacrifice accuracy. But we have to do so in order to avoid reporting false errors. Any thoughts? ------ Regards, Alexander Korotkov
Attachment
Hi Alexander, 06.11.2023 12:30, Alexander Korotkov wrote: > Surely these would significantly sacrifice accuracy. But we have to do > so in order to avoid reporting false errors. > I've reduced the dirty reproducer Daniel Shelepanov posted initially to the following: numdbs=10 for ((d=1;d<=$numdbs;d++)); do createdb db$d psql db$d -c "create extension pg_visibility" done for ((i=1;i<=300;i++)); do echo "iteration $i" for ((d=1;d<=$numdbs;d++)); do ( echo " create table vacuum_test as select 42 i; vacuum (disable_page_skipping) vacuum_test; select * from pg_check_visible('vacuum_test'); " | psql db$d -a -q >psql-$d.log 2>&1 ) & done wait res=0 for ((d=1;d<=$numdbs;d++)); do grep -q '0 rows' psql-$d.log || { echo "Error condition in psql-$d.log:"; cat psql-$d.log; res=1; break; } psql db$d -q -c "drop table vacuum_test" done [ $res == 0 ] || break; done It looks like the v2 patch doesn't fix the original issue. Maybe I miss something, but with the patch applied, I see the failure immediately, though without the patch several iterations are needed to get it. Best regards, Alexander
Hi, Alexander. On Tue, Nov 7, 2023 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: > It looks like the v2 patch doesn't fix the original issue. Maybe I miss > something, but with the patch applied, I see the failure immediately, > though without the patch several iterations are needed to get it. That's a bug in the patch. Thank you for cathing it. It should start calculation from latestCompletedXid + 1, not InvalidTransactionId. Please, check the revised patch. ------ Regards, Alexander Korotkov
Attachment
07.11.2023 14:38, Alexander Korotkov wrote: > Hi, Alexander. > > On Tue, Nov 7, 2023 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: >> It looks like the v2 patch doesn't fix the original issue. Maybe I miss >> something, but with the patch applied, I see the failure immediately, >> though without the patch several iterations are needed to get it. > > That's a bug in the patch. Thank you for cathing it. It should start > calculation from latestCompletedXid + 1, not InvalidTransactionId. > Please, check the revised patch. Thanks for looking at this! Unfortunately, I still see the failure with the v3, but not on a first iteration: ... iteration 316 Error condition in psql-8.log: create table vacuum_test as select 42 i; vacuum (disable_page_skipping) vacuum_test; select * from pg_check_visible('vacuum_test'); t_ctid -------- (0,1) (1 row) (I've double-checked that the patch is applied and get_strict_xid_horizon() is called.) Best regards, Alexander
Hi, Alexander! On Tue, Nov 7, 2023 at 3:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: > > 07.11.2023 14:38, Alexander Korotkov wrote: > > Hi, Alexander. > > > > On Tue, Nov 7, 2023 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: > >> It looks like the v2 patch doesn't fix the original issue. Maybe I miss > >> something, but with the patch applied, I see the failure immediately, > >> though without the patch several iterations are needed to get it. > > > > That's a bug in the patch. Thank you for cathing it. It should start > > calculation from latestCompletedXid + 1, not InvalidTransactionId. > > Please, check the revised patch. > > Thanks for looking at this! > Unfortunately, I still see the failure with the v3, but not on a first > iteration: > ... > iteration 316 > Error condition in psql-8.log: > create table vacuum_test as select 42 i; > vacuum (disable_page_skipping) vacuum_test; > select * from pg_check_visible('vacuum_test'); > t_ctid > -------- > (0,1) > (1 row) > > (I've double-checked that the patch is applied and get_strict_xid_horizon() > is called.) I managed to reproduce this on a Linux VM. This problem should arise because in extension I don't have access to ProcArrayStruct. So, my code is iterating the whole PGPROC's array. I reimplemented the new horizon calculation function in the core with usage of ProcArrayStruct. Now it doesn't fall for me. Please, recheck. ------ Regards, Alexander Korotkov
Attachment
Hi Alexander, 04.12.2023 03:23, Alexander Korotkov wrote: > I managed to reproduce this on a Linux VM. This problem should arise > because in extension I don't have access to ProcArrayStruct. So, my > code is iterating the whole PGPROC's array. I reimplemented the new > horizon calculation function in the core with usage of > ProcArrayStruct. Now it doesn't fall for me. Please, recheck. Yes, v4 works for me as well (thousands of iterations passed). Thank you! Though the test passes even without manipulations with the PROC_IN_VACUUM flag in pg_visibility.c (maybe the test is not good enough to show why those manipulations are needed). I also couldn't see where VISHORIZON_DATA_STRICT comes into play... Best regards, Alexander
Hi! I agree with Alexander Lakhin about PROC_IN_VACUUM and VISHORIZON_DATA_STRICT: 1) probably manipulations with the PROC_IN_VACUUM flag in pg_visibility.c were needed for condition [1] and can be removed now; 2) the VISHORIZON_DATA_STRICT macro is probably unnecessary too (since we are not going to use it in the GlobalVisHorizonKindForRel() function). Also It would be nice to remove the get_strict_xid_horizon() function from the comment (replace to GetStrictOldestNonRemovableTransactionId()?). [1] https://github.com/postgres/postgres/blob/4d0cf0b05defcee985d5af38cb0db2b9c2f8dbae/src/backend/storage/ipc/procarray.c#L1812 -- With best regards, Dmitry Koval Postgres Professional: http://postgrespro.com
Hi! On Tue, Dec 5, 2023 at 9:03 PM Dmitry Koval <d.koval@postgrespro.ru> wrote: > I agree with Alexander Lakhin about PROC_IN_VACUUM and > VISHORIZON_DATA_STRICT: > 1) probably manipulations with the PROC_IN_VACUUM flag in > pg_visibility.c were needed for condition [1] and can be removed now; Right, PROC_IN_VACUUM is no longer required. The possible benefit of it would be to avoid bloat during a possibly long run of pg_visibility() function. But the downside are problems with the snapshot if the invoking query contains something except a single call of the pg_visibility() function, and complexity. Removed. > 2) the VISHORIZON_DATA_STRICT macro is probably unnecessary too (since > we are not going to use it in the GlobalVisHorizonKindForRel() function). Makes sense, removed. > Also It would be nice to remove the get_strict_xid_horizon() function > from the comment (replace to GetStrictOldestNonRemovableTransactionId()?). Right, fixed. The revised patch is attached. Besides the fixes above, it contains improvements for comments and the detailed commit message. Tom, Robert, what do you think about the patch attached? It required a new type of xid horizon in core and sacrifices accuracy. But this is the only way I can imagine, we can fix the problem in a general way. ------ Regards, Alexander Korotkov
Attachment
Hi! Thank you, there is one small point left (in the comment): can you replace "guarantteed to be to be newer" to "guaranteed to be newer", file src/backend/storage/ipc/procarray.c? -- With best regards, Dmitry Koval Postgres Professional: http://postgrespro.com
Hi, Dmitry! On Sat, Jan 13, 2024 at 7:33 PM Dmitry Koval <d.koval@postgrespro.ru> wrote: > Thank you, there is one small point left (in the comment): can you > replace "guarantteed to be to be newer" to "guaranteed to be newer", > file src/backend/storage/ipc/procarray.c? Fixed. Thank you for catching this. ------ Regards, Alexander Korotkov
Attachment
On Sun, Jan 14, 2024 at 4:35 AM Alexander Korotkov <aekorotkov@gmail.com> wrote: > On Sat, Jan 13, 2024 at 7:33 PM Dmitry Koval <d.koval@postgrespro.ru> wrote: > > Thank you, there is one small point left (in the comment): can you > > replace "guarantteed to be to be newer" to "guaranteed to be newer", > > file src/backend/storage/ipc/procarray.c? > > Fixed. Thank you for catching this. I made the following improvements to the patch. 1. I find a way to implement the path with less changes to the core code. The GetRunningTransactionData() function allows to get the least running xid, all I need is to add database-aware values. 2. I added the TAP test reproducing the original issue. I'm going to push this if no objections. ------ Regards, Alexander Korotkov