Thread: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17485 Logged by: Peter Slavov Email address: pet.slavov@gmail.com PostgreSQL version: 14.3 Operating system: Ubuntu 20.04 Description: I a noticed a problem in our production database when rebuilding the primary key index. The database has a lot of INSERTS and UPDATES when this is happening. I was able to reproduce this on PostgreSQL 14.1/2/3 locally on docker instance and on AWS EC2. Here is how you can reproduce: - Load table `test2` from this file containing the table structure and small testing dataset: https://drive.google.com/file/d/1vCqKOda3IIrHmWDzNNJaDQfCb5eQA_D2/view?usp=sharing - Start inserting and updating in this table using pgbench: SHELL: pgbench -r -T 1000 -f test.sql -c 50 -j 50 receiptbank test.sql file: https://drive.google.com/file/d/1mdduGqu1XcDg01iwSysF7szrIlCKUJQT/view?usp=sharing - Reindex the PRIMARY KEY (it is possible to have to repeat this 2-3 times): SQL: REINDEX INDEX CONCURRENTLY test2_pkey - Check index with `amcheck` extension for errors: SQL: select bt_index_check(index => c.oid, heapallindexed => true) from pg_class c where oid = 'test2_pkey'::regclass; SQL result for me: ERROR: heap tuple (13134,18) from table "test2" lacks matching index tuple within index "test2_pkey" - Check for missing records in the index: SQL: WITH missing_check AS ( SELECT t1.*, t1.ctid, EXISTS(SELECT 1 FROM test2 t2 WHERE t2.id = t1.id) AS flag FROM test2 t1) SELECT id, ctid FROM missing_check WHERE flag = false; Result for me was several records found ONLY with the sequential scan... This problem is possibly related to BUG #17478 Thanks for the help, Peter Slavov
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> 18 мая 2022 г., в 15:42, PG Bug reporting form <noreply@postgresql.org> написал(а): > I was able to reproduce this on PostgreSQL 14.1/2/3 locally on > docker instance and on AWS EC2. Uhm, that's very...interesting. I'll look closely next week. Though I didn't have a chance to reproduce yet. We have fixed similar problem in 14.1. And now we have very similar TAP test to you reproduction [0]. How do you think, what'sthe key difference between TAP test and your repro? Thanks! Best regards, Andrey Borodin. [0] https://github.com/postgres/postgres/blob/REL_14_STABLE/contrib/amcheck/t/002_cic.pl
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Thu, May 19, 2022 at 09:22:44AM +0500, Andrey Borodin wrote: > Uhm, that's very...interesting. I'll look closely next week. Though I didn't have a chance to reproduce yet. > We have fixed similar problem in 14.1. And now we have very similar > TAP test to you reproduction [0]. How do you think, what's the key > difference between TAP test and your repro? Interesting, indeed. Another question I have: is this limited to v14 or are you able to see it in older versions? REINDEX CONCURRENTLY has been introduced in v12. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Петър Славов
Date:
На чт, 19.05.2022 г. в 7:32 ч. Michael Paquier <michael@paquier.xyz> написа:
On Thu, May 19, 2022 at 09:22:44AM +0500, Andrey Borodin wrote:
> Uhm, that's very...interesting. I'll look closely next week. Though I didn't have a chance to reproduce yet.
> We have fixed similar problem in 14.1. And now we have very similar
> TAP test to you reproduction [0]. How do you think, what's the key
> difference between TAP test and your repro?
Interesting, indeed. Another question I have: is this limited to v14
or are you able to see it in older versions? REINDEX CONCURRENTLY has
been introduced in v12.
--
Michael
Yes I have made the same test on PostgreSQL 13.7, but the reindex works as expected there (no issues).
I haven't tested on older versions.
Peter
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Петър Славов
Date:
Hi Andrey,
This test looks similar to me but I cannot say what is the difference. My tests are done under heavier load maybe.
Also I couldn't reproduce this with a table with low num number of columns (integrer and text). I am not sure if this is relevant...
Also I couldn't reproduce this with a table with low num number of columns (integrer and text). I am not sure if this is relevant...
Peter
На чт, 19.05.2022 г. в 7:22 ч. Andrey Borodin <x4mmm@yandex-team.ru> написа:
> 18 мая 2022 г., в 15:42, PG Bug reporting form <noreply@postgresql.org> написал(а):
> I was able to reproduce this on PostgreSQL 14.1/2/3 locally on
> docker instance and on AWS EC2.
Uhm, that's very...interesting. I'll look closely next week. Though I didn't have a chance to reproduce yet.
We have fixed similar problem in 14.1. And now we have very similar TAP test to you reproduction [0]. How do you think, what's the key difference between TAP test and your repro?
Thanks! Best regards, Andrey Borodin.
[0] https://github.com/postgres/postgres/blob/REL_14_STABLE/contrib/amcheck/t/002_cic.pl
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Thu, May 19, 2022 at 08:24:27AM +0300, Петър Славов wrote: > Yes I have made the same test on PostgreSQL 13.7, but the reindex works as > expected there (no issues). > I haven't tested on older versions. Okay, thanks. Something that has changed in this area is the addition of c98763b, where spurious waits are avoided in some of the phaese of REINDEX CONCURRENTLY. I am wondering if this is related. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-19, Michael Paquier wrote: > Okay, thanks. Something that has changed in this area is the > addition of c98763b, where spurious waits are avoided in some of the > phaese of REINDEX CONCURRENTLY. I am wondering if this is related. Hmm, yes, it's definitely possible that it is related. I'll have a look soon. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Once again, thank you and all of the developers for your hard work on PostgreSQL. This is by far the most pleasant management experience of any database I've worked on." (Dan Harris) http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Thu, May 19, 2022 at 02:03:56PM +0200, Alvaro Herrera wrote: > Hmm, yes, it's definitely possible that it is related. > > I'll have a look soon. It took me some time to write a script to bisect that, but I have been able to establish a correlation with d9d0762 that causes VACUUM to ignore transactions doing some concurrent reindex operations. I would not be surprised to see that this is also related to some of the reports we have seen lately with reindex operations. There was one with logical replication and missing records from a primary key, I recall. For the stable branches of 14 and 15, I would tend to play it safe and revert d9d0762. I have to admit that f9900df and c98763b stress me a bit, and that we have not have anticipated all the ramifications of this set of changes. Thoughts? -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-23, Michael Paquier wrote: > It took me some time to write a script to bisect that, but I have been > able to establish a correlation with d9d0762 that causes VACUUM to > ignore transactions doing some concurrent reindex operations. I would > not be surprised to see that this is also related to some of the > reports we have seen lately with reindex operations. There was one > with logical replication and missing records from a primary key, I > recall. > > For the stable branches of 14 and 15, I would tend to play it safe and > revert d9d0762. I have to admit that f9900df and c98763b stress me a > bit, and that we have not have anticipated all the ramifications of > this set of changes. Thoughts? Wow, thanks for researching that over the weekend. I think if this is a big enough deal (and I think it may be) then IMO we should revert as you suggest, make an out-of-schedule release, and then I can take some time to investigate in more depth and see if the feature can be salvaged. OTOH if we think an out-of-schedule release is not warranted, then reverting right now is not useful; we can make a decision about that closer to the next minor release, once we've had time to see if the bug can be fixed in some other way that doesn't break the whole feature. Opinions? -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Saca el libro que tu religión considere como el indicado para encontrar la oración que traiga paz a tu alma. Luego rebootea el computador y ve si funciona" (Carlos Duclós)
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> 23 мая 2022 г., в 13:07, Alvaro Herrera <alvherre@alvh.no-ip.org> написал(а): > > Opinions? I think revert+release is not really a good idea until we understand how this commit breaks things. Chances are that it only affects frequency of the reproduction. When we will understand what is root cause of the bug - it won't take much time to fix things. Best regards, Andrey Borodin.
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> 23 мая 2022 г., в 10:40, Michael Paquier <michael@paquier.xyz> написал(а): > > It took me some time to write a script to bisect that, but I have been > able to establish a correlation with d9d0762 that causes VACUUM to > ignore transactions doing some concurrent reindex operations. I've transformed Peter's test into TAP test that runs ~20 seconds and reliably reproduces problem on my laptop. And I observe that commenting out condition in following code fixes the test. //if (!(statusFlags & PROC_IN_SAFE_IC)) h->data_oldest_nonremovable = TransactionIdOlder(h->data_oldest_nonremovable, xmin); Best regards, Andrey Borodin.
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> 23 мая 2022 г., в 15:49, Andrey Borodin <x4mmm@yandex-team.ru> написал(а): > > I've transformed Peter's test into TAP test that runs ~20 seconds and reliably reproduces problem on my laptop. I found out one interesting thing: unindexed tuple (that comes from amcheck scan) does not exist in heap page at the momentof check fail. I've added ReadBuffer() in case of bloom_lacks_element() and ItemIdHasStorage() is false. I understand that this description is a too vague, so I attached a patch for amcheck relaxing bt_index_check() so the testwould pass. Best regards, Andrey Borodin.
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Mon, May 23, 2022 at 2:02 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > I think revert+release is not really a good idea until we understand how this commit breaks things. > Chances are that it only affects frequency of the reproduction. +1 -- it's been in a stable release for months now, and we will probably know the exact nature of the problem in just a few more days. There is no reason to decide that the feature needs to be reverted before anything else. Or if there is I would like to hear it. -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Mon, May 23, 2022 at 6:06 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > I found out one interesting thing: unindexed tuple (that comes from amcheck scan) does not exist in heap page at the momentof check fail. That could just be a "downstream problem" from HOT chain corruption. Maybe you'd get a clearer/earlier failure if you also applied this patch, on an assertion-enabled build: https://postgr.es/m/CAH2-Wzk2LeWPwz1wcKNz7Fux4Ogn+PC81H+q7Q7no-5XT0dx3w@mail.gmail.com -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Mon, May 23, 2022 at 10:07:44AM +0200, Alvaro Herrera wrote: > I think if this is a big enough deal (and I think it may be) then IMO we > should revert as you suggest, make an out-of-schedule release, and then > I can take some time to investigate in more depth and see if the feature > can be salvaged. > > OTOH if we think an out-of-schedule release is not warranted, then > reverting right now is not useful; we can make a decision about that > closer to the next minor release, once we've had time to see if the bug > can be fixed in some other way that doesn't break the whole feature. The annoying part is that this can cause silent corruptions for indexes created with REINDEX and CIC, so most users won't know about the failure until they see that their application is broken. And we are just talking about a btree index here, other index AMs may be similarly impacted. So that's rather bad IMHO :/ It seems to me that the problem is around the wait phase after the validation, where the computation of limitXmin coming from the snapshot used for the validation ignores now the impact of VACUUM, hence impacting the timing when the index can be safely used. It also looks like it is possible to build an isolation test, where we use a transaction with a snapshot older than the REINDEX to force it to wait in the first WaitForOlderSnapshots() call. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Mon, May 23, 2022 at 03:49:02PM +0500, Andrey Borodin wrote: > I've transformed Peter's test into TAP test that runs ~20 seconds > and reliably reproduces problem on my laptop. Thanks for the TAP test. That's nice. It actually passes here, reliably. > And I observe that commenting out condition in following code fixes the test. > //if (!(statusFlags & PROC_IN_SAFE_IC)) > h->data_oldest_nonremovable = > TransactionIdOlder(h->data_oldest_nonremovable, xmin); Well, by doing so, I think that you are just making the CIC/REINDEX wait again until the index is safe to use, but we want to skip this wait as of the optimization done in d9d0762. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Mon, May 23, 2022 at 6:20 PM Michael Paquier <michael@paquier.xyz> wrote: > > And I observe that commenting out condition in following code fixes the test. > > //if (!(statusFlags & PROC_IN_SAFE_IC)) > > h->data_oldest_nonremovable = > > TransactionIdOlder(h->data_oldest_nonremovable, xmin); > > Well, by doing so, I think that you are just making the CIC/REINDEX > wait again until the index is safe to use, but we want to skip this > wait as of the optimization done in d9d0762. Uh...isn't that exactly the point that Andrey made himself, in posting the snippet? You seem to be addressing this PROC_IN_SAFE_IC snippet as if Andrey formally proposed it as a bugfix, which seems like an odd interpretation to me. It seems pretty clear to me that Andrey was just making an observation, in case it helped with debugging. -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-23, Peter Geoghegan wrote: > You seem to be addressing this PROC_IN_SAFE_IC snippet as if Andrey > formally proposed it as a bugfix, which seems like an odd > interpretation to me. It seems pretty clear to me that Andrey was just > making an observation, in case it helped with debugging. Right. I approached it yesterday by running the test case with each set_indexsafe_procflags() callsite commented out, see which one breaks things. Didn't reach any conclusion because I ran into thermal problems with my laptop, which got me angry and couldn't make any further progress. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "I'm always right, but sometimes I'm more right than other times." (Linus Torvalds)
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> On 24 May 2022, at 14:02, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > I approached it yesterday by running the test case with each > set_indexsafe_procflags() callsite commented out, see which one breaks > things On my machine commenting out set_indexsafe_procflags() before "Phase 3 of concurrent index build" fixes the tests. > On 23 May 2022, at 23:18, Peter Geoghegan <pg@bowt.ie> wrote: > > Maybe you'd get a clearer/earlier failure if you also applied this > patch, on an assertion-enabled build I've tried this approach, but nothing actually seem to change... BTW I used a three-way-merge rebase, there is a slight conflictin comments. But, luckily, comments don't run. > On 24 May 2022, at 06:20, Michael Paquier <michael@paquier.xyz> wrote: > > Thanks for the TAP test. That's nice. It actually passes here, > reliably. IDK. Maybe if you increase --time of pgbench you will observe the problem... > On 24 May 2022, at 07:19, Peter Geoghegan <pg@bowt.ie> wrote: > > Andrey was just > making an observation, in case it helped with debugging. Yes, I'm not proposing to commit anything so far. All my tests, snippets, diffs here are only debug stuff. Thank you! Best regards, Andrey Borodin.
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-24 11:02:12 +0200, Alvaro Herrera wrote: > On 2022-May-23, Peter Geoghegan wrote: > > > You seem to be addressing this PROC_IN_SAFE_IC snippet as if Andrey > > formally proposed it as a bugfix, which seems like an odd > > interpretation to me. It seems pretty clear to me that Andrey was just > > making an observation, in case it helped with debugging. > > Right. > > I approached it yesterday by running the test case with each > set_indexsafe_procflags() callsite commented out, see which one breaks > things. Didn't reach any conclusion because I ran into thermal problems > with my laptop, which got me angry and couldn't make any further > progress. Do we have any idea what really causes the corruption? One thing that'd be worth excluding is the use of parallel index builds. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Tue, May 24, 2022 at 9:37 AM Andres Freund <andres@anarazel.de> wrote: > Do we have any idea what really causes the corruption? I don't think so. Andrey's tap test fails for me on 14 as expected, and does so reliably -- so there is a fairly good reproducer for this. I don't have time to debug this right now (need to work on my pgCon talk), but it would probably be straightforward to get an RR recording of the failure. > One thing that'd be worth excluding is the use of parallel index builds. I can rule out a problem with parallel index builds -- disabling them in the tap test doesn't alter the outcome. -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-24 09:37:02 -0700, Andres Freund wrote: > On 2022-05-24 11:02:12 +0200, Alvaro Herrera wrote: > > I approached it yesterday by running the test case with each > > set_indexsafe_procflags() callsite commented out, see which one breaks > > things. Didn't reach any conclusion because I ran into thermal problems > > with my laptop, which got me angry and couldn't make any further > > progress. > > Do we have any idea what really causes the corruption? Reproed the problem with the pgbench script, against an existing cluster. One note: With fsync=on, it's much harder to reproduce. One instance: pgbench: error: client 0 script 2 aborted in command 3 query 0: ERROR: heap tuple (37,56) from table "tbl" lacks matchingindex tuple within index "idx" HINT: Retrying verification using the function bt_index_parent_check() might provide a more specific error. the last record for that offset is (with waldump enhanced to print offsets): rmgr: Heap2 len (rec/tot): 60/ 60, tx: 0, lsn: 0/8E56D218, prev 0/8E56D1F0, desc: PRUNE latestRemovedXid1878329 nredirected 1 ndead 0 nunused 1, unused: [178], redirected: [56->53], blkref #0: rel 1663/5/27905blk 37 the tuple originally was inserted: rmgr: Heap len (rec/tot): 95/ 95, tx: 1802927, lsn: 0/89776380, prev 0/89776358, desc: INSERT off 56 flags0x04, blkref #0: rel 1663/5/27905 blk 37 rmgr: Heap len (rec/tot): 48/ 48, tx: 1802927, lsn: 0/89776460, prev 0/89776420, desc: HEAP_CONFIRM off56, blkref #0: rel 1663/5/27905 blk 37 and updated a "while" back: rmgr: Heap len (rec/tot): 54/ 54, tx: 1804510, lsn: 0/898B5E68, prev 0/898B5E40, desc: LOCK off 56: xid1804510: flags 0x00 LOCK_ONLY EXCL_LOCK , blkref #0: rel 1663/5/27905 blk 37 rmgr: Heap len (rec/tot): 73/ 73, tx: 1804510, lsn: 0/898B5EA0, prev 0/898B5E68, desc: HOT_UPDATE off 56xmax 1804510 flags 0x60 ; new off 80 xmax 1804510, blkref #0: rel 1663/5/27905 blk 37 since which there have been numerous redirections to other tuples: rmgr: Heap2 len (rec/tot): 212/ 212, tx: 0, lsn: 0/89AC6B40, prev 0/89AC6B00, desc: PRUNE latestRemovedXid1806705 nredirected 16 ndead 0 nunused 47, unused: [47, 61, 12, 14, 38, 52, 53, 60, 75, 16, 17, 31, 50, 33,59, 76, 49, 51, 54, 63, 89, 95, 18, 19, 29, 32, 48, 62, 81, 85, 88, 106, 28, 39, 40, 44, 26, 27, 45, 58, 73, 91, 34, 41,80, 86, 96], redirected: [1->74, 3->93, 4->37, 7->94, 8->72, 9->87, 10->105, 15->108, 21->71, 23->98, 30->79, 35->102,55->92, 56->107, 67->103, 84->104], blkref #0: rel 1663/5/27905 blk 37 rmgr: Heap2 len (rec/tot): 220/ 220, tx: 0, lsn: 0/89DC6240, prev 0/89DC6218, desc: PRUNE latestRemovedXid1810120 nredirected 27 ndead 0 nunused 29, unused: [74, 88, 93, 19, 94, 53, 59, 105, 16, 29, 118, 108, 71,26, 38, 98, 79, 52, 102, 31, 54, 92, 107, 50, 113, 103, 104, 60, 85], redirected: [1->122, 3->62, 5->49, 7->91, 10->121,15->28, 21->51, 23->119, 24->58, 30->73, 33->81, 34->86, 35->89, 36->14, 42->76, 45->112, 48->75, 55->80, 56->116,67->17, 68->115, 70->61, 78->117, 83->27, 84->18, 97->120, 99->114], blkref #0: rel 1663/5/27905 blk 37 .... rmgr: Heap2 len (rec/tot): 60/ 60, tx: 0, lsn: 0/8E56D218, prev 0/8E56D1F0, desc: PRUNE latestRemovedXid1878329 nredirected 1 ndead 0 nunused 1, unused: [178], redirected: [56->53], blkref #0: rel 1663/5/27905blk 37 the target of the redirection that's pruned away in 0/8E56D218 is more recent: rmgr: Heap len (rec/tot): 54/ 54, tx: 1878329, lsn: 0/8E556070, prev 0/8E556020, desc: LOCK off 178: xid1878329: flags 0x00 LOCK_ONLY EXCL_LOCK , blkref #0: rel 1663/5/27905 blk 37 rmgr: Heap len (rec/tot): 74/ 74, tx: 1878329, lsn: 0/8E5560A8, prev 0/8E556070, desc: HOT_UPDATE off 178xmax 1878329 flags 0x60 ; new off 53 xmax 1878329, blkref #0: rel 1663/5/27905 blk 37 I suspect the problem might be related to pruning done during the validation scan. Once PROC_IN_SAFE_IC is set, the backend itself will not preserve tids its own snapshot might need. Which will wreak havoc during the validation scan. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-24 11:15:42 -0700, Andres Freund wrote: > I suspect the problem might be related to pruning done during the validation > scan. Once PROC_IN_SAFE_IC is set, the backend itself will not preserve tids > its own snapshot might need. Which will wreak havoc during the validation > scan. Looking at it more, I don't see when PROC_IN_SAFE_IC is safe as defined, tbh. Afaict it basically completely breaks snapshots - which we rely on to work for both the initial build scan and then for the validation scan. It seems safe to make CIC to ignore other CICs when deciding whether to wait for those transactions. But it seems utterly unsafe to ignore CICs when when determining horizons. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> On 24 May 2022, at 23:15, Andres Freund <andres@anarazel.de> wrote: > > With fsync=on, it's much harder to reproduce. That exaplains why it's easier to reproduce on MacOS: it seem it ignores fsync. > On 24 May 2022, at 23:15, Andres Freund <andres@anarazel.de> wrote: > > I suspect the problem might be related to pruning done during the validation > scan. Once PROC_IN_SAFE_IC is set, the backend itself will not preserve tids > its own snapshot might need. Which will wreak havoc during the validation > scan. I observe that removing PROC_IN_SAFE_IC for index_validate() fixes tests. But why it's not a problem for index_build() scan? And I do not understand why it's a problem that tuple is pruned during the scan... How does this "wreak havoc" happen? Best regards, Andrey Borodin.
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-24 10:38:14 -0700, Peter Geoghegan wrote: > On Tue, May 24, 2022 at 9:37 AM Andres Freund <andres@anarazel.de> wrote: > > Do we have any idea what really causes the corruption? > > I don't think so. I think I found it: https://postgr.es/m/20220524183705.cmgbqq32z63qynhe%40alap3.anarazel.de afaict PROC_IN_SAFE_IC is completely broken right now. Any concurrent prune can remove prune rows that are visible to the snapshot held by the PROC_IN_SAFE_IC backend. Which basically makes them "fair weather snapshots" - they work only as long as there is no concurrent activity. Similar behavior is fine for VACUUM - it doesn't use a snapshot / need a consistent view of the table. But not for CIC - otherwise it could just use SnapshotAny or such. I don't really see a realistic alternative other than reverting at this point. I think this needs to be rethought fairly fundamentally. > Andrey's tap test fails for me on 14 as expected, and does so reliably > -- so there is a fairly good reproducer for this. > > I don't have time to debug this right now (...), but it would probably be > straightforward to get an RR recording of the failure. I tried that, but it didn't repro under rr within 15min or so. > (need to work on my pgCon talk) Good luck :) > > One thing that'd be worth excluding is the use of parallel index builds. > > I can rule out a problem with parallel index builds -- disabling them > in the tap test doesn't alter the outcome. Good. Just to clarify: I was suspicious of PROC_IN_SAFE_IC being set incoherently in parallel workers or such, not of parallel index builds "in general". Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-24 23:38:07 +0500, Andrey Borodin wrote: > > > > On 24 May 2022, at 23:15, Andres Freund <andres@anarazel.de> wrote: > > > > With fsync=on, it's much harder to reproduce. > That exaplains why it's easier to reproduce on MacOS: it seem it ignores fsync. Yea, one needs wal_sync_method=fsync_writethrough or such :/ > > On 24 May 2022, at 23:15, Andres Freund <andres@anarazel.de> wrote: > > > > I suspect the problem might be related to pruning done during the validation > > scan. Once PROC_IN_SAFE_IC is set, the backend itself will not preserve tids > > its own snapshot might need. Which will wreak havoc during the validation > > scan. > > I observe that removing PROC_IN_SAFE_IC for index_validate() fixes tests. > But why it's not a problem for index_build() scan? I now suspect it's a problem for both, just more visible for index_validate(). > And I do not understand why it's a problem that tuple is pruned during the scan... How does this "wreak havoc" happen? Basically snapshots don't work anymore. If PROC_IN_SAFE_IC is set, that backend is ignored for the horizon computation for snapshots / on-access HOT pruning. Which means that rows that are visible to the snapshot can be pruned away. One might think that could be safe, after all the row is invisible to all other backends. The problem is that the validation scan won't see *newer* rows either, since they're not visible to the snapshot either. And if the new row version is a HOT tuple, it won't have made an index entry on its own. Boom, corruption. Basically: 1) S1 builds index in phase 2 2) S2 inserts tuple t1 (not in the index built in 1), since it's inserted after that) 3) S2 hot updates tuple t1->t2 4) S1 sets PROC_IN_SAFE_IC, builds snapshot, starts validation scan (phase 3) 5) S2 hot updates tuple t2->t3 6) Either S1 or S2 performs hot pruning, redirecting t1 to t3, this is only possible because PROC_IN_SAFE_IC caused S2's ->xmin to be ignored 7) S2 checks t1->t3, finds that t3 is too new for the snapshot, doesn't create an index entry 8) corruption Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Greg Stark
Date:
On Tue, 24 May 2022 at 15:02, Andres Freund <andres@anarazel.de> wrote: > > Basically: > > 1) S1 builds index in phase 2 > 2) S2 inserts tuple t1 (not in the index built in 1), since it's inserted after that) > 3) S2 hot updates tuple t1->t2 Not that it matters but is this step even necessary? > 4) S1 sets PROC_IN_SAFE_IC, builds snapshot, starts validation scan (phase 3) > 5) S2 hot updates tuple t2->t3 That seems like the key observation. But I wonder if it's even the only flow where this could be an issue. What happens if t2 is deleted, can it get pruned away completely? > 6) Either S1 or S2 performs hot pruning, redirecting t1 to t3, this is only > possible because PROC_IN_SAFE_IC caused S2's ->xmin to be ignored Or presumably any other transaction. But ... does the update to t2->t3 not automatically trigger pruning anyways? > 7) S2 checks t1->t3, finds that t3 is too new for the snapshot, doesn't create > an index entry Just to be clear, it would normally have created an index entry (for the whole HOT chain) because t2 is in the recheck snapshot and therefore the whole HOT chain wasn't in the initial snapshot. I'm a little confused here. > 8) corruption Aside from amcheck I wonder if we can come up with any way for users to tell whether their index is affected or at risk. Like, is there a way to tell from catalog entries if an index was created with CIC? -- greg
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-24 17:11:12 -0400, Greg Stark wrote: > On Tue, 24 May 2022 at 15:02, Andres Freund <andres@anarazel.de> wrote: > > > > Basically: > > > > 1) S1 builds index in phase 2 > > 2) S2 inserts tuple t1 (not in the index built in 1), since it's inserted > after that) > > 3) S2 hot updates tuple t1->t2 > > Not that it matters but is this step even necessary? I think it is, but there might be other recipes reproducing the problem. > > 4) S1 sets PROC_IN_SAFE_IC, builds snapshot, starts validation scan (phase 3) > > 5) S2 hot updates tuple t2->t3 > > That seems like the key observation. But I wonder if it's even the > only flow where this could be an issue. What happens if t2 is deleted, > can it get pruned away completely? Yes it could, but afaics that'd be fine, because then there's no missing index entry. And the index should only be marked valid once all older snapshots have ended. > > 6) Either S1 or S2 performs hot pruning, redirecting t1 to t3, this is only > > possible because PROC_IN_SAFE_IC caused S2's ->xmin to be ignored > > Or presumably any other transaction. Right. > But ... does the update to t2->t3 not automatically trigger pruning anyways? We don't prune during updates right now (but do when fetching the row to update) - I think that's bad, but it's how it is. When you say "automatically" - do you mean that it'd happen unconditionally, independent of the horizon? It shouldn't... > > 7) S2 checks t1->t3, finds that t3 is too new for the snapshot, doesn't create > > an index entry > > Just to be clear, it would normally have created an index entry (for > the whole HOT chain) because t2 is in the recheck snapshot and > therefore the whole HOT chain wasn't in the initial snapshot. I'm a > little confused here. Hm? Why / where would we have done that? It's a HOT update, so the UPDATE doesn't create an index entry. And the validate scan won't see the HOT chain because t2 has been pruned away and t3 is too new. What "recheck snapshot" are you referring to? The one passed to validate_index()? > > 8) corruption > > Aside from amcheck I wonder if we can come up with any way for users > to tell whether their index is affected or at risk. Like, is there a > way to tell from catalog entries if an index was created with CIC? Not reliably, afaik. indcheckxmin won't ever be set for a CIC index IIRC, but it's not reliably set for a non-CIC index. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Tue, May 24, 2022 at 03:24:33PM -0700, Andres Freund wrote: > On 2022-05-24 17:11:12 -0400, Greg Stark wrote: >> Aside from amcheck I wonder if we can come up with any way for users >> to tell whether their index is affected or at risk. Like, is there a >> way to tell from catalog entries if an index was created with CIC? > > Not reliably, afaik. indcheckxmin won't ever be set for a CIC index IIRC, but > it's not reliably set for a non-CIC index. When it comes to REINDEX, we recreate entirely a new relation for the concurrent flavor, dropping the old one. Hence its OID changes, while all the data from the old relation gets copied over. That's not much and you cannot use that for a CIC, still.. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> On 25 May 2022, at 00:01, Andres Freund <andres@anarazel.de> wrote: > >>> On 24 May 2022, at 23:15, Andres Freund <andres@anarazel.de> wrote: >>> >>> I suspect the problem might be related to pruning done during the validation >>> scan. Once PROC_IN_SAFE_IC is set, the backend itself will not preserve tids >>> its own snapshot might need. Which will wreak havoc during the validation >>> scan. >> >> I observe that removing PROC_IN_SAFE_IC for index_validate() fixes tests. >> But why it's not a problem for index_build() scan? > > I now suspect it's a problem for both, just more visible for index_validate(). No, I understood now. index_build() in CIC can be safely replaced by index_build_empty(). index_build() is only called because many AMs can make index more efficient than index_build_empty()+inserts. BTW for other indexes that are built just by inserts we can safely optimize CIC by calling index_build_empty() instead ofindex_build() in Phase 1. >> And I do not understand why it's a problem that tuple is pruned during the scan... How does this "wreak havoc" happen? > > Basically snapshots don't work anymore I got it! (almost) Thank you! Now I see why tests in 002_cic.pl did not catch that: there were no HOT updates. So, can we leave the feature enabled for all the process besides index validation? We don't seem to need a valid snapshotin CIC anywhere else. Or it's safer to revert whole commit? Thanks! Best regards, Andrey Borodin.
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-25 09:00:47 +0500, Andrey Borodin wrote: > > On 25 May 2022, at 00:01, Andres Freund <andres@anarazel.de> wrote: > > > >>> On 24 May 2022, at 23:15, Andres Freund <andres@anarazel.de> wrote: > >>> > >>> I suspect the problem might be related to pruning done during the validation > >>> scan. Once PROC_IN_SAFE_IC is set, the backend itself will not preserve tids > >>> its own snapshot might need. Which will wreak havoc during the validation > >>> scan. > >> > >> I observe that removing PROC_IN_SAFE_IC for index_validate() fixes tests. > >> But why it's not a problem for index_build() scan? > > > > I now suspect it's a problem for both, just more visible for index_validate(). > No, I understood now. index_build() in CIC can be safely replaced by > index_build_empty(). I'm not convinced that that is true. I think at least for stuff like indexes used in constraints (unique or exclusion) we need to do something like the current approach. But more importantly, there's a difference between not inserting any index entries, and inserting entries based on a essentially corrupted snapshot. Using a too old snapshot (which is what we have here) could lead to all sorts of oddities when following HOT chains etc. Which might end up inserting *too much* into the index, I think. > So, can we leave the feature enabled for all the process besides index > validation? We don't seem to need a valid snapshot in CIC anywhere else. Or > it's safer to revert whole commit? I doubt it's safe in the other places either. We can't just do a table scan, which expects to be passed a valid snapshot, with a broken snapshot, and expect no problems. At the very least I expect there to be a potential of spurious warnings / errors. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-24, Andres Freund wrote: > One might think that could be safe, after all the row is invisible to all > other backends. The problem is that the validation scan won't see *newer* rows > either, since they're not visible to the snapshot either. And if the new row > version is a HOT tuple, it won't have made an index entry on its own. Boom, > corruption. Whoa. My oversight here -- I failed to picture a HOT prune concurrently with CIC or RIC. > Basically snapshots don't work anymore. If PROC_IN_SAFE_IC is set, > that backend is ignored for the horizon computation for snapshots / > on-access HOT pruning. Which means that rows that are visible to the > snapshot can be pruned away. I wondered if we could have different tuple horizons for HOT pruning than for vacuum, but looking at ComputeXidHorizons() and users of that, it looks complicated to adapt. Another possibility (than reverting the commit altogether) might be to disable HOT pruning while a process is operating on that relation with PROC_IN_SAFE_IC. So CIC/RIC processes are still ignored for VACUUM, while not creating corrupted indexes. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Postgres is bloatware by design: it was built to house PhD theses." (Joey Hellerstein, SIGMOD annual conference 2002)
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Robert Haas
Date:
On Wed, May 25, 2022 at 7:45 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > Basically snapshots don't work anymore. If PROC_IN_SAFE_IC is set, > > that backend is ignored for the horizon computation for snapshots / > > on-access HOT pruning. Which means that rows that are visible to the > > snapshot can be pruned away. > > I wondered if we could have different tuple horizons for HOT pruning > than for vacuum, but looking at ComputeXidHorizons() and users of that, > it looks complicated to adapt. > > Another possibility (than reverting the commit altogether) might be to > disable HOT pruning while a process is operating on that relation with > PROC_IN_SAFE_IC. So CIC/RIC processes are still ignored for VACUUM, > while not creating corrupted indexes. I'm not sure that would be a win, because HOT pruning is great as long as the tuples you're pruning are old enough. Also, it seems like it would require complex new infrastructure that I think we should be reluctant to invent in back branches. It seems to me that we should just revert. As far as I can see, and for sure I might be missing something, this is a classic case of an idea that seemed good at the time but turns out not to work. When we look at a recently HOT-updated tuple, we need to know whether the original insertion happened before or after the index build. We can't figure that out if we prune away the tuples that store the xmin values that we need in order to figure that out. So it turns out we need everyone to respect that snapshot after all. Bummer. -- Robert Haas EDB: http://www.enterprisedb.com
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-25, Robert Haas wrote: > On Wed, May 25, 2022 at 7:45 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > Another possibility (than reverting the commit altogether) might be to > > disable HOT pruning while a process is operating on that relation with > > PROC_IN_SAFE_IC. So CIC/RIC processes are still ignored for VACUUM, > > while not creating corrupted indexes. > > I'm not sure that would be a win, because HOT pruning is great as long > as the tuples you're pruning are old enough. Well, the point is that VACUUM could still prune dead tuples that are newer than the CIC in all *other* tables. VACUUM cannot run in the table being processed by CIC/RIC (because of locks), so there's no change; it's only HOT-pruning in the table being processed that would change. > Also, it seems like it would require complex new infrastructure that I > think we should be reluctant to invent in back branches. This is definitely true. And I think this would be expensive, because we'd have to check in every heap_page_prune call. > It seems to me that we should just revert. Deciding to revert makes me sad, because this feature is extremely valuable for users. However, I understand the danger and I don't disagree with the rationale so I can't really object. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "¿Cómo puedes confiar en algo que pagas y que no ves, y no confiar en algo que te dan y te lo muestran?" (Germán Poo)
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-25 18:43:22 +0200, Alvaro Herrera wrote: > On 2022-May-25, Robert Haas wrote: > > Also, it seems like it would require complex new infrastructure that I > > think we should be reluctant to invent in back branches. > > This is definitely true. And I think this would be expensive, because > we'd have to check in every heap_page_prune call. I think the cost could be addressed, along the lines of the mechanism I put in as part of the snapshot scalability work. I.e. don't compute an accurate horizon when not needed for pruning, only do so when within a certain range of xids. But it seems still way too invasive for the back branches. Quite obviously we need a lot more testing for this etc. I'm also doubtful it's the right approach. The problem here comes from needing a snapshot for the entire duration of the validation scan. ISTM that we should work on not needing that snapshot, rather than trying to reduce the consequences of holding that snapshot. I think it might be possible to avoid it. Random sketch: We could prevent HOT updates during CIC for rows inserted during the first scan. If we did that we IIRC could rely on the xids of the last row version to determine whether an index insertion is needed during the validation scan. > > It seems to me that we should just revert. > > Deciding to revert makes me sad, because this feature is extremely > valuable for users. However, I understand the danger and I don't > disagree with the rationale so I can't really object. I sadly don't see how we could develop a reliable reimplementation of this feature without delaying (or destabilizing) the release... Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Robert Haas
Date:
On Wed, May 25, 2022 at 12:43 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > It seems to me that we should just revert. > > Deciding to revert makes me sad, because this feature is extremely > valuable for users. However, I understand the danger and I don't > disagree with the rationale so I can't really object. Right, I mean I'm not saying I *like* reverting, and I'm not disputing it's a good feature. Just that, if we have to choose between this feature and not having index corruption, we better not have index corruption.... and I'm not seeing any way that we can just tweak this and make it work. -- Robert Haas EDB: http://www.enterprisedb.com
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Wed, May 25, 2022 at 10:18 AM Robert Haas <robertmhaas@gmail.com> wrote: > Right, I mean I'm not saying I *like* reverting, and I'm not disputing > it's a good feature. Just that, if we have to choose between this > feature and not having index corruption, we better not have index > corruption.... and I'm not seeing any way that we can just tweak this > and make it work. Are we any closer to deciding on a timeline, in light of recent discussion? I'm now convinced that an out-of-schedule release is probably the way to go. My thinking is: are we really going to make users wait for August 11th for a fix? -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Bruce Momjian
Date:
On Wed, May 25, 2022 at 10:32:43AM -0700, Peter Geoghegan wrote: > On Wed, May 25, 2022 at 10:18 AM Robert Haas <robertmhaas@gmail.com> wrote: > > Right, I mean I'm not saying I *like* reverting, and I'm not disputing > > it's a good feature. Just that, if we have to choose between this > > feature and not having index corruption, we better not have index > > corruption.... and I'm not seeing any way that we can just tweak this > > and make it work. > > Are we any closer to deciding on a timeline, in light of recent discussion? > > I'm now convinced that an out-of-schedule release is probably the way > to go. My thinking is: are we really going to make users wait for > August 11th for a fix? No, we are unlikely to wait until August. What do we currently know? * Caused by CREATE/REINDEX CONCURRENTLY * No way to detect which indexes were created concurrently * amcheck will be required to find corrupt indexes (options?) * Heap is fine, only index is corrupt * Bug since PG 14 * Not known why reports are only appearing now * Feature will likely need to be removed rather than fixed -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Wed, May 25, 2022 at 11:17:33PM -0400, Bruce Momjian wrote: > * Caused by CREATE/REINDEX CONCURRENTLY Yes. > * No way to detect which indexes were created concurrently > * amcheck will be required to find corrupt indexes (options?) amcheck would help for btree, not the other AMs. I am not sure but there could be other symptoms that may not be detected by amcheck? > * Heap is fine, only index is corrupt Yes. > * Bug since PG 14 14.0, yes. > * Not known why reports are only appearing now > * Feature will likely need to be removed rather than fixed Trends go in this direction. It does not seem like there is a complete conclusion on this point, either. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> On 26 May 2022, at 08:39, Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, May 25, 2022 at 11:17:33PM -0400, Bruce Momjian wrote: >> * Caused by CREATE/REINDEX CONCURRENTLY > > Yes. > >> * No way to detect which indexes were created concurrently >> * amcheck will be required to find corrupt indexes (options?) > > amcheck would help for btree, not the other AMs. I am not sure but > there could be other symptoms that may not be detected by amcheck? Noah invented a neat way to do check in pre-heapallindexed versions [0]. I think we can make GiST and some others AMs bechecked this way. BTW, patches for GiST and GIN verification are circulating in pgsql-hackers. Would anyone be interested in reviewing? Ifso - I'll add heapallindexed to GiST verification. Last GIN version by Heikki was in a good condition, though without heapallindexed AFAIK. [1] [2] Best regards, Andrey Borodin. [0] https://github.com/postgres/postgres/blob/REL_11_STABLE/src/bin/pgbench/t/022_cic.pl#L28-L40 [1] https://www.postgresql.org/message-id/flat/a5bc2aad-464b-91bd-061d-28af0f9b634c%40iki.fi#222ef3f6cbdd6f86172b38ff7663e6a3 [2] https://www.postgresql.org/message-id/CAF3eApa07-BajjG8%2BRYx-Dr_cq28ZA0GsZmUQrGu5b2ayRhB5A%40mail.gmail.com
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Петър Славов
Date:
On Thu, May 26, 2022 at 6:17 AM Bruce Momjian <bruce@momjian.us> wrote:
What do we currently know?
* Caused by CREATE/REINDEX CONCURRENTLY
* No way to detect which indexes were created concurrently
* amcheck will be required to find corrupt indexes (options?)
Actually in the initial bug report I posted an SQL that is testing sequential scan VS index scan to find the missing records in the index.
This is how we initially found the problem.
I am not sure how these missing records are related to the `amcheck` problems because the `ctid`s does not match exactly.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Indecision is a decision. Inaction is an action. Mark Batterson
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Bruce Momjian
Date:
On Thu, May 26, 2022 at 12:39:20PM +0900, Michael Paquier wrote: > On Wed, May 25, 2022 at 11:17:33PM -0400, Bruce Momjian wrote: > > * Caused by CREATE/REINDEX CONCURRENTLY > > Yes. > > > * No way to detect which indexes were created concurrently > > * amcheck will be required to find corrupt indexes (options?) > > amcheck would help for btree, not the other AMs. I am not sure but > there could be other symptoms that may not be detected by amcheck? > > > * Heap is fine, only index is corrupt > > Yes. > > > * Bug since PG 14 > > 14.0, yes. > > > * Not known why reports are only appearing now > > * Feature will likely need to be removed rather than fixed > > Trends go in this direction. It does not seem like there is a > complete conclusion on this point, either. Uh, if people don't know if they have used CREATE/REINDEX CONCURRENTLY in PG 14.0+, they are going to need to run amcheck on all btree indexes and reindex all non-btree indexes? That is going to be a painful message to deliver, and hear. :-( -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Christophe Pettus
Date:
> On May 26, 2022, at 07:40, Bruce Momjian <bruce@momjian.us> wrote: > > Uh, if people don't know if they have used CREATE/REINDEX CONCURRENTLY > in PG 14.0+, they are going to need to run amcheck on all btree indexes > and reindex all non-btree indexes? That is going to be a painful > message to deliver, and hear. :-( Unless we are 100% sure that amcheck will never return a false negative in this situation, we may have to recommend reindexingeverything. Ouch.
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Bruce Momjian
Date:
On Thu, May 26, 2022 at 07:41:28AM -0700, Christophe Pettus wrote: > > > > On May 26, 2022, at 07:40, Bruce Momjian <bruce@momjian.us> wrote: > > > > Uh, if people don't know if they have used CREATE/REINDEX CONCURRENTLY > > in PG 14.0+, they are going to need to run amcheck on all btree indexes > > and reindex all non-btree indexes? That is going to be a painful > > message to deliver, and hear. :-( > > Unless we are 100% sure that amcheck will never return a false negative in this situation, we may have to recommend reindexingeverything. Ouch. I assume we are then looking at reindexdb, which has a parallel option. I assume we can also recommend --concurrently on a patched PG 14.4, which might not be terrible since it would not cause downtime. I guess the command would be: $ reindexdb --all --concurrently --jobs ## Without reindexdb's features, this would be much more painful. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Thu, May 26, 2022 at 10:48:27AM -0400, Bruce Momjian wrote: > I assume we are then looking at reindexdb, which has a parallel option. > I assume we can also recommend --concurrently on a patched PG 14.4, > which might not be terrible since it would not cause downtime. I guess > the command would be: > > $ reindexdb --all --concurrently --jobs ## > > Without reindexdb's features, this would be much more painful. Worth noting that CONCURRENTLY is not supported on system catalogs so these are safe from the start, and that reindexdb skips the parallel processing of catalogs when using --jobs and --concurrently. So that would be non-intrusive. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-25 18:43:22 +0200, Alvaro Herrera wrote: > On 2022-May-25, Robert Haas wrote: > > It seems to me that we should just revert. > > Deciding to revert makes me sad, because this feature is extremely > valuable for users. However, I understand the danger and I don't > disagree with the rationale so I can't really object. Since we haven't come up with a better plan, let's do this? Seems like we should add a test for HOT vs CONCURRENTLY? Seems easy enough to screw up some way in the future... Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> On 28 May 2022, at 04:13, Andres Freund <andres@anarazel.de> wrote: > > Seems like we should add a test for HOT vs CONCURRENTLY? This test fails on my machine with 35K transactions. And fails in ~50% of cases with 30K transactions. And this seems like a lot of work, ~10 CPU seconds. How can we reduce the time to reproduce? Best regards, Andrey Borodin.
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-28 11:26:55 +0500, Andrey Borodin wrote: > > On 28 May 2022, at 04:13, Andres Freund <andres@anarazel.de> wrote: > > > > Seems like we should add a test for HOT vs CONCURRENTLY? > > This test fails on my machine with 35K transactions. And fails in ~50% of cases with 30K transactions. > And this seems like a lot of work, ~10 CPU seconds. How can we reduce the time to reproduce? I think you basically need to force some, but not all, of the modifying transactions to be open for a bit longer, so that it's more likely that there's a chance to prune vs CIC waiting. Might also be helpful to update rows multiple times within an xact. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> On 28 May 2022, at 12:02, Andres Freund <andres@anarazel.de> wrote: > > I think you basically need to force some, but not all, of the modifying > transactions to be open for a bit longer, so that it's more likely that > there's a chance to prune vs CIC waiting. Might also be helpful to update rows > multiple times within an xact. Now I've got 2 different versions of test for master branch. Both fail in 50% of cases on my machine. Both take approximately4 seconds of wallclock time and 1 second of CPU time. v3: wait with a fraction of waiting transactions. This test fails with 0 postgres 0x00000001049ec508 ExceptionalCondition + 124 1 postgres 0x00000001045ea284 heap_page_prune + 2992 2 postgres 0x00000001045e9670 heap_page_prune_opt + 424 3 postgres 0x00000001045e25c0 heapam_index_fetch_tuple + 140 4 postgres 0x0000000100272d60 index_fetch_heap + 104 5 postgres 0x0000000100272e18 index_getnext_slot + 88 6 postgres 0x00000001003bbf4c check_exclusion_or_unique_constraint + 440 7 postgres 0x00000001003bc360 ExecCheckIndexConstraints + 232 8 postgres 0x00000001003ea30c ExecInsert + 1024 9 postgres 0x00000001003e90cc ExecModifyTable + 1536 10 postgres 0x00000001003bd0cc standard_ExecutorRun + 268 11 postgres 0x0000000100542d94 ProcessQuery + 160 12 postgres 0x00000001005423c8 PortalRunMulti + 396 13 postgres 0x0000000100541cfc PortalRun + 476 And reverting d9d0762 does not fix the issue. I'm not sure if I'm observing some other problem here. v4 of a test not use pg_sleep() and fails with regular amcheck failure. Reverting d9d0762 fixes the test. Unless I executethe test for 1 million transactions, then it fail even with a revert... I suspect that v3 and v4 triggers different problems. Best regards, Andrey Borodin.
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-28 19:46:40 +0500, Andrey Borodin wrote: > > On 28 May 2022, at 12:02, Andres Freund <andres@anarazel.de> wrote: > > > > I think you basically need to force some, but not all, of the modifying > > transactions to be open for a bit longer, so that it's more likely that > > there's a chance to prune vs CIC waiting. Might also be helpful to update rows > > multiple times within an xact. > Now I've got 2 different versions of test for master branch. Both fail in 50% of cases on my machine. Both take approximately4 seconds of wallclock time and 1 second of CPU time. > > v3: wait with a fraction of waiting transactions. > This test fails with > 0 postgres 0x00000001049ec508 ExceptionalCondition + 124 > 1 postgres 0x00000001045ea284 heap_page_prune + 2992 > 2 postgres 0x00000001045e9670 heap_page_prune_opt + 424 > 3 postgres 0x00000001045e25c0 heapam_index_fetch_tuple + 140 > 4 postgres 0x0000000100272d60 index_fetch_heap + 104 > 5 postgres 0x0000000100272e18 index_getnext_slot + 88 > 6 postgres 0x00000001003bbf4c check_exclusion_or_unique_constraint + 440 > 7 postgres 0x00000001003bc360 ExecCheckIndexConstraints + 232 > 8 postgres 0x00000001003ea30c ExecInsert + 1024 > 9 postgres 0x00000001003e90cc ExecModifyTable + 1536 > 10 postgres 0x00000001003bd0cc standard_ExecutorRun + 268 > 11 postgres 0x0000000100542d94 ProcessQuery + 160 > 12 postgres 0x00000001005423c8 PortalRunMulti + 396 > 13 postgres 0x0000000100541cfc PortalRun + 476 > > And reverting d9d0762 does not fix the issue. I'm not sure if I'm observing some other problem here. I've not been able to reproduce this issue. Even after increasing the number of clients and transactions, and running the test a number of times. With d9d0762 reverted, the problem doesn't happen anymore for me. Any chance you hit this with d9d0762 reverted? It's easy to e.g. revert and run the tests without recreating the temp-install, to reduce cycle times. Was there anything else running on the system? c98763bf51bf also needs to reverted, of course. > v4 of a test not use pg_sleep() and fails with regular amcheck failure. Reverting d9d0762 fixes the test. Unless I executethe test for 1 million transactions, then it fail even with a revert... What you're saying is that the revert might not actually fix the problem, or that you're hitting a separate bug. Correct? If if it fails after 1 mio xact, what are the symptoms? It might be worth trying to repro the problem in 13 or such. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> On 29 May 2022, at 00:34, Andres Freund <andres@anarazel.de> wrote: > > c98763bf51bf also needs to > reverted, of course. Reverting c98763bf51bf fixed both tests. I did not understand that we are going to revert it too. Thanks! Best regards, Andrey Borodin.
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Sat, May 28, 2022 at 12:34:13PM -0700, Andres Freund wrote: > I've not been able to reproduce this issue. Even after increasing the number > of clients and transactions, and running the test a number of times. With > d9d0762 reverted, the problem doesn't happen anymore for me. Neither was I able to reproduce this issue after a couple of hours making my laptop hotter, using either the TAP tests or some equivalent manual tests, and the revert of d9d0762 helps. > Any chance you hit this with d9d0762 reverted? It's easy to e.g. revert and > run the tests without recreating the temp-install, to reduce cycle times. > > Was there anything else running on the system? c98763bf51bf also needs to > reverted, of course. Yeah, I agree that we'd better revert c98763bf for the time being. And f9900df on top of that? I was trying to think of ways to get an isolation test out of that, but that proves to be sort of tricky as we need to manipulate the HOT chains after the validation phase has begun with the snapshot from the build phase. It is easy to block before the validation transaction starts, like in WaitForLockersMultiple() beforehand, though. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-30, Michael Paquier wrote: > > Was there anything else running on the system? c98763bf51bf also needs to > > reverted, of course. > > Yeah, I agree that we'd better revert c98763bf for the time being. > And f9900df on top of that? Yeah, both commits need to be reverted, since the latter depends critically on the former. I'll get the revert pushed soon. > I was trying to think of ways to get an isolation test out of that, > but that proves to be sort of tricky as we need to manipulate the HOT > chains after the validation phase has begun with the snapshot from the > build phase. It is easy to block before the validation transaction > starts, like in WaitForLockersMultiple() beforehand, though. Hmm. I suppose for the next try of implementing a feature like this, we'll definitely want to incorporate some tests that can catch problems of this sort. But I don't think we need to come up with something right now. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-29 09:15:15 +0500, Andrey Borodin wrote: > > On 29 May 2022, at 00:34, Andres Freund <andres@anarazel.de> wrote: > > > > c98763bf51bf also needs to > > reverted, of course. > Reverting c98763bf51bf fixed both tests. I did not understand that we are going to revert it too. We're going to need to revert both, but I don't understand why c98763bf51bf would affect this your tests. Afaics it should just affect CIC, not RIC. Could you make sure you can reproduce the issue without c98763bf51bf reverted? Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
On 2022-05-30 15:54:08 +0900, Michael Paquier wrote: > Yeah, I agree that we'd better revert c98763bf for the time being. > And f9900df on top of that? Well, f9900df needs to be reverted, because it caused the problem at hand, and is ontop of c98763bf... > I was trying to think of ways to get an isolation test out of that, > but that proves to be sort of tricky as we need to manipulate the HOT > chains after the validation phase has begun with the snapshot from the > build phase. It is easy to block before the validation transaction > starts, like in WaitForLockersMultiple() beforehand, though. I think it's ok if we have a heuristic test for this kind of thing. It sometimes can even be good, because it means we'll get different schedulings over time, hitting "unknown" bugs. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Mon, May 30, 2022 at 12:10 PM Andres Freund <andres@anarazel.de> wrote: > > I was trying to think of ways to get an isolation test out of that, > > but that proves to be sort of tricky as we need to manipulate the HOT > > chains after the validation phase has begun with the snapshot from the > > build phase. It is easy to block before the validation transaction > > starts, like in WaitForLockersMultiple() beforehand, though. > > I think it's ok if we have a heuristic test for this kind of thing. It > sometimes can even be good, because it means we'll get different schedulings > over time, hitting "unknown" bugs. As long as it has a reasonably good chance of failing with the bug, it's still a valid test IMV. As you say, there may be some value in not over-specifying what the problem is -- that could actually bring unknown issues to light, especially if the test uses amcheck. Clearly there is never any strict guarantee that writing a test will avoid even one bug in the future. You're always working off some intuition about what related problems might happen in the future, weighed against the costs (mostly the added test cycles). And so adding an imprecise test really isn't very different to adding a precise test that reliably catches the bug that the test was written to catch. -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-30, Andres Freund wrote: > On 2022-05-30 15:54:08 +0900, Michael Paquier wrote: > > Yeah, I agree that we'd better revert c98763bf for the time being. > > And f9900df on top of that? > > Well, f9900df needs to be reverted, because it caused the problem at hand, and > is ontop of c98763bf... I'm not sure I understand what you're saying here. I understood that both RIC and CIC are affected, so the reversal is of the following commits: c98763bf51bf Avoid spurious waits in concurrent indexing f9900df5f949 Avoid spurious wait in concurrent reindex d9d076222f5b VACUUM: ignore indexing operations with CONCURRENTLY -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Use it up, wear it out, make it do, or do without"
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-30, Alvaro Herrera wrote: > I'm not sure I understand what you're saying here. I understood that > both RIC and CIC are affected, so the reversal is of the following > commits: > > c98763bf51bf Avoid spurious waits in concurrent indexing > f9900df5f949 Avoid spurious wait in concurrent reindex > d9d076222f5b VACUUM: ignore indexing operations with CONCURRENTLY This is the full reversal in branch master. I'm unable to reproduce a problem with the v3 patch Andrey last provided, however. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Tiene valor aquel que admite que es un cobarde" (Fernandel)
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
On 2022-05-30 22:40:39 +0200, Alvaro Herrera wrote: > On 2022-May-30, Andres Freund wrote: > > > On 2022-05-30 15:54:08 +0900, Michael Paquier wrote: > > > Yeah, I agree that we'd better revert c98763bf for the time being. > > > And f9900df on top of that? > > > > Well, f9900df needs to be reverted, because it caused the problem at hand, and > > is ontop of c98763bf... > > I'm not sure I understand what you're saying here. I understood that > both RIC and CIC are affected, so the reversal is of the following > commits: I was just a bit confused about Michael's phrasing of reverting f9900df "on top of" c98763bf. Not important... > c98763bf51bf Avoid spurious waits in concurrent indexing > f9900df5f949 Avoid spurious wait in concurrent reindex > d9d076222f5b VACUUM: ignore indexing operations with CONCURRENTLY That looks right, yes. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Mon, May 30, 2022 at 12:10:04PM -0700, Andres Freund wrote: > I think it's ok if we have a heuristic test for this kind of thing. It > sometimes can even be good, because it means we'll get different schedulings > over time, hitting "unknown" bugs. Yeah, the tricky part is how to parametize that to be cheap, still useful. Having a larger number of attributes makes the particular problem of this thread easier to hit because it enlarges the validation window in the concurrent build, and generating such dummy data with a simple schema should be quick enough, but I would not accept in the tree a test that takes ~5s to run without a rather good hit rate. For this particular issue, I would say that something able to fail up to 20%~25% of the time for a short run-time would be quite decent, actually, even if that sounds low in a single run because the odds of detecting one failure increase a "lot" after a few repeated runs. On my own laptop, the original test of the reporter takes up to 3~4s to reproduce the issue, for example, all the time, so that's good :) -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Mon, May 30, 2022 at 10:43:41PM +0200, Alvaro Herrera wrote: > This is the full reversal in branch master. This looks fine to me, thanks. I can see that you have kept the reference to ComputeXidHorizons for vacuum_defer_cleanup_age in guc.c. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> 31 мая 2022 г., в 00:08, Andres Freund <andres@anarazel.de> написал(а): > > We're going to need to revert both, but I don't understand why c98763bf51bf > would affect this your tests. Afaics it should just affect CIC, not RIC. Could > you make sure you can reproduce the issue without c98763bf51bf reverted? Here's the revision with tests v3 and revert of d9d0762 [0]. In CI you can see the tests pass on Linux and FreeBSD, but failon MacOS [1]. Best regards, Andrey Borodin. [0] https://github.com/x4m/postgres_g/commits/test [1] https://cirrus-ci.com/task/4580203115577344?logs=test_world#L1566
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-31, Michael Paquier wrote: > On Mon, May 30, 2022 at 10:43:41PM +0200, Alvaro Herrera wrote: > > This is the full reversal in branch master. > > This looks fine to me, thanks. I can see that you have kept the > reference to ComputeXidHorizons for vacuum_defer_cleanup_age in > guc.c. Yeah, there was a typo fix there that I also kept as well as one newline change. Basically I scanned the reversal patch manually and removed the changes I thought were better kept. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El miedo atento y previsor es la madre de la seguridad" (E. Burke)
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-31, Michael Paquier wrote: > Yeah, the tricky part is how to parametize that to be cheap, still > useful. Having a larger number of attributes makes the particular > problem of this thread easier to hit because it enlarges the > validation window in the concurrent build, Hmm, I wonder if it's possible to use an index ON tab (pg_sleep(100ms)) or something like that (perhaps even use an expresion involving an advisory lock acquisition, which is being held by the other session), to widen the window. The stop events patch that A. Korotkov posted [0] would likely be very useful here. [0] https://postgr.es/m/CAPpHfdtSEOHX8dSk9Qp+Z++i4BGQoffKip6JDWngEA+g7Z-XmQ@mail.gmail.com > On my own laptop, the original test of the reporter takes up to 3~4s > to reproduce the issue, for example, all the time, so that's good :) Sadly, my laptop is somehow broken and I haven't reproduced the failure even a single time, and my other regular computers are at high seas (I hope). -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Michael Paquier
Date:
On Tue, May 31, 2022 at 11:19:01AM +0200, Alvaro Herrera wrote: > Hmm, I wonder if it's possible to use an index ON tab (pg_sleep(100ms)) > or something like that (perhaps even use an expresion involving an > advisory lock acquisition, which is being held by the other session), to > widen the window. I have not considered the use of expressions to enforce a lock at this stage. I'll give it a shot, even if it comes to the abuse of an immutable function with volatile internals :) > The stop events patch that A. Korotkov posted [0] would likely be very > useful here. > > [0] https://postgr.es/m/CAPpHfdtSEOHX8dSk9Qp+Z++i4BGQoffKip6JDWngEA+g7Z-XmQ@mail.gmail.com Thanks for the reference. I've wanted to resurrect this stuff, actually, after dealing with 6dced63 to be able to have a stop point in the middle of a restart point. -- Michael
Attachment
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-26, Christophe Pettus wrote: > Unless we are 100% sure that amcheck will never return a false > negative in this situation, we may have to recommend reindexing > everything. I am not aware of any reason why amcheck would return a false negative. Is there any evidence behind this assertion? If there is none, the recommendation should be to use amcheck on all btree indexes and reindex those that have the problem; and reindex all indexes of other AMs that could have been reindexed or created concurrently in 14beta1 or later (implying: an index that was created in 13 and pg_upgraded but not touched afterwards is not at risk). -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-31, Alvaro Herrera wrote: > If there is none, the recommendation should be to use amcheck on all > btree indexes and reindex those that have the problem; and reindex all > indexes of other AMs that could have been reindexed or created > concurrently in 14beta1 or later (implying: an index that was created in > 13 and pg_upgraded but not touched afterwards is not at risk). Another possibility for very large indexes may be to disable all types of index scans, then apply no-op UPDATEs to the unindexed rows until the migrate to some other heap block, then vacuum. After that, amcheck should issue a clean report. This is much less intrusive than reindexing possibly several TB of data. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Tue, May 31, 2022 at 7:43 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > I am not aware of any reason why amcheck would return a false negative. > Is there any evidence behind this assertion? With heapallindexed verification, there is a theoretical risk of them from the Bloom filter itself, due to hash collisions. But the chance of that happening is rather low. And there is a random seed, so absent an extreme shortage of memory the collision shouldn't recur on a second or a third attempt at verification. Or with a second or third missing tuple in any single attempt at verifying a corrupt index. > If there is none, the recommendation should be to use amcheck on all > btree indexes and reindex those that have the problem; and reindex all > indexes of other AMs that could have been reindexed or created > concurrently in 14beta1 or later (implying: an index that was created in > 13 and pg_upgraded but not touched afterwards is not at risk). I'm in favor of this. We agonized about these false negatives back in 2017, when the feature went in, but I think that they're a negligible issue compared to everything else. It just doesn't need to be discussed in the release notes. The large majority of people that do this will find nothing at all, because there is no corruption. Lots of people that already use amcheck probably don't use heapallindexed verification at all. For one thing it's not the default with pg_amcheck. It's also significantly more expensive. -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-28, Andrey Borodin wrote: > And reverting d9d0762 does not fix the issue. I'm not sure if I'm observing some other problem here. > > v4 of a test not use pg_sleep() and fails with regular amcheck > failure. Reverting d9d0762 fixes the test. Unless I execute the test > for 1 million transactions, then it fail even with a revert... > > I suspect that v3 and v4 triggers different problems. Hmm, the only failure I see with v4 is a deadlock of this sort: 2022-05-31 17:26:13.400 CEST [130375] 004_rc.pl ERROR: deadlock detected 2022-05-31 17:26:13.400 CEST [130375] 004_rc.pl DETAIL: Process 130375 waits for ShareLock on transaction 36108; blockedby process 130372. Process 130372 waits for ShareLock on transaction 36107; blocked by process 130375. Process 130375: INSERT INTO tbl VALUES(random()*1000,0,0,0,now()) on conflict(i) do update set updated_at = now(); Process 130372: INSERT INTO tbl VALUES(random()*1000,0,0,0,now()) on conflict(i) do update set updated_at = now(); which is of course not very interesting. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ <Schwern> It does it in a really, really complicated way <crab> why does it need to be complicated? <Schwern> Because it's MakeMaker.
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andrey Borodin
Date:
> On 31 May 2022, at 20:30, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2022-May-28, Andrey Borodin wrote: > >> And reverting d9d0762 does not fix the issue. I'm not sure if I'm observing some other problem here. >> >> v4 of a test not use pg_sleep() and fails with regular amcheck >> failure. Reverting d9d0762 fixes the test. Unless I execute the test >> for 1 million transactions, then it fail even with a revert... >> >> I suspect that v3 and v4 triggers different problems. > > Hmm, the only failure I see with v4 is a deadlock of this sort: As seen in CI with reverting only d9d0762: v3 fails only on MacOS [0,1] v4 fails on Linux and FreeBSD [2,3] So it's kind of system-dependent... > On 31 May 2022, at 19:43, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > (implying: an index that was created in > 13 and pg_upgraded but not touched afterwards is not at risk). 13.5+ to be accurate. Other 13 releases contained CIC\RC bug fixed by fdd965d0. Best regards, Andrey Borodin. [0] https://github.com/x4m/postgres_g/runs/6665396918 [1] https://cirrus-ci.com/task/4580203115577344?logs=test_world#L1571 [2] https://github.com/x4m/postgres_g/runs/6675153300 [3] https://cirrus-ci.com/task/6540481319403520?logs=test_world#L842
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-30, Andres Freund wrote: > On 2022-05-30 22:40:39 +0200, Alvaro Herrera wrote: > > c98763bf51bf Avoid spurious waits in concurrent indexing > > f9900df5f949 Avoid spurious wait in concurrent reindex > > d9d076222f5b VACUUM: ignore indexing operations with CONCURRENTLY > > That looks right, yes. Sorry, after going through these commits again, this is a misunderstanding on my part. The first two commits need not be reverted; only the third one does, which is the one that changes the way the system-wide Xmin is determined (the commit message says it affects VACUUM, but naturally it affects HOT pruning as well.) The point of the first two changes is to remove snapshot waits in CIC and RC. Before those commits, each CIC would wait for all other CICs and RCs in the system. AFAICS this is unrelated to the HOT-pruning bug. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Para tener más hay que desear menos"
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-31, Andrey Borodin wrote: > As seen in CI with reverting only d9d0762: > v3 fails only on MacOS [0,1] > v4 fails on Linux and FreeBSD [2,3] > [0] https://github.com/x4m/postgres_g/runs/6665396918 > [1] https://cirrus-ci.com/task/4580203115577344?logs=test_world#L1571 > [2] https://github.com/x4m/postgres_g/runs/6675153300 > [3] https://cirrus-ci.com/task/6540481319403520?logs=test_world#L842 Hmm ... Exactly what code is being tested here? I see in one of these backtraces that there is a reference to a function called heap_prune_from_root, but I can't find that in the Postgres sources. At this point, I'm clear that d9d076222f5b needs to be reverted because of the interactions with HOT prune, but it's not at all clear to me what is the relationship of the other two commits to bugs. I am going to revert that one now, so that we can continue to diagnose any lingering problem that may be related to the other two commits -- but perhaps it is something else. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Tue, May 31, 2022 at 11:42 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Hmm ... Exactly what code is being tested here? I see in one of these > backtraces that there is a reference to a function called > heap_prune_from_root, but I can't find that in the Postgres sources. This includes a patch from me, to make pruning more robust. It could technically be a bug in my patch, though a bug revealed by my patch seems quite possible too. -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-31, Peter Geoghegan wrote: > On Tue, May 31, 2022 at 11:42 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > Hmm ... Exactly what code is being tested here? I see in one of these > > backtraces that there is a reference to a function called > > heap_prune_from_root, but I can't find that in the Postgres sources. > > This includes a patch from me, to make pruning more robust. It could > technically be a bug in my patch, though a bug revealed by my patch > seems quite possible too. Ah, I see. Well, the explanation that depends on Xmin doesn't apply to this case. I'm not saying there are no bugs in the other two commits, just that such bug(s) have a different explanation, so it makes sense to revert separately. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Tue, May 31, 2022 at 11:54 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > On 2022-May-31, Peter Geoghegan wrote: > > This includes a patch from me, to make pruning more robust. It could > > technically be a bug in my patch, though a bug revealed by my patch > > seems quite possible too. > > Ah, I see. Well, the explanation that depends on Xmin doesn't apply to > this case. I'm not saying there are no bugs in the other two commits, > just that such bug(s) have a different explanation, so it makes sense to > revert separately. The assertion failure takes place in heapam_index_build_range_scan(), at the point that it is called by amcheck itself -- during an opportunistic prune that occurs in passing. This is the specific assertion that fails: https://github.com/x4m/postgres_g/blob/09d76bdc2528b476ef2f04c793061dcbf23372b5/src/backend/access/heap/pruneheap.c#L762 Another bug in HEAD definitely seems possible here, but it's hard to know without careful testing. The failure that we see is limited to the "macOS - Monterey" run, which only fails on this new test. -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-31 12:09:04 -0700, Peter Geoghegan wrote: > On Tue, May 31, 2022 at 11:54 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2022-May-31, Peter Geoghegan wrote: > > > This includes a patch from me, to make pruning more robust. It could > > > technically be a bug in my patch, though a bug revealed by my patch > > > seems quite possible too. > > > > Ah, I see. Well, the explanation that depends on Xmin doesn't apply to > > this case. I'm not saying there are no bugs in the other two commits, > > just that such bug(s) have a different explanation, so it makes sense to > > revert separately. > > The assertion failure takes place in heapam_index_build_range_scan(), > at the point that it is called by amcheck itself -- during an > opportunistic prune that occurs in passing. This is the specific > assertion that fails: > > https://github.com/x4m/postgres_g/blob/09d76bdc2528b476ef2f04c793061dcbf23372b5/src/backend/access/heap/pruneheap.c#L762 I don't think that assertion is correct. Consider transactions aborting concurrently with heap pruning. You could have done a HTSV for one chain element, a concurrent abort happened, then you did the HTSV for another chain element. If the HTSVs were not in the order of the HOT chain you could see HEAPTUPLE_DEAD for an earlier chain element, while seeing HEAPTUPLE_INSERT_IN_PROGRESS in a later one. There's several other scenarios with subtransaction aborts as well, I think. Greetings, Andres Freund
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-31, Peter Geoghegan wrote: > The assertion failure takes place in heapam_index_build_range_scan(), > at the point that it is called by amcheck itself -- during an > opportunistic prune that occurs in passing. This is the specific > assertion that fails: > > https://github.com/x4m/postgres_g/blob/09d76bdc2528b476ef2f04c793061dcbf23372b5/src/backend/access/heap/pruneheap.c#L762 > > Another bug in HEAD definitely seems possible here, but it's hard to > know without careful testing. The failure that we see is limited to > the "macOS - Monterey" run, which only fails on this new test. Actually, it fails on Linux, FreeBSD and macOS -- it's only that one particular version of the test fails in macOS and the other fails in the other two systems. But both of those failures occur in this assertion. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Peter Geoghegan
Date:
On Tue, May 31, 2022 at 12:19 PM Andres Freund <andres@anarazel.de> wrote: > I don't think that assertion is correct. > > Consider transactions aborting concurrently with heap pruning. You could have > done a HTSV for one chain element, a concurrent abort happened, then you did > the HTSV for another chain element. If the HTSVs were not in the order of the > HOT chain you could see HEAPTUPLE_DEAD for an earlier chain element, while > seeing HEAPTUPLE_INSERT_IN_PROGRESS in a later one. There's several other > scenarios with subtransaction aborts as well, I think. I think that it probably was correct before I rebased the patch on top of your bugfix commit 18b87b201f. The original version would have actually called HTSV directly, at the point that it accessed each tuple from a HOT chain. If nothing else this suggests that the patch should be clear on this point about not calling HTSV in HOT chain order. Offhand I think that it probably would still work if it was limited to HEAPTUPLE_LIVE (no more asserting in the HEAPTUPLE_INSERT_IN_PROGRESS case). Not sure if that's worth it. A topic for another time. -- Peter Geoghegan
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Alvaro Herrera
Date:
On 2022-May-31, Alvaro Herrera wrote: > At this point, I'm clear that d9d076222f5b needs to be reverted because > of the interactions with HOT prune, but it's not at all clear to me what > is the relationship of the other two commits to bugs. I am going to > revert that one now, so that we can continue to diagnose any lingering > problem that may be related to the other two commits -- but perhaps it > is something else. Reverted. (This is the same commit that Michaël found to blame at the start of the thread.) -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Re: BUG #17485: Records missing from Primary Key index when doing REINDEX INDEX CONCURRENTLY
From
Andres Freund
Date:
Hi, On 2022-05-31 16:43:00 +0200, Alvaro Herrera wrote: > On 2022-May-26, Christophe Pettus wrote: > > > Unless we are 100% sure that amcheck will never return a false > > negative in this situation, we may have to recommend reindexing > > everything. > > I am not aware of any reason why amcheck would return a false negative. > Is there any evidence behind this assertion? I think it's very likely to be possible to have false negatives: 1) It's pretty obvious that you could have constraint violations that went unnoticed due to the index corruption, were the missing heap-tuple since has been removed. Consider e.g. a foreign key cascading deletion that missed the tuple due to the missing index entry - if the wrongly-not-deleted tuple is subsequently updated, without changing the fkey relevant columns, everything would look fine from amcheck's view now. 2) It looks like we may not just end up with too few index entries, but can also end up with multiple index entries for a HOT chain. I don't think amcheck can detect this right now. 3) I think it's possible to end up with chain bogus root pointers being used when inserting index entries due to the bug. Those indexes will pass heapallindexed, but the index entries will point to bogus tuples. Greetings, Andres Freund