Re: gist microvacuum doesn't appear to care about hot standby? - Mailing list pgsql-hackers
From | Alexander Korotkov |
---|---|
Subject | Re: gist microvacuum doesn't appear to care about hot standby? |
Date | |
Msg-id | CAPpHfdsKS0K8q1sJ-XyMrU=L+e6XSAOgS09NXp1bQDQts+qz+g@mail.gmail.com Whole thread Raw |
In response to | Re: gist microvacuum doesn't appear to care about hot standby? (Alexander Korotkov <a.korotkov@postgrespro.ru>) |
Responses |
Re: gist microvacuum doesn't appear to care about hot standby?
|
List | pgsql-hackers |
On Tue, Dec 18, 2018 at 2:04 AM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > On Mon, Dec 17, 2018 at 3:35 PM Alexander Korotkov > <a.korotkov@postgrespro.ru> wrote: > > On Mon, Dec 17, 2018 at 3:40 AM Alexander Korotkov > > <a.korotkov@postgrespro.ru> wrote: > > > On Mon, Dec 17, 2018 at 1:25 AM Andres Freund <andres@anarazel.de> wrote: > > > > On 2018-12-17 01:03:52 +0300, Alexander Korotkov wrote: > > > > > Sorry for delay. Attached patch implements conflict handling for gist > > > > > microvacuum like btree and hash. I'm going to push it if no > > > > > objections. > > > > > > > > > > Note, that it implements new WAL record type. So, new WAL can\t be > > > > > replayed on old minor release. I'm note sure if we claim that it's > > > > > usually possible. Should we state something explicitly for this case? > > > > > > > > Please hold off committing for a bit. Adding new WAL records in a minor > > > > release ought to be very well considered and a measure of last resort. > > > > > > > > Couldn't we determine the xid horizon on the primary, and reuse an > > > > existing WAL record to trigger the conflict? Or something along those > > > > lines? > > > > > > I thought about that, but decided it's better to mimic B-tree and hash > > > behavior rather than invent new logic in a minor release. But given > > > that new WAL record in minor release in substantial problem, that > > > argument doesn't matter. > > > > > > Yes, it seems to be possible. We can determine xid horizon on primary > > > in the same way you proposed for B-tree and hash [1] and use > > > XLOG_HEAP2_CLEANUP_INFO record to trigger the conflict. Do you like > > > me to make such patch for GiST based on your patch? > > > > Got another tricky idea. Now, deleted offset numbers are written to > > buffer data. We can also append them to record data. So, basing on > > record length we can resolve conflicts when offsets are provided in > > record data. Unpatched version will just ignore extra record data > > tail. That would cost us some redundant bigger wal records, but solve > > other problems. Any thoughts? > > Please, find backpatch version of patch implementing this approach > attached. I found it more attractive than placing xid horizon > calculation to primary. Because xid horizon calculation on primary is > substantially new behavior, which is unwanted for backpatching. I've > not yet tested this patch. > > I'm going to test this patch including WAL compatibility. If > everything will be OK, then commit. I've managed to reproduce the problem and test my backpatch solution. primary (patched) standby 1 (patched) standby 2 (unpatched) drop table if exists test; create table test (p point) with (fillfactor = 50, autovacuum_enabled = false); insert into test (select point(i % 100, i / 100) from generate_series(0,9999) i); vacuum test; create index test_gist_idx on test using gist (p); alter table test set (fillfactor = 100); begin isolation level repeatable read; select count(*) from test where p <@ box(point(0,0),point(99,99)); count ------- 10000 (1 row) begin isolation level repeatable read; select count(*) from test where p <@ box(point(0,0),point(99,99)); count ------- 10000 (1 row) delete from test where p[0]::int % 10 = 0 and p[1]::int % 10 = 0; set enable_seqscan = off; set enable_bitmapscan = off; set enable_indexonlyscan = off; select count(*) from test where p <@ box(point(0,0),point(99,99)); insert into test (select point(i % 100, i / 100) from generate_series(0,9999) i); select count(*) from test where p <@ box(point(0,0),point(99,99)); count ------- 10000 (1 row) select count(*) from test where p <@ box(point(0,0),point(99,99)); count ------- 9961 (1 row) select count(*) from test where p <@ box(point(0,0),point(99,99)); FATAL: terminating connection due to conflict with recovery DETAIL: User query might have needed to see row versions that must be removed. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. So, two standbys were reading the same WAL generated by patched primary. Patched standby got conflict: it gives correct query answer then drops transaction. Unpatched replicate WAL stream without conflict. So, it gives wrong query answer as if it was reading WAL from unpatched master. If experimenting with unpatched primary, both standbys gives wrong query answer without conflict. Please, find attached two patches I'm going to commit: for master and for backpatching. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
pgsql-hackers by date: