Thread: Hot Standby: First integrated patch
First integrated patch for Hot Standby, allowing queries to be executed while in recovery mode. The patch tests successfully with the enclosed files: * primary_setup_test.sql - run it on primary node * standby_allowed.sql - run on standby - should all succeed * standby_disallowed.sql - run on standby - should all fail plus other manual testing. This is still WIP - its good enough to release for comments, though I am not yet confident enough to claim it bug free. What this doesn't do YET: * cope fully with subxid cache overflows (some parts still to add) * cope with prepared transactions on master * work correctly when running queries AND replaying WAL * work correctly with regard to AccessExclusiveLocks, which should prevent access to tables These last four points are what I'm working on over the next two weeks, plus any other holes people point out along the way. I have worked out designs for most of these aspects and will discuss them on -hackers, though most design notes are in the Wiki. I'm still looking into prepared transactions. Comments appreciated. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
Attachment
On Fri, 2008-10-17 at 15:38 +0100, Simon Riggs wrote: > First integrated patch for Hot Standby, allowing queries to be executed > while in recovery mode. Patch with --context=10 to get around apply problems reported by Merlin. Thanks. Some additional info on categorisation of changes to give a better overview of the change footprint: Major changes for Hot Standby backend/access/transam/xact.c | 679 ++++++++++++++++++!!!!!!!!! backend/storage/ipc/procarray.c | 787 +++++++++++++++++++++++++!!! backend/storage/lmgr/proc.c | 107 ++++ backend/postmaster/postmaster.c | 75 ++ backend/utils/time/tqual.c | 29 ! Minor changes for Hot Standby backend/utils/init/postinit.c | 8 backend/access/transam/multixact.c | 14 backend/access/transam/slru.c | 16 backend/access/transam/twophase.c | 2 Required changes for bgwriter in recovery mode backend/access/transam/xlog.c | 670 +++++++++++++++++--!!!!!!!!!! backend/postmaster/bgwriter.c | 206 ++++++++- Minor utility changes bin/pg_controldata/pg_controldata.c | 3 bin/pg_resetxlog/pg_resetxlog.c | 2 Changed header files for above include/access/xact.h | 27 + include/access/xlog.h | 54 +! include/access/xlog_internal.h | 6 include/catalog/pg_control.h | 6 include/postmaster/bgwriter.h | 6 include/storage/pmsignal.h | 1 include/storage/proc.h | 4 include/storage/procarray.h | 17 include/utils/snapshot.h | 67 ++! Prevent operations that would fail in recovery mode backend/access/heap/pruneheap.c | 8 backend/commands/discard.c | 1 backend/commands/lockcmds.c | 12 backend/commands/sequence.c | 2 backend/storage/lmgr/lock.c | 9 backend/tcop/utility.c | 21 backend/utils/adt/txid.c | 6 include/miscadmin.h | 3 Additional/Changed Comments only backend/access/transam/clog.c | 3 backend/access/transam/subtrans.c | 3 backend/storage/buffer/README | 9 -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
Attachment
On Fri, Oct 17, 2008 at 10:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > First integrated patch for Hot Standby, allowing queries to be executed > while in recovery mode. > > The patch tests successfully with the enclosed files: > * primary_setup_test.sql - run it on primary node > * standby_allowed.sql - run on standby - should all succeed > * standby_disallowed.sql - run on standby - should all fail > plus other manual testing. > > This is still WIP - its good enough to release for comments, though I am > not yet confident enough to claim it bug free. > > What this doesn't do YET: > * cope fully with subxid cache overflows (some parts still to add) > * cope with prepared transactions on master > * work correctly when running queries AND replaying WAL > * work correctly with regard to AccessExclusiveLocks, which should > prevent access to tables > > These last four points are what I'm working on over the next two weeks, > plus any other holes people point out along the way. I have worked out > designs for most of these aspects and will discuss them on -hackers, > though most design notes are in the Wiki. I'm still looking into > prepared transactions. > > Comments appreciated. It appears to be working, at least in some fashion. The supplied tests all pass. At first glance it seems like I have to force changes to the standby with pg_switch_xlog(). hmm. This probably isn't right: postgres=# \d No relations found. postgres=# select count(*) from foo; count ---------1000000 (1 row) I created a table, pg_switch_xlog, query several times,i dropped a table, pg_switch_xlog, table is 'gone', but still returns data exit/enter session, now its gone. Sometimes I have to exit/enter session to get an up to date standby. These are just first impressions... merlin
On Fri, 2008-10-17 at 16:47 -0400, Merlin Moncure wrote: > On Fri, Oct 17, 2008 at 10:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > First integrated patch for Hot Standby, allowing queries to be executed > > while in recovery mode. > > > > The patch tests successfully with the enclosed files: > > * primary_setup_test.sql - run it on primary node > > * standby_allowed.sql - run on standby - should all succeed > > * standby_disallowed.sql - run on standby - should all fail > > plus other manual testing. > > > > This is still WIP - its good enough to release for comments, though I am > > not yet confident enough to claim it bug free. > > > > What this doesn't do YET: > > * cope fully with subxid cache overflows (some parts still to add) > > * cope with prepared transactions on master > > * work correctly when running queries AND replaying WAL > > * work correctly with regard to AccessExclusiveLocks, which should > > prevent access to tables > > > > These last four points are what I'm working on over the next two weeks, > > plus any other holes people point out along the way. I have worked out > > designs for most of these aspects and will discuss them on -hackers, > > though most design notes are in the Wiki. I'm still looking into > > prepared transactions. > > > > Comments appreciated. > > It appears to be working, at least in some fashion. The supplied > tests all pass. Cool Thanks for testing so far. > At first glance it seems like I have to force changes to the standby > with pg_switch_xlog(). > > hmm. You'll have to explain some more. Normally files don't get sent until they are full, so yes, you would need to do a pg_switch_xlog(). This is not "streaming replication". Others are working on that. > This probably isn't right: > postgres=# \d > > No relations found. > postgres=# select count(*) from foo; > count > --------- > 1000000 > (1 row) > > I created a table, pg_switch_xlog, query several times,i dropped a > table, pg_switch_xlog, table is 'gone', but still returns data > > exit/enter session, now its gone. Sometimes I have to exit/enter > session to get an up to date standby. These are just first > impressions... Replaying and queries don't mix yet, so that is expected. I'm working on this in phases. This patch is phase 1 - it is not the "finished patch". Phase 2: working on correct block locking to allow concurrent DML changes to occcur while we run queries. Phase 3: working on correct relation locking/relcache to allow concurrent DDL changes to occur while we run queries. I have designs of the above and expect to complete in next two weeks. The reason for the above behaviour is that DDL changes need to fire relcache invalidation messages so that the query backend sees the change. The reason the table is still there is because the files haven't been dropped yet. So everything you have seen is expected, by me. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Sat, Oct 18, 2008 at 4:11 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > > On Fri, 2008-10-17 at 16:47 -0400, Merlin Moncure wrote: >> On Fri, Oct 17, 2008 at 10:38 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> > >> > First integrated patch for Hot Standby, allowing queries to be executed >> > while in recovery mode. >> > >> > The patch tests successfully with the enclosed files: >> > * primary_setup_test.sql - run it on primary node >> > * standby_allowed.sql - run on standby - should all succeed >> > * standby_disallowed.sql - run on standby - should all fail >> > plus other manual testing. >> > >> > This is still WIP - its good enough to release for comments, though I am >> > not yet confident enough to claim it bug free. >> > >> > What this doesn't do YET: >> > * cope fully with subxid cache overflows (some parts still to add) >> > * cope with prepared transactions on master >> > * work correctly when running queries AND replaying WAL >> > * work correctly with regard to AccessExclusiveLocks, which should >> > prevent access to tables >> > >> > These last four points are what I'm working on over the next two weeks, >> > plus any other holes people point out along the way. I have worked out >> > designs for most of these aspects and will discuss them on -hackers, >> > though most design notes are in the Wiki. I'm still looking into >> > prepared transactions. >> > >> > Comments appreciated. >> >> It appears to be working, at least in some fashion. The supplied >> tests all pass. > > Cool > > Thanks for testing so far. > >> At first glance it seems like I have to force changes to the standby >> with pg_switch_xlog(). >> >> hmm. > > You'll have to explain some more. Normally files don't get sent until > they are full, so yes, you would need to do a pg_switch_xlog(). > > This is not "streaming replication". Others are working on that. right...this was expected. less the missing parts, things are working very well. merlin