Thread: Streaming replication: sequences on slave seemingly ahead of sequences on master
Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Vincent de Phily
Date:
Hi list, we have two 9.1.2 servers on debian squeeze, and are setting up a simple streaming replication between the two. * wal_keep_segments is set high on the master * the slave's recovery.conf contains just standbay_mode=on and primary_conninfo=foo * we use a simple start_backup/rsync/stop_backup to create the base copy before starting the slave. It all seems to be working fine, except that when checking the data (selecting latest primary key and sequence value for all tables) on master and slave, some sequence ids are higher on the slave than on the master. I could understand if they were lower, but this is weird. * The slave's sequences can be anywhere between 1 and 50 ids ahead. * The actual table data is properly in sync. * We look at the slave before the master. * We ignore readings where pg_current_xlog_location() != pg_last_xlog_replay_location(). * It only happens on frequently-updated sequences. * During recovery, we have warnings of the form: 2012-05-04 10:32:08 CEST WARNING: xlog min recovery request 16A/2A03BDD0 is past current point 16A/1E72A880 2012-05-04 10:32:08 CEST CONTEXT: writing block 0 of relation base/35355/42224_vm xlog redo vacuum: rel 1663/1562168/1563037; blk 12122, lastBlockVacuumed 12070 2012-05-04 10:32:12 CEST WARNING: xlog min recovery request 16A/469F2120 is past current point 16A/1E9B6EB8 2012-05-04 10:32:12 CEST CONTEXT: writing block 0 of relation base/56308/57181_vm xlog redo vacuum: rel 1663/1562168/1563037; blk 21875, lastBlockVacuumed 21329 2012-05-04 10:32:17 CEST WARNING: xlog min recovery request 16A/22D497B8 is past current point 16A/1FF69258 * servers have near-identical hardware and software * monitoring via munin show at most 1-2 KB of replication lag * we retried the base backup twice So... * any likely mistake on our side ? * can it be fixed ? * is this harmless and to be ignored ? Thank you. -- Vincent de Phily
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Merlin Moncure
Date:
On Fri, May 4, 2012 at 8:55 AM, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote: > Hi list, > > we have two 9.1.2 servers on debian squeeze, and are setting up a simple > streaming replication between the two. > > * wal_keep_segments is set high on the master > * the slave's recovery.conf contains just standbay_mode=on and > primary_conninfo=foo > * we use a simple start_backup/rsync/stop_backup to create the base copy > before starting the slave. > > > It all seems to be working fine, except that when checking the data (selecting > latest primary key and sequence value for all tables) on master and slave, > some sequence ids are higher on the slave than on the master. I could > understand if they were lower, but this is weird. > > * The slave's sequences can be anywhere between 1 and 50 ids ahead. how did you determine that exactly? how do you know the transactions are committing in sequence order? merlin
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Vincent de Phily
Date:
On Friday 04 May 2012 09:47:16 Merlin Moncure wrote: > On Fri, May 4, 2012 at 8:55 AM, Vincent de Phily > > <vincent.dephily@mobile-devices.fr> wrote: > > Hi list, > > > > we have two 9.1.2 servers on debian squeeze, and are setting up a simple > > streaming replication between the two. > > > > * wal_keep_segments is set high on the master > > * the slave's recovery.conf contains just standbay_mode=on and > > primary_conninfo=foo > > * we use a simple start_backup/rsync/stop_backup to create the base copy > > before starting the slave. > > > > > > It all seems to be working fine, except that when checking the data > > (selecting latest primary key and sequence value for all tables) on > > master and slave, some sequence ids are higher on the slave than on the > > master. I could understand if they were lower, but this is weird. > > > > * The slave's sequences can be anywhere between 1 and 50 ids ahead. > > how did you determine that exactly? Quick and dirty : SQL=$(psql -tA -h $MASTER $DB <<< "select E'select \''||table_name||E'\', '|| column_name||' from '||table_name||' order by '||column_name||' desc limit 1;' from information_schema.columns where table_schema='public' and ordinal_position=1 order by table_name;select E'select \''||sequence_name|| E'\', last_value from '||sequence_name||';' from information_schema.sequences where sequence_schema='public' order by sequence_name;") psql -tA -h $SLAVE $DB <<< "select pg_last_xlog_replay_location();$SQL" > $SLAVE.check psql -tA -h $MASTER $DB <<< "select pg_current_xlog_location();$SQL" > $MASTER.check if diff -u $MASTER.check $SLAVE.check; then cat $MASTER.check echo -e "\e[32msync ok\e[m" else echo -e "\e[31msync bad\e[m" fi > how do you know the transactions > are committing in sequence order? I dont, actually. But whichever order the transactions eventually commit in, I'd expect that order to be the same on the slave and the host ? And I wouldn't expect anything to finish on the slave before it finishes on the master ? -- Vincent de Phily Mobile Devices +33 (0) 142 119 325 +353 (0) 85 710 6320 Warning This message (and any associated files) is intended only for the use of its intended recipient and may contain information that is confidential, subject to copyright or constitutes a trade secret. If you are not the intended recipient you are hereby notified that any dissemination, copying or distribution of this message, or files associated with this message, is strictly prohibited. If you have received this message in error, please notify us immediately by replying to the message and deleting it from your computer. Any views or opinions presented are solely those of the author vincent.dephily@mobile-devices.fr and do not necessarily represent those of the company. Although the company has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Michael Nolan
Date:
This is due to how sequences are pre-allocated in blocks to sessions running on the master.
Since the slave is updated via the WALs, and not via 'nextval' function calls in queries, the sequences that are actually used will remain in sync with the master.
--
Mike Nolan
Since the slave is updated via the WALs, and not via 'nextval' function calls in queries, the sequences that are actually used will remain in sync with the master.
--
Mike Nolan
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Simon Riggs
Date:
On 4 May 2012 14:55, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote: > It all seems to be working fine, except that when checking the data (selecting > latest primary key and sequence value for all tables) on master and slave, > some sequence ids are higher on the slave than on the master. I could > understand if they were lower, but this is weird. > > * The slave's sequences can be anywhere between 1 and 50 ids ahead. This is normal. The sequences are advanced in chunks of 100, so the master's value will be the nextval() while the value on standby will be the start of the next chunk, so as you say, slightly ahead of the master. The same thing would also happen in case of a crash. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Vincent de Phily
Date:
On Sunday 06 May 2012 10:29:17 Simon Riggs wrote: > On 4 May 2012 14:55, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote: > > It all seems to be working fine, except that when checking the data > > (selecting latest primary key and sequence value for all tables) on > > master and slave, some sequence ids are higher on the slave than on the > > master. I could understand if they were lower, but this is weird. > > > > * The slave's sequences can be anywhere between 1 and 50 ids ahead. > > This is normal. The sequences are advanced in chunks of 100, so the > master's value will be the nextval() while the value on standby will > be the start of the next chunk, so as you say, slightly ahead of the > master. > > The same thing would also happen in case of a crash. Thanks for the explanation (Michael's too). Would be nice to see it added to the documentation (unless I just didn't find it ?), as it is quite surprising, and might lead to problems if people expect to be able to read sequence values from the slave. As a bonus question, I guess it would be the same if using synchroneous replication ? -- Vincent de Phily
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Simon Riggs
Date:
On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote: > Would be nice to see it added to the documentation (unless I just didn't find > it ?), as it is quite surprising, and might lead to problems if people expect > to be able to read sequence values from the slave. If you think so, please submit a patch. That's how it works here. > As a bonus question, I guess it would be the same if using synchroneous > replication ? Yes -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Magnus Hagander
Date:
On Mon, May 7, 2012 at 10:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote: > >> Would be nice to see it added to the documentation (unless I just didn't find >> it ?), as it is quite surprising, and might lead to problems if people expect >> to be able to read sequence values from the slave. > > If you think so, please submit a patch. That's how it works here. FWIW, I think this would be a reasonable thing to document, given that it violates the principle of least surprise for people who are not intimately familiar with how replication and/or sequences work wrt wal logging in postgresql. (but no, I'm not actually volunteering at this point to write said patch due to my backlog already being too large :P) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Simon Riggs
Date:
On 7 May 2012 09:19, Magnus Hagander <magnus@hagander.net> wrote: > On Mon, May 7, 2012 at 10:14 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote: >> >>> Would be nice to see it added to the documentation (unless I just didn't find >>> it ?), as it is quite surprising, and might lead to problems if people expect >>> to be able to read sequence values from the slave. >> >> If you think so, please submit a patch. That's how it works here. > > FWIW, I think this would be a reasonable thing to document, given that > it violates the principle of least surprise for people who are not > intimately familiar with how replication and/or sequences work wrt wal > logging in postgresql. (but no, I'm not actually volunteering at this > point to write said patch due to my backlog already being too large > :P) Mine also. Its important that everybody understands that submitting a patch is the way to get change, plus its a good test of whether the change is actually worth the effort to make it happen. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes: > On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote: >> Would be nice to see it added to the documentation (unless I just didn't find >> it ?), as it is quite surprising, and might lead to problems if people expect >> to be able to read sequence values from the slave. > If you think so, please submit a patch. That's how it works here. Any documentation patch should be written by somebody who's actually researched the behavior a bit; in particular I believe this can be adjusted with the sequence CACHE setting. regards, tom lane
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Michael Nolan
Date:
On Mon, May 7, 2012 at 4:01 AM, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote:
What people need to understand is that there is no way to 'read' a sequence value from a slave. 'SELECT * from sequence_name' will not reliably give you either the most recently assigned or the next sequence value. This is currently covered in the documentation for sequences, but could probably be improved upon and mentioned somewhere in the documentation on setting up slave servers. (I will look at adding it to the binary replication tutorial wiki page.)
Since 'nextval' cannot be called on a sequence on a slave (because a slave can only support read-only transactions), 'currval' will by definition return an error.
To cross-pollinate with another thread, if temporary tables (and insert/delete/update transactions to them) are to be supported on a slave, will the applications using those temporary tables expect to be able to use 'nextval' on inserts to temporary tables as well?
Yes.
--On Sunday 06 May 2012 10:29:17 Simon Riggs wrote:Would be nice to see it added to the documentation (unless I just didn't find
> On 4 May 2012 14:55, Vincent de Phily <vincent.dephily@mobile-devices.fr>
wrote:
it ?), as it is quite surprising, and might lead to problems if people expect
to be able to read sequence values from the slave.
What people need to understand is that there is no way to 'read' a sequence value from a slave. 'SELECT * from sequence_name' will not reliably give you either the most recently assigned or the next sequence value. This is currently covered in the documentation for sequences, but could probably be improved upon and mentioned somewhere in the documentation on setting up slave servers. (I will look at adding it to the binary replication tutorial wiki page.)
Since 'nextval' cannot be called on a sequence on a slave (because a slave can only support read-only transactions), 'currval' will by definition return an error.
To cross-pollinate with another thread, if temporary tables (and insert/delete/update transactions to them) are to be supported on a slave, will the applications using those temporary tables expect to be able to use 'nextval' on inserts to temporary tables as well?
As a bonus question, I guess it would be the same if using synchroneous
replication ?
Yes.
Mike Nolan
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Fujii Masao
Date:
On Mon, May 7, 2012 at 10:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Simon Riggs <simon@2ndQuadrant.com> writes: >> On 7 May 2012 09:01, Vincent de Phily <vincent.dephily@mobile-devices.fr> wrote: >>> Would be nice to see it added to the documentation (unless I just didn't find >>> it ?), as it is quite surprising, and might lead to problems if people expect >>> to be able to read sequence values from the slave. > >> If you think so, please submit a patch. That's how it works here. > > Any documentation patch should be written by somebody who's actually > researched the behavior a bit; in particular I believe this can be > adjusted with the sequence CACHE setting. No. That behavior is caused by the hard-coded value SEQ_LOG_VALS (= 32 in sequence.c) rather than CACHE setting. Regards, -- Fujii Masao
Re: Streaming replication: sequences on slave seemingly ahead of sequences on master
From
Merlin Moncure
Date:
On Mon, May 7, 2012 at 8:52 AM, Michael Nolan <htfoot@gmail.com> wrote: > To cross-pollinate with another thread, if temporary tables (and > insert/delete/update transactions to them) are to be supported on a slave, > will the applications using those temporary tables expect to be able to use > 'nextval' on inserts to temporary tables as well? That's a good question. I re-asked it for you on -hackers on the GTT thread. Look for follow ups there. merlin