Thread: xlog switch & last record before the switch

xlog switch & last record before the switch

From
"Florian G. Pflug"
Date:
Hi

To test my PITR-slave readonly-query patch, I continously do
insert into test ...
pg_switch_xlog()
sleep 1
on the master, and let the slave process the generated xlogs

The log output on the slave looks the following (unnecessary lines remove)

LOG:  restored log file "00000001000000000000016E" from archive
LOG:  REDO @ 0/16E00020; LSN 0/16E0004C: prev 0/16D00098; xid 1346: Transaction
- commit: 2007-07-12 15:43:00.686056+02
LOG:  REDO @ 0/16E0004C; LSN 0/16E0008C: prev 0/16E00020; xid 0: XLOG -
checkpoint: redo 0/16E00020; tli 1; xid 0/1347; oid 24576; multi 1; offset 0; online
LOG:  REDO @ 0/16E0008C; LSN 0/16E00140: prev 0/16E0004C; xid 1350: Sequence -
log: rel 1663/1/16384
LOG:  REDO @ 0/16E00140; LSN 0/16E00B88: prev 0/16E0008C; xid 1350; bkpb1: Heap
- insert: rel 1663/1/16386; tid 2/49
LOG:  REDO @ 0/16E00B88; LSN 0/16F00000: prev 0/16E00140; xid 1350: XLOG - xlog
switch

LOG:  restored log file "00000001000000000000016F" from archive
LOG:  REDO @ 0/16F00020; LSN 0/16F0004C: prev 0/16E00B88; xid 1350: Transaction
- commit: 2007-07-12 15:43:02.159717+02
LOG:  REDO @ 0/16F0004C; LSN 0/16F00098: prev 0/16F00020; xid 1352: Heap -
insert: rel 1663/1/16386; tid 2/50
LOG:  REDO @ 0/16F00098; LSN 0/17000000: prev 0/16F0004C; xid 1352: XLOG - xlog
switch

LOG:  restored log file "000000010000000000000170" from archive
LOG:  REDO @ 0/17000020; LSN 0/1700004C: prev 0/16F00098; xid 1352: Transaction
- commit: 2007-07-12 15:43:02.26456+02
LOG:  REDO @ 0/1700004C; LSN 0/17000098: prev 0/17000020; xid 1356: Heap -
insert: rel 1663/1/16386; tid 2/51
LOG:  REDO @ 0/17000098; LSN 0/17100000: prev 0/1700004C; xid 1356: XLOG - xlog
switch

As you can see, the COMMIT records seems to end up being logged *after*
the xlog switch. I would have expected the order
"heap-insert, commit, switch, heap-insert, commit, switch, ...", not
"heap-insert, switch, commit, heap-insert, switch, commit, ...".

Is this the expected behaviour, or just an artefact of the implementation
of xlog switches?

greetings, Florian Pflug



Re: xlog switch & last record before the switch

From
"Simon Riggs"
Date:
On Thu, 2007-07-12 at 16:17 +0200, Florian G. Pflug wrote:

> To test my PITR-slave readonly-query patch, I continously do
> insert into test ...
> pg_switch_xlog()
> sleep 1
> on the master, and let the slave process the generated xlogs
> 
> The log output on the slave looks the following (unnecessary lines remove)
> 
> LOG:  restored log file "00000001000000000000016E" from archive
> LOG:  REDO @ 0/16E00020; LSN 0/16E0004C: prev 0/16D00098; xid 1346: Transaction
> - commit: 2007-07-12 15:43:00.686056+02
> LOG:  REDO @ 0/16E0004C; LSN 0/16E0008C: prev 0/16E00020; xid 0: XLOG -
> checkpoint: redo 0/16E00020; tli 1; xid 0/1347;
>   oid 24576; multi 1; offset 0; online
> LOG:  REDO @ 0/16E0008C; LSN 0/16E00140: prev 0/16E0004C; xid 1350: Sequence -
> log: rel 1663/1/16384
> LOG:  REDO @ 0/16E00140; LSN 0/16E00B88: prev 0/16E0008C; xid 1350; bkpb1: Heap
> - insert: rel 1663/1/16386; tid 2/49
> LOG:  REDO @ 0/16E00B88; LSN 0/16F00000: prev 0/16E00140; xid 1350: XLOG - xlog
> switch
> 
> LOG:  restored log file "00000001000000000000016F" from archive
> LOG:  REDO @ 0/16F00020; LSN 0/16F0004C: prev 0/16E00B88; xid 1350: Transaction
> - commit: 2007-07-12 15:43:02.159717+02
> LOG:  REDO @ 0/16F0004C; LSN 0/16F00098: prev 0/16F00020; xid 1352: Heap -
> insert: rel 1663/1/16386; tid 2/50
> LOG:  REDO @ 0/16F00098; LSN 0/17000000: prev 0/16F0004C; xid 1352: XLOG - xlog
> switch
> 
> LOG:  restored log file "000000010000000000000170" from archive
> LOG:  REDO @ 0/17000020; LSN 0/1700004C: prev 0/16F00098; xid 1352: Transaction
> - commit: 2007-07-12 15:43:02.26456+02
> LOG:  REDO @ 0/1700004C; LSN 0/17000098: prev 0/17000020; xid 1356: Heap -
> insert: rel 1663/1/16386; tid 2/51
> LOG:  REDO @ 0/17000098; LSN 0/17100000: prev 0/1700004C; xid 1356: XLOG - xlog
> switch
> 
> As you can see, the COMMIT records seems to end up being logged *after*
> the xlog switch. I would have expected the order
> "heap-insert, commit, switch, heap-insert, commit, switch, ...", not
> "heap-insert, switch, commit, heap-insert, switch, commit, ...".
> 
> Is this the expected behaviour, or just an artefact of the implementation
> of xlog switches?

Can you show the exact SQL executed? 

If you do INSERT ... ;select pg_switch_xlog() then the COMMIT will occur
after the switch. If you do:
INSERT ...;
select pg_switch_xlog();
then the COMMIT will occur before the switch.

If it were otherwise this would imply statements were executed prior to
the previous commit, which I hope and pray is never the case.

--  Simon Riggs EnterpriseDB  http://www.enterprisedb.com



Re: xlog switch & last record before the switch

From
"Florian G. Pflug"
Date:
Simon Riggs wrote:
> On Thu, 2007-07-12 at 16:17 +0200, Florian G. Pflug wrote:
> 
>> To test my PITR-slave readonly-query patch, I continously do
>> insert into test ...
>> pg_switch_xlog()
>> sleep 1
>> on the master, and let the slave process the generated xlogs
>>
>> The log output on the slave looks the following (unnecessary lines remove)>>>> <snipped log output>>
>>
>> As you can see, the COMMIT records seems to end up being logged *after*
>> the xlog switch. I would have expected the order
>> "heap-insert, commit, switch, heap-insert, commit, switch, ...", not
>> "heap-insert, switch, commit, heap-insert, switch, commit, ...".
>>
>> Is this the expected behaviour, or just an artefact of the implementation
>> of xlog switches?
> 
> Can you show the exact SQL executed? 
> 
> If you do INSERT ... ;select pg_switch_xlog() then the COMMIT will occur
> after the switch. If you do:
> INSERT ...;
> select pg_switch_xlog();
> then the COMMIT will occur before the switch.

Ah, you caught me - I had forgotten that pgsql -c "<statement>"
executes the statement in one transaction.

I was just going to suggest that pg_start_backup, pg_stop_backup
and pg_switch_xlog emit a warning or even an error if called
from within a larger transaction, because that's quite certainly
not what the user wants. But since those are just plain functions,
I guess checking for that might not be trivial...

> If it were otherwise this would imply statements were executed prior to
> the previous commit, which I hope and pray is never the case.
You can relax, the bug was located in front of the screen :-)

greetings, Florian Pflug