Thread: Latest version of Hot Standby patch

Latest version of Hot Standby patch

From

Simon Riggs

Date:

17 December 2008, 11:21:32

http://wiki.postgresql.org/wiki/Hot_Standby

now contains a link to latest version of this patch. This is still at
"v5", just brought forward to CVS HEAD.

I will be doing further work on the patch from here and will reply to
this post each time I submit a new version.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

17 December 2008, 16:55:30

On Wed, 2008-12-17 at 15:21 +0000, Simon Riggs wrote:
> http://wiki.postgresql.org/wiki/Hot_Standby
> 
> now contains a link to latest version of this patch. This is still at
> "v5", just brought forward to CVS HEAD.
> 
> I will be doing further work on the patch from here and will reply to
> this post each time I submit a new version.

New version (already!) v5 - this time slightly broken down to aid review

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

18 December 2008, 09:14:13

Can't we use the existing backendid in place of the slot id?

(sorry if this has been discussed already; can't remember)

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

18 December 2008, 11:43:10

On Thu, 2008-12-18 at 15:13 +0200, Heikki Linnakangas wrote:

> Can't we use the existing backendid in place of the slot id?
> 
> (sorry if this has been discussed already; can't remember)

Exactly the sort of question we need, but unfortunately I'm a little
hazy, but I just woke up some maybe some coffee will change that answer
later.

They're certainly related and I did try it initially. SlotId was not
present in early versions of the patch up to 13 Oct, though backendId
was.

IIRC there were a couple of reasons

* corner case behaviour of backendids - bgwriter writes checkpoint WAL
records. Has no backendid, but needs a slotid (possibly others)

* slotids are assigned once and never changed, so allowing them to be
used as array lookups directly

I think it would be possible to use slotids as backendids, but not the
other way around. Anyway, seemed like a great way to induce bugs into
the sinval code, so I didn't try too hard to make them identical.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

18 December 2008, 13:47:41

On Wed, 2008-12-17 at 15:21 +0000, Simon Riggs wrote:
> http://wiki.postgresql.org/wiki/Hot_Standby
> 
> now contains a link to latest version of this patch. This is still at
> "v5", just brought forward to CVS HEAD.
> 
> I will be doing further work on the patch from here and will reply to
> this post each time I submit a new version.

New version: patch apply fixed, doc typos corrected

First half of Wiki docs checked and updated to explain User and Admin
Overview exactly as currently implemented.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

19 December 2008, 04:59:43

Simon Riggs wrote:
> * corner case behaviour of backendids - bgwriter writes checkpoint WAL
> records. Has no backendid, but needs a slotid (possibly others)

Why does bgwriter need a slotid? It doesn't run any transactions.

> * slotids are assigned once and never changed, so allowing them to be
> used as array lookups directly

So are backend ids.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

19 December 2008, 07:51:40

On Fri, 2008-12-19 at 10:59 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > * corner case behaviour of backendids - bgwriter writes checkpoint WAL
> > records. Has no backendid, but needs a slotid (possibly others)
> 
> Why does bgwriter need a slotid? It doesn't run any transactions.
> 
> > * slotids are assigned once and never changed, so allowing them to be
> > used as array lookups directly
> 
> So are backend ids.

I'm a little hazy, to be sure. I'm pretty sure there was a blocker, but
if I cannot recall it we should assume it doesn't exist.

Where are you going with the thought? Remove slotId from each proc and
then use backendId to identify the recovery proc?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

19 December 2008, 07:58:40

Simon Riggs wrote:
> On Fri, 2008-12-19 at 10:59 +0200, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> * corner case behaviour of backendids - bgwriter writes checkpoint WAL
>>> records. Has no backendid, but needs a slotid (possibly others)
>> Why does bgwriter need a slotid? It doesn't run any transactions.
>>
>>> * slotids are assigned once and never changed, so allowing them to be
>>> used as array lookups directly
>> So are backend ids.
> 
> I'm a little hazy, to be sure. I'm pretty sure there was a blocker, but
> if I cannot recall it we should assume it doesn't exist.
> 
> Where are you going with the thought? Remove slotId from each proc and
> then use backendId to identify the recovery proc?

Yep.

Well, to be honest, I don't much like the whole notion of tracking the 
slots. I think we should just rely on the XLOG_RECOVERY_END records to 
purge stale PGPROC entries, belonging to backends that died without 
writing an abort record.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

19 December 2008, 08:00:59

Heikki Linnakangas wrote:
> Well, to be honest, I don't much like the whole notion of tracking the 
> slots. I think we should just rely on the XLOG_RECOVERY_END records to 
> purge stale PGPROC entries, belonging to backends that died without 
> writing an abort record.

Sorry, I meant XLOG_XACT_RUNNING_XACTS, not XLOG_RECOVERY_END.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

19 December 2008, 09:15:48

On Fri, 2008-12-19 at 14:00 +0200, Heikki Linnakangas wrote:
> Heikki Linnakangas wrote:
> > Well, to be honest, I don't much like the whole notion of tracking the 
> > slots. I think we should just rely on the XLOG_RECOVERY_END records to 
> > purge stale PGPROC entries, belonging to backends that died without 
> > writing an abort record.
> 
> Sorry, I meant XLOG_XACT_RUNNING_XACTS, not XLOG_RECOVERY_END.

OK, previous response aborted, but the problem is fairly similar.

If we just assign a recovery proc to each new transaction started then

* we would need an expandable number of recovery procs to cope with the
fact that we have more potentially emulated transactions than procs.
This info needs to be in shared memory, so however many you allocate, it
can still run out. Then what do you do? PANIC recovery and permanently
fail? Kick off all queries until a XLOG_XACT_RUNNING_XACTS arrives?

* there would be no direct mapping between the commit record and the
proc, so each commit would need to scan the whole proc array to get the
correct proc (which is potentially getting bigger and bigger because of
the first point)

The slotid (or backendid) is essential to keeping the number of emulated
transactions bounded at all times. I don't see that it costs much, if
anything and the resulting code is at least as simple as the
alternatives.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

31 December 2008, 14:40:20

On Wed, 2008-12-17 at 15:21 +0000, Simon Riggs wrote:
> http://wiki.postgresql.org/wiki/Hot_Standby
> 
> now contains a link to latest version of this patch. 

v6 of Hot Standby now uploaded to Wiki (link above), with these changes:

* Must ignore_killed_tuples and never kill_prior_tuple during index
scans in recovery (v6) * XLOG_BTREE_DELETE records handled correctly (v6) * btree VACUUM code - must scan every block
ofindex (v6) * BEGIN TRANSACTION READ WRITE should throw error (v6)
 

New test cycle starting with this patch over next few days.

Work continues on other items.

Happy New Year everyone,

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Guillaume Lelarge

Date:

02 January 2009, 06:02:44

Simon Riggs a écrit :
> On Wed, 2008-12-17 at 15:21 +0000, Simon Riggs wrote:
>> http://wiki.postgresql.org/wiki/Hot_Standby
>>
>> now contains a link to latest version of this patch. 
> 
> v6 of Hot Standby now uploaded to Wiki (link above), with these changes:
> 
> * Must ignore_killed_tuples and never kill_prior_tuple during index
> scans in recovery (v6)
>   * XLOG_BTREE_DELETE records handled correctly (v6)
>   * btree VACUUM code - must scan every block of index (v6)
>   * BEGIN TRANSACTION READ WRITE should throw error (v6)
> 
> New test cycle starting with this patch over next few days.
> 

I use latest CVS version. I tried to apply the patches and I have the
following error :

./hs.apply.sh: line 4: hs.prevent.v5.patch

I think you forgot to update the script. hs.prevent.v5.patch doesn't
exist in the tar file but hs.prevent.v6.patch does. Not sure we really
need this new file because I have a compilation error if I use the new
one. I don't have this error when I don't use the hs.prevent.v6.patch
file. Compilation error is :

utility.c: In function ‘ProcessUtility’:

utility.c:292: erreur: expected ‘)’ before ‘{’ token

utility.c:306: erreur: expected expression before ‘}’ token

And one file didn't want to get patched :

patching file src/include/catalog/pg_proc.h
Hunk #1 FAILED at 3223.
1 out of 1 hunk FAILED -- saving rejects to file
src/include/catalog/pg_proc.h.rej

Not sure why. I did patch it manually, but I still have my compilation
error.

Regards.

-- 
Guillaume.http://www.postgresqlfr.orghttp://dalibo.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

02 January 2009, 13:34:22

On Fri, 2009-01-02 at 11:02 +0100, Guillaume Lelarge wrote:
> Simon Riggs a écrit :
> > On Wed, 2008-12-17 at 15:21 +0000, Simon Riggs wrote:
> >> http://wiki.postgresql.org/wiki/Hot_Standby
> >>
> >> now contains a link to latest version of this patch. 
> > 
> > v6 of Hot Standby now uploaded to Wiki (link above), with these changes:
> > 
> > * Must ignore_killed_tuples and never kill_prior_tuple during index
> > scans in recovery (v6)
> >   * XLOG_BTREE_DELETE records handled correctly (v6)
> >   * btree VACUUM code - must scan every block of index (v6)
> >   * BEGIN TRANSACTION READ WRITE should throw error (v6)
> > 
> > New test cycle starting with this patch over next few days.
> > 
> 
> I use latest CVS version. I tried to apply the patches and I have the
> following error :

Thanks, will fix.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

03 January 2009, 05:33:23

On Fri, 2009-01-02 at 17:35 +0000, Simon Riggs wrote:

> > I use latest CVS version. I tried to apply the patches and I have the
> > following error :
> 
> Thanks, will fix.

Fixed various bit rots and re-packaged. v6a now up, v6 unlinked.

Thanks,

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Guillaume Lelarge

Date:

03 January 2009, 17:22:08

Simon Riggs a écrit :
> On Fri, 2009-01-02 at 17:35 +0000, Simon Riggs wrote:
> 
>>> I use latest CVS version. I tried to apply the patches and I have the
>>> following error :
>> Thanks, will fix.
> 
> Fixed various bit rots and re-packaged. v6a now up, v6 unlinked.
> 

Thanks. I only did a few checks and it worked great for me. I will try
to put some more time on it.


-- 
Guillaume.http://www.postgresqlfr.orghttp://dalibo.com

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Simon Riggs

Date:

04 January 2009, 04:16:45

On Sun, 2009-01-04 at 21:03 +1300, Mark Kirkwood wrote:

> bench=# select now(),count(*) from history;
> ERROR:  could not open relation base/16384/16394: No such file or
> directory

Thanks for the report.

I'm attempting to recreate now.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Mark Kirkwood

Date:

04 January 2009, 04:19:43

Simon Riggs wrote:
> On Wed, 2008-12-17 at 15:21 +0000, Simon Riggs wrote:
>   
>> http://wiki.postgresql.org/wiki/Hot_Standby
>>
>> now contains a link to latest version of this patch. 
>>     
>
> v6 of Hot Standby now uploaded to Wiki (link above), with these changes:
>
> * Must ignore_killed_tuples and never kill_prior_tuple during index
> scans in recovery (v6)
>   * XLOG_BTREE_DELETE records handled correctly (v6)
>   * btree VACUUM code - must scan every block of index (v6)
>   * BEGIN TRANSACTION READ WRITE should throw error (v6)
>
> New test cycle starting with this patch over next few days.
>
> Work continues on other items.
>
> Happy New Year everyone,
>
>   
I'm running some tests on v6a. The setup is:

- install master, setup standby as usual, start standby
- create database bench on master
- initialize pgbench dataset size 100 on master
- start 4 clients doing 500000 transactions each.

After about 100000 transactions have been processed on the master, query 
the standby:

bench=# \d history             Table "public.history"Column |            Type             | Modifiers
--------+-----------------------------+-----------tid    | integer                     |bid    | integer
    |aid    | integer                     |delta  | integer                     |mtime  | timestamp without time zone
|filler| character(22)               |
 

bench=# select now(),count(*) from history;
ERROR:  could not open relation base/16384/16394: No such file or directory

regards

Mark

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Simon Riggs

Date:

04 January 2009, 08:17:09

On Sun, 2009-01-04 at 21:03 +1300, Mark Kirkwood wrote:

> bench=# \d history
>               Table "public.history"
>  Column |            Type             | Modifiers
> --------+-----------------------------+-----------
>  tid    | integer                     |
>  bid    | integer                     |
>  aid    | integer                     |
>  delta  | integer                     |
>  mtime  | timestamp without time zone |
>  filler | character(22)               |
> 
> bench=# select now(),count(*) from history;
> ERROR:  could not open relation base/16384/16394: No such file or
> directory

>From my recreating your test case, the oids are consistent with the
History table. So the cache looks good.

md.c should be cacheing the file descriptor so the second use of the
file should not be reopening it. I've not touched smgr/md so a missing
file error is a surprise.

I wonder if this is an error associated with large file handling and
file forks? Smells like an FSM or VM error.

Is the file actually missing? i.e. ls -l mydatadir/base/16384/16394*

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Mark Kirkwood

Date:

04 January 2009, 08:21:12

Mark Kirkwood wrote:
> Simon Riggs wrote:
>> On Sun, 2009-01-04 at 21:03 +1300, Mark Kirkwood wrote:
>>
>>  
>>> bench=# \d history
>>>               Table "public.history"
>>>  Column |            Type             | Modifiers
>>> --------+-----------------------------+-----------
>>>  tid    | integer                     |
>>>  bid    | integer                     |
>>>  aid    | integer                     |
>>>  delta  | integer                     |
>>>  mtime  | timestamp without time zone |
>>>  filler | character(22)               |
>>>
>>> bench=# select now(),count(*) from history;
>>> ERROR:  could not open relation base/16384/16394: No such file or
>>> directory
>>>     
>>
>> >From my recreating your test case, the oids are consistent with the
>> History table. So the cache looks good.
>>
>> md.c should be cacheing the file descriptor so the second use of the
>> file should not be reopening it. I've not touched smgr/md so a missing
>> file error is a surprise.
>>
>> I wonder if this is an error associated with large file handling and
>> file forks? Smells like an FSM or VM error.
>>
>> Is the file actually missing? i.e. ls -l mydatadir/base/16384/16394*
>>
>>   
> Yeah -
> $ ls -l $PGDATA/base/16384/16394*
> ls: /data0/pgslave/8.4/base/16384/16394*: No such file or directory
>

This might be useful:

the other tables in the dataset (accounts, branches, tellers)  all 
behave as expected:
bench=# select now(),count(*) from branches;             now              | count
-------------------------------+-------2009-01-04 22:17:00.298597+13 |   100
(1 row)

I'm guessing something tied up with the fact that history has no rows to 
start with...

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Simon Riggs

Date:

04 January 2009, 08:24:00

On Sun, 2009-01-04 at 22:13 +1300, Mark Kirkwood wrote:
> Simon Riggs wrote:
> >
> > Is the file actually missing? i.e. ls -l mydatadir/base/16384/16394*
> >
> >   
> Yeah -
> $ ls -l $PGDATA/base/16384/16394*
> ls: /data0/pgslave/8.4/base/16384/16394*: No such file or directory

What else is missing? Files, Directories etc?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Mark Kirkwood

Date:

04 January 2009, 08:31:11

Simon Riggs wrote:
> On Sun, 2009-01-04 at 21:03 +1300, Mark Kirkwood wrote:
>
>   
>> bench=# \d history
>>               Table "public.history"
>>  Column |            Type             | Modifiers
>> --------+-----------------------------+-----------
>>  tid    | integer                     |
>>  bid    | integer                     |
>>  aid    | integer                     |
>>  delta  | integer                     |
>>  mtime  | timestamp without time zone |
>>  filler | character(22)               |
>>
>> bench=# select now(),count(*) from history;
>> ERROR:  could not open relation base/16384/16394: No such file or
>> directory
>>     
>
> >From my recreating your test case, the oids are consistent with the
> History table. So the cache looks good.
>
> md.c should be cacheing the file descriptor so the second use of the
> file should not be reopening it. I've not touched smgr/md so a missing
> file error is a surprise.
>
> I wonder if this is an error associated with large file handling and
> file forks? Smells like an FSM or VM error.
>
> Is the file actually missing? i.e. ls -l mydatadir/base/16384/16394*
>
>   
Yeah -
$ ls -l $PGDATA/base/16384/16394*
ls: /data0/pgslave/8.4/base/16384/16394*: No such file or directory

regards

Mark

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Simon Riggs

Date:

04 January 2009, 10:22:46

On Sun, 2009-01-04 at 22:18 +1300, Mark Kirkwood wrote:
> >>>
> >>> bench=# select now(),count(*) from history;
> >>> ERROR:  could not open relation base/16384/16394: No such file or
> >>> directory
> >>>     

> I'm guessing something tied up with the fact that history has no rows
> to 
> start with...

Good guess, thanks. I can recreate the error now, though not by
following the actions in the order you mentioned. I guess the files
hadn't applied fully before you ran the test.

The problem I can re-create looks like this:

1. Create standby set-up, with both primary and standby active
2. Create new table on primary, but don't add data; wait for apply
3. Attempt to access new table on standby, throws ERROR as shown
4. Add 1 row on primary; wait for apply
5. Attempt to access new table on standby, no ERROR

It looks to me like WAL for CREATE TABLE doesn't actually create a file,
we just rely on the ability of mdextend() to create the file if required
during recovery.

So it looks to me like either an outstanding error with current system,
or a new error introduced with recent-ish md/smgr changes. Second
opinion please Heikki, if you are available?

I'll come back to this in a few hours, but I have some other things need
to do right now.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Heikki Linnakangas

Date:

04 January 2009, 10:38:16

Simon Riggs wrote:
> On Sun, 2009-01-04 at 22:18 +1300, Mark Kirkwood wrote:
>>>>> bench=# select now(),count(*) from history;
>>>>> ERROR:  could not open relation base/16384/16394: No such file or
>>>>> directory
>>>>>     
> 
>> I'm guessing something tied up with the fact that history has no rows
>> to 
>> start with...
> 
> Good guess, thanks. I can recreate the error now, though not by
> following the actions in the order you mentioned. I guess the files
> hadn't applied fully before you ran the test.
> 
> The problem I can re-create looks like this:
> 
> 1. Create standby set-up, with both primary and standby active
> 2. Create new table on primary, but don't add data; wait for apply
> 3. Attempt to access new table on standby, throws ERROR as shown
> 4. Add 1 row on primary; wait for apply
> 5. Attempt to access new table on standby, no ERROR
> 
> It looks to me like WAL for CREATE TABLE doesn't actually create a file,
> we just rely on the ability of mdextend() to create the file if required
> during recovery.
> 
> So it looks to me like either an outstanding error with current system,
> or a new error introduced with recent-ish md/smgr changes. Second
> opinion please Heikki, if you are available?

Hmm, that's odd. Table creation calls RelationCreateStorage, which calls 
smgrcreate and writes the WAL record. smgr_redo certainly does call 
smgrcreate.

I can reproduce that too with CVS HEAD, so it's clearly a bug. I 
probably introduced it with the recent smgr changes; I'll try to hunt it 
down.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Heikki Linnakangas

Date:

04 January 2009, 10:59:49

Heikki Linnakangas wrote:
> I can reproduce that too with CVS HEAD, so it's clearly a bug. I 
> probably introduced it with the recent smgr changes; I'll try to hunt it 
> down.

Now that was an embarrassingly simple bug:

--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -106,7 +106,7 @@ RelationCreateStorage(RelFileNode rnode, bool istemp)     srel = smgropen(rnode);
smgrcreate(srel,MAIN_FORKNUM, false);
 

-    if (istemp)
+    if (!istemp)     {         /*          * Make an XLOG entry showing the file creation.  If we abort, the file

Fixed, as well as the same bug in RelationTruncate. Thanks for report, 
I'll keep this brown paper bag on for a few days...

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch: unexpected error querying standby

From

Greg Stark

Date:

04 January 2009, 13:05:27

What I ifind interesting about this is that whereas I had been  
concerned that adding hot standby late in the development cycle might  
be destabilize the tree and add lots of time to the release cycle it  
seems having it might actually increase our ability to see problems in  
the recovery code which was previously quite hard to test.

-- 
Greg


On 4 Jan 2009, at 09:59, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com > wrote:

> Heikki Linnakangas wrote:
>> I can reproduce that too with CVS HEAD, so it's clearly a bug. I  
>> probably introduced it with the recent smgr changes; I'll try to  
>> hunt it down.
>
> Now that was an embarrassingly simple bug:
>
> --- a/src/backend/catalog/storage.c
> +++ b/src/backend/catalog/storage.c
> @@ -106,7 +106,7 @@ RelationCreateStorage(RelFileNode rnode, bool  
> istemp)
>    srel = smgropen(rnode);
>    smgrcreate(srel, MAIN_FORKNUM, false);
>
> -    if (istemp)
> +    if (!istemp)
>    {
>        /*
>         * Make an XLOG entry showing the file creation.  If we  
> abort, the file
>
> Fixed, as well as the same bug in RelationTruncate. Thanks for  
> report, I'll keep this brown paper bag on for a few days...
>
> -- 
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

07 January 2009, 05:35:25

Simon Riggs wrote:
>   * btree VACUUM code - must scan every block of index (v6)

Need to unlock them too.

--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -472,7 +472,7 @@ btree_xlog_vacuum(XLogRecPtr lsn, XLogRecord *record)        xlrec = (xl_btree_vacuum *)
XLogRecGetData(record);
        /*
-        * We need to ensure everyy block is unpinned between the
+        * We need to ensure every block is pinned between the         * lastBlockVacuumed and the current block, if
thereare any.         * This ensures that every block in the index is touched during         * VACUUM as required to
ensurescans work correctly.
 
@@ -482,7 +482,11 @@ btree_xlog_vacuum(XLogRecPtr lsn, XLogRecord *record)                BlockNumber blkno =
xlrec->lastBlockVacuumed+ 1;
 
                for (; blkno < xlrec->block; blkno++)
+               {                        buffer = XLogReadBufferForCleanup(xlrec->node, 
blkno, false);
+                       if (BufferIsValid(buffer))
+                               UnlockReleaseBuffer(buffer);
+               }        }
        /*

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

07 January 2009, 05:54:12

On Wed, 2009-01-07 at 11:35 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> >   * btree VACUUM code - must scan every block of index (v6)
> 
> Need to unlock them too.

Oh c**p. Thanks.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

07 January 2009, 07:19:16

There's still something wrong with the way subtransactions are handled. 
I got:

postgres=# SELECT * FROM foo;
ERROR:  could not access status of transaction 118649
DETAIL:  Could not open file "pg_subtrans/0001": No such file or directory.

in the standby after some testing.

I created a lot of subtransactions in the master, each inserting a row 
to table 'foo', and left the transaction open. In another session, I did 
a lot of dummy activity (truncate bar; insert into bar ...) to generate 
WAL, and also checkpoints and pg_xlog_switch() calls. I then restarted 
the standby, and got the above error.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

07 January 2009, 08:07:31

On Wed, 2009-01-07 at 13:18 +0200, Heikki Linnakangas wrote:
> There's still something wrong with the way subtransactions are handled. 
> I got:
> 
> postgres=# SELECT * FROM foo;
> ERROR:  could not access status of transaction 118649
> DETAIL:  Could not open file "pg_subtrans/0001": No such file or directory.
> 
> in the standby after some testing.

Please tell me some more. Was 118649 active etc?

118649 should be in pg_subtrans/057 shouldn't it?? Hmmm.

> I created a lot of subtransactions in the master, each inserting a row 
> to table 'foo', and left the transaction open. In another session, I did 
> a lot of dummy activity (truncate bar; insert into bar ...) to generate 
> WAL, and also checkpoints and pg_xlog_switch() calls. I then restarted 
> the standby, and got the above error.

Can you confirm this works in normal running?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

07 January 2009, 08:22:44

On Wed, 2008-12-17 at 15:21 +0000, Simon Riggs wrote:
> http://wiki.postgresql.org/wiki/Hot_Standby
> 
> now contains a link to latest version of this patch. 

v6b now available via Wiki, fixes 5 reported issues.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

07 January 2009, 09:43:42

Another annoyance I noticed while testing the case of a lot of 
subtransactions (overflowing the procarray cache) is that when you have 
a transaction with a lot of subtransactions open, getting the initial 
snapshot fails, and the standby doesn't open for read-only queries.

Normally, GetRunningTransactionData determines the xid of the latest 
running xid by scanning the procarray. If the subxid cache has 
overflowed, it simply gives up. Comment there suggests that it could 
call ReadNewTransactionId() instead, like it does when there's no active 
xids in proc array. I think we should do that, or something else to 
alleviate the problem.

When there's no xids in the procarray, couldn't we just use 
latestCompletedXid instead of calling ReadNewTransactionId()?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

07 January 2009, 17:41:28

On Wed, 2009-01-07 at 15:43 +0200, Heikki Linnakangas wrote:

> Another annoyance I noticed while testing 

I'm sorry this has annoyed you. Thanks for testing.

> the case of a lot of 
> subtransactions (overflowing the procarray cache) is that when you have 
> a transaction with a lot of subtransactions open, getting the initial 
> snapshot fails, 

...on that attempt only, yes.

> and the standby doesn't open for read-only queries.

It will eventually do so, at the first time there is no overflow, so in
most cases the wait will be fairly short.

I thought it was code that would so seldom run that it was unlikely to
be bug free. I would prefer to record this as a possible enhancement
once committed rather than an essential fix; would you agree?

> Normally, GetRunningTransactionData determines the xid of the latest 
> running xid by scanning the procarray. If the subxid cache has 
> overflowed, it simply gives up. Comment there suggests that it could 
> call ReadNewTransactionId() instead, like it does when there's no active 
> xids in proc array. I think we should do that, or something else to 
> alleviate the problem.

I mention in comments that I was worried about the contention that would
cause since this runs in all servers.

> When there's no xids in the procarray, couldn't we just use 
> latestCompletedXid instead of calling ReadNewTransactionId()?

latestCompletedXid is protected by ProcArrayLock so not much difference
between those two. If you're saying you'd prefer latestCompletedXid then
I can make that change.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

07 January 2009, 17:56:36

Simon Riggs wrote:
> On Wed, 2009-01-07 at 15:43 +0200, Heikki Linnakangas wrote:
>> When there's no xids in the procarray, couldn't we just use 
>> latestCompletedXid instead of calling ReadNewTransactionId()?
> 
> latestCompletedXid is protected by ProcArrayLock so not much difference
> between those two.

The big difference is that we're already holding ProcArrayLock. You 
could read the value of latestCompletedXid before releasing 
ProcArrayLock, and wouldn't need the retry logic.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

07 January 2009, 18:07:41

On Wed, 2009-01-07 at 23:56 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Wed, 2009-01-07 at 15:43 +0200, Heikki Linnakangas wrote:
> >> When there's no xids in the procarray, couldn't we just use 
> >> latestCompletedXid instead of calling ReadNewTransactionId()?
> > 
> > latestCompletedXid is protected by ProcArrayLock so not much difference
> > between those two.
> 
> The big difference is that we're already holding ProcArrayLock. You 
> could read the value of latestCompletedXid before releasing 
> ProcArrayLock, and wouldn't need the retry logic.

Sounds good to me then. Will rework.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

08 January 2009, 05:20:23

On Wed, 2009-01-07 at 22:08 +0000, Simon Riggs wrote:
> On Wed, 2009-01-07 at 23:56 +0200, Heikki Linnakangas wrote:
> > Simon Riggs wrote:
> > > On Wed, 2009-01-07 at 15:43 +0200, Heikki Linnakangas wrote:
> > >> When there's no xids in the procarray, couldn't we just use 
> > >> latestCompletedXid instead of calling ReadNewTransactionId()?
> > > 
> > > latestCompletedXid is protected by ProcArrayLock so not much difference
> > > between those two.
> > 
> > The big difference is that we're already holding ProcArrayLock. You 
> > could read the value of latestCompletedXid before releasing 
> > ProcArrayLock, and wouldn't need the retry logic.
> 
> Sounds good to me then. Will rework.

Applies brakes suddenly.

I realise this is subtle trap I almost fell into the first time I coded
it. The function is retrieving GetRunningTransactionData() and so we are
interested in the latest running xid, not the latest completed xid. The
latter is sufficient for snapshots, but the information derived by
GetRunningTransactionData() is used to maintain UnobservedXids.

Now we might have a discussion about whether we *need* that information,
but the code is correct as it currently stands.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

08 January 2009, 06:12:37

Simon Riggs wrote:
> On Wed, 2009-01-07 at 22:08 +0000, Simon Riggs wrote:
>> On Wed, 2009-01-07 at 23:56 +0200, Heikki Linnakangas wrote:
>>> Simon Riggs wrote:
>>>> On Wed, 2009-01-07 at 15:43 +0200, Heikki Linnakangas wrote:
>>>>> When there's no xids in the procarray, couldn't we just use 
>>>>> latestCompletedXid instead of calling ReadNewTransactionId()?
>>>> latestCompletedXid is protected by ProcArrayLock so not much difference
>>>> between those two.
>>> The big difference is that we're already holding ProcArrayLock. You 
>>> could read the value of latestCompletedXid before releasing 
>>> ProcArrayLock, and wouldn't need the retry logic.
>> Sounds good to me then. Will rework.
> 
> Applies brakes suddenly.
> 
> I realise this is subtle trap I almost fell into the first time I coded
> it. The function is retrieving GetRunningTransactionData() and so we are
> interested in the latest running xid, not the latest completed xid. The
> latter is sufficient for snapshots, but the information derived by
> GetRunningTransactionData() is used to maintain UnobservedXids.

If there's no transactions running, latest completed xid is just what we 
need. When there is any transactions in procarray, we should take the 
max xid of those, as the patch already does.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

08 January 2009, 07:51:42

On Wed, 2009-01-07 at 13:18 +0200, Heikki Linnakangas wrote:
> There's still something wrong with the way subtransactions are handled. 
> I got:
> 
> postgres=# SELECT * FROM foo;
> ERROR:  could not access status of transaction 118649
> DETAIL:  Could not open file "pg_subtrans/0001": No such file or directory.
> 
> in the standby after some testing.

Currently, we don't run TruncateSubtrans() when we perform
restartpoints. That means that it's impossible for HS to see an xid as
running for which it's segment has been deleted. The only other
possibility is that the segment has not yet been created.

subtrans is not WAL logged, so new subtrans pages are never created
during recovery, so not yet created is a typical case.

This looks to be essentially the same error I had already fixed, just
that my fix in slru.c is not sufficient. The reason for this is that
initially this error occurred on the startup process when attempting to
record a subtransaction start. Now that has been optimised away, the
same error occurs, but now while reading rather than writing.

We don't wish to introduce WAL logging of subtrans, so the correct fix
is to simply widen the error-checking in SlruPhysicalReadPage() from
using InRecovery to IsRecoveryProcessingMode(). That then causes us to
return zeroes for the page requested rather than error out.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

08 January 2009, 08:37:26

On Thu, 2009-01-08 at 12:12 +0200, Heikki Linnakangas wrote:

> >> Sounds good to me then. Will rework.
> > 
> > Applies brakes suddenly.
> > 
> > I realise this is subtle trap I almost fell into the first time I coded
> > it. The function is retrieving GetRunningTransactionData() and so we are
> > interested in the latest running xid, not the latest completed xid. The
> > latter is sufficient for snapshots, but the information derived by
> > GetRunningTransactionData() is used to maintain UnobservedXids.
> 
> If there's no transactions running, latest completed xid is just what we 
> need. 

> When there is any transactions in procarray, we should take the 
> max xid of those, as the patch already does.

OK, I don't now see the need for the special case in the way I've done
it. There could still be problems there, but if there are they should
apply to all cases not just the no transactions running case.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

13 January 2009, 13:30:58

On Wed, 2009-01-07 at 15:43 +0200, Heikki Linnakangas wrote:
> Normally, GetRunningTransactionData determines the xid of the latest 
> running xid by scanning the procarray. If the subxid cache has 
> overflowed, it simply gives up. Comment there suggests that it could 
> call ReadNewTransactionId() instead, like it does when there's no
> active xids in proc array. I think we should do that, or something
> else to alleviate the problem.
> 
> When there's no xids in the procarray, couldn't we just use 
> latestCompletedXid instead of calling ReadNewTransactionId()?

Done.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

13 January 2009, 15:38:47

On Wed, 2009-01-07 at 13:18 +0200, Heikki Linnakangas wrote:
> There's still something wrong with the way subtransactions are handled. 
> I got:
> 
> postgres=# SELECT * FROM foo;
> ERROR:  could not access status of transaction 118649
> DETAIL:  Could not open file "pg_subtrans/0001": No such file or directory.
> 
> in the standby after some testing.
> 
> I created a lot of subtransactions in the master, each inserting a row 
> to table 'foo', and left the transaction open. In another session, I did 
> a lot of dummy activity (truncate bar; insert into bar ...) to generate 
> WAL, and also checkpoints and pg_xlog_switch() calls. I then restarted 
> the standby, and got the above error.

I fixed this by ignoring I/O errors on pg_subtrans, but I think that's
the wrong approach and could mask errors.

ExtendSubtrans() doesn't generate WAL records, but it could. I don't
want to do that either for performance reasons.

Best way seems to be to do (almost) the same thing as ExtendSubtrans()
during recovery, so we zero out pages at the point that recovering
transactions get written to pg_subtrans.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Latest version of Hot Standby patch

From

Heikki Linnakangas

Date:

14 January 2009, 02:33:57

Thanks for picking this up, despite my report was so vague.

Simon Riggs wrote:
> Best way seems to be to do (almost) the same thing as ExtendSubtrans()
> during recovery, so we zero out pages at the point that recovering
> transactions get written to pg_subtrans.

Yep.

Do we have the same bug with ExtendCLOG?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latest version of Hot Standby patch

From

Simon Riggs

Date:

14 January 2009, 04:10:51

On Wed, 2009-01-14 at 08:27 +0200, Heikki Linnakangas wrote:
> Thanks for picking this up, despite my report was so vague.
> 
> Simon Riggs wrote:
> > Best way seems to be to do (almost) the same thing as ExtendSubtrans()
> > during recovery, so we zero out pages at the point that recovering
> > transactions get written to pg_subtrans.
> 
> Yep.
> 
> Do we have the same bug with ExtendCLOG?

No, the clog issues WAL records for that. 8 times less than subtrans
would if we allowed it to, so its not as bad performance wise either.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support