Thread: BUG #6041: Unlogged table was created bad in slave node

BUG #6041: Unlogged table was created bad in slave node

From

"Emanuel"

Date:

26 May 2011, 08:37:33

The following bug has been logged online:

Bug reference:      6041
Logged by:          Emanuel
Email address:      postgres.arg@gmail.com
PostgreSQL version: 9.1 beta
Operating system:   Ubuntu
Description:        Unlogged table was created bad in slave node
Details:

MASTER=6666
SLAVE SYNC=6667
SLAVE ASYNC=6668

root@dell-desktop:/usr/local/pgsql# bin/psql -Upostgres -p6666
psql (9.1beta1)
Type "help" for help.

postgres=# CREATE UNLOGGED TABLE voidy AS SELECT i, random() FROM
generate_series(1,1000) i(i);
SELECT 1000
postgres=# \q
root@dell-desktop:/usr/local/pgsql# bin/psql -Upostgres -p6667
psql (9.1beta1)
Type "help" for help.

postgres=# \d voidy
     Unlogged table "public.voidy"
 Column |       Type       | Modifiers
--------+------------------+-----------
 i      | integer          |
 random | double precision |

postgres=# select * from voidy ;
ERROR:  could not open file "base/12071/17034": No existe el archivo o
directorio
postgres=# \q
root@dell-desktop:/usr/local/pgsql# bin/psql -Upostgres -p6668
psql (9.1beta1)
Type "help" for help.

postgres=# \d voidy
     Unlogged table "public.voidy"
 Column |       Type       | Modifiers
--------+------------------+-----------
 i      | integer          |
 random | double precision |

postgres=# select * from voidy ;
ERROR:  could not open file "base/12071/17034": No existe el archivo o
directorio

Re: BUG #6041: Unlogged table was created bad in slave node

From

Euler Taveira de Oliveira

Date:

26 May 2011, 13:07:04

Em 26-05-2011 08:37, Emanuel escreveu:

> Description:        Unlogged table was created bad in slave node
>
Unlogged table contents are not available to slave nodes [1].

> postgres=# select * from voidy ;
> ERROR:  could not open file "base/12071/17034": No existe el archivo o
> directorio
>
I think we should emit the real cause in those cases, if possible (not too
much overhead). The message would be "Unlogged table content is not available
in standby server".


[1] http://www.postgresql.org/docs/9.1/static/sql-createtable.html


--
   Euler Taveira de Oliveira - Timbira       http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: BUG #6041: Unlogged table was created bad in slave node

From

Alvaro Herrera

Date:

26 May 2011, 13:25:19

Excerpts from Euler Taveira de Oliveira's message of jue may 26 12:00:05 -0400 2011:
> Em 26-05-2011 08:37, Emanuel escreveu:
>
> > Description:        Unlogged table was created bad in slave node
> >
> Unlogged table contents are not available to slave nodes [1].
>
> > postgres=# select * from voidy ;
> > ERROR:  could not open file "base/12071/17034": No existe el archivo o
> > directorio
> >
> I think we should emit the real cause in those cases, if possible (not too
> much overhead). The message would be "Unlogged table content is not available
> in standby server".

I guess what it should do is create an empty file in the slave.

--
Ãlvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #6041: Unlogged table was created bad in slave node

From

Tom Lane

Date:

26 May 2011, 13:34:47

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Euler Taveira de Oliveira's message of jue may 26 12:00:05 -0400 2011:
>> I think we should emit the real cause in those cases, if possible (not too
>> much overhead). The message would be "Unlogged table content is not available
>> in standby server".

> I guess what it should do is create an empty file in the slave.

Probably it should, because won't the table malfunction after the slave
is promoted to master, if there's no file at all there?  Or will the
process of coming live create an empty file even if there was none?

But Euler is pointing out a different issue, which is usability.  If the
slave just acts like the table is present but empty, we are likely to
get bug reports about that too.  An error telling you you aren't allowed
to access such a table on slaves would be more user-friendly, if we can
do it without too much pain.

            regards, tom lane

Re: BUG #6041: Unlogged table was created bad in slave node

From

Emanuel Calvo

Date:

26 May 2011, 14:38:43

2011/5/26 Euler Taveira de Oliveira <euler@timbira.com>:
> Em 26-05-2011 08:37, Emanuel escreveu:
>
>> Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0Unlogged table was created bad i=
n slave node
>>
> Unlogged table contents are not available to slave nodes [1].
>

I know it. But the error raised isn't pretty nice for common users.
IMHO it could be an empty file with a message of WARNING level,
for notify administratros to watch up these tables.

>> postgres=3D# select * from voidy ;
>> ERROR: =C2=A0could not open file "base/12071/17034": No existe el archiv=
o o
>> directorio
>>
> I think we should emit the real cause in those cases, if possible (not too
> much overhead). The message would be "Unlogged table content is not
> available in standby server".
>

I think that detecting if table is unlogged && file not exists, rise
this message.
But the problem will comes when the standby is promoted to master, in that
case I didn't test what happens (file not exists, i think it will
couldn't insert data
directly or may force to the dba drop the table from catalog).


--=20
--
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Emanuel Calvo
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Helpame.com

Re: BUG #6041: Unlogged table was created bad in slave node

From

Robert Haas

Date:

01 June 2011, 15:30:18

On Thu, May 26, 2011 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Excerpts from Euler Taveira de Oliveira's message of jue may 26 12:00:05 -0400 2011:
>>> I think we should emit the real cause in those cases, if possible (not too
>>> much overhead). The message would be "Unlogged table content is not available
>>> in standby server".
>
>> I guess what it should do is create an empty file in the slave.
>
> Probably it should, because won't the table malfunction after the slave
> is promoted to master, if there's no file at all there?  Or will the
> process of coming live create an empty file even if there was none?

Coming live creates an empty file.

> But Euler is pointing out a different issue, which is usability.  If the
> slave just acts like the table is present but empty, we are likely to
> get bug reports about that too.  An error telling you you aren't allowed
> to access such a table on slaves would be more user-friendly, if we can
> do it without too much pain.

I looked into this a bit.  A few observations:

(1) This problem is actually not confined to unlogged tables;
temporary tables have the same issue.  For example, if you create a
temporary table on the master and then, on the slave, do SELECT * FROM
 pg_temp_3.hi_mom (or whatever the name of the temp schema where the
temp table is) you get the same error.  In fact I suspect if you took
a base backup that included the temporary relation and matched the
backend ID you could even manage to read out the old contents (modulo
any fun and exciting XID wraparound issues).  But the problem is of
course more noticeable for unlogged tables since they're not hidden
away in a special funny schema.

(2) It's somewhat difficult to get a clean fix for this error because
it's coming from deep down in the bowels of the system, inside
mdnblocks().  At that point, we haven't got a Relation available, just
an SMgrRelation, so the information that this relation is unlogged is
not really available, at least not without some sort of gross
modularity violation like calling smgrexists() on the init fork.  It
doesn't particularly seem like a good idea to conditionalize the
behavior of mdnblocks() on relpersistence anyway; that's a pretty
gross modularity violation all by itself.

(3) mdnblocks() gets called twice during the process of doing a simple
select on a relation - once from the planner, via estimate_rel_size,
and again from the executor, via initscan.  A fairly obvious fix for
this problem is to skip the call to estimate_rel_size() if we're
dealing with a temporary or unlogged relation and are in recovery; and
make heap_beginscan_internal() throw a suitable error under similar
circumstances (or do we want to throw the error at plan time?  tossing
it out from all the way down in estimate_rel_size() seems a bit
wacky).  Patch taking this approach attached.

(4) It strikes me that it might be possible to address this problem a
bit more cleanly by allowing mdnblocks() and smgrnblocks() and
RelationGetNumberOfBlocksInFork() to take a boolean argument
indicating whether or not an error should be thrown if the underlying
physical file happens not to exist.  When no error is to be signaled,
we simply return 0 when the main fork doesn't exist, rather than
throwing an error.  We could then make estimate_rel_size() use this
variant, eliminating the need for an explicit test in
get_relation_info().  I'm sort of inclined to go this route even
though it would require touching a bit more code.  We've already
whacked around RelationGetNumberOfBlocks() a bit in this release (the
function that formerly had that name is now
RelationGetNumberOfBlocksInFork(), and RelationGetNumberOfBlocks() is
now a macro that calls that function w/MAIN_FORKNUM) so it doesn't
seem like a bad time to get any other API changes that might be useful
out of our system.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

reject-unlogged-during-recovery.patch

Re: BUG #6041: Unlogged table was created bad in slave node

From

Robert Haas

Date:

03 June 2011, 13:45:03

On Wed, Jun 1, 2011 at 2:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, May 26, 2011 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Alvaro Herrera <alvherre@commandprompt.com> writes:
>>> Excerpts from Euler Taveira de Oliveira's message of jue may 26 12:00:0=
5 -0400 2011:
>>>> I think we should emit the real cause in those cases, if possible (not=
 too
>>>> much overhead). The message would be "Unlogged table content is not av=
ailable
>>>> in standby server".
>>
>>> I guess what it should do is create an empty file in the slave.
>>
>> Probably it should, because won't the table malfunction after the slave
>> is promoted to master, if there's no file at all there? =A0Or will the
>> process of coming live create an empty file even if there was none?
>
> Coming live creates an empty file.
>
>> But Euler is pointing out a different issue, which is usability. =A0If t=
he
>> slave just acts like the table is present but empty, we are likely to
>> get bug reports about that too. =A0An error telling you you aren't allow=
ed
>> to access such a table on slaves would be more user-friendly, if we can
>> do it without too much pain.
>
> I looked into this a bit. =A0A few observations:
>
> (1) This problem is actually not confined to unlogged tables;
> temporary tables have the same issue. =A0For example, if you create a
> temporary table on the master and then, on the slave, do SELECT * FROM
> =A0pg_temp_3.hi_mom (or whatever the name of the temp schema where the
> temp table is) you get the same error. =A0In fact I suspect if you took
> a base backup that included the temporary relation and matched the
> backend ID you could even manage to read out the old contents (modulo
> any fun and exciting XID wraparound issues). =A0But the problem is of
> course more noticeable for unlogged tables since they're not hidden
> away in a special funny schema.
>
> (2) It's somewhat difficult to get a clean fix for this error because
> it's coming from deep down in the bowels of the system, inside
> mdnblocks(). =A0At that point, we haven't got a Relation available, just
> an SMgrRelation, so the information that this relation is unlogged is
> not really available, at least not without some sort of gross
> modularity violation like calling smgrexists() on the init fork. =A0It
> doesn't particularly seem like a good idea to conditionalize the
> behavior of mdnblocks() on relpersistence anyway; that's a pretty
> gross modularity violation all by itself.
>
> (3) mdnblocks() gets called twice during the process of doing a simple
> select on a relation - once from the planner, via estimate_rel_size,
> and again from the executor, via initscan. =A0A fairly obvious fix for
> this problem is to skip the call to estimate_rel_size() if we're
> dealing with a temporary or unlogged relation and are in recovery; and
> make heap_beginscan_internal() throw a suitable error under similar
> circumstances (or do we want to throw the error at plan time? =A0tossing
> it out from all the way down in estimate_rel_size() seems a bit
> wacky). =A0Patch taking this approach attached.
>
> (4) It strikes me that it might be possible to address this problem a
> bit more cleanly by allowing mdnblocks() and smgrnblocks() and
> RelationGetNumberOfBlocksInFork() to take a boolean argument
> indicating whether or not an error should be thrown if the underlying
> physical file happens not to exist. =A0When no error is to be signaled,
> we simply return 0 when the main fork doesn't exist, rather than
> throwing an error. =A0We could then make estimate_rel_size() use this
> variant, eliminating the need for an explicit test in
> get_relation_info(). =A0I'm sort of inclined to go this route even
> though it would require touching a bit more code. =A0We've already
> whacked around RelationGetNumberOfBlocks() a bit in this release (the
> function that formerly had that name is now
> RelationGetNumberOfBlocksInFork(), and RelationGetNumberOfBlocks() is
> now a macro that calls that function w/MAIN_FORKNUM) so it doesn't
> seem like a bad time to get any other API changes that might be useful
> out of our system.
>
> Thoughts?

Anyone?  Anyone?  Bueller?

If we don't want to gum this with the above-mentioned cruft, the other
obvious alternative here is to do nothing, and live with the
non-beauty of the resulting error message.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #6041: Unlogged table was created bad in slave node

From

Alvaro Herrera

Date:

03 June 2011, 14:02:16

Excerpts from Robert Haas's message of vie jun 03 12:44:45 -0400 2011:
> On Wed, Jun 1, 2011 at 2:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:

> > (4) It strikes me that it might be possible to address this problem a
> > bit more cleanly by allowing mdnblocks() and smgrnblocks() and
> > RelationGetNumberOfBlocksInFork() to take a boolean argument
> > indicating whether or not an error should be thrown if the underlying
> > physical file happens not to exist. Â When no error is to be signaled,
> > we simply return 0 when the main fork doesn't exist, rather than
> > throwing an error.
>
> If we don't want to gum this with the above-mentioned cruft, the other
> obvious alternative here is to do nothing, and live with the
> non-beauty of the resulting error message.

Option 4 seems reasonable to me ... can you get rid of the dupe
smgrnblocks call simultaneously?

--
Ãlvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #6041: Unlogged table was created bad in slave node

From

Robert Haas

Date:

06 June 2011, 23:29:10

On Fri, Jun 3, 2011 at 1:01 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Robert Haas's message of vie jun 03 12:44:45 -0400 2011:
>> On Wed, Jun 1, 2011 at 2:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> > (4) It strikes me that it might be possible to address this problem a
>> > bit more cleanly by allowing mdnblocks() and smgrnblocks() and
>> > RelationGetNumberOfBlocksInFork() to take a boolean argument
>> > indicating whether or not an error should be thrown if the underlying
>> > physical file happens not to exist.  When no error is to be signaled,
>> > we simply return 0 when the main fork doesn't exist, rather than
>> > throwing an error.
>>
>> If we don't want to gum this with the above-mentioned cruft, the other
>> obvious alternative here is to do nothing, and live with the
>> non-beauty of the resulting error message.
>
> Option 4 seems reasonable to me ... can you get rid of the dupe
> smgrnblocks call simultaneously?

What dup smgrnblocks call?

Patch along these lines attached.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

reject-unlogged-during-recovery-v2.patch

Re: BUG #6041: Unlogged table was created bad in slave node

From

Simon Riggs

Date:

07 June 2011, 03:57:52

On Wed, Jun 1, 2011 at 7:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, May 26, 2011 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Alvaro Herrera <alvherre@commandprompt.com> writes:
>>> Excerpts from Euler Taveira de Oliveira's message of jue may 26 12:00:0=
5 -0400 2011:
>>>> I think we should emit the real cause in those cases, if possible (not=
 too
>>>> much overhead). The message would be "Unlogged table content is not av=
ailable
>>>> in standby server".
>>
>>> I guess what it should do is create an empty file in the slave.
>>
>> Probably it should, because won't the table malfunction after the slave
>> is promoted to master, if there's no file at all there? =A0Or will the
>> process of coming live create an empty file even if there was none?
>
> Coming live creates an empty file.
>
>> But Euler is pointing out a different issue, which is usability. =A0If t=
he
>> slave just acts like the table is present but empty, we are likely to
>> get bug reports about that too. =A0An error telling you you aren't allow=
ed
>> to access such a table on slaves would be more user-friendly, if we can
>> do it without too much pain.
>
> I looked into this a bit. =A0A few observations:
>
> (1) This problem is actually not confined to unlogged tables;
> temporary tables have the same issue. =A0For example, if you create a
> temporary table on the master and then, on the slave, do SELECT * FROM
> =A0pg_temp_3.hi_mom (or whatever the name of the temp schema where the
> temp table is) you get the same error. =A0In fact I suspect if you took
> a base backup that included the temporary relation and matched the
> backend ID you could even manage to read out the old contents (modulo
> any fun and exciting XID wraparound issues). =A0But the problem is of
> course more noticeable for unlogged tables since they're not hidden
> away in a special funny schema.

Seems like you're trying to fix the problem directly, which as you
say, has problems.

At some point we resolve from a word mentioned in the FROM clause to a
relfilenode.

Surely somewhere there we can notice its unlogged before we end up
down in the guts of smgr?

--=20
=A0Simon Riggs=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 http:/=
/www.2ndQuadrant.com/
=A0PostgreSQL Development, 24x7 Support, Training & Services

Re: BUG #6041: Unlogged table was created bad in slave node

From

Robert Haas

Date:

07 June 2011, 09:16:13

On Tue, Jun 7, 2011 at 2:57 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Seems like you're trying to fix the problem directly, which as you
> say, has problems.
>
> At some point we resolve from a word mentioned in the FROM clause to a
> relfilenode.
>
> Surely somewhere there we can notice its unlogged before we end up
> down in the guts of smgr?

Probably.  I guess the question is whether we want this to fail in (a)
the parser, (b) the planner, or (c) the executor.  I'm not quite sure
what is best, but if I had to guess I would have picked (c) in
preference to (b) in preference to (a), and you seem to be proposing
(a).  Having parserOpenTable() or transformSelectStmt() or some such
place barf doesn't feel right - it's not the job of those functions to
decide whether the query can actually be executed at the moment, just
whether it's well-formed.  Failing in the planner seems better, but it
seems like a crude hack to have estimate_rel_size() be the place that
bails out just because that's where we happen to call smgrnblocks().
Also, it seems like we oughtn't error out if someone wants to, say,
PREPARE a query while running in HS mode and then wait until after the
server is promoted to EXECUTE it, which won't work if we fail in the
planner.  So that led me to the approach I took in the patch I posted
last night: tweak estimate_rel_size() just enough so it doesn't fail,
and then fail for real inside the executor.

This is not to say I'm not open to other ideas, but just to give a
brief history of how I ended up where I am.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #6041: Unlogged table was created bad in slave node

From

Alvaro Herrera

Date:

07 June 2011, 11:24:28

Excerpts from Robert Haas's message of mar jun 07 08:16:01 -0400 2011:
> On Tue, Jun 7, 2011 at 2:57 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > Seems like you're trying to fix the problem directly, which as you
> > say, has problems.
> >
> > At some point we resolve from a word mentioned in the FROM clause to a
> > relfilenode.
> >
> > Surely somewhere there we can notice its unlogged before we end up
> > down in the guts of smgr?
>
> Probably.  I guess the question is whether we want this to fail in (a)
> the parser, (b) the planner, or (c) the executor.  I'm not quite sure
> what is best, but if I had to guess I would have picked (c) in
> preference to (b) in preference to (a), and you seem to be proposing
> (a).  Having parserOpenTable() or transformSelectStmt() or some such
> place barf doesn't feel right - it's not the job of those functions to
> decide whether the query can actually be executed at the moment, just
> whether it's well-formed.

Really?  I thought it was the job of the parse analysis phase to figure
out if table and column names are valid or not, and such.  If that's the
case, wouldn't it make sense to disallow usage of a table that doesn't
"exist" in a certain sense?

--
Ãlvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #6041: Unlogged table was created bad in slave node

From

Tom Lane

Date:

07 June 2011, 11:37:46

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Excerpts from Robert Haas's message of mar jun 07 08:16:01 -0400 2011:
>> Probably.  I guess the question is whether we want this to fail in (a)
>> the parser, (b) the planner, or (c) the executor.

> Really?  I thought it was the job of the parse analysis phase to figure
> out if table and column names are valid or not, and such.  If that's the
> case, wouldn't it make sense to disallow usage of a table that doesn't
> "exist" in a certain sense?

If you hope ever to support the proposed UNLOGGED-to-LOGGED or vice
versa table state changes, you don't want to be testing this in the
parser.  It has to be done at plan or execute time.

            regards, tom lane

Re: BUG #6041: Unlogged table was created bad in slave node

From

Tom Lane

Date:

07 June 2011, 12:50:40

Robert Haas <robertmhaas@gmail.com> writes:
> Patch along these lines attached.

Frankly, I find this quite ugly, and much prefer the general approach of
your previous patch in <BANLkTim433vF5HWjbJ0FSWm_-xA8DDaGNg@mail.gmail.com>.

However, I don't like where you put the execution-time test there.  I'd
put it in ExecOpenScanRelation instead, so that it covers both seqscan
and indexscan accesses.

            regards, tom lane

Re: BUG #6041: Unlogged table was created bad in slave node

From

Andres Freund

Date:

07 June 2011, 14:43:00

On Tuesday, June 07, 2011 04:29:02 Robert Haas wrote:
> On Fri, Jun 3, 2011 at 1:01 PM, Alvaro Herrera
>
> <alvherre@commandprompt.com> wrote:
> > Excerpts from Robert Haas's message of vie jun 03 12:44:45 -0400 2011:
> >> On Wed, Jun 1, 2011 at 2:28 PM, Robert Haas <robertmhaas@gmail.com>
wrote:
> >> > (4) It strikes me that it might be possible to address this problem a
> >> > bit more cleanly by allowing mdnblocks() and smgrnblocks() and
> >> > RelationGetNumberOfBlocksInFork() to take a boolean argument
> >> > indicating whether or not an error should be thrown if the underlying
> >> > physical file happens not to exist.  When no error is to be signaled,
> >> > we simply return 0 when the main fork doesn't exist, rather than
> >> > throwing an error.
> >>
> >> If we don't want to gum this with the above-mentioned cruft, the other
> >> obvious alternative here is to do nothing, and live with the
> >> non-beauty of the resulting error message.
> >
> > Option 4 seems reasonable to me ... can you get rid of the dupe
> > smgrnblocks call simultaneously?
>
> What dup smgrnblocks call?
>
> Patch along these lines attached.
> diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
> index b8fc87e..edd1674 100644
> --- a/src/include/storage/bufmgr.h
> +++ b/src/include/storage/bufmgr.h
> @@ -186,7 +186,7 @@ extern void DropRelFileNodeBuffers(RelFileNodeBackend
> rnode,
>
>  extern void DropDatabaseBuffers(Oid dbid);
>
>  #define RelationGetNumberOfBlocks(reln) \
>
> -    RelationGetNumberOfBlocksInFork(reln, MAIN_FORKNUM)
> +    RelationGetNumberOfBlocksInFork(reln, MAIN_FORKNUM, true)
>
>  #ifdef NOT_USED
>  extern void PrintPinnedBufs(void);

That hunk seems to be a bit dangerous given that RelationGetNumberOfBlocks is
used in the executor. Maybe all the callsites are actually safe but it seems
to be too fragile to me.
In my opinion RelationGetNumberOfBlocks should grow missing_ok as well.

Andres

Re: BUG #6041: Unlogged table was created bad in slave node

From

Robert Haas

Date:

07 June 2011, 15:05:19

On Tue, Jun 7, 2011 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Patch along these lines attached.
>
> Frankly, I find this quite ugly, and much prefer the general approach of
> your previous patch in <BANLkTim433vF5HWjbJ0FSWm_-xA8DDaGNg@mail.gmail.co=
m>.
>
> However, I don't like where you put the execution-time test there. =A0I'd
> put it in ExecOpenScanRelation instead, so that it covers both seqscan
> and indexscan accesses.

Ah, OK.  I was wondering if there was a better place.  I'll do it that
way, then.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #6041: Unlogged table was created bad in slave node

From

Robert Haas

Date:

07 June 2011, 16:25:03

On Tue, Jun 7, 2011 at 2:05 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jun 7, 2011 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Patch along these lines attached.
>>
>> Frankly, I find this quite ugly, and much prefer the general approach of
>> your previous patch in <BANLkTim433vF5HWjbJ0FSWm_-xA8DDaGNg@mail.gmail.com>.
>>
>> However, I don't like where you put the execution-time test there.  I'd
>> put it in ExecOpenScanRelation instead, so that it covers both seqscan
>> and indexscan accesses.
>
> Ah, OK.  I was wondering if there was a better place.  I'll do it that
> way, then.

I found a few other holes in my previous patch as well.  I think this
plugs them all, but it's hard to be sure there aren't any other calls
to RelationGetNumberOfBlocks() that could bomb out.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

reject-unlogged-during-recovery-v3.patch

Re: BUG #6041: Unlogged table was created bad in slave node

From

Tom Lane

Date:

07 June 2011, 16:54:11

Robert Haas <robertmhaas@gmail.com> writes:
> I found a few other holes in my previous patch as well.  I think this
> plugs them all, but it's hard to be sure there aren't any other calls
> to RelationGetNumberOfBlocks() that could bomb out.

[ squint... ]  Do we need those additional tests in plancat.c?  I
haven't paid attention to whether we support unlogged indexes on logged
tables, but if we do, protecting the RelationGetNumberOfBlocks() call is
the least of your worries.  You ought to be fixing things so the planner
won't consider the index valid at all (cf. the indisvalid test at line
165).  Similarly, the change in estimate_rel_size seems to be at an
awfully low level, akin to locking the barn door after the horses are
out.  What code path are you thinking will reach there on an unlogged
table?

It might be that it'd be best just to have both the planner and executor
throwing errors on unlogged tables, rather than rejiggering pieces of
the planner to sort-of not fail on an unlogged table.

            regards, tom lane

Re: BUG #6041: Unlogged table was created bad in slave node

From

Robert Haas

Date:

07 June 2011, 18:00:52

On Tue, Jun 7, 2011 at 3:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I found a few other holes in my previous patch as well. =A0I think this
>> plugs them all, but it's hard to be sure there aren't any other calls
>> to RelationGetNumberOfBlocks() that could bomb out.
>
> [ squint... ] =A0Do we need those additional tests in plancat.c? =A0I
> haven't paid attention to whether we support unlogged indexes on logged
> tables, but if we do, protecting the RelationGetNumberOfBlocks() call is
> the least of your worries. =A0You ought to be fixing things so the planner
> won't consider the index valid at all (cf. the indisvalid test at line
> 165).

Right now, RelationNeedsWAL() is always the same for a table and for
an index belonging to that table.  That is, indexes on temporary
tables are temporary; indees on unlogged tables are unlogged; indexes
on permanent tables are permanent.  But I agree that's something we'll
have to deal with if and when someone implements unlogged indexes on
logged tables.  (Though frankly I hope someone will come up with a
better name for that; else it's going to be worse than
constraint_exclusion vs. exclusion constraints.)

> Similarly, the change in estimate_rel_size seems to be at an
> awfully low level, akin to locking the barn door after the horses are
> out. =A0What code path are you thinking will reach there on an unlogged
> table?

Well, it gets there; I found this out empirically.
get_relation_info() calls it in two different places.  Actually, I see
now that the v3 patch has a few leftovers: the test in
estimate_relation_size() makes the first of the two checks in
get_relaton_info() redundant -- but the second hunk in
get_relation_info() is needed, because there it calls
RelationGetNumberOfBlocks() directly.  This is why I thought it might
be better to provide a version of RelationGetNumberOfBlocks() that
doesn't fail if the file is missing, instead of trying to plug these
holes one by one.

> It might be that it'd be best just to have both the planner and executor
> throwing errors on unlogged tables, rather than rejiggering pieces of
> the planner to sort-of not fail on an unlogged table.

Mmm, that's not a bad thought either.  Although I think if we can be
certain that the planner will error out, the executor checks aren't
necessary.  It would disallow preparing a statement and then executing
it after promotion, but that doesn't seem terribly important.  Any
idea where to put the check?

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #6041: Unlogged table was created bad in slave node

From

Tom Lane

Date:

07 June 2011, 18:05:17

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jun 7, 2011 at 3:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It might be that it'd be best just to have both the planner and executor
>> throwing errors on unlogged tables, rather than rejiggering pieces of
>> the planner to sort-of not fail on an unlogged table.

> Mmm, that's not a bad thought either.  Although I think if we can be
> certain that the planner will error out, the executor checks aren't
> necessary.  It would disallow preparing a statement and then executing
> it after promotion, but that doesn't seem terribly important.  Any
> idea where to put the check?

Well, I'd recommend keeping the test in ExecOpenScanRelation, since it's
cheap insurance against the situation changing since the plan was made.
But for the planner, why not just put the same kind of test in
get_relation_info, just after it does heap_open?

            regards, tom lane

Re: BUG #6041: Unlogged table was created bad in slave node

From

Robert Haas

Date:

07 June 2011, 19:06:28

On Tue, Jun 7, 2011 at 5:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Jun 7, 2011 at 3:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It might be that it'd be best just to have both the planner and executor
>>> throwing errors on unlogged tables, rather than rejiggering pieces of
>>> the planner to sort-of not fail on an unlogged table.
>
>> Mmm, that's not a bad thought either. =A0Although I think if we can be
>> certain that the planner will error out, the executor checks aren't
>> necessary. =A0It would disallow preparing a statement and then executing
>> it after promotion, but that doesn't seem terribly important. =A0Any
>> idea where to put the check?
>
> Well, I'd recommend keeping the test in ExecOpenScanRelation, since it's
> cheap insurance against the situation changing since the plan was made.

Well, the system can't very well go back into recovery once it's
emerged.  I guess it's possible that a table could be switched from
LOGGED to UNLOGGED during recovery though, in some hypothetical future
release.  No one's proposed that feature yet, but since there IS a
proposal to go the other way it doesn't seem unlikely we may see the
other direction eventually too.  I can't get too excited about
blocking this in multiple places just for the sake of preserving a
nicer error message in the face of a possible future feature
development, though.

> But for the planner, why not just put the same kind of test in
> get_relation_info, just after it does heap_open?

OK, let me try that.

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #6041: Unlogged table was created bad in slave node

From

Robert Haas

Date:

07 June 2011, 23:17:56

On Tue, Jun 7, 2011 at 6:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> But for the planner, why not just put the same kind of test in
>> get_relation_info, just after it does heap_open?
>
> OK, let me try that.

Seems to work beautifully, so committed that way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company