Thread: pg_basebackup failure after setting default_table_access_method option
Hi,
I noticed pg_basebackup failure when default_table_access_method option is set.
Test steps:
Step 1: Init database
./initdb -D data
Step 2: Start Server
./postgres -D data &
Step 3: Set Guc option
export PGOPTIONS='-c default_table_access_method=zheap'
Step 4: Peform backup
/pg_basebackup -D backup -p 5432 --no-sync
2019-06-05 20:35:04.088 IST [11601] FATAL: cannot read pg_class without having selected a database
pg_basebackup: error: could not connect to server: FATAL: cannot read pg_class without having selected a database
Reason why it is failing:
pg_basebackup does not use any database to connect to server as it backs up the whole data instance.
As the option default_table_access_method is set.
It tries to validate this option, but while validating the option in ScanPgRelation function:
if (!OidIsValid(MyDatabaseId))
elog(FATAL, "cannot read pg_class without having selected a database");
Here as pg_basebackup uses no database the command fails.
Fix:
The patch has the fix for the above issue:
Let me know your opinion on this.
Regards,
vignesh
vignesh
Attachment
Re: pg_basebackup failure after setting default_table_access_method option
From
Haribabu Kommi
Date:
On Thu, Jun 6, 2019 at 1:46 AM vignesh C <vignesh21@gmail.com> wrote:
Hi,I noticed pg_basebackup failure when default_table_access_method option is set.Test steps:Step 1: Init database./initdb -D dataStep 2: Start Server./postgres -D data &Step 3: Set Guc optionexport PGOPTIONS='-c default_table_access_method=zheap'Step 4: Peform backup/pg_basebackup -D backup -p 5432 --no-sync2019-06-05 20:35:04.088 IST [11601] FATAL: cannot read pg_class without having selected a databasepg_basebackup: error: could not connect to server: FATAL: cannot read pg_class without having selected a databaseReason why it is failing:pg_basebackup does not use any database to connect to server as it backs up the whole data instance.As the option default_table_access_method is set.It tries to validate this option, but while validating the option in ScanPgRelation function:if (!OidIsValid(MyDatabaseId))elog(FATAL, "cannot read pg_class without having selected a database");Here as pg_basebackup uses no database the command fails.
Thanks for the details steps to reproduce the bug, I am also able to reproduce the problem.
Fix:The patch has the fix for the above issue:Let me know your opinion on this.
Thanks for the patch and it fixes the problem.
Regards,
Haribabu Kommi
Fujitsu Australia
Re: pg_basebackup failure after setting default_table_access_methodoption
From
Michael Paquier
Date:
On Thu, Jun 06, 2019 at 11:19:48AM +1000, Haribabu Kommi wrote: > Thanks for the details steps to reproduce the bug, I am also able to > reproduce the problem. This way is even more simple, no need for zheap to be around: =# create access method heap2 TYPE table HANDLER heap_tableam_handler; CREATE ACCESS METHOD And then: PGOPTIONS="-c default_table_access_method=heap2" psql "replication=1" psql: error: could not connect to server: FATAL: cannot read pg_class without having selected a database > Thanks for the patch and it fixes the problem. I was wondering if we actually need at all a catalog lookup at this stage, simplifying get_table_am_oid() on the way so as we always throw an error (its missing_ok is here to allow a proper error in the GUC context). The table AM lookup happens only when creating a table, so we could just get a failure when attempting to create a table with this incorrect value. Actually, when updating a value and reloading and/or restarting the server, it is possible to easily get in a state where we have an invalid table AM parameter stored in the GUC, which is what the callback is here to avoid. If you attempt to update the parameter with ALTER SYSTEM, then the command complains. So it seems to me that the user experience is inconsistent. -- Michael
Attachment
On Thu, Jun 6, 2019 at 6:50 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Thu, Jun 6, 2019 at 1:46 AM vignesh C <vignesh21@gmail.com> wrote:Hi,I noticed pg_basebackup failure when default_table_access_method option is set.Test steps:Step 1: Init database./initdb -D dataStep 2: Start Server./postgres -D data &Step 3: Set Guc optionexport PGOPTIONS='-c default_table_access_method=zheap'Step 4: Peform backup/pg_basebackup -D backup -p 5432 --no-sync2019-06-05 20:35:04.088 IST [11601] FATAL: cannot read pg_class without having selected a databasepg_basebackup: error: could not connect to server: FATAL: cannot read pg_class without having selected a databaseReason why it is failing:pg_basebackup does not use any database to connect to server as it backs up the whole data instance.As the option default_table_access_method is set.It tries to validate this option, but while validating the option in ScanPgRelation function:if (!OidIsValid(MyDatabaseId))elog(FATAL, "cannot read pg_class without having selected a database");Here as pg_basebackup uses no database the command fails.Thanks for the details steps to reproduce the bug, I am also able to reproduce the problem.Fix:The patch has the fix for the above issue:Let me know your opinion on this.Thanks for the patch and it fixes the problem.Regards,Haribabu KommiFujitsu Australia
Regards,
vignesh
Have a nice day
vignesh
Have a nice day
> On Thu, Jun 6, 2019 at 9:06 AM Michael Paquier <michael@paquier.xyz> wrote: > > I was wondering if we actually need at all a catalog lookup at this > stage, simplifying get_table_am_oid() on the way so as we always > throw an error (its missing_ok is here to allow a proper error in the > GUC context). Just for me to understand, do you suggest to not check default_table_access_method existence in check_default_table_access_method? If yes, then > The table AM lookup happens only when creating a table, so we could just get > a failure when attempting to create a table with this incorrect value. is correct, but doesn't it leave the room for some problems in the future with a wrong assumptions about correctness of default_table_access_method? > Actually, when updating a value and reloading and/or restarting the > server, it is possible to easily get in a state where we have an > invalid table AM parameter stored in the GUC, which is what the > callback is here to avoid. If you attempt to update the parameter > with ALTER SYSTEM, then the command complains. So it seems to me that > the user experience is inconsistent. Right, as far as I see the there is the same for e.g. default_tablespace due to IsTransactionState condition. In fact looks like one can see the very same issue with this option too, so probably it also needs to have MyDatabaseId check.
Hi, On 2019-06-06 16:06:36 +0900, Michael Paquier wrote: > On Thu, Jun 06, 2019 at 11:19:48AM +1000, Haribabu Kommi wrote: > > Thanks for the details steps to reproduce the bug, I am also able to > > reproduce the problem. > > This way is even more simple, no need for zheap to be around: > =# create access method heap2 TYPE table HANDLER heap_tableam_handler; > CREATE ACCESS METHOD > And then: > PGOPTIONS="-c default_table_access_method=heap2" psql "replication=1" > psql: error: could not connect to server: FATAL: cannot read pg_class > without having selected a database Yea, need to fix that. > > Thanks for the patch and it fixes the problem. > > I was wondering if we actually need at all a catalog lookup at this > stage, simplifying get_table_am_oid() on the way so as we always > throw an error (its missing_ok is here to allow a proper error in the > GUC context). The table AM lookup happens only when creating a table, > so we could just get a failure when attempting to create a table with > this incorrect value. I think that'd be a bad plan. We check other such GUCs, e.g. default_tablespace where this behaviour has been copied from, even if not bulletproof. > Actually, when updating a value and reloading and/or restarting the > server, it is possible to easily get in a state where we have an > invalid table AM parameter stored in the GUC, which is what the > callback is here to avoid. We have plenty other callbacks that aren't bulletproof, so I don't think this is really something we should / can change in isolation here. Greetings, Andres Freund
Hi, On 2019-06-08 16:03:09 +0200, Dmitry Dolgov wrote: > > On Thu, Jun 6, 2019 at 9:06 AM Michael Paquier <michael@paquier.xyz> wrote: > > The table AM lookup happens only when creating a table, so we could just get > > a failure when attempting to create a table with this incorrect value. > > is correct, but doesn't it leave the room for some problems in the future with > a wrong assumptions about correctness of default_table_access_method? What do you mean by that? Every single use of default_table_access_method (and similarly default_tablespace) has to check the value, because it could be outdated / not checked due to wrong context. Greetings, Andres Freund
> On Sat, Jun 8, 2019 at 5:30 PM Andres Freund <andres@anarazel.de> wrote: > > On 2019-06-08 16:03:09 +0200, Dmitry Dolgov wrote: > > > On Thu, Jun 6, 2019 at 9:06 AM Michael Paquier <michael@paquier.xyz> wrote: > > > The table AM lookup happens only when creating a table, so we could just get > > > a failure when attempting to create a table with this incorrect value. > > > > is correct, but doesn't it leave the room for some problems in the future with > > a wrong assumptions about correctness of default_table_access_method? > > What do you mean by that? I didn't have any particular problem in mind, just an abstract and probably wrong observation. One more observation is that this > Every single use of default_table_access_method (and similarly > default_tablespace) has to check the value, because it could be outdated / > not checked due to wrong context. for default_tablespace clearly expressed in GetDefaultTablespace function (if you see something like that, obviously you better use it), but there is nothing like similar for default_table_access_method so one have to keep it in mind (although of course it's not a problem so far, since it's being used in only one place).
Re: pg_basebackup failure after setting default_table_access_methodoption
From
Michael Paquier
Date:
On Sat, Jun 08, 2019 at 08:26:07AM -0700, Andres Freund wrote: > We have plenty other callbacks that aren't bulletproof, so I don't think > this is really something we should / can change in isolation here. Good point. I was looking at the check callbacks in guc.c for similar changes, and missed these. After testing, only these parameters fail with the same error: - default_table_access_method - default_text_search_config For the second one it's much older. -- Michael
Attachment
Hi, On 2019-06-10 16:37:33 +0900, Michael Paquier wrote: > On Sat, Jun 08, 2019 at 08:26:07AM -0700, Andres Freund wrote: > > We have plenty other callbacks that aren't bulletproof, so I don't think > > this is really something we should / can change in isolation here. > > Good point. I was looking at the check callbacks in guc.c for similar > changes, and missed these. After testing, only these parameters fail > with the same error: > - default_table_access_method > - default_text_search_config > > For the second one it's much older. Yea, that's where the default_table_access_method code originates from, obviously. I'll backpatch the default_text_search_config fix separately (and first). Greetings, Andres Freund
Re: pg_basebackup failure after setting default_table_access_methodoption
From
Michael Paquier
Date:
On Mon, Jun 10, 2019 at 10:33:37PM -0700, Andres Freund wrote: > Yea, that's where the default_table_access_method code originates from, > obviously. I'll backpatch the default_text_search_config fix separately > (and first). So you are just planning to add a check on MyDatabaseId for both? No objections to that. -- Michael
Attachment
Hi, On 2019-06-11 14:56:36 +0900, Michael Paquier wrote: > On Mon, Jun 10, 2019 at 10:33:37PM -0700, Andres Freund wrote: > > Yea, that's where the default_table_access_method code originates from, > > obviously. I'll backpatch the default_text_search_config fix separately > > (and first). > > So you are just planning to add a check on MyDatabaseId for both? No > objections to that. Well, all four. Given it's just copied code I don't see much code in splitting the commit anymore. I noticed some other uglyness: check_timezone calls interval_in(), without any checks. Not a huge fan of doing all that in postmaster, even leaving the wrong error reporting aside :(. But that seems like a plenty different enough issue to fix it separately, if we decide we want to do so. Greetings, Andres Freund
Re: pg_basebackup failure after setting default_table_access_methodoption
From
Michael Paquier
Date:
On Mon, Jun 10, 2019 at 11:49:03PM -0700, Andres Freund wrote: > Well, all four. Given it's just copied code I don't see much code in > splitting the commit anymore. Thanks for pushing the fix, the result looks fine. > I noticed some other uglyness: check_timezone calls interval_in(), > without any checks. Not a huge fan of doing all that in postmaster, even > leaving the wrong error reporting aside :(. But that seems like a > plenty different enough issue to fix it separately, if we decide we want > to do so. Indeed, I have not noticed this one :( -- Michael