pg_basebackup failure after setting default_table_access_method option

Started by vignesh Cover 6 years ago13 messages
#1vignesh C
vignesh21@gmail.com
1 attachment(s)

Hi,

*I noticed pg_basebackup failure when default_table_access_method option is
set.*

*Test steps:*

Step 1: Init database
./initdb -D data

Step 2: Start Server
./postgres -D data &

Step 3: Set Guc option
export PGOPTIONS='-c default_table_access_method=zheap'

Step 4: Peform backup
/pg_basebackup -D backup -p 5432 --no-sync
2019-06-05 20:35:04.088 IST [11601] FATAL: cannot read pg_class without
having selected a database
pg_basebackup: error: could not connect to server: FATAL: cannot read
pg_class without having selected a database

*Reason why it is failing:*
pg_basebackup does not use any database to connect to server as it backs up
the whole data instance.
As the option default_table_access_method is set.
It tries to validate this option, but while validating the option in
ScanPgRelation function:
if (!OidIsValid(MyDatabaseId))
elog(FATAL, "cannot read pg_class without having selected a database");

Here as pg_basebackup uses no database the command fails.

Fix:
The patch has the fix for the above issue:

Let me know your opinion on this.

--
Regards,
vignesh
EnterpriseDB: http://www.enterprisedb.com

Attachments:

0001-pg-backup-guc-option-validation-failure.patchapplication/octet-stream; name=0001-pg-backup-guc-option-validation-failure.patchDownload
diff --git a/src/backend/access/table/tableamapi.c b/src/backend/access/table/tableamapi.c
index 32877e7..8277fd2 100644
--- a/src/backend/access/table/tableamapi.c
+++ b/src/backend/access/table/tableamapi.c
@@ -21,6 +21,7 @@
 #include "utils/fmgroids.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
+#include "miscadmin.h"
 
 
 /*
@@ -121,7 +122,7 @@ check_default_table_access_method(char **newval, void **extra, GucSource source)
 	 * If we aren't inside a transaction, we cannot do database access so
 	 * cannot verify the name.  Must accept the value on faith.
 	 */
-	if (IsTransactionState())
+	if (IsTransactionState() && MyDatabaseId != InvalidOid)
 	{
 		if (!OidIsValid(get_table_am_oid(*newval, true)))
 		{
#2Haribabu Kommi
kommi.haribabu@gmail.com
In reply to: vignesh C (#1)
Re: pg_basebackup failure after setting default_table_access_method option

On Thu, Jun 6, 2019 at 1:46 AM vignesh C <vignesh21@gmail.com> wrote:

Hi,

*I noticed pg_basebackup failure when default_table_access_method option
is set.*

*Test steps:*

Step 1: Init database
./initdb -D data

Step 2: Start Server
./postgres -D data &

Step 3: Set Guc option
export PGOPTIONS='-c default_table_access_method=zheap'

Step 4: Peform backup
/pg_basebackup -D backup -p 5432 --no-sync
2019-06-05 20:35:04.088 IST [11601] FATAL: cannot read pg_class without
having selected a database
pg_basebackup: error: could not connect to server: FATAL: cannot read
pg_class without having selected a database

*Reason why it is failing:*
pg_basebackup does not use any database to connect to server as it backs
up the whole data instance.
As the option default_table_access_method is set.
It tries to validate this option, but while validating the option in
ScanPgRelation function:
if (!OidIsValid(MyDatabaseId))
elog(FATAL, "cannot read pg_class without having selected a database");

Here as pg_basebackup uses no database the command fails.

Thanks for the details steps to reproduce the bug, I am also able to
reproduce the problem.

Fix:
The patch has the fix for the above issue:

Let me know your opinion on this.

Thanks for the patch and it fixes the problem.

Regards,
Haribabu Kommi
Fujitsu Australia

#3Michael Paquier
michael@paquier.xyz
In reply to: Haribabu Kommi (#2)
Re: pg_basebackup failure after setting default_table_access_method option

On Thu, Jun 06, 2019 at 11:19:48AM +1000, Haribabu Kommi wrote:

Thanks for the details steps to reproduce the bug, I am also able to
reproduce the problem.

This way is even more simple, no need for zheap to be around:
=# create access method heap2 TYPE table HANDLER heap_tableam_handler;
CREATE ACCESS METHOD
And then:
PGOPTIONS="-c default_table_access_method=heap2" psql "replication=1"
psql: error: could not connect to server: FATAL: cannot read pg_class
without having selected a database

Thanks for the patch and it fixes the problem.

I was wondering if we actually need at all a catalog lookup at this
stage, simplifying get_table_am_oid() on the way so as we always
throw an error (its missing_ok is here to allow a proper error in the
GUC context). The table AM lookup happens only when creating a table,
so we could just get a failure when attempting to create a table with
this incorrect value.

Actually, when updating a value and reloading and/or restarting the
server, it is possible to easily get in a state where we have an
invalid table AM parameter stored in the GUC, which is what the
callback is here to avoid. If you attempt to update the parameter
with ALTER SYSTEM, then the command complains. So it seems to me that
the user experience is inconsistent.
--
Michael

#4vignesh C
vignesh21@gmail.com
In reply to: Haribabu Kommi (#2)
Re: pg_basebackup failure after setting default_table_access_method option

Thanks Hari for helping in verifying.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

On Thu, Jun 6, 2019 at 6:50 AM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:

On Thu, Jun 6, 2019 at 1:46 AM vignesh C <vignesh21@gmail.com> wrote:

Hi,

*I noticed pg_basebackup failure when default_table_access_method option
is set.*

*Test steps:*

Step 1: Init database
./initdb -D data

Step 2: Start Server
./postgres -D data &

Step 3: Set Guc option
export PGOPTIONS='-c default_table_access_method=zheap'

Step 4: Peform backup
/pg_basebackup -D backup -p 5432 --no-sync
2019-06-05 20:35:04.088 IST [11601] FATAL: cannot read pg_class without
having selected a database
pg_basebackup: error: could not connect to server: FATAL: cannot read
pg_class without having selected a database

*Reason why it is failing:*
pg_basebackup does not use any database to connect to server as it backs
up the whole data instance.
As the option default_table_access_method is set.
It tries to validate this option, but while validating the option in
ScanPgRelation function:
if (!OidIsValid(MyDatabaseId))
elog(FATAL, "cannot read pg_class without having selected a database");

Here as pg_basebackup uses no database the command fails.

Thanks for the details steps to reproduce the bug, I am also able to
reproduce the problem.

Fix:
The patch has the fix for the above issue:

Let me know your opinion on this.

Thanks for the patch and it fixes the problem.

Regards,
Haribabu Kommi
Fujitsu Australia

--
Regards,
vignesh
Have a nice day

#5Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Michael Paquier (#3)
Re: pg_basebackup failure after setting default_table_access_method option

On Thu, Jun 6, 2019 at 9:06 AM Michael Paquier <michael@paquier.xyz> wrote:

I was wondering if we actually need at all a catalog lookup at this
stage, simplifying get_table_am_oid() on the way so as we always
throw an error (its missing_ok is here to allow a proper error in the
GUC context).

Just for me to understand, do you suggest to not check
default_table_access_method existence in check_default_table_access_method? If
yes, then

The table AM lookup happens only when creating a table, so we could just get
a failure when attempting to create a table with this incorrect value.

is correct, but doesn't it leave the room for some problems in the future with
a wrong assumptions about correctness of default_table_access_method?

Actually, when updating a value and reloading and/or restarting the
server, it is possible to easily get in a state where we have an
invalid table AM parameter stored in the GUC, which is what the
callback is here to avoid. If you attempt to update the parameter
with ALTER SYSTEM, then the command complains. So it seems to me that
the user experience is inconsistent.

Right, as far as I see the there is the same for e.g. default_tablespace due to
IsTransactionState condition. In fact looks like one can see the very same
issue with this option too, so probably it also needs to have MyDatabaseId
check.

#6Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#3)
Re: pg_basebackup failure after setting default_table_access_method option

Hi,

On 2019-06-06 16:06:36 +0900, Michael Paquier wrote:

On Thu, Jun 06, 2019 at 11:19:48AM +1000, Haribabu Kommi wrote:

Thanks for the details steps to reproduce the bug, I am also able to
reproduce the problem.

This way is even more simple, no need for zheap to be around:
=# create access method heap2 TYPE table HANDLER heap_tableam_handler;
CREATE ACCESS METHOD
And then:
PGOPTIONS="-c default_table_access_method=heap2" psql "replication=1"
psql: error: could not connect to server: FATAL: cannot read pg_class
without having selected a database

Yea, need to fix that.

Thanks for the patch and it fixes the problem.

I was wondering if we actually need at all a catalog lookup at this
stage, simplifying get_table_am_oid() on the way so as we always
throw an error (its missing_ok is here to allow a proper error in the
GUC context). The table AM lookup happens only when creating a table,
so we could just get a failure when attempting to create a table with
this incorrect value.

I think that'd be a bad plan. We check other such GUCs,
e.g. default_tablespace where this behaviour has been copied from, even
if not bulletproof.

Actually, when updating a value and reloading and/or restarting the
server, it is possible to easily get in a state where we have an
invalid table AM parameter stored in the GUC, which is what the
callback is here to avoid.

We have plenty other callbacks that aren't bulletproof, so I don't think
this is really something we should / can change in isolation here.

Greetings,

Andres Freund

#7Andres Freund
andres@anarazel.de
In reply to: Dmitry Dolgov (#5)
Re: pg_basebackup failure after setting default_table_access_method option

Hi,

On 2019-06-08 16:03:09 +0200, Dmitry Dolgov wrote:

On Thu, Jun 6, 2019 at 9:06 AM Michael Paquier <michael@paquier.xyz> wrote:
The table AM lookup happens only when creating a table, so we could just get
a failure when attempting to create a table with this incorrect value.

is correct, but doesn't it leave the room for some problems in the future with
a wrong assumptions about correctness of default_table_access_method?

What do you mean by that? Every single use of
default_table_access_method (and similarly default_tablespace) has to
check the value, because it could be outdated / not checked due to wrong
context.

Greetings,

Andres Freund

#8Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Andres Freund (#7)
Re: pg_basebackup failure after setting default_table_access_method option

On Sat, Jun 8, 2019 at 5:30 PM Andres Freund <andres@anarazel.de> wrote:

On 2019-06-08 16:03:09 +0200, Dmitry Dolgov wrote:

On Thu, Jun 6, 2019 at 9:06 AM Michael Paquier <michael@paquier.xyz> wrote:
The table AM lookup happens only when creating a table, so we could just get
a failure when attempting to create a table with this incorrect value.

is correct, but doesn't it leave the room for some problems in the future with
a wrong assumptions about correctness of default_table_access_method?

What do you mean by that?

I didn't have any particular problem in mind, just an abstract and probably
wrong observation. One more observation is that this

Every single use of default_table_access_method (and similarly
default_tablespace) has to check the value, because it could be outdated /
not checked due to wrong context.

for default_tablespace clearly expressed in GetDefaultTablespace function (if
you see something like that, obviously you better use it), but there is nothing
like similar for default_table_access_method so one have to keep it in mind
(although of course it's not a problem so far, since it's being used in only
one place).

#9Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#6)
Re: pg_basebackup failure after setting default_table_access_method option

On Sat, Jun 08, 2019 at 08:26:07AM -0700, Andres Freund wrote:

We have plenty other callbacks that aren't bulletproof, so I don't think
this is really something we should / can change in isolation here.

Good point. I was looking at the check callbacks in guc.c for similar
changes, and missed these. After testing, only these parameters fail
with the same error:
- default_table_access_method
- default_text_search_config

For the second one it's much older.
--
Michael

#10Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#9)
Re: pg_basebackup failure after setting default_table_access_method option

Hi,

On 2019-06-10 16:37:33 +0900, Michael Paquier wrote:

On Sat, Jun 08, 2019 at 08:26:07AM -0700, Andres Freund wrote:

We have plenty other callbacks that aren't bulletproof, so I don't think
this is really something we should / can change in isolation here.

Good point. I was looking at the check callbacks in guc.c for similar
changes, and missed these. After testing, only these parameters fail
with the same error:
- default_table_access_method
- default_text_search_config

For the second one it's much older.

Yea, that's where the default_table_access_method code originates from,
obviously. I'll backpatch the default_text_search_config fix separately
(and first).

Greetings,

Andres Freund

#11Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#10)
Re: pg_basebackup failure after setting default_table_access_method option

On Mon, Jun 10, 2019 at 10:33:37PM -0700, Andres Freund wrote:

Yea, that's where the default_table_access_method code originates from,
obviously. I'll backpatch the default_text_search_config fix separately
(and first).

So you are just planning to add a check on MyDatabaseId for both? No
objections to that.
--
Michael

#12Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#11)
Re: pg_basebackup failure after setting default_table_access_method option

Hi,

On 2019-06-11 14:56:36 +0900, Michael Paquier wrote:

On Mon, Jun 10, 2019 at 10:33:37PM -0700, Andres Freund wrote:

Yea, that's where the default_table_access_method code originates from,
obviously. I'll backpatch the default_text_search_config fix separately
(and first).

So you are just planning to add a check on MyDatabaseId for both? No
objections to that.

Well, all four. Given it's just copied code I don't see much code in
splitting the commit anymore.

I noticed some other uglyness: check_timezone calls interval_in(),
without any checks. Not a huge fan of doing all that in postmaster, even
leaving the wrong error reporting aside :(. But that seems like a
plenty different enough issue to fix it separately, if we decide we want
to do so.

Greetings,

Andres Freund

#13Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#12)
Re: pg_basebackup failure after setting default_table_access_method option

On Mon, Jun 10, 2019 at 11:49:03PM -0700, Andres Freund wrote:

Well, all four. Given it's just copied code I don't see much code in
splitting the commit anymore.

Thanks for pushing the fix, the result looks fine.

I noticed some other uglyness: check_timezone calls interval_in(),
without any checks. Not a huge fan of doing all that in postmaster, even
leaving the wrong error reporting aside :(. But that seems like a
plenty different enough issue to fix it separately, if we decide we want
to do so.

Indeed, I have not noticed this one :(
--
Michael