cache lookup errors for missing replication origins

Started by Michael Paquierover 8 years ago5 messages
#1Michael Paquier
michael.paquier@gmail.com

Hi all,

Cache lookup errors with elog() can be triggered easily by users at
SQL level using replication origin functions:
=# select pg_replication_origin_advance('popo', '0/1');
ERROR: XX000: cache lookup failed for replication origin 'popo'
LOCATION: replorigin_by_name, origin.c:229
=# select pg_replication_origin_drop('popo');
ERROR: XX000: cache lookup failed for replication origin 'popo'
LOCATION: replorigin_by_name, origin.c:229
=# select pg_replication_origin_session_setup('popo');
ERROR: XX000: cache lookup failed for replication origin 'popo'
LOCATION: replorigin_by_name, origin.c:229
=# select pg_replication_origin_progress('popo', true);
ERROR: XX000: cache lookup failed for replication origin 'popo';
LOCATION: replorigin_by_name, origin.c:229
A cache lookup means that an illogical status has been reached, but
those code paths don't refer to that. So I think that the error in
replorigin_by_name should be changed to that:
ERROR: 42704: replication slot "%s" does not exist

As far as I can see, replorigin_by_oid makes no use of its missing_ok
= false in the backend code, so letting it untouched would have no
impact. replorigin_by_name with missing_ok = false is only used with
SQL-callable functions, so it could be changed without any impact
elsewhere (without considering externally-maintained replication
modules).

Thanks,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Michael Paquier
michael.paquier@gmail.com
In reply to: Michael Paquier (#1)
1 attachment(s)
Re: cache lookup errors for missing replication origins

On Tue, Sep 5, 2017 at 12:59 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

ERROR: 42704: replication slot "%s" does not exist

s/slot/origin/

As far as I can see, replorigin_by_oid makes no use of its missing_ok
= false in the backend code, so letting it untouched would have no
impact. replorigin_by_name with missing_ok = false is only used with
SQL-callable functions, so it could be changed without any impact
elsewhere (without considering externally-maintained replication
modules).

As long as I don't forget, attached is a patch added as well to the
next CF. replorigin_by_oid should have the same switch from elog to
ereport in my opinion. Additional regression tests are included.
--
Michael

Attachments:

replorigin-elogs.patchapplication/octet-stream; name=replorigin-elogs.patchDownload
diff --git a/contrib/test_decoding/expected/replorigin.out b/contrib/test_decoding/expected/replorigin.out
index 76d4ea986d..8ea4ddda97 100644
--- a/contrib/test_decoding/expected/replorigin.out
+++ b/contrib/test_decoding/expected/replorigin.out
@@ -26,7 +26,14 @@ SELECT pg_replication_origin_drop('test_decoding: temp');
 (1 row)
 
 SELECT pg_replication_origin_drop('test_decoding: temp');
-ERROR:  cache lookup failed for replication origin 'test_decoding: temp'
+ERROR:  replication origin "test_decoding: temp" does not exist
+-- various failure checks for undefined slots
+select pg_replication_origin_advance('test_decoding: temp', '0/1');
+ERROR:  replication origin "test_decoding: temp" does not exist
+select pg_replication_origin_session_setup('test_decoding: temp');
+ERROR:  replication origin "test_decoding: temp" does not exist
+select pg_replication_origin_progress('test_decoding: temp', true);
+ERROR:  replication origin "test_decoding: temp" does not exist
 SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');
  ?column? 
 ----------
diff --git a/contrib/test_decoding/sql/replorigin.sql b/contrib/test_decoding/sql/replorigin.sql
index 7870f0ea32..451cd4bc3b 100644
--- a/contrib/test_decoding/sql/replorigin.sql
+++ b/contrib/test_decoding/sql/replorigin.sql
@@ -13,6 +13,11 @@ SELECT pg_replication_origin_create('test_decoding: temp');
 SELECT pg_replication_origin_drop('test_decoding: temp');
 SELECT pg_replication_origin_drop('test_decoding: temp');
 
+-- various failure checks for undefined slots
+select pg_replication_origin_advance('test_decoding: temp', '0/1');
+select pg_replication_origin_session_setup('test_decoding: temp');
+select pg_replication_origin_progress('test_decoding: temp', true);
+
 SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');
 
 -- origin tx
diff --git a/src/backend/replication/logical/origin.c b/src/backend/replication/logical/origin.c
index edc6efb8a6..1fc8f2dee9 100644
--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -225,8 +225,10 @@ replorigin_by_name(char *roname, bool missing_ok)
 		ReleaseSysCache(tuple);
 	}
 	else if (!missing_ok)
-		elog(ERROR, "cache lookup failed for replication origin '%s'",
-			 roname);
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_OBJECT),
+				 errmsg("replication origin \"%s\" does not exist",
+						roname)));
 
 	return roident;
 }
@@ -437,8 +439,10 @@ replorigin_by_oid(RepOriginId roident, bool missing_ok, char **roname)
 		*roname = NULL;
 
 		if (!missing_ok)
-			elog(ERROR, "cache lookup failed for replication origin with oid %u",
-				 roident);
+			ereport(ERROR,
+					(errcode(ERRCODE_UNDEFINED_OBJECT),
+					 errmsg("replication origin with OID %u does not exist",
+							roident)));
 
 		return false;
 	}
#3Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Michael Paquier (#2)
Re: cache lookup errors for missing replication origins

On Wed, Sep 6, 2017 at 3:51 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Tue, Sep 5, 2017 at 12:59 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:

ERROR: 42704: replication slot "%s" does not exist

s/slot/origin/

As far as I can see, replorigin_by_oid makes no use of its missing_ok
= false in the backend code, so letting it untouched would have no
impact. replorigin_by_name with missing_ok = false is only used with
SQL-callable functions, so it could be changed without any impact
elsewhere (without considering externally-maintained replication
modules).

As long as I don't forget, attached is a patch added as well to the
next CF. replorigin_by_oid should have the same switch from elog to
ereport in my opinion. Additional regression tests are included.

The patch passes the regression test and I found no problems in this
patch. I've marked it as Ready for Committer.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Robert Haas
robertmhaas@gmail.com
In reply to: Masahiko Sawada (#3)
Re: cache lookup errors for missing replication origins

On Tue, Oct 3, 2017 at 2:16 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The patch passes the regression test and I found no problems in this
patch. I've marked it as Ready for Committer.

Committed and back-patched to 9.5, which was as far as it applied cleanly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Michael Paquier
michael.paquier@gmail.com
In reply to: Robert Haas (#4)
Re: cache lookup errors for missing replication origins

On Thu, Oct 5, 2017 at 9:38 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Oct 3, 2017 at 2:16 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The patch passes the regression test and I found no problems in this
patch. I've marked it as Ready for Committer.

Committed and back-patched to 9.5, which was as far as it applied cleanly.

Thanks all.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers