Error while creating subscription when server is running in single user mode
Hi,
There is an error while creating subscription when server is running in
single user mode
centos@centos-cpula bin]$ ./postgres --single postgres -D m1data
PostgreSQL stand-alone backend 10beta1
backend> create subscription sub connection 'dbname=postgres port=5433
user=centos' publication p with (create_slot=0,enabled=off);
2017-05-31 12:53:09.318 BST [10469] LOG: statement: create subscription
sub connection 'dbname=postgres port=5433 user=centos' publication p
with (create_slot=0,enabled=off);
2017-05-31 12:53:09.326 BST [10469] ERROR: epoll_ctl() failed: Bad file
descriptor
--
regards,tushar
EnterpriseDB https://www.enterprisedb.com/
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, May 31, 2017 at 7:54 AM, tushar <tushar.ahuja@enterprisedb.com> wrote:
centos@centos-cpula bin]$ ./postgres --single postgres -D m1data
PostgreSQL stand-alone backend 10beta1
backend> create subscription sub connection 'dbname=postgres port=5433
user=centos' publication p with (create_slot=0,enabled=off);
2017-05-31 12:53:09.318 BST [10469] LOG: statement: create subscription sub
connection 'dbname=postgres port=5433 user=centos' publication p with
(create_slot=0,enabled=off);2017-05-31 12:53:09.326 BST [10469] ERROR: epoll_ctl() failed: Bad file
descriptor
IMHO, In single user mode, it can not support replication (it can not
have background WALReciver task). However, I believe there should be a
proper error if the above statement is correct.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, May 31, 2017 at 7:01 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, May 31, 2017 at 7:54 AM, tushar <tushar.ahuja@enterprisedb.com> wrote:
centos@centos-cpula bin]$ ./postgres --single postgres -D m1data
PostgreSQL stand-alone backend 10beta1
backend> create subscription sub connection 'dbname=postgres port=5433
user=centos' publication p with (create_slot=0,enabled=off);
2017-05-31 12:53:09.318 BST [10469] LOG: statement: create subscription sub
connection 'dbname=postgres port=5433 user=centos' publication p with
(create_slot=0,enabled=off);2017-05-31 12:53:09.326 BST [10469] ERROR: epoll_ctl() failed: Bad file
descriptorIMHO, In single user mode, it can not support replication (it can not
have background WALReciver task). However, I believe there should be a
proper error if the above statement is correct.
Yeah, see 0e0f43d6 for example. A simple fix is to look at
IsUnderPostmaster when creating, altering or dropping a subscription
in subscriptioncmds.c.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, May 31, 2017 at 2:20 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
Yeah, see 0e0f43d6 for example. A simple fix is to look at
IsUnderPostmaster when creating, altering or dropping a subscription
in subscriptioncmds.c.
Yeah, below patch fixes that.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
subscription_error.patchapplication/octet-stream; name=subscription_error.patchDownload+16-0
On Wed, May 31, 2017 at 10:49 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, May 31, 2017 at 2:20 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:Yeah, see 0e0f43d6 for example. A simple fix is to look at
IsUnderPostmaster when creating, altering or dropping a subscription
in subscriptioncmds.c.Yeah, below patch fixes that.
Thanks, this looks correct to me at quick glance.
+ if (!IsUnderPostmaster)
+ ereport(FATAL,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("subscription commands are not supported by
single-user servers")));
The messages could be more detailed, like directly the operation of
CREATE/ALTER/DROP SUBCRIPTION in each error message. But that's a nit.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jun 1, 2017 at 1:02 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
Thanks, this looks correct to me at quick glance.
+ if (!IsUnderPostmaster) + ereport(FATAL, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("subscription commands are not supported by single-user servers"))); The messages could be more detailed, like directly the operation of CREATE/ALTER/DROP SUBCRIPTION in each error message. But that's a nit.
Thanks for looking into it. Yeah, I think it's better to give
specific message instead of generic because we still support some of
the subscription commands even in single-user mode i.e ALTER
SUBSCRIPTION OWNER. My patch doesn't block this command and there is
no harm in supporting this in single-user mode but does this make any
sense? We may create some use case like creation subscription in
normal mode and then ALTER OWNER in single user mode but it makes
little sense to me.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Attachments:
subscription_error_v1.patchapplication/octet-stream; name=subscription_error_v1.patchDownload+16-0
On 6/1/17 04:49, Dilip Kumar wrote:
On Thu, Jun 1, 2017 at 1:02 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:Thanks, this looks correct to me at quick glance.
+ if (!IsUnderPostmaster) + ereport(FATAL, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("subscription commands are not supported by single-user servers"))); The messages could be more detailed, like directly the operation of CREATE/ALTER/DROP SUBCRIPTION in each error message. But that's a nit.Thanks for looking into it. Yeah, I think it's better to give
specific message instead of generic because we still support some of
the subscription commands even in single-user mode i.e ALTER
SUBSCRIPTION OWNER. My patch doesn't block this command and there is
no harm in supporting this in single-user mode but does this make any
sense? We may create some use case like creation subscription in
normal mode and then ALTER OWNER in single user mode but it makes
little sense to me.
We should look at what the underlying problem is before we prohibit
anything at a high level.
When I try it, I get a
TRAP: FailedAssertion("!(event->fd != (-1))", File: "latch.c", Line: 861)
which might indicate that there is a more general problem with latch use
in single-user mode.
If I remove that assertion, things work fine after that. The originally
reported error "epoll_ctl() failed: Bad file descriptor" might indicate
that there is platform-dependent behavior.
I think the general problem is that the latch code that checks for
postmaster death does not handle single-user mode well.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-06-01 21:42:41 -0400, Peter Eisentraut wrote:
We should look at what the underlying problem is before we prohibit
anything at a high level.
I'm not sure there's any underlying issue here, except being in single
user mode.
When I try it, I get a
TRAP: FailedAssertion("!(event->fd != (-1))", File: "latch.c", Line: 861)
which might indicate that there is a more general problem with latch use
in single-user mode.
That just means that the latch isn't initialized. Which makes:
If I remove that assertion, things work fine after that. The originally
reported error "epoll_ctl() failed: Bad file descriptor" might indicate
that there is platform-dependent behavior.
quite unsurprising. I'm not sure how this hints at platform dependent
behaviour?
libpqrcv_connect() uses MyProc->procLatch, which doesn't exist/isn't
initialized in single user mode. I'm very unclear why that code uses
MyProc->procLatch rather than MyLatch, but that'd not change anything -
the tablesync stuff etc would still not work.
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 6/1/17 21:55, Andres Freund wrote:
On 2017-06-01 21:42:41 -0400, Peter Eisentraut wrote:
We should look at what the underlying problem is before we prohibit
anything at a high level.I'm not sure there's any underlying issue here, except being in single
user mode.
My point is that we shouldn't be putting checks into DDL commands about
single-user mode if the actual cause of the issue is in a lower-level
system. Not all uses of a particular DDL command necessary use a latch,
for example. Also, there could be other things that hit a latch that
are reachable in single-user mode that we haven't found yet.
So I think the check should either go somewhere in the latch code, or
possibly in the libpqwalreceiver code. Or we make the latch code work
so that the check-for-postmaster-death code becomes a noop in
single-user mode. Suggestions?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-06-02 15:00:21 -0400, Peter Eisentraut wrote:
On 6/1/17 21:55, Andres Freund wrote:
On 2017-06-01 21:42:41 -0400, Peter Eisentraut wrote:
We should look at what the underlying problem is before we prohibit
anything at a high level.I'm not sure there's any underlying issue here, except being in single
user mode.My point is that we shouldn't be putting checks into DDL commands about
single-user mode if the actual cause of the issue is in a lower-level
system.
But it's not really.
Not all uses of a particular DDL command necessary use a latch,
for example. Also, there could be other things that hit a latch that
are reachable in single-user mode that we haven't found yet.
Latches work in single user mode, it's just that the new code for some
reason uses uninitialized memory as the latch. As I pointed out above,
the new code really should just use MyLatch instead of
MyProc->procLatch.
So I think the check should either go somewhere in the latch code, or
possibly in the libpqwalreceiver code. Or we make the latch code work
so that the check-for-postmaster-death code becomes a noop in
single-user mode. Suggestions?
I don't think the postmaster death code is really the issue here. Nor
is libpqwalreceiver really the issue. We can put ERRORs in a bunch of
unrelated subsystems, sure, but that doesn't really solve the issue that
logical rep pretty essentially requires multiple processes. We've
prevented parallelism from being used in general (cf. standard_planner),
we've not put checks in all the subsystems it uses.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
My point is that we shouldn't be putting checks into DDL commands about
single-user mode if the actual cause of the issue is in a lower-level
system. Not all uses of a particular DDL command necessary use a latch,
for example. Also, there could be other things that hit a latch that
are reachable in single-user mode that we haven't found yet.
So I think the check should either go somewhere in the latch code, or
possibly in the libpqwalreceiver code. Or we make the latch code work
so that the check-for-postmaster-death code becomes a noop in
single-user mode. Suggestions?
It's certainly plausible that we could have the latch code just ignore
WL_POSTMASTER_DEATH if not IsUnderPostmaster. I think that the original
reasoning for not doing that was that the calling code should know which
environment it's in, and not pass an unimplementable wait-exit reason;
so silently ignoring the bit could mask a bug. Perhaps that argument is
no longer attractive. Alternatively, we could fix the relevant call sites
to do "(IsUnderPostmaster ? WL_POSTMASTER_DEATH : 0)", and keep the strict
behavior for the majority of call sites.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 6/2/17 15:41, Tom Lane wrote:
It's certainly plausible that we could have the latch code just ignore
WL_POSTMASTER_DEATH if not IsUnderPostmaster. I think that the original
reasoning for not doing that was that the calling code should know which
environment it's in, and not pass an unimplementable wait-exit reason;
so silently ignoring the bit could mask a bug. Perhaps that argument is
no longer attractive. Alternatively, we could fix the relevant call sites
to do "(IsUnderPostmaster ? WL_POSTMASTER_DEATH : 0)", and keep the strict
behavior for the majority of call sites.
There are a lot of those call sites. (And a lot of duplicate code for
what to do if postmaster death actually happens.) I doubt we want to
check them all.
The attached patch fixes the reported issue for me.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-Ignore-WL_POSTMASTER_DEATH-latch-event-in-single-use.patchtext/plain; charset=UTF-8; name=0001-Ignore-WL_POSTMASTER_DEATH-latch-event-in-single-use.patch; x-mac-creator=0; x-mac-type=0Download+1-2
On 6/2/17 15:24, Andres Freund wrote:
but that doesn't really solve the issue that
logical rep pretty essentially requires multiple processes.
But it may be sensible to execute certain DDL commands for repair, which
is why I'm arguing for a finer-grained approach than just prohibiting
everything.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 02, 2017 at 11:06:52PM -0400, Peter Eisentraut wrote:
On 6/2/17 15:41, Tom Lane wrote:
It's certainly plausible that we could have the latch code just ignore
WL_POSTMASTER_DEATH if not IsUnderPostmaster. I think that the original
reasoning for not doing that was that the calling code should know which
environment it's in, and not pass an unimplementable wait-exit reason;
so silently ignoring the bit could mask a bug. Perhaps that argument is
no longer attractive. Alternatively, we could fix the relevant call sites
to do "(IsUnderPostmaster ? WL_POSTMASTER_DEATH : 0)", and keep the strict
behavior for the majority of call sites.There are a lot of those call sites. (And a lot of duplicate code for
what to do if postmaster death actually happens.) I doubt we want to
check them all.
[Action required within three days. This is a generic notification.]
The above-described topic is currently a PostgreSQL 10 open item. Peter,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1]/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.
[1]: /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 6/2/17 23:06, Peter Eisentraut wrote:
On 6/2/17 15:41, Tom Lane wrote:
It's certainly plausible that we could have the latch code just ignore
WL_POSTMASTER_DEATH if not IsUnderPostmaster. I think that the original
reasoning for not doing that was that the calling code should know which
environment it's in, and not pass an unimplementable wait-exit reason;
so silently ignoring the bit could mask a bug. Perhaps that argument is
no longer attractive. Alternatively, we could fix the relevant call sites
to do "(IsUnderPostmaster ? WL_POSTMASTER_DEATH : 0)", and keep the strict
behavior for the majority of call sites.There are a lot of those call sites. (And a lot of duplicate code for
what to do if postmaster death actually happens.) I doubt we want to
check them all.The attached patch fixes the reported issue for me.
committed
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 2, 2017 at 3:24 PM, Andres Freund <andres@anarazel.de> wrote:
Latches work in single user mode, it's just that the new code for some
reason uses uninitialized memory as the latch. As I pointed out above,
the new code really should just use MyLatch instead of
MyProc->procLatch.
We seem to have accumulated quite a few instance of that.
[rhaas pgsql]$ git grep MyLatch | wc -l
116
[rhaas pgsql]$ git grep 'MyProc->procLatch' | wc -l
33
Most of the offenders are in src/backend/replication, but there are
some that are related to parallelism as well (bgworker.c, pqmq.c,
parallel.c, condition_variable.c). Maybe we (you?) should just go and
change them all. I don't think using MyLatch instead of
MyProc->procLatch has become automatic for everyone yet.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-06-06 15:48:42 -0400, Robert Haas wrote:
On Fri, Jun 2, 2017 at 3:24 PM, Andres Freund <andres@anarazel.de> wrote:
Latches work in single user mode, it's just that the new code for some
reason uses uninitialized memory as the latch. As I pointed out above,
the new code really should just use MyLatch instead of
MyProc->procLatch.
FWIW, I'd misremembered some code here, and we actually reach the
function initializing the shared latch, even in single user mode.
We seem to have accumulated quite a few instance of that.
[rhaas pgsql]$ git grep MyLatch | wc -l
116
[rhaas pgsql]$ git grep 'MyProc->procLatch' | wc -l
33Most of the offenders are in src/backend/replication, but there are
some that are related to parallelism as well (bgworker.c, pqmq.c,
parallel.c, condition_variable.c). Maybe we (you?) should just go and
change them all. I don't think using MyLatch instead of
MyProc->procLatch has become automatic for everyone yet.
Nevertheless this should be changed. Will do.
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-06-06 12:53:21 -0700, Andres Freund wrote:
On 2017-06-06 15:48:42 -0400, Robert Haas wrote:
On Fri, Jun 2, 2017 at 3:24 PM, Andres Freund <andres@anarazel.de> wrote:
Latches work in single user mode, it's just that the new code for some
reason uses uninitialized memory as the latch. As I pointed out above,
the new code really should just use MyLatch instead of
MyProc->procLatch.FWIW, I'd misremembered some code here, and we actually reach the
function initializing the shared latch, even in single user mode.We seem to have accumulated quite a few instance of that.
[rhaas pgsql]$ git grep MyLatch | wc -l
116
[rhaas pgsql]$ git grep 'MyProc->procLatch' | wc -l
33Most of the offenders are in src/backend/replication, but there are
some that are related to parallelism as well (bgworker.c, pqmq.c,
parallel.c, condition_variable.c). Maybe we (you?) should just go and
change them all. I don't think using MyLatch instead of
MyProc->procLatch has become automatic for everyone yet.Nevertheless this should be changed. Will do.
Here's the patch for that, also addressing some issues I found while
updating those callsites (separate thread started, too).
- Andres