Parallel worker error

Started by Pavan Deolaseeover 8 years ago30 messages
#1Pavan Deolasee
pavan.deolasee@gmail.com

Hello,

While investing an issue in Postgres-XL 10, I came across this rather
surprisingly behaviour in PG master. See this test case:

create role testuser1;
set role to testuser1;
show role; -- shows testuser1

-- reset back to default role
reset session authorization ;
show role; -- shows none

set force_parallel_mode TO 1;
select count(*) from pg_class ; -- ok

-- now drop the role and try parallel query again
drop role testuser1 ;
select count(*) from pg_class ; -- fails

The last statement in this test fails with an error:
ERROR: role "testuser1" does not exist
CONTEXT: parallel worker

Looks like the leader process when serialises the GUC state, saves the
"role" value, which is still set to testuser1 (and which is now dropped).
Later, when the parallel worker process tries to restore the GUC state, it
fails on catalog lookup.

Comments in show_role() mentions a kludge because apparently SET SESSION
AUTHORIZATION cannot call set_config_option and change the current value of
"role". So that probably explains why show_role() displays the correct
information, but parallel worker fails to handle the case.

It's quite possible that I don't understand the differences in "role" and
"session authorization", but it still looks like a bug to me. May
be SerializeGUCState() should check if SetRoleIsActive is true and only
then save the role information?

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#2Robert Haas
robertmhaas@gmail.com
In reply to: Pavan Deolasee (#1)
Re: Parallel worker error

On Wed, Aug 30, 2017 at 6:58 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:

It's quite possible that I don't understand the differences in "role" and
"session authorization", but it still looks like a bug to me. May be
SerializeGUCState() should check if SetRoleIsActive is true and only then
save the role information?

Ugh. Well, this is definitely a bug, but I'm not sure if that's the right fix.

Mutually interdependent GUCs are bad news.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavan Deolasee (#1)
Re: Parallel worker error

Pavan Deolasee <pavan.deolasee@gmail.com> writes:

The last statement in this test fails with an error:
ERROR: role "testuser1" does not exist
CONTEXT: parallel worker

I can duplicate this in HEAD and v10, but not 9.6, so I've added it
as an open issue for v10. No idea what broke it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#2)
Re: Parallel worker error

On Wed, Aug 30, 2017 at 5:11 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 30, 2017 at 6:58 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:

It's quite possible that I don't understand the differences in "role" and
"session authorization", but it still looks like a bug to me. May be
SerializeGUCState() should check if SetRoleIsActive is true and only then
save the role information?

Ugh. Well, this is definitely a bug, but I'm not sure if that's the right fix.

Yeah.

Mutually interdependent GUCs are bad news.

I am able to reproduce this without involving session authorization
guc as well. One needs to drop the newly created role from another
session, then also we can see the same error.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)
Re: Parallel worker error

I wrote:

I can duplicate this in HEAD and v10, but not 9.6, so I've added it
as an open issue for v10. No idea what broke it.

Oh, scratch that: the issue exists in 9.6, it's just that the particular
test query you're using here does not generate a parallelized plan in
9.6, as shown by "explain". With a different query that does get
parallelized, 9.6 fails too:

regression=# select sum(ten) from tenk1;
ERROR: role "testuser1" does not exist
CONTEXT: parallel worker

Still, I think it's reasonable to characterize this as "must fix for v10".

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Amit Kapila (#4)
Re: Parallel worker error

On Wed, Aug 30, 2017 at 5:19 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

I am able to reproduce this without involving session authorization
guc as well. One needs to drop the newly created role from another
session, then also we can see the same error.

Yeah, that's how I first created the case. But concurrent dropping of role
(which is surprisingly allowed even when there are active connections with
the role active) opens another set of errors. Hence I tried to reproduce
this in a single session. While it might be interesting to fix the
concurrent role drop problem someday, the single session problem looks more
pressing.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#4)
Re: Parallel worker error

Amit Kapila <amit.kapila16@gmail.com> writes:

I am able to reproduce this without involving session authorization
guc as well. One needs to drop the newly created role from another
session, then also we can see the same error.

Hm. I suspect the basic shape of what's happening here is "an existing
session can continue to run with OuterUserId corresponding to a dropped
role, but we fail when trying to duplicate that state into a parallel
worker". I wonder whether there aren't similar gotchas for other GUCs
whose interpretation depends on catalog lookups, eg search_path.

We might need to redesign the GUC-propagation mechanism so it sends
the various internal representations of GUC values, not the user-visible
strings. (I'm thinking of the blobs that guc.c can use to restore a
previous state at transaction abort ... don't recall what the code
calls them ATM.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#7)
Re: Parallel worker error

On Wed, Aug 30, 2017 at 8:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

We might need to redesign the GUC-propagation mechanism so it sends
the various internal representations of GUC values, not the user-visible
strings.

That would probably be better in the long run, but I'm not keen to do
it in a back-branch under time pressure. I think one approach is to
stick another test against InitializingParallelWorker into check_role,
making it treat that case like no-role-specified. But I'm a little
worried about whether that would ever lead to a case where the role
should get set and doesn't, leading to a security bug.

Another approach is to attack this problem right at the root, which
IMHO is that changing "session_authorization" is implicitly setting
"role", but for the reasons mentioned in the comment in show_role(),
it doesn't set it explicitly. Well, that results in a situation where
the actual value assigned to the GUC doesn't match the value that
needs to be passed to the worker, so it breaks. I'm not sure what
would be a practical approach to that problem, but it's a thought.

A third approach is to do something like what you're proposing but
just limited to "role". In particular, the obvious thing to do would
be exclude it from SerializeGUCState and somehow propagate it as part
of FixedParallelState, which already has authenticated_user_id and
current_user_id.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#8)
Re: Parallel worker error

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Aug 30, 2017 at 8:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

We might need to redesign the GUC-propagation mechanism so it sends
the various internal representations of GUC values, not the user-visible
strings.

That would probably be better in the long run, but I'm not keen to do
it in a back-branch under time pressure.

Definitely a valid objection. But before assuming that this issue is
limited to SET ROLE, it'd be wise to push a bit on the other GUCs with
catalog-dependent values, to see if there are any others we need to
worry about. I"m okay with a narrow solution if SET ROLE really is
the only problem, but at this stage I'm not convinced of that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#9)
Re: Parallel worker error

On Wed, Aug 30, 2017 at 9:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Aug 30, 2017 at 8:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

We might need to redesign the GUC-propagation mechanism so it sends
the various internal representations of GUC values, not the user-visible
strings.

That would probably be better in the long run, but I'm not keen to do
it in a back-branch under time pressure.

Definitely a valid objection. But before assuming that this issue is
limited to SET ROLE, it'd be wise to push a bit on the other GUCs with
catalog-dependent values, to see if there are any others we need to
worry about. I"m okay with a narrow solution if SET ROLE really is
the only problem, but at this stage I'm not convinced of that.

I don't think the problem with role is that it's catalog-dependent,
but that the effective value is changed by code that never calls
SetConfigOption() or set_config_option() or anything other mechanism
that the GUC code knows about. That coding pattern is known to be
broken (see the commit message for
a16d421ca4fc639929bc964b2585e8382cf16e33, for example) and my bet is
the only reason why set role has gotten by with it for so long is
because the code was written a long time ago and nobody wants to risk
breaking anything by trying to clean it up. It's almost unfair to
blame this on parallel query; if someone were to submit a patch
tomorrow that changes the effective value of a GUC without going
through SetConfigOption(), you'd punt it into outer space at
relativistic speed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#10)
Re: Parallel worker error

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Aug 30, 2017 at 9:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I"m okay with a narrow solution if SET ROLE really is
the only problem, but at this stage I'm not convinced of that.

I don't think the problem with role is that it's catalog-dependent,
but that the effective value is changed by code that never calls
SetConfigOption() or set_config_option() or anything other mechanism
that the GUC code knows about.

It's certainly possible that that's a contributing factor, but the
variant that Amit alluded to demonstrates that catalog dependency
is part of the problem:

regression=# create user testuser1;
CREATE ROLE
regression=# \c - testuser1
You are now connected to database "regression" as user "testuser1".

-- in a different session, do "drop user testuser1;", then:

regression=> show role;
role
------
none
(1 row)

regression=> show session authorization;
session_authorization
-----------------------
testuser1
(1 row)

regression=> select count(*) from pg_class;
count
-------
1113
(1 row)

regression=> set force_parallel_mode TO 1;
SET
regression=> select count(*) from pg_class;
ERROR: role with OID 110981 does not exist
CONTEXT: parallel worker
regression=> set force_parallel_mode TO 0;
SET
regression=> select count(*) from pg_class;
count
-------
1113
(1 row)

The problem here is exactly that we cannot transmit the leader's
state to the worker. You can't blame it on SET ROLE, because
I didn't do one.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#11)
Re: Parallel worker error

On Wed, Aug 30, 2017 at 9:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The problem here is exactly that we cannot transmit the leader's
state to the worker. You can't blame it on SET ROLE, because
I didn't do one.

Hmm, that's a good reason for holding it blameless. In this case,
I'll blame the fact that we allow a role to be dropped while there are
users connected using that role. That's about as sensible as allowing
a table to be dropped while there are users reading from it, or a
database to be dropped while there are users connected to it, both of
which we in fact prohibit. IOW, I'd say the problem is that we've
allowed the system state to become inconsistent and now we're sad
because stuff doesn't work.

But since that's an established design fl^H^Hprinciple, maybe that
means we should go with the approach of teaching SerializeGUCState()
to ignore role altogether and instead have ParallelWorkerMain call
SetCurrentRoleId using information passed via the FixedParallelState
(not sure of the precise details here).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Robert Haas (#12)
Re: Parallel worker error

Robert Haas wrote:

In this case,
I'll blame the fact that we allow a role to be dropped while there are
users connected using that role. That's about as sensible as allowing
a table to be dropped while there are users reading from it, or a
database to be dropped while there are users connected to it, both of
which we in fact prohibit. IOW, I'd say the problem is that we've
allowed the system state to become inconsistent and now we're sad
because stuff doesn't work.

But since that's an established design fl^H^Hprinciple,

Actually, my first comment when Pavan mentioned this on IM was that we
should look into fixing that problem sometime. It's not terribly urgent
since it doesn't seem to hurt anything too badly, but it's still a bug.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#13)
Re: Parallel worker error

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

Robert Haas wrote:

In this case,
I'll blame the fact that we allow a role to be dropped while there are
users connected using that role.

Actually, my first comment when Pavan mentioned this on IM was that we
should look into fixing that problem sometime. It's not terribly urgent
since it doesn't seem to hurt anything too badly, but it's still a bug.

My feeling is that it's going to be unreasonably expensive. Are we
going to take a lock every time we call a SECURITY DEFINER function?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#12)
Re: Parallel worker error

On Wed, Aug 30, 2017 at 11:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:

But since that's an established design fl^H^Hprinciple, maybe that
means we should go with the approach of teaching SerializeGUCState()
to ignore role altogether and instead have ParallelWorkerMain call
SetCurrentRoleId using information passed via the FixedParallelState
(not sure of the precise details here).

Could I get some opinions on the virtues of this approach, vs. any of
the other suggestions at or near
/messages/by-id/CA+TgmoaSP90E33-MU2YpGs73TtJ37m5Hv-xqHjc7TPqX9wX8ew@mail.gmail.com
?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#15)
Re: Parallel worker error

On Thu, Aug 31, 2017 at 03:11:10PM -0400, Robert Haas wrote:

On Wed, Aug 30, 2017 at 11:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:

But since that's an established design fl^H^Hprinciple, maybe that
means we should go with the approach of teaching SerializeGUCState()
to ignore role altogether and instead have ParallelWorkerMain call
SetCurrentRoleId using information passed via the FixedParallelState
(not sure of the precise details here).

Could I get some opinions on the virtues of this approach, vs. any of
the other suggestions at or near
/messages/by-id/CA+TgmoaSP90E33-MU2YpGs73TtJ37m5Hv-xqHjc7TPqX9wX8ew@mail.gmail.com
?

It seems good to me, better than the other options in that mail. This does
rely on "role" being on the only vulnerable GUC. Looking at callers of
GUC_check_errmsg(), I don't expect trouble in any other GUC. (I suspect one
can hit the "permission denied to set role" during parallel initialization
after a concurrent ALTER ROLE removes a membership.)

[Action required within three days. This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item. Robert,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1]/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

[1]: /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#12)
Re: Parallel worker error

On Wed, Aug 30, 2017 at 8:49 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 30, 2017 at 9:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The problem here is exactly that we cannot transmit the leader's
state to the worker. You can't blame it on SET ROLE, because
I didn't do one.

Hmm, that's a good reason for holding it blameless. In this case,
I'll blame the fact that we allow a role to be dropped while there are
users connected using that role.

The similar could happen with schema as well. For example, even if
you have set the schema name in the search_path of session-1, it will
still be allowed to drop the schema from another session. Now, we
will pass the dropped schema name as part of search_path to the
parallel worker. It won't lead to any error because we don't check
catalog while setting the value of search_path GUC.

I am not sure whether we need to bother about this, but I thought it
might help in choosing the approach to fix this problem.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Amit Kapila
amit.kapila16@gmail.com
In reply to: Noah Misch (#16)
Re: Parallel worker error

On Sat, Sep 2, 2017 at 12:21 PM, Noah Misch <noah@leadboat.com> wrote:

On Thu, Aug 31, 2017 at 03:11:10PM -0400, Robert Haas wrote:

On Wed, Aug 30, 2017 at 11:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:

But since that's an established design fl^H^Hprinciple, maybe that
means we should go with the approach of teaching SerializeGUCState()
to ignore role altogether and instead have ParallelWorkerMain call
SetCurrentRoleId using information passed via the FixedParallelState
(not sure of the precise details here).

Could I get some opinions on the virtues of this approach, vs. any of
the other suggestions at or near
/messages/by-id/CA+TgmoaSP90E33-MU2YpGs73TtJ37m5Hv-xqHjc7TPqX9wX8ew@mail.gmail.com
?

It seems good to me, better than the other options in that mail. This does
rely on "role" being on the only vulnerable GUC. Looking at callers of
GUC_check_errmsg(), I don't expect trouble in any other GUC. (I suspect one
can hit the "permission denied to set role" during parallel initialization
after a concurrent ALTER ROLE removes a membership.)

I think that error won't happen during parallel initialization because
of 'InitializingParallelWorker' check.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#16)
Re: Parallel worker error

On Sat, Sep 2, 2017 at 2:51 AM, Noah Misch <noah@leadboat.com> wrote:

[Action required within three days. This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item. Robert,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

It's not really a valid v10 open item, because it's not new in v10.
But I agree it should be fixed soon.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Robert Haas (#12)
Re: Parallel worker error

On Wed, Aug 30, 2017 at 8:49 PM, Robert Haas <robertmhaas@gmail.com> wrote:

But since that's an established design fl^H^Hprinciple, maybe that
means we should go with the approach of teaching SerializeGUCState()
to ignore role altogether and instead have ParallelWorkerMain call
SetCurrentRoleId using information passed via the FixedParallelState
(not sure of the precise details here).

Seems like a reasonable approach to me.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#21Amit Kapila
amit.kapila16@gmail.com
In reply to: Noah Misch (#16)
2 attachment(s)
Re: Parallel worker error

On Sat, Sep 2, 2017 at 12:21 PM, Noah Misch <noah@leadboat.com> wrote:

On Thu, Aug 31, 2017 at 03:11:10PM -0400, Robert Haas wrote:

On Wed, Aug 30, 2017 at 11:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:

But since that's an established design fl^H^Hprinciple, maybe that
means we should go with the approach of teaching SerializeGUCState()
to ignore role altogether and instead have ParallelWorkerMain call
SetCurrentRoleId using information passed via the FixedParallelState
(not sure of the precise details here).

Could I get some opinions on the virtues of this approach, vs. any of
the other suggestions at or near
/messages/by-id/CA+TgmoaSP90E33-MU2YpGs73TtJ37m5Hv-xqHjc7TPqX9wX8ew@mail.gmail.com
?

It seems good to me, better than the other options in that mail.

It seems like the consensus is to move forward with this approach. I
have written a patch implementing the above idea. Note, that to use
SetCurrentRoleId, we need the value of guc "is_superuser" for the
current user and we don't pass this value to parallel workers as this
is PGC_INTERNAL guc variable. So, I have passed this value via
FixedParallelState.

After this patch, I think the check of InitializingParallelWorker in
check_role function is redundant. I have prepared a separate patch
for it, but may be it can be handled with the main patch to fix the
issue.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_role_handling_parallel_worker_v1.patchapplication/octet-stream; name=fix_role_handling_parallel_worker_v1.patchDownload
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ce1b907..88f5f76 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -74,6 +74,7 @@ typedef struct FixedParallelState
 	Oid			temp_namespace_id;
 	Oid			temp_toast_namespace_id;
 	int			sec_context;
+	bool		is_current_user_superuser;
 	PGPROC	   *parallel_master_pgproc;
 	pid_t		parallel_master_pid;
 	BackendId	parallel_master_backend_id;
@@ -197,6 +198,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
 	Size		segsize = 0;
 	int			i;
 	FixedParallelState *fps;
+	const char *varval;
 	Snapshot	transaction_snapshot = GetTransactionSnapshot();
 	Snapshot	active_snapshot = GetActiveSnapshot();
 
@@ -276,6 +278,11 @@ InitializeParallelDSM(ParallelContext *pcxt)
 	fps->database_id = MyDatabaseId;
 	fps->authenticated_user_id = GetAuthenticatedUserId();
 	GetUserIdAndSecContext(&fps->current_user_id, &fps->sec_context);
+	varval = GetConfigOptionByName("is_superuser", NULL, true);
+	if (varval && strcmp(varval, "on") == 0)
+		fps->is_current_user_superuser = true;
+	else
+		fps->is_current_user_superuser = false;
 	GetTempNamespaceState(&fps->temp_namespace_id,
 						  &fps->temp_toast_namespace_id);
 	fps->parallel_master_pgproc = MyProc;
@@ -1082,6 +1089,12 @@ ParallelWorkerMain(Datum main_arg)
 	/* Restore user ID and security context. */
 	SetUserIdAndSecContext(fps->current_user_id, fps->sec_context);
 
+	/*
+	 * Skip verifying whether session user is allowed to become this role and
+	 * blindly restore the leader's state for current role.
+	 */
+	SetCurrentRoleId(fps->current_user_id, fps->is_current_user_superuser);
+
 	/* Restore temp-namespace state to ensure search path matches leader's. */
 	SetTempNamespaceState(fps->temp_namespace_id,
 						  fps->temp_toast_namespace_id);
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 246fea8..40d8537 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -8986,12 +8986,18 @@ read_nondefault_variables(void)
  * constants; a few, like server_encoding and lc_ctype, are handled specially
  * outside the serialize/restore procedure.  Therefore, SerializeGUCState()
  * never sends these, and RestoreGUCState() never changes them.
+ *
+ * Role is a special variable in the sense that its current value can be an
+ * invalid value and there are multiple ways by which that can happen (like
+ * after setting the role, someone drops it).  So we handle it outside of
+ * serialize/restore machinery.
  */
 static bool
 can_skip_gucvar(struct config_generic *gconf)
 {
 	return gconf->context == PGC_POSTMASTER ||
-		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT;
+		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT ||
+		strcmp(gconf->name, "role") == 0;
 }
 
 /*
@@ -9252,7 +9258,6 @@ SerializeGUCState(Size maxsize, char *start_address)
 	Size		actual_size;
 	Size		bytes_left;
 	int			i;
-	int			i_role = -1;
 
 	/* Reserve space for saving the actual size of the guc state */
 	Assert(maxsize > sizeof(actual_size));
@@ -9260,19 +9265,7 @@ SerializeGUCState(Size maxsize, char *start_address)
 	bytes_left = maxsize - sizeof(actual_size);
 
 	for (i = 0; i < num_guc_variables; i++)
-	{
-		/*
-		 * It's pretty ugly, but we've got to force "role" to be initialized
-		 * after "session_authorization"; otherwise, the latter will override
-		 * the former.
-		 */
-		if (strcmp(guc_variables[i]->name, "role") == 0)
-			i_role = i;
-		else
-			serialize_variable(&curptr, &bytes_left, guc_variables[i]);
-	}
-	if (i_role >= 0)
-		serialize_variable(&curptr, &bytes_left, guc_variables[i_role]);
+		serialize_variable(&curptr, &bytes_left, guc_variables[i]);
 
 	/* Store actual size without assuming alignment of start_address. */
 	actual_size = maxsize - bytes_left - sizeof(actual_size);
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 2ae600f..b5287ab 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -488,6 +488,22 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+NOTICE:  role "regress_parallel_worker" does not exist, skipping
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+ count 
+-------
+ 10000
+(1 row)
+
+reset force_parallel_mode;
+reset role;
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 89fe80a..801df0b 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -195,6 +195,17 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
 
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+reset force_parallel_mode;
+reset role;
+
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
remove_redundant_check_from_check_role_v1.patchapplication/octet-stream; name=remove_redundant_check_from_check_role_v1.patchDownload
diff --git a/src/backend/commands/variable.c b/src/backend/commands/variable.c
index 3ed1c56..0d2bbe4 100644
--- a/src/backend/commands/variable.c
+++ b/src/backend/commands/variable.c
@@ -901,12 +901,9 @@ check_role(char **newval, void **extra, GucSource source)
 		ReleaseSysCache(roleTup);
 
 		/*
-		 * Verify that session user is allowed to become this role, but skip
-		 * this in parallel mode, where we must blindly recreate the parallel
-		 * leader's state.
+		 * Verify that session user is allowed to become this role.
 		 */
-		if (!InitializingParallelWorker &&
-			!is_member_of_role(GetSessionUserId(), roleid))
+		if (!is_member_of_role(GetSessionUserId(), roleid))
 		{
 			GUC_check_errcode(ERRCODE_INSUFFICIENT_PRIVILEGE);
 			GUC_check_errmsg("permission denied to set role \"%s\"",
#22Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#21)
Re: Parallel worker error

On Mon, Sep 4, 2017 at 5:46 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

It seems like the consensus is to move forward with this approach. I
have written a patch implementing the above idea. Note, that to use
SetCurrentRoleId, we need the value of guc "is_superuser" for the
current user and we don't pass this value to parallel workers as this
is PGC_INTERNAL guc variable. So, I have passed this value via
FixedParallelState.

If I apply this patch and run 'make check', the tests all pass. If I
then revert all the changes to parallel.c and keep only the changes to
guc.c, the tests still pass. IOW, the part of this patch that makes
the tests pass is the changes to guc.c, that just skip saving and
restoring the role altogether.

It looks to me like we need to get all of the following variables from
miscinit.c set to the correct values:

static Oid AuthenticatedUserId = InvalidOid;
static Oid SessionUserId = InvalidOid;
static Oid OuterUserId = InvalidOid;
static Oid CurrentUserId = InvalidOid;
static bool AuthenticatedUserIsSuperuser = false;
static bool SessionUserIsSuperuser = false;
static int SecurityRestrictionContext = 0;
static bool SetRoleIsActive = false;

To do that, I think we first need to call InitializeSessionUserId(),
which will set AuthenticatedUserId and AuthenticatedUserIsSuperuser.
Then, we need to call SetSessionUserId(), which will set SessionUserId
and SessionUserIsSuperuser. Then, we need to call SetCurrentRoleId(),
which will set SetRoleIsActive and OuterUserId. Then finally we need
to call SetUserIdAndSecContext, which sets CurrentUserId and
SecurityRestrictionContext. The order matters, because the earlier
functions in this series overwrite the values set by later ones as a
side-effect. Furthermore, it matters in each case that the values
passed to those functions are taken from the corresponding variables
in the leader.

In the unpatched code, looking at ParallelWorkerMain(), we call
BackgroundWorkerInitializeConnectionByOid() first; that calls
InitializeSessionUserId(). Next we call RestoreGUCState(), which
because of the restore-the-role-last hack calls SetSessionUserId (when
session_authorization is restored) followed by SetCurrentRoleId()
(when role is restored). Finally it calls SetUserIdAndSecContext().

With your patch, the order is wrong. SetCurrentRoleId() is called
AFTER SetUserIdAndSecContext(). Furthermore, the role ID passed to
SetCurrentRoleId() is always the same one passed to
SetUserIdAndSecContext(). But those roles might be different.
fps->current_user_id is set by a call to GetUserIdAndSecContext(),
which reads CurrentUserId, not OuterUserId. I think you need to pass
the value returned by GetCurrentRoleId() as an additional element of
FixedParallelState.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#22)
1 attachment(s)
Re: Parallel worker error

On Fri, Sep 8, 2017 at 3:18 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Sep 4, 2017 at 5:46 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

It seems like the consensus is to move forward with this approach. I
have written a patch implementing the above idea. Note, that to use
SetCurrentRoleId, we need the value of guc "is_superuser" for the
current user and we don't pass this value to parallel workers as this
is PGC_INTERNAL guc variable. So, I have passed this value via
FixedParallelState.

With your patch, the order is wrong. SetCurrentRoleId() is called
AFTER SetUserIdAndSecContext(). Furthermore, the role ID passed to
SetCurrentRoleId() is always the same one passed to
SetUserIdAndSecContext(). But those roles might be different.
fps->current_user_id is set by a call to GetUserIdAndSecContext(),
which reads CurrentUserId, not OuterUserId. I think you need to pass
the value returned by GetCurrentRoleId() as an additional element of
FixedParallelState.

You are right. I have changed the ordering and passed OuterUserId via
FixedParallelState.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_role_handling_parallel_worker_v2.patchapplication/octet-stream; name=fix_role_handling_parallel_worker_v2.patchDownload
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ce1b907..248e2db 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -71,9 +71,11 @@ typedef struct FixedParallelState
 	Oid			database_id;
 	Oid			authenticated_user_id;
 	Oid			current_user_id;
+	Oid			outer_user_id;
 	Oid			temp_namespace_id;
 	Oid			temp_toast_namespace_id;
 	int			sec_context;
+	bool		is_current_user_superuser;
 	PGPROC	   *parallel_master_pgproc;
 	pid_t		parallel_master_pid;
 	BackendId	parallel_master_backend_id;
@@ -197,6 +199,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
 	Size		segsize = 0;
 	int			i;
 	FixedParallelState *fps;
+	const char *varval;
 	Snapshot	transaction_snapshot = GetTransactionSnapshot();
 	Snapshot	active_snapshot = GetActiveSnapshot();
 
@@ -275,7 +278,13 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		shm_toc_allocate(pcxt->toc, sizeof(FixedParallelState));
 	fps->database_id = MyDatabaseId;
 	fps->authenticated_user_id = GetAuthenticatedUserId();
+	fps->outer_user_id = GetCurrentRoleId();
 	GetUserIdAndSecContext(&fps->current_user_id, &fps->sec_context);
+	varval = GetConfigOptionByName("is_superuser", NULL, true);
+	if (varval && strcmp(varval, "on") == 0)
+		fps->is_current_user_superuser = true;
+	else
+		fps->is_current_user_superuser = false;
 	GetTempNamespaceState(&fps->temp_namespace_id,
 						  &fps->temp_toast_namespace_id);
 	fps->parallel_master_pgproc = MyProc;
@@ -1079,6 +1088,13 @@ ParallelWorkerMain(Datum main_arg)
 	 */
 	InvalidateSystemCaches();
 
+	/*
+	 * Restore current role id.  Skip verifying whether session user is
+	 * allowed to become this role and blindly restore the leader's state for
+	 * current role.
+	 */
+	SetCurrentRoleId(fps->outer_user_id, fps->is_current_user_superuser);
+
 	/* Restore user ID and security context. */
 	SetUserIdAndSecContext(fps->current_user_id, fps->sec_context);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 969e80f..e4bed34 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -8986,12 +8986,18 @@ read_nondefault_variables(void)
  * constants; a few, like server_encoding and lc_ctype, are handled specially
  * outside the serialize/restore procedure.  Therefore, SerializeGUCState()
  * never sends these, and RestoreGUCState() never changes them.
+ *
+ * Role is a special variable in the sense that its current value can be an
+ * invalid value and there are multiple ways by which that can happen (like
+ * after setting the role, someone drops it).  So we handle it outside of
+ * serialize/restore machinery.
  */
 static bool
 can_skip_gucvar(struct config_generic *gconf)
 {
 	return gconf->context == PGC_POSTMASTER ||
-		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT;
+		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT ||
+		strcmp(gconf->name, "role") == 0;
 }
 
 /*
@@ -9252,7 +9258,6 @@ SerializeGUCState(Size maxsize, char *start_address)
 	Size		actual_size;
 	Size		bytes_left;
 	int			i;
-	int			i_role = -1;
 
 	/* Reserve space for saving the actual size of the guc state */
 	Assert(maxsize > sizeof(actual_size));
@@ -9260,19 +9265,7 @@ SerializeGUCState(Size maxsize, char *start_address)
 	bytes_left = maxsize - sizeof(actual_size);
 
 	for (i = 0; i < num_guc_variables; i++)
-	{
-		/*
-		 * It's pretty ugly, but we've got to force "role" to be initialized
-		 * after "session_authorization"; otherwise, the latter will override
-		 * the former.
-		 */
-		if (strcmp(guc_variables[i]->name, "role") == 0)
-			i_role = i;
-		else
-			serialize_variable(&curptr, &bytes_left, guc_variables[i]);
-	}
-	if (i_role >= 0)
-		serialize_variable(&curptr, &bytes_left, guc_variables[i_role]);
+		serialize_variable(&curptr, &bytes_left, guc_variables[i]);
 
 	/* Store actual size without assuming alignment of start_address. */
 	actual_size = maxsize - bytes_left - sizeof(actual_size);
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 2ae600f..b5287ab 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -488,6 +488,22 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+NOTICE:  role "regress_parallel_worker" does not exist, skipping
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+ count 
+-------
+ 10000
+(1 row)
+
+reset force_parallel_mode;
+reset role;
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 89fe80a..801df0b 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -195,6 +195,17 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
 
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+reset force_parallel_mode;
+reset role;
+
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
#24Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#23)
Re: Parallel worker error

On Fri, Sep 8, 2017 at 1:17 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

You are right. I have changed the ordering and passed OuterUserId via
FixedParallelState.

This looks a little strange:

+ SetCurrentRoleId(fps->outer_user_id, fps->is_current_user_superuser);

The first argument says "outer" but the second says "current". I'm
wondering if we can just make the second one is_superuser.

I'm also wondering if, rather than using GetConfigOptionByName, we
should just make the GUC underlying is_superuser non-static and use
the value directly. If not, then I'm alternatively wondering whether
we should maybe use a less-generic name than varval.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#24)
2 attachment(s)
Re: Parallel worker error

On Fri, Sep 8, 2017 at 3:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Sep 8, 2017 at 1:17 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

You are right. I have changed the ordering and passed OuterUserId via
FixedParallelState.

This looks a little strange:

+ SetCurrentRoleId(fps->outer_user_id, fps->is_current_user_superuser);

The first argument says "outer" but the second says "current". I'm
wondering if we can just make the second one is_superuser.

No issues changed as per suggestion.

I'm also wondering if, rather than using GetConfigOptionByName, we
should just make the GUC underlying is_superuser non-static and use
the value directly. If not, then I'm alternatively wondering whether
we should maybe use a less-generic name than varval.

I think we can go either way. So prepared patches with both
approaches. In fix_role_handling_parallel_worker_v3_1.patch, I have
changed the variable name and in
fix_role_handling_parallel_worker_v3_2.patch, I have exposed the guc
is_superuser.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_role_handling_parallel_worker_v3_1.patchapplication/octet-stream; name=fix_role_handling_parallel_worker_v3_1.patchDownload
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ce1b907..62b287e 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -71,9 +71,11 @@ typedef struct FixedParallelState
 	Oid			database_id;
 	Oid			authenticated_user_id;
 	Oid			current_user_id;
+	Oid			outer_user_id;
 	Oid			temp_namespace_id;
 	Oid			temp_toast_namespace_id;
 	int			sec_context;
+	bool		is_superuser;
 	PGPROC	   *parallel_master_pgproc;
 	pid_t		parallel_master_pid;
 	BackendId	parallel_master_backend_id;
@@ -197,6 +199,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
 	Size		segsize = 0;
 	int			i;
 	FixedParallelState *fps;
+	const char *is_superuser_var;
 	Snapshot	transaction_snapshot = GetTransactionSnapshot();
 	Snapshot	active_snapshot = GetActiveSnapshot();
 
@@ -275,7 +278,13 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		shm_toc_allocate(pcxt->toc, sizeof(FixedParallelState));
 	fps->database_id = MyDatabaseId;
 	fps->authenticated_user_id = GetAuthenticatedUserId();
+	fps->outer_user_id = GetCurrentRoleId();
 	GetUserIdAndSecContext(&fps->current_user_id, &fps->sec_context);
+	is_superuser_var = GetConfigOptionByName("is_superuser", NULL, true);
+	if (is_superuser_var && strcmp(is_superuser_var, "on") == 0)
+		fps->is_superuser = true;
+	else
+		fps->is_superuser = false;
 	GetTempNamespaceState(&fps->temp_namespace_id,
 						  &fps->temp_toast_namespace_id);
 	fps->parallel_master_pgproc = MyProc;
@@ -1079,6 +1088,13 @@ ParallelWorkerMain(Datum main_arg)
 	 */
 	InvalidateSystemCaches();
 
+	/*
+	 * Restore current role id.  Skip verifying whether session user is
+	 * allowed to become this role and blindly restore the leader's state for
+	 * current role.
+	 */
+	SetCurrentRoleId(fps->outer_user_id, fps->is_superuser);
+
 	/* Restore user ID and security context. */
 	SetUserIdAndSecContext(fps->current_user_id, fps->sec_context);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 969e80f..e4bed34 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -8986,12 +8986,18 @@ read_nondefault_variables(void)
  * constants; a few, like server_encoding and lc_ctype, are handled specially
  * outside the serialize/restore procedure.  Therefore, SerializeGUCState()
  * never sends these, and RestoreGUCState() never changes them.
+ *
+ * Role is a special variable in the sense that its current value can be an
+ * invalid value and there are multiple ways by which that can happen (like
+ * after setting the role, someone drops it).  So we handle it outside of
+ * serialize/restore machinery.
  */
 static bool
 can_skip_gucvar(struct config_generic *gconf)
 {
 	return gconf->context == PGC_POSTMASTER ||
-		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT;
+		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT ||
+		strcmp(gconf->name, "role") == 0;
 }
 
 /*
@@ -9252,7 +9258,6 @@ SerializeGUCState(Size maxsize, char *start_address)
 	Size		actual_size;
 	Size		bytes_left;
 	int			i;
-	int			i_role = -1;
 
 	/* Reserve space for saving the actual size of the guc state */
 	Assert(maxsize > sizeof(actual_size));
@@ -9260,19 +9265,7 @@ SerializeGUCState(Size maxsize, char *start_address)
 	bytes_left = maxsize - sizeof(actual_size);
 
 	for (i = 0; i < num_guc_variables; i++)
-	{
-		/*
-		 * It's pretty ugly, but we've got to force "role" to be initialized
-		 * after "session_authorization"; otherwise, the latter will override
-		 * the former.
-		 */
-		if (strcmp(guc_variables[i]->name, "role") == 0)
-			i_role = i;
-		else
-			serialize_variable(&curptr, &bytes_left, guc_variables[i]);
-	}
-	if (i_role >= 0)
-		serialize_variable(&curptr, &bytes_left, guc_variables[i_role]);
+		serialize_variable(&curptr, &bytes_left, guc_variables[i]);
 
 	/* Store actual size without assuming alignment of start_address. */
 	actual_size = maxsize - bytes_left - sizeof(actual_size);
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 2ae600f..b5287ab 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -488,6 +488,22 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+NOTICE:  role "regress_parallel_worker" does not exist, skipping
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+ count 
+-------
+ 10000
+(1 row)
+
+reset force_parallel_mode;
+reset role;
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 89fe80a..801df0b 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -195,6 +195,17 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
 
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+reset force_parallel_mode;
+reset role;
+
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
fix_role_handling_parallel_worker_v3_2.patchapplication/octet-stream; name=fix_role_handling_parallel_worker_v3_2.patchDownload
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ce1b907..e1a6a75 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -71,9 +71,11 @@ typedef struct FixedParallelState
 	Oid			database_id;
 	Oid			authenticated_user_id;
 	Oid			current_user_id;
+	Oid			outer_user_id;
 	Oid			temp_namespace_id;
 	Oid			temp_toast_namespace_id;
 	int			sec_context;
+	bool		is_superuser;
 	PGPROC	   *parallel_master_pgproc;
 	pid_t		parallel_master_pid;
 	BackendId	parallel_master_backend_id;
@@ -275,6 +277,8 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		shm_toc_allocate(pcxt->toc, sizeof(FixedParallelState));
 	fps->database_id = MyDatabaseId;
 	fps->authenticated_user_id = GetAuthenticatedUserId();
+	fps->outer_user_id = GetCurrentRoleId();
+	fps->is_superuser = session_auth_is_superuser;
 	GetUserIdAndSecContext(&fps->current_user_id, &fps->sec_context);
 	GetTempNamespaceState(&fps->temp_namespace_id,
 						  &fps->temp_toast_namespace_id);
@@ -1079,6 +1083,13 @@ ParallelWorkerMain(Datum main_arg)
 	 */
 	InvalidateSystemCaches();
 
+	/*
+	 * Restore current role id.  Skip verifying whether session user is
+	 * allowed to become this role and blindly restore the leader's state for
+	 * current role.
+	 */
+	SetCurrentRoleId(fps->outer_user_id, fps->is_superuser);
+
 	/* Restore user ID and security context. */
 	SetUserIdAndSecContext(fps->current_user_id, fps->sec_context);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 969e80f..51748a2 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -446,6 +446,7 @@ char	   *event_source;
 bool		row_security;
 bool		check_function_bodies = true;
 bool		default_with_oids = false;
+bool		session_auth_is_superuser;
 
 int			log_min_error_statement = ERROR;
 int			log_min_messages = WARNING;
@@ -492,7 +493,6 @@ int			huge_pages;
  * and is kept in sync by assign_hooks.
  */
 static char *syslog_ident_str;
-static bool session_auth_is_superuser;
 static double phony_random_seed;
 static char *client_encoding_string;
 static char *datestyle_string;
@@ -8986,12 +8986,18 @@ read_nondefault_variables(void)
  * constants; a few, like server_encoding and lc_ctype, are handled specially
  * outside the serialize/restore procedure.  Therefore, SerializeGUCState()
  * never sends these, and RestoreGUCState() never changes them.
+ *
+ * Role is a special variable in the sense that its current value can be an
+ * invalid value and there are multiple ways by which that can happen (like
+ * after setting the role, someone drops it).  So we handle it outside of
+ * serialize/restore machinery.
  */
 static bool
 can_skip_gucvar(struct config_generic *gconf)
 {
 	return gconf->context == PGC_POSTMASTER ||
-		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT;
+		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT ||
+		strcmp(gconf->name, "role") == 0;
 }
 
 /*
@@ -9252,7 +9258,6 @@ SerializeGUCState(Size maxsize, char *start_address)
 	Size		actual_size;
 	Size		bytes_left;
 	int			i;
-	int			i_role = -1;
 
 	/* Reserve space for saving the actual size of the guc state */
 	Assert(maxsize > sizeof(actual_size));
@@ -9260,19 +9265,7 @@ SerializeGUCState(Size maxsize, char *start_address)
 	bytes_left = maxsize - sizeof(actual_size);
 
 	for (i = 0; i < num_guc_variables; i++)
-	{
-		/*
-		 * It's pretty ugly, but we've got to force "role" to be initialized
-		 * after "session_authorization"; otherwise, the latter will override
-		 * the former.
-		 */
-		if (strcmp(guc_variables[i]->name, "role") == 0)
-			i_role = i;
-		else
-			serialize_variable(&curptr, &bytes_left, guc_variables[i]);
-	}
-	if (i_role >= 0)
-		serialize_variable(&curptr, &bytes_left, guc_variables[i_role]);
+		serialize_variable(&curptr, &bytes_left, guc_variables[i]);
 
 	/* Store actual size without assuming alignment of start_address. */
 	actual_size = maxsize - bytes_left - sizeof(actual_size);
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index c1870d2..aa2c7c1 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -244,6 +244,7 @@ extern bool log_btree_build_stats;
 
 extern PGDLLIMPORT bool check_function_bodies;
 extern bool default_with_oids;
+extern bool	session_auth_is_superuser;
 
 extern int	log_min_error_statement;
 extern int	log_min_messages;
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 2ae600f..b5287ab 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -488,6 +488,22 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+NOTICE:  role "regress_parallel_worker" does not exist, skipping
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+ count 
+-------
+ 10000
+(1 row)
+
+reset force_parallel_mode;
+reset role;
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 89fe80a..801df0b 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -195,6 +195,17 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
 
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+reset force_parallel_mode;
+reset role;
+
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
#26Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#25)
4 attachment(s)
Re: Parallel worker error

On Sat, Sep 9, 2017 at 7:56 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Sep 8, 2017 at 3:13 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Sep 8, 2017 at 1:17 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

You are right. I have changed the ordering and passed OuterUserId via
FixedParallelState.

This looks a little strange:

+ SetCurrentRoleId(fps->outer_user_id, fps->is_current_user_superuser);

The first argument says "outer" but the second says "current". I'm
wondering if we can just make the second one is_superuser.

No issues changed as per suggestion.

I'm also wondering if, rather than using GetConfigOptionByName, we
should just make the GUC underlying is_superuser non-static and use
the value directly. If not, then I'm alternatively wondering whether
we should maybe use a less-generic name than varval.

I think we can go either way. So prepared patches with both
approaches. In fix_role_handling_parallel_worker_v3_1.patch, I have
changed the variable name and in
fix_role_handling_parallel_worker_v3_2.patch, I have exposed the guc)
is_superuser.

This patch no longer applies, so attached rebased patches. I have
also created patches for v10 as we might want to backpatch the fix.
Added the patch in CF (https://commitfest.postgresql.org/15/1342/)

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_role_handling_parallel_worker_v4_1.patchapplication/octet-stream; name=fix_role_handling_parallel_worker_v4_1.patchDownload
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d683050..6813577 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -75,9 +75,11 @@ typedef struct FixedParallelState
 	Oid			database_id;
 	Oid			authenticated_user_id;
 	Oid			current_user_id;
+	Oid			outer_user_id;
 	Oid			temp_namespace_id;
 	Oid			temp_toast_namespace_id;
 	int			sec_context;
+	bool		is_superuser;
 	PGPROC	   *parallel_master_pgproc;
 	pid_t		parallel_master_pid;
 	BackendId	parallel_master_backend_id;
@@ -201,6 +203,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
 	Size		segsize = 0;
 	int			i;
 	FixedParallelState *fps;
+	const char *is_superuser_var;
 	dsm_handle	session_dsm_handle = DSM_HANDLE_INVALID;
 	Snapshot	transaction_snapshot = GetTransactionSnapshot();
 	Snapshot	active_snapshot = GetActiveSnapshot();
@@ -296,7 +299,13 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		shm_toc_allocate(pcxt->toc, sizeof(FixedParallelState));
 	fps->database_id = MyDatabaseId;
 	fps->authenticated_user_id = GetAuthenticatedUserId();
+	fps->outer_user_id = GetCurrentRoleId();
 	GetUserIdAndSecContext(&fps->current_user_id, &fps->sec_context);
+	is_superuser_var = GetConfigOptionByName("is_superuser", NULL, true);
+	if (is_superuser_var && strcmp(is_superuser_var, "on") == 0)
+		fps->is_superuser = true;
+	else
+		fps->is_superuser = false;
 	GetTempNamespaceState(&fps->temp_namespace_id,
 						  &fps->temp_toast_namespace_id);
 	fps->parallel_master_pgproc = MyProc;
@@ -1115,6 +1124,13 @@ ParallelWorkerMain(Datum main_arg)
 	 */
 	InvalidateSystemCaches();
 
+	/*
+	 * Restore current role id.  Skip verifying whether session user is
+	 * allowed to become this role and blindly restore the leader's state for
+	 * current role.
+	 */
+	SetCurrentRoleId(fps->outer_user_id, fps->is_superuser);
+
 	/* Restore user ID and security context. */
 	SetUserIdAndSecContext(fps->current_user_id, fps->sec_context);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae22185..521c0fc 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -8986,12 +8986,18 @@ read_nondefault_variables(void)
  * constants; a few, like server_encoding and lc_ctype, are handled specially
  * outside the serialize/restore procedure.  Therefore, SerializeGUCState()
  * never sends these, and RestoreGUCState() never changes them.
+ *
+ * Role is a special variable in the sense that its current value can be an
+ * invalid value and there are multiple ways by which that can happen (like
+ * after setting the role, someone drops it).  So we handle it outside of
+ * serialize/restore machinery.
  */
 static bool
 can_skip_gucvar(struct config_generic *gconf)
 {
 	return gconf->context == PGC_POSTMASTER ||
-		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT;
+		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT ||
+		strcmp(gconf->name, "role") == 0;
 }
 
 /*
@@ -9252,7 +9258,6 @@ SerializeGUCState(Size maxsize, char *start_address)
 	Size		actual_size;
 	Size		bytes_left;
 	int			i;
-	int			i_role = -1;
 
 	/* Reserve space for saving the actual size of the guc state */
 	Assert(maxsize > sizeof(actual_size));
@@ -9260,19 +9265,7 @@ SerializeGUCState(Size maxsize, char *start_address)
 	bytes_left = maxsize - sizeof(actual_size);
 
 	for (i = 0; i < num_guc_variables; i++)
-	{
-		/*
-		 * It's pretty ugly, but we've got to force "role" to be initialized
-		 * after "session_authorization"; otherwise, the latter will override
-		 * the former.
-		 */
-		if (strcmp(guc_variables[i]->name, "role") == 0)
-			i_role = i;
-		else
-			serialize_variable(&curptr, &bytes_left, guc_variables[i]);
-	}
-	if (i_role >= 0)
-		serialize_variable(&curptr, &bytes_left, guc_variables[i_role]);
+		serialize_variable(&curptr, &bytes_left, guc_variables[i]);
 
 	/* Store actual size without assuming alignment of start_address. */
 	actual_size = maxsize - bytes_left - sizeof(actual_size);
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 3c63ad1..ac9ad06 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -508,6 +508,22 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+NOTICE:  role "regress_parallel_worker" does not exist, skipping
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+ count 
+-------
+ 10000
+(1 row)
+
+reset force_parallel_mode;
+reset role;
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 720495c..495f033 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -201,6 +201,17 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
 
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+reset force_parallel_mode;
+reset role;
+
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
fix_role_handling_parallel_worker_v4_2.patchapplication/octet-stream; name=fix_role_handling_parallel_worker_v4_2.patchDownload
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index d683050..1f542ed 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -75,9 +75,11 @@ typedef struct FixedParallelState
 	Oid			database_id;
 	Oid			authenticated_user_id;
 	Oid			current_user_id;
+	Oid			outer_user_id;
 	Oid			temp_namespace_id;
 	Oid			temp_toast_namespace_id;
 	int			sec_context;
+	bool		is_superuser;
 	PGPROC	   *parallel_master_pgproc;
 	pid_t		parallel_master_pid;
 	BackendId	parallel_master_backend_id;
@@ -296,6 +298,8 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		shm_toc_allocate(pcxt->toc, sizeof(FixedParallelState));
 	fps->database_id = MyDatabaseId;
 	fps->authenticated_user_id = GetAuthenticatedUserId();
+	fps->outer_user_id = GetCurrentRoleId();
+	fps->is_superuser = session_auth_is_superuser;
 	GetUserIdAndSecContext(&fps->current_user_id, &fps->sec_context);
 	GetTempNamespaceState(&fps->temp_namespace_id,
 						  &fps->temp_toast_namespace_id);
@@ -1115,6 +1119,13 @@ ParallelWorkerMain(Datum main_arg)
 	 */
 	InvalidateSystemCaches();
 
+	/*
+	 * Restore current role id.  Skip verifying whether session user is
+	 * allowed to become this role and blindly restore the leader's state for
+	 * current role.
+	 */
+	SetCurrentRoleId(fps->outer_user_id, fps->is_superuser);
+
 	/* Restore user ID and security context. */
 	SetUserIdAndSecContext(fps->current_user_id, fps->sec_context);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ae22185..65372d7 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -446,6 +446,7 @@ char	   *event_source;
 bool		row_security;
 bool		check_function_bodies = true;
 bool		default_with_oids = false;
+bool		session_auth_is_superuser;
 
 int			log_min_error_statement = ERROR;
 int			log_min_messages = WARNING;
@@ -492,7 +493,6 @@ int			huge_pages;
  * and is kept in sync by assign_hooks.
  */
 static char *syslog_ident_str;
-static bool session_auth_is_superuser;
 static double phony_random_seed;
 static char *client_encoding_string;
 static char *datestyle_string;
@@ -8986,12 +8986,18 @@ read_nondefault_variables(void)
  * constants; a few, like server_encoding and lc_ctype, are handled specially
  * outside the serialize/restore procedure.  Therefore, SerializeGUCState()
  * never sends these, and RestoreGUCState() never changes them.
+ *
+ * Role is a special variable in the sense that its current value can be an
+ * invalid value and there are multiple ways by which that can happen (like
+ * after setting the role, someone drops it).  So we handle it outside of
+ * serialize/restore machinery.
  */
 static bool
 can_skip_gucvar(struct config_generic *gconf)
 {
 	return gconf->context == PGC_POSTMASTER ||
-		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT;
+		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT ||
+		strcmp(gconf->name, "role") == 0;
 }
 
 /*
@@ -9252,7 +9258,6 @@ SerializeGUCState(Size maxsize, char *start_address)
 	Size		actual_size;
 	Size		bytes_left;
 	int			i;
-	int			i_role = -1;
 
 	/* Reserve space for saving the actual size of the guc state */
 	Assert(maxsize > sizeof(actual_size));
@@ -9260,19 +9265,7 @@ SerializeGUCState(Size maxsize, char *start_address)
 	bytes_left = maxsize - sizeof(actual_size);
 
 	for (i = 0; i < num_guc_variables; i++)
-	{
-		/*
-		 * It's pretty ugly, but we've got to force "role" to be initialized
-		 * after "session_authorization"; otherwise, the latter will override
-		 * the former.
-		 */
-		if (strcmp(guc_variables[i]->name, "role") == 0)
-			i_role = i;
-		else
-			serialize_variable(&curptr, &bytes_left, guc_variables[i]);
-	}
-	if (i_role >= 0)
-		serialize_variable(&curptr, &bytes_left, guc_variables[i_role]);
+		serialize_variable(&curptr, &bytes_left, guc_variables[i]);
 
 	/* Store actual size without assuming alignment of start_address. */
 	actual_size = maxsize - bytes_left - sizeof(actual_size);
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 467125a..aa130cd 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -245,6 +245,7 @@ extern bool log_btree_build_stats;
 
 extern PGDLLIMPORT bool check_function_bodies;
 extern bool default_with_oids;
+extern bool	session_auth_is_superuser;
 
 extern int	log_min_error_statement;
 extern int	log_min_messages;
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 3c63ad1..ac9ad06 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -508,6 +508,22 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+NOTICE:  role "regress_parallel_worker" does not exist, skipping
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+ count 
+-------
+ 10000
+(1 row)
+
+reset force_parallel_mode;
+reset role;
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 720495c..495f033 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -201,6 +201,17 @@ SELECT make_record(x) FROM (SELECT generate_series(1, 5) x) ss ORDER BY x;
 ROLLBACK TO SAVEPOINT settings;
 DROP function make_record(n int);
 
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+reset force_parallel_mode;
+reset role;
+
 -- to increase the parallel query test coverage
 SAVEPOINT settings;
 SET LOCAL force_parallel_mode = 1;
fix_role_handling_parallel_worker_10_v4_1.patchapplication/octet-stream; name=fix_role_handling_parallel_worker_10_v4_1.patchDownload
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ce1b907..62b287e 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -71,9 +71,11 @@ typedef struct FixedParallelState
 	Oid			database_id;
 	Oid			authenticated_user_id;
 	Oid			current_user_id;
+	Oid			outer_user_id;
 	Oid			temp_namespace_id;
 	Oid			temp_toast_namespace_id;
 	int			sec_context;
+	bool		is_superuser;
 	PGPROC	   *parallel_master_pgproc;
 	pid_t		parallel_master_pid;
 	BackendId	parallel_master_backend_id;
@@ -197,6 +199,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
 	Size		segsize = 0;
 	int			i;
 	FixedParallelState *fps;
+	const char *is_superuser_var;
 	Snapshot	transaction_snapshot = GetTransactionSnapshot();
 	Snapshot	active_snapshot = GetActiveSnapshot();
 
@@ -275,7 +278,13 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		shm_toc_allocate(pcxt->toc, sizeof(FixedParallelState));
 	fps->database_id = MyDatabaseId;
 	fps->authenticated_user_id = GetAuthenticatedUserId();
+	fps->outer_user_id = GetCurrentRoleId();
 	GetUserIdAndSecContext(&fps->current_user_id, &fps->sec_context);
+	is_superuser_var = GetConfigOptionByName("is_superuser", NULL, true);
+	if (is_superuser_var && strcmp(is_superuser_var, "on") == 0)
+		fps->is_superuser = true;
+	else
+		fps->is_superuser = false;
 	GetTempNamespaceState(&fps->temp_namespace_id,
 						  &fps->temp_toast_namespace_id);
 	fps->parallel_master_pgproc = MyProc;
@@ -1079,6 +1088,13 @@ ParallelWorkerMain(Datum main_arg)
 	 */
 	InvalidateSystemCaches();
 
+	/*
+	 * Restore current role id.  Skip verifying whether session user is
+	 * allowed to become this role and blindly restore the leader's state for
+	 * current role.
+	 */
+	SetCurrentRoleId(fps->outer_user_id, fps->is_superuser);
+
 	/* Restore user ID and security context. */
 	SetUserIdAndSecContext(fps->current_user_id, fps->sec_context);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 58a4cf9..490acf7 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -8986,12 +8986,18 @@ read_nondefault_variables(void)
  * constants; a few, like server_encoding and lc_ctype, are handled specially
  * outside the serialize/restore procedure.  Therefore, SerializeGUCState()
  * never sends these, and RestoreGUCState() never changes them.
+ *
+ * Role is a special variable in the sense that its current value can be an
+ * invalid value and there are multiple ways by which that can happen (like
+ * after setting the role, someone drops it).  So we handle it outside of
+ * serialize/restore machinery.
  */
 static bool
 can_skip_gucvar(struct config_generic *gconf)
 {
 	return gconf->context == PGC_POSTMASTER ||
-		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT;
+		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT ||
+		strcmp(gconf->name, "role") == 0;
 }
 
 /*
@@ -9252,7 +9258,6 @@ SerializeGUCState(Size maxsize, char *start_address)
 	Size		actual_size;
 	Size		bytes_left;
 	int			i;
-	int			i_role = -1;
 
 	/* Reserve space for saving the actual size of the guc state */
 	Assert(maxsize > sizeof(actual_size));
@@ -9260,19 +9265,7 @@ SerializeGUCState(Size maxsize, char *start_address)
 	bytes_left = maxsize - sizeof(actual_size);
 
 	for (i = 0; i < num_guc_variables; i++)
-	{
-		/*
-		 * It's pretty ugly, but we've got to force "role" to be initialized
-		 * after "session_authorization"; otherwise, the latter will override
-		 * the former.
-		 */
-		if (strcmp(guc_variables[i]->name, "role") == 0)
-			i_role = i;
-		else
-			serialize_variable(&curptr, &bytes_left, guc_variables[i]);
-	}
-	if (i_role >= 0)
-		serialize_variable(&curptr, &bytes_left, guc_variables[i_role]);
+		serialize_variable(&curptr, &bytes_left, guc_variables[i]);
 
 	/* Store actual size without assuming alignment of start_address. */
 	actual_size = maxsize - bytes_left - sizeof(actual_size);
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 7458870..6bbcd34 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -424,6 +424,22 @@ select string4 from tenk1 order by string4 limit 5;
 
 reset max_parallel_workers;
 reset enable_hashagg;
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+NOTICE:  role "regress_parallel_worker" does not exist, skipping
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+ count 
+-------
+ 10000
+(1 row)
+
+reset force_parallel_mode;
+reset role;
 set force_parallel_mode=1;
 explain (costs off)
   select stringu1::int2 from tenk1 where unique1 = 1;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 09c1b03..cf02d84 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -162,6 +162,17 @@ select string4 from tenk1 order by string4 limit 5;
 reset max_parallel_workers;
 reset enable_hashagg;
 
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+reset force_parallel_mode;
+reset role;
+
 set force_parallel_mode=1;
 
 explain (costs off)
fix_role_handling_parallel_worker_10_v4_2.patchapplication/octet-stream; name=fix_role_handling_parallel_worker_10_v4_2.patchDownload
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index ce1b907..e1a6a75 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -71,9 +71,11 @@ typedef struct FixedParallelState
 	Oid			database_id;
 	Oid			authenticated_user_id;
 	Oid			current_user_id;
+	Oid			outer_user_id;
 	Oid			temp_namespace_id;
 	Oid			temp_toast_namespace_id;
 	int			sec_context;
+	bool		is_superuser;
 	PGPROC	   *parallel_master_pgproc;
 	pid_t		parallel_master_pid;
 	BackendId	parallel_master_backend_id;
@@ -275,6 +277,8 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		shm_toc_allocate(pcxt->toc, sizeof(FixedParallelState));
 	fps->database_id = MyDatabaseId;
 	fps->authenticated_user_id = GetAuthenticatedUserId();
+	fps->outer_user_id = GetCurrentRoleId();
+	fps->is_superuser = session_auth_is_superuser;
 	GetUserIdAndSecContext(&fps->current_user_id, &fps->sec_context);
 	GetTempNamespaceState(&fps->temp_namespace_id,
 						  &fps->temp_toast_namespace_id);
@@ -1079,6 +1083,13 @@ ParallelWorkerMain(Datum main_arg)
 	 */
 	InvalidateSystemCaches();
 
+	/*
+	 * Restore current role id.  Skip verifying whether session user is
+	 * allowed to become this role and blindly restore the leader's state for
+	 * current role.
+	 */
+	SetCurrentRoleId(fps->outer_user_id, fps->is_superuser);
+
 	/* Restore user ID and security context. */
 	SetUserIdAndSecContext(fps->current_user_id, fps->sec_context);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 58a4cf9..bee9cc5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -446,6 +446,7 @@ char	   *event_source;
 bool		row_security;
 bool		check_function_bodies = true;
 bool		default_with_oids = false;
+bool		session_auth_is_superuser;
 
 int			log_min_error_statement = ERROR;
 int			log_min_messages = WARNING;
@@ -492,7 +493,6 @@ int			huge_pages;
  * and is kept in sync by assign_hooks.
  */
 static char *syslog_ident_str;
-static bool session_auth_is_superuser;
 static double phony_random_seed;
 static char *client_encoding_string;
 static char *datestyle_string;
@@ -8986,12 +8986,18 @@ read_nondefault_variables(void)
  * constants; a few, like server_encoding and lc_ctype, are handled specially
  * outside the serialize/restore procedure.  Therefore, SerializeGUCState()
  * never sends these, and RestoreGUCState() never changes them.
+ *
+ * Role is a special variable in the sense that its current value can be an
+ * invalid value and there are multiple ways by which that can happen (like
+ * after setting the role, someone drops it).  So we handle it outside of
+ * serialize/restore machinery.
  */
 static bool
 can_skip_gucvar(struct config_generic *gconf)
 {
 	return gconf->context == PGC_POSTMASTER ||
-		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT;
+		gconf->context == PGC_INTERNAL || gconf->source == PGC_S_DEFAULT ||
+		strcmp(gconf->name, "role") == 0;
 }
 
 /*
@@ -9252,7 +9258,6 @@ SerializeGUCState(Size maxsize, char *start_address)
 	Size		actual_size;
 	Size		bytes_left;
 	int			i;
-	int			i_role = -1;
 
 	/* Reserve space for saving the actual size of the guc state */
 	Assert(maxsize > sizeof(actual_size));
@@ -9260,19 +9265,7 @@ SerializeGUCState(Size maxsize, char *start_address)
 	bytes_left = maxsize - sizeof(actual_size);
 
 	for (i = 0; i < num_guc_variables; i++)
-	{
-		/*
-		 * It's pretty ugly, but we've got to force "role" to be initialized
-		 * after "session_authorization"; otherwise, the latter will override
-		 * the former.
-		 */
-		if (strcmp(guc_variables[i]->name, "role") == 0)
-			i_role = i;
-		else
-			serialize_variable(&curptr, &bytes_left, guc_variables[i]);
-	}
-	if (i_role >= 0)
-		serialize_variable(&curptr, &bytes_left, guc_variables[i_role]);
+		serialize_variable(&curptr, &bytes_left, guc_variables[i]);
 
 	/* Store actual size without assuming alignment of start_address. */
 	actual_size = maxsize - bytes_left - sizeof(actual_size);
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index c1870d2..aa2c7c1 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -244,6 +244,7 @@ extern bool log_btree_build_stats;
 
 extern PGDLLIMPORT bool check_function_bodies;
 extern bool default_with_oids;
+extern bool	session_auth_is_superuser;
 
 extern int	log_min_error_statement;
 extern int	log_min_messages;
diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 7458870..6bbcd34 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -424,6 +424,22 @@ select string4 from tenk1 order by string4 limit 5;
 
 reset max_parallel_workers;
 reset enable_hashagg;
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+NOTICE:  role "regress_parallel_worker" does not exist, skipping
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+ count 
+-------
+ 10000
+(1 row)
+
+reset force_parallel_mode;
+reset role;
 set force_parallel_mode=1;
 explain (costs off)
   select stringu1::int2 from tenk1 where unique1 = 1;
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 09c1b03..cf02d84 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -162,6 +162,17 @@ select string4 from tenk1 order by string4 limit 5;
 reset max_parallel_workers;
 reset enable_hashagg;
 
+-- test the sanity of parallel query after the active role is dropped.
+drop role if exists regress_parallel_worker;
+create role regress_parallel_worker;
+set role regress_parallel_worker;
+reset session authorization;
+drop role regress_parallel_worker;
+set force_parallel_mode = 1;
+select count(*) from tenk1;
+reset force_parallel_mode;
+reset role;
+
 set force_parallel_mode=1;
 
 explain (costs off)
#27Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#26)
Re: Parallel worker error

On Sun, Oct 29, 2017 at 12:02 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

This patch no longer applies, so attached rebased patches. I have
also created patches for v10 as we might want to backpatch the fix.
Added the patch in CF (https://commitfest.postgresql.org/15/1342/)

Thanks. I picked the second variant, committed, and also back-patched to 9.6.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#27)
Re: Parallel worker error

On Sun, Oct 29, 2017 at 1:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Oct 29, 2017 at 12:02 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

This patch no longer applies, so attached rebased patches. I have
also created patches for v10 as we might want to backpatch the fix.
Added the patch in CF (https://commitfest.postgresql.org/15/1342/)

Thanks. I picked the second variant, committed, and also back-patched to 9.6.

Thanks. I have closed this entry in CF app, however, I am not sure
what is the best way to deal with the entry present in PostgreSQL 10
Open Items page[1]https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items. The entry for this bug seems to be present in
Older Bugs section. Shall we leave it as it is or do we want to do
something else with it?

[1]: https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#28)
Re: Parallel worker error

On Mon, Oct 30, 2017 at 8:25 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks. I have closed this entry in CF app, however, I am not sure
what is the best way to deal with the entry present in PostgreSQL 10
Open Items page[1]. The entry for this bug seems to be present in
Older Bugs section. Shall we leave it as it is or do we want to do
something else with it?

[1] - https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items

How about just adding an additional bullet point for that item that
says "fixed by commit blah blah for 10.1"?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#29)
Re: Parallel worker error

On Mon, Oct 30, 2017 at 10:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Oct 30, 2017 at 8:25 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Thanks. I have closed this entry in CF app, however, I am not sure
what is the best way to deal with the entry present in PostgreSQL 10
Open Items page[1]. The entry for this bug seems to be present in
Older Bugs section. Shall we leave it as it is or do we want to do
something else with it?

[1] - https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items

How about just adding an additional bullet point for that item that
says "fixed by commit blah blah for 10.1"?

Sounds good, so updated the Open items page accordingly.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers