pgsql: Harden new test case against force_parallel_mode = regress.
Harden new test case against force_parallel_mode = regress.
Per buildfarm: worker processes can't see a role created in
the current transaction.
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/98a88bc2bcd60e41ca70e2f1e13eee827e23eefb
Modified Files
--------------
src/test/regress/expected/psql.out | 3 ++-
src/test/regress/sql/psql.sql | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
On Thu, Mar 2, 2023 at 5:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Harden new test case against force_parallel_mode = regress.
Per buildfarm: worker processes can't see a role created in
the current transaction.
Now why would that happen? Surely the snapshot for each command is
passed down from leader to worker, and the worker is not free to
invent a snapshot from nothing.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Thu, Mar 2, 2023 at 5:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Per buildfarm: worker processes can't see a role created in
the current transaction.
Now why would that happen? Surely the snapshot for each command is
passed down from leader to worker, and the worker is not free to
invent a snapshot from nothing.
The workers were failing at startup, eg (from [1]):
+ERROR: role "regress_psql_user" does not exist
+CONTEXT: while setting parameter "session_authorization" to "regress_psql_user"
Maybe this says that worker startup needs to install the snapshot before
doing any catalog accesses? Anyway, I'd be happy to revert this test
hack if you care to make the case work.
regards, tom lane
On Fri, 3 Mar 2023 at 17:16, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 2, 2023 at 5:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Harden new test case against force_parallel_mode = regress.
Per buildfarm: worker processes can't see a role created in
the current transaction.Now why would that happen? Surely the snapshot for each command is
passed down from leader to worker, and the worker is not free to
invent a snapshot from nothing.
Probably because we nitialize which user and database to use in the
backend before we load the parent process' snapshot:
in ParallelWorkerMain (parallel.c, as of HEAD @ b6a0d469):
/* Restore database connection. */
BackgroundWorkerInitializeConnectionByOid(fps->database_id,
fps->authenticated_user_id,
0);
[...]
/* Crank up a transaction state appropriate to a parallel worker. */
tstatespace = shm_toc_lookup(toc, PARALLEL_KEY_TRANSACTION_STATE, false);
StartParallelWorkerTransaction(tstatespace);
/* Restore combo CID state. */
combocidspace = shm_toc_lookup(toc, PARALLEL_KEY_COMBO_CID, false);
RestoreComboCIDState(combocidspace);
-Matthias
I wrote:
The workers were failing at startup, eg (from [1]):
argh, forgot to add the link:
regards, tom lane
On Fri, Mar 3, 2023 at 11:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The workers were failing at startup, eg (from [1]):
+ERROR: role "regress_psql_user" does not exist +CONTEXT: while setting parameter "session_authorization" to "regress_psql_user"Maybe this says that worker startup needs to install the snapshot before
doing any catalog accesses? Anyway, I'd be happy to revert this test
hack if you care to make the case work.
Oh, that's interesting (and sad). A parallel worker has a "startup
transaction" that is used to restore library and GUC state, and then
after that transaction commits, it starts up a new transaction that
uses the same snapshot and settings as the transaction in the parallel
leader. So the problem here is that the startup transaction can't see
the uncommitted work of some unrelated (as far as it knows)
transaction, and that prevents restoring the session_authorization
GUC.
That startup transaction has broken stuff before, and it would be nice
to get rid of it. Unfortunately, I don't remember right now why we
need it in the first place. I'm fairly sure that if you load the
library and GUC state without any transaction, that doesn't work,
because a bunch of important processing gets skipped. And I think if
you try to do those things in the "real" transaction that fails for
some reason too, maybe that there's no guarantee that all the relevant
GUCs can be changed at that point, but I'm fuzzy on the details at the
moment.
So I don't know how to fix this right now, but thanks for the details.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Mar 3, 2023 at 11:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
+ERROR: role "regress_psql_user" does not exist +CONTEXT: while setting parameter "session_authorization" to "regress_psql_user"
Oh, that's interesting (and sad). A parallel worker has a "startup
transaction" that is used to restore library and GUC state, and then
after that transaction commits, it starts up a new transaction that
uses the same snapshot and settings as the transaction in the parallel
leader. So the problem here is that the startup transaction can't see
the uncommitted work of some unrelated (as far as it knows)
transaction, and that prevents restoring the session_authorization
GUC.
Got it.
That startup transaction has broken stuff before, and it would be nice
to get rid of it. Unfortunately, I don't remember right now why we
need it in the first place. I'm fairly sure that if you load the
library and GUC state without any transaction, that doesn't work,
because a bunch of important processing gets skipped. And I think if
you try to do those things in the "real" transaction that fails for
some reason too, maybe that there's no guarantee that all the relevant
GUCs can be changed at that point, but I'm fuzzy on the details at the
moment.
Couldn't we install the leader's snapshot into both transactions?
regards, tom lane
On Fri, Mar 3, 2023 at 12:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Couldn't we install the leader's snapshot into both transactions?
Yeah, maybe that would Just Work. Not sure.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Fri, Mar 3, 2023 at 12:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Couldn't we install the leader's snapshot into both transactions?
Yeah, maybe that would Just Work. Not sure.
Well, IIUC the worker is currently getting a brand new snapshot
for its startup transaction, which is exactly what you said upthread
it should never do. Seems like that could have more failure modes
than just this one.
regards, tom lane