WIP: Fix parallel workers connection bug in pg_dump (Bug #13727)
Parallel workers were failing to connect to the database when running
pg_dump with a connection string. The first of the following two commands
runs without errors, while the second one fails:
pg_dump "postgres://my-user:my-password@my.hostname.com:5432/my-db" -Fd -f
my-dump
pg_dump "postgres://my-user:my-password@my.hostname.com:5432/my-db" -Fd
--jobs=9 -f my-dump
The error message:
pg_dump: [parallel archiver] connection to database "my-db" failed:
fe_sendauth: no password supplied
The password is not being stored correctly in the PGconn object when
connecting with a connection string.
This is my first time contributing to Postgres, so I tried to stick to the
instructions from the "Submitting a Patch" wiki. This submission is for
discussion because I haven't figured out how to write regression tests for
this patch yet (and I would appreciate guidance).
Target branch: master
Compiles and tests successfully: true
Platform-specific items: none
Regression tests: still needed
Documentation: N/A
Performance implications: none
Attachments:
pworker-connection-fix-v1.patchapplication/octet-stream; name=pworker-connection-fix-v1.patchDownload
diff --git a/src/bin/pg_dump/pg_backup_db.c b/src/bin/pg_dump/pg_backup_db.c
index 5968c20..7502f93 100644
*** a/src/bin/pg_dump/pg_backup_db.c
--- b/src/bin/pg_dump/pg_backup_db.c
***************
*** 282,288 **** ConnectDatabase(Archive *AHX,
}
} while (new_pass);
! AH->savedPassword = password;
/* check to see that the backend connection was successfully made */
if (PQstatus(AH->connection) == CONNECTION_BAD)
--- 282,288 ----
}
} while (new_pass);
! AH->savedPassword = PQpass(AH->connection);
/* check to see that the backend connection was successfully made */
if (PQstatus(AH->connection) == CONNECTION_BAD)
I'm still unclear on how to write regression tests for a connectivity bug.
Are they necessary in this case?
On Sun, Oct 25, 2015 at 5:55 PM, Zeus Kronion <zkronion@gmail.com> wrote:
Show quoted text
Parallel workers were failing to connect to the database when running
pg_dump with a connection string. The first of the following two commands
runs without errors, while the second one fails:
pg_dump "postgres://my-user:my-password@my.hostname.com:5432/my-db" -Fd
-f my-dump
pg_dump "postgres://my-user:my-password@my.hostname.com:5432/my-db" -Fd
--jobs=9 -f my-dumpThe error message:
pg_dump: [parallel archiver] connection to database "my-db" failed:
fe_sendauth: no password suppliedThe password is not being stored correctly in the PGconn object when
connecting with a connection string.This is my first time contributing to Postgres, so I tried to stick to the
instructions from the "Submitting a Patch" wiki. This submission is for
discussion because I haven't figured out how to write regression tests for
this patch yet (and I would appreciate guidance).Target branch: master
Compiles and tests successfully: true
Platform-specific items: none
Regression tests: still needed
Documentation: N/A
Performance implications: none
On 30-10-2015 10:04, Zeus Kronion wrote:
I'm still unclear on how to write regression tests for a connectivity
bug. Are they necessary in this case?
There aren't regression tests for pg_dump. However, your instructions
are sufficient to demonstrate the bug.
You could continue the thread in -bugs because the discussion started
there. Sometimes people track -bugs ML to make sure that some bugs
aren't forgotten. Add your patch to the next CF [1]https://commitfest.postgresql.org/7/.
[1]: https://commitfest.postgresql.org/7/
--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 30, 2015 at 2:42 PM, Euler Taveira <euler@timbira.com.br> wrote:
On 30-10-2015 10:04, Zeus Kronion wrote:
I'm still unclear on how to write regression tests for a connectivity
bug. Are they necessary in this case?There aren't regression tests for pg_dump. However, your instructions are
sufficient to demonstrate the bug.
Well, we could have something in pg_dump/t/, though the instance set
by standard_initdb would require some update in pg_hba.conf to switch
to md5 before running the dump.
You could continue the thread in -bugs because the discussion started there.
Sometimes people track -bugs ML to make sure that some bugs aren't
forgotten. Add your patch to the next CF [1].
Yep. Things get easily lost.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/25/15 10:55 PM, Zeus Kronion wrote:
Parallel workers were failing to connect to the database when running
pg_dump with a connection string. The first of the following two commands
runs without errors, while the second one fails:
pg_dump "postgres://my-user:my-password@my.hostname.com:5432/my-db" -Fd -f
my-dump
pg_dump "postgres://my-user:my-password@my.hostname.com:5432/my-db" -Fd
--jobs=9 -f my-dumpThe error message:
pg_dump: [parallel archiver] connection to database "my-db" failed:
fe_sendauth: no password suppliedThe password is not being stored correctly in the PGconn object when
connecting with a connection string.
Yeah, the current code is definitely broken for this case. However, I
don't feel like this patch is quite there yet, either. _connectDB has
similar logic in it which might be hit in case e.g. a a user's HBA is
changed from a non-password-requiring method to a password-requiring one
after the one or more connections has been initiated. That one needs
changing as well.
However, I don't quite like the way the password cache is kept up to
date in the old *or* the new code. It seems to me that it should
instead look like:
if (PQconnectionUsedPassword(AH->connection))
AH->savedPassword = PQpass(AH->connection);
What do you think?
.m
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Nov 1, 2015 5:04 PM, "Marko Tiikkaja" <marko@joh.to> wrote:
On 10/25/15 10:55 PM, Zeus Kronion wrote:
Parallel workers were failing to connect to the database when running
pg_dump with a connection string. The first of the following two commands
runs without errors, while the second one fails:
pg_dump "postgres://my-user:my-password@my.hostname.com:5432/my-db" -Fd
-f
my-dump
pg_dump "postgres://my-user:my-password@my.hostname.com:5432/my-db" -Fd
--jobs=9 -f my-dumpThe error message:
pg_dump: [parallel archiver] connection to database "my-db" failed:
fe_sendauth: no password suppliedThe password is not being stored correctly in the PGconn object when
connecting with a connection string.Yeah, the current code is definitely broken for this case. However, I
don't feel like this patch is quite there yet, either. _connectDB has
similar logic in it which might be hit in case e.g. a a user's HBA is
changed from a non-password-requiring method to a password-requiring one
after the one or more connections has been initiated. That one needs
changing as well.
I wasn't aware of that case. Should be an easy fix to make this weekend.
However, I don't quite like the way the password cache is kept up to date
in the old *or* the new code. It seems to me that it should instead look
like:
if (PQconnectionUsedPassword(AH->connection))
AH->savedPassword = PQpass(AH->connection);What do you think?
I don't understand why this logic is preferable. Is your concern that
AH->savedPassword may contain a password even when none is needed? Or is
the change simply meant to give the reader a better sense of what is
actually going on?
-CS
On 11/5/15 4:11 PM, Zeus Kronion wrote:
On Nov 1, 2015 5:04 PM, "Marko Tiikkaja" <marko@joh.to> wrote:
However, I don't quite like the way the password cache is kept up to date
in the old *or* the new code. It seems to me that it should instead look
like:if (PQconnectionUsedPassword(AH->connection))
AH->savedPassword = PQpass(AH->connection);What do you think?
I don't understand why this logic is preferable. Is your concern that
AH->savedPassword may contain a password even when none is needed?
The opposite, sort of. If the first connection uses a password, the
second one doesn't, and the third one does again, you need to ask for a
password again because you emptied the cache on the second attempt since
it didn't use a password. Granted, this situation is quite unlikely to
occur in practice, but I find the "correct" code *also* more readable.
To me it reads like "if the what we're caching was applied during the
connection attempt, update the cache; otherwise keep the previous value
in case it's useful in the future".
.m
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 6, 2015 at 12:23 AM, Marko Tiikkaja <marko@joh.to> wrote:
On 11/5/15 4:11 PM, Zeus Kronion wrote:
On Nov 1, 2015 5:04 PM, "Marko Tiikkaja" <marko@joh.to> wrote:
However, I don't quite like the way the password cache is kept up to date
in the old *or* the new code. It seems to me that it should instead look
like:if (PQconnectionUsedPassword(AH->connection))
AH->savedPassword = PQpass(AH->connection);What do you think?
I don't understand why this logic is preferable. Is your concern that
AH->savedPassword may contain a password even when none is needed?The opposite, sort of. If the first connection uses a password, the second
one doesn't, and the third one does again, you need to ask for a password
again because you emptied the cache on the second attempt since it didn't
use a password. Granted, this situation is quite unlikely to occur in
practice, but I find the "correct" code *also* more readable. To me it reads
like "if the what we're caching was applied during the connection attempt,
update the cache; otherwise keep the previous value in case it's useful in
the future".
OK, I think that we had better address this bug for this CF. I am
attaching an updated patch following Marko's suggestion, and marking
this patch as ready for committer. I have noticed that the fix was
actually not complete: ConnectDatabase needs a similar fix, this is a
code path when a connection string like "dbname=db user=foo" is used.
And this was failing as well when the password is passed directly in
the connection string for multiple jobs.
--
Michael
Attachments:
20151222_pgdump_jobs_pass.patchbinary/octet-stream; name=20151222_pgdump_jobs_pass.patchDownload
diff --git a/src/bin/pg_dump/pg_backup_db.c b/src/bin/pg_dump/pg_backup_db.c
index 5968c20..5f07e51 100644
--- a/src/bin/pg_dump/pg_backup_db.c
+++ b/src/bin/pg_dump/pg_backup_db.c
@@ -192,7 +192,8 @@ _connectDB(ArchiveHandle *AH, const char *reqdb, const char *requser)
}
} while (new_pass);
- AH->savedPassword = password;
+ if (PQconnectionUsedPassword(newConn))
+ AH->savedPassword = pg_strdup(PQpass(newConn));
/* check for version mismatch */
_check_database_version(AH);
@@ -282,7 +283,8 @@ ConnectDatabase(Archive *AHX,
}
} while (new_pass);
- AH->savedPassword = password;
+ if (PQconnectionUsedPassword(AH->connection))
+ AH->savedPassword = pg_strdup(PQpass(AH->connection));
/* check to see that the backend connection was successfully made */
if (PQstatus(AH->connection) == CONNECTION_BAD)
Michael Paquier <michael.paquier@gmail.com> writes:
OK, I think that we had better address this bug for this CF. I am
attaching an updated patch following Marko's suggestion, and marking
this patch as ready for committer. I have noticed that the fix was
actually not complete: ConnectDatabase needs a similar fix, this is a
code path when a connection string like "dbname=db user=foo" is used.
And this was failing as well when the password is passed directly in
the connection string for multiple jobs.
As written, this would leak password strings, and it even seems possible
that it would leave savedPassword pointing at dangling memory, since the
free(password) inside the loop might free what savedPassword is pointing
at, and then in principle we might fail to overwrite savedPassword
afterwards. This probably can't happen in practice because it'd require
successive connection attempts to come to different conclusions about
PQconnectionNeedsPassword/PQconnectionUsedPassword. But it seems pretty
fragile in the face of future changes to this code. I modified it further
so that "password" and "savedPassword" never share storage, and pushed it.
A larger concern is that I suspect we need to abandon this whole approach
of passing a different representation of the connection parameters than
we were given the first time through. If dbname is a connection URI
(or keyword=value connection string), it might well contain more
information than just host + port + username + password. We're losing
any such details during the workers' reconnections. But that looks like
it would be a rather wide-ranging rewrite, so I just committed what we
had for now; at least we fixed the reported bug symptom.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Dec 24, 2015 at 4:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
As written, this would leak password strings, and it even seems possible
that it would leave savedPassword pointing at dangling memory, since the
free(password) inside the loop might free what savedPassword is pointing
at, and then in principle we might fail to overwrite savedPassword
afterwards. This probably can't happen in practice because it'd require
successive connection attempts to come to different conclusions about
PQconnectionNeedsPassword/PQconnectionUsedPassword.
Yes, that's what I was assuming.
But it seems pretty
fragile in the face of future changes to this code. I modified it further
so that "password" and "savedPassword" never share storage, and pushed it.
OK, thanks for fixing the issue!
A larger concern is that I suspect we need to abandon this whole approach
of passing a different representation of the connection parameters than
we were given the first time through. If dbname is a connection URI
(or keyword=value connection string), it might well contain more
information than just host + port + username + password. We're losing
any such details during the workers' reconnections. But that looks like
it would be a rather wide-ranging rewrite, so I just committed what we
had for now; at least we fixed the reported bug symptom.
vacuumdb suffers the same symptoms I think...
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers