Patch proposal: make use of regular expressions for the username in pg_hba.conf
Hi hackers,
Attached is a patch proposal to allow the use of regular expressions for
the username in pg_hba.conf.
Using regular expressions for the username in the pg_hba.conf file is
convenient in situations where an organization has a large number of
users and needs an expressive way to map them.
For example, if an organization wants to allow gss connections only for
users having their principal, e.g. @BDTFOREST.LOCAL, they could make use
of an entry in pg_hba.conf such as:
host all /^.*@BDTFOREST.LOCAL$ 0.0.0.0/0 gss
Without this patch, I can think of three alternatives with existing
functionality, which all of tradeoffs. This includes:
1) Create an entry per user: this is challenging for organizations
managing large numbers of users (e.g. 1000s). This is also not dynamic,
i.e. the HBA file would need to be updated when users are added or removed.
2) Use a mapping in pg_ident.conf, for example:
Here is an entry in pg_hba.conf that uses a map:
host all all 0.0.0.0/0 gss map=mygssmap
and by defining this mapping in pg_ident.conf:
mygssmap /^(.*)@BDTFOREST\.LOCAL$ \1@BDTFOREST.LOCAL
That works for filtering the username.
LOG: connection authenticated: identity="bertrand@BDTFOREST.LOCAL"
method=gss (/pg_installed/data/pg_hba.conf:95)
$ grep -n mygssmap /pg_installed/data/pg_hba.conf
95:host all all 0.0.0.0/0 gss map=mygssmap
However, the behavior is not the same for the ones that don’t match the
mapping in pg_ident.conf: indeed the connection attempt stop here and
the next HBA line won’t be evaluated.
FATAL: GSSAPI authentication failed for user "bdt"
DETAIL: Connection matched pg_hba.conf line 95: "host all
all 0.0.0.0/0 gss map=mygssmap"
3) Make use of a role in pg_hba.conf, e.g. “+BDTONLY”. That would work
too, and also allow the evaluation of the next HBA line for the ones
that are not part of the role.
However:
- That’s not as dynamic as the regular expression, as new users
would need to be granted the role and some users who are moving in the
company may need to have the role revoked.
- Looking at the regular expression in the HBA file makes it clear
what filtering needs to be done. This is not obvious when looking at the
role, even if it has a meaningful name. This can generate “incorrect
filtering” should one user be granted the role by mistake, or make it
more difficult to debug why a user is not being matched to a particular
line in the HBA file.
This is why I think username filtering with regular expressions would
provide its own advantages.
Thoughts? Looking forward to your feedback,
Regards,
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
Attachments:
v1-0001-hba_with_regexp.patchtext/plain; charset=UTF-8; name=v1-0001-hba_with_regexp.patchDownload+75-16
On Fri, Aug 19, 2022 at 1:13 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
This is why I think username filtering with regular expressions would
provide its own advantages.Thoughts? Looking forward to your feedback,
I think your motivation for the feature is solid. It is killing me a
bit that this is making it easier to switch authentication methods
based on the role name, when I suspect what someone might really want
is to switch authentication methods based on the ID the user is trying
to authenticate with. But that's not your fault or problem to fix,
because the startup packet doesn't currently have that information.
(It does make me wonder whether I withdrew my PGAUTHUSER proposal [1]https://commitfest.postgresql.org/38/3314/
a month too early. And man, do I wish that pg_ident and pg_hba were
one file.)
I think you're going to have to address backwards compatibility
concerns. Today, I can create a role named "/a", and I can put that
into the HBA without quoting it. I'd be unamused if, after an upgrade,
my rule suddenly matched any role name containing an 'a'.
Speaking of partial matches, should this feature allow them? Maybe
rules should have to match the entire username instead, and sidestep
the inevitable "I forgot to anchor my regex" problems?
Thanks,
--Jacob
Jacob Champion <jchampion@timescale.com> writes:
On Fri, Aug 19, 2022 at 1:13 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
This is why I think username filtering with regular expressions would
provide its own advantages.
I think your motivation for the feature is solid.
Yeah. I'm not sure that I buy the argument that this is more useful
than writing a role name and controlling things with GRANT ROLE, but
it's a plausible alternative with properties that might win in some
use-cases. So I see little reason not to allow it.
I'd actually ask why stop here? In particular, why not do the same
with the database-name column, especially since that does *not*
have the ability to use roles as a substitute for a wildcard entry?
I think you're going to have to address backwards compatibility
concerns. Today, I can create a role named "/a", and I can put that
into the HBA without quoting it. I'd be unamused if, after an upgrade,
my rule suddenly matched any role name containing an 'a'.
Meh ... that concern seems overblown to me. I guess it's possible
that somebody has an HBA entry that looks like that, but it doesn't
seem very plausible. Note that we made this exact same change in
pg_ident.conf years ago, and AFAIR we got zero complaints.
Speaking of partial matches, should this feature allow them? Maybe
rules should have to match the entire username instead, and sidestep
the inevitable "I forgot to anchor my regex" problems?
I think the pg_ident.conf precedent is binding on us here. If we
make this one work differently, nobody's going to thank us for it,
they're just going to wonder "did the left hand not know what the
right hand already did?"
regards, tom lane
Hi,
On 9/9/22 2:46 AM, Tom Lane wrote:
Jacob Champion<jchampion@timescale.com> writes:
On Fri, Aug 19, 2022 at 1:13 AM Drouvot, Bertrand<bdrouvot@amazon.com> wrote:
This is why I think username filtering with regular expressions would
provide its own advantages.I think your motivation for the feature is solid.
Yeah. I'm not sure that I buy the argument that this is more useful
than writing a role name and controlling things with GRANT ROLE, but
it's a plausible alternative with properties that might win in some
use-cases. So I see little reason not to allow it.
Thank you both for your feedback.
I'd actually ask why stop here? In particular, why not do the same
with the database-name column, especially since that does *not*
have the ability to use roles as a substitute for a wildcard entry?
I think that's a fair point, I'll look at it.
I think you're going to have to address backwards compatibility
concerns. Today, I can create a role named "/a", and I can put that
into the HBA without quoting it. I'd be unamused if, after an upgrade,
my rule suddenly matched any role name containing an 'a'.Meh ... that concern seems overblown to me. I guess it's possible
that somebody has an HBA entry that looks like that, but it doesn't
seem very plausible. Note that we made this exact same change in
pg_ident.conf years ago, and AFAIR we got zero complaints.
Agree that it seems unlikely but maybe we could add a new GUC to turn
the regex usage on the hba file on/off (and use off as the default)?
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com
"Drouvot, Bertrand" <bdrouvot@amazon.com> writes:
Agree that it seems unlikely but maybe we could add a new GUC to turn
the regex usage on the hba file on/off (and use off as the default)?
I think that will just add useless complication.
regards, tom lane
On Thu, Sep 8, 2022 at 5:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jacob Champion <jchampion@timescale.com> writes:
I think you're going to have to address backwards compatibility
concerns. Today, I can create a role named "/a", and I can put that
into the HBA without quoting it. I'd be unamused if, after an upgrade,
my rule suddenly matched any role name containing an 'a'.Meh ... that concern seems overblown to me. I guess it's possible
that somebody has an HBA entry that looks like that, but it doesn't
seem very plausible. Note that we made this exact same change in
pg_ident.conf years ago, and AFAIR we got zero complaints.
What percentage of users actually use pg_ident maps? My assumption
would be that a change to pg_hba would affect many more people, but
then I don't have any proof that there are users with role names that
look like that to begin with. I won't pound the table with it.
Speaking of partial matches, should this feature allow them? Maybe
rules should have to match the entire username instead, and sidestep
the inevitable "I forgot to anchor my regex" problems?I think the pg_ident.conf precedent is binding on us here. If we
make this one work differently, nobody's going to thank us for it,
they're just going to wonder "did the left hand not know what the
right hand already did?"
Hmm... yeah, I suppose. From the other direction, it'd be bad to train
users that unanchored regexes are safe in pg_hba only to take those
guardrails off in pg_ident. I will tuck that away as a potential
behavior change, for a different thread.
Thanks,
--Jacob
On 8/19/22 01:12, Drouvot, Bertrand wrote:
+ wstr = palloc((strlen(tok->string + 1) + 1) * sizeof(pg_wchar)); + wlen = pg_mb2wchar_with_len(tok->string + 1, + wstr, strlen(tok->string + 1));
The (tok->string + 1) construction comes up often enough that I think it
should be put in a `regex` variable or similar. That would help my eyes
with the (strlen(tok->string + 1) + 1) construction, especially.
I noticed that for pg_ident, we precompile the regexes per-line and
reuse those in child processes. Whereas here we're compiling, using, and
then discarding the regex for each check. I think the example set by the
pg_ident code is probably the one to follow, unless you have a good
reason not to.
+# Testing with regular expression for username +reset_pg_hba($node, '/^.*md.*$', 'password'); +test_role($node, 'md5_role', 'password from pgpass and regular expression for username', 0); +
IMO the coverage for this patch needs to be filled out. Negative test
cases are more important than positive ones for security-related code.
Other than that, and Tom's note on potentially expanding this to other
areas, I think this is a pretty straightforward patch.
Thanks,
--Jacob
Hi,
On 9/10/22 1:21 AM, Jacob Champion wrote:
On 8/19/22 01:12, Drouvot, Bertrand wrote:
+ wstr = palloc((strlen(tok->string + 1) + 1) * sizeof(pg_wchar)); + wlen = pg_mb2wchar_with_len(tok->string + 1, + wstr, strlen(tok->string + 1));The (tok->string + 1) construction comes up often enough that I think it
should be put in a `regex` variable or similar. That would help my eyes
with the (strlen(tok->string + 1) + 1) construction, especially.I noticed that for pg_ident, we precompile the regexes per-line and
reuse those in child processes. Whereas here we're compiling, using, and
then discarding the regex for each check. I think the example set by the
pg_ident code is probably the one to follow, unless you have a good
reason not to.
Thanks for the feedback.
Yeah fully agree. I'll provide a new version that follow the same logic
as the pg_ident code.
+# Testing with regular expression for username +reset_pg_hba($node, '/^.*md.*$', 'password'); +test_role($node, 'md5_role', 'password from pgpass and regular expression for username', 0); +IMO the coverage for this patch needs to be filled out. Negative test
cases are more important than positive ones for security-related code.
Agree, will do.
Other than that, and Tom's note on potentially expanding this to other
areas,
I'll add regexp usage for the database column and also the for the
address one when non CIDR is provided (so host name(s)) (I think it also
makes sense specially as we don't allow multiple values for this column).
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services:https://aws.amazon.com
Hi,
On 9/12/22 9:55 AM, Drouvot, Bertrand wrote:
Hi,
On 9/10/22 1:21 AM, Jacob Champion wrote:
On 8/19/22 01:12, Drouvot, Bertrand wrote:
+ wstr = palloc((strlen(tok->string + 1) + 1) * sizeof(pg_wchar)); + wlen = pg_mb2wchar_with_len(tok->string + 1, + wstr, strlen(tok->string + 1));The (tok->string + 1) construction comes up often enough that I think it
should be put in a `regex` variable or similar. That would help my eyes
with the (strlen(tok->string + 1) + 1) construction, especially.I noticed that for pg_ident, we precompile the regexes per-line and
reuse those in child processes. Whereas here we're compiling, using, and
then discarding the regex for each check. I think the example set by the
pg_ident code is probably the one to follow, unless you have a good
reason not to.Thanks for the feedback.
Yeah fully agree. I'll provide a new version that follow the same logic
as the pg_ident code.+# Testing with regular expression for username +reset_pg_hba($node, '/^.*md.*$', 'password'); +test_role($node, 'md5_role', 'password from pgpass and regular expression for username', 0); +IMO the coverage for this patch needs to be filled out. Negative test
cases are more important than positive ones for security-related code.Agree, will do.
Other than that, and Tom's note on potentially expanding this to other
areas,I'll add regexp usage for the database column and also the for the
address one when non CIDR is provided (so host name(s)) (I think it also
makes sense specially as we don't allow multiple values for this column).
Please find attached v2 addressing the comments mentioned above.
v2 also provides regular expression usage for the database and the
address columns (when a host name is being used).
Remark:
The CF bot is failing for Windows (all other tests are green) and only
for the new tap test related to the regular expression on the host name
(the ones on database and role are fine).
The issue is not related to the patch. The issue is that the Windows
Cirrus test does not like when a host name is provided for a "host"
entry in pg_hba.conf (while it works fine when a CIDR is provided).
You can see an example in [1]https://github.com/bdrouvot/postgres/branches on branch “host_non_cidr” where the only change is to replace the
CIDR by "localhost" in 002_scram.pl. As you can see the Cirrus tests are
failing on Windows only (its log file is here [2]https://api.cirrus-ci.com/v1/artifact/task/6507279833890816/log/src/test/ssl/tmp_check/log/002_scram_primary.log).
I'll look at this "Windows" related issue but would appreciate any
guidance/help if someone has experience in this area on windows.
[1]: https://github.com/bdrouvot/postgres/branches on branch “host_non_cidr”
[2]: https://api.cirrus-ci.com/v1/artifact/task/6507279833890816/log/src/test/ssl/tmp_check/log/002_scram_primary.log
https://api.cirrus-ci.com/v1/artifact/task/6507279833890816/log/src/test/ssl/tmp_check/log/002_scram_primary.log
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v2-0001-hba_with_regexp.patchtext/plain; charset=UTF-8; name=v2-0001-hba_with_regexp.patchDownload+258-81
On Fri, Sep 16, 2022 at 06:24:07PM +0200, Drouvot, Bertrand wrote:
The CF bot is failing for Windows (all other tests are green) and only for
the new tap test related to the regular expression on the host name (the
ones on database and role are fine).The issue is not related to the patch. The issue is that the Windows Cirrus
test does not like when a host name is provided for a "host" entry in
pg_hba.conf (while it works fine when a CIDR is provided).You can see an example in [1] where the only change is to replace the CIDR
by "localhost" in 002_scram.pl. As you can see the Cirrus tests are failing
on Windows only (its log file is here [2]).I'll look at this "Windows" related issue but would appreciate any
guidance/help if someone has experience in this area on windows.
I recall that being able to do a reverse lookup of a hostname on
Windows for localhost requires a few extra setup steps as that's not
guaranteed to be set in all environments by default, which is why we
go at great length to use 127.0.0.1 in the TAP test setup for example
(see Cluster.pm). Looking at your patch, the goal is to test the
mapping of regular expression for host names, user names and database
names. If the first case is not guaranteed, my guess is that it is
fine to skip this portion of the tests on Windows.
While reading the patch, I am a bit confused about token_regcomp() and
token_regexec(). It would help the review a lot if these were
documented with proper comments, even if these act roughly as wrappers
for pg_regexec() and pg_regcomp().
--
Michael
Hi,
On 9/17/22 8:53 AM, Michael Paquier wrote:
On Fri, Sep 16, 2022 at 06:24:07PM +0200, Drouvot, Bertrand wrote:
The CF bot is failing for Windows (all other tests are green) and only for
the new tap test related to the regular expression on the host name (the
ones on database and role are fine).The issue is not related to the patch. The issue is that the Windows Cirrus
test does not like when a host name is provided for a "host" entry in
pg_hba.conf (while it works fine when a CIDR is provided).You can see an example in [1] where the only change is to replace the CIDR
by "localhost" in 002_scram.pl. As you can see the Cirrus tests are failing
on Windows only (its log file is here [2]).I'll look at this "Windows" related issue but would appreciate any
guidance/help if someone has experience in this area on windows.I recall that being able to do a reverse lookup of a hostname on
Windows for localhost requires a few extra setup steps as that's not
guaranteed to be set in all environments by default, which is why we
go at great length to use 127.0.0.1 in the TAP test setup for example
(see Cluster.pm). Looking at your patch, the goal is to test the
mapping of regular expression for host names, user names and database
names. If the first case is not guaranteed, my guess is that it is
fine to skip this portion of the tests on Windows.
Thanks for looking at it!
That sounds reasonable, v3 attached is skipping the regular expression
tests for the hostname on Windows.
While reading the patch, I am a bit confused about token_regcomp() and
token_regexec(). It would help the review a lot if these were
documented with proper comments, even if these act roughly as wrappers
for pg_regexec() and pg_regcomp().
Fully agree, comments were missing. They've been added in v3 attached.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v3-0001-hba_with_regexp.patchtext/plain; charset=UTF-8; name=v3-0001-hba_with_regexp.patchDownload+275-81
On Fri, Sep 09, 2022 at 03:05:18PM -0700, Jacob Champion wrote:
On Thu, Sep 8, 2022 at 5:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jacob Champion <jchampion@timescale.com> writes:
I think you're going to have to address backwards compatibility
concerns. Today, I can create a role named "/a", and I can put that
into the HBA without quoting it. I'd be unamused if, after an upgrade,
my rule suddenly matched any role name containing an 'a'.Meh ... that concern seems overblown to me. I guess it's possible
that somebody has an HBA entry that looks like that, but it doesn't
seem very plausible. Note that we made this exact same change in
pg_ident.conf years ago, and AFAIR we got zero complaints.What percentage of users actually use pg_ident maps? My assumption
would be that a change to pg_hba would affect many more people, but
then I don't have any proof that there are users with role names that
look like that to begin with. I won't pound the table with it.
This concern does not sound overblown to me. A change in pg_hba.conf
impacts everybody. I was just looking at this patch, and the logic
with usernames and databases is changed so as we would *always* treat
*any* entries beginning with a slash as a regular expression, skipping
them if they don't match with an error fed to the logs and
pg_hba_file_rules. This could lead to silent security issues as
stricter HBA policies need to be located first in pg_hba.conf and
these could suddenly fail to load. It would be much safer to me if we
had in place some restrictions to avoid such problems to happen,
meaning some extra checks in the DDL code paths for such object names
and perhaps even something in pg_upgrade with a scan at pg_database
and pg_authid.
On the bright side, I have been looking at some of the RFCs covering
the set of characters allowed in DNS names and slashes are not
authorized in hostnames, making this change rather safe AFAIK.
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Thu, Sep 8, 2022 at 5:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Meh ... that concern seems overblown to me. I guess it's possible
that somebody has an HBA entry that looks like that, but it doesn't
seem very plausible. Note that we made this exact same change in
pg_ident.conf years ago, and AFAIR we got zero complaints.
This concern does not sound overblown to me.
You have to assume that somebody (a) has a role or DB name starting
with slash, (b) has an explicit reference to that name in their
pg_hba.conf, (c) doesn't read the release notes, and (d) doesn't
notice that things are misbehaving until after some hacker manages
to break into their installation on the strength of the misbehaving
entry. OK, I'll grant that the probability of (c) is depressingly
close to unity; but each of the other steps seems quite low probability.
All four of them happening in one installation is something I doubt
will happen.
On the contrary side, if we make this work differently from the
pg_ident.conf precedent, or install weird rules to try to prevent
accidental misinterpretations, that could also lead to security
problems because things don't work as someone would expect. I see
no a-priori reason to believe that this risk is negligible compared
to the other one.
regards, tom lane
On Tue, Sep 20, 2022 at 12:09:33AM -0400, Tom Lane wrote:
You have to assume that somebody (a) has a role or DB name starting
with slash, (b) has an explicit reference to that name in their
pg_hba.conf, (c) doesn't read the release notes, and (d) doesn't
notice that things are misbehaving until after some hacker manages
to break into their installation on the strength of the misbehaving
entry. OK, I'll grant that the probability of (c) is depressingly
close to unity; but each of the other steps seems quite low probability.
All four of them happening in one installation is something I doubt
will happen.
It is the kind of things that could blow up as a CVE and some bad PR
for the project, so I cannot get excited about enforcing this new rule
in an authentication file (aka before a role is authenticated) while
we are talking about 3~4 code paths (?) that would need an extra check
to make sure that no instances have such object names.
On the contrary side, if we make this work differently from the
pg_ident.conf precedent, or install weird rules to try to prevent
accidental misinterpretations, that could also lead to security
problems because things don't work as someone would expect. I see
no a-priori reason to believe that this risk is negligible compared
to the other one.
I also do like a lot the idea of making things consistent across all
the auth configuration files for all the fields where this can be
applied.
--
Michael
Hi,
On 9/20/22 6:30 AM, Michael Paquier wrote:
On Tue, Sep 20, 2022 at 12:09:33AM -0400, Tom Lane wrote:
You have to assume that somebody (a) has a role or DB name starting
with slash, (b) has an explicit reference to that name in their
pg_hba.conf, (c) doesn't read the release notes, and (d) doesn't
notice that things are misbehaving until after some hacker manages
to break into their installation on the strength of the misbehaving
entry. OK, I'll grant that the probability of (c) is depressingly
close to unity; but each of the other steps seems quite low probability.
All four of them happening in one installation is something I doubt
will happen.It is the kind of things that could blow up as a CVE and some bad PR
for the project, so I cannot get excited about enforcing this new rule
in an authentication file (aka before a role is authenticated) while
we are talking about 3~4 code paths (?) that would need an extra check
to make sure that no instances have such object names.
I also have the feeling that having (a), (b) and (d) is low probability.
That said, If the user "/john" already exists and has a hba entry then
this entry will still match with the patch. Problem is that all the
users that contain "john" would also now match.
But things get worst if say /a is an existing user and hba entry as the
entry would match any users that contains "a" with the patch.
I assume (maybe i should not) that if objects starting with / already
exist there is very good reason(s) behind. Then I don't think that
preventing their creation in the DDL would help (quite the contrary for
the ones that really need them).
It looks to me that adding a GUC (off by default) to enable/disable the
regexp usage in the hba could be a fair compromise. It won't block any
creation starting with a / and won't open more doors (if such objects
exist) by default.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Mon, Sep 19, 2022 at 9:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
You have to assume that somebody (a) has a role or DB name starting
with slash, (b) has an explicit reference to that name in their
pg_hba.conf, (c) doesn't read the release notes, and (d) doesn't
notice that things are misbehaving until after some hacker manages
to break into their installation on the strength of the misbehaving
entry. OK, I'll grant that the probability of (c) is depressingly
close to unity; but each of the other steps seems quite low probability.
All four of them happening in one installation is something I doubt
will happen.
I can't argue with (a) or (b), but (d) seems decently likely to me. If
your normal user base consists of people who are authorized to access
your system, what clues would you have that your HBA is silently
failing open?
--Jacob
On Tue, Sep 20, 2022 at 01:33:09PM +0200, Drouvot, Bertrand wrote:
I assume (maybe i should not) that if objects starting with / already exist
there is very good reason(s) behind. Then I don't think that preventing
their creation in the DDL would help (quite the contrary for the ones that
really need them).
I have been pondering on this point for the last few weeks, and I'd
like to change my opinion and side with Tom on this one as per the
very unlikeliness of this being a problem in the wild. I have studied
the places that would require restrictions but that was just feeling
adding a bit more bloat into the CREATE/ALTER ROLE paths for what's
aimed at providing a consistent experience for the user across
pg_hba.conf and pg_ident.conf.
It looks to me that adding a GUC (off by default) to enable/disable the
regexp usage in the hba could be a fair compromise. It won't block any
creation starting with a / and won't open more doors (if such objects exist)
by default.
Enforcing a behavior change in HBA policies with a GUC does not strike
me as a good thing in the long term. I am ready to bet that it would
just sit around for nothing like the compatibility GUCs.
Anyway, I have looked at the patch.
+ List *roles_re;
+ List *databases_re;
+ regex_t hostname_re;
I am surprised by the approach of using separate lists for the regular
expressions and the raw names. Wouldn't it be better to store
everything in a single list but assign an entry type? In this case it
would be either regex or plain string. This would minimize the
footprint of the changes (no extra arguments *_re in the routines
checking for a match on the roles, databases or hosts). And it seems
to me that this would make unnecessary the use of re_num here and
there. The hostname is different, of course, requiring only an extra
field for its type, or something like that.
Perhaps the documentation would gain in clarity if there were more
examples, like a set of comma-separated examples (mix of regex and raw
strings for example, for all the field types that gain support for
regexes)?
-$node->append_conf('postgresql.conf', "log_connections = on\n");
+$node->append_conf(
+ 'postgresql.conf', qq{
+listen_addresses = '127.0.0.1'
+log_connections = on
+});
Hmm. I think that we may need to reconsider the location of the tests
for the regexes with the host name, as the "safe" regression tests
should not switch listen_addresses. One location where we already do
that is src/test/ssl/, so these could be moved there. Keeping the
database and user name parts in src/test/authentication/ is fine.
Something that stood out on a first review is the refactoring of
001_password.pl that can be done independently of the main patch:
- test_role() -> test_conn() to be able to pass down a database name.
- reset_pg_hba() to control the host, db and user parts. The host
part does not really apply after moving the hosts checks to a more
secure location, so I guess that this had better be extended just for
the user and database, keeping host=local all the time.
I am planning to apply 0001 attached independently, reducing the
footprint of 0002, which is your previous patch left untouched
(mostly!).
--
Michael
Hi,
On 10/5/22 9:24 AM, Michael Paquier wrote:
Something that stood out on a first review is the refactoring of
001_password.pl that can be done independently of the main patch:
Good idea, thanks for the proposal.
- test_role() -> test_conn() to be able to pass down a database name.
- reset_pg_hba() to control the host, db and user parts. The host
part does not really apply after moving the hosts checks to a more
secure location, so I guess that this had better be extended just for
the user and database, keeping host=local all the time.
I am planning to apply 0001 attached independently,
0001 looks good to me.
reducing the
footprint of 0002, which is your previous patch left untouched
(mostly!).
Thanks! I'll look at it and the comments you just made up-thread.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Wed, Oct 05, 2022 at 03:32:20PM +0200, Drouvot, Bertrand wrote:
On 10/5/22 9:24 AM, Michael Paquier wrote:
- test_role() -> test_conn() to be able to pass down a database name.
- reset_pg_hba() to control the host, db and user parts. The host
part does not really apply after moving the hosts checks to a more
secure location, so I guess that this had better be extended just for
the user and database, keeping host=local all the time.
I am planning to apply 0001 attached independently,0001 looks good to me.
Thanks. I have applied this refactoring, leaving the host part out of
the equation as we should rely only on local connections for this
part of the test. The best fit I can think about for the checks on
the hostname patterns would be either the ssl, ldap or krb5 tests.
SSL is more widely tested than the two others.
reducing the
footprint of 0002, which is your previous patch left untouched
(mostly!).Thanks! I'll look at it and the comments you just made up-thread.
Cool, thanks. One thing that matters a lot IMO (that I forgot to
mention previously) is to preserve the order of the items parsed from
the configuration files.
Also, I am wondering whether we'd better have some regression tests
where a regex includes a comma and a role name itself has a comma,
actually, just to stress more the parsing of individual items in the
HBA file.
--
Michael
Hi,
On 10/5/22 9:24 AM, Michael Paquier wrote:
On Tue, Sep 20, 2022 at 01:33:09PM +0200, Drouvot, Bertrand wrote:
Anyway, I have looked at the patch.+ List *roles_re;
+ List *databases_re;
+ regex_t hostname_re;
I am surprised by the approach of using separate lists for the regular
expressions and the raw names. Wouldn't it be better to store
everything in a single list but assign an entry type? In this case it
would be either regex or plain string. This would minimize the
footprint of the changes (no extra arguments *_re in the routines
checking for a match on the roles, databases or hosts). And it seems
to me that this would make unnecessary the use of re_num here and
there.
Please find attached v5 addressing this. I started with an union but it
turns out that we still need the plain string when a regex is used. This
is not needed for the authentication per say but for fill_hba_line(). So
I ended up creating a new struct without union in v5.
The hostname is different, of course, requiring only an extra
field for its type, or something like that.
I'm using the same new struct as described above for the hostname.
Perhaps the documentation would gain in clarity if there were more
examples, like a set of comma-separated examples (mix of regex and raw
strings for example, for all the field types that gain support for
regexes)?
Right, I added more examples in v5.
-$node->append_conf('postgresql.conf', "log_connections = on\n"); +$node->append_conf( + 'postgresql.conf', qq{ +listen_addresses = '127.0.0.1' +log_connections = on +}); Hmm. I think that we may need to reconsider the location of the tests for the regexes with the host name, as the "safe" regression tests should not switch listen_addresses. One location where we already do that is src/test/ssl/, so these could be moved there.
Good point, I moved the hostname related tests in src/test/ssl.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com