Feature: Use DNS SRV records for connecting

Started by Feike Steenbergenover 6 years ago5 messageshackers
Jump to latest
#1Feike Steenbergen
feikesteenbergen@gmail.com

Hi all,

I'd like to get some feedback on whether or not implementing a DNS SRV feature
for connecting to PostgreSQL would be desirable/useful.

The main use case is to have a DNS SRV record that lists all the possible
primaries of a given replicated PostgreSQL cluster. With auto failover
solutions like patroni, pg_auto_failover, stolon, etc. any of these endpoints
could be serving the primary server at any point in time.

Combined with target_session_attrs a connection string to a highly-available
cluster could be something like:

psql "dnssrv=mydb.prod.example.com target_session_attr=read_write"

Which would then resolve the SRV record _postgresql._tcp.mydb.prod.example.com
and using the method described in RFC 2782 connect to the host/port combination
one by one until it finds the primary.

A benefit of using SRV records would be that the port is also part of the DNS
record and therefore a single IP could be used to serve many databases on
separate ports. When working with a cloud environment or containerized setup
(or both) this would open up some good possibilities.

Note: We currently can already do this somehow by specifying multiple
hosts/ports in the connection string, however it would be useful if we could
refer to a single SRV record instead, as that would have a list of hosts
and ports to connect to.

DNS SRV is described in detail here:
https://tools.ietf.org/html/rfc2782

I'd love to hear some support/dissent,

regards,

Feike

#2Graham Leggett
minfrin@sharp.fm
In reply to: Feike Steenbergen (#1)
Re: Feature: Use DNS SRV records for connecting

On 13 Aug 2019, at 11:50, Feike Steenbergen <feikesteenbergen@gmail.com> wrote:

I'd like to get some feedback on whether or not implementing a DNS SRV feature
for connecting to PostgreSQL would be desirable/useful.

A big +1.

We currently use SRV records to tell postgresql what kind of server it is. This way all of our postgresql servers have an identical configuration, they just tailor themselves on startup as appropriate:

_postgresql-master._tcp.sql.example.com.

The above record in our case declares who the master is. If the postgresql startup says “hey, that’s me” it configures itself as a master. If the postgresql startup says “hey, that’s not me” it configures itself as a slave of the master.

We also use TXT records to define the databases we want (with protection against DNS security issues, we never remove a database based on a TXT record, but signed DNS records will help here).

_postgresql.sql.example.com TXT "v=PGSQL1;d=mydb;u=myuser"

We use a series of systemd “daemons” that are configured to run before and after postgresql to do the actual configuration on bootup, but it would be great if postgresql could just do this out the box.

Regards,
Graham

Attachments:

smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Feike Steenbergen (#1)
Re: Feature: Use DNS SRV records for connecting

Feike Steenbergen <feikesteenbergen@gmail.com> writes:

I'd like to get some feedback on whether or not implementing a DNS SRV feature
for connecting to PostgreSQL would be desirable/useful.

How would we get at that data without writing our own DNS client?
(AFAIK, our existing DNS interactions are all handled by getnameinfo()
or other library-supplied functions.)

Maybe that'd be worth doing, but it sounds like a lot of work and a
lot of new code to maintain, relative to the value of the feature.

regards, tom lane

#4Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#3)
Re: Feature: Use DNS SRV records for connecting

Hi,

On 2019-08-13 10:43:07 -0400, Tom Lane wrote:

How would we get at that data without writing our own DNS client?
(AFAIK, our existing DNS interactions are all handled by getnameinfo()
or other library-supplied functions.)

Maybe that'd be worth doing, but it sounds like a lot of work and a
lot of new code to maintain, relative to the value of the feature.

It might have enough independent advantages to make it worthwhile
though.

Right now our non-blocking interfaces aren't actually in a number of
cases, due to name resolution being blocking. While that's documented,
it imo means that our users need to use a non-blocking DNS library, if
they need non-blocking PQconnectPoll() - it's imo not that practical to
just use IPs in most cases.

We also don't have particularly good control over the order of hostnames
returned by getaddrinfo, which makes it harder to implement reliable
round-robin etc.

Greetings,

Andres Freund

#5Andrey Borodin
amborodin@acm.org
In reply to: Andres Freund (#4)
Re: Feature: Use DNS SRV records for connecting

On 14 Aug 2019, at 23:01, Andres Freund <andres@anarazel.de> wrote:

On 2019-08-13 10:43:07 -0400, Tom Lane wrote:

How would we get at that data without writing our own DNS client?
(AFAIK, our existing DNS interactions are all handled by getnameinfo()
or other library-supplied functions.)

Maybe that'd be worth doing, but it sounds like a lot of work and a
lot of new code to maintain, relative to the value of the feature.

It might have enough independent advantages to make it worthwhile
though.

Here is a patch prototyping SRV-based discovery for libpq, reviving
the idea from the 2019 thread.

Tom asked how we'd get SRV data without writing our own DNS client.
On POSIX, res_query(3) from libresolv does the lookup and
dn_expand(3) decompresses names in the response - the same library
that Kerberos and LDAP support already depend on. On Windows,
DnsQuery() from windns.h returns a typed linked list with no wire-format
parsing needed. The resolver is about 200 lines and uses no DNS logic
of our own.

Regarding non-blocking resolution raised by Andres: res_query() is
blocking, as is the existing getaddrinfo() call for regular hosts.
This patch does not make things worse, but does not improve them
either. Async DNS resolution is a separate, larger problem.

A new connection parameter "srvhost" (env: "PGSRVHOST") causes libpq
to query "_postgresql._tcp.<srvhost>" before connecting. I chose
"srvhost" over the original "dnssrv" suggestion to align with the
existing "host" parameter naming. Two URI schemes are supported as
shorthand:

psql "postgresql+srv://cluster.example.com/mydb?target_session_attrs=read-write"
psql "postgres+srv://cluster.example.com/mydb?target_session_attrs=read-write"

Given the DNS records:

_postgresql._tcp.cluster.example.com. SRV 10 50 5432 primary.example.com.
_postgresql._tcp.cluster.example.com. SRV 20 50 5432 standby1.example.com.
_postgresql._tcp.cluster.example.com. SRV 20 50 5432 standby2.example.com.

The resolved host list is sorted per RFC 2782 and injected into the
existing multi-host machinery before connhost[] is built, so
target_session_attrs, load_balance_hosts, and failover work on the
expanded list without any changes to PQconnectPoll.

"srvhost" is mutually exclusive with "host" and "hostaddr". Multiple
hosts in a "+srv" URI are rejected - expansion is DNS's responsibility.

I'm proposing this across the PostgreSQL driver ecosystem (pgx, pgjdbc,
npgsql, and now libpq). I would be happy to hear a feedback.

Best regards, Andrey Borodin.

Attachments:

0001-libpq-Add-DNS-SRV-record-support-for-service-discove.patchapplication/octet-stream; name=0001-libpq-Add-DNS-SRV-record-support-for-service-discove.patch; x-unix-mode=0644Download+863-7