default result formats setting

Started by Peter Eisentrautabout 5 years ago23 messages
#1Peter Eisentraut
peter.eisentraut@2ndquadrant.com
1 attachment(s)

During the discussion on dynamic result sets[0]/messages/by-id/6e747f98-835f-2e05-cde5-86ee444a7140@2ndquadrant.com, it became apparent that
the current way binary results are requested in the extended query
protocol is too cumbersome for some practical uses, and keeping that
style around would also make the proposed protocol extensions very
complicated.

The premise here is that a client library has hard-coded knowledge on
how to deal with binary format for certain, but not all, data types.
(Most client libraries process everything in text, and some client
libraries process everything in binary. Neither of these extremes are
of concern here.) Such a client always has to request a result row
description (Describe statement) before sending a Bind message, in order
to be able to pick out the result columns in should request in binary.
The feedback was that this extra round trip is often not worth it in
terms of performance, and so it is not done and binary format is not
used when it could be.

The conceptual solution is to allow a client to register for a session
which types it wants to always get in binary, unless it says otherwise.
In the discussion in [0]/messages/by-id/6e747f98-835f-2e05-cde5-86ee444a7140@2ndquadrant.com, I pondered a new protocol message for that,
but after further thought, a GUC setting would do just as well.

The attached patch implements this. For example, to get int2, int4,
int8 in binary by default, you could set

SET default_result_formats = '21=1,23=1,20=1';

This is a list of oid=format pairs.

I think this format satisfies the current requirements of the JDBC
driver. But the format could also be extended in the future to allow
type names to be listed or some other ways of identifying the types.

In order to be able to test this via libpq, I had to add a little hack.
Currently, PQexecParams() and similar functions can only pass exactly
one result format code, which per protocol is then applied to all result
columns. There is no support for sending zero result format codes to
make the session default apply. I enabled this by allowing -1 to be
passed as the format code. I'm not sure if we want to make this part of
the official API, but it would be useful to have something like this
somehow.

[0]: /messages/by-id/6e747f98-835f-2e05-cde5-86ee444a7140@2ndquadrant.com
/messages/by-id/6e747f98-835f-2e05-cde5-86ee444a7140@2ndquadrant.com

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v1-0001-Add-default_result_formats-setting.patchtext/plain; charset=UTF-8; name=v1-0001-Add-default_result_formats-setting.patch; x-mac-creator=0; x-mac-type=0Download
From fa0e6968985cbe1f100746ea562b7f7712baf646 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 26 Oct 2020 09:10:43 +0100
Subject: [PATCH v1] Add default_result_formats setting

---
 doc/src/sgml/config.sgml       |  42 ++++++++++++++
 doc/src/sgml/libpq.sgml        |   3 +
 doc/src/sgml/protocol.sgml     |   2 +-
 src/backend/tcop/pquery.c      | 100 ++++++++++++++++++++++++++++++++-
 src/backend/utils/misc/guc.c   |  12 ++++
 src/include/tcop/pquery.h      |   5 ++
 src/interfaces/libpq/fe-exec.c |  14 ++++-
 7 files changed, 173 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f043433e31..6009e13899 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8530,6 +8530,48 @@ <title>Statement Behavior</title>
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-default-result-formats" xreflabel="default_result_formats">
+      <term><varname>default_result_formats</varname> (<type>string</type>)
+      <indexterm>
+       <primary><varname>default_result_formats</varname></primary>
+       <secondary>configuration parameter</secondary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        This parameter specifies the default result formats by data type for
+        rows returned in the extended query protocol when no result formats
+        are specified in the Bind message.  It is intended to be used by
+        client libraries that prefer to handle certain data types in binary
+        format.  The typical usage would be that the client library sets this
+        value when it starts a connection.  (A client library that wants to
+        handle <emphasis>all</emphasis> types in binary doesn't need to use
+        this because it can just specify the format code for all types at once
+        in the protocol message.)
+       </para>
+       <para>
+        The value is a list of
+        <replaceable>typeoid</replaceable>=<replaceable>format</replaceable>,
+        separated by commas.  <replaceable>typeoid</replaceable> is the OID of
+        a type, from the <structname>pg_type</structname> system catalog.
+        <replaceable>format</replaceable> is the format code, currently 0 for
+        text and 1 for binary.  0 is the default for all types, so only 1
+        needs to be specified explicitly.  For example, if you want to
+        automatically get values of the types <type>int2</type>,
+        <type>int4</type>, and <type>int8</type> in binary while leaving the
+        rest in text, an appropriate setting would be
+        <literal>21=1,23=1,20=1</literal>.
+       </para>
+       <para>
+        Invalid format codes are an error.  Nonexistent type OIDs are not
+        diagnosed.  This setting applies only to result rows from the extended
+        query protocol, so it does not affect usage of
+        <application>psql</application> or <application>pg_dump</application>
+        for example.  Also, it does not affect the format of query parameters.
+       </para>
+      </listitem>
+     </varlistentry>
+
      </variablelist>
     </sect2>
      <sect2 id="runtime-config-client-format">
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index de60281fcb..6361d5a93b 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2711,6 +2711,9 @@ <title>Main Functions</title>
             results in binary format.  (There is not currently a provision
             to obtain different result columns in different formats,
             although that is possible in the underlying protocol.)
+
+            XXX Specify -1 to send no result formats, so that <xref
+            linkend="guc-default-result-formats"/> can take effect.
            </para>
           </listitem>
          </varlistentry>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 3a64db6f55..e98f92ee78 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -3694,7 +3694,7 @@ <title>Message Formats</title>
                 (denoted <replaceable>R</replaceable> below).
                 This can be zero to indicate that there are no result columns
                 or that the result columns should all use the default format
-                (text);
+                (usually text, but see <xref linkend="guc-default-result-formats"/>);
                 or one, in which case the specified format code is applied
                 to all result columns (if any); or it can equal the actual
                 number of result columns of the query.
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 96ea74f118..1dff772d1e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -26,6 +26,7 @@
 #include "tcop/utility.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
+#include "utils/varlena.h"
 
 
 /*
@@ -597,6 +598,98 @@ PortalStart(Portal portal, ParamListInfo params,
 	portal->status = PORTAL_READY;
 }
 
+char *default_result_formats;
+
+bool
+check_default_result_formats(char **newval, void **extra, GucSource source)
+{
+	char	   *rawstring;
+	List	   *elemlist;
+	ListCell   *lc;
+
+	rawstring = pstrdup(*newval);
+	if (!SplitGUCList(rawstring, ',', &elemlist))
+	{
+		GUC_check_errdetail("List syntax is invalid.");
+		pfree(rawstring);
+		list_free(elemlist);
+		return false;
+	}
+
+	foreach(lc, elemlist)
+	{
+		char	   *str = lfirst(lc);
+		Oid			oid;
+		short int	format;
+
+		if (sscanf(str, "%u=%hd", &oid, &format) != 2)
+		{
+			GUC_check_errdetail("Invalid list entry: %s", str);
+			pfree(rawstring);
+			list_free(elemlist);
+			return false;
+		}
+
+		if (format !=0 && format != 1)
+		{
+			GUC_check_errdetail("Invalid format code: %d", format);
+			pfree(rawstring);
+			list_free(elemlist);
+			return false;
+		}
+	}
+
+	return true;
+}
+
+List *default_result_formats_binary = NIL;
+
+void
+assign_default_result_formats(const char *newval, void *extra)
+{
+	char	   *rawstring;
+	List	   *elemlist;
+	ListCell   *lc;
+
+	rawstring = pstrdup(newval);
+	if (!SplitGUCList(rawstring, ',', &elemlist))
+	{
+		pfree(rawstring);
+		list_free(elemlist);
+		return;
+	}
+
+	foreach(lc, elemlist)
+	{
+		char	   *str = lfirst(lc);
+		Oid			oid;
+		short int	format;
+		MemoryContext oldcontext;
+
+		if (sscanf(str, "%u=%hd", &oid, &format) != 2)
+		{
+			pfree(rawstring);
+			list_free(elemlist);
+			return;
+		}
+
+		oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+		switch (format)
+		{
+			case 0:
+				default_result_formats_binary = list_delete_oid(default_result_formats_binary, oid);
+				break;
+			case 1:
+				default_result_formats_binary = list_append_unique_oid(default_result_formats_binary, oid);
+				break;
+		}
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	pfree(rawstring);
+	list_free(elemlist);
+}
+
 /*
  * PortalSetResultFormat
  *		Select the format codes for a portal's output.
@@ -642,7 +735,12 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats)
 	{
 		/* use default format for all columns */
 		for (i = 0; i < natts; i++)
-			portal->formats[i] = 0;
+		{
+			if (list_member_oid(default_result_formats_binary, TupleDescAttr(portal->tupDesc, i)->atttypid))
+				portal->formats[i] = 1;
+			else
+				portal->formats[i] = 0;
+		}
 	}
 }
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index a62d64eaa4..8d4cd50055 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -83,6 +83,7 @@
 #include "storage/predicate.h"
 #include "storage/proc.h"
 #include "storage/standby.h"
+#include "tcop/pquery.h"
 #include "tcop/tcopprot.h"
 #include "tsearch/ts_cache.h"
 #include "utils/acl.h"
@@ -4448,6 +4449,17 @@ static struct config_string ConfigureNamesString[] =
 		check_backtrace_functions, assign_backtrace_functions, NULL
 	},
 
+	{
+		{"default_result_formats", PGC_USERSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Which format codes to use for types if nothing else is specified in the protocol."),
+			NULL,
+			GUC_LIST_INPUT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&default_result_formats,
+		"",
+		check_default_result_formats, assign_default_result_formats, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, NULL, NULL, NULL, NULL
diff --git a/src/include/tcop/pquery.h b/src/include/tcop/pquery.h
index 437642cc72..1d431fe013 100644
--- a/src/include/tcop/pquery.h
+++ b/src/include/tcop/pquery.h
@@ -15,10 +15,12 @@
 #define PQUERY_H
 
 #include "nodes/parsenodes.h"
+#include "utils/guc.h"
 #include "utils/portal.h"
 
 
 extern PGDLLIMPORT Portal ActivePortal;
+extern char *default_result_formats;
 
 
 extern PortalStrategy ChoosePortalStrategy(List *stmts);
@@ -30,6 +32,9 @@ extern List *FetchStatementTargetList(Node *stmt);
 extern void PortalStart(Portal portal, ParamListInfo params,
 						int eflags, Snapshot snapshot);
 
+extern bool check_default_result_formats(char **newval, void **extra, GucSource source);
+extern void assign_default_result_formats(const char *newval, void *extra);
+
 extern void PortalSetResultFormat(Portal portal, int nFormats,
 								  int16 *formats);
 
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index eea0237c3a..e41216e2a1 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1617,9 +1617,17 @@ PQsendQueryGuts(PGconn *conn,
 				goto sendFailed;
 		}
 	}
-	if (pqPutInt(1, 2, conn) < 0 ||
-		pqPutInt(resultFormat, 2, conn))
-		goto sendFailed;
+	if (resultFormat >= 0)
+	{
+		if (pqPutInt(1, 2, conn) < 0 ||
+			pqPutInt(resultFormat, 2, conn))
+			goto sendFailed;
+	}
+	else
+	{
+		if (pqPutInt(0, 2, conn) < 0)
+			goto sendFailed;
+	}
 	if (pqPutMsgEnd(conn) < 0)
 		goto sendFailed;
 
-- 
2.28.0

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#1)
Re: default result formats setting

po 26. 10. 2020 v 9:31 odesílatel Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> napsal:

During the discussion on dynamic result sets[0], it became apparent that
the current way binary results are requested in the extended query
protocol is too cumbersome for some practical uses, and keeping that
style around would also make the proposed protocol extensions very
complicated.

The premise here is that a client library has hard-coded knowledge on
how to deal with binary format for certain, but not all, data types.
(Most client libraries process everything in text, and some client
libraries process everything in binary. Neither of these extremes are
of concern here.) Such a client always has to request a result row
description (Describe statement) before sending a Bind message, in order
to be able to pick out the result columns in should request in binary.
The feedback was that this extra round trip is often not worth it in
terms of performance, and so it is not done and binary format is not
used when it could be.

The conceptual solution is to allow a client to register for a session
which types it wants to always get in binary, unless it says otherwise.
In the discussion in [0], I pondered a new protocol message for that,
but after further thought, a GUC setting would do just as well.

The attached patch implements this. For example, to get int2, int4,
int8 in binary by default, you could set

SET default_result_formats = '21=1,23=1,20=1';

Using SET statement for this case looks very obscure :/

This is a protocol related issue, and should be solved by protocol
extending. I don't think so SQL level is good for that.

More, this format is not practical for custom types, and the list can be
pretty long.

This is a list of oid=format pairs.

I think this format satisfies the current requirements of the JDBC
driver. But the format could also be extended in the future to allow
type names to be listed or some other ways of identifying the types.

In order to be able to test this via libpq, I had to add a little hack.
Currently, PQexecParams() and similar functions can only pass exactly
one result format code, which per protocol is then applied to all result
columns. There is no support for sending zero result format codes to
make the session default apply. I enabled this by allowing -1 to be
passed as the format code. I'm not sure if we want to make this part of
the official API, but it would be useful to have something like this
somehow.

+1 to this feature, but -1 for design. It should be solved on protocol
level.

Regards

Pavel

Show quoted text

[0]:

/messages/by-id/6e747f98-835f-2e05-cde5-86ee444a7140@2ndquadrant.com

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: default result formats setting

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

The conceptual solution is to allow a client to register for a session
which types it wants to always get in binary, unless it says otherwise.

OK.

In the discussion in [0], I pondered a new protocol message for that,
but after further thought, a GUC setting would do just as well.

I think a GUC is conceptually the wrong level ...

In order to be able to test this via libpq, I had to add a little hack.

... which is part of the reason why you have to kluge this. I'm not
entirely certain which levels of the client stack need to know about
this, but surely libpq is one.

I'm also quite worried about failures (maybe even security problems)
arising from the "wrong level" of the client stack setting the GUC.

Independently of that, how would you implement "says otherwise" here,
ie do a single-query override of the session's prevailing setting?
Maybe the right thing for that is to define -1 all the way down to the
protocol level as meaning "use the session's per-type default", and
then if you don't want that you can pass 0 or 1. An advantage of that
is that you couldn't accidentally break an application that wasn't
ready for this feature, because it would not be the default to use it.

regards, tom lane

#4Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Pavel Stehule (#2)
Re: default result formats setting

On 2020-10-26 09:45, Pavel Stehule wrote:

The attached patch implements this.  For example, to get int2, int4,
int8 in binary by default, you could set

SET default_result_formats = '21=1,23=1,20=1';

Using SET statement for this case looks very obscure :/

This is a protocol related issue, and should be solved by protocol
extending. I don't think so SQL level is good for that.

We could also make it a protocol message, but it would essentially
implement the same thing, just again separately. And then you'd have no
support to inspect the current setting, test out different settings
interactively, etc. That seems pretty wasteful and complicated for no
real gain.

More, this format is not practical for custom types, and the list can
be pretty long.

The list is what the list is. I don't see how you can make it any
shorter. You have to list the data types that you're interested in
somehow. Any other ideas?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#3)
Re: default result formats setting

On 2020-10-26 15:35, Tom Lane wrote:

In the discussion in [0], I pondered a new protocol message for that,
but after further thought, a GUC setting would do just as well.

I think a GUC is conceptually the wrong level ...

It does feel that way, but it gets the job done well and you can use all
the functionality already existing, such as being able to inspect
settings, temporarily change settings, etc. Otherwise we'd have to
implement a lot of things like that again. That would turn this 200
line patch into a 2000 line patch without any real additional benefit.

In order to be able to test this via libpq, I had to add a little hack.

... which is part of the reason why you have to kluge this. I'm not
entirely certain which levels of the client stack need to know about
this, but surely libpq is one.

I'm also quite worried about failures (maybe even security problems)
arising from the "wrong level" of the client stack setting the GUC.

I don't think libpq needs to know about this very deeply. The protocol
provides format information with the result set. Libpq programs can
query that with PQfformat() and act accordingly. Nothing else is needed.

The real consumer of this would be the JDBC driver, which has built-in
knowledge of the binary formats of some data types. Libpq doesn't, so
it wouldn't use this facility anyway. (Not saying someone couldn't
write a higher-level C library that does this, but it doesn't exist now.
... hmm ... ecpg ...)

Independently of that, how would you implement "says otherwise" here,
ie do a single-query override of the session's prevailing setting?
Maybe the right thing for that is to define -1 all the way down to the
protocol level as meaning "use the session's per-type default", and
then if you don't want that you can pass 0 or 1. An advantage of that
is that you couldn't accidentally break an application that wasn't
ready for this feature, because it would not be the default to use it.

Yeah, that sounds a lot better. I'll look into that.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#4)
Re: default result formats setting

čt 5. 11. 2020 v 21:48 odesílatel Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> napsal:

On 2020-10-26 09:45, Pavel Stehule wrote:

The attached patch implements this. For example, to get int2, int4,
int8 in binary by default, you could set

SET default_result_formats = '21=1,23=1,20=1';

Using SET statement for this case looks very obscure :/

This is a protocol related issue, and should be solved by protocol
extending. I don't think so SQL level is good for that.

We could also make it a protocol message, but it would essentially
implement the same thing, just again separately. And then you'd have no
support to inspect the current setting, test out different settings
interactively, etc. That seems pretty wasteful and complicated for no
real gain.

If you need a debug API, then it can be better implemented with functions.
But why do you need it on SQL level?

This is a protocol related thing.

Show quoted text

More, this format is not practical for custom types, and the list can
be pretty long.

The list is what the list is. I don't see how you can make it any
shorter. You have to list the data types that you're interested in
somehow. Any other ideas?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Peter Eisentraut (#5)
1 attachment(s)
Re: default result formats setting

On 2020-11-05 22:03, Peter Eisentraut wrote:

Independently of that, how would you implement "says otherwise" here,
ie do a single-query override of the session's prevailing setting?
Maybe the right thing for that is to define -1 all the way down to the
protocol level as meaning "use the session's per-type default", and
then if you don't want that you can pass 0 or 1. An advantage of that
is that you couldn't accidentally break an application that wasn't
ready for this feature, because it would not be the default to use it.

Yeah, that sounds a lot better. I'll look into that.

Here is a new patch updated to work that way. Feels better now.

--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/

Attachments:

v2-0001-Add-default_result_formats-setting.patchtext/plain; charset=UTF-8; name=v2-0001-Add-default_result_formats-setting.patch; x-mac-creator=0; x-mac-type=0Download
From 30d3118e3f104fe1eef50c5782ab0fb5c2fb2b55 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 9 Nov 2020 11:06:07 +0100
Subject: [PATCH v2] Add default_result_formats setting

The current way binary results are requested in the extended query
protocol is too cumbersome for some practical uses.  Some clients, for
example the JDBC driver, have built-in support for handling specific
data types in binary.  Such a client would always have to request a
result row description (Describe statement) before sending a Bind
message, in order to be able to pick out the result columns it should
request in binary.  The feedback was that this extra round trip is
often not worth it in terms of performance, and so it is not done and
binary format is not used when it could be.

The solution is to allow a client to register for a session which
types it wants to always get in binary.  This is done by a new GUC
setting.  For example, to get int2, int4, int8 in binary by default,
you could set

SET default_result_formats = '21=1,23=1,20=1';

This is a list of oid=format pairs.

To request result formats based on this setting, send format code
-1 (instead of 0 or 1) in the Bind message.

This format satisfies the current requirements of the JDBC
driver (which knows about types by their OID).  But the format could
also be extended in the future to allow type names to be listed or
some other ways of identifying the types.

Discussion: https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com
---
 doc/src/sgml/config.sgml     |  43 +++++++++++++
 doc/src/sgml/libpq.sgml      |  10 +--
 doc/src/sgml/protocol.sgml   |   7 ++-
 src/backend/tcop/pquery.c    | 119 ++++++++++++++++++++++++++++++++++-
 src/backend/utils/misc/guc.c |  12 ++++
 src/include/tcop/pquery.h    |   5 ++
 6 files changed, 186 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f043433e31..8703c06538 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8530,6 +8530,49 @@ <title>Statement Behavior</title>
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-default-result-formats" xreflabel="default_result_formats">
+      <term><varname>default_result_formats</varname> (<type>string</type>)
+      <indexterm>
+       <primary><varname>default_result_formats</varname></primary>
+       <secondary>configuration parameter</secondary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        This parameter specifies the default result formats by data type for
+        rows returned in the extended query protocol when result format code
+        -1 is specified in the <link linkend="protocol-message-Bind">Bind
+        message</link>.  It is intended to be used by client libraries that
+        prefer to handle specific, but not all, data types in binary format.
+        The typical usage would be that the client library sets this value
+        when it starts a connection.  (A client library that wants to handle
+        <emphasis>all</emphasis> types in binary doesn't need to use this
+        because it can just specify the format code for all types at once in
+        the protocol message.)
+       </para>
+       <para>
+        The value is a list of
+        <replaceable>typeoid</replaceable>=<replaceable>format</replaceable>,
+        separated by commas.  <replaceable>typeoid</replaceable> is the OID of
+        a type, from the <structname>pg_type</structname> system catalog.
+        <replaceable>format</replaceable> is the format code, currently 0 for
+        text and 1 for binary.  0 is the default for all types, so only 1
+        needs to be specified explicitly.  For example, if you want to
+        automatically get values of the types <type>int2</type>,
+        <type>int4</type>, and <type>int8</type> in binary while leaving the
+        rest in text, an appropriate setting would be
+        <literal>21=1,23=1,20=1</literal>.
+       </para>
+       <para>
+        Invalid format codes are an error.  Nonexistent type OIDs are not
+        diagnosed.  This setting applies only to result rows from the extended
+        query protocol, so it does not affect usage of
+        <application>psql</application> or <application>pg_dump</application>
+        for example.  Also, it does not affect the format of query parameters.
+       </para>
+      </listitem>
+     </varlistentry>
+
      </variablelist>
     </sect2>
      <sect2 id="runtime-config-client-format">
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 9ce32fb39b..14560c271f 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2707,10 +2707,12 @@ <title>Main Functions</title>
           <term><parameter>resultFormat</parameter></term>
           <listitem>
            <para>
-            Specify zero to obtain results in text format, or one to obtain
-            results in binary format.  (There is not currently a provision
-            to obtain different result columns in different formats,
-            although that is possible in the underlying protocol.)
+            Specify 0 to obtain results in text format, or 1 to obtain results
+            in binary format, or -1 to have the setting of <xref
+            linkend="guc-default-result-formats"/> be applied by the server.
+            (There is not currently a provision to obtain different result
+            columns in different formats, although that is possible in the
+            underlying protocol.)
            </para>
           </listitem>
          </varlistentry>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 9a95d7b734..4e18ab7817 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -3565,7 +3565,7 @@ <title>Message Formats</title>
 </varlistentry>
 
 
-<varlistentry>
+<varlistentry id="protocol-message-Bind">
 <term>
 Bind (F)
 </term>
@@ -3707,8 +3707,9 @@ <title>Message Formats</title>
 </term>
 <listitem>
 <para>
-                The result-column format codes.  Each must presently be
-                zero (text) or one (binary).
+                The result-column format codes.  Each must be 0 for text, or 1
+                for binary, or -1 to apply the setting of <xref
+                linkend="guc-default-result-formats"/>.
 </para>
 </listitem>
 </varlistentry>
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 96ea74f118..2560f116a5 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -26,6 +26,7 @@
 #include "tcop/utility.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
+#include "utils/varlena.h"
 
 
 /*
@@ -597,6 +598,116 @@ PortalStart(Portal portal, ParamListInfo params,
 	portal->status = PORTAL_READY;
 }
 
+char *default_result_formats;
+
+bool
+check_default_result_formats(char **newval, void **extra, GucSource source)
+{
+	char	   *rawstring;
+	List	   *elemlist;
+	ListCell   *lc;
+
+	rawstring = pstrdup(*newval);
+	if (!SplitGUCList(rawstring, ',', &elemlist))
+	{
+		GUC_check_errdetail("List syntax is invalid.");
+		pfree(rawstring);
+		list_free(elemlist);
+		return false;
+	}
+
+	foreach(lc, elemlist)
+	{
+		char	   *str = lfirst(lc);
+		Oid			oid;
+		short int	format;
+
+		if (sscanf(str, "%u=%hd", &oid, &format) != 2)
+		{
+			GUC_check_errdetail("Invalid list entry: %s", str);
+			pfree(rawstring);
+			list_free(elemlist);
+			return false;
+		}
+
+		if (format !=0 && format != 1)
+		{
+			GUC_check_errdetail("Invalid format code: %d", format);
+			pfree(rawstring);
+			list_free(elemlist);
+			return false;
+		}
+	}
+
+	return true;
+}
+
+List *default_result_formats_binary = NIL;
+
+void
+assign_default_result_formats(const char *newval, void *extra)
+{
+	char	   *rawstring;
+	List	   *elemlist;
+	ListCell   *lc;
+
+	rawstring = pstrdup(newval);
+	if (!SplitGUCList(rawstring, ',', &elemlist))
+	{
+		pfree(rawstring);
+		list_free(elemlist);
+		return;
+	}
+
+	foreach(lc, elemlist)
+	{
+		char	   *str = lfirst(lc);
+		Oid			oid;
+		short int	format;
+		MemoryContext oldcontext;
+
+		if (sscanf(str, "%u=%hd", &oid, &format) != 2)
+		{
+			pfree(rawstring);
+			list_free(elemlist);
+			return;
+		}
+
+		oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+		switch (format)
+		{
+			case 0:
+				default_result_formats_binary = list_delete_oid(default_result_formats_binary, oid);
+				break;
+			case 1:
+				default_result_formats_binary = list_append_unique_oid(default_result_formats_binary, oid);
+				break;
+		}
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	pfree(rawstring);
+	list_free(elemlist);
+}
+
+/*
+ * Convenience routine for PortalSetResultFormat(): Return format code,
+ * resolving code -1 by using default_result_formats setting.
+ */
+static int16
+resolve_result_format(int16 format, Form_pg_attribute attr)
+{
+	if (format == -1)
+	{
+		if (list_member_oid(default_result_formats_binary, attr->atttypid))
+			return 1;
+		else
+			return 0;
+	}
+	else
+		return format;
+}
+
 /*
  * PortalSetResultFormat
  *		Select the format codes for a portal's output.
@@ -628,7 +739,9 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats)
 					(errcode(ERRCODE_PROTOCOL_VIOLATION),
 					 errmsg("bind message has %d result formats but query has %d columns",
 							nFormats, natts)));
-		memcpy(portal->formats, formats, natts * sizeof(int16));
+
+		for (i = 0; i < natts; i++)
+			portal->formats[i] = resolve_result_format(formats[i], TupleDescAttr(portal->tupDesc, i));
 	}
 	else if (nFormats > 0)
 	{
@@ -636,11 +749,11 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats)
 		int16		format1 = formats[0];
 
 		for (i = 0; i < natts; i++)
-			portal->formats[i] = format1;
+			portal->formats[i] = resolve_result_format(format1, TupleDescAttr(portal->tupDesc, i));
 	}
 	else
 	{
-		/* use default format for all columns */
+		/* by default use text format for all columns */
 		for (i = 0; i < natts; i++)
 			portal->formats[i] = 0;
 	}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index bb34630e8e..b6ebcf40aa 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -83,6 +83,7 @@
 #include "storage/predicate.h"
 #include "storage/proc.h"
 #include "storage/standby.h"
+#include "tcop/pquery.h"
 #include "tcop/tcopprot.h"
 #include "tsearch/ts_cache.h"
 #include "utils/acl.h"
@@ -4448,6 +4449,17 @@ static struct config_string ConfigureNamesString[] =
 		check_backtrace_functions, assign_backtrace_functions, NULL
 	},
 
+	{
+		{"default_result_formats", PGC_USERSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Which format codes to use for types if nothing else is specified in the protocol."),
+			NULL,
+			GUC_LIST_INPUT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&default_result_formats,
+		"",
+		check_default_result_formats, assign_default_result_formats, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, NULL, NULL, NULL, NULL
diff --git a/src/include/tcop/pquery.h b/src/include/tcop/pquery.h
index 437642cc72..1d431fe013 100644
--- a/src/include/tcop/pquery.h
+++ b/src/include/tcop/pquery.h
@@ -15,10 +15,12 @@
 #define PQUERY_H
 
 #include "nodes/parsenodes.h"
+#include "utils/guc.h"
 #include "utils/portal.h"
 
 
 extern PGDLLIMPORT Portal ActivePortal;
+extern char *default_result_formats;
 
 
 extern PortalStrategy ChoosePortalStrategy(List *stmts);
@@ -30,6 +32,9 @@ extern List *FetchStatementTargetList(Node *stmt);
 extern void PortalStart(Portal portal, ParamListInfo params,
 						int eflags, Snapshot snapshot);
 
+extern bool check_default_result_formats(char **newval, void **extra, GucSource source);
+extern void assign_default_result_formats(const char *newval, void *extra);
+
 extern void PortalSetResultFormat(Portal portal, int nFormats,
 								  int16 *formats);
 

base-commit: ef60de67ebde6dfd1ea09b4d08166736bf05698c
-- 
2.29.1

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#7)
Re: default result formats setting

On 11/9/20 5:10 AM, Peter Eisentraut wrote:

On 2020-11-05 22:03, Peter Eisentraut wrote:

Independently of that, how would you implement "says otherwise" here,
ie do a single-query override of the session's prevailing setting?
Maybe the right thing for that is to define -1 all the way down to the
protocol level as meaning "use the session's per-type default", and
then if you don't want that you can pass 0 or 1.  An advantage of that
is that you couldn't accidentally break an application that wasn't
ready for this feature, because it would not be the default to use it.

Yeah, that sounds a lot better.  I'll look into that.

Here is a new patch updated to work that way.  Feels better now.

I think this is conceptually OK, although it feels a bit odd.

Might it be better to have the values as typename={binary,text} pairs
instead of oid={0,1} pairs, which are fairly opaque? That might make
things easier for things like UDTs where the oid might not be known or
constant.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#9Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Andrew Dunstan (#8)
1 attachment(s)
Re: default result formats setting

On 2020-11-16 16:15, Andrew Dunstan wrote:

I think this is conceptually OK, although it feels a bit odd.

Might it be better to have the values as typename={binary,text} pairs
instead of oid={0,1} pairs, which are fairly opaque? That might make
things easier for things like UDTs where the oid might not be known or
constant.

Yes, type names would be better. I was hesitant because of all the
parsing work involved, but I bit the bullet and did it in the new patch.

To simplify the format, I changed the parameter so it's just a list of
types that you want in binary, rather than type=value pairs. If we ever
want to add another format, we would revisit this, but it seems unlikely
in the near future.

Also, I have changed the naming of the parameter since this is no longer
the "default" but something you choose explicitly. I'm thinking in the
direction of "auto" mode for the naming. Obviously, the name is easy to
tweak in any case.

--
Peter Eisentraut
2ndQuadrant, an EDB company
https://www.2ndquadrant.com/

Attachments:

v3-0001-Add-result_format_auto_binary_types-setting.patchtext/plain; charset=UTF-8; name=v3-0001-Add-result_format_auto_binary_types-setting.patch; x-mac-creator=0; x-mac-type=0Download
From cfe184cbb45134360713405bb4d39b674a298672 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 25 Nov 2020 07:46:25 +0100
Subject: [PATCH v3] Add result_format_auto_binary_types setting

The current way binary results are requested in the extended query
protocol is too cumbersome for some practical uses.  Some clients, for
example the JDBC driver, have built-in support for handling specific
data types in binary.  Such a client would always have to request a
result row description (Describe statement) before sending a Bind
message, in order to be able to pick out the result columns it should
request in binary.  The feedback was that this extra round trip is
often not worth it in terms of performance, and so it is not done and
binary format is not used when it could be.

The solution is to allow a client to register for a session which
types it wants to always get in binary.  This is done by a new GUC
setting.  For example, to get int2, int4, int8 in binary by default,
you could set

    SET result_format_auto_binary_types = int2,int4,int8;

To request result formats based on this setting, send format code
-1 (instead of 0 or 1) in the Bind message.

Discussion: https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com
---
 doc/src/sgml/config.sgml     |  53 ++++++++
 doc/src/sgml/libpq.sgml      |  10 +-
 doc/src/sgml/protocol.sgml   |   7 +-
 src/backend/tcop/pquery.c    | 246 ++++++++++++++++++++++++++++++++++-
 src/backend/utils/misc/guc.c |  12 ++
 src/include/tcop/pquery.h    |   5 +
 6 files changed, 323 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3795c57004..20049fe7df 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8530,6 +8530,59 @@ <title>Statement Behavior</title>
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-result-format-auto-binary-types" xreflabel="result_format_auto_binary_types">
+      <term><varname>result_format_auto_binary_types</varname> (<type>string</type>)
+      <indexterm>
+       <primary><varname>result_format_auto_binary_types</varname></primary>
+       <secondary>configuration parameter</secondary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        This parameter specifies which data types are to be sent from the
+        server in binary format for rows returned in the extended query
+        protocol when result format code -1 is specified in the <link
+        linkend="protocol-message-Bind">Bind message</link>.  It is intended
+        to be used by client libraries that prefer to handle specific, but not
+        all, data types in binary format.  The typical usage would be that the
+        client library sets this value when it starts a connection.  (A client
+        library that wants to handle <emphasis>all</emphasis> types in binary
+        doesn't need to use this because it can just specify the format code
+        for all types at once in the protocol message.)
+       </para>
+
+       <para>
+        The value is a comma-separated list of type names.  For example, if
+        you want to automatically get values of the types <type>int2</type>,
+        <type>int4</type>, and <type>int8</type> in binary while leaving the
+        rest in text, an appropriate setting would be
+        <literal>int2,int4,int8</literal> or
+        <literal>smallint,int,bigint</literal>.  Type names may also be
+        double-quoted, as in SQL syntax.  The parsing of this value is run
+        with a temporary search path that only contains the
+        <literal>pg_catalog</literal> schema.  Therefore, if data types in
+        other schemas are to be included, they must be fully schema-qualified.
+       </para>
+
+       <para>
+        A non-existing type is ignored (but a message is written to the server
+        log).  This allows using the same value for different databases that
+        might have different extensions installed.  Because validating this
+        setting requires system catalog access, invalid settings are not
+        diagnosed at the time the value is set but only when the value is used
+        the first time, which happens when the first extended query protocol
+        result is returned to the client.
+       </para>
+
+       <para>
+        This setting applies only to result rows from the extended query
+        protocol, so it does not affect usage of
+        <application>psql</application> or <application>pg_dump</application>
+        for example.  Also, it does not affect the format of query parameters.
+       </para>
+      </listitem>
+     </varlistentry>
+
      </variablelist>
     </sect2>
      <sect2 id="runtime-config-client-format">
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 9d4b6ab4a8..2a4ec7ff63 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2707,10 +2707,12 @@ <title>Main Functions</title>
           <term><parameter>resultFormat</parameter></term>
           <listitem>
            <para>
-            Specify zero to obtain results in text format, or one to obtain
-            results in binary format.  (There is not currently a provision
-            to obtain different result columns in different formats,
-            although that is possible in the underlying protocol.)
+            Specify 0 to obtain results in text format, or 1 to obtain results
+            in binary format, or -1 to have the setting of <xref
+            linkend="guc-result-format-auto-binary-types"/> be applied by the
+            server.  (There is not currently a provision to obtain different
+            result columns in different formats, although that is possible in
+            the underlying protocol.)
            </para>
           </listitem>
          </varlistentry>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index cee28889e1..d0e447803a 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -3567,7 +3567,7 @@ <title>Message Formats</title>
 </varlistentry>
 
 
-<varlistentry>
+<varlistentry id="protocol-message-Bind">
 <term>
 Bind (F)
 </term>
@@ -3709,8 +3709,9 @@ <title>Message Formats</title>
 </term>
 <listitem>
 <para>
-                The result-column format codes.  Each must presently be
-                zero (text) or one (binary).
+                The result-column format codes.  Each must be 0 for text, or 1
+                for binary, or -1 to apply the setting of <xref
+                linkend="guc-result-format-auto-binary-types"/>.
 </para>
 </listitem>
 </varlistentry>
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 96ea74f118..a06d5f3635 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -18,14 +18,19 @@
 #include <limits.h>
 
 #include "access/xact.h"
+#include "catalog/namespace.h"
 #include "commands/prepare.h"
 #include "executor/tstoreReceiver.h"
 #include "miscadmin.h"
+#include "parser/parse_type.h"
+#include "parser/scansup.h"
 #include "pg_trace.h"
 #include "tcop/pquery.h"
 #include "tcop/utility.h"
+#include "utils/inval.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
+#include "utils/syscache.h"
 
 
 /*
@@ -597,6 +602,235 @@ PortalStart(Portal portal, ParamListInfo params,
 	portal->status = PORTAL_READY;
 }
 
+/*
+ * Support for result_format_auto_binary_types setting
+ *
+ * Interpreting this setting requires catalog access, so we cannot do most of
+ * the work in the usual check or assign hooks.  Instead, we do it the first
+ * time it is used.
+ */
+
+/* GUC variable */
+char *result_format_auto_binary_types;
+
+/* remembers whether the internal representation is up to date */
+static bool result_format_auto_binary_types_internal_valid = false;
+
+/*
+ * SplitGUCTypeList() --- parse a list of types
+ *
+ * Similar to SplitIdentifierString() etc., but does not strip quotes or alter
+ * the list elements in any way, since this is done by parseTypeString()
+ * later.  It only checks for comma as a separator and keeps track of balanced
+ * double quotes.
+ */
+static bool
+SplitGUCTypeList(char *rawstring, List **namelist)
+{
+	const char separator = ',';
+	char	   *nextp = rawstring;
+	bool		done = false;
+
+	*namelist = NIL;
+
+	while (scanner_isspace(*nextp))
+		nextp++;				/* skip leading whitespace */
+
+	if (*nextp == '\0')
+		return true;			/* allow empty string */
+
+	/* At the top of the loop, we are at start of a new list element. */
+	do
+	{
+		char	   *curname;
+		char	   *endp;
+		bool		in_quote = false;
+
+		curname = nextp;
+
+		while (*nextp)
+		{
+			if (!*nextp)
+				break;
+
+			if (*nextp == '"')
+				in_quote = !in_quote;
+
+			if (*nextp == separator && !in_quote)
+				break;
+
+			nextp++;
+		}
+
+		endp = nextp;
+		if (curname == nextp)
+			return false;	/* empty element not allowed */
+
+		if (*nextp == separator)
+			nextp++;
+		else if (*nextp == '\0')
+		{
+			if (in_quote)
+				return false;
+			done = true;
+		}
+		else
+			return false;		/* invalid syntax */
+
+		/* Now safe to overwrite separator with a null */
+		*endp = '\0';
+
+		/*
+		 * Finished isolating current name --- add it to list
+		 */
+		*namelist = lappend(*namelist, curname);
+
+		/* Loop back if we didn't reach end of string */
+	} while (!done);
+
+	return true;
+}
+
+bool
+check_result_format_auto_binary_types(char **newval, void **extra, GucSource source)
+{
+	char       *rawstring;
+	List       *elemlist;
+
+	rawstring = pstrdup(*newval);
+	if (!SplitGUCTypeList(rawstring, &elemlist))
+	{
+		GUC_check_errdetail("List syntax is invalid.");
+		pfree(rawstring);
+		list_free(elemlist);
+		return false;
+	}
+
+	pfree(rawstring);
+	list_free(elemlist);
+
+	return true;
+}
+
+/*
+ * GUC assign hook invalidates internal representation when the setting changes
+ */
+void
+assign_result_format_auto_binary_types(const char *newval, void *extra)
+{
+	result_format_auto_binary_types_internal_valid = false;
+}
+
+/*
+ * Invalidate when anything relevant in the system catalogs changes
+ */
+static void
+invalidate_result_format_auto_binary_types_internal(Datum arg, int cacheid, uint32 hashvalue)
+{
+	result_format_auto_binary_types_internal_valid = false;
+}
+
+/*
+ * Error context callback, so the error messages are clearer, since they
+ * happen as part of query processing, not GUC processing.
+ */
+static void
+_parse_rfabt_error_callback(void *arg)
+{
+	const char *value = (const char *) arg;
+
+	errcontext("parsing %s element \"%s\"",
+			   "result_format_auto_binary_types",
+			   value);
+}
+
+/*
+ * Subroutine for PortalSetResultFormat(): Return format code for type by
+ * using result_format_auto_binary_types setting.
+ */
+static int16
+resolve_result_format(Oid typeid)
+{
+	static List *result_format_auto_binary_types_internal = NIL;	/* type OID list */
+	static bool syscache_callback_set = false;
+
+	if (!syscache_callback_set)
+	{
+		CacheRegisterSyscacheCallback(TYPEOID, invalidate_result_format_auto_binary_types_internal, (Datum) 0);
+		syscache_callback_set = true;
+	}
+
+	if (!result_format_auto_binary_types_internal_valid)
+	{
+		MemoryContext oldcontext;
+		char	   *rawstring;
+		List	   *elemlist;
+		ListCell   *lc;
+
+		rawstring = pstrdup(result_format_auto_binary_types);
+		if (!SplitGUCTypeList(rawstring, &elemlist))
+		{
+			pfree(rawstring);
+			list_free(elemlist);
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("invalid list syntax in parameter \"%s\"",
+							"result_format_auto_binary_types")));
+		}
+
+		oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+
+		list_free(result_format_auto_binary_types_internal);
+		result_format_auto_binary_types_internal = NIL;
+
+		foreach(lc, elemlist)
+		{
+			char	   *str = lfirst(lc);
+			ErrorContextCallback myerrcontext;
+			OverrideSearchPath temppath = { .addCatalog = true };
+			Oid			typeid;
+
+			myerrcontext.callback = _parse_rfabt_error_callback;
+			myerrcontext.arg = str;
+			myerrcontext.previous = error_context_stack;
+			error_context_stack = &myerrcontext;
+
+			/* use pg_catalog-only search_path */
+			PushOverrideSearchPath(&temppath);
+
+			parseTypeString(str, &typeid, NULL, true);
+
+			PopOverrideSearchPath();
+
+			if (typeid)
+				result_format_auto_binary_types_internal =
+					list_append_unique_oid(result_format_auto_binary_types_internal,
+										   typeid);
+			else
+				/*
+				 * We ignore nonexisting types, so that one setting can be
+				 * shared between different databases that might have
+				 * different extensions installed.  But we should diagnose
+				 * that we are ignoring a type, otherwise typos and similar
+				 * might never get noticed.
+				 */
+				ereport(LOG,
+						errmsg("type %s does not exist", str));
+
+			error_context_stack = myerrcontext.previous;
+		}
+
+		MemoryContextSwitchTo(oldcontext);
+
+		pfree(rawstring);
+		list_free(elemlist);
+
+		result_format_auto_binary_types_internal_valid = true;
+	}
+
+	return list_member_oid(result_format_auto_binary_types_internal, typeid) ? 1 : 0;
+}
+
 /*
  * PortalSetResultFormat
  *		Select the format codes for a portal's output.
@@ -628,7 +862,11 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats)
 					(errcode(ERRCODE_PROTOCOL_VIOLATION),
 					 errmsg("bind message has %d result formats but query has %d columns",
 							nFormats, natts)));
-		memcpy(portal->formats, formats, natts * sizeof(int16));
+
+		for (i = 0; i < natts; i++)
+			portal->formats[i] = (formats[i] == -1
+								  ? resolve_result_format(TupleDescAttr(portal->tupDesc, i)->atttypid)
+								  : formats[i]);
 	}
 	else if (nFormats > 0)
 	{
@@ -636,11 +874,13 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats)
 		int16		format1 = formats[0];
 
 		for (i = 0; i < natts; i++)
-			portal->formats[i] = format1;
+			portal->formats[i] = (format1 == -1
+								  ? resolve_result_format(TupleDescAttr(portal->tupDesc, i)->atttypid)
+								  : format1);
 	}
 	else
 	{
-		/* use default format for all columns */
+		/* by default use text format for all columns */
 		for (i = 0; i < natts; i++)
 			portal->formats[i] = 0;
 	}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index bb34630e8e..79491857fa 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -83,6 +83,7 @@
 #include "storage/predicate.h"
 #include "storage/proc.h"
 #include "storage/standby.h"
+#include "tcop/pquery.h"
 #include "tcop/tcopprot.h"
 #include "tsearch/ts_cache.h"
 #include "utils/acl.h"
@@ -4448,6 +4449,17 @@ static struct config_string ConfigureNamesString[] =
 		check_backtrace_functions, assign_backtrace_functions, NULL
 	},
 
+	{
+		{"result_format_auto_binary_types", PGC_USERSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Which data types to send in binary for format -1 in the extended query protocol."),
+			NULL,
+			GUC_LIST_INPUT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&result_format_auto_binary_types,
+		"",
+		check_result_format_auto_binary_types, assign_result_format_auto_binary_types, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, NULL, NULL, NULL, NULL
diff --git a/src/include/tcop/pquery.h b/src/include/tcop/pquery.h
index 437642cc72..bceca24d2d 100644
--- a/src/include/tcop/pquery.h
+++ b/src/include/tcop/pquery.h
@@ -15,10 +15,12 @@
 #define PQUERY_H
 
 #include "nodes/parsenodes.h"
+#include "utils/guc.h"
 #include "utils/portal.h"
 
 
 extern PGDLLIMPORT Portal ActivePortal;
+extern char *result_format_auto_binary_types;
 
 
 extern PortalStrategy ChoosePortalStrategy(List *stmts);
@@ -30,6 +32,9 @@ extern List *FetchStatementTargetList(Node *stmt);
 extern void PortalStart(Portal portal, ParamListInfo params,
 						int eflags, Snapshot snapshot);
 
+extern bool check_result_format_auto_binary_types(char **newval, void **extra, GucSource source);
+extern void assign_result_format_auto_binary_types(const char *newval, void *extra);
+
 extern void PortalSetResultFormat(Portal portal, int nFormats,
 								  int16 *formats);
 

base-commit: a7e65dc88b6f088fc2fcf5a660d866de644b1300
-- 
2.29.2

#10David Steele
david@pgmasters.net
In reply to: Peter Eisentraut (#9)
Re: default result formats setting

On 11/25/20 2:06 AM, Peter Eisentraut wrote:

On 2020-11-16 16:15, Andrew Dunstan wrote:

I think this is conceptually OK, although it feels a bit odd.

Might it be better to have the values as typename={binary,text} pairs
instead of oid={0,1} pairs, which are fairly opaque? That might make
things easier for things like UDTs where the oid might not be known or
constant.

Yes, type names would be better.  I was hesitant because of all the
parsing work involved, but I bit the bullet and did it in the new patch.

To simplify the format, I changed the parameter so it's just a list of
types that you want in binary, rather than type=value pairs.  If we ever
want to add another format, we would revisit this, but it seems unlikely
in the near future.

Also, I have changed the naming of the parameter since this is no longer
the "default" but something you choose explicitly.  I'm thinking in the
direction of "auto" mode for the naming.  Obviously, the name is easy to
tweak in any case.

Andrew, Tom, does the latest patch address your concerns?

Regards,
--
-David
david@pgmasters.net

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Steele (#10)
Re: default result formats setting

David Steele <david@pgmasters.net> writes:

Andrew, Tom, does the latest patch address your concerns?

[ reads patch quickly... ] I think the definition is fine now,
modulo possible bikeshedding on the GUC name. (I have no
great suggestion on that right now, but the current proposal
seems mighty verbose.)

The implementation feels weird though, mainly in that I don't like
Peter's choices for where to put the code. pquery.c is not where
I would have expected to find the support for this, and I do not
have any confidence that applying the format conversion while
filling portal->formats[] is enough to cover all cases. I'd have
thought that access/common/printtup.c or somewhere near there
would be where to do it.

Likewise, the code associated with caching the results of the type
OID lookups seems like it should be someplace where you'd be more
likely to find (a) type name lookup and (b) caching logic. I'm
not quite sure about the best place for that, but we could do
worse than put it in parse_type.c. (As I recall, the parser
already has some caching related to operator lookup, so doing
part (b) there isn't too much of a stretch.)

Also, if we need YA string-splitting function, please put it
beside the ones that already exist (SplitIdentifierString etc in
varlena.c). That way (a) it's available if some other code needs
it, and (b) when somebody gets around to refactoring all the
splitters, they won't have to dig into nooks and crannies to find
them.

Having said that, I wonder if we should define the parameter's
contents this way, i.e. as things that parseTypeString will
accept. At best that's overspecification, e.g. should people
expect that varchar(7) and varchar(9) are different things
(and, perhaps, that such entries *don't* match varchars of other
lengths?) I think a case could be made for requiring the entries
to be identifiers exactly matching pg_type.typname, possibly with
schema qualification. This'd allow tighter verification of the
GUC value's format in the GUC check hook.

Or we could drop all of that and go back to having it be a list
of type OIDs, which would remove a *whole lot* of the complexity,
and I'm not sure that it's materially less friendly. Applications
have had to deal with type OIDs in the protocol since forever.

BTW, I wonder whether we still need to restrict the GUC to not
be settable from postgresql.conf. The fact that the client has
to explicitly pass -1 seems to reduce any security issues quite
a bit.

regards, tom lane

#12Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#11)
1 attachment(s)
Re: default result formats setting

On 09.03.21 19:04, Tom Lane wrote:

The implementation feels weird though, mainly in that I don't like
Peter's choices for where to put the code. pquery.c is not where
I would have expected to find the support for this, and I do not
have any confidence that applying the format conversion while
filling portal->formats[] is enough to cover all cases. I'd have
thought that access/common/printtup.c or somewhere near there
would be where to do it.

done

Or we could drop all of that and go back to having it be a list
of type OIDs, which would remove a *whole lot* of the complexity,
and I'm not sure that it's materially less friendly. Applications
have had to deal with type OIDs in the protocol since forever.

also done

The client driver needs to be able to interpret the OIDs that the
RowDescription sends back, so it really needs to be able to deal in
OIDs, and having the option to specify type names won't help it right now.

BTW, I wonder whether we still need to restrict the GUC to not
be settable from postgresql.conf. The fact that the client has
to explicitly pass -1 seems to reduce any security issues quite
a bit.

There was no security concern, but I don't think it's useful. The
driver would specify "send int4 in binary, I know how to handle that".
There doesn't seem to be a point in specifying that sort of thing globally.

Attachments:

v4-0001-Add-result_format_auto_binary_types-setting.patchtext/plain; charset=UTF-8; name=v4-0001-Add-result_format_auto_binary_types-setting.patch; x-mac-creator=0; x-mac-type=0Download
From 3410ac02bc34b60b79c99ad96505dabe4362570f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 18 Mar 2021 21:07:42 +0100
Subject: [PATCH v4] Add result_format_auto_binary_types setting

The current way binary results are requested in the extended query
protocol is too cumbersome for some practical uses.  Some clients, for
example the JDBC driver, have built-in support for handling specific
data types in binary.  Such a client would always have to request a
result row description (Describe statement) before sending a Bind
message, in order to be able to pick out the result columns it should
request in binary.  The feedback was that this extra round trip is
often not worth it in terms of performance, and so it is not done and
binary format is not used when it could be.

The solution is to allow a client to register for a session which
types it wants to always get in binary.  This is done by a new GUC
setting.  For example, to get int2, int4, int8 in binary by default,
you could set

    SET result_format_auto_binary_types = 21,23,20;

To request result formats based on this setting, send format code
-1 (instead of 0 or 1) in the Bind message.

Discussion: https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com
---
 doc/src/sgml/config.sgml                      | 40 ++++++++
 doc/src/sgml/libpq.sgml                       | 10 +-
 doc/src/sgml/protocol.sgml                    |  7 +-
 src/backend/access/common/printtup.c          | 94 +++++++++++++++++++
 src/backend/utils/misc/guc.c                  | 12 +++
 src/include/access/printtup.h                 |  5 +
 src/test/modules/Makefile                     |  1 +
 src/test/modules/libpq_extended/.gitignore    |  6 ++
 src/test/modules/libpq_extended/Makefile      | 20 ++++
 src/test/modules/libpq_extended/README        |  1 +
 .../libpq_extended/t/001_result_format.pl     | 33 +++++++
 11 files changed, 222 insertions(+), 7 deletions(-)
 create mode 100644 src/test/modules/libpq_extended/.gitignore
 create mode 100644 src/test/modules/libpq_extended/Makefile
 create mode 100644 src/test/modules/libpq_extended/README
 create mode 100644 src/test/modules/libpq_extended/t/001_result_format.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8603cf3f94..0fed8abfd2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8671,6 +8671,46 @@ <title>Statement Behavior</title>
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-result-format-auto-binary-types" xreflabel="result_format_auto_binary_types">
+      <term><varname>result_format_auto_binary_types</varname> (<type>string</type>)
+      <indexterm>
+       <primary><varname>result_format_auto_binary_types</varname></primary>
+       <secondary>configuration parameter</secondary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        This parameter specifies which data types are to be sent from the
+        server in binary format for rows returned in the extended query
+        protocol when result format code -1 is specified in the <link
+        linkend="protocol-message-Bind">Bind message</link>.  It is intended
+        to be used by client libraries that prefer to handle specific, but not
+        all, data types in binary format.  The typical usage would be that the
+        client library sets this value when it starts a connection.  (A client
+        library that wants to handle <emphasis>all</emphasis> types in binary
+        doesn't need to use this because it can just specify the format code
+        for all types at once in the protocol message.)
+       </para>
+
+       <para>
+        The value is a comma-separated list of type OIDs.  For example, if you
+        want to automatically get values of the types <type>int2</type>,
+        <type>int4</type>, and <type>int8</type> in binary while leaving the
+        rest in text, an appropriate setting would be
+        <literal>21,23,20</literal>.  A non-existing type OIDs are ignored.
+        This allows using the same value for different databases that might
+        have different extensions installed.
+       </para>
+
+       <para>
+        This setting applies only to result rows from the extended query
+        protocol, so it does not affect usage of
+        <application>psql</application> or <application>pg_dump</application>
+        for example.  Also, it does not affect the format of query parameters.
+       </para>
+      </listitem>
+     </varlistentry>
+
      </variablelist>
     </sect2>
      <sect2 id="runtime-config-client-format">
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index be674fbaa9..ef55f09487 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2834,10 +2834,12 @@ <title>Main Functions</title>
           <term><parameter>resultFormat</parameter></term>
           <listitem>
            <para>
-            Specify zero to obtain results in text format, or one to obtain
-            results in binary format.  (There is not currently a provision
-            to obtain different result columns in different formats,
-            although that is possible in the underlying protocol.)
+            Specify 0 to obtain results in text format, or 1 to obtain results
+            in binary format, or -1 to have the setting of <xref
+            linkend="guc-result-format-auto-binary-types"/> be applied by the
+            server.  (There is not currently a provision to obtain different
+            result columns in different formats, although that is possible in
+            the underlying protocol.)
            </para>
           </listitem>
          </varlistentry>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 43092fe62a..e3a614e80e 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -3587,7 +3587,7 @@ <title>Message Formats</title>
 </varlistentry>
 
 
-<varlistentry>
+<varlistentry id="protocol-message-Bind">
 <term>
 Bind (F)
 </term>
@@ -3729,8 +3729,9 @@ <title>Message Formats</title>
 </term>
 <listitem>
 <para>
-                The result-column format codes.  Each must presently be
-                zero (text) or one (binary).
+                The result-column format codes.  Each must be 0 for text, or 1
+                for binary, or -1 to apply the setting of <xref
+                linkend="guc-result-format-auto-binary-types"/>.
 </para>
 </listitem>
 </varlistentry>
diff --git a/src/backend/access/common/printtup.c b/src/backend/access/common/printtup.c
index 54b539f6fb..0754ec7e9b 100644
--- a/src/backend/access/common/printtup.c
+++ b/src/backend/access/common/printtup.c
@@ -22,6 +22,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
+#include "utils/varlena.h"
 
 
 static void printtup_startup(DestReceiver *self, int operation,
@@ -30,6 +31,13 @@ static bool printtup(TupleTableSlot *slot, DestReceiver *self);
 static void printtup_shutdown(DestReceiver *self);
 static void printtup_destroy(DestReceiver *self);
 
+/* GUC variable */
+char *result_format_auto_binary_types;
+
+/* internal parsed representation */
+static List *result_format_auto_binary_types_internal = NIL;	/* type OID list */
+
+
 /* ----------------------------------------------------------------
  *		printtup / debugtup support
  * ----------------------------------------------------------------
@@ -231,6 +239,9 @@ SendRowDescriptionMessage(StringInfo buf, TupleDesc typeinfo,
 		else
 			format = 0;
 
+		if (format == -1)
+			format = list_member_oid(result_format_auto_binary_types_internal, atttypid) ? 1 : 0;
+
 		pq_writestring(buf, NameStr(att->attname));
 		pq_writeint32(buf, resorigtbl);
 		pq_writeint16(buf, resorigcol);
@@ -271,6 +282,9 @@ printtup_prepare_info(DR_printtup *myState, TupleDesc typeinfo, int numAttrs)
 		int16		format = (formats ? formats[i] : 0);
 		Form_pg_attribute attr = TupleDescAttr(typeinfo, i);
 
+		if (format == -1)
+			format = list_member_oid(result_format_auto_binary_types_internal, attr->atttypid) ? 1 : 0;
+
 		thisState->format = format;
 		if (format == 0)
 		{
@@ -483,3 +497,83 @@ debugtup(TupleTableSlot *slot, DestReceiver *self)
 
 	return true;
 }
+
+
+/*
+ * Support for result_format_auto_binary_types setting
+ */
+
+bool
+check_result_format_auto_binary_types(char **newval, void **extra, GucSource source)
+{
+	char       *rawstring;
+	List       *elemlist;
+	ListCell   *lc;
+
+	rawstring = pstrdup(*newval);
+	if (!SplitGUCList(rawstring, ',', &elemlist))
+	{
+		GUC_check_errdetail("List syntax is invalid.");
+		pfree(rawstring);
+		list_free(elemlist);
+		return false;
+	}
+
+	foreach(lc, elemlist)
+	{
+		char	   *str = lfirst(lc);
+
+		if (atooid(str) == 0)
+		{
+			GUC_check_errdetail("Invalid list entry: %s", str);
+			pfree(rawstring);
+			list_free(elemlist);
+			return false;
+		}
+	}
+
+	pfree(rawstring);
+	list_free(elemlist);
+
+	return true;
+}
+
+void
+assign_result_format_auto_binary_types(const char *newval, void *extra)
+{
+	char	   *rawstring;
+	List	   *elemlist;
+	ListCell   *lc;
+
+	rawstring = pstrdup(newval);
+	if (!SplitGUCList(rawstring, ',', &elemlist))
+	{
+		pfree(rawstring);
+		list_free(elemlist);
+		return;
+	}
+
+	list_free(result_format_auto_binary_types_internal);
+	result_format_auto_binary_types_internal = NIL;
+
+	foreach(lc, elemlist)
+	{
+		char	   *str = lfirst(lc);
+		Oid			oid;
+		MemoryContext oldcontext;
+
+		if ((oid = atooid(str)) == 0)
+		{
+			pfree(rawstring);
+			list_free(elemlist);
+			return;
+		}
+
+		oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+		result_format_auto_binary_types_internal = list_append_unique_oid(result_format_auto_binary_types_internal, oid);
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	pfree(rawstring);
+	list_free(elemlist);
+}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index b263e3493b..4d5447b09a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -31,6 +31,7 @@
 
 #include "access/commit_ts.h"
 #include "access/gin.h"
+#include "access/printtup.h"
 #include "access/rmgr.h"
 #include "access/tableam.h"
 #include "access/transam.h"
@@ -4514,6 +4515,17 @@ static struct config_string ConfigureNamesString[] =
 		check_backtrace_functions, assign_backtrace_functions, NULL
 	},
 
+	{
+		{"result_format_auto_binary_types", PGC_USERSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Which data types to send in binary for format -1 in the extended query protocol."),
+			NULL,
+			GUC_LIST_INPUT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&result_format_auto_binary_types,
+		"",
+		check_result_format_auto_binary_types, assign_result_format_auto_binary_types, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, NULL, NULL, NULL, NULL
diff --git a/src/include/access/printtup.h b/src/include/access/printtup.h
index c9b37538a0..9ceaf4cd82 100644
--- a/src/include/access/printtup.h
+++ b/src/include/access/printtup.h
@@ -14,6 +14,7 @@
 #ifndef PRINTTUP_H
 #define PRINTTUP_H
 
+#include "utils/guc.h"
 #include "utils/portal.h"
 
 extern DestReceiver *printtup_create_DR(CommandDest dest);
@@ -27,6 +28,10 @@ extern void debugStartup(DestReceiver *self, int operation,
 						 TupleDesc typeinfo);
 extern bool debugtup(TupleTableSlot *slot, DestReceiver *self);
 
+extern char *result_format_auto_binary_types;
+extern bool check_result_format_auto_binary_types(char **newval, void **extra, GucSource source);
+extern void assign_result_format_auto_binary_types(const char *newval, void *extra);
+
 /* XXX these are really in executor/spi.c */
 extern void spi_dest_startup(DestReceiver *self, int operation,
 							 TupleDesc typeinfo);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 93e7829c67..65dcbd2a09 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -10,6 +10,7 @@ SUBDIRS = \
 		  delay_execution \
 		  dummy_index_am \
 		  dummy_seclabel \
+		  libpq_extended \
 		  libpq_pipeline \
 		  plsample \
 		  snapshot_too_old \
diff --git a/src/test/modules/libpq_extended/.gitignore b/src/test/modules/libpq_extended/.gitignore
new file mode 100644
index 0000000000..6fb386d93b
--- /dev/null
+++ b/src/test/modules/libpq_extended/.gitignore
@@ -0,0 +1,6 @@
+/test-*
+
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/libpq_extended/Makefile b/src/test/modules/libpq_extended/Makefile
new file mode 100644
index 0000000000..eb6f308206
--- /dev/null
+++ b/src/test/modules/libpq_extended/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/libpq_extended/Makefile
+
+PROGRAM = test-result-format
+OBJS = test-result-format.o
+
+PG_CPPFLAGS = -I$(libpq_srcdir)
+PG_LIBS_INTERNAL += $(libpq_pgport)
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/libpq_pipeline
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/libpq_extended/README b/src/test/modules/libpq_extended/README
new file mode 100644
index 0000000000..842c02d4f0
--- /dev/null
+++ b/src/test/modules/libpq_extended/README
@@ -0,0 +1 @@
+Tests for libpq extended query protocol support
diff --git a/src/test/modules/libpq_extended/t/001_result_format.pl b/src/test/modules/libpq_extended/t/001_result_format.pl
new file mode 100644
index 0000000000..0ceea44bcc
--- /dev/null
+++ b/src/test/modules/libpq_extended/t/001_result_format.pl
@@ -0,0 +1,33 @@
+use strict;
+use warnings;
+
+use PostgresNode;
+use TestLib;
+use Test::More tests => 4;
+use Cwd;
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+
+$ENV{PATH} = "$ENV{PATH}:" . getcwd();
+
+# int2,int4,int8
+$ENV{PGOPTIONS} = '-c result_format_auto_binary_types=21,23,20,20000';
+$node->command_like([
+	'test-result-format',
+	$node->connstr('postgres'),
+	"SELECT 1::int4, 2::int8, 'abc'::text",
+	],
+	qr/0->1 1->1 2->0/,
+	"binary format is applied, nonexisting OID ignored");
+
+$ENV{PGOPTIONS} = '-c result_format_auto_binary_types=foo,bar';
+$node->command_fails([
+	'test-result-format',
+	$node->connstr('postgres'),
+	"SELECT 1::int4, 2::int8, 'abc'::text",
+	],
+	"invalid setting is an error");
+
+$node->stop('fast');

base-commit: ed62d3737c1b823f796d974060b1d0295a3dd831
-- 
2.30.2

#13Emre Hasegeli
emre@hasegeli.com
In reply to: Peter Eisentraut (#12)
Re: default result formats setting

I applied the patch, tried running the test and got the following:

rm -rf '/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended'/tmp_check
/bin/sh ../../../../config/install-sh -c -d '/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended'/tmp_check
cd . && TESTDIR='/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended' PATH="/Users/hasegeli/Developer/postgres/tmp_install/Users/hasegeli/.local/pgsql/bin:$PATH" DYLD_LIBRARY_PATH="/Users/hasegeli/Developer/postgres/tmp_install/Users/hasegeli/.local/pgsql/lib" PGPORT='65432' PG_REGRESS='/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended/../../../../src/test/regress/pg_regress' REGRESS_SHLIB='/Users/hasegeli/Developer/postgres/src/test/regress/regress.so' /usr/bin/prove -I ../../../../src/test/perl/ -I . t/*.pl
t/001_result_format.pl .. # Looks like your test exited with 2 before it could output anything.
t/001_result_format.pl .. Dubious, test returned 2 (wstat 512, 0x200)
Failed 4/4 subtests

Test Summary Report
-------------------
t/001_result_format.pl (Wstat: 512 Tests: 0 Failed: 0)
Non-zero exit status: 2
Parse errors: Bad plan. You planned 4 tests but ran 0.

#14Peter Eisentraut
peter@eisentraut.org
In reply to: Emre Hasegeli (#13)
Re: default result formats setting

On 19.03.21 15:55, Emre Hasegeli wrote:

I applied the patch, tried running the test and got the following:

rm -rf '/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended'/tmp_check
/bin/sh ../../../../config/install-sh -c -d '/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended'/tmp_check
cd . && TESTDIR='/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended' PATH="/Users/hasegeli/Developer/postgres/tmp_install/Users/hasegeli/.local/pgsql/bin:$PATH" DYLD_LIBRARY_PATH="/Users/hasegeli/Developer/postgres/tmp_install/Users/hasegeli/.local/pgsql/lib" PGPORT='65432' PG_REGRESS='/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended/../../../../src/test/regress/pg_regress' REGRESS_SHLIB='/Users/hasegeli/Developer/postgres/src/test/regress/regress.so' /usr/bin/prove -I ../../../../src/test/perl/ -I . t/*.pl
t/001_result_format.pl .. # Looks like your test exited with 2 before it could output anything.
t/001_result_format.pl .. Dubious, test returned 2 (wstat 512, 0x200)
Failed 4/4 subtests

Test Summary Report
-------------------
t/001_result_format.pl (Wstat: 512 Tests: 0 Failed: 0)
Non-zero exit status: 2
Parse errors: Bad plan. You planned 4 tests but ran 0.

Could you look into the log files in that test directory what is going
on? The test setup is closely modeled after
src/test/modules/libpq_pipeline/. Does that one run ok?

#15Emre Hasegeli
emre@hasegeli.com
In reply to: Peter Eisentraut (#14)
Re: default result formats setting

Could you look into the log files in that test directory what is going
on?

Command 'test-result-format' not found in
/Users/hasegeli/Developer/postgres/tmp_install/Users/hasegeli/.local/pgsql/bin,
/Users/hasegeli/.local/bin, /opt/homebrew/bin, /usr/local/bin,
/usr/bin, /bin, /usr/sbin, /sbin,
/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended at
/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended/../../../../src/test/perl/TestLib.pm
line 818.

Maybe you forgot to commit the file in the test?

The test setup is closely modeled after
src/test/modules/libpq_pipeline/. Does that one run ok?

Yes

#16Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Emre Hasegeli (#15)
1 attachment(s)
Re: default result formats setting

On 21.03.21 20:18, Emre Hasegeli wrote:

Could you look into the log files in that test directory what is going
on?

Command 'test-result-format' not found in
/Users/hasegeli/Developer/postgres/tmp_install/Users/hasegeli/.local/pgsql/bin,
/Users/hasegeli/.local/bin, /opt/homebrew/bin, /usr/local/bin,
/usr/bin, /bin, /usr/sbin, /sbin,
/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended at
/Users/hasegeli/Developer/postgres/src/test/modules/libpq_extended/../../../../src/test/perl/TestLib.pm
line 818.

Maybe you forgot to commit the file in the test?

Indeed. Here is an updated patch.

Attachments:

v5-0001-Add-result_format_auto_binary_types-setting.patchtext/plain; charset=UTF-8; name=v5-0001-Add-result_format_auto_binary_types-setting.patch; x-mac-creator=0; x-mac-type=0Download
From 61796d1077bd146daa6952dae799819539c05a96 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 22 Mar 2021 13:36:35 +0100
Subject: [PATCH v5] Add result_format_auto_binary_types setting

The current way binary results are requested in the extended query
protocol is too cumbersome for some practical uses.  Some clients, for
example the JDBC driver, have built-in support for handling specific
data types in binary.  Such a client would always have to request a
result row description (Describe statement) before sending a Bind
message, in order to be able to pick out the result columns it should
request in binary.  The feedback was that this extra round trip is
often not worth it in terms of performance, and so it is not done and
binary format is not used when it could be.

The solution is to allow a client to register for a session which
types it wants to always get in binary.  This is done by a new GUC
setting.  For example, to get int2, int4, int8 in binary by default,
you could set

    SET result_format_auto_binary_types = 21,23,20;

To request result formats based on this setting, send format code
-1 (instead of 0 or 1) in the Bind message.

Discussion: https://www.postgresql.org/message-id/flat/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com
---
 doc/src/sgml/config.sgml                      | 40 ++++++++
 doc/src/sgml/libpq.sgml                       | 10 +-
 doc/src/sgml/protocol.sgml                    |  7 +-
 src/backend/access/common/printtup.c          | 94 +++++++++++++++++++
 src/backend/utils/misc/guc.c                  | 12 +++
 src/include/access/printtup.h                 |  5 +
 src/test/modules/Makefile                     |  1 +
 src/test/modules/libpq_extended/.gitignore    |  6 ++
 src/test/modules/libpq_extended/Makefile      | 20 ++++
 src/test/modules/libpq_extended/README        |  1 +
 .../libpq_extended/t/001_result_format.pl     | 33 +++++++
 .../libpq_extended/test-result-format.c       | 39 ++++++++
 12 files changed, 261 insertions(+), 7 deletions(-)
 create mode 100644 src/test/modules/libpq_extended/.gitignore
 create mode 100644 src/test/modules/libpq_extended/Makefile
 create mode 100644 src/test/modules/libpq_extended/README
 create mode 100644 src/test/modules/libpq_extended/t/001_result_format.pl
 create mode 100644 src/test/modules/libpq_extended/test-result-format.c

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ee4925d6d9..0c81a96170 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8671,6 +8671,46 @@ <title>Statement Behavior</title>
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-result-format-auto-binary-types" xreflabel="result_format_auto_binary_types">
+      <term><varname>result_format_auto_binary_types</varname> (<type>string</type>)
+      <indexterm>
+       <primary><varname>result_format_auto_binary_types</varname></primary>
+       <secondary>configuration parameter</secondary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        This parameter specifies which data types are to be sent from the
+        server in binary format for rows returned in the extended query
+        protocol when result format code -1 is specified in the <link
+        linkend="protocol-message-Bind">Bind message</link>.  It is intended
+        to be used by client libraries that prefer to handle specific, but not
+        all, data types in binary format.  The typical usage would be that the
+        client library sets this value when it starts a connection.  (A client
+        library that wants to handle <emphasis>all</emphasis> types in binary
+        doesn't need to use this because it can just specify the format code
+        for all types at once in the protocol message.)
+       </para>
+
+       <para>
+        The value is a comma-separated list of type OIDs.  For example, if you
+        want to automatically get values of the types <type>int2</type>,
+        <type>int4</type>, and <type>int8</type> in binary while leaving the
+        rest in text, an appropriate setting would be
+        <literal>21,23,20</literal>.  A non-existing type OIDs are ignored.
+        This allows using the same value for different databases that might
+        have different extensions installed.
+       </para>
+
+       <para>
+        This setting applies only to result rows from the extended query
+        protocol, so it does not affect usage of
+        <application>psql</application> or <application>pg_dump</application>
+        for example.  Also, it does not affect the format of query parameters.
+       </para>
+      </listitem>
+     </varlistentry>
+
      </variablelist>
     </sect2>
      <sect2 id="runtime-config-client-format">
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index be674fbaa9..ef55f09487 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2834,10 +2834,12 @@ <title>Main Functions</title>
           <term><parameter>resultFormat</parameter></term>
           <listitem>
            <para>
-            Specify zero to obtain results in text format, or one to obtain
-            results in binary format.  (There is not currently a provision
-            to obtain different result columns in different formats,
-            although that is possible in the underlying protocol.)
+            Specify 0 to obtain results in text format, or 1 to obtain results
+            in binary format, or -1 to have the setting of <xref
+            linkend="guc-result-format-auto-binary-types"/> be applied by the
+            server.  (There is not currently a provision to obtain different
+            result columns in different formats, although that is possible in
+            the underlying protocol.)
            </para>
           </listitem>
          </varlistentry>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 43092fe62a..e3a614e80e 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -3587,7 +3587,7 @@ <title>Message Formats</title>
 </varlistentry>
 
 
-<varlistentry>
+<varlistentry id="protocol-message-Bind">
 <term>
 Bind (F)
 </term>
@@ -3729,8 +3729,9 @@ <title>Message Formats</title>
 </term>
 <listitem>
 <para>
-                The result-column format codes.  Each must presently be
-                zero (text) or one (binary).
+                The result-column format codes.  Each must be 0 for text, or 1
+                for binary, or -1 to apply the setting of <xref
+                linkend="guc-result-format-auto-binary-types"/>.
 </para>
 </listitem>
 </varlistentry>
diff --git a/src/backend/access/common/printtup.c b/src/backend/access/common/printtup.c
index 54b539f6fb..0754ec7e9b 100644
--- a/src/backend/access/common/printtup.c
+++ b/src/backend/access/common/printtup.c
@@ -22,6 +22,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memdebug.h"
 #include "utils/memutils.h"
+#include "utils/varlena.h"
 
 
 static void printtup_startup(DestReceiver *self, int operation,
@@ -30,6 +31,13 @@ static bool printtup(TupleTableSlot *slot, DestReceiver *self);
 static void printtup_shutdown(DestReceiver *self);
 static void printtup_destroy(DestReceiver *self);
 
+/* GUC variable */
+char *result_format_auto_binary_types;
+
+/* internal parsed representation */
+static List *result_format_auto_binary_types_internal = NIL;	/* type OID list */
+
+
 /* ----------------------------------------------------------------
  *		printtup / debugtup support
  * ----------------------------------------------------------------
@@ -231,6 +239,9 @@ SendRowDescriptionMessage(StringInfo buf, TupleDesc typeinfo,
 		else
 			format = 0;
 
+		if (format == -1)
+			format = list_member_oid(result_format_auto_binary_types_internal, atttypid) ? 1 : 0;
+
 		pq_writestring(buf, NameStr(att->attname));
 		pq_writeint32(buf, resorigtbl);
 		pq_writeint16(buf, resorigcol);
@@ -271,6 +282,9 @@ printtup_prepare_info(DR_printtup *myState, TupleDesc typeinfo, int numAttrs)
 		int16		format = (formats ? formats[i] : 0);
 		Form_pg_attribute attr = TupleDescAttr(typeinfo, i);
 
+		if (format == -1)
+			format = list_member_oid(result_format_auto_binary_types_internal, attr->atttypid) ? 1 : 0;
+
 		thisState->format = format;
 		if (format == 0)
 		{
@@ -483,3 +497,83 @@ debugtup(TupleTableSlot *slot, DestReceiver *self)
 
 	return true;
 }
+
+
+/*
+ * Support for result_format_auto_binary_types setting
+ */
+
+bool
+check_result_format_auto_binary_types(char **newval, void **extra, GucSource source)
+{
+	char       *rawstring;
+	List       *elemlist;
+	ListCell   *lc;
+
+	rawstring = pstrdup(*newval);
+	if (!SplitGUCList(rawstring, ',', &elemlist))
+	{
+		GUC_check_errdetail("List syntax is invalid.");
+		pfree(rawstring);
+		list_free(elemlist);
+		return false;
+	}
+
+	foreach(lc, elemlist)
+	{
+		char	   *str = lfirst(lc);
+
+		if (atooid(str) == 0)
+		{
+			GUC_check_errdetail("Invalid list entry: %s", str);
+			pfree(rawstring);
+			list_free(elemlist);
+			return false;
+		}
+	}
+
+	pfree(rawstring);
+	list_free(elemlist);
+
+	return true;
+}
+
+void
+assign_result_format_auto_binary_types(const char *newval, void *extra)
+{
+	char	   *rawstring;
+	List	   *elemlist;
+	ListCell   *lc;
+
+	rawstring = pstrdup(newval);
+	if (!SplitGUCList(rawstring, ',', &elemlist))
+	{
+		pfree(rawstring);
+		list_free(elemlist);
+		return;
+	}
+
+	list_free(result_format_auto_binary_types_internal);
+	result_format_auto_binary_types_internal = NIL;
+
+	foreach(lc, elemlist)
+	{
+		char	   *str = lfirst(lc);
+		Oid			oid;
+		MemoryContext oldcontext;
+
+		if ((oid = atooid(str)) == 0)
+		{
+			pfree(rawstring);
+			list_free(elemlist);
+			return;
+		}
+
+		oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+		result_format_auto_binary_types_internal = list_append_unique_oid(result_format_auto_binary_types_internal, oid);
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	pfree(rawstring);
+	list_free(elemlist);
+}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 3b36a31a47..a1fb787782 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -31,6 +31,7 @@
 
 #include "access/commit_ts.h"
 #include "access/gin.h"
+#include "access/printtup.h"
 #include "access/rmgr.h"
 #include "access/tableam.h"
 #include "access/toast_compression.h"
@@ -4543,6 +4544,17 @@ static struct config_string ConfigureNamesString[] =
 		check_backtrace_functions, assign_backtrace_functions, NULL
 	},
 
+	{
+		{"result_format_auto_binary_types", PGC_USERSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Which data types to send in binary for format -1 in the extended query protocol."),
+			NULL,
+			GUC_LIST_INPUT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&result_format_auto_binary_types,
+		"",
+		check_result_format_auto_binary_types, assign_result_format_auto_binary_types, NULL
+	},
+
 	/* End-of-list marker */
 	{
 		{NULL, 0, 0, NULL, NULL}, NULL, NULL, NULL, NULL, NULL
diff --git a/src/include/access/printtup.h b/src/include/access/printtup.h
index c9b37538a0..9ceaf4cd82 100644
--- a/src/include/access/printtup.h
+++ b/src/include/access/printtup.h
@@ -14,6 +14,7 @@
 #ifndef PRINTTUP_H
 #define PRINTTUP_H
 
+#include "utils/guc.h"
 #include "utils/portal.h"
 
 extern DestReceiver *printtup_create_DR(CommandDest dest);
@@ -27,6 +28,10 @@ extern void debugStartup(DestReceiver *self, int operation,
 						 TupleDesc typeinfo);
 extern bool debugtup(TupleTableSlot *slot, DestReceiver *self);
 
+extern char *result_format_auto_binary_types;
+extern bool check_result_format_auto_binary_types(char **newval, void **extra, GucSource source);
+extern void assign_result_format_auto_binary_types(const char *newval, void *extra);
+
 /* XXX these are really in executor/spi.c */
 extern void spi_dest_startup(DestReceiver *self, int operation,
 							 TupleDesc typeinfo);
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 93e7829c67..65dcbd2a09 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -10,6 +10,7 @@ SUBDIRS = \
 		  delay_execution \
 		  dummy_index_am \
 		  dummy_seclabel \
+		  libpq_extended \
 		  libpq_pipeline \
 		  plsample \
 		  snapshot_too_old \
diff --git a/src/test/modules/libpq_extended/.gitignore b/src/test/modules/libpq_extended/.gitignore
new file mode 100644
index 0000000000..e3ad34260f
--- /dev/null
+++ b/src/test/modules/libpq_extended/.gitignore
@@ -0,0 +1,6 @@
+/test-result-format
+
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/libpq_extended/Makefile b/src/test/modules/libpq_extended/Makefile
new file mode 100644
index 0000000000..eb6f308206
--- /dev/null
+++ b/src/test/modules/libpq_extended/Makefile
@@ -0,0 +1,20 @@
+# src/test/modules/libpq_extended/Makefile
+
+PROGRAM = test-result-format
+OBJS = test-result-format.o
+
+PG_CPPFLAGS = -I$(libpq_srcdir)
+PG_LIBS_INTERNAL += $(libpq_pgport)
+
+TAP_TESTS = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/libpq_pipeline
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/libpq_extended/README b/src/test/modules/libpq_extended/README
new file mode 100644
index 0000000000..842c02d4f0
--- /dev/null
+++ b/src/test/modules/libpq_extended/README
@@ -0,0 +1 @@
+Tests for libpq extended query protocol support
diff --git a/src/test/modules/libpq_extended/t/001_result_format.pl b/src/test/modules/libpq_extended/t/001_result_format.pl
new file mode 100644
index 0000000000..0ceea44bcc
--- /dev/null
+++ b/src/test/modules/libpq_extended/t/001_result_format.pl
@@ -0,0 +1,33 @@
+use strict;
+use warnings;
+
+use PostgresNode;
+use TestLib;
+use Test::More tests => 4;
+use Cwd;
+
+my $node = get_new_node('main');
+$node->init;
+$node->start;
+
+$ENV{PATH} = "$ENV{PATH}:" . getcwd();
+
+# int2,int4,int8
+$ENV{PGOPTIONS} = '-c result_format_auto_binary_types=21,23,20,20000';
+$node->command_like([
+	'test-result-format',
+	$node->connstr('postgres'),
+	"SELECT 1::int4, 2::int8, 'abc'::text",
+	],
+	qr/0->1 1->1 2->0/,
+	"binary format is applied, nonexisting OID ignored");
+
+$ENV{PGOPTIONS} = '-c result_format_auto_binary_types=foo,bar';
+$node->command_fails([
+	'test-result-format',
+	$node->connstr('postgres'),
+	"SELECT 1::int4, 2::int8, 'abc'::text",
+	],
+	"invalid setting is an error");
+
+$node->stop('fast');
diff --git a/src/test/modules/libpq_extended/test-result-format.c b/src/test/modules/libpq_extended/test-result-format.c
new file mode 100644
index 0000000000..c3a2209fa6
--- /dev/null
+++ b/src/test/modules/libpq_extended/test-result-format.c
@@ -0,0 +1,39 @@
+#include "postgres_fe.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include "libpq-fe.h"
+
+int
+main(int argc, char *argv[])
+{
+	PGconn	   *conn;
+	PGresult   *res;
+
+	conn = PQconnectdb("");
+
+	if (PQstatus(conn) != CONNECTION_OK)
+	{
+		fprintf(stderr, "Connection to database failed: %s",
+				PQerrorMessage(conn));
+		exit(1);
+	}
+
+	res = PQexecParams(conn, "SELECT 1::int4, 2::int8, 'abc'::text",
+					   0, NULL, NULL, NULL, NULL,
+					   -1);
+
+	if (PQresultStatus(res) != PGRES_TUPLES_OK)
+	{
+		fprintf(stderr, "query failed: %s", PQerrorMessage(conn));
+		exit(1);
+	}
+
+	for (int i = 0; i < PQnfields(res); i++)
+	{
+		printf("%d->%d ", i, PQfformat(res, i));
+	}
+
+	PQfinish(conn);
+	return 0;
+}

base-commit: 909b449e00fc2f71e1a38569bbddbb6457d28485
-- 
2.30.2

#17Emre Hasegeli
emre@hasegeli.com
In reply to: Peter Eisentraut (#16)
Re: default result formats setting

I think this is a good feature that would be useful to JDBC and more.

I don't know the surrounding code very well, but the patch looks good to me.

I agree with Tom Lane that the name of the variable is too verbose.
Maybe "auto_binary_types" is enough. Do we gain much by prefixing
"result_format_"? Wouldn't we use the same variable, if we support
binary inputs one day?

It is nice that the patch comes with the test module. The name
"libpq_extended" sounds a bit vague to me. Maybe it's a better idea
to call it "libpq_result_format" and test "format=1" in it as well.

My last nitpicking about the names is the "test-result-format"
command. All the rest of the test modules name the commands with
underscores. It would be nicer if this one complies.

There is one place that needs to be updated on the Makefile of the test:

+subdir = src/test/modules/libpq_pipeline

s/pipeline/extended/

Then the test runs successfully.

#18Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#4)
Re: default result formats setting

On Thu, Nov 5, 2020 at 3:49 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:

We could also make it a protocol message, but it would essentially
implement the same thing, just again separately. And then you'd have no
support to inspect the current setting, test out different settings
interactively, etc. That seems pretty wasteful and complicated for no
real gain.

But ... if it's just a GUC, it can be set by code on the server side
that the client knows nothing about, breaking the client. That seems
pretty bad to me.

--
Robert Haas
EDB: http://www.enterprisedb.com

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#18)
Re: default result formats setting

Robert Haas <robertmhaas@gmail.com> writes:

But ... if it's just a GUC, it can be set by code on the server side
that the client knows nothing about, breaking the client. That seems
pretty bad to me.

It's impossible for the proposed patch to break *existing* clients,
because they all send requested format 0 or 1, and that is exactly
what they'll get back.

A client that is sending -1 and assuming that it will get back
a particular format could get broken if the GUC doesn't have the
value it thinks, true. But I'd argue that such code is unreasonably
non-robust. Can't we solve this by recommending that clients using
this feature always double-check which format they actually got?
ISTM that the use-cases for the feature involve checking what data
type you got anyway, so that's not an unreasonable added requirement.

regards, tom lane

#20Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#19)
Re: default result formats setting

On Wed, Mar 24, 2021 at 10:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

But ... if it's just a GUC, it can be set by code on the server side
that the client knows nothing about, breaking the client. That seems
pretty bad to me.

It's impossible for the proposed patch to break *existing* clients,
because they all send requested format 0 or 1, and that is exactly
what they'll get back.

OK.

A client that is sending -1 and assuming that it will get back
a particular format could get broken if the GUC doesn't have the
value it thinks, true. But I'd argue that such code is unreasonably
non-robust. Can't we solve this by recommending that clients using
this feature always double-check which format they actually got?
ISTM that the use-cases for the feature involve checking what data
type you got anyway, so that's not an unreasonable added requirement.

I suppose that's a fair idea, but to me it still feels a bit like a
round peg in the square hole. Suppose for example that there's a
client application which wants to talk to a connection pooler which in
turn wants to talk to the server. Let's also suppose that connection
pooler isn't just a pass-through, but wants to redirect client
connections to various servers or even intercept queries and result
sets and make changes as the data passes by. It can do that by parsing
SQL and solving the halting problem, whereas if this were a
protocol-level option it would be completely doable. Now you could say
"well, by that argument, DateStyle ought to be a protocol-level
option, too," and that's pretty a pretty fair criticism of what I'm
saying here. On the other hand, I'm not too sure that wouldn't have
been the right call. Using SQL to tailor the wire protocol format
feels like some kind of layering inversion to me. I think we should be
working toward a state where it's more clear which things are "owned"
at the wire protocol level and which things are "owned" at the SQL
level, and this seems to be going in exactly the opposite direction,
and in fact probably taking things further in that direction than
we've ever gone before.

--
Robert Haas
EDB: http://www.enterprisedb.com

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#20)
Re: default result formats setting

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Mar 24, 2021 at 10:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

A client that is sending -1 and assuming that it will get back
a particular format could get broken if the GUC doesn't have the
value it thinks, true. But I'd argue that such code is unreasonably
non-robust. Can't we solve this by recommending that clients using
this feature always double-check which format they actually got?
ISTM that the use-cases for the feature involve checking what data
type you got anyway, so that's not an unreasonable added requirement.

I suppose that's a fair idea, but to me it still feels a bit like a
round peg in the square hole. Suppose for example that there's a
client application which wants to talk to a connection pooler which in
turn wants to talk to the server. Let's also suppose that connection
pooler isn't just a pass-through, but wants to redirect client
connections to various servers or even intercept queries and result
sets and make changes as the data passes by. It can do that by parsing
SQL and solving the halting problem, whereas if this were a
protocol-level option it would be completely doable. Now you could say
"well, by that argument, DateStyle ought to be a protocol-level
option, too," and that's pretty a pretty fair criticism of what I'm
saying here. On the other hand, I'm not too sure that wouldn't have
been the right call. Using SQL to tailor the wire protocol format
feels like some kind of layering inversion to me.

I can't say that I'm 100% comfortable with it either, but the alternative
seems quite unpleasant, precisely because the client side might have
multiple layers involved. If we make it a wire-protocol thing then
a whole lot of client API thrashing is going to ensue to transmit the
desired setting up and down the stack. As an example, libpq doesn't
really give a darn which data format is returned: it is the application
using libpq that would want to be able to set this option. If libpq
has to be involved in transmitting the option to the backend, then we
need a new libpq API call to tell it to do that. Rinse and repeat
for anything that wraps libpq. And, in the end, it's not real clear
which client-side layer *should* have control of this. In some cases
you might want the decision to be taken quite high up, because which
format is really more efficient will depend on the total usage picture
for a given application, which low-level code like libpq wouldn't know.
Having a library decide that "this buck stops with me" is likely to be
the wrong thing.

I do not understand the structure of the client stack for JDBC, but
I wonder whether there won't be similar issues there.

As you say, DateStyle and the like are precedents for things that
*could* break application stacks, and in another universe maybe we'd
have managed them differently. In the end though, they've been like
that for a long time and we've not heard many complaints about them.
So I'm inclined to think that that precedent says this is OK too.

BTW, I thought briefly about whether we could try to lock things down
a bit by marking the GUC as PGC_BACKEND, which would effectively mean
that clients would have to send it in the startup packet. However,
that would verge on making it unusable for non-built-in datatypes,
for which you need to look up the OID first. So I don't think that'd
be an improvement.

I think we should be
working toward a state where it's more clear which things are "owned"
at the wire protocol level and which things are "owned" at the SQL
level, and this seems to be going in exactly the opposite direction,

I don't think I buy the premise that there are exactly two levels
on the client side.

regards, tom lane

#22Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#21)
Re: default result formats setting

On Wed, Mar 24, 2021 at 12:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't think I buy the premise that there are exactly two levels
on the client side.

Thanks for sharing your thoughts on this. I agree it's a complex
issue, and the idea that there are possibly more than two logical
levels is, for me, maybe your most interesting observation.

--
Robert Haas
EDB: http://www.enterprisedb.com

#23Dave Cramer
davecramer@postgres.rocks
In reply to: Robert Haas (#22)
Re: default result formats setting

On Sun, 7 Aug 2022 at 09:58, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 24, 2021 at 12:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't think I buy the premise that there are exactly two levels
on the client side.

Thanks for sharing your thoughts on this. I agree it's a complex
issue, and the idea that there are possibly more than two logical
levels is, for me, maybe your most interesting observation.

--
Robert Haas
EDB: http://www.enterprisedb.co <http://www.enterprisedb.com&gt;

I'd like to revive this thread.

I have put in a patch to do the same thing.
PostgreSQL: Re: Proposal to provide the facility to set binary format
output for specific OID's per session
</messages/by-id/CADK3HHJFVS1VWxGDKov8XMeFzyxyGJAyzCRQUwjvso+NMo+ofA@mail.gmail.com&gt;

Upthread Tom mused about how the JDBC driver would handle it. I can tell
you that it handles it fine with no changes as does the go driver. Further
as Jack pointed out it provides significant performance benefits.

The original discussion correctly surmises that the DESCRIBE statement is
rarely (if ever) used as any advantages of sending are nullified by the
cost of sending it.

I prefer the GUC as this allows pools to be configured to reset the setting
when returning the connection to the pool and setting it correctly for the
client when borrowing the connection.

Regards,

Dave

Show quoted text