Proposal to provide the facility to set binary format output for specific OID's per session

Started by Dave Cramerover 3 years ago15 messages
#1Dave Cramer
davecramer@gmail.com
1 attachment(s)

Greetings,

Jack Christensen the author of the go pgx driver had suggested Default
result formats should be settable per session · Discussion #5 ·
postgresql-interfaces/enhancement-ideas (github.com)
<https://github.com/postgresql-interfaces/enhancement-ideas/discussions/5&gt;

The JDBC driver has a similar problem and defers switching to binary format
until a statement has been reused 5 times; at which point we create a named
prepared statement and incur the overhead of an extra round trip for the
DESCRIBE statement. Because the extra round trip generally negates any
performance enhancements that receiving the data in binary format may
provide, we avoid using binary and receive everything in text format until
we are sure the extra trip is worth it.

Connection pools further complicate the issue: We can't use named
statements with connection pools since there is no binding of the
connection to the client. As such in the JDBC driver we recommend turning
off the ability to create a named statement and thus binary formats.

As a proof of concept I provide the attached patch which implements the
ability to specify which oids will be returned in binary format per
session.

IE set format_binary='20,21,25' for instance.

After which the specified oids will be output in binary format if there is
no describe statement or even using simpleQuery.

Both the JDBC driver and the go driver can exploit this change with no
changes. I haven't confirmed if other drivers would work without changes.

Furthermore jackc/postgresql_simple_protocol_binary_format_bench
(github.com)
<https://github.com/jackc/postgresql_simple_protocol_binary_format_bench&gt;
suggests
that there is a considerable performance benefit. To quote 'At 100 rows the
text format takes 48% longer than the binary format.'

Regards,
Dave Cramer

Attachments:

0001-add-format_binary.patchapplication/octet-stream; name=0001-add-format_binary.patchDownload
From 4ba50fcf8c1eaf0e13e34ef94b6c456a7ecc5af9 Mon Sep 17 00:00:00 2001
From: Dave Cramer <davecramer@gmail.com>
Date: Sun, 17 Jul 2022 21:40:32 -0400
Subject: [PATCH] add format_binary

fix error

working version of setting binary oids

rename binary_formats, comments

added error checking in check_format_binary
---
 src/backend/tcop/postgres.c      |  2 +
 src/backend/tcop/pquery.c        | 54 +++++++++++++++++---
 src/backend/utils/init/globals.c |  1 +
 src/backend/utils/misc/guc.c     | 85 ++++++++++++++++++++++++++++++++
 src/include/miscadmin.h          |  1 +
 5 files changed, 136 insertions(+), 7 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 6f18b68856..ac62094707 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -105,6 +105,8 @@ int			PostAuthDelay = 0;
 /* Time between checks that the client is still connected. */
 int			client_connection_check_interval = 0;
 
+Oid			*binary_format_oids = NULL;
+
 /* ----------------
  *		private typedefs etc
  * ----------------
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..e11f55b36e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -58,6 +58,8 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
+static int findOid( Oid *binary_oids, Oid oid);
+extern Oid  *binary_format_oids;
 
 
 /*
@@ -644,18 +646,56 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats)
 	}
 	else if (nFormats > 0)
 	{
-		/* single format specified, use for all columns */
-		int16		format1 = formats[0];
+		// The client has requested binary formats for some types
+		if ( binary_format_oids != NULL )
+		{
+			Oid targetOid;
 
-		for (i = 0; i < natts; i++)
-			portal->formats[i] = format1;
+			for (i = 0; i < natts; i++){
+				targetOid = portal->tupDesc->attrs[i].atttypid;
+				portal->formats[i] = findOid(binary_format_oids, targetOid);
+			}
+		}
+		else
+		{
+			/* single format specified, use for all columns */
+			int16		format1 = formats[0];
+
+			for (i = 0; i < natts; i++)
+				portal->formats[i] = format1;
+		}
 	}
 	else
 	{
-		/* use default format for all columns */
-		for (i = 0; i < natts; i++)
-			portal->formats[i] = 0;
+		if ( binary_format_oids != NULL )
+		{
+			Oid targetOid;
+
+			for (i = 0; i < natts; i++){
+				targetOid = portal->tupDesc->attrs[i].atttypid;
+				portal->formats[i] = findOid(binary_format_oids, targetOid);
+			}
+		}
+		else {
+			/* use default format for all columns */
+			for (i = 0; i < natts; i++)
+				portal->formats[i] = 0;
+		}
+	}
+}
+
+/*
+* Linear search through the array of oids.
+* I don't expect this to ever be a large array
+*/
+static int findOid( Oid *binary_oids, Oid oid)
+{
+	Oid *tmp = binary_oids;
+	while (tmp && *tmp != InvalidOid)
+	{
+		if (*tmp++ == oid) return 1;
 	}
+	return 0;
 }
 
 /*
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1a5d29ac9b..31188462ec 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -123,6 +123,7 @@ int			IntervalStyle = INTSTYLE_POSTGRES;
 bool		enableFsync = true;
 bool		allowSystemTableMods = false;
 int			work_mem = 4096;
+char         *format_binary = NULL;
 double		hash_mem_multiplier = 2.0;
 int			maintenance_work_mem = 65536;
 int			max_parallel_maintenance_workers = 2;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0328029d43..f8bdfcffe5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -143,6 +143,7 @@ extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
 extern bool ignore_invalid_pages;
 extern bool synchronize_seqscans;
+extern Oid  *binary_format_oids;
 
 #ifdef TRACE_SYNCSCAN
 extern bool trace_syncscan;
@@ -242,6 +243,8 @@ static bool check_recovery_target_lsn(char **newval, void **extra, GucSource sou
 static void assign_recovery_target_lsn(const char *newval, void *extra);
 static bool check_primary_slot_name(char **newval, void **extra, GucSource source);
 static bool check_default_with_oids(bool *newval, void **extra, GucSource source);
+static bool check_format_binary(char **newval, void **extra, GucSource source);
+static void assign_format_binary(const char*newval, void *extra);
 
 /* Private functions in guc-file.l that need to be called from guc.c */
 static ConfigVariable *ProcessConfigFileInternal(GucContext context,
@@ -4296,6 +4299,17 @@ static struct config_string ConfigureNamesString[] =
 		"",
 		NULL, NULL, NULL
 	},
+	{
+		{"format_binary", PGC_USERSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Sets the type Oid's to be returned in binary format"),
+			gettext_noop("Set by the client to indicate which types are to be "
+						 "returned in binary format. "),
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&format_binary,
+		"",
+		check_format_binary, assign_format_binary, NULL
+	},
 
 	{
 		{"search_path", PGC_USERSET, CLIENT_CONN_STATEMENT,
@@ -12892,4 +12906,75 @@ check_default_with_oids(bool *newval, void **extra, GucSource source)
 	return true;
 }
 
+static bool
+check_format_binary( char **newval, void **extra, GucSource source)
+{
+	// sanity check
+	if (*newval == NULL)
+		return false;
+	
+	if (strcmp(*newval,"") == 0)
+		return true;
+
+	char *tmp = palloc(strlen(*newval));
+	strcpy(tmp, *newval);
+	char *token = strtok(tmp, ",");
+
+	while(token != NULL)
+	{
+		Oid candidate = atooid(token);
+		if (candidate > OID_MAX) 
+			GUC_check_errdetail("OID out of range found in %s, %s", *newval, token);
+
+		// atooid will return 0 aka InvalidOid if it can't convert the string or 0 
+		// if it's really 0	
+		if (candidate == InvalidOid)
+		{	
+			if (errno == EINVAL)
+				GUC_check_errdetail("%s has invalid characters at %s",
+					*newval, token);
+			else
+				GUC_check_errdetail("InvalidOid (0) found in %s", *newval);
+			return false;
+		}
+		else
+			token = strtok(NULL, ",");
+	}
+	return true;
+}
+
+static void
+assign_format_binary(const char *newval, void *extra)
+{
+	// check for errors or nothing to do
+	if (newval == NULL || strcmp(newval, "") == 0)
+		return;
+
+	char *tmp = palloc(strlen(newval));
+	strcpy(tmp, newval);
+	// unlikely to have more than 16
+	int length = 16;
+	// +1 for the InvalidOid marker at the end
+	Oid *tmpOids = palloc(length+1);
+	int i = 0;
+	
+	char *token = strtok(tmp, ",");
+
+	while(token != NULL)
+	{
+		tmpOids[i++] = atooid(token);
+		if (i > length)
+		{
+			length += 16;
+			tmpOids = repalloc(tmpOids, length+1);
+		}
+		token = strtok(NULL, ",");
+	}
+	tmpOids[i] = InvalidOid;
+	binary_format_oids = tmpOids;
+
+}
+
+
+
 #include "guc-file.c"
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index ea9a56d395..fbe1707a01 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -275,6 +275,7 @@ extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
 extern PGDLLIMPORT bool VacuumCostActive;
+extern PGDLLIMPORT char *format_binary;
 
 
 /* in tcop/postgres.c */
-- 
2.32.1 (Apple Git-133)

#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Dave Cramer (#1)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

At Fri, 22 Jul 2022 11:00:18 -0400, Dave Cramer <davecramer@gmail.com> wrote in

As a proof of concept I provide the attached patch which implements the
ability to specify which oids will be returned in binary format per
session.

...

Both the JDBC driver and the go driver can exploit this change with no
changes. I haven't confirmed if other drivers would work without changes.

I'm not sure about the needs of that, but binary exchange format is
not the one that can be turned on ignoring the peer's capability. If
JDBC driver wants some types be sent in binary format, it seems to be
able to be specified in bind message.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#3Dave Cramer
davecramer@gmail.com
In reply to: Kyotaro Horiguchi (#2)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

Dave Cramer

On Sun, 24 Jul 2022 at 23:02, Kyotaro Horiguchi <horikyota.ntt@gmail.com>
wrote:

At Fri, 22 Jul 2022 11:00:18 -0400, Dave Cramer <davecramer@gmail.com>
wrote in

As a proof of concept I provide the attached patch which implements the
ability to specify which oids will be returned in binary format per
session.

...

Both the JDBC driver and the go driver can exploit this change with no
changes. I haven't confirmed if other drivers would work without changes.

I'm not sure about the needs of that, but binary exchange format is
not the one that can be turned on ignoring the peer's capability.

I'm not sure what this means. The client is specifying which types it wants
in binary format.

If
JDBC driver wants some types be sent in binary format, it seems to be
able to be specified in bind message.

To be clear it's not just the JDBC client; the original idea came from the
author of go driver.
And yes you can specify it in the bind message but you have to specify it
in *every* bind message which pretty much negates any advantage you might
get out of binary format due to the extra round trip.

Regards,
Dave

Show quoted text

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#4Jack Christensen
jack@jackchristensen.com
In reply to: Dave Cramer (#3)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

On Mon, Jul 25, 2022 at 4:57 AM Dave Cramer <davecramer@gmail.com> wrote:

Dave Cramer

On Sun, 24 Jul 2022 at 23:02, Kyotaro Horiguchi <horikyota.ntt@gmail.com>
wrote:

At Fri, 22 Jul 2022 11:00:18 -0400, Dave Cramer <davecramer@gmail.com>
wrote in

As a proof of concept I provide the attached patch which implements the
ability to specify which oids will be returned in binary format per
session.

...

Both the JDBC driver and the go driver can exploit this change with no
changes. I haven't confirmed if other drivers would work without

changes.

I'm not sure about the needs of that, but binary exchange format is
not the one that can be turned on ignoring the peer's capability.

I'm not sure what this means. The client is specifying which types it
wants in binary format.

If
JDBC driver wants some types be sent in binary format, it seems to be
able to be specified in bind message.

To be clear it's not just the JDBC client; the original idea came from the
author of go driver.
And yes you can specify it in the bind message but you have to specify it
in *every* bind message which pretty much negates any advantage you might
get out of binary format due to the extra round trip.

Regards,
Dave

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

The advantage is to be able to use the binary format with only a single
network round trip in cases where prepared statements are not possible.
e.g. when using PgBouncer. Using the simple protocol with this patch lets
users of pgx (the Go driver mentioned above) and PgBouncer use the binary
format. The performance gains can be significant especially with types such
as timestamptz that are very slow to parse.

As far as only sending binary types that the client can understand, the
client driver would call `set format_binary` at the beginning of the
session.

Jack Christensen

#5Joe Conway
mail@joeconway.com
In reply to: Jack Christensen (#4)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

On 7/25/22 10:07, Jack Christensen wrote:

The advantage is to be able to use the binary format with only a single
network round trip in cases where prepared statements are not possible.
e.g. when using PgBouncer. Using the simple protocol with this patch
lets users of pgx (the Go driver mentioned above) and PgBouncer use the
binary format. The performance gains can be significant especially with
types such as timestamptz that are very slow to parse.

As far as only sending binary types that the client can understand, the
client driver would call `set format_binary` at the beginning of the
session.

+1 makes a lot of sense to me.

Dave please add this to the open commitfest (202209)

--
Joe Conway
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#6Sehrope Sarkuni
sehrope@jackdb.com
In reply to: Joe Conway (#5)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

Idea here makes sense and I've seen this brought up repeatedly on the JDBC
lists.

Does the driver need to be aware that this SET command was executed? I'm
wondering what happens if an end user executes this with an OID the driver
does not actually know how to handle.

+ Oid *tmpOids = palloc(length+1);
...
+ tmpOids = repalloc(tmpOids, length+1);

These should be: sizeof(Oid) * (length + 1)

Also, I think you need to specify an explicit context via
MemoryContextAlloc or the allocated memory will be in the default context
and released at the end of the command.

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

#7Dave Cramer
davecramer@gmail.com
In reply to: Sehrope Sarkuni (#6)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

Hi Sehrope,

On Mon, 25 Jul 2022 at 17:22, Sehrope Sarkuni <sehrope@jackdb.com> wrote:

Idea here makes sense and I've seen this brought up repeatedly on the JDBC
lists.

Does the driver need to be aware that this SET command was executed? I'm
wondering what happens if an end user executes this with an OID the driver
does not actually know how to handle.

I suppose there would be a failure to read the attribute correctly.

+ Oid *tmpOids = palloc(length+1);
...
+ tmpOids = repalloc(tmpOids, length+1);

These should be: sizeof(Oid) * (length + 1)

Yes they should, thanks!

Also, I think you need to specify an explicit context via
MemoryContextAlloc or the allocated memory will be in the default context
and released at the end of the command.

Also good catch

Thanks,

Dave

Show quoted text
#8Dave Cramer
davecramer@gmail.com
In reply to: Dave Cramer (#7)
1 attachment(s)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

Hi Sehrope,

On Mon, 25 Jul 2022 at 17:53, Dave Cramer <davecramer@gmail.com> wrote:

Hi Sehrope,

On Mon, 25 Jul 2022 at 17:22, Sehrope Sarkuni <sehrope@jackdb.com> wrote:

Idea here makes sense and I've seen this brought up repeatedly on the
JDBC lists.

Does the driver need to be aware that this SET command was executed? I'm
wondering what happens if an end user executes this with an OID the driver
does not actually know how to handle.

I suppose there would be a failure to read the attribute correctly.

+ Oid *tmpOids = palloc(length+1);
...
+ tmpOids = repalloc(tmpOids, length+1);

These should be: sizeof(Oid) * (length + 1)

Yes they should, thanks!

Also, I think you need to specify an explicit context via
MemoryContextAlloc or the allocated memory will be in the default context
and released at the end of the command.

Also good catch

Thanks,

Attached patch to correct these deficiencies.

Thanks again,

Show quoted text

Dave

Attachments:

0002-allocate-the-correct-amount-of-memory-in-permanent-s.patchapplication/octet-stream; name=0002-allocate-the-correct-amount-of-memory-in-permanent-s.patchDownload
From 27fec1bbb076a11ac6139cb41456cd02ea25b02a Mon Sep 17 00:00:00 2001
From: Dave Cramer <davecramer@gmail.com>
Date: Tue, 26 Jul 2022 08:01:30 -0400
Subject: [PATCH 2/2] allocate the correct amount of memory in permanent
 storage

---
 src/backend/utils/misc/guc.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index f8bdfcffe5..255e541d55 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -12952,10 +12952,14 @@ assign_format_binary(const char *newval, void *extra)
 
 	char *tmp = palloc(strlen(newval));
 	strcpy(tmp, newval);
+
+	/* Must save OID list in permanent storage. */
+	MemoryContext oldcxt = MemoryContextSwitchTo(TopMemoryContext);
+
 	// unlikely to have more than 16
 	int length = 16;
 	// +1 for the InvalidOid marker at the end
-	Oid *tmpOids = palloc(length+1);
+	Oid *tmpOids = palloc(sizeof(Oid)*(length+1));
 	int i = 0;
 	
 	char *token = strtok(tmp, ",");
@@ -12966,12 +12970,13 @@ assign_format_binary(const char *newval, void *extra)
 		if (i > length)
 		{
 			length += 16;
-			tmpOids = repalloc(tmpOids, length+1);
+			tmpOids = repalloc(tmpOids, sizeof(Oid)*(length+1));
 		}
 		token = strtok(NULL, ",");
 	}
 	tmpOids[i] = InvalidOid;
 	binary_format_oids = tmpOids;
+	MemoryContextSwitchTo(oldcxt);
 
 }
 
-- 
2.32.1 (Apple Git-133)

#9Justin Pryzby
pryzby@telsasoft.com
In reply to: Dave Cramer (#8)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:

Attached patch to correct these deficiencies.

You sent a patch to be applied on top of the first patch, but cfbot doesn't
know that, so it says the patch doesn't apply.
http://cfbot.cputube.org/dave-cramer.html

BTW, a previous discussion about this idea is here:
/messages/by-id/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com

--
Justin

#10Dave Cramer
davecramer@gmail.com
In reply to: Justin Pryzby (#9)
1 attachment(s)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:

Attached patch to correct these deficiencies.

You sent a patch to be applied on top of the first patch, but cfbot doesn't
know that, so it says the patch doesn't apply.
http://cfbot.cputube.org/dave-cramer.html

BTW, a previous discussion about this idea is here:

/messages/by-id/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com

squashed patch attached

Dave

Attachments:

0001-add-format_binary.patchapplication/octet-stream; name=0001-add-format_binary.patchDownload
From dcbb6b597ef44b109d8499274a07eb13779e07e7 Mon Sep 17 00:00:00 2001
From: Dave Cramer <davecramer@gmail.com>
Date: Sun, 17 Jul 2022 21:40:32 -0400
Subject: [PATCH] add format_binary

fix error

working version of setting binary oids

rename binary_formats, comments

added error checking in check_format_binary

allocate the correct amount of memory in permanent storage
---
 src/backend/tcop/postgres.c      |  2 +
 src/backend/tcop/pquery.c        | 54 ++++++++++++++++---
 src/backend/utils/init/globals.c |  1 +
 src/backend/utils/misc/guc.c     | 90 ++++++++++++++++++++++++++++++++
 src/include/miscadmin.h          |  1 +
 5 files changed, 141 insertions(+), 7 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 6f18b68856..ac62094707 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -105,6 +105,8 @@ int			PostAuthDelay = 0;
 /* Time between checks that the client is still connected. */
 int			client_connection_check_interval = 0;
 
+Oid			*binary_format_oids = NULL;
+
 /* ----------------
  *		private typedefs etc
  * ----------------
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..e11f55b36e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -58,6 +58,8 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
+static int findOid( Oid *binary_oids, Oid oid);
+extern Oid  *binary_format_oids;
 
 
 /*
@@ -644,18 +646,56 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats)
 	}
 	else if (nFormats > 0)
 	{
-		/* single format specified, use for all columns */
-		int16		format1 = formats[0];
+		// The client has requested binary formats for some types
+		if ( binary_format_oids != NULL )
+		{
+			Oid targetOid;
 
-		for (i = 0; i < natts; i++)
-			portal->formats[i] = format1;
+			for (i = 0; i < natts; i++){
+				targetOid = portal->tupDesc->attrs[i].atttypid;
+				portal->formats[i] = findOid(binary_format_oids, targetOid);
+			}
+		}
+		else
+		{
+			/* single format specified, use for all columns */
+			int16		format1 = formats[0];
+
+			for (i = 0; i < natts; i++)
+				portal->formats[i] = format1;
+		}
 	}
 	else
 	{
-		/* use default format for all columns */
-		for (i = 0; i < natts; i++)
-			portal->formats[i] = 0;
+		if ( binary_format_oids != NULL )
+		{
+			Oid targetOid;
+
+			for (i = 0; i < natts; i++){
+				targetOid = portal->tupDesc->attrs[i].atttypid;
+				portal->formats[i] = findOid(binary_format_oids, targetOid);
+			}
+		}
+		else {
+			/* use default format for all columns */
+			for (i = 0; i < natts; i++)
+				portal->formats[i] = 0;
+		}
+	}
+}
+
+/*
+* Linear search through the array of oids.
+* I don't expect this to ever be a large array
+*/
+static int findOid( Oid *binary_oids, Oid oid)
+{
+	Oid *tmp = binary_oids;
+	while (tmp && *tmp != InvalidOid)
+	{
+		if (*tmp++ == oid) return 1;
 	}
+	return 0;
 }
 
 /*
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1a5d29ac9b..31188462ec 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -123,6 +123,7 @@ int			IntervalStyle = INTSTYLE_POSTGRES;
 bool		enableFsync = true;
 bool		allowSystemTableMods = false;
 int			work_mem = 4096;
+char         *format_binary = NULL;
 double		hash_mem_multiplier = 2.0;
 int			maintenance_work_mem = 65536;
 int			max_parallel_maintenance_workers = 2;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0328029d43..255e541d55 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -143,6 +143,7 @@ extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
 extern bool ignore_invalid_pages;
 extern bool synchronize_seqscans;
+extern Oid  *binary_format_oids;
 
 #ifdef TRACE_SYNCSCAN
 extern bool trace_syncscan;
@@ -242,6 +243,8 @@ static bool check_recovery_target_lsn(char **newval, void **extra, GucSource sou
 static void assign_recovery_target_lsn(const char *newval, void *extra);
 static bool check_primary_slot_name(char **newval, void **extra, GucSource source);
 static bool check_default_with_oids(bool *newval, void **extra, GucSource source);
+static bool check_format_binary(char **newval, void **extra, GucSource source);
+static void assign_format_binary(const char*newval, void *extra);
 
 /* Private functions in guc-file.l that need to be called from guc.c */
 static ConfigVariable *ProcessConfigFileInternal(GucContext context,
@@ -4296,6 +4299,17 @@ static struct config_string ConfigureNamesString[] =
 		"",
 		NULL, NULL, NULL
 	},
+	{
+		{"format_binary", PGC_USERSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Sets the type Oid's to be returned in binary format"),
+			gettext_noop("Set by the client to indicate which types are to be "
+						 "returned in binary format. "),
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&format_binary,
+		"",
+		check_format_binary, assign_format_binary, NULL
+	},
 
 	{
 		{"search_path", PGC_USERSET, CLIENT_CONN_STATEMENT,
@@ -12892,4 +12906,80 @@ check_default_with_oids(bool *newval, void **extra, GucSource source)
 	return true;
 }
 
+static bool
+check_format_binary( char **newval, void **extra, GucSource source)
+{
+	// sanity check
+	if (*newval == NULL)
+		return false;
+	
+	if (strcmp(*newval,"") == 0)
+		return true;
+
+	char *tmp = palloc(strlen(*newval));
+	strcpy(tmp, *newval);
+	char *token = strtok(tmp, ",");
+
+	while(token != NULL)
+	{
+		Oid candidate = atooid(token);
+		if (candidate > OID_MAX) 
+			GUC_check_errdetail("OID out of range found in %s, %s", *newval, token);
+
+		// atooid will return 0 aka InvalidOid if it can't convert the string or 0 
+		// if it's really 0	
+		if (candidate == InvalidOid)
+		{	
+			if (errno == EINVAL)
+				GUC_check_errdetail("%s has invalid characters at %s",
+					*newval, token);
+			else
+				GUC_check_errdetail("InvalidOid (0) found in %s", *newval);
+			return false;
+		}
+		else
+			token = strtok(NULL, ",");
+	}
+	return true;
+}
+
+static void
+assign_format_binary(const char *newval, void *extra)
+{
+	// check for errors or nothing to do
+	if (newval == NULL || strcmp(newval, "") == 0)
+		return;
+
+	char *tmp = palloc(strlen(newval));
+	strcpy(tmp, newval);
+
+	/* Must save OID list in permanent storage. */
+	MemoryContext oldcxt = MemoryContextSwitchTo(TopMemoryContext);
+
+	// unlikely to have more than 16
+	int length = 16;
+	// +1 for the InvalidOid marker at the end
+	Oid *tmpOids = palloc(sizeof(Oid)*(length+1));
+	int i = 0;
+	
+	char *token = strtok(tmp, ",");
+
+	while(token != NULL)
+	{
+		tmpOids[i++] = atooid(token);
+		if (i > length)
+		{
+			length += 16;
+			tmpOids = repalloc(tmpOids, sizeof(Oid)*(length+1));
+		}
+		token = strtok(NULL, ",");
+	}
+	tmpOids[i] = InvalidOid;
+	binary_format_oids = tmpOids;
+	MemoryContextSwitchTo(oldcxt);
+
+}
+
+
+
 #include "guc-file.c"
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index ea9a56d395..fbe1707a01 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -275,6 +275,7 @@ extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
 extern PGDLLIMPORT bool VacuumCostActive;
+extern PGDLLIMPORT char *format_binary;
 
 
 /* in tcop/postgres.c */
-- 
2.32.1 (Apple Git-133)

#11Ibrar Ahmed
ibrar.ahmad@gmail.com
In reply to: Dave Cramer (#10)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

On Fri, Aug 12, 2022 at 5:48 PM Dave Cramer <davecramer@gmail.com> wrote:

On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:

Attached patch to correct these deficiencies.

You sent a patch to be applied on top of the first patch, but cfbot
doesn't
know that, so it says the patch doesn't apply.
http://cfbot.cputube.org/dave-cramer.html

BTW, a previous discussion about this idea is here:

/messages/by-id/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com

squashed patch attached

Dave

The patch does not apply successfully; a rebase is required.

=== applying patch ./0001-add-format_binary.patch
patching file src/backend/tcop/postgres.c
Hunk #1 succeeded at 97 (offset -8 lines).
patching file src/backend/tcop/pquery.c
patching file src/backend/utils/init/globals.c
patching file src/backend/utils/misc/guc.c
Hunk #1 succeeded at 144 (offset 1 line).
Hunk #2 succeeded at 244 with fuzz 2 (offset 1 line).
Hunk #3 succeeded at 4298 (offset -1 lines).
Hunk #4 FAILED at 12906.
1 out of 4 hunks FAILED -- saving rejects to file
src/backend/utils/misc/guc.c.rej
patching file src/include/miscadmin.h

--
Ibrar Ahmed

#12Dave Cramer
davecramer@gmail.com
In reply to: Ibrar Ahmed (#11)
1 attachment(s)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

On Tue, 6 Sept 2022 at 02:30, Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:

On Fri, Aug 12, 2022 at 5:48 PM Dave Cramer <davecramer@gmail.com> wrote:

On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:

Attached patch to correct these deficiencies.

You sent a patch to be applied on top of the first patch, but cfbot
doesn't
know that, so it says the patch doesn't apply.
http://cfbot.cputube.org/dave-cramer.html

BTW, a previous discussion about this idea is here:

/messages/by-id/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com

squashed patch attached

Dave

The patch does not apply successfully; a rebase is required.

=== applying patch ./0001-add-format_binary.patch
patching file src/backend/tcop/postgres.c
Hunk #1 succeeded at 97 (offset -8 lines).
patching file src/backend/tcop/pquery.c
patching file src/backend/utils/init/globals.c
patching file src/backend/utils/misc/guc.c
Hunk #1 succeeded at 144 (offset 1 line).
Hunk #2 succeeded at 244 with fuzz 2 (offset 1 line).
Hunk #3 succeeded at 4298 (offset -1 lines).
Hunk #4 FAILED at 12906.
1 out of 4 hunks FAILED -- saving rejects to file src/backend/utils/misc/guc.c.rej
patching file src/include/miscadmin.h

Thanks,

New rebased patch attached

Dave

Show quoted text

Attachments:

0001-add-format_binary.patchapplication/octet-stream; name=0001-add-format_binary.patchDownload
From c958582ec89ec065101f61a9b61df4cd800cb3e9 Mon Sep 17 00:00:00 2001
From: Dave Cramer <davecramer@gmail.com>
Date: Sun, 17 Jul 2022 21:40:32 -0400
Subject: [PATCH] add format_binary

fix error

working version of setting binary oids

rename binary_formats, comments

added error checking in check_format_binary

allocate the correct amount of memory in permanent storage
---
 src/backend/tcop/postgres.c      |  2 +
 src/backend/tcop/pquery.c        | 54 ++++++++++++++++---
 src/backend/utils/init/globals.c |  1 +
 src/backend/utils/misc/guc.c     | 91 ++++++++++++++++++++++++++++++++
 src/include/miscadmin.h          |  1 +
 5 files changed, 142 insertions(+), 7 deletions(-)

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7bec4e4ff5..8a47bc2733 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -97,6 +97,8 @@ int			PostAuthDelay = 0;
 /* Time between checks that the client is still connected. */
 int			client_connection_check_interval = 0;
 
+Oid			*binary_format_oids = NULL;
+
 /* ----------------
  *		private typedefs etc
  * ----------------
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 5aa5a350f3..e11f55b36e 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -58,6 +58,8 @@ static uint64 DoPortalRunFetch(Portal portal,
 							   long count,
 							   DestReceiver *dest);
 static void DoPortalRewind(Portal portal);
+static int findOid( Oid *binary_oids, Oid oid);
+extern Oid  *binary_format_oids;
 
 
 /*
@@ -644,18 +646,56 @@ PortalSetResultFormat(Portal portal, int nFormats, int16 *formats)
 	}
 	else if (nFormats > 0)
 	{
-		/* single format specified, use for all columns */
-		int16		format1 = formats[0];
+		// The client has requested binary formats for some types
+		if ( binary_format_oids != NULL )
+		{
+			Oid targetOid;
 
-		for (i = 0; i < natts; i++)
-			portal->formats[i] = format1;
+			for (i = 0; i < natts; i++){
+				targetOid = portal->tupDesc->attrs[i].atttypid;
+				portal->formats[i] = findOid(binary_format_oids, targetOid);
+			}
+		}
+		else
+		{
+			/* single format specified, use for all columns */
+			int16		format1 = formats[0];
+
+			for (i = 0; i < natts; i++)
+				portal->formats[i] = format1;
+		}
 	}
 	else
 	{
-		/* use default format for all columns */
-		for (i = 0; i < natts; i++)
-			portal->formats[i] = 0;
+		if ( binary_format_oids != NULL )
+		{
+			Oid targetOid;
+
+			for (i = 0; i < natts; i++){
+				targetOid = portal->tupDesc->attrs[i].atttypid;
+				portal->formats[i] = findOid(binary_format_oids, targetOid);
+			}
+		}
+		else {
+			/* use default format for all columns */
+			for (i = 0; i < natts; i++)
+				portal->formats[i] = 0;
+		}
+	}
+}
+
+/*
+* Linear search through the array of oids.
+* I don't expect this to ever be a large array
+*/
+static int findOid( Oid *binary_oids, Oid oid)
+{
+	Oid *tmp = binary_oids;
+	while (tmp && *tmp != InvalidOid)
+	{
+		if (*tmp++ == oid) return 1;
 	}
+	return 0;
 }
 
 /*
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1a5d29ac9b..31188462ec 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -123,6 +123,7 @@ int			IntervalStyle = INTSTYLE_POSTGRES;
 bool		enableFsync = true;
 bool		allowSystemTableMods = false;
 int			work_mem = 4096;
+char         *format_binary = NULL;
 double		hash_mem_multiplier = 2.0;
 int			maintenance_work_mem = 65536;
 int			max_parallel_maintenance_workers = 2;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c336698ad5..beb91c5664 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -144,6 +144,7 @@ extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
 extern bool ignore_invalid_pages;
 extern bool synchronize_seqscans;
+extern Oid  *binary_format_oids;
 
 #ifdef TRACE_SYNCSCAN
 extern bool trace_syncscan;
@@ -243,6 +244,8 @@ static bool check_recovery_target_lsn(char **newval, void **extra, GucSource sou
 static void assign_recovery_target_lsn(const char *newval, void *extra);
 static bool check_primary_slot_name(char **newval, void **extra, GucSource source);
 static bool check_default_with_oids(bool *newval, void **extra, GucSource source);
+static bool check_format_binary(char **newval, void **extra, GucSource source);
+static void assign_format_binary(const char*newval, void *extra);
 
 /*
  * Track whether there were any deferred checks for custom resource managers
@@ -4295,6 +4298,17 @@ static struct config_string ConfigureNamesString[] =
 		"",
 		NULL, NULL, NULL
 	},
+	{
+		{"format_binary", PGC_USERSET, CLIENT_CONN_STATEMENT,
+			gettext_noop("Sets the type Oid's to be returned in binary format"),
+			gettext_noop("Set by the client to indicate which types are to be "
+						 "returned in binary format. "),
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+		},
+		&format_binary,
+		"",
+		check_format_binary, assign_format_binary, NULL
+	},
 
 	{
 		{"search_path", PGC_USERSET, CLIENT_CONN_STATEMENT,
@@ -13333,3 +13347,80 @@ check_default_with_oids(bool *newval, void **extra, GucSource source)
 
 	return true;
 }
+
+static bool
+check_format_binary( char **newval, void **extra, GucSource source)
+{
+	// sanity check
+	if (*newval == NULL)
+		return false;
+	
+	if (strcmp(*newval,"") == 0)
+		return true;
+
+	char *tmp = palloc(strlen(*newval));
+	strcpy(tmp, *newval);
+	char *token = strtok(tmp, ",");
+
+	while(token != NULL)
+	{
+		Oid candidate = atooid(token);
+		if (candidate > OID_MAX) 
+			GUC_check_errdetail("OID out of range found in %s, %s", *newval, token);
+
+		// atooid will return 0 aka InvalidOid if it can't convert the string or 0 
+		// if it's really 0	
+		if (candidate == InvalidOid)
+		{	
+			if (errno == EINVAL)
+				GUC_check_errdetail("%s has invalid characters at %s",
+					*newval, token);
+			else
+				GUC_check_errdetail("InvalidOid (0) found in %s", *newval);
+			return false;
+		}
+		else
+			token = strtok(NULL, ",");
+	}
+	return true;
+}
+
+static void
+assign_format_binary(const char *newval, void *extra)
+{
+	// check for errors or nothing to do
+	if (newval == NULL || strcmp(newval, "") == 0)
+		return;
+
+	char *tmp = palloc(strlen(newval));
+	strcpy(tmp, newval);
+
+	/* Must save OID list in permanent storage. */
+	MemoryContext oldcxt = MemoryContextSwitchTo(TopMemoryContext);
+
+	// unlikely to have more than 16
+	int length = 16;
+	// +1 for the InvalidOid marker at the end
+	Oid *tmpOids = palloc(sizeof(Oid)*(length+1));
+	int i = 0;
+	
+	char *token = strtok(tmp, ",");
+
+	while(token != NULL)
+	{
+		tmpOids[i++] = atooid(token);
+		if (i > length)
+		{
+			length += 16;
+			tmpOids = repalloc(tmpOids, sizeof(Oid)*(length+1));
+		}
+		token = strtok(NULL, ",");
+	}
+	tmpOids[i] = InvalidOid;
+	binary_format_oids = tmpOids;
+	MemoryContextSwitchTo(oldcxt);
+
+}
+
+#include "guc-file.c"
+
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 65cf4ba50f..0f9fc3562c 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -275,6 +275,7 @@ extern PGDLLIMPORT int64 VacuumPageDirty;
 
 extern PGDLLIMPORT int VacuumCostBalance;
 extern PGDLLIMPORT bool VacuumCostActive;
+extern PGDLLIMPORT char *format_binary;
 
 
 /* in tcop/postgres.c */
-- 
2.32.1 (Apple Git-133)

#13Dave Cramer
davecramer@postgres.rocks
In reply to: Dave Cramer (#12)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

Waiting on the author to do what ? I'm waiting for a review.

#14Ian Lawrence Barwick
barwick@gmail.com
In reply to: Dave Cramer (#12)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

2022年9月6日(火) 21:32 Dave Cramer <davecramer@gmail.com>:

On Tue, 6 Sept 2022 at 02:30, Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:

On Fri, Aug 12, 2022 at 5:48 PM Dave Cramer <davecramer@gmail.com> wrote:

On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:

Attached patch to correct these deficiencies.

You sent a patch to be applied on top of the first patch, but cfbot doesn't
know that, so it says the patch doesn't apply.
http://cfbot.cputube.org/dave-cramer.html

BTW, a previous discussion about this idea is here:
/messages/by-id/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com

squashed patch attached

Dave

The patch does not apply successfully; a rebase is required.

=== applying patch ./0001-add-format_binary.patch
patching file src/backend/tcop/postgres.c
Hunk #1 succeeded at 97 (offset -8 lines).
patching file src/backend/tcop/pquery.c
patching file src/backend/utils/init/globals.c
patching file src/backend/utils/misc/guc.c
Hunk #1 succeeded at 144 (offset 1 line).
Hunk #2 succeeded at 244 with fuzz 2 (offset 1 line).
Hunk #3 succeeded at 4298 (offset -1 lines).
Hunk #4 FAILED at 12906.
1 out of 4 hunks FAILED -- saving rejects to file src/backend/utils/misc/guc.c.rej
patching file src/include/miscadmin.h

Thanks,

New rebased patch attached

Hi

cfbot reports the patch no longer applies [1]http://cfbot.cputube.org/patch_40_3777.log. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch again.

[1]: http://cfbot.cputube.org/patch_40_3777.log

Thanks

Ian Barwick

#15Dave Cramer
davecramer@gmail.com
In reply to: Ian Lawrence Barwick (#14)
Re: Proposal to provide the facility to set binary format output for specific OID's per session

Hi Ian,

Thanks, will do
Dave Cramer

On Thu, 3 Nov 2022 at 21:36, Ian Lawrence Barwick <barwick@gmail.com> wrote:

Show quoted text

2022年9月6日(火) 21:32 Dave Cramer <davecramer@gmail.com>:

On Tue, 6 Sept 2022 at 02:30, Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:

On Fri, Aug 12, 2022 at 5:48 PM Dave Cramer <davecramer@gmail.com>

wrote:

On Fri, 5 Aug 2022 at 17:51, Justin Pryzby <pryzby@telsasoft.com>

wrote:

On Tue, Jul 26, 2022 at 08:11:04AM -0400, Dave Cramer wrote:

Attached patch to correct these deficiencies.

You sent a patch to be applied on top of the first patch, but cfbot

doesn't

know that, so it says the patch doesn't apply.
http://cfbot.cputube.org/dave-cramer.html

BTW, a previous discussion about this idea is here:

/messages/by-id/40cbb35d-774f-23ed-3079-03f938aacdae@2ndquadrant.com

squashed patch attached

Dave

The patch does not apply successfully; a rebase is required.

=== applying patch ./0001-add-format_binary.patch
patching file src/backend/tcop/postgres.c
Hunk #1 succeeded at 97 (offset -8 lines).
patching file src/backend/tcop/pquery.c
patching file src/backend/utils/init/globals.c
patching file src/backend/utils/misc/guc.c
Hunk #1 succeeded at 144 (offset 1 line).
Hunk #2 succeeded at 244 with fuzz 2 (offset 1 line).
Hunk #3 succeeded at 4298 (offset -1 lines).
Hunk #4 FAILED at 12906.
1 out of 4 hunks FAILED -- saving rejects to file

src/backend/utils/misc/guc.c.rej

patching file src/include/miscadmin.h

Thanks,

New rebased patch attached

Hi

cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch
again.

[1] http://cfbot.cputube.org/patch_40_3777.log

Thanks

Ian Barwick