libpq Alternate Row Processor

Started by Kyle Gearhartalmost 9 years ago10 messages
#1Kyle Gearhart
kyle.gearhart@indigohill.io

The guts of pqRowProcessor in libpq does a good bit of work to maintain the internal data structure of a PGresult. There are a few use cases where the caller doesn't need the ability to access the result set row by row, column by column using PQgetvalue. Think of an ORM that is just going to copy the data from PGresult for each row into its own structures.

I've got a working proof of concept that allows the caller to attach a callback that pqRowProcessor will call instead of going thru its own routine. This eliminates all the copying of data from the PGconn buffer to a PGresult buffer and then ultimately a series of PQgetvalue calls by the client. The callback allows the caller to receive each row's data directly from the PGconn buffer.

It would require exposing struct pgDataValue in libpq-fe.h. The prototype for the callback pointer would be:
int (*PQrowProcessorCB)(PGresult*, const PGdataValue*, int col_count, void *user_data);

My initial testing shows a significant performance improvement. I'd like some opinions on this before wiring up a performance proof and updating the documentation for a formal patch submission.

Kyle Gearhart

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Kyle Gearhart (#1)
Re: libpq Alternate Row Processor

On 2/3/17 3:53 PM, Kyle Gearhart wrote:

The guts of pqRowProcessor in libpq does a good bit of work to maintain the internal data structure of a PGresult. There are a few use cases where the caller doesn't need the ability to access the result set row by row, column by column using PQgetvalue. Think of an ORM that is just going to copy the data from PGresult for each row into its own structures.

I've got a working proof of concept that allows the caller to attach a callback that pqRowProcessor will call instead of going thru its own routine. This eliminates all the copying of data from the PGconn buffer to a PGresult buffer and then ultimately a series of PQgetvalue calls by the client. The callback allows the caller to receive each row's data directly from the PGconn buffer.

It would require exposing struct pgDataValue in libpq-fe.h. The prototype for the callback pointer would be:
int (*PQrowProcessorCB)(PGresult*, const PGdataValue*, int col_count, void *user_data);

My initial testing shows a significant performance improvement. I'd like some opinions on this before wiring up a performance proof and updating the documentation for a formal patch submission.

I just did essentially the same thing for SPI (use a callback to allow
the caller to handle the tuple instead of shoving it into a tuplestore).
A simple test in plpython showed a 460% improvement.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kyle Gearhart (#1)
Re: libpq Alternate Row Processor

Kyle Gearhart <kyle.gearhart@indigohill.io> writes:

The guts of pqRowProcessor in libpq does a good bit of work to maintain the internal data structure of a PGresult. There are a few use cases where the caller doesn't need the ability to access the result set row by row, column by column using PQgetvalue. Think of an ORM that is just going to copy the data from PGresult for each row into its own structures.

It seems like you're sort of reinventing "single row mode":
https://www.postgresql.org/docs/devel/static/libpq-single-row-mode.html

Do we really need yet another way of breaking the unitary-query-result
abstraction?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Kyle Gearhart
kyle.gearhart@indigohill.io
In reply to: Tom Lane (#3)
Re: libpq Alternate Row Processor

From: Tom Lane [mailto:tgl@sss.pgh.pa.us]:

Kyle Gearhart <kyle.gearhart@indigohill.io> writes:

The guts of pqRowProcessor in libpq does a good bit of work to maintain the internal data structure of a PGresult. There are a few use cases where the caller doesn't need the ability to access the result set row by row, column by column using PQgetvalue. Think of an ORM that is just going to copy the data from PGresult for each row into its own structures.

It seems like you're sort of reinventing "single row mode":

https://www.postgresql.org/docs/devel/static/libpq-single-row-mode.html

Do we really need yet another way of breaking the unitary-query-result abstraction?

If it's four times faster...then the option should be available in libpq. I'm traveling tomorrow but will try to get a patch and proof with pgbench dataset up by the middle of the week.

The performance gains are consistent with Jim Nasby's findings with SPI.

Kyle Gearhart

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Kyle Gearhart
kyle.gearhart@indigohill.io
In reply to: Kyle Gearhart (#4)
2 attachment(s)
Re: libpq Alternate Row Processor

From: Tom Lane [mailto:tgl@sss.pgh.pa.us]:

Kyle Gearhart <kyle.gearhart@indigohill.io> writes:

The guts of pqRowProcessor in libpq does a good bit of work to maintain the internal data structure of a PGresult. There are a few use cases where the caller doesn't need the ability to access the result set row by row, column by column using PQgetvalue. Think of an ORM that is just going to copy the data from PGresult for each row into its own structures.

It seems like you're sort of reinventing "single row mode":

https://www.postgresql.org/docs/devel/static/libpq-single-row-mode.html

Do we really need yet another way of breaking the unitary-query-result abstraction?

If it's four times faster...then the option should be available in libpq. I'm traveling tomorrow but will try to get a patch and proof with pgbench dataset up by the middle of the week.

Attached is a proof, test program and test results. No documentation changes have been included at this time.

It was tested against a pgbench_accounts record set with 100,000 records. Overall, wall clock improves 24%. User time elapsed is a 430% improvement. About half the time is spent waiting on the IO with the callback. With the regular pqRowProcessor only about 16% of the time is spent waiting on IO.

The test program follows the pgbench program's command line options, with an added parameter called "m", short for mode. Set the option to "row" for single row processing and "cb" for callback processing.

I did not provision for the test program to accept a password from a prompt, you'll have to pass that in the arguments.

Attachments:

libpq_rp.patchapplication/octet-stream; name=libpq_rp.patchDownload
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 21dd772..729941f 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -171,3 +171,4 @@ PQsslAttributeNames       168
 PQsslAttribute            169
 PQsetErrorContextVisibility 170
 PQresultVerboseErrorMessage 171
+PQsetRowProcessor			172
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b551875..1b3fbd8 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1036,6 +1036,15 @@ pqRowProcessor(PGconn *conn, const char **errmsgp)
 	int			i;
 
 	/*
+	 * If we have a row processor set, just return the row processor.
+	 */
+	if (conn->row_processor_set)
+		return conn->row_processor(res, 
+								   columns, 
+								   nfields, 
+								   conn->row_processor_data); 
+
+	/*
 	 * In single-row mode, make a new PGresult that will hold just this one
 	 * row; the original conn->result is left unchanged so that it can be used
 	 * again as the template for future rows.
@@ -1391,6 +1400,11 @@ PQsendQueryStart(PGconn *conn)
 	/* reset single-row processing mode */
 	conn->singleRowMode = false;
 
+	/* Clean up the row processor */
+	conn->row_processor_set = false;
+	conn->row_processor = 0;
+	conn->row_processor_data = 0;
+
 	/* ready to send command message */
 	return true;
 }
@@ -1623,6 +1637,45 @@ PQsetSingleRowMode(PGconn *conn)
 }
 
 /*
+ * Sets a row processor, allowing the caller to bypass the standard
+ * pqRowProcessor with their own.
+ */
+int
+PQsetRowProcessor(PGconn *conn, PQrowProcessor rp, void *data)
+{
+	if (!conn)
+		return 0;
+
+	if (conn->asyncStatus != PGASYNC_BUSY)
+		return 0;
+
+	if (conn->result)
+		return 0;
+
+	/*
+	 * Setup the row processor
+	 */
+	conn->row_processor_set = true;
+	conn->row_processor = rp;
+	conn->row_processor_data = data;
+
+	return 1;
+}
+
+/*
+ * Clears the row processor
+ */
+int
+PQclearRowProcessor(PGconn *conn)
+{
+	conn->row_processor_set = false;
+	conn->row_processor = NULL;
+	conn->row_processor_data = NULL;
+
+	return 1;
+}
+
+/*
  * Consume any available input from the backend
  * 0 return: some kind of trouble
  * 1 return: no problem
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 1b53d0e..af32fff 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -241,6 +241,47 @@ typedef struct pgresAttDesc
 	int			atttypmod;		/* type-specific modifier info */
 } PGresAttDesc;
 
+/*
+ * Data for a single attribute of a single tuple
+ *
+ * We use char* for Attribute values.
+ *
+ * The value pointer always points to a null-terminated area; we add a
+ * null (zero) byte after whatever the backend sends us.  This is only
+ * particularly useful for text values ... with a binary value, the
+ * value might have embedded nulls, so the application can't use C string
+ * operators on it.  But we add a null anyway for consistency.
+ * Note that the value itself does not contain a length word.
+ *
+ * A NULL attribute is a special case in two ways: its len field is NULL_LEN
+ * and its value field points to null_field in the owning PGresult.  All the
+ * NULL attributes in a query result point to the same place (there's no need
+ * to store a null string separately for each one).
+ */
+
+#define PG_NULL_LEN		(-1)	/* pg_result len for NULL value */
+
+/* PGdataValue represents a data field value being passed to a row processor.
+ * It could be either text or binary data; text data is not zero-terminated.
+ * A SQL NULL is represented by len < 0; then value is still valid but there
+ * are no data bytes there.
+ */
+typedef struct pgDataValue
+{
+	int			len;			/* data length in bytes, or <0 if NULL */
+	const char *value;			/* data value, without zero-termination */
+} PGdataValue;
+
+/*
+ * When PQsetRowProcessor sets a PQrowProcessor, libpq will call the
+ * function for each row received from the back end.  It's the users
+ * responsibilty to COPY the data for each column to memory.  PQrowProcessor
+ * should return 1 in the event of an error and 0 for sucessfull processing.
+ */
+typedef int (*PQrowProcessor)(PGresult *res, const PGdataValue *cols,
+							  int col_count, void *data); 
+
+
 /* ----------------
  * Exported functions of libpq
  * ----------------
@@ -418,6 +459,8 @@ extern int PQsendQueryPrepared(PGconn *conn,
 					int resultFormat);
 extern int	PQsetSingleRowMode(PGconn *conn);
 extern PGresult *PQgetResult(PGconn *conn);
+extern int PQsetRowProcessor(PGconn *conn, PQrowProcessor rp, void *data); 
+extern int PQclearRowProcessor(PGconn *conn);
 
 /* Routines for managing an asynchronous query */
 extern int	PQisBusy(PGconn *conn);
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 7289bd1..d4ce72d 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -281,16 +281,6 @@ typedef struct pgLobjfuncs
 	Oid			fn_lo_write;	/* OID of backend function LOwrite		*/
 } PGlobjfuncs;
 
-/* PGdataValue represents a data field value being passed to a row processor.
- * It could be either text or binary data; text data is not zero-terminated.
- * A SQL NULL is represented by len < 0; then value is still valid but there
- * are no data bytes there.
- */
-typedef struct pgDataValue
-{
-	int			len;			/* data length in bytes, or <0 if NULL */
-	const char *value;			/* data value, without zero-termination */
-} PGdataValue;
 
 typedef enum pg_conn_host_type
 {
@@ -389,6 +379,11 @@ struct pg_conn
 	char		copy_is_binary; /* 1 = copy binary, 0 = copy text */
 	int			copy_already_done;		/* # bytes already returned in COPY
 										 * OUT */
+
+	bool 		row_processor_set;
+	PQrowProcessor row_processor;		/* pqRowProcessor call row_processor if */
+	void 		*row_processor_data;	/* row_processor_set */
+
 	PGnotify   *notifyHead;		/* oldest unreported Notify msg */
 	PGnotify   *notifyTail;		/* newest unreported Notify msg */
 
performance.zip.zipapplication/x-zip-compressed; name=performance.zip.zipDownload
#6Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Kyle Gearhart (#5)
Re: libpq Alternate Row Processor

On 2/8/17 5:11 PM, Kyle Gearhart wrote:

Overall, wall clock improves 24%. User time elapsed is a 430% improvement. About half the time is spent waiting on the IO with the callback. With the regular pqRowProcessor only about 16% of the time is spent waiting on IO.

To wit...

real user sys
single row 0.214 0.131 0.048
callback 0.161 0.030 0.051

Those are averaged over 11 runs.

Can you run a trace to see where all the time is going in the single row
case? I don't see an obvious time-suck with a quick look through the
code. It'd be interesting to see how things change if you eliminate the
filler column from the SELECT.

Also, the backend should be buffering ~8kb of data before handing that
to the socket. If that's more than the kernel can buffer I'd expect a
serious performance hit.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Kyle Gearhart
kyle.gearhart@indigohill.io
In reply to: Jim Nasby (#6)
4 attachment(s)
Re: libpq Alternate Row Processor

On 2/9/17 7:15 PM, Jim Nasby wrote:

Can you run a trace to see where all the time is going in the single row case? I don't see an obvious time-suck with a quick look through the code. It'd be interesting to see how things change if you eliminate the filler column from the SELECT.

Traces are attached, these are with callgrind.

profile_nofiller.txt: single row without filler column
profile_filler.txt: single row with filler column
profile_filler_callback.txt: callback with filler column

pqResultAlloc looks to hit malloc pretty hard. The callback reduces all of that to a single malloc for each row.

Without the filler, here is the average over 11 runs:
Real user sys
Callback .133 .033 .035
Single Row .170 .112 .029

For the callback case, it's slightly higher than the prior results with the filler column.

Attachments:

profile_nofiller.txttext/plain; name=profile_nofiller.txtDownload
profile_filler.txttext/plain; name=profile_filler.txtDownload
profile_filler_callback.txttext/plain; name=profile_filler_callback.txtDownload
Performance Results.xlsxapplication/vnd.openxmlformats-officedocument.spreadsheetml.sheet; name="Performance Results.xlsx"Download
#8Merlin Moncure
mmoncure@gmail.com
In reply to: Kyle Gearhart (#7)
Re: libpq Alternate Row Processor

On Mon, Feb 13, 2017 at 8:46 AM, Kyle Gearhart
<kyle.gearhart@indigohill.io> wrote:

On 2/9/17 7:15 PM, Jim Nasby wrote:

Can you run a trace to see where all the time is going in the single row case? I don't see an obvious time-suck with a quick look through the code. It'd be interesting to see how things change if you eliminate the filler column from the SELECT.

Traces are attached, these are with callgrind.

profile_nofiller.txt: single row without filler column
profile_filler.txt: single row with filler column
profile_filler_callback.txt: callback with filler column

pqResultAlloc looks to hit malloc pretty hard. The callback reduces all of that to a single malloc for each row.

Couldn't that be optimized, say, by preserving malloc'd memory when in
single row mode and recycling it? (IIRC during the single row mode
discussion this optimization was voted down).

A barebones callback mode ISTM is a complete departure from the
classic PGresult interface. This code is pretty unpleasant IMO:
acct->abalance = *((int*)PQgetvalue(res, 0, i));
acct->abalance = __bswap_32(acct->abalance);

Your code is faster but foists a lot of the work on the user, so it's
kind of cheating in a way (although very carefully written
applications might be able to benefit).

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Kyle Gearhart
kyle.gearhart@indigohill.io
In reply to: Merlin Moncure (#8)
Re: libpq Alternate Row Processor

On Mon, Feb 13, 2017 Merlin Moncure wrote:

A barebones callback mode ISTM is a complete departure from the classic PGresult interface. This code is pretty unpleasant IMO:

acct->abalance = *((int*)PQgetvalue(res, 0, i)); abalance =
acct->__bswap_32(acct->abalance);

Your code is faster but foists a lot of the work on the user, so it's kind of cheating in a way (although very carefully written applications might be able to benefit).

The bit you call out above is for single row mode. Binary mode is a slippery slope, with or without the proposed callback.

Let's remember that one of the biggest, often overlooked, gains when using an ORM is that it abstracts all this mess away. The goal here is to prevent all the ORM/framework folks from having to implement protocol. Otherwise they get to wait on libpq to copy from the socket to the PGconn buffer to the PGresult structure to their buffers. The callback keeps the slowest guy on the team...on the bench.

Kyle Gearhart

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Kyle Gearhart (#7)
Re: libpq Alternate Row Processor

On 2/13/17 8:46 AM, Kyle Gearhart wrote:

profile_filler.txt
61,410,901 ???:_int_malloc [/usr/lib64/libc-2.17.so]
38,321,887 ???:_int_free [/usr/lib64/libc-2.17.so]
31,400,139 ???:pqResultAlloc [/usr/local/pgsql/lib/libpq.so.5.10]
22,839,505 ???:pqParseInput3 [/usr/local/pgsql/lib/libpq.so.5.10]
17,600,004 ???:pqRowProcessor [/usr/local/pgsql/lib/libpq.so.5.10]
16,002,817 ???:malloc [/usr/lib64/libc-2.17.so]
14,716,359 ???:pqGetInt [/usr/local/pgsql/lib/libpq.so.5.10]
14,400,000 ???:check_tuple_field_number [/usr/local/pgsql/lib/libpq.so.5.10]
13,800,324 main.c:main [/usr/local/src/postgresql-perf/test]

profile_filler_callback.txt
16,842,303 ???:pqParseInput3 [/usr/local/pgsql/lib/libpq.so.5.10]
14,810,783 ???:_int_malloc [/usr/lib64/libc-2.17.so]
12,616,338 ???:pqGetInt [/usr/local/pgsql/lib/libpq.so.5.10]
10,000,000 ???:pqSkipnchar [/usr/local/pgsql/lib/libpq.so.5.10]
9,200,004 main.c:process_callback [/usr/local/src/postgresql-perf/test]

Wow, that's a heck of a difference.

There's a ton of places where the backend copies data for no other
purpose than to put it into a different memory context. I'm wondering if
there's improvement to be had there as well, or whether palloc is so
much faster than malloc that it's not an issue. I suspect that some of
the effects are being masked by other things since presumably palloc and
memcpy are pretty cheap on small volumes of data...
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers