selecting large result sets in psql using cursors

Started by Chris Mairover 19 years ago51 messages

Chris Mair

list@1006.org

over 19 years ago

1 attachment(s)

Hi there,

attached is the new and fixed version of the patch for selecting
large result sets from psql using cursors.

It was previously discussed on hackers:
http://archives.postgresql.org/pgsql-hackers/2006-07/msg00231.php

Thanks again to Neil Conway for helping with this (the first
sketch of the patch was his :)

Wondering if this makes a chance to get accepted...

Bye :-)
Chris.

Attachments:

psql_cursor-4.patchtext/x-patch; charset=ISO-8859-15; name=psql_cursor-4.patchDownload

diff -rc pgsql.original/doc/src/sgml/ref/psql-ref.sgml pgsql/doc/src/sgml/ref/psql-ref.sgml
*** pgsql.original/doc/src/sgml/ref/psql-ref.sgml	Fri Jul 28 02:24:08 2006
--- pgsql/doc/src/sgml/ref/psql-ref.sgml	Fri Jul 28 02:33:20 2006
***************
*** 1201,1206 ****
--- 1201,1231 ----
          </listitem>
        </varlistentry>
  
+ 
+       <varlistentry>
+         <term><literal>\u</literal> [ { <replaceable class="parameter">filename</replaceable> | <literal>|</literal><replaceable class="parameter">command</replaceable> } ]</term>
+ 
+         <listitem>
+         <para>
+         Sends the current query input buffer to the server and
+         optionally stores the query's output in <replaceable
+         class="parameter">filename</replaceable> or pipes the output
+         into a separate Unix shell executing <replaceable
+         class="parameter">command</replaceable>.
+         Unlike <literal>\g</literal>, <literal>\u</literal> works only
+         for select statements and uses a cursor to retrieve the result set.
+         Therefore <literal>\u</literal> uses only a limited amount of memory,
+         regardless the size of the result set. It can be used whenever a result
+         set needs to be retrieved that exeeds the client's memory resources.
+         <literal>\u</literal> always gives unaligned output. It does, however
+         use the current field separator (see <command>\pset</command>).
+         <literal>\u</literal> gives an error when trying to execute something
+         that is not a SELECT statement.
+         </para>
+         </listitem>
+       </varlistentry>
+ 
+ 
        <varlistentry>
          <term><literal>\help</literal> (or <literal>\h</literal>) <literal>[ <replaceable class="parameter">command</replaceable> ]</literal></term>
          <listitem>
diff -rc pgsql.original/src/bin/psql/command.c pgsql/src/bin/psql/command.c
*** pgsql.original/src/bin/psql/command.c	Fri Jul 28 02:24:15 2006
--- pgsql/src/bin/psql/command.c	Fri Jul 28 02:32:47 2006
***************
*** 830,835 ****
--- 830,866 ----
  		}
  	}
  
+ 	/*
+ 	 *  \u executes the current query buffer using a cursor
+ 	 */
+ 	else if (strcmp(cmd, "u") == 0)
+ 	{
+ 		char 	   *fname = psql_scan_slash_option(scan_state,
+ 								OT_FILEPIPE, NULL, false);
+ 
+ 		if (!fname)
+ 			pset.gfname = NULL;
+ 		else
+ 		{
+ 			expand_tilde(&fname);
+ 			pset.gfname = pg_strdup(fname);
+ 		}
+ 		free(fname);
+ 
+ 		if (query_buf && query_buf->len == 0)
+ 		{
+ 			if (!quiet)
+ 			{
+ 				puts(_("Query buffer is empty."));
+ 				fflush(stdout);
+ 			}
+ 		}
+ 		else
+ 		{
+ 			status = PSQL_CMD_SEND_USING_CURSOR;
+ 		}
+ 	}
+ 
  	/* \unset */
  	else if (strcmp(cmd, "unset") == 0)
  	{
diff -rc pgsql.original/src/bin/psql/command.h pgsql/src/bin/psql/command.h
*** pgsql.original/src/bin/psql/command.h	Fri Jul 28 02:24:15 2006
--- pgsql/src/bin/psql/command.h	Fri Jul 28 02:32:47 2006
***************
*** 16,21 ****
--- 16,22 ----
  {
  	PSQL_CMD_UNKNOWN = 0,			/* not done parsing yet (internal only) */
  	PSQL_CMD_SEND,					/* query complete; send off */
+ 	PSQL_CMD_SEND_USING_CURSOR,		/* query complete; send off using cursor */
  	PSQL_CMD_SKIP_LINE,				/* keep building query */
  	PSQL_CMD_TERMINATE,				/* quit program */
  	PSQL_CMD_NEWEDIT,				/* query buffer was changed (e.g., via \e) */
diff -rc pgsql.original/src/bin/psql/common.c pgsql/src/bin/psql/common.c
*** pgsql.original/src/bin/psql/common.c	Fri Jul 28 02:24:15 2006
--- pgsql/src/bin/psql/common.c	Fri Jul 28 02:32:47 2006
***************
*** 28,33 ****
--- 28,34 ----
  #include "command.h"
  #include "copy.h"
  #include "mb/pg_wchar.h"
+ #include "mbprint.h"
  
  
  /* Workarounds for Windows */
***************
*** 52,58 ****
  	 ((T)->millitm - (U)->millitm))
  #endif
  
! 
  static bool command_no_begin(const char *query);
  
  /*
--- 53,59 ----
  	 ((T)->millitm - (U)->millitm))
  #endif
  
! static bool is_select_command(const char *query);
  static bool command_no_begin(const char *query);
  
  /*
***************
*** 934,939 ****
--- 935,1128 ----
  
  
  /*
+  * SendQueryUsingCursor: send the (SELECT) query string to the backend
+  * using a cursor and print out results.
+  *
+  * Unlike with SendQuery(), single step mode, ON_ERROR_ROLLBACK mode,
+  * timing and format settings (except delimiters) are NOT honoured.
+  *
+  * Returns true if the query executed successfully, false otherwise.
+  */
+ bool
+ SendQueryUsingCursor(const char *query)
+ {
+ 	PGresult		*results;
+ 	bool			started_txn			= false;
+ 	PQExpBufferData	buf;
+ 	FILE			*queryFout_copy 	= NULL;
+ 	bool			queryFoutPipe_copy	= false;
+ 	int				ntuples, nfields = -1;
+ 	int				i, j;
+ 
+ 	if (!pset.db)
+ 	{
+ 		psql_error("You are currently not connected to a database.\n");
+ 		return false;
+ 	}
+ 
+ 	if (!is_select_command(query))
+ 	{
+ 		psql_error("Need a SELECT command to perform cursor fetch.\n");
+ 		return false;
+ 	}
+ 
+ 	if (VariableEquals(pset.vars, "ECHO", "queries"))
+ 	{
+ 		puts(query);
+ 		fflush(stdout);
+ 	}
+ 
+ 	if (pset.logfile)
+ 	{
+ 		fprintf(pset.logfile,
+ 				_("********* QUERY **********\n"
+ 				  "%s\n"
+ 				  "**************************\n\n"), query);
+ 		fflush(pset.logfile);
+ 	}
+ 
+ 	SetCancelConn();
+ 
+ 	/* prepare to write output to \u argument, if any */
+ 	if (pset.gfname)
+ 	{
+ 		queryFout_copy = pset.queryFout;
+ 		queryFoutPipe_copy = pset.queryFoutPipe;
+ 
+ 		pset.queryFout = stdout;    /* so it doesn't get closed */
+ 
+ 		/* open file/pipe */    
+ 		if (!setQFout(pset.gfname))
+ 		{
+ 			pset.queryFout = queryFout_copy;
+ 			pset.queryFoutPipe = queryFoutPipe_copy;
+ 			ResetCancelConn();
+ 			return false;
+ 		}
+ 	}
+ 
+ 	/* If we're not in a transaction, start one */
+ 	if (PQtransactionStatus(pset.db) == PQTRANS_IDLE)
+ 	{
+ 		results = PQexec(pset.db, "BEGIN");
+ 		if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 			goto error;
+ 
+ 		PQclear(results);
+ 		started_txn = true;
+ 	}
+ 
+ 	initPQExpBuffer(&buf);
+ 	appendPQExpBuffer(&buf,
+ 					  "DECLARE _psql_cursor NO SCROLL CURSOR FOR %s",
+ 					  query);
+ 
+ 	results = PQexec(pset.db, buf.data);
+ 	if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 		goto error;
+ 
+ 	PQclear(results);
+ 	termPQExpBuffer(&buf);
+ 
+ 	for (;;)
+ 	{
+ 		/* space-time trade-off: get 100 tuples at a time */
+ 		results = PQexec(pset.db, "FETCH FORWARD 100 FROM _psql_cursor");
+ 		if (PQresultStatus(results) != PGRES_TUPLES_OK)
+ 			goto error;
+ 
+ 		ntuples = PQntuples(results);
+ 		if (ntuples == 0)
+ 		{
+ 			PQclear(results);
+ 			break;
+ 		}
+ 
+ 		if (nfields == -1)
+ 			nfields = PQnfields(results);
+ 
+ 		for (j = 0; j < ntuples; j++)
+ 		{
+ 			for (i = 0; i < nfields; i++)
+ 			{ 
+ 				fprintf(
+ 					pset.queryFout, "%s", 
+ 					(char*) mbvalidate((unsigned char*) PQgetvalue(results, j, i),
+ 					pset.popt.topt.encoding)
+ 				);
+ 				if (i != nfields - 1)
+ 				{
+ 					fprintf(pset.queryFout, "%s", pset.popt.topt.fieldSep);
+ 				}
+ 			}
+ 			fprintf(pset.queryFout, "\n");
+ 		}
+ 
+ 		PQclear(results);
+ 
+ 		if (cancel_pressed)
+ 			break;
+ 	}
+ 
+ 	/* close \u argument file/pipe, restore old setting */
+ 	if (pset.gfname)
+ 	{
+ 		setQFout(NULL);
+ 
+ 		pset.queryFout = queryFout_copy;
+ 		pset.queryFoutPipe = queryFoutPipe_copy;
+ 
+ 		free(pset.gfname);
+ 		pset.gfname = NULL;
+ 	}
+ 
+ 	results = PQexec(pset.db, "CLOSE _psql_cursor");
+ 	if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 		goto error;
+ 	PQclear(results);
+ 
+ 	if (started_txn)
+ 	{
+ 		results = PQexec(pset.db, "ROLLBACK");
+ 		started_txn = false;
+ 		if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 			goto error;
+ 		PQclear(results);
+ 	}
+ 
+ 	/* check for events that may occur during query execution */
+ 
+ 	if (pset.encoding != PQclientEncoding(pset.db) &&
+ 		PQclientEncoding(pset.db) >= 0)
+ 	{
+ 		/* track effects of SET CLIENT_ENCODING */
+ 		pset.encoding = PQclientEncoding(pset.db);
+ 		pset.popt.topt.encoding = pset.encoding;
+ 		SetVariable(pset.vars, "ENCODING",
+ 					pg_encoding_to_char(pset.encoding));
+ 	}
+ 
+ 	PrintNotifications();
+ 
+ 	return true;
+ 
+ error:
+ 	psql_error("%s", PQerrorMessage(pset.db));
+ 	if (results)
+ 		PQclear(results);
+ 	if (started_txn)
+ 	{
+ 		results = PQexec(pset.db, "ROLLBACK");
+ 		if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 			goto error;
+ 		PQclear(results);
+ 	}
+ 	ResetCancelConn();
+ 	return false;
+ }
+ 
+ 
+ /*
   * Advance the given char pointer over white space and SQL comments.
   */
  static const char *
***************
*** 994,999 ****
--- 1183,1215 ----
  
  
  /*
+  * Check whether the specified command is a SELECT.
+  */
+ static bool
+ is_select_command(const char *query)
+ {
+ 	int wordlen;
+ 
+ 	/*
+ 	 * First we must advance over any whitespace and comments.
+ 	 */
+ 	query = skip_white_space(query);
+ 
+ 	/*
+ 	 * Check word length (since "selectx" is not "select").
+ 	 */
+ 	wordlen = 0;
+ 	while (isalpha((unsigned char) query[wordlen]))
+ 		wordlen += PQmblen(&query[wordlen], pset.encoding);
+ 
+ 	if (wordlen == 6 && pg_strncasecmp(query, "select", 6) == 0)
+ 		return true;
+ 
+ 	return false;
+ }
+ 
+ 
+ /*
   * Check whether a command is one of those for which we should NOT start
   * a new transaction block (ie, send a preceding BEGIN).
   *
diff -rc pgsql.original/src/bin/psql/common.h pgsql/src/bin/psql/common.h
*** pgsql.original/src/bin/psql/common.h	Fri Jul 28 02:24:15 2006
--- pgsql/src/bin/psql/common.h	Fri Jul 28 02:32:47 2006
***************
*** 55,60 ****
--- 55,61 ----
  extern PGresult *PSQLexec(const char *query, bool start_xact);
  
  extern bool SendQuery(const char *query);
+ extern bool SendQueryUsingCursor(const char *query);
  
  extern bool is_superuser(void);
  extern bool standard_strings(void);
diff -rc pgsql.original/src/bin/psql/help.c pgsql/src/bin/psql/help.c
*** pgsql.original/src/bin/psql/help.c	Fri Jul 28 02:24:15 2006
--- pgsql/src/bin/psql/help.c	Fri Jul 28 02:32:47 2006
***************
*** 190,195 ****
--- 190,196 ----
  	fprintf(output, _("Query Buffer\n"));
  	fprintf(output, _("  \\e [FILE]      edit the query buffer (or file) with external editor\n"));
  	fprintf(output, _("  \\g [FILE]      send query buffer to server (and results to file or |pipe)\n"));
+ 	fprintf(output, _("  \\u [FILE]      like \\g, but use cursor for SELECT (for large result sets)\n"));
  	fprintf(output, _("  \\p             show the contents of the query buffer\n"));
  	fprintf(output, _("  \\r             reset (clear) the query buffer\n"));
  #ifdef USE_READLINE
diff -rc pgsql.original/src/bin/psql/mainloop.c pgsql/src/bin/psql/mainloop.c
*** pgsql.original/src/bin/psql/mainloop.c	Fri Jul 28 02:24:15 2006
--- pgsql/src/bin/psql/mainloop.c	Fri Jul 28 02:32:47 2006
***************
*** 264,270 ****
  
  				success = slashCmdStatus != PSQL_CMD_ERROR;
  
! 				if ((slashCmdStatus == PSQL_CMD_SEND || slashCmdStatus == PSQL_CMD_NEWEDIT) &&
  					query_buf->len == 0)
  				{
  					/* copy previous buffer to current for handling */
--- 264,270 ----
  
  				success = slashCmdStatus != PSQL_CMD_ERROR;
  
! 				if ((slashCmdStatus == PSQL_CMD_SEND || slashCmdStatus == PSQL_CMD_NEWEDIT || slashCmdStatus == PSQL_CMD_SEND_USING_CURSOR) &&
  					query_buf->len == 0)
  				{
  					/* copy previous buffer to current for handling */
***************
*** 282,287 ****
--- 282,298 ----
  					/* flush any paren nesting info after forced send */
  					psql_scan_reset(scan_state);
  				}
+ 				else if (slashCmdStatus == PSQL_CMD_SEND_USING_CURSOR)
+ 				{
+ 					success = SendQueryUsingCursor(query_buf->data);
+ 
+ 					resetPQExpBuffer(previous_buf);
+ 					appendPQExpBufferStr(previous_buf, query_buf->data);
+ 					resetPQExpBuffer(query_buf);
+ 
+ 					/* flush any paren nesting info after forced send */
+ 					psql_scan_reset(scan_state);
+ 				}
  				else if (slashCmdStatus == PSQL_CMD_NEWEDIT)
  				{
  					/* rescan query_buf as new input */
diff -rc pgsql.original/src/bin/psql/settings.h pgsql/src/bin/psql/settings.h
*** pgsql.original/src/bin/psql/settings.h	Fri Jul 28 02:24:15 2006
--- pgsql/src/bin/psql/settings.h	Fri Jul 28 02:32:47 2006
***************
*** 37,43 ****
  	printQueryOpt popt;
  	VariableSpace vars;			/* "shell variable" repository */
  
! 	char	   *gfname;			/* one-shot file output argument for \g */
  
  	bool		notty;			/* stdin or stdout is not a tty (as determined
  								 * on startup) */
--- 37,43 ----
  	printQueryOpt popt;
  	VariableSpace vars;			/* "shell variable" repository */
  
! 	char	   *gfname;			/* one-shot file output argument for \g and \u */
  
  	bool		notty;			/* stdin or stdout is not a tty (as determined
  								 * on startup) */

Tom Lane

tgl@sss.pgh.pa.us

over 19 years ago

In reply to: Chris Mair (#1)

Re: selecting large result sets in psql using cursors

Chris Mair <list@1006.org> writes:

attached is the new and fixed version of the patch for selecting
large result sets from psql using cursors.

The is_select_command bit is wrong because it doesn't allow for left
parentheses in front of the SELECT keyword (something entirely
reasonable when considering big union/intersect/except trees).
Also you'd need to allow for VALUES as the first keyword.
But isn't the whole thing unnecessary? ISTM you could just ship the
query with the DECLARE CURSOR prepended, and see whether you get a
syntax error or not.

At some point we ought to extend libpq enough to expose the V3-protocol
feature that allows partial fetches from portals; that would be a
cleaner way to implement this feature. However since nobody has yet
proposed a good API for this in libpq, I don't object to implementing
\u with DECLARE CURSOR for now.

BTW, \u seems not to have any mnemonic value whatsoever ... isn't there
some other name we could use?

regards, tom lane

Peter Eisentraut

peter_e@gmx.net

over 19 years ago

In reply to: Tom Lane (#2)

Re: [HACKERS] selecting large result sets in psql using cursors

Tom Lane wrote:

BTW, \u seems not to have any mnemonic value whatsoever ... isn't
there some other name we could use?

Ever since pgsql-patches replies started going to -hackers, threading
doesn't work anymore, so I for one can't tell what this refers to at
all.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: Peter Eisentraut (#3)

Re: [HACKERS] selecting large result sets in psql using

Peter Eisentraut wrote:

Tom Lane wrote:

BTW, \u seems not to have any mnemonic value whatsoever ... isn't
there some other name we could use?

Ever since pgsql-patches replies started going to -hackers, threading
doesn't work anymore, so I for one can't tell what this refers to at
all.

I see the original posting here:

http://archives.postgresql.org/pgsql-patches/2006-07/msg00287.php

but I don't remember seeing this posting at all, and it isn't saved in
my mailbox either. Strange.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Simon Riggs

simon@2ndquadrant.com

over 19 years ago

In reply to: Bruce Momjian (#4)

Re: [HACKERS] selecting large result sets in psql using

On Thu, 2006-08-17 at 03:14 -0400, Bruce Momjian wrote:

Peter Eisentraut wrote:

Tom Lane wrote:

BTW, \u seems not to have any mnemonic value whatsoever ... isn't
there some other name we could use?

Ever since pgsql-patches replies started going to -hackers, threading
doesn't work anymore, so I for one can't tell what this refers to at
all.

I see the original posting here:

http://archives.postgresql.org/pgsql-patches/2006-07/msg00287.php

but I don't remember seeing this posting at all, and it isn't saved in
my mailbox either. Strange.

FWIW I saw it.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

Tom Lane

tgl@sss.pgh.pa.us

over 19 years ago

In reply to: Peter Eisentraut (#3)

pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

Peter Eisentraut <peter_e@gmx.net> writes:

Ever since pgsql-patches replies started going to -hackers, threading
doesn't work anymore, so I for one can't tell what this refers to at
all.

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to revert
to the old way?

regards, tom lane

Joe Conway

mail@joeconway.com

over 19 years ago

In reply to: Tom Lane (#6)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Ever since pgsql-patches replies started going to -hackers, threading
doesn't work anymore, so I for one can't tell what this refers to at
all.

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to revert
to the old way?

I'd vote for reverting to the old way. Anyone serious about hacking
should be on both lists.

Joe

Alvaro Herrera

alvherre@commandprompt.com

over 19 years ago

In reply to: Joe Conway (#7)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

Joe Conway wrote:

Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Ever since pgsql-patches replies started going to -hackers, threading
doesn't work anymore, so I for one can't tell what this refers to at
all.

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to revert
to the old way?

I'd vote for reverting to the old way. Anyone serious about hacking
should be on both lists.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Josh Berkus

josh@agliodbs.com

over 19 years ago

In reply to: Joe Conway (#7)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

Tom, all:

I thought the strategy was to provide a way to subscribe to
pgsql-patches, get the text of the messages, and not get the
attachments. Was that techincally infeasable?

--Josh

#10

Chris Mair

chrisnospam@1006.org

over 19 years ago

In reply to: Tom Lane (#2)

1 attachment(s)

Re: selecting large result sets in psql using cursors

Hi,

thanks for reviewing this :)

attached is the new and fixed version of the patch for selecting
large result sets from psql using cursors.

The is_select_command bit is wrong because it doesn't allow for left
parentheses in front of the SELECT keyword (something entirely
reasonable when considering big union/intersect/except trees).
Also you'd need to allow for VALUES as the first keyword.

You're right, I improved is_select_command to take these into account.
(Btw, I didn't even know a command VALUES existed..)

But isn't the whole thing unnecessary? ISTM you could just ship the
query with the DECLARE CURSOR prepended, and see whether you get a
syntax error or not.

I find it neat that \u gives a good error message if someone
executes a non-select query. If I leave that out there is no way to tell
a real syntax error from one cause by executing non-selects...

Anyway, if we don't want the extra check, I can skip the
is_select_command call, of course.

Patch with fix against current CVS is attached.

At some point we ought to extend libpq enough to expose the V3-protocol
feature that allows partial fetches from portals; that would be a
cleaner way to implement this feature. However since nobody has yet
proposed a good API for this in libpq, I don't object to implementing
\u with DECLARE CURSOR for now.

BTW, \u seems not to have any mnemonic value whatsoever ... isn't there
some other name we could use?

True :)
Since buffer commands all have a single char I wanted a single char one
too. The "c" for "cursor" was taken already, so i choose the "u" (second
char in "cursor"). If somebody has a better suggestion, let us know ;)

Bye, Chris.

PS: I'm traveling Fri 18th - Fri 25th and won't check mail often.

Chris Mair
http://www.1006.org

Attachments:

psql_cursor-5.patchtext/x-patch; charset=ISO-8859-15; name=psql_cursor-5.patchDownload

diff -rc pgsql.original/doc/src/sgml/ref/psql-ref.sgml pgsql/doc/src/sgml/ref/psql-ref.sgml
*** pgsql.original/doc/src/sgml/ref/psql-ref.sgml	2006-08-17 16:50:58.000000000 +0200
--- pgsql/doc/src/sgml/ref/psql-ref.sgml	2006-08-17 16:54:41.000000000 +0200
***************
*** 1201,1206 ****
--- 1201,1231 ----
          </listitem>
        </varlistentry>
  
+ 
+       <varlistentry>
+         <term><literal>\u</literal> [ { <replaceable class="parameter">filename</replaceable> | <literal>|</literal><replaceable class="parameter">command</replaceable> } ]</term>
+ 
+         <listitem>
+         <para>
+         Sends the current query input buffer to the server and
+         optionally stores the query's output in <replaceable
+         class="parameter">filename</replaceable> or pipes the output
+         into a separate Unix shell executing <replaceable
+         class="parameter">command</replaceable>.
+         Unlike <literal>\g</literal>, <literal>\u</literal> works only
+         for select statements and uses a cursor to retrieve the result set.
+         Therefore <literal>\u</literal> uses only a limited amount of memory,
+         regardless the size of the result set. It can be used whenever a result
+         set needs to be retrieved that exeeds the client's memory resources.
+         <literal>\u</literal> always gives unaligned output. It does, however
+         use the current field separator (see <command>\pset</command>).
+         <literal>\u</literal> gives an error when trying to execute something
+         that is not a SELECT statement.
+         </para>
+         </listitem>
+       </varlistentry>
+ 
+ 
        <varlistentry>
          <term><literal>\help</literal> (or <literal>\h</literal>) <literal>[ <replaceable class="parameter">command</replaceable> ]</literal></term>
          <listitem>
diff -rc pgsql.original/src/bin/psql/command.c pgsql/src/bin/psql/command.c
*** pgsql.original/src/bin/psql/command.c	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/command.c	2006-08-17 16:55:25.000000000 +0200
***************
*** 830,835 ****
--- 830,866 ----
  		}
  	}
  
+ 	/*
+ 	 *  \u executes the current query buffer using a cursor
+ 	 */
+ 	else if (strcmp(cmd, "u") == 0)
+ 	{
+ 		char 	   *fname = psql_scan_slash_option(scan_state,
+ 								OT_FILEPIPE, NULL, false);
+ 
+ 		if (!fname)
+ 			pset.gfname = NULL;
+ 		else
+ 		{
+ 			expand_tilde(&fname);
+ 			pset.gfname = pg_strdup(fname);
+ 		}
+ 		free(fname);
+ 
+ 		if (query_buf && query_buf->len == 0)
+ 		{
+ 			if (!quiet)
+ 			{
+ 				puts(_("Query buffer is empty."));
+ 				fflush(stdout);
+ 			}
+ 		}
+ 		else
+ 		{
+ 			status = PSQL_CMD_SEND_USING_CURSOR;
+ 		}
+ 	}
+ 
  	/* \unset */
  	else if (strcmp(cmd, "unset") == 0)
  	{
diff -rc pgsql.original/src/bin/psql/command.h pgsql/src/bin/psql/command.h
*** pgsql.original/src/bin/psql/command.h	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/command.h	2006-08-17 16:55:25.000000000 +0200
***************
*** 16,21 ****
--- 16,22 ----
  {
  	PSQL_CMD_UNKNOWN = 0,			/* not done parsing yet (internal only) */
  	PSQL_CMD_SEND,					/* query complete; send off */
+ 	PSQL_CMD_SEND_USING_CURSOR,		/* query complete; send off using cursor */
  	PSQL_CMD_SKIP_LINE,				/* keep building query */
  	PSQL_CMD_TERMINATE,				/* quit program */
  	PSQL_CMD_NEWEDIT,				/* query buffer was changed (e.g., via \e) */
diff -rc pgsql.original/src/bin/psql/common.c pgsql/src/bin/psql/common.c
*** pgsql.original/src/bin/psql/common.c	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/common.c	2006-08-17 16:59:05.000000000 +0200
***************
*** 28,33 ****
--- 28,34 ----
  #include "command.h"
  #include "copy.h"
  #include "mb/pg_wchar.h"
+ #include "mbprint.h"
  
  
  /* Workarounds for Windows */
***************
*** 52,58 ****
  	 ((T)->millitm - (U)->millitm))
  #endif
  
! 
  static bool command_no_begin(const char *query);
  
  /*
--- 53,59 ----
  	 ((T)->millitm - (U)->millitm))
  #endif
  
! static bool is_select_command(const char *query);
  static bool command_no_begin(const char *query);
  
  /*
***************
*** 952,957 ****
--- 953,1146 ----
  
  
  /*
+  * SendQueryUsingCursor: send the (SELECT) query string to the backend
+  * using a cursor and print out results.
+  *
+  * Unlike with SendQuery(), single step mode, ON_ERROR_ROLLBACK mode,
+  * timing and format settings (except delimiters) are NOT honoured.
+  *
+  * Returns true if the query executed successfully, false otherwise.
+  */
+ bool
+ SendQueryUsingCursor(const char *query)
+ {
+ 	PGresult		*results;
+ 	bool			started_txn			= false;
+ 	PQExpBufferData	buf;
+ 	FILE			*queryFout_copy 	= NULL;
+ 	bool			queryFoutPipe_copy	= false;
+ 	int				ntuples, nfields = -1;
+ 	int				i, j;
+ 
+ 	if (!pset.db)
+ 	{
+ 		psql_error("You are currently not connected to a database.\n");
+ 		return false;
+ 	}
+ 
+ 	if (!is_select_command(query))
+ 	{
+ 		psql_error("Need a SELECT command to perform cursor fetch.\n");
+ 		return false;
+ 	}
+ 
+ 	if (VariableEquals(pset.vars, "ECHO", "queries"))
+ 	{
+ 		puts(query);
+ 		fflush(stdout);
+ 	}
+ 
+ 	if (pset.logfile)
+ 	{
+ 		fprintf(pset.logfile,
+ 				_("********* QUERY **********\n"
+ 				  "%s\n"
+ 				  "**************************\n\n"), query);
+ 		fflush(pset.logfile);
+ 	}
+ 
+ 	SetCancelConn();
+ 
+ 	/* prepare to write output to \u argument, if any */
+ 	if (pset.gfname)
+ 	{
+ 		queryFout_copy = pset.queryFout;
+ 		queryFoutPipe_copy = pset.queryFoutPipe;
+ 
+ 		pset.queryFout = stdout;    /* so it doesn't get closed */
+ 
+ 		/* open file/pipe */    
+ 		if (!setQFout(pset.gfname))
+ 		{
+ 			pset.queryFout = queryFout_copy;
+ 			pset.queryFoutPipe = queryFoutPipe_copy;
+ 			ResetCancelConn();
+ 			return false;
+ 		}
+ 	}
+ 
+ 	/* If we're not in a transaction, start one */
+ 	if (PQtransactionStatus(pset.db) == PQTRANS_IDLE)
+ 	{
+ 		results = PQexec(pset.db, "BEGIN");
+ 		if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 			goto error;
+ 
+ 		PQclear(results);
+ 		started_txn = true;
+ 	}
+ 
+ 	initPQExpBuffer(&buf);
+ 	appendPQExpBuffer(&buf,
+ 					  "DECLARE _psql_cursor NO SCROLL CURSOR FOR %s",
+ 					  query);
+ 
+ 	results = PQexec(pset.db, buf.data);
+ 	if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 		goto error;
+ 
+ 	PQclear(results);
+ 	termPQExpBuffer(&buf);
+ 
+ 	for (;;)
+ 	{
+ 		/* space-time trade-off: get 100 tuples at a time */
+ 		results = PQexec(pset.db, "FETCH FORWARD 100 FROM _psql_cursor");
+ 		if (PQresultStatus(results) != PGRES_TUPLES_OK)
+ 			goto error;
+ 
+ 		ntuples = PQntuples(results);
+ 		if (ntuples == 0)
+ 		{
+ 			PQclear(results);
+ 			break;
+ 		}
+ 
+ 		if (nfields == -1)
+ 			nfields = PQnfields(results);
+ 
+ 		for (j = 0; j < ntuples; j++)
+ 		{
+ 			for (i = 0; i < nfields; i++)
+ 			{ 
+ 				fprintf(
+ 					pset.queryFout, "%s", 
+ 					(char*) mbvalidate((unsigned char*) PQgetvalue(results, j, i),
+ 					pset.popt.topt.encoding)
+ 				);
+ 				if (i != nfields - 1)
+ 				{
+ 					fprintf(pset.queryFout, "%s", pset.popt.topt.fieldSep);
+ 				}
+ 			}
+ 			fprintf(pset.queryFout, "\n");
+ 		}
+ 
+ 		PQclear(results);
+ 
+ 		if (cancel_pressed)
+ 			break;
+ 	}
+ 
+ 	/* close \u argument file/pipe, restore old setting */
+ 	if (pset.gfname)
+ 	{
+ 		setQFout(NULL);
+ 
+ 		pset.queryFout = queryFout_copy;
+ 		pset.queryFoutPipe = queryFoutPipe_copy;
+ 
+ 		free(pset.gfname);
+ 		pset.gfname = NULL;
+ 	}
+ 
+ 	results = PQexec(pset.db, "CLOSE _psql_cursor");
+ 	if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 		goto error;
+ 	PQclear(results);
+ 
+ 	if (started_txn)
+ 	{
+ 		results = PQexec(pset.db, "ROLLBACK");
+ 		started_txn = false;
+ 		if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 			goto error;
+ 		PQclear(results);
+ 	}
+ 
+ 	/* check for events that may occur during query execution */
+ 
+ 	if (pset.encoding != PQclientEncoding(pset.db) &&
+ 		PQclientEncoding(pset.db) >= 0)
+ 	{
+ 		/* track effects of SET CLIENT_ENCODING */
+ 		pset.encoding = PQclientEncoding(pset.db);
+ 		pset.popt.topt.encoding = pset.encoding;
+ 		SetVariable(pset.vars, "ENCODING",
+ 					pg_encoding_to_char(pset.encoding));
+ 	}
+ 
+ 	PrintNotifications();
+ 
+ 	return true;
+ 
+ error:
+ 	psql_error("%s", PQerrorMessage(pset.db));
+ 	if (results)
+ 		PQclear(results);
+ 	if (started_txn)
+ 	{
+ 		results = PQexec(pset.db, "ROLLBACK");
+ 		if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 			goto error;
+ 		PQclear(results);
+ 	}
+ 	ResetCancelConn();
+ 	return false;
+ }
+ 
+ 
+ /*
   * Advance the given char pointer over white space and SQL comments.
   */
  static const char *
***************
*** 1012,1017 ****
--- 1201,1244 ----
  
  
  /*
+  * Check whether the specified command is a SELECT (or VALUES).
+  */
+ static bool
+ is_select_command(const char *query)
+ {
+ 	int wordlen;
+ 	const char *pos = 0;
+ 
+ 	/*
+ 	 * First we must advance over any whitespace, comments and left parentheses.
+ 	 */
+ 	while (pos != query)
+ 	{
+ 		pos = query;
+ 		query = skip_white_space(query);
+ 		if (query[0] == '(') {
+ 			query++;
+ 		}
+ 	}
+ 
+ 	/*
+ 	 * Check word length (since "selectx" is not "select").
+ 	 */
+ 	wordlen = 0;
+ 	while (isalpha((unsigned char) query[wordlen]))
+ 		wordlen += PQmblen(&query[wordlen], pset.encoding);
+ 
+ 	if (wordlen == 6 && pg_strncasecmp(query, "select", 6) == 0)
+ 		return true;
+ 	
+ 	if (wordlen == 6 && pg_strncasecmp(query, "values", 6) == 0)
+ 		return true;
+ 
+ 	return false;
+ }
+ 
+ 
+ /*
   * Check whether a command is one of those for which we should NOT start
   * a new transaction block (ie, send a preceding BEGIN).
   *
diff -rc pgsql.original/src/bin/psql/common.h pgsql/src/bin/psql/common.h
*** pgsql.original/src/bin/psql/common.h	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/common.h	2006-08-17 16:55:25.000000000 +0200
***************
*** 55,60 ****
--- 55,61 ----
  extern PGresult *PSQLexec(const char *query, bool start_xact);
  
  extern bool SendQuery(const char *query);
+ extern bool SendQueryUsingCursor(const char *query);
  
  extern bool is_superuser(void);
  extern bool standard_strings(void);
diff -rc pgsql.original/src/bin/psql/help.c pgsql/src/bin/psql/help.c
*** pgsql.original/src/bin/psql/help.c	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/help.c	2006-08-17 16:55:25.000000000 +0200
***************
*** 190,195 ****
--- 190,196 ----
  	fprintf(output, _("Query Buffer\n"));
  	fprintf(output, _("  \\e [FILE]      edit the query buffer (or file) with external editor\n"));
  	fprintf(output, _("  \\g [FILE]      send query buffer to server (and results to file or |pipe)\n"));
+ 	fprintf(output, _("  \\u [FILE]      like \\g, but use cursor for SELECT (for large result sets)\n"));
  	fprintf(output, _("  \\p             show the contents of the query buffer\n"));
  	fprintf(output, _("  \\r             reset (clear) the query buffer\n"));
  #ifdef USE_READLINE
diff -rc pgsql.original/src/bin/psql/mainloop.c pgsql/src/bin/psql/mainloop.c
*** pgsql.original/src/bin/psql/mainloop.c	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/mainloop.c	2006-08-17 16:55:25.000000000 +0200
***************
*** 264,270 ****
  
  				success = slashCmdStatus != PSQL_CMD_ERROR;
  
! 				if ((slashCmdStatus == PSQL_CMD_SEND || slashCmdStatus == PSQL_CMD_NEWEDIT) &&
  					query_buf->len == 0)
  				{
  					/* copy previous buffer to current for handling */
--- 264,270 ----
  
  				success = slashCmdStatus != PSQL_CMD_ERROR;
  
! 				if ((slashCmdStatus == PSQL_CMD_SEND || slashCmdStatus == PSQL_CMD_NEWEDIT || slashCmdStatus == PSQL_CMD_SEND_USING_CURSOR) &&
  					query_buf->len == 0)
  				{
  					/* copy previous buffer to current for handling */
***************
*** 282,287 ****
--- 282,298 ----
  					/* flush any paren nesting info after forced send */
  					psql_scan_reset(scan_state);
  				}
+ 				else if (slashCmdStatus == PSQL_CMD_SEND_USING_CURSOR)
+ 				{
+ 					success = SendQueryUsingCursor(query_buf->data);
+ 
+ 					resetPQExpBuffer(previous_buf);
+ 					appendPQExpBufferStr(previous_buf, query_buf->data);
+ 					resetPQExpBuffer(query_buf);
+ 
+ 					/* flush any paren nesting info after forced send */
+ 					psql_scan_reset(scan_state);
+ 				}
  				else if (slashCmdStatus == PSQL_CMD_NEWEDIT)
  				{
  					/* rescan query_buf as new input */
diff -rc pgsql.original/src/bin/psql/settings.h pgsql/src/bin/psql/settings.h
*** pgsql.original/src/bin/psql/settings.h	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/settings.h	2006-08-17 16:55:25.000000000 +0200
***************
*** 37,43 ****
  	printQueryOpt popt;
  	VariableSpace vars;			/* "shell variable" repository */
  
! 	char	   *gfname;			/* one-shot file output argument for \g */
  
  	bool		notty;			/* stdin or stdout is not a tty (as determined
  								 * on startup) */
--- 37,43 ----
  	printQueryOpt popt;
  	VariableSpace vars;			/* "shell variable" repository */
  
! 	char	   *gfname;			/* one-shot file output argument for \g and \u */
  
  	bool		notty;			/* stdin or stdout is not a tty (as determined
  								 * on startup) */

#11

Peter Eisentraut

peter_e@gmx.net

over 19 years ago

In reply to: Tom Lane (#6)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

Tom Lane wrote:

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to
revert to the old way?

Since almost the first day I hacked on PostgreSQL I have been filtering
both lists into the same folder, so they pretty much appear to be one
and the same to me anyway. The only step that would optimize that
situation further would be doing away with pgsql-patches and telling
people to send patches to pgsql-hackers. I understand that some people
may not care for the extra volume that the patches bring in. But with
250+ kB of hackers mail a day, the few patches don't seem all that
significant. And to be serious about hacking (and tracking the
hacking) you need to get both lists anyway, so it would make sense to
me to just have one.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#12

Andrew Dunstan

andrew@dunslane.net

over 19 years ago

In reply to: Peter Eisentraut (#11)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

Peter Eisentraut wrote:

Tom Lane wrote:

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to
revert to the old way?

Since almost the first day I hacked on PostgreSQL I have been filtering
both lists into the same folder, so they pretty much appear to be one
and the same to me anyway. The only step that would optimize that
situation further would be doing away with pgsql-patches and telling
people to send patches to pgsql-hackers. I understand that some people
may not care for the extra volume that the patches bring in. But with
250+ kB of hackers mail a day, the few patches don't seem all that
significant. And to be serious about hacking (and tracking the
hacking) you need to get both lists anyway, so it would make sense to
me to just have one.

how many very large patches are sent? Not too many. We could in fact put
a limit on the attachment size and make people publish very large
patches some other way (on the web, say?)

cheers

andrew

#13

Magnus Hagander

mha@sollentuna.net

over 19 years ago

In reply to: Alvaro Herrera (#8)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

Ever since pgsql-patches replies started going to -hackers,
threading doesn't work anymore, so I for one can't tell what this
refers to at all.

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to
revert to the old way?

I'd vote for reverting to the old way. Anyone serious about hacking
should be on both lists.

Then why bother with two different lists?

If developers need to be on both list (which I beleive they do), and the
focus of both lists is developers, then why not just remove one of them
and get rid of the problem?

//Magnus

#14

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: Chris Mair (#10)

Re: [HACKERS] selecting large result sets in psql using

Chris Mair wrote:

At some point we ought to extend libpq enough to expose the V3-protocol
feature that allows partial fetches from portals; that would be a
cleaner way to implement this feature. However since nobody has yet
proposed a good API for this in libpq, I don't object to implementing
\u with DECLARE CURSOR for now.

BTW, \u seems not to have any mnemonic value whatsoever ... isn't there
some other name we could use?

True :)
Since buffer commands all have a single char I wanted a single char one
too. The "c" for "cursor" was taken already, so i choose the "u" (second
char in "cursor"). If somebody has a better suggestion, let us know ;)

I think a new backslash variable isn't the way to go. I would use a
\pset variable to control what is happening.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#15

Joe Conway

mail@joeconway.com

over 19 years ago

In reply to: Magnus Hagander (#13)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

Magnus Hagander wrote:

Then why bother with two different lists?

If developers need to be on both list (which I beleive they do), and the
focus of both lists is developers, then why not just remove one of them
and get rid of the problem?

I wouldn't argue with that. It would be at least equally good from my
perspective, and maybe slightly better.

Joe

#16

Steve Atkins

steve@blighty.com

over 19 years ago

In reply to: Magnus Hagander (#13)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

On Aug 17, 2006, at 9:30 AM, Magnus Hagander wrote:

Ever since pgsql-patches replies started going to -hackers,
threading doesn't work anymore, so I for one can't tell what this
refers to at all.

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to
revert to the old way?

I'd vote for reverting to the old way. Anyone serious about hacking
should be on both lists.

Then why bother with two different lists?

If developers need to be on both list (which I beleive they do),
and the
focus of both lists is developers, then why not just remove one of
them
and get rid of the problem?

One reason might be that a lot of application developers who develop
applications or modules associated with PG, but not the core PG code
itself also lurk on -hackers, as it's by far the best way to keep up
with
the status of various PG enhancements (and also an excellent place
to pick up a lot of undocumented good practices).

Cheers,
Steve

#17

Chris Mair

chrisnospam@1006.org

over 19 years ago

In reply to: Chris Mair (#10)

Re: [HACKERS] selecting large result sets in psql using

Replying to myself...

Patch with fix against current CVS is attached.

Alvaro Herrera sent two fixes off-list: a typo and
at the end of SendQueryUsingCursor I sould COMMIT, not ROLLBACK.

So, one more version (6) that fixes these too is attached.

Bye, Chris.

PS: I'm keeping this on both lists now, hope it's ok.

--
Chris Mair
http://www.1006.org

#18

Chris Mair

chrisnospam@1006.org

over 19 years ago

In reply to: Chris Mair (#17)

1 attachment(s)

Re: [HACKERS] selecting large result sets in psql using

Patch with fix against current CVS is attached.

Forgot the attachment... soory.

Chris Mair
http://www.1006.org

Attachments:

psql_cursor-6.patchtext/x-patch; charset=ISO-8859-15; name=psql_cursor-6.patchDownload

diff -rc pgsql.original/doc/src/sgml/ref/psql-ref.sgml pgsql/doc/src/sgml/ref/psql-ref.sgml
*** pgsql.original/doc/src/sgml/ref/psql-ref.sgml	2006-08-17 16:50:58.000000000 +0200
--- pgsql/doc/src/sgml/ref/psql-ref.sgml	2006-08-17 18:02:29.000000000 +0200
***************
*** 1201,1206 ****
--- 1201,1231 ----
          </listitem>
        </varlistentry>
  
+ 
+       <varlistentry>
+         <term><literal>\u</literal> [ { <replaceable class="parameter">filename</replaceable> | <literal>|</literal><replaceable class="parameter">command</replaceable> } ]</term>
+ 
+         <listitem>
+         <para>
+         Sends the current query input buffer to the server and
+         optionally stores the query's output in <replaceable
+         class="parameter">filename</replaceable> or pipes the output
+         into a separate Unix shell executing <replaceable
+         class="parameter">command</replaceable>.
+         Unlike <literal>\g</literal>, <literal>\u</literal> works only
+         for select statements and uses a cursor to retrieve the result set.
+         Therefore <literal>\u</literal> uses only a limited amount of memory,
+         regardless the size of the result set. It can be used whenever a result
+         set needs to be retrieved that exceeds the client's memory resources.
+         <literal>\u</literal> always gives unaligned output. It does, however
+         use the current field separator (see <command>\pset</command>).
+         <literal>\u</literal> gives an error when trying to execute something
+         that is not a SELECT statement.
+         </para>
+         </listitem>
+       </varlistentry>
+ 
+ 
        <varlistentry>
          <term><literal>\help</literal> (or <literal>\h</literal>) <literal>[ <replaceable class="parameter">command</replaceable> ]</literal></term>
          <listitem>
diff -rc pgsql.original/src/bin/psql/command.c pgsql/src/bin/psql/command.c
*** pgsql.original/src/bin/psql/command.c	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/command.c	2006-08-17 18:02:49.000000000 +0200
***************
*** 830,835 ****
--- 830,866 ----
  		}
  	}
  
+ 	/*
+ 	 *  \u executes the current query buffer using a cursor
+ 	 */
+ 	else if (strcmp(cmd, "u") == 0)
+ 	{
+ 		char 	   *fname = psql_scan_slash_option(scan_state,
+ 								OT_FILEPIPE, NULL, false);
+ 
+ 		if (!fname)
+ 			pset.gfname = NULL;
+ 		else
+ 		{
+ 			expand_tilde(&fname);
+ 			pset.gfname = pg_strdup(fname);
+ 		}
+ 		free(fname);
+ 
+ 		if (query_buf && query_buf->len == 0)
+ 		{
+ 			if (!quiet)
+ 			{
+ 				puts(_("Query buffer is empty."));
+ 				fflush(stdout);
+ 			}
+ 		}
+ 		else
+ 		{
+ 			status = PSQL_CMD_SEND_USING_CURSOR;
+ 		}
+ 	}
+ 
  	/* \unset */
  	else if (strcmp(cmd, "unset") == 0)
  	{
diff -rc pgsql.original/src/bin/psql/command.h pgsql/src/bin/psql/command.h
*** pgsql.original/src/bin/psql/command.h	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/command.h	2006-08-17 16:55:25.000000000 +0200
***************
*** 16,21 ****
--- 16,22 ----
  {
  	PSQL_CMD_UNKNOWN = 0,			/* not done parsing yet (internal only) */
  	PSQL_CMD_SEND,					/* query complete; send off */
+ 	PSQL_CMD_SEND_USING_CURSOR,		/* query complete; send off using cursor */
  	PSQL_CMD_SKIP_LINE,				/* keep building query */
  	PSQL_CMD_TERMINATE,				/* quit program */
  	PSQL_CMD_NEWEDIT,				/* query buffer was changed (e.g., via \e) */
diff -rc pgsql.original/src/bin/psql/common.c pgsql/src/bin/psql/common.c
*** pgsql.original/src/bin/psql/common.c	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/common.c	2006-08-17 18:40:51.000000000 +0200
***************
*** 28,33 ****
--- 28,34 ----
  #include "command.h"
  #include "copy.h"
  #include "mb/pg_wchar.h"
+ #include "mbprint.h"
  
  
  /* Workarounds for Windows */
***************
*** 52,58 ****
  	 ((T)->millitm - (U)->millitm))
  #endif
  
! 
  static bool command_no_begin(const char *query);
  
  /*
--- 53,59 ----
  	 ((T)->millitm - (U)->millitm))
  #endif
  
! static bool is_select_command(const char *query);
  static bool command_no_begin(const char *query);
  
  /*
***************
*** 952,957 ****
--- 953,1146 ----
  
  
  /*
+  * SendQueryUsingCursor: send the (SELECT) query string to the backend
+  * using a cursor and print out results.
+  *
+  * Unlike with SendQuery(), single step mode, ON_ERROR_ROLLBACK mode,
+  * timing and format settings (except delimiters) are NOT honoured.
+  *
+  * Returns true if the query executed successfully, false otherwise.
+  */
+ bool
+ SendQueryUsingCursor(const char *query)
+ {
+ 	PGresult		*results;
+ 	bool			started_txn			= false;
+ 	PQExpBufferData	buf;
+ 	FILE			*queryFout_copy 	= NULL;
+ 	bool			queryFoutPipe_copy	= false;
+ 	int				ntuples, nfields = -1;
+ 	int				i, j;
+ 
+ 	if (!pset.db)
+ 	{
+ 		psql_error("You are currently not connected to a database.\n");
+ 		return false;
+ 	}
+ 
+ 	if (!is_select_command(query))
+ 	{
+ 		psql_error("Need a SELECT command to perform cursor fetch.\n");
+ 		return false;
+ 	}
+ 
+ 	if (VariableEquals(pset.vars, "ECHO", "queries"))
+ 	{
+ 		puts(query);
+ 		fflush(stdout);
+ 	}
+ 
+ 	if (pset.logfile)
+ 	{
+ 		fprintf(pset.logfile,
+ 				_("********* QUERY **********\n"
+ 				  "%s\n"
+ 				  "**************************\n\n"), query);
+ 		fflush(pset.logfile);
+ 	}
+ 
+ 	SetCancelConn();
+ 
+ 	/* prepare to write output to \u argument, if any */
+ 	if (pset.gfname)
+ 	{
+ 		queryFout_copy = pset.queryFout;
+ 		queryFoutPipe_copy = pset.queryFoutPipe;
+ 
+ 		pset.queryFout = stdout;    /* so it doesn't get closed */
+ 
+ 		/* open file/pipe */    
+ 		if (!setQFout(pset.gfname))
+ 		{
+ 			pset.queryFout = queryFout_copy;
+ 			pset.queryFoutPipe = queryFoutPipe_copy;
+ 			ResetCancelConn();
+ 			return false;
+ 		}
+ 	}
+ 
+ 	/* If we're not in a transaction, start one */
+ 	if (PQtransactionStatus(pset.db) == PQTRANS_IDLE)
+ 	{
+ 		results = PQexec(pset.db, "BEGIN");
+ 		if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 			goto error;
+ 
+ 		PQclear(results);
+ 		started_txn = true;
+ 	}
+ 
+ 	initPQExpBuffer(&buf);
+ 	appendPQExpBuffer(&buf,
+ 					  "DECLARE _psql_cursor NO SCROLL CURSOR FOR %s",
+ 					  query);
+ 
+ 	results = PQexec(pset.db, buf.data);
+ 	if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 		goto error;
+ 
+ 	PQclear(results);
+ 	termPQExpBuffer(&buf);
+ 
+ 	for (;;)
+ 	{
+ 		/* space-time trade-off: get 100 tuples at a time */
+ 		results = PQexec(pset.db, "FETCH FORWARD 100 FROM _psql_cursor");
+ 		if (PQresultStatus(results) != PGRES_TUPLES_OK)
+ 			goto error;
+ 
+ 		ntuples = PQntuples(results);
+ 		if (ntuples == 0)
+ 		{
+ 			PQclear(results);
+ 			break;
+ 		}
+ 
+ 		if (nfields == -1)
+ 			nfields = PQnfields(results);
+ 
+ 		for (j = 0; j < ntuples; j++)
+ 		{
+ 			for (i = 0; i < nfields; i++)
+ 			{ 
+ 				fprintf(
+ 					pset.queryFout, "%s", 
+ 					(char*) mbvalidate((unsigned char*) PQgetvalue(results, j, i),
+ 					pset.popt.topt.encoding)
+ 				);
+ 				if (i != nfields - 1)
+ 				{
+ 					fprintf(pset.queryFout, "%s", pset.popt.topt.fieldSep);
+ 				}
+ 			}
+ 			fprintf(pset.queryFout, "\n");
+ 		}
+ 
+ 		PQclear(results);
+ 
+ 		if (cancel_pressed)
+ 			break;
+ 	}
+ 
+ 	/* close \u argument file/pipe, restore old setting */
+ 	if (pset.gfname)
+ 	{
+ 		setQFout(NULL);
+ 
+ 		pset.queryFout = queryFout_copy;
+ 		pset.queryFoutPipe = queryFoutPipe_copy;
+ 
+ 		free(pset.gfname);
+ 		pset.gfname = NULL;
+ 	}
+ 
+ 	results = PQexec(pset.db, "CLOSE _psql_cursor");
+ 	if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 		goto error;
+ 	PQclear(results);
+ 
+ 	if (started_txn)
+ 	{
+ 		results = PQexec(pset.db, "COMMIT");
+ 		started_txn = false;
+ 		if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 			goto error;
+ 		PQclear(results);
+ 	}
+ 
+ 	/* check for events that may occur during query execution */
+ 
+ 	if (pset.encoding != PQclientEncoding(pset.db) &&
+ 		PQclientEncoding(pset.db) >= 0)
+ 	{
+ 		/* track effects of SET CLIENT_ENCODING */
+ 		pset.encoding = PQclientEncoding(pset.db);
+ 		pset.popt.topt.encoding = pset.encoding;
+ 		SetVariable(pset.vars, "ENCODING",
+ 					pg_encoding_to_char(pset.encoding));
+ 	}
+ 
+ 	PrintNotifications();
+ 
+ 	return true;
+ 
+ error:
+ 	psql_error("%s", PQerrorMessage(pset.db));
+ 	if (results)
+ 		PQclear(results);
+ 	if (started_txn)
+ 	{
+ 		results = PQexec(pset.db, "ROLLBACK");
+ 		if (PQresultStatus(results) != PGRES_COMMAND_OK)
+ 			goto error;
+ 		PQclear(results);
+ 	}
+ 	ResetCancelConn();
+ 	return false;
+ }
+ 
+ 
+ /*
   * Advance the given char pointer over white space and SQL comments.
   */
  static const char *
***************
*** 1012,1017 ****
--- 1201,1244 ----
  
  
  /*
+  * Check whether the specified command is a SELECT (or VALUES).
+  */
+ static bool
+ is_select_command(const char *query)
+ {
+ 	int wordlen;
+ 	const char *pos = 0;
+ 
+ 	/*
+ 	 * First we must advance over any whitespace, comments and left parentheses.
+ 	 */
+ 	while (pos != query)
+ 	{
+ 		pos = query;
+ 		query = skip_white_space(query);
+ 		if (query[0] == '(') {
+ 			query++;
+ 		}
+ 	}
+ 
+ 	/*
+ 	 * Check word length (since "selectx" is not "select").
+ 	 */
+ 	wordlen = 0;
+ 	while (isalpha((unsigned char) query[wordlen]))
+ 		wordlen += PQmblen(&query[wordlen], pset.encoding);
+ 
+ 	if (wordlen == 6 && pg_strncasecmp(query, "select", 6) == 0)
+ 		return true;
+ 	
+ 	if (wordlen == 6 && pg_strncasecmp(query, "values", 6) == 0)
+ 		return true;
+ 
+ 	return false;
+ }
+ 
+ 
+ /*
   * Check whether a command is one of those for which we should NOT start
   * a new transaction block (ie, send a preceding BEGIN).
   *
diff -rc pgsql.original/src/bin/psql/common.h pgsql/src/bin/psql/common.h
*** pgsql.original/src/bin/psql/common.h	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/common.h	2006-08-17 16:55:25.000000000 +0200
***************
*** 55,60 ****
--- 55,61 ----
  extern PGresult *PSQLexec(const char *query, bool start_xact);
  
  extern bool SendQuery(const char *query);
+ extern bool SendQueryUsingCursor(const char *query);
  
  extern bool is_superuser(void);
  extern bool standard_strings(void);
diff -rc pgsql.original/src/bin/psql/help.c pgsql/src/bin/psql/help.c
*** pgsql.original/src/bin/psql/help.c	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/help.c	2006-08-17 16:55:25.000000000 +0200
***************
*** 190,195 ****
--- 190,196 ----
  	fprintf(output, _("Query Buffer\n"));
  	fprintf(output, _("  \\e [FILE]      edit the query buffer (or file) with external editor\n"));
  	fprintf(output, _("  \\g [FILE]      send query buffer to server (and results to file or |pipe)\n"));
+ 	fprintf(output, _("  \\u [FILE]      like \\g, but use cursor for SELECT (for large result sets)\n"));
  	fprintf(output, _("  \\p             show the contents of the query buffer\n"));
  	fprintf(output, _("  \\r             reset (clear) the query buffer\n"));
  #ifdef USE_READLINE
diff -rc pgsql.original/src/bin/psql/mainloop.c pgsql/src/bin/psql/mainloop.c
*** pgsql.original/src/bin/psql/mainloop.c	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/mainloop.c	2006-08-17 16:55:25.000000000 +0200
***************
*** 264,270 ****
  
  				success = slashCmdStatus != PSQL_CMD_ERROR;
  
! 				if ((slashCmdStatus == PSQL_CMD_SEND || slashCmdStatus == PSQL_CMD_NEWEDIT) &&
  					query_buf->len == 0)
  				{
  					/* copy previous buffer to current for handling */
--- 264,270 ----
  
  				success = slashCmdStatus != PSQL_CMD_ERROR;
  
! 				if ((slashCmdStatus == PSQL_CMD_SEND || slashCmdStatus == PSQL_CMD_NEWEDIT || slashCmdStatus == PSQL_CMD_SEND_USING_CURSOR) &&
  					query_buf->len == 0)
  				{
  					/* copy previous buffer to current for handling */
***************
*** 282,287 ****
--- 282,298 ----
  					/* flush any paren nesting info after forced send */
  					psql_scan_reset(scan_state);
  				}
+ 				else if (slashCmdStatus == PSQL_CMD_SEND_USING_CURSOR)
+ 				{
+ 					success = SendQueryUsingCursor(query_buf->data);
+ 
+ 					resetPQExpBuffer(previous_buf);
+ 					appendPQExpBufferStr(previous_buf, query_buf->data);
+ 					resetPQExpBuffer(query_buf);
+ 
+ 					/* flush any paren nesting info after forced send */
+ 					psql_scan_reset(scan_state);
+ 				}
  				else if (slashCmdStatus == PSQL_CMD_NEWEDIT)
  				{
  					/* rescan query_buf as new input */
diff -rc pgsql.original/src/bin/psql/settings.h pgsql/src/bin/psql/settings.h
*** pgsql.original/src/bin/psql/settings.h	2006-08-17 16:51:04.000000000 +0200
--- pgsql/src/bin/psql/settings.h	2006-08-17 16:55:25.000000000 +0200
***************
*** 37,43 ****
  	printQueryOpt popt;
  	VariableSpace vars;			/* "shell variable" repository */
  
! 	char	   *gfname;			/* one-shot file output argument for \g */
  
  	bool		notty;			/* stdin or stdout is not a tty (as determined
  								 * on startup) */
--- 37,43 ----
  	printQueryOpt popt;
  	VariableSpace vars;			/* "shell variable" repository */
  
! 	char	   *gfname;			/* one-shot file output argument for \g and \u */
  
  	bool		notty;			/* stdin or stdout is not a tty (as determined
  								 * on startup) */

#19

Magnus Hagander

mha@sollentuna.net

over 19 years ago

In reply to: Steve Atkins (#16)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

I'd vote for reverting to the old way. Anyone serious

about hacking

should be on both lists.

Then why bother with two different lists?

If developers need to be on both list (which I beleive they

do), and

the focus of both lists is developers, then why not just

remove one of

them and get rid of the problem?

One reason might be that a lot of application developers who
develop applications or modules associated with PG, but not
the core PG code itself also lurk on -hackers, as it's by far
the best way to keep up with the status of various PG
enhancements (and also an excellent place to pick up a lot of
undocumented good practices).

Won't you learn even more good practices if you actually see the patches
as well? :-P

The bottom line is, I think, does the volume of mail on -patches
actually make a big difference given the much higher volume on -hackers?
(If you just want to skip the patches, just set up attachment filtering
on the list..)

//Magnus

#20

Gregory Stark

gsstark@mit.edu

over 19 years ago

In reply to: Steve Atkins (#16)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

On Aug 17, 2006, at 9:30 AM, Magnus Hagander wrote:

Then why bother with two different lists?

If developers need to be on both list (which I beleive they do), and the
focus of both lists is developers, then why not just remove one of them
and get rid of the problem?

Didn't I say something about not being able to convince people by arguing but
being sure people would come around eventually? :)

Steve Atkins <steve@blighty.com> writes:

One reason might be that a lot of application developers who develop
applications or modules associated with PG, but not the core PG code
itself also lurk on -hackers, as it's by far the best way to keep up with
the status of various PG enhancements (and also an excellent place
to pick up a lot of undocumented good practices).

Well if they want to keep up with the status of various PG enhancements they
had better be seeing the patches too since that's where that information is!
They don't have to read the actual patches but at least see the messages
describing them and their status. As the work progresses that's the only way
to clearly understand the status of it.

I originally suggested having the list manager strip out attachments, save
them on a web accessible place and insert a url in the message. I think we're
blocking on having that implemented in majordomo. If people are coming around
to my suggestion then I'll talk to Marc and see if I can help implement that.
I'm not sure what the majordomo code looks like so I don't know how easy it is
to hack in filters like that.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

#21

Chris Mair

chrisnospam@1006.org

over 19 years ago

In reply to: Bruce Momjian (#14)

Re: [HACKERS] selecting large result sets in psql using

BTW, \u seems not to have any mnemonic value whatsoever ... isn't there
some other name we could use?

True :)
Since buffer commands all have a single char I wanted a single char one
too. The "c" for "cursor" was taken already, so i choose the "u" (second
char in "cursor"). If somebody has a better suggestion, let us know ;)

I think a new backslash variable isn't the way to go. I would use a
\pset variable to control what is happening.

IMHO with \pset I'd have different places where I'd need to figure
out whether to do the cursor thing and I was a bit reluctant to add
stuff to existing code paths. Also the other \pset options are somewhat
orthogonal to this one. Just my two EUR cents, of course... :)

Bye, Chris.

Chris Mair
http://www.1006.org

#22

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: Chris Mair (#21)

Re: [HACKERS] selecting large result sets in psql using

Chris Mair wrote:

BTW, \u seems not to have any mnemonic value whatsoever ... isn't there
some other name we could use?

True :)
Since buffer commands all have a single char I wanted a single char one
too. The "c" for "cursor" was taken already, so i choose the "u" (second
char in "cursor"). If somebody has a better suggestion, let us know ;)

I think a new backslash variable isn't the way to go. I would use a
\pset variable to control what is happening.

IMHO with \pset I'd have different places where I'd need to figure
out whether to do the cursor thing and I was a bit reluctant to add
stuff to existing code paths. Also the other \pset options are somewhat
orthogonal to this one. Just my two EUR cents, of course... :)

Well, let's see what others say, but \pset seems _much_ more natural for
this type of thing to me.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#23

Jim C. Nasby

jnasby@pervasive.com

over 19 years ago

In reply to: Tom Lane (#6)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

On Thu, Aug 17, 2006 at 09:20:43AM -0400, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Ever since pgsql-patches replies started going to -hackers, threading
doesn't work anymore, so I for one can't tell what this refers to at
all.

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to revert
to the old way?

Has that actually been working? I seem to still get replies in both
places...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#24

Tom Lane

tgl@sss.pgh.pa.us

over 19 years ago

In reply to: Joe Conway (#15)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

Joe Conway <mail@joeconway.com> writes:

Magnus Hagander wrote:

Then why bother with two different lists?

If developers need to be on both list (which I beleive they do), and the
focus of both lists is developers, then why not just remove one of them
and get rid of the problem?

I wouldn't argue with that. It would be at least equally good from my
perspective, and maybe slightly better.

One big difference between the two lists is the maximum-message-size
policy ;-). To unify them we would need to relax the size limit on
-hackers, and I'm not convinced that's a good idea. It would likely
drive away at least some people who currently provide valuable ideas
even though they don't care to receive -patches.

regards, tom lane

#25

Tom Lane

tgl@sss.pgh.pa.us

over 19 years ago

In reply to: Bruce Momjian (#14)

Re: [HACKERS] selecting large result sets in psql using cursors

Bruce Momjian <bruce@momjian.us> writes:

Chris Mair wrote:

Since buffer commands all have a single char I wanted a single char one
too. The "c" for "cursor" was taken already, so i choose the "u" (second
char in "cursor"). If somebody has a better suggestion, let us know ;)

I think a new backslash variable isn't the way to go. I would use a
\pset variable to control what is happening.

That seems like it would be very awkward to use: you'd have to type
quite a bit to go from one mode to the other.

Personally I think that insisting on a one-letter command name is not
such a good idea if you can't pick a reasonably memorable name.
I'd suggest "\gc" (\g with a Cursor) or "\gb" (\g for a Big query)
or some such.

regards, tom lane

#26

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: Tom Lane (#25)

Re: [HACKERS] selecting large result sets in psql using

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Chris Mair wrote:

Since buffer commands all have a single char I wanted a single char one
too. The "c" for "cursor" was taken already, so i choose the "u" (second
char in "cursor"). If somebody has a better suggestion, let us know ;)

I think a new backslash variable isn't the way to go. I would use a
\pset variable to control what is happening.

That seems like it would be very awkward to use: you'd have to type
quite a bit to go from one mode to the other.

Personally I think that insisting on a one-letter command name is not
such a good idea if you can't pick a reasonably memorable name.
I'd suggest "\gc" (\g with a Cursor) or "\gb" (\g for a Big query)
or some such.

So add it as a modifyer to \g? Yea, that works, but it doesn't work for
';' as a 'go' command, of course, which is perhaps OK.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#27

Chris Mair

chrisnospam@1006.org

over 19 years ago

In reply to: Bruce Momjian (#26)

Re: [HACKERS] selecting large result sets in psql using

Since buffer commands all have a single char I wanted a single char one
too. The "c" for "cursor" was taken already, so i choose the "u" (second
char in "cursor"). If somebody has a better suggestion, let us know ;)

I think a new backslash variable isn't the way to go. I would use a
\pset variable to control what is happening.

That seems like it would be very awkward to use: you'd have to type
quite a bit to go from one mode to the other.

Personally I think that insisting on a one-letter command name is not
such a good idea if you can't pick a reasonably memorable name.
I'd suggest "\gc" (\g with a Cursor) or "\gb" (\g for a Big query)
or some such.

\gc sounds like a good idea to me :)

(I must admit gc reminds me about 'garbage collector', which in a weired
way is related with what we're doing here... At least more related than
'Great Britain' ;)

So add it as a modifyer to \g? Yea, that works, but it doesn't work for
';' as a 'go' command, of course, which is perhaps OK.

Yes, it was intended to differentiate this command from ';';

Bye, Chris.

Chris Mair
http://www.1006.org

#28

Peter Eisentraut

peter_e@gmx.net

over 19 years ago

In reply to: Chris Mair (#27)

Re: [HACKERS] selecting large result sets in psql using

Am Donnerstag, 17. August 2006 20:05 schrieb Chris Mair:

\gc sounds like a good idea to me :)

Strictly speaking, in the randomly defined grammer of psql, \gc is \g with an
argument of 'c' (try it, it works).

I'm not sure what use case you envision for this feature. Obviously, this is
for queries with large result sets. I'd guess that people will not normally
look at those result sets interactively. If the target audience is instead
psql scripting, you don't really need the most convenient command possible.
A \set variable would make sense to me.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#29

Tom Lane

tgl@sss.pgh.pa.us

over 19 years ago

In reply to: Peter Eisentraut (#28)

Re: [PATCHES] selecting large result sets in psql using

Peter Eisentraut <peter_e@gmx.net> writes:

A \set variable would make sense to me.

So Peter and Bruce like a \set variable, Chris and I like a different
command. Seems like a tie ... more votes out there anywhere?

regards, tom lane

#30

David Fetter

david@fetter.org

over 19 years ago

In reply to: Tom Lane (#29)

Re: [PATCHES] selecting large result sets in psql using

On Fri, Aug 18, 2006 at 10:16:12AM -0400, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

A \set variable would make sense to me.

So Peter and Bruce like a \set variable, Chris and I like a
different command. Seems like a tie ... more votes out there
anywhere?

It seems to me that a \set variable lets people use minimal
intrusiveness on scripts, etc., as they'll just set it when they start
needing cursor-ized result sets and unset it when finished.

Just my $.02 :)

Cheers,
D
--
David Fetter <david@fetter.org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

#31

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: David Fetter (#30)

Re: [PATCHES] selecting large result sets in psql using

David Fetter wrote:

On Fri, Aug 18, 2006 at 10:16:12AM -0400, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

A \set variable would make sense to me.

So Peter and Bruce like a \set variable, Chris and I like a
different command. Seems like a tie ... more votes out there
anywhere?

It seems to me that a \set variable lets people use minimal
intrusiveness on scripts, etc., as they'll just set it when they start
needing cursor-ized result sets and unset it when finished.

True. They could even put it in .psqlrc if they want. Basically need a
way to modify \g. Seems a \set is the way we have always done such
modifications in the past. The big question is whether this is somehow
different. Personally, I don't think so.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#32

Tom Lane

tgl@sss.pgh.pa.us

over 19 years ago

In reply to: Bruce Momjian (#31)

Re: [PATCHES] selecting large result sets in psql using

Bruce Momjian <bruce@momjian.us> writes:

True. They could even put it in .psqlrc if they want. Basically need a
way to modify \g. Seems a \set is the way we have always done such
modifications in the past. The big question is whether this is somehow
different. Personally, I don't think so.

If you want a \set variable, then at least make it do something useful:
make it an integer var that sets the fetch count, rather than hard-wiring
the count as is done in Chris' existing patch. Zero (or perhaps unset)
disables.

regards, tom lane

#33

Noname

chrisnospam@1006.org

over 19 years ago

In reply to: Tom Lane (#32)

Re: [PATCHES] selecting large result sets in psql using

True. They could even put it in .psqlrc if they want. Basically need
a way to modify \g. Seems a \set is the way we have always done such
modifications in the past. The big question is whether this is
somehow different. Personally, I don't think so.

If you want a \set variable, then at least make it do something useful:
make it an integer var that sets the fetch count, rather than
hard-wiring the count as is done in Chris' existing patch. Zero (or
perhaps unset) disables.

regards, tom lane

Hello,

first I must admit that I misunderstood Bruce post. I thought he meant
to tweak \pset (psql command to set formatting). This didn't make
sense to me. Only now I realize everyone is talking about \set
(psql internal variable).

That being said, I'm a bit unsure now what we should do.

As Peter said, it is true that mostly this feature would be
used for scripting where \set and \unset are not as cumbersome
to use as in an interactive session.
Tom's idea to factor in the fetch count as an option is also
very tempting.

To cut the Gordon knot I'm going to suggest we use:

\set CURSOR_FETCH fetch_count

and \g and ; are modified such that when they see
this variable set to fetch_count > 0 and the buffer
is a select they would use the modified fetch/output code.

Does this sound reasonable to everyone?

Bye :)
Chris.

--
Chris Mair
http://www.1006.org

#34

Tom Lane

tgl@sss.pgh.pa.us

over 19 years ago

In reply to: Noname (#33)

Re: [PATCHES] selecting large result sets in psql using

<chrisnospam@1006.org> writes:

To cut the Gordon knot I'm going to suggest we use:

\set CURSOR_FETCH fetch_count

and \g and ; are modified such that when they see
this variable set to fetch_count > 0 and the buffer
is a select they would use the modified fetch/output code.

Does this sound reasonable to everyone?

OK with me, but maybe call the variable FETCH_COUNT, to avoid the
presupposition that the implementation uses a cursor. As I mentioned
before, I expect we'll someday rework it to not use that.

regards, tom lane

#35

Robert Treat

xzilla@users.sourceforge.net

over 19 years ago

In reply to: Peter Eisentraut (#11)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

On Thursday 17 August 2006 11:55, Peter Eisentraut wrote:

Tom Lane wrote:

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to
revert to the old way?

Since almost the first day I hacked on PostgreSQL I have been filtering
both lists into the same folder, so they pretty much appear to be one
and the same to me anyway.

I'm curious, do you combine any other lists like that? I've played around
with that idea (for example, I used to combine webmaster emails, pgsql-www,
and -slaves emails but the slaves traffic was too high so I had to split it
back out). As someone subscribed to a good dozen pg lists, I've always been
quite amazed how much email some of the folks here manage to process... I
suppose I could just chalk it up to a pine vs. gui thing, but I suspect there
are some other tricks people have to make emails more manageable (anyone
combine all pg mail to one folder?)

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

#36

Bruce Momjian

bruce@momjian.us

over 19 years ago

In reply to: Robert Treat (#35)

Re: pgsql-patches reply-to (was Re: [PATCHES]

Robert Treat wrote:

On Thursday 17 August 2006 11:55, Peter Eisentraut wrote:

Tom Lane wrote:

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to
revert to the old way?

Since almost the first day I hacked on PostgreSQL I have been filtering
both lists into the same folder, so they pretty much appear to be one
and the same to me anyway.

I'm curious, do you combine any other lists like that? I've played around
with that idea (for example, I used to combine webmaster emails, pgsql-www,
and -slaves emails but the slaves traffic was too high so I had to split it
back out). As someone subscribed to a good dozen pg lists, I've always been
quite amazed how much email some of the folks here manage to process... I
suppose I could just chalk it up to a pine vs. gui thing, but I suspect there
are some other tricks people have to make emails more manageable (anyone
combine all pg mail to one folder?)

Yes, all mine are in one folder, and I use elm ME. It is faster than a
GUI email client.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#37

Joshua D. Drake

jd@commandprompt.com

over 19 years ago

In reply to: Robert Treat (#35)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

I'm curious, do you combine any other lists like that? I've played around
with that idea (for example, I used to combine webmaster emails, pgsql-www,
and -slaves emails but the slaves traffic was too high so I had to split it
back out). As someone subscribed to a good dozen pg lists, I've always been
quite amazed how much email some of the folks here manage to process... I
suppose I could just chalk it up to a pine vs. gui thing, but I suspect there
are some other tricks people have to make emails more manageable (anyone
combine all pg mail to one folder?)

Well as someone who is also on almost all of the PostgreSQL lists, plus
a number of sub projects :)

I filter everything postgresql except for the funds list into a single
box and I process each in order :). I used to break them up, but I found
with cross posting, and trying to reference back and forth it was just
easier to have a single box.

I used to be a big pine user but due to the large amount of email I do
process I had to move to Thunderbird which makes certain things just
much easier.

Sincerely,

Joshua D. Drake

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

#38

Tom Lane

tgl@sss.pgh.pa.us

over 19 years ago

In reply to: Bruce Momjian (#36)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

Bruce Momjian <bruce@momjian.us> writes:

Robert Treat wrote:

... some other tricks people have to make emails more manageable (anyone
combine all pg mail to one folder?)

Yes, all mine are in one folder, and I use elm ME. It is faster than a
GUI email client.

All my PG list mail goes into one folder too. The list bot is pretty
good (not perfect :-() about sending only one copy of crossposted
messages. Personally I use exmh, but I don't expect people who don't
remember the Mesozoic era to know what that is.

regards, tom lane

#39

Michael Glaesemann

grzm@seespotcode.net

over 19 years ago

In reply to: Robert Treat (#35)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

On Aug 23, 2006, at 12:15 , Robert Treat wrote:

On Thursday 17 August 2006 11:55, Peter Eisentraut wrote:

Tom Lane wrote:

Yeah, that experiment hasn't seemed to work all that well for me
either. Do you have another idea to try, or do you just want to
revert to the old way?

Since almost the first day I hacked on PostgreSQL I have been
filtering
both lists into the same folder, so they pretty much appear to be one
and the same to me anyway.

I'm curious, do you combine any other lists like that? I've played
around
with that idea (for example, I used to combine webmaster emails,
pgsql-www,
and -slaves emails but the slaves traffic was too high so I had to
split it
back out). As someone subscribed to a good dozen pg lists, I've
always been
quite amazed how much email some of the folks here manage to
process... I
suppose I could just chalk it up to a pine vs. gui thing, but I
suspect there
are some other tricks people have to make emails more manageable
(anyone
combine all pg mail to one folder?)

Reading pg ml mail is relatively high on my list of things I want to
do, so I have it all come into my inbox. However, with other mailing
lists (e.g., ruby-talk and the RoR lists which have the highest
volume of any mailing list I'm subscribed to) I generally have them
routed into their own folder. I usually let lower-volume mailing
lists just end up in my inbox as well

Mail.app on Mac OS X 10.4. I make heavy use of the Mail Act-on[1](http://www.indev.ca/MailActOn.html)
plugin to make further processing of mail easier (such as archiving
to appropriate folders).

Michael Glaesemann
grzm seespotcode net

[1]: (http://www.indev.ca/MailActOn.html)

#40

Joshua D. Drake

jd@commandprompt.com

over 19 years ago

In reply to: Tom Lane (#38)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Robert Treat wrote:

... some other tricks people have to make emails more manageable (anyone
combine all pg mail to one folder?)

Yes, all mine are in one folder, and I use elm ME. It is faster than a
GUI email client.

All my PG list mail goes into one folder too. The list bot is pretty
good (not perfect :-() about sending only one copy of crossposted
messages. Personally I use exmh, but I don't expect people who don't
remember the Mesozoic era to know what that is.

I know what it is from text books ;). Practical Unix 3rd Ed, by Sobel I
think it was.

Sincerely,

Joshua D. Drake

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

#41

Dave Page

dpage@vale-housing.co.uk

over 19 years ago

In reply to: Robert Treat (#35)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Treat
Sent: 23 August 2006 04:16
To: pgsql-hackers@postgresql.org
Cc: Peter Eisentraut; Tom Lane
Subject: Re: pgsql-patches reply-to (was Re: [HACKERS]
[PATCHES] selecting large result sets in psql using cursors)

I've always been
quite amazed how much email some of the folks here manage to
process... I
suppose I could just chalk it up to a pine vs. gui thing, but
I suspect there
are some other tricks people have to make emails more
manageable (anyone
combine all pg mail to one folder?)

More or less - one for -www, webmaster and slaves stuff, and another for
-odbc, -hackers, -patches, -committers, -perform, -general and so on. I
do keep additional ones for FG and -core though. Everything is
auto-filtered at our Exchange server so it's organised as I like whether
I pick it up on PDA, webmail, PC or Mac.

Regards, Dave.

#42

Noname

chrisnospam@1006.org

over 19 years ago

In reply to: Tom Lane (#34)

Re: [PATCHES] selecting large result sets in psql using

To cut the Gordon knot I'm going to suggest we use:

\set CURSOR_FETCH fetch_count

and \g and ; are modified such that when they see
this variable set to fetch_count > 0 and the buffer
is a select they would use the modified fetch/output code.

Does this sound reasonable to everyone?

OK with me, but maybe call the variable FETCH_COUNT, to avoid the
presupposition that the implementation uses a cursor. As I mentioned
before, I expect we'll someday rework it to not use that.

regards, tom lane

Ok,
sounds good.
I'm travelling this week, but can send an updated patch during the weekend.

Bye,
Chris.

--
Chris Mair
http://www.1006.org

#43

Bruno Wolff III

bruno@wolff.to

over 19 years ago

In reply to: Robert Treat (#35)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

On Tue, Aug 22, 2006 at 23:15:59 -0400,
Robert Treat <xzilla@users.sourceforge.net> wrote:

On Thursday 17 August 2006 11:55, Peter Eisentraut wrote:

I'm curious, do you combine any other lists like that? I've played around
with that idea (for example, I used to combine webmaster emails, pgsql-www,
and -slaves emails but the slaves traffic was too high so I had to split it
back out). As someone subscribed to a good dozen pg lists, I've always been
quite amazed how much email some of the folks here manage to process... I
suppose I could just chalk it up to a pine vs. gui thing, but I suspect there
are some other tricks people have to make emails more manageable (anyone
combine all pg mail to one folder?)

I do, but it is a lot of email and if I miss a few days it takes a while to
catch up again. At some point I will probably do some smarter filtering, but
I don't want to spend the effort to figure that out right now.

#44

Alvaro Herrera

alvherre@commandprompt.com

over 19 years ago

In reply to: Bruno Wolff III (#43)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

Bruno Wolff III wrote:

On Tue, Aug 22, 2006 at 23:15:59 -0400,
Robert Treat <xzilla@users.sourceforge.net> wrote:

On Thursday 17 August 2006 11:55, Peter Eisentraut wrote:

I'm curious, do you combine any other lists like that? I've played around
with that idea (for example, I used to combine webmaster emails, pgsql-www,
and -slaves emails but the slaves traffic was too high so I had to split it
back out). As someone subscribed to a good dozen pg lists, I've always been
quite amazed how much email some of the folks here manage to process... I
suppose I could just chalk it up to a pine vs. gui thing, but I suspect there
are some other tricks people have to make emails more manageable (anyone
combine all pg mail to one folder?)

I do, but it is a lot of email and if I miss a few days it takes a while to
catch up again. At some point I will probably do some smarter filtering, but
I don't want to spend the effort to figure that out right now.

I was at some point doing the "smarter filtering", i.e. each list to its
own folder, but eventually found out that it's better to combine the
whole thing, which is what I do now. I also managed to figure out that
it's better to put stuff that doesn't pass through the list, but has a
Cc: some-list header, in the same folder; that way, duplicates (of which
I do get a few) are easier to handle. (You can choose to remove dupes
by telling Majordomo not to send you mails that have you on Cc:, but
I've found that I lose some people's emails due to my own spam
filtering.) I have on my TODO to have procmail throw away an email that
it already delivered (e.g. by comparing Message-Id's), so if someone has
a solution to that I'd like to know.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#45

Bruno Wolff III

bruno@wolff.to

over 19 years ago

In reply to: Alvaro Herrera (#44)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

On Wed, Aug 23, 2006 at 15:03:24 -0400,
Alvaro Herrera <alvherre@commandprompt.com> wrote:

Bruno Wolff III wrote:

I do, but it is a lot of email and if I miss a few days it takes a while to
catch up again. At some point I will probably do some smarter filtering, but
I don't want to spend the effort to figure that out right now.

I was at some point doing the "smarter filtering", i.e. each list to its
own folder, but eventually found out that it's better to combine the
whole thing, which is what I do now. I also managed to figure out that
it's better to put stuff that doesn't pass through the list, but has a
Cc: some-list header, in the same folder; that way, duplicates (of which
I do get a few) are easier to handle. (You can choose to remove dupes
by telling Majordomo not to send you mails that have you on Cc:, but
I've found that I lose some people's emails due to my own spam
filtering.) I have on my TODO to have procmail throw away an email that
it already delivered (e.g. by comparing Message-Id's), so if someone has
a solution to that I'd like to know.

I don't have cc's removed because that still sometimes gets me faster replies,
but I do have get only one message when a message is posted to several lists
set.
I use mutt to read mail and maildrop to do filtering.
I think for me smarter filtering would be to split the lists into to or three
groups. There are lists I see a fair number of interesting messages on, lists
I can often answer questions on, and other postgres lists. When I fall behind,
doing a D.* on the other postgres lists is something I should do more than
I currently am.

#46

Andrew Dunstan

andrew@dunslane.net

over 19 years ago

In reply to: Noname (#33)

Re: [PATCHES] selecting large result sets in psql using

chrisnospam@1006.org wrote:

To cut the Gordon knot I'm going to suggest we use:

ITYM "Gordian" - see http://en.wikipedia.org/wiki/Gordian_Knot

cheers

andrew ;-)

#47

Peter Eisentraut

peter_e@gmx.net

over 19 years ago

In reply to: Alvaro Herrera (#44)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

Alvaro Herrera wrote:

I have on my TODO to have procmail
throw away an email that it already delivered (e.g. by comparing
Message-Id's), so if someone has a solution to that I'd like to know.

:0 Wh: msgid.lock
| formail -D 65536 $HOME/.msgid.cache

I don't use the eliminatecc feature either, for the known reasons, but
the above works without fail.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#48

Alvaro Herrera

alvherre@commandprompt.com

over 19 years ago

In reply to: Peter Eisentraut (#47)

Re: pgsql-patches reply-to (was Re: [PATCHES] selecting large result sets in psql using cursors)

Peter Eisentraut wrote:

Alvaro Herrera wrote:

I have on my TODO to have procmail
throw away an email that it already delivered (e.g. by comparing
Message-Id's), so if someone has a solution to that I'd like to know.

:0 Wh: msgid.lock
| formail -D 65536 $HOME/.msgid.cache

I don't use the eliminatecc feature either, for the known reasons, but
the above works without fail.

Thanks! I installed it and it certainly works as a charm.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#49

Jim C. Nasby

jnasby@pervasive.com

over 19 years ago

In reply to: Tom Lane (#29)

Re: [PATCHES] selecting large result sets in psql using

On Fri, Aug 18, 2006 at 10:16:12AM -0400, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

A \set variable would make sense to me.

So Peter and Bruce like a \set variable, Chris and I like a different
command. Seems like a tie ... more votes out there anywhere?

If this will be used interactively, it would be nice to have both. That
way if you're running a bunch of cursor fetches, you can just do one
\set, but if you only want to run one or a few you can use \gc and not
mess around with \set. But I don't know how common interactive usage
will be. Presumably code can easily be taught to do either, though \set
would probably be less invasive to older code that someone wants to
change.

Another thought (which probably applies more to \set than \gc): if you
could set a threshold of how many rows the planner is estimating before
automatically switching to a cursor, that would simplify things.
Interactively, you could just let psql/PostgreSQL decide which was best
for each query. Same is true in code, though it probably matters more
for existing code than new code.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#50

Noname

chrisnospam@1006.org

over 19 years ago

In reply to: Jim C. Nasby (#49)

Re: [PATCHES] selecting large result sets in psql using

If this will be used interactively, it would be nice to have both. That
way if you're running a bunch of cursor fetches, you can just do one
\set, but if you only want to run one or a few you can use \gc and not
mess around with \set. But I don't know how common interactive usage
will be. Presumably code can easily be taught to do either, though \set
would probably be less invasive to older code that someone wants to
change.

I don't know if having both is really that desirable. In particular,
as Peter pointed out, \gc is not possible because it means \g outputting
to file 'c' in the current version of psql.

Another thought (which probably applies more to \set than \gc): if you
could set a threshold of how many rows the planner is estimating before
automatically switching to a cursor, that would simplify things.
Interactively, you could just let psql/PostgreSQL decide which was best
for each query. Same is true in code, though it probably matters more
for existing code than new code.

Right now, this would be very hard, because the existing output code
cannot readily be adapted to using cursors. My patch does fetching and
output in a new code path that is very simple, but doesn't do all the
nice formatting for human readability. So moving seamlessly between the
two behind the scenes is not possible, least refactoring the whole
output code of psql.

Tom Lane mentioned the solution at the root of all this eventually might
be a new version of libpq that does large fetches in chunks on its own.
But, we're talking > 8.2.0 then...

Bye :)
Chris.

--
Chris Mair
http://www.1006.org

#51

Noname

chrisnospam@1006.org

over 19 years ago

In reply to: Noname (#42)

Re: selecting large result sets in psql using

To cut the Gordon knot I'm going to suggest we use:

\set CURSOR_FETCH fetch_count

and \g and ; are modified such that when they see
this variable set to fetch_count > 0 and the buffer
is a select they would use the modified fetch/output code.

Does this sound reasonable to everyone?

OK with me, but maybe call the variable FETCH_COUNT, to avoid the
presupposition that the implementation uses a cursor. As I mentioned
before, I expect we'll someday rework it to not use that.

regards, tom lane

Ok,
sounds good.
I'm travelling this week, but can send an updated patch during the
weekend.

I've just submitted an updated patch to pgsql-patches in a new
thread :)

Bye,
Chris.

--
Chris Mair
http://www.1006.org