pgbench - allow backslash-continuations in custom scripts

Started by Fabien COELHOover 10 years ago70 messages
#1Fabien COELHO
coelho@cri.ensmp.fr
1 attachment(s)

Add backslash continuations to pgbench custom scripts.

The benefit of this approach is that it is upward compatible, and it is
also pretty simple to implement. The downside is that backslash
continuation is not the best syntax ever invented, but then you do not
have to use it if you do not like it.

The alternative would be to make semi-colon a mandatory end-of-line
marker, which would introduce an incompatibility and requires more efforts
to implement, including some kind of SQL-compatible lexer.

IMHO this approach is the best compromise.

--
Fabien.

Attachments:

pgbench-conts-1.patchtext/x-diff; name=pgbench-conts-1.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index a808546..f68acb2 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -697,11 +697,13 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
   </para>
 
   <para>
-   The format of a script file is one SQL command per line; multiline
-   SQL commands are not supported.  Empty lines and lines beginning with
-   <literal>--</> are ignored.  Script file lines can also be
-   <quote>meta commands</>, which are interpreted by <application>pgbench</>
-   itself, as described below.
+   The format of a script file is composed of lines which are each either
+   one SQL command or one <quote>meta command</> interpreted by
+   <application>pgbench</> itself, as described below.
+   Commands can be spread over multiple lines using backslash (<literal>\</>)
+   continuations, in which case the set of continuated lines is considered
+   as just one line.
+   Empty lines and lines beginning with <literal>--</> are ignored.
   </para>
 
   <para>
@@ -769,7 +771,8 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
       Examples:
 <programlisting>
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (100000 * :scale) + 1
+\set aid \
+  (1021 * :aid) % (100000 * :scale) + 1
 </programlisting></para>
     </listitem>
    </varlistentry>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 8b8b591..8991702 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2437,7 +2437,7 @@ process_commands(char *buf, const char *source, const int lineno)
 }
 
 /*
- * Read a line from fd, and return it in a malloc'd buffer.
+ * Read a possibly \-continuated line from fd, and return it in a malloc'd buffer.
  * Return NULL at EOF.
  *
  * The buffer will typically be larger than necessary, but we don't care
@@ -2462,9 +2462,25 @@ read_line_from_file(FILE *fd)
 		memcpy(buf + used, tmpbuf, thislen + 1);
 		used += thislen;
 
-		/* Done if we collected a newline */
-		if (thislen > 0 && tmpbuf[thislen - 1] == '\n')
-			break;
+		/* If we collected a newline */
+		if (used > 0 && buf[used - 1] == '\n')
+		{
+			/* Handle simple \-continuations */
+			if (used >= 2 && buf[used - 2] == '\\')
+			{
+				buf[used - 2] = '\0';
+				used -= 2;
+			}
+			else if (used >= 3 && buf[used - 2] == '\r' &&
+					 buf[used - 3] == '\\')
+			{
+				buf[used - 3] = '\0';
+				used -= 3;
+			}
+			else
+				/* Else we are done */
+				break;
+		}
 
 		/* Else, enlarge buf to ensure we can append next bufferload */
 		buflen += BUFSIZ;
#2Josh Berkus
josh@agliodbs.com
In reply to: Fabien COELHO (#1)
Re: pgbench - allow backslash-continuations in custom scripts

On 05/14/2015 12:10 PM, Fabien COELHO wrote:

Add backslash continuations to pgbench custom scripts.

The benefit of this approach is that it is upward compatible, and it is
also pretty simple to implement. The downside is that backslash
continuation is not the best syntax ever invented, but then you do not
have to use it if you do not like it.

The alternative would be to make semi-colon a mandatory end-of-line
marker, which would introduce an incompatibility and requires more
efforts to implement, including some kind of SQL-compatible lexer.

IMHO this approach is the best compromise.

I don't personally agree. I believe that it it worth breaking backwards
compatibility to support line breaks in pgbench statements, and that if
we're not going to do that, supporting \ continuations is of little value.

As someone who actively uses pgbench to write custom benchmarks, I need
to write queries which I can test. \ continuation does NOT work on the
psql command line, so that's useless for testing my queries; I still
have to reformat and troubleshoot. If we added \ continuation, I
wouldn't use it.

I think we should support line breaks, and require semicolons for
end-of-statement. Backwards-compatability in custom pgbench scripts is
not critical; pgbench scripts are neither used in produciton, nor used
in automated systems much that I know of.

I'm not clear on why we'd need a full SQL lexer.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#2)
Re: pgbench - allow backslash-continuations in custom scripts

Josh Berkus <josh@agliodbs.com> writes:

On 05/14/2015 12:10 PM, Fabien COELHO wrote:

Add backslash continuations to pgbench custom scripts.

I don't personally agree. I believe that it it worth breaking backwards
compatibility to support line breaks in pgbench statements, and that if
we're not going to do that, supporting \ continuations is of little value.

I tend to agree on that bottom line; having this be inconsistent with psql
does not seem like a win.

I'm not clear on why we'd need a full SQL lexer.

So you don't get fooled by semicolons embedded in string literals or
comments.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Josh Berkus
josh@agliodbs.com
In reply to: Fabien COELHO (#1)
Re: pgbench - allow backslash-continuations in custom scripts

On 06/19/2015 02:51 PM, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

On 05/14/2015 12:10 PM, Fabien COELHO wrote:

Add backslash continuations to pgbench custom scripts.

I don't personally agree. I believe that it it worth breaking backwards
compatibility to support line breaks in pgbench statements, and that if
we're not going to do that, supporting \ continuations is of little value.

I tend to agree on that bottom line; having this be inconsistent with psql
does not seem like a win.

I'm not clear on why we'd need a full SQL lexer.

So you don't get fooled by semicolons embedded in string literals or
comments.

I take it we ignore those now? I mean, personally, it wouldn't break
anything for me but since some other benhcmarks involve random text
generators ....

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Josh Berkus (#4)
Re: pgbench - allow backslash-continuations in custom scripts

I tend to agree on that bottom line; having this be inconsistent with psql
does not seem like a win.

I'm not clear on why we'd need a full SQL lexer.

So you don't get fooled by semicolons embedded in string literals or
comments.

I take it we ignore those now? I mean, personally, it wouldn't break
anything for me but since some other benhcmarks involve random text
generators ....

If backward compatibility is not an issue (I'm surprised:-), and failure
is acceptable in contrived cases, a simple implementation would be to
accumulate lines till one ends with ";\s*$",

Otherwise maybe the "states" management or the lexer are enough (in simple
quotes, in double quotes, in comment, in stuff), so this can implemented
without actually requiring another lexer in pgbench and be robust.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Josh Berkus (#2)
2 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

Hello Josh,

Add backslash continuations to pgbench custom scripts.
[...]
IMHO this approach is the best compromise.

I don't personally agree. I believe that it it worth breaking backwards
compatibility to support line breaks in pgbench statements, and that if
we're not going to do that, supporting \ continuations is of little value.

As someone who actively uses pgbench to write custom benchmarks, I need
to write queries which I can test. \ continuation does NOT work on the
psql command line, so that's useless for testing my queries; I still
have to reformat and troubleshoot. If we added \ continuation, I
wouldn't use it.

I think we should support line breaks, and require semicolons for
end-of-statement. Backwards-compatability in custom pgbench scripts is
not critical; pgbench scripts are neither used in produciton, nor used
in automated systems much that I know of.

I'm not clear on why we'd need a full SQL lexer.

Attached is a "without lexer" version which does ;-terminated SQL commands
and \-continuated meta commands (may be useful for \shell and long \set
expressions).

Also attached is a small pgbench script to test the feature.

Without a lexer it is possible to fool pgbench with somehow contrived
examples, say with:

SELECT 'hello;
world';

The ";" within the string will be considered as end-of-line.

Also, comments intermixed with sql on the same line would generate errors.

SELECT 1 -- one
+ 3;

Would fail, but comments on lines of their own are ok.

It may be argued that these are not a likely scripts and that this
behavior could be declared as a "feature" for keeping the code simple.

ISTM that it would be an overall improvement, but also the ;-termination
requirement breaks backward compatibility.

--
Fabien.

Attachments:

test.sqlapplication/x-sql; name=test.sqlDownload
pgbench-conts-2.patchtext/x-diff; name=pgbench-conts-2.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index a808546..7990564 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -697,11 +697,15 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
   </para>
 
   <para>
-   The format of a script file is one SQL command per line; multiline
-   SQL commands are not supported.  Empty lines and lines beginning with
-   <literal>--</> are ignored.  Script file lines can also be
-   <quote>meta commands</>, which are interpreted by <application>pgbench</>
-   itself, as described below.
+   The format of a script file is composed of lines which are each either
+   one SQL command or one <quote>meta command</> interpreted by
+   <application>pgbench</> itself, as described below.
+   Meta-commands can be spread over multiple lines using backslash
+   (<literal>\</>) continuations, in which case the set of continuated
+   lines is considered as just one line.
+   SQL commands may be spead over several lines and must be
+   <literal>;</>-terminated.
+   Empty lines and lines beginning with <literal>--</> are ignored.
   </para>
 
   <para>
@@ -769,7 +773,9 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
       Examples:
 <programlisting>
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (100000 * :scale) + 1
+-- update an already defined aid:
+\set aid \
+  (1021 * :aid) % (100000 * :scale) + 1
 </programlisting></para>
     </listitem>
    </varlistentry>
@@ -932,11 +938,15 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
 \setrandom tid 1 :ntellers
 \setrandom delta -5000 5000
 BEGIN;
-UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
+UPDATE pgbench_accounts
+  SET abalance = abalance + :delta WHERE aid = :aid;
 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
-UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
-UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
-INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
+UPDATE pgbench_tellers
+  SET tbalance = tbalance + :delta WHERE tid = :tid;
+UPDATE pgbench_branches
+  SET bbalance = bbalance + :delta WHERE bid = :bid;
+INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
+  VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
 END;
 </programlisting>
 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 6f35db4..ff41d91 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2450,7 +2450,9 @@ process_commands(char *buf, const char *source, const int lineno)
 }
 
 /*
- * Read a line from fd, and return it in a malloc'd buffer.
+ * Read a possibly \-continuated (for backslash commands) or ;-terminated
+ * (for SQL statements) lines from fd, and return it in a malloc'd buffer.
+ *
  * Return NULL at EOF.
  *
  * The buffer will typically be larger than necessary, but we don't care
@@ -2463,6 +2465,8 @@ read_line_from_file(FILE *fd)
 	char	   *buf;
 	size_t		buflen = BUFSIZ;
 	size_t		used = 0;
+	bool		is_sql_statement = false;
+	bool		is_backslash_command = false;
 
 	buf = (char *) palloc(buflen);
 	buf[0] = '\0';
@@ -2471,13 +2475,87 @@ read_line_from_file(FILE *fd)
 	{
 		size_t		thislen = strlen(tmpbuf);
 
+		/* coldly skip comments and empty lines */
+		{
+			int i = 0;
+
+			while (i < thislen && isspace(tmpbuf[i]))
+				i++;
+
+			if (tmpbuf[i] == '\0') /* blank */
+				continue;
+
+			if (tmpbuf[i] == '-' && tmpbuf[i+1] == '-') /* comment */
+				continue;
+		}
+
 		/* Append tmpbuf to whatever we had already */
 		memcpy(buf + used, tmpbuf, thislen + 1);
 		used += thislen;
 
-		/* Done if we collected a newline */
-		if (thislen > 0 && tmpbuf[thislen - 1] == '\n')
-			break;
+		if (!is_backslash_command && !is_sql_statement)
+		{
+			/* determined what the current line is */
+			int i = 0;
+
+			while (i < thislen && isspace(tmpbuf[i]))
+				i++;
+
+			if (tmpbuf[i] == '\\')
+				is_backslash_command = true;
+			else if (tmpbuf[i] != '\0')
+				is_sql_statement = true;
+		}
+
+		/* If we collected a newline */
+		if (used > 0 && buf[used - 1] == '\n')
+		{
+			if (is_backslash_command)
+			{
+				/* Handle simple \-continuations */
+				if (used >= 2 && buf[used - 2] == '\\')
+				{
+					buf[used - 2] = '\0';
+					used -= 2;
+				}
+				else if (used >= 3 && buf[used - 2] == '\r' &&
+						 buf[used - 3] == '\\')
+				{
+					buf[used - 3] = '\0';
+					used -= 3;
+				}
+				else
+					/* Else we are done */
+					break;
+			}
+			else if (is_sql_statement)
+			{
+				/* look for a terminating ";" */
+				int i = 2;
+
+				/* backward skip blanks */
+				while (used-i >= 0 && isspace(buf[used-i]))
+					i++;
+
+				if (used-i >= 0 && buf[used-i] == ';')
+					break;
+				else
+				{
+					/* scratch newline because process_commands approach to
+					 * parsing is simplistic and expects just one line.
+					 */
+					if (buf[used-1] == '\n')
+						buf[used-1] = ' ';
+					if (used >= 2 && buf[used-2] == '\r')
+					{
+						buf[used-2] = ' ';
+						buf[used-1] = '\0';
+						used --;
+					}
+				}
+			}
+			/* else it was a blank line */
+		}
 
 		/* Else, enlarge buf to ensure we can append next bufferload */
 		buflen += BUFSIZ;
#7Josh Berkus
josh@agliodbs.com
In reply to: Fabien COELHO (#1)
Re: pgbench - allow backslash-continuations in custom scripts

Fabien,

Without a lexer it is possible to fool pgbench with somehow contrived
examples, say with:

SELECT 'hello;
world';

The ";" within the string will be considered as end-of-line.

Also, comments intermixed with sql on the same line would generate errors.

SELECT 1 -- one
+ 3;

Would fail, but comments on lines of their own are ok.

It may be argued that these are not a likely scripts and that this
behavior could be declared as a "feature" for keeping the code simple.

Yeah, these seem pretty contrived. I would personally be OK with
breaking them.

ISTM that it would be an overall improvement, but also the ;-termination
requirement breaks backward compatibility.

Look, how many people in the world develop their own pgbench scripts?
And how many of those aren't on this list right now, reading this
thread? I expect I could count them on my fingers and toes.
Backwards-compatability for pgdump, pg_basebackup, initdb, etc. matters.
The worst case with pgbench is that we break two people's test scripts,
they read the release notes, and fix them.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Josh Berkus (#7)
Re: pgbench - allow backslash-continuations in custom scripts

The worst case with pgbench is that we break two people's test scripts,
they read the release notes, and fix them.

Sure, I agree that breaking pgbench custom scripts compatibility is no big
deal, and having pgbench consistent with psql is useful.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Josh Berkus
josh@agliodbs.com
In reply to: Fabien COELHO (#1)
Re: pgbench - allow backslash-continuations in custom scripts

On 06/21/2015 01:37 PM, Fabien COELHO wrote:

The worst case with pgbench is that we break two people's test scripts,
they read the release notes, and fix them.

Sure, I agree that breaking pgbench custom scripts compatibility is no
big deal, and having pgbench consistent with psql is useful.

... apparently nobody disagrees ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tatsuo Ishii
ishii@postgresql.org
In reply to: Josh Berkus (#7)
Re: pgbench - allow backslash-continuations in custom scripts

Look, how many people in the world develop their own pgbench scripts?
And how many of those aren't on this list right now, reading this
thread? I expect I could count them on my fingers and toes.

I'm not against you break the backward compatibility of pgbench custom
scripts.

However I just want to let you know that PostgreSQL Enterprise
Consortium has been published couple of scripts along with reports on
evaluating PostgreSQL, which have been downloaded considerable
numbers.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Jeff Janes
jeff.janes@gmail.com
In reply to: Josh Berkus (#9)
Re: pgbench - allow backslash-continuations in custom scripts

On Wed, Jun 24, 2015 at 1:22 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 06/21/2015 01:37 PM, Fabien COELHO wrote:

The worst case with pgbench is that we break two people's test scripts,
they read the release notes, and fix them.

Sure, I agree that breaking pgbench custom scripts compatibility is no
big deal, and having pgbench consistent with psql is useful.

... apparently nobody disagrees ...

I'm fine re-punctuating my current scripts. And since pgbench doesn't have
to be version-matched to the server it connects to, people can just keep
using the older version if they want to against a new server.

Are there other breaking changes we have been wanting to make? If so,
should we try to get them all in during the same release?

Cheers,

Jeff

#12Michael Paquier
michael.paquier@gmail.com
In reply to: Tatsuo Ishii (#10)
Re: pgbench - allow backslash-continuations in custom scripts

On Thu, Jun 25, 2015 at 10:51 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:

Look, how many people in the world develop their own pgbench scripts?
And how many of those aren't on this list right now, reading this
thread? I expect I could count them on my fingers and toes.

I'm not against you break the backward compatibility of pgbench custom
scripts.

However I just want to let you know that PostgreSQL Enterprise
Consortium has been published couple of scripts along with reports on
evaluating PostgreSQL, which have been downloaded considerable
numbers.

pgbench is a tool for dedicated to developers. Hence people who should
be able to fix scripts properly as long as we inform them by
documenting it in the release notes so I am not sure that this is
actually a problem.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tatsuo Ishii
ishii@postgresql.org
In reply to: Michael Paquier (#12)
Re: pgbench - allow backslash-continuations in custom scripts

I'm not against you break the backward compatibility of pgbench custom
scripts.

However I just want to let you know that PostgreSQL Enterprise
Consortium has been published couple of scripts along with reports on
evaluating PostgreSQL, which have been downloaded considerable
numbers.

pgbench is a tool for dedicated to developers. Hence people who should
be able to fix scripts properly as long as we inform them by
documenting it in the release notes so I am not sure that this is
actually a problem.

I'm not sure what you mean by "developers" here but if that means
people who are developing PostgreSQL itself, I am strongly against
"pgbench is a tool for dedicated to developers" concept. Pgbench has
been heavily used by users who want to measure PostgreSQL's
performance.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Michael Paquier
michael.paquier@gmail.com
In reply to: Tatsuo Ishii (#13)
Re: pgbench - allow backslash-continuations in custom scripts

On Fri, Jun 26, 2015 at 9:01 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:

I'm not against you break the backward compatibility of pgbench custom
scripts.

However I just want to let you know that PostgreSQL Enterprise
Consortium has been published couple of scripts along with reports on
evaluating PostgreSQL, which have been downloaded considerable
numbers.

pgbench is a tool for dedicated to developers. Hence people who should
be able to fix scripts properly as long as we inform them by
documenting it in the release notes so I am not sure that this is
actually a problem.

I'm not sure what you mean by "developers" here but if that means
people who are developing PostgreSQL itself, I am strongly against
"pgbench is a tool for dedicated to developers" concept. Pgbench has
been heavily used by users who want to measure PostgreSQL's
performance.

I meant "people who can write SQL". Sorry for the misunderstanding.
Please do not take any offense ;)
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Fabien COELHO (#6)
Re: pgbench - allow backslash-continuations in custom scripts

On 06/21/2015 11:12 AM, Fabien COELHO wrote:

Hello Josh,

Add backslash continuations to pgbench custom scripts.
[...]
IMHO this approach is the best compromise.

I don't personally agree. I believe that it it worth breaking backwards
compatibility to support line breaks in pgbench statements, and that if
we're not going to do that, supporting \ continuations is of little value.

As someone who actively uses pgbench to write custom benchmarks, I need
to write queries which I can test. \ continuation does NOT work on the
psql command line, so that's useless for testing my queries; I still
have to reformat and troubleshoot. If we added \ continuation, I
wouldn't use it.

I think we should support line breaks, and require semicolons for
end-of-statement. Backwards-compatability in custom pgbench scripts is
not critical; pgbench scripts are neither used in produciton, nor used
in automated systems much that I know of.

I'm not clear on why we'd need a full SQL lexer.

Attached is a "without lexer" version which does ;-terminated SQL commands
and \-continuated meta commands (may be useful for \shell and long \set
expressions).

As Tom pointed out, you need the full lexer to do this correctly. You
can argue that something that handles the most common cases is enough,
but realistically, by the time you've handled all the common cases
correctly, you've just re-invented the lexer.

The home-grown lexer is missing e.g. dollar-quoting support, so this is
not be parsed correctly:

do $$
begin
...
end;
$$;

That would be very nice to handle correctly, I've used DO-blocks in
pgbench scripts many times, and it's a pain to have to write them in a
single line.

Also worth noting that you can currently test so-called multi-statements
with pgbench, by putting multiple statements on a single line. Your
patch seems to still do that, but if we went with a full-blown SQL
lexer, they would probably be split into two statements.

I think we should either bite the bullet and include the full SQL lexer
in pgbench, or come up with some new syntax for marking the beginning
and end of a statement. We could do something like bash here-documents
or Postgres dollar-quoting, for example:

\set ...
select 1234; -- A statement on a single line, no change here

-- Begin a multi-line statement
\multi-line-statement END_TOKEN
select *
from complicated;
END_TOKEN

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Andres Freund
andres@anarazel.de
In reply to: Heikki Linnakangas (#15)
Re: pgbench - allow backslash-continuations in custom scripts

On 2015-07-03 13:50:02 +0300, Heikki Linnakangas wrote:

As Tom pointed out, you need the full lexer to do this correctly. You can
argue that something that handles the most common cases is enough, but
realistically, by the time you've handled all the common cases correctly,
you've just re-invented the lexer.

Yes.

I think we should either bite the bullet and include the full SQL lexer in
pgbench, or come up with some new syntax for marking the beginning and end
of a statement.

I'm pretty clearly in favor of doing correct lexing. I think we should
generalize that and make it reusable. psql has it's own hacked up
version already, there seems little point in having variedly good copies
around.

We could do something like bash here-documents or Postgres
dollar-quoting, for example:

\set ...
select 1234; -- A statement on a single line, no change here

-- Begin a multi-line statement
\multi-line-statement END_TOKEN
select *
from complicated;
END_TOKEN

Not pretty imo. I could see including something esimpler, in addition to
the lexer, to allow sending multiple statements in one go.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Heikki Linnakangas (#15)
Re: pgbench - allow backslash-continuations in custom scripts

Hello Heikki,

I'm not clear on why we'd need a full SQL lexer.

Attached is a "without lexer" version which does ;-terminated SQL commands
and \-continuated meta commands (may be useful for \shell and long \set
expressions).

As Tom pointed out, you need the full lexer to do this correctly. You can
argue that something that handles the most common cases is enough, but
realistically, by the time you've handled all the common cases correctly,
you've just re-invented the lexer.

Sure. I understand that part of Josh argument is that we are discussing
pgbench test scripts here, not real full-blown applications, and these are
expected to be quite basic, plain mostly SQL things.

The home-grown lexer is missing e.g. dollar-quoting support, so this is not
be parsed correctly:

do $$
begin
...
end;
$$;

Hmmm, good one, if indeed you want to use PL/pgSQL or even any arbitrary
language in a pgbench scripts... I would rather have created functions
(once, outside of pgbench) and would call them from the script, so that
would be a simple SELECT.

That would be very nice to handle correctly, I've used DO-blocks in pgbench
scripts many times, and it's a pain to have to write them in a single line.

Yep. With some languages I'm not sure that it is even possible.

Also worth noting that you can currently test so-called multi-statements
with pgbench, by putting multiple statements on a single line.

Yes indeed, behind the hood pgbench expects just one line, or you have to
change significantly the way statements are handled, which is way beyond
my initial intentions on this one, and this would mean quite a lot of
changes for more or less corner cases.

Your patch seems to still do that, but if we went with a full-blown SQL
lexer, they would probably be split into two statements.

I think we should either bite the bullet and include the full SQL lexer in
pgbench, or come up with some new syntax for marking the beginning and end of
a statement. We could do something like bash here-documents or Postgres
dollar-quoting, for example:

\set ...
select 1234; -- A statement on a single line, no change here

-- Begin a multi-line statement
\multi-line-statement END_TOKEN
select *
from complicated;
END_TOKEN

I do not like the aesthetic of the above much. I really liked the idea of
simply writing SQL queries just as in psql.

So maybe just handling $$-quoting would be enough to handle reasonable
use-cases without troubling pgbench internal working too much? That would
be a simple local changes in the line reader.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Andres Freund (#16)
Re: pgbench - allow backslash-continuations in custom scripts

I'm pretty clearly in favor of doing correct lexing. I think we should
generalize that and make it reusable. psql has it's own hacked up
version already, there seems little point in having variedly good copies
around.

I must admit that I do not know how to share lexer rules but have
different actions on them (psql vs sql parser vs ...), as the action code
is intrinsically intertwined with expressions. Maybe some hack is
possible. Having yet another SQL-lexer to maintain seems highly
undesirable, especially just for pgbench.

I could see including something esimpler, in addition to the lexer, to
allow sending multiple statements in one go.

Currently, probably

SELECT 1; SELECT 1;

Does 2 statements in one go, but it is on one line.

May by allowing both continuations and ";" at the same time:

-- two statements in one go
SELECT 1; \
SELECT 1;
-- next statement on it's own
SELECT
1;

Which could be reasonnably neat, and easy to implement.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Heikki Linnakangas (#15)
2 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

The home-grown lexer is missing e.g. dollar-quoting support, so this is not
be parsed correctly:

do $$
begin
...
end;
$$;

That would be very nice to handle correctly, I've used DO-blocks in pgbench
scripts many times, and it's a pain to have to write them in a single line.

Attached is a version which does that (I think), and a test script.

- backslash-commands can be \-continuated
- sql-commands may include $$-quotes and must end with a ';'
- double-dash comments and blank line are skipped.

Obviously it is still a non-lexer hack which can be easily defeated, but
ISTM that it handles non-contrived cases well. Anyway ISTM that dollar
quoting cannot be handle as such and simply by a lexer, it is really an
exception mechanism.

--
Fabien.

Attachments:

pgbench-conts-3.patchtext/x-diff; name=pgbench-conts-3.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 2517a3a..b816673 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -696,11 +696,16 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
   </para>
 
   <para>
-   The format of a script file is one SQL command per line; multiline
-   SQL commands are not supported.  Empty lines and lines beginning with
-   <literal>--</> are ignored.  Script file lines can also be
-   <quote>meta commands</>, which are interpreted by <application>pgbench</>
-   itself, as described below.
+   The format of a script file is composed of lines which are each either
+   one SQL command or one <quote>meta command</> interpreted by
+   <application>pgbench</> itself, as described below.
+   Meta-commands can be spread over multiple lines using backslash
+   (<literal>\</>) continuations, in which case the set of continuated
+   lines is considered as just one line.
+   SQL commands may be spead over several lines and must be
+   <literal>;</>-terminated, and may contain simple dollar-quoted strings
+   over multiple lines.
+   Empty lines and lines beginning with <literal>--</> are ignored.
   </para>
 
   <para>
@@ -768,7 +773,9 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
       Examples:
 <programlisting>
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (100000 * :scale) + 1
+-- update an already defined aid:
+\set aid \
+  (1021 * :aid) % (100000 * :scale) + 1
 </programlisting></para>
     </listitem>
    </varlistentry>
@@ -931,11 +938,15 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
 \setrandom tid 1 :ntellers
 \setrandom delta -5000 5000
 BEGIN;
-UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
+UPDATE pgbench_accounts
+  SET abalance = abalance + :delta WHERE aid = :aid;
 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
-UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
-UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
-INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
+UPDATE pgbench_tellers
+  SET tbalance = tbalance + :delta WHERE tid = :tid;
+UPDATE pgbench_branches
+  SET bbalance = bbalance + :delta WHERE bid = :bid;
+INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
+  VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
 END;
 </programlisting>
 
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 95be62c..c10fb29 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2430,7 +2430,10 @@ process_commands(char *buf, const char *source, const int lineno)
 }
 
 /*
- * Read a line from fd, and return it in a malloc'd buffer.
+ * Read a possibly \-continuated (for backslash commands) or ;-terminated
+ * (for SQL statements) lines from fd, and return it in a malloc'd buffer.
+ * Also handle possible $$-quotes.
+ *
  * Return NULL at EOF.
  *
  * The buffer will typically be larger than necessary, but we don't care
@@ -2443,6 +2446,13 @@ read_line_from_file(FILE *fd)
 	char	   *buf;
 	size_t		buflen = BUFSIZ;
 	size_t		used = 0;
+	bool		is_sql_statement = false;
+	bool		is_backslash_command = false;
+	/* simplistic $$-quoting handling */
+	int			ddquote_start = 0;
+	int 		ddquote_length = 0;
+	int 		ddquote_end = 0; /* where the previous $$-quote ended */
+	char	   *ddquote_string = NULL;
 
 	buf = (char *) palloc(buflen);
 	buf[0] = '\0';
@@ -2451,13 +2461,130 @@ read_line_from_file(FILE *fd)
 	{
 		size_t		thislen = strlen(tmpbuf);
 
+		/* coldly skip comments and empty lines */
+		{
+			int i = 0;
+
+			while (i < thislen && isspace(tmpbuf[i]))
+				i++;
+
+			if (tmpbuf[i] == '\0') /* blank */
+				continue;
+
+			if (tmpbuf[i] == '-' && tmpbuf[i+1] == '-') /* comment */
+				continue;
+		}
+
 		/* Append tmpbuf to whatever we had already */
 		memcpy(buf + used, tmpbuf, thislen + 1);
 		used += thislen;
 
-		/* Done if we collected a newline */
-		if (thislen > 0 && tmpbuf[thislen - 1] == '\n')
-			break;
+		/* Determined what the current line is */
+		if (!is_backslash_command && !is_sql_statement)
+		{
+			int i = 0;
+
+			while (i < thislen && isspace(tmpbuf[i]))
+				i++;
+
+			if (tmpbuf[i] == '\\')
+				is_backslash_command = true;
+			else if (tmpbuf[i] != '\0')
+				is_sql_statement = true;
+		}
+
+		/* Handle simple $$-quoting, may not work if several quotes on a line.
+		 */
+		if (is_sql_statement && ddquote_string != NULL)
+		{
+			char * found = strstr(buf + ddquote_start + ddquote_length,
+								  ddquote_string);
+			if (found != NULL)
+			{
+				pg_free(ddquote_string);
+				ddquote_string = NULL;
+				ddquote_end = found + ddquote_length - buf;
+			}
+		}
+
+		/* Is there a starting $$-quote?
+		 */
+		if (is_sql_statement && ddquote_string == NULL)
+		{
+			int i = ddquote_end;
+
+			/* \$[a-zA-Z0-9]*\$ pattern */
+			while (i < thislen - 1)
+			{
+				if (buf[i] == '$') /* starting '$' */
+				{
+					int j = i+1;
+					while (j < thislen && isalnum(buf[j]))
+						j++; /* eat alnum characters */
+					if (buf[j] == '$') /* final '$' */
+					{
+						ddquote_start = i;
+						ddquote_length = j-i+1;
+						ddquote_string = pg_malloc(ddquote_length + 1);
+						strncpy(ddquote_string, buf+i, ddquote_length);
+						ddquote_string[ddquote_length] = '\0';
+						break;
+					}
+				}
+				i++;
+			}
+		}
+
+		/* If we collected a newline */
+		if (used > 0 && buf[used - 1] == '\n')
+		{
+			if (is_backslash_command)
+			{
+				/* Handle simple \-continuations */
+				if (used >= 2 && buf[used - 2] == '\\')
+				{
+					buf[used - 2] = '\0';
+					used -= 2;
+				}
+				else if (used >= 3 && buf[used - 2] == '\r' &&
+						 buf[used - 3] == '\\')
+				{
+					buf[used - 3] = '\0';
+					used -= 3;
+				}
+				else
+					/* Else we are done */
+					break;
+			}
+			else if (is_sql_statement)
+			{
+				if (ddquote_string == NULL)
+				{
+					/* look for a terminating ";" */
+					int i = 2;
+
+					/* backward skip blanks */
+					while (used-i >= 0 && isspace(buf[used-i]))
+						i++;
+
+					if (used-i >= 0 && buf[used-i] == ';')
+						break;
+				}
+
+				/* scratch newline because process_commands approach to
+				 * parsing is simplistic and expects just one line.
+				 */
+				if (buf[used-1] == '\n')
+					buf[used-1] = ' ';
+				if (used >= 2 && buf[used-2] == '\r')
+				{
+					buf[used-2] = ' ';
+					buf[used-1] = '\0';
+					used --;
+				}
+			}
+			/* else it was a blank line */
+		}
 
 		/* Else, enlarge buf to ensure we can append next bufferload */
 		buflen += BUFSIZ;
test.sqlapplication/x-sql; name=test.sqlDownload
#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fabien COELHO (#18)
Re: pgbench - allow backslash-continuations in custom scripts

Fabien COELHO <coelho@cri.ensmp.fr> writes:

I'm pretty clearly in favor of doing correct lexing. I think we should
generalize that and make it reusable. psql has it's own hacked up
version already, there seems little point in having variedly good copies
around.

I must admit that I do not know how to share lexer rules but have
different actions on them (psql vs sql parser vs ...), as the action code
is intrinsically intertwined with expressions.

Obviously this is scope creep of the first magnitude, but ISTM that
it would be possible to share a lexer between psql and pgbench, since
in both of them the basic requirement is "break SQL commands apart and
identify newline-terminated backslash commands". If we're gonna break
pgbench's backwards compatibility anyway, there would be a whole lot
to be said for just going over to psql's input parsing rules, lock
stock 'n barrel; and this would be a good way to achieve that.

As it stands, psqlscan.l has some external dependencies on the rest of
psql, but we could perhaps refactor some of those away, and provide dummy
implementations to satisfy others (eg pgbench could provide a dummy
GetVariable() that just always returns NULL).

So I'm imagining symlinking psqlscan.l into src/bin/pgbench and using it
as-is (possibly after refactoring in psql). A possible issue is avoiding
unnecessary invocations of flex, though. Maybe symlinking the .c file
would work better.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#20)
Re: pgbench - allow backslash-continuations in custom scripts

I wrote:

As it stands, psqlscan.l has some external dependencies on the rest of
psql, but we could perhaps refactor some of those away, and provide dummy
implementations to satisfy others (eg pgbench could provide a dummy
GetVariable() that just always returns NULL).

So I'm imagining symlinking psqlscan.l into src/bin/pgbench and using it
as-is (possibly after refactoring in psql). A possible issue is avoiding
unnecessary invocations of flex, though. Maybe symlinking the .c file
would work better.

A quick experiment with compiling psqlscan inside pgbench yields the
following failures:

pgbench.o: In function `psql_scan_setup':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1239: undefined reference to `pset'
pgbench.o: In function `escape_variable':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1950: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1950: undefined reference to `GetVariable'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1956: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1957: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1971: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1973: undefined reference to `psql_error'
pgbench.o: In function `evaluate_backtick':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1701: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1712: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1722: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1728: undefined reference to `psql_error'
pgbench.o: In function `yylex':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:511: undefined reference to `standard_strings'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:743: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:743: undefined reference to `GetVariable'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:751: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1037: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1037: undefined reference to `GetVariable'
pgbench.o: In function `psql_scan_slash_option':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1619: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1628: undefined reference to `psql_error'

The pset references are to pset.encoding, pset.db, or pset.vars. I'd
think the best way to deal with the encoding and connection are to pass
them as parameters to psql_scan_setup() which'd store them in
the PsqlScanState. pset.vars is only passed to GetVariable. We could
refactor that away somehow (although actually, why wouldn't we want to
just implement variable substitution exactly like it is in psql? Maybe
the right answer is to import psql/variables.c lock stock n barrel too...)
psql_error() and standard_strings() wouldn't be hard to provide.

So this is looking *eminently* doable.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tom Lane (#21)
Re: pgbench - allow backslash-continuations in custom scripts

(although actually, why wouldn't we want to just implement variable
substitution exactly like it is in psql?

Pgbench variable substitution is performed when the script is run, not
while the file is being processed for being split, which is when a lexer
would be used. The situation is not the same with psql. The most it could
do would be to keep track of what substitution are done in queries.

So this is looking *eminently* doable.

Possibly. How much more effort would be involved compared to the quick
patch I did, I wonder:-)

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Fabien COELHO (#22)
Re: pgbench - allow backslash-continuations in custom scripts

Hello Tom,

(although actually, why wouldn't we want to just implement variable
substitution exactly like it is in psql?

Pgbench variable substitution is performed when the script is run, not while
the file is being processed for being split, which is when a lexer would be
used. The situation is not the same with psql. The most it could do would be
to keep track of what substitution are done in queries.

So this is looking *eminently* doable.

Possibly. How much more effort would be involved compared to the quick patch
I did, I wonder:-)

I had a quick look at the code, and although it seems doable to hack the
psql lexer for pgbench benefit, I do not think it is a good idea:

- improving pgbench scripts is really a small feature which requires
a light weight approach in my opinion. There is no real benefit
of having a lexer solution which can handle contrived cases, because
they would be contrived cases and not the kind of tests really written
by pgbench users.

- the solution would probably be fragile to changes in psql, or
could prevent such changes because of the pgbench dependency,
and this does not look desirable.

- it would involve much more time than I'm ready to give on such a
small feature.

So the current patch, or possibly variants of this patch to fix issues
that may be raised, is what I'm ready to put forward on this.

If you feel that this feature only deserve a lexer solution, then the
patch should be "returned with feedback".

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Fabien COELHO (#23)
3 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

Hi,

If you feel that this feature only deserve a lexer solution, then the
patch should be "returned with feedback".

It's unfortunate to abandon this idea so I tried this and made it
run with psql's parser. I think it works as expected.

The attached files are as follwoing.

- 0001-Prepare-for-share-psqlscan-with-pgbench.patch
A patch to modify psql so that psqlscan can be shared with other modules.

- 0002-Make-use-of-psqlscan-in-pgbench.patch
A patch to use psqlscan in pgbench.

- hoge.sql
A sample custom script including multilne statement and line comment

I can't judge wheter this is a new version of Febien's patch
following Tom's suggestion or brand-new one. Anyway I'd like to
post on this thread.

======
At Fri, 17 Jul 2015 21:26:44 +0200 (CEST), Fabien COELHO <coelho@cri.ensmp.fr> wrote in <alpine.DEB.2.10.1507172113080.31314@sto>

Pgbench variable substitution is performed when the script is run, not
while the file is being processed for being split, which is when a
lexer would be used. The situation is not the same with psql. The most
it could do would be to keep track of what substitution are done in
queries.

So this is looking *eminently* doable.

Possibly. How much more effort would be involved compared to the
quick patch I did, I wonder:-)

The patch set consists of two parts.

The first modifies psqlscan.l to work in pgbench. Almost along on
Tom's suggestion.

- Eliminate direct reading of pset and store them into
PsqlScanState in psql_scan_setup.

- variables, common, settings and prompt in pgbench are the
shrinked version from that of psql.

The second part modifies pgbench to use the modified version of
psqlscan.l. As the result,

- Multiline SQLs (not backslash continuation) in custom script is
allowed. (also for builtins but it's no use).

- backslash commands is handled as the same as before: multiline
is not allowed.

A sample script is also attached.

Suggestions? Opinions?

I don't have idea how to deal with the copy of psqlscan.[lh] from
psql. Currently they are simply the dead copies of those of psql.

- Modifying psqlscan in psql requires consideration on how it is
used in pgbench.

- They are rather small but common, variables, prompt are
essentially needeless files..

reagsrds,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Prepare-for-share-psqlscan-with-pgbench.patchtext/x-patch; charset=us-asciiDownload
>From 13153d3645c480341fe1e94a14af0609f2ec54d4 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 23 Jul 2015 20:44:37 +0900
Subject: [PATCH 1/2] Prepare for share psqlscan with pgbench.

psql_scan no more accesses directly to pset struct and allow omission
of VariableSpace.
---
 src/bin/psql/mainloop.c |  6 +++--
 src/bin/psql/psqlscan.h |  7 +++---
 src/bin/psql/psqlscan.l | 59 ++++++++++++++++++++++++++++---------------------
 src/bin/psql/startup.c  |  4 ++--
 4 files changed, 43 insertions(+), 33 deletions(-)

diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index b6cef94..e98cb94 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -233,7 +233,8 @@ MainLoop(FILE *source)
 		/*
 		 * Parse line, looking for command separators.
 		 */
-		psql_scan_setup(scan_state, line, strlen(line));
+		psql_scan_setup(scan_state, line, strlen(line),
+						pset.db, pset.vars, pset.encoding);
 		success = true;
 		line_saved_in_history = false;
 
@@ -373,7 +374,8 @@ MainLoop(FILE *source)
 					resetPQExpBuffer(query_buf);
 					/* reset parsing state since we are rescanning whole line */
 					psql_scan_reset(scan_state);
-					psql_scan_setup(scan_state, line, strlen(line));
+					psql_scan_setup(scan_state, line, strlen(line),
+									pset.db, pset.vars, pset.encoding);
 					line_saved_in_history = false;
 					prompt_status = PROMPT_READY;
 				}
diff --git a/src/bin/psql/psqlscan.h b/src/bin/psql/psqlscan.h
index 55070ca..1b6361b 100644
--- a/src/bin/psql/psqlscan.h
+++ b/src/bin/psql/psqlscan.h
@@ -11,7 +11,7 @@
 #include "pqexpbuffer.h"
 
 #include "prompt.h"
-
+#include "variables.h"
 
 /* Abstract type for lexer's internal state */
 typedef struct PsqlScanStateData *PsqlScanState;
@@ -36,12 +36,11 @@ enum slash_option_type
 	OT_NO_EVAL					/* no expansion of backticks or variables */
 };
 
-
 extern PsqlScanState psql_scan_create(void);
 extern void psql_scan_destroy(PsqlScanState state);
 
-extern void psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len);
+extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+							PGconn *db, VariableSpace vars, int encoding);
 extern void psql_scan_finish(PsqlScanState state);
 
 extern PsqlScanResult psql_scan(PsqlScanState state,
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
index be059ab..08bd9d2 100644
--- a/src/bin/psql/psqlscan.l
+++ b/src/bin/psql/psqlscan.l
@@ -43,11 +43,6 @@
 
 #include <ctype.h>
 
-#include "common.h"
-#include "settings.h"
-#include "variables.h"
-
-
 /*
  * We use a stack of flex buffers to handle substitution of psql variables.
  * Each stacked buffer contains the as-yet-unread text from one psql variable.
@@ -81,10 +76,12 @@ typedef struct PsqlScanStateData
 	const char *scanline;		/* current input line at outer level */
 
 	/* safe_encoding, curline, refline are used by emit() to replace FFs */
+	PGconn	   *db;				/* active connection */
 	int			encoding;		/* encoding being used now */
 	bool		safe_encoding;	/* is current encoding "safe"? */
 	const char *curline;		/* actual flex input string for cur buf */
 	const char *refline;		/* original data for cur buffer */
+	VariableSpace vars;			/* "shell variable" repository */
 
 	/*
 	 * All this state lives across successive input lines, until explicitly
@@ -736,11 +733,14 @@ other			.
 
 :{variable_char}+	{
 					/* Possible psql variable substitution */
-					char   *varname;
-					const char *value;
+					char   *varname = NULL;
+					const char *value = NULL;
 
-					varname = extract_substring(yytext + 1, yyleng - 1);
-					value = GetVariable(pset.vars, varname);
+					if (cur_state->vars)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = GetVariable(cur_state->vars, varname);
+					}
 
 					if (value)
 					{
@@ -769,7 +769,8 @@ other			.
 						ECHO;
 					}
 
-					free(varname);
+					if (varname)
+						free(varname);
 				}
 
 :'{variable_char}+'	{
@@ -1033,9 +1034,12 @@ other			.
 						char   *varname;
 						const char *value;
 
-						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(pset.vars, varname);
-						free(varname);
+						if (cur_state->vars)
+						{
+							varname = extract_substring(yytext + 1, yyleng - 1);
+							value = GetVariable(cur_state->vars, varname);
+							free(varname);
+						}
 
 						/*
 						 * The variable value is just emitted without any
@@ -1227,17 +1231,19 @@ psql_scan_destroy(PsqlScanState state)
  * or freed until after psql_scan_finish is called.
  */
 void
-psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len)
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PGconn *db, VariableSpace vars, int encoding)
 {
 	/* Mustn't be scanning already */
 	Assert(state->scanbufhandle == NULL);
 	Assert(state->buffer_stack == NULL);
 
 	/* Do we need to hack the character set encoding? */
-	state->encoding = pset.encoding;
+	state->encoding = encoding;
 	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
 
+	state->vars = vars;
+
 	/* needed for prepare_buffer */
 	cur_state = state;
 
@@ -1615,7 +1621,7 @@ psql_scan_slash_option(PsqlScanState state,
 					{
 						if (!inquotes && type == OT_SQLID)
 							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, pset.encoding);
+						cp += PQmblen(cp, cur_state->encoding);
 					}
 				}
 			}
@@ -1944,15 +1950,18 @@ escape_variable(bool as_ident)
 	char	   *varname;
 	const char *value;
 
-	/* Variable lookup. */
-	varname = extract_substring(yytext + 2, yyleng - 3);
-	value = GetVariable(pset.vars, varname);
-	free(varname);
+	/* Variable lookup if possible. */
+	if (cur_state->vars && cur_state->db)
+	{
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = GetVariable(cur_state->vars, varname);
+		free(varname);
+	}
 
 	/* Escaping. */
 	if (value)
 	{
-		if (!pset.db)
+		if (!cur_state->db)
 			psql_error("can't escape without active connection\n");
 		else
 		{
@@ -1960,14 +1969,14 @@ escape_variable(bool as_ident)
 
 			if (as_ident)
 				escaped_value =
-					PQescapeIdentifier(pset.db, value, strlen(value));
+					PQescapeIdentifier(cur_state->db, value, strlen(value));
 			else
 				escaped_value =
-					PQescapeLiteral(pset.db, value, strlen(value));
+					PQescapeLiteral(cur_state->db, value, strlen(value));
 
 			if (escaped_value == NULL)
 			{
-				const char *error = PQerrorMessage(pset.db);
+				const char *error = PQerrorMessage(cur_state->db);
 
 				psql_error("%s", error);
 			}
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 28ba75a..c143dfe 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -305,8 +305,8 @@ main(int argc, char *argv[])
 
 		scan_state = psql_scan_create();
 		psql_scan_setup(scan_state,
-						options.action_string,
-						strlen(options.action_string));
+						options.action_string, strlen(options.action_string),
+						pset.db, pset.vars, pset.encoding);
 
 		successResult = HandleSlashCmds(scan_state, NULL) != PSQL_CMD_ERROR
 			? EXIT_SUCCESS : EXIT_FAILURE;
-- 
1.8.3.1

0002-Make-use-of-psqlscan-in-pgbench.patchtext/x-patch; charset=us-asciiDownload
>From 0c7292cd008d2bfe5729ecfc4890352a68ef6481 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Jul 2015 10:58:23 +0900
Subject: [PATCH 2/2] Make use of psqlscan in pgbench.

Make pgbench to use psqlscan of psql as is. This allows SQL statements
in custom scirpts to take the same form as psql e.g, multiline
statements or line comments. backslash commands are not allowed split
into multi lines as it used to be.
---
 src/bin/pgbench/Makefile    |    3 +-
 src/bin/pgbench/common.c    |   25 +
 src/bin/pgbench/common.h    |   17 +
 src/bin/pgbench/pgbench.c   |  345 ++++----
 src/bin/pgbench/prompt.h    |   25 +
 src/bin/pgbench/psqlscan.h  |   63 ++
 src/bin/pgbench/psqlscan.l  | 1997 +++++++++++++++++++++++++++++++++++++++++++
 src/bin/pgbench/variables.c |   22 +
 src/bin/pgbench/variables.h |   22 +
 9 files changed, 2368 insertions(+), 151 deletions(-)
 create mode 100644 src/bin/pgbench/common.c
 create mode 100644 src/bin/pgbench/common.h
 create mode 100644 src/bin/pgbench/prompt.h
 create mode 100644 src/bin/pgbench/psqlscan.h
 create mode 100644 src/bin/pgbench/psqlscan.l
 create mode 100644 src/bin/pgbench/variables.c
 create mode 100644 src/bin/pgbench/variables.h

diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile
index 18fdf58..9edae63 100644
--- a/src/bin/pgbench/Makefile
+++ b/src/bin/pgbench/Makefile
@@ -7,8 +7,7 @@ subdir = src/bin/pgbench
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = pgbench.o exprparse.o $(WIN32RES)
-
+OBJS = pgbench.o exprparse.o psqlscan.o common.o variables.o $(WIN32RES)
 override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
 
 ifneq ($(PORTNAME), win32)
diff --git a/src/bin/pgbench/common.c b/src/bin/pgbench/common.c
new file mode 100644
index 0000000..16eb7b6
--- /dev/null
+++ b/src/bin/pgbench/common.c
@@ -0,0 +1,25 @@
+/*
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * src/bin/pgbench/common.c
+ */
+
+#include "common.h"
+
+void
+psql_error(const char *fmt,...)
+{
+	fprintf(stderr, "psql_error is called. abort.\n");
+	exit(1);
+}
+
+/*
+ * This function originally returns the setting of standard_conforming_strings
+ *  but pgbench doesn't have connection while parsing scripts return always
+ *  true as the default value for 9.1 or later.
+ */
+bool
+standard_strings(void)
+{
+	return true;
+}
diff --git a/src/bin/pgbench/common.h b/src/bin/pgbench/common.h
new file mode 100644
index 0000000..80c6ac2
--- /dev/null
+++ b/src/bin/pgbench/common.h
@@ -0,0 +1,17 @@
+/*
+ * dummy functions for pgbench to use psqlscan
+ *
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * src/bin/pgbench/common.h
+ */
+#ifndef COMMON_H
+#define COMMON_H
+
+#include "postgres_fe.h"
+#include "libpq-fe.h"
+
+extern void psql_error(const char *fmt,...) pg_attribute_printf(1, 2);
+extern bool standard_strings(void);
+
+#endif   /* COMMON_H */
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index e839fa3..daeca78 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -54,7 +54,7 @@
 #endif
 
 #include "pgbench.h"
-
+#include "psqlscan.h"
 /*
  * Multi-platform pthread implementations
  */
@@ -2221,9 +2221,9 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
-/* Parse a command; return a Command struct, or NULL if it's a comment */
+/* Parse a backslash command; return a Command struct  */
 static Command *
-process_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(char *buf, const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 
@@ -2231,207 +2231,232 @@ process_commands(char *buf, const char *source, const int lineno)
 	int			j;
 	char	   *p,
 			   *tok;
+	int			max_args = -1;
 
 	/* Make the string buf end at the next newline */
 	if ((p = strchr(buf, '\n')) != NULL)
 		*p = '\0';
 
-	/* Skip leading whitespace */
 	p = buf;
-	while (isspace((unsigned char) *p))
-		p++;
-
-	/* If the line is empty or actually a comment, we're done */
-	if (*p == '\0' || strncmp(p, "--", 2) == 0)
-		return NULL;
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
 	my_commands->line = pg_strdup(buf);
 	my_commands->command_num = num_commands++;
-	my_commands->type = 0;		/* until set */
+	my_commands->type = META_COMMAND;
 	my_commands->argc = 0;
 
-	if (*p == '\\')
+	if (*p != '\\')
 	{
-		int			max_args = -1;
+		fprintf(stderr, "process_backslash_commands has been called at wrong place. abort.\n");
+		exit(1);
+	}		
 
-		my_commands->type = META_COMMAND;
 
-		j = 0;
-		tok = strtok(++p, delim);
+	j = 0;
+	tok = strtok(++p, delim);
 
-		if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
-			max_args = 2;
+	if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
+		max_args = 2;
+
+	while (tok != NULL)
+	{
+		my_commands->cols[j] = tok - buf + 1;
+		my_commands->argv[j++] = pg_strdup(tok);
+		my_commands->argc++;
+		if (max_args >= 0 && my_commands->argc >= max_args)
+			tok = strtok(NULL, "");
+		else
+			tok = strtok(NULL, delim);
+	}
 
-		while (tok != NULL)
+	if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
+	{
+		/*
+		 * parsing: \setrandom variable min max [uniform] \setrandom
+		 * variable min max (gaussian|exponential) threshold
+		 */
+
+		if (my_commands->argc < 4)
 		{
-			my_commands->cols[j] = tok - buf + 1;
-			my_commands->argv[j++] = pg_strdup(tok);
-			my_commands->argc++;
-			if (max_args >= 0 && my_commands->argc >= max_args)
-				tok = strtok(NULL, "");
-			else
-				tok = strtok(NULL, delim);
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing arguments", NULL, -1);
 		}
 
-		if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
-		{
-			/*
-			 * parsing: \setrandom variable min max [uniform] \setrandom
-			 * variable min max (gaussian|exponential) threshold
-			 */
+		/* argc >= 4 */
 
-			if (my_commands->argc < 4)
+		if (my_commands->argc == 4 ||		/* uniform without/with
+											 * "uniform" keyword */
+			(my_commands->argc == 5 &&
+			 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
+		{
+			/* nothing to do */
+		}
+		else if (			/* argc >= 5 */
+			(pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
+			(pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
+		{
+			if (my_commands->argc < 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing arguments", NULL, -1);
-			}
-
-			/* argc >= 4 */
-
-			if (my_commands->argc == 4 ||		/* uniform without/with
-												 * "uniform" keyword */
-				(my_commands->argc == 5 &&
-				 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
-			{
-				/* nothing to do */
-			}
-			else if (			/* argc >= 5 */
-					 (pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
-				   (pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
-			{
-				if (my_commands->argc < 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-					 "missing threshold argument", my_commands->argv[4], -1);
-				}
-				else if (my_commands->argc > 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "too many arguments", my_commands->argv[4],
-								 my_commands->cols[6]);
-				}
+							 "missing threshold argument", my_commands->argv[4], -1);
 			}
-			else	/* cannot parse, unexpected arguments */
+			else if (my_commands->argc > 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "unexpected argument", my_commands->argv[4],
-							 my_commands->cols[4]);
+							 "too many arguments", my_commands->argv[4],
+							 my_commands->cols[6]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+		else	/* cannot parse, unexpected arguments */
 		{
-			if (my_commands->argc < 3)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "unexpected argument", my_commands->argv[4],
+						 my_commands->cols[4]);
+		}
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+	{
+		if (my_commands->argc < 3)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-			expr_scanner_init(my_commands->argv[2], source, lineno,
-							  my_commands->line, my_commands->argv[0],
-							  my_commands->cols[2] - 1);
+		expr_scanner_init(my_commands->argv[2], source, lineno,
+						  my_commands->line, my_commands->argv[0],
+						  my_commands->cols[2] - 1);
 
-			if (expr_yyparse() != 0)
-			{
-				/* dead code: exit done from syntax_error called by yyerror */
-				exit(1);
-			}
+		if (expr_yyparse() != 0)
+		{
+			/* dead code: exit done from syntax_error called by yyerror */
+			exit(1);
+		}
 
-			my_commands->expr = expr_parse_result;
+		my_commands->expr = expr_parse_result;
 
-			expr_scanner_finish();
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+		expr_scanner_finish();
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+	{
+		if (my_commands->argc < 2)
 		{
-			if (my_commands->argc < 2)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
-
-			/*
-			 * Split argument into number and unit to allow "sleep 1ms" etc.
-			 * We don't have to terminate the number argument with null
-			 * because it will be parsed with atoi, which ignores trailing
-			 * non-digit characters.
-			 */
-			if (my_commands->argv[1][0] != ':')
-			{
-				char	   *c = my_commands->argv[1];
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-				while (isdigit((unsigned char) *c))
-					c++;
-				if (*c)
-				{
-					my_commands->argv[2] = c;
-					if (my_commands->argc < 3)
-						my_commands->argc = 3;
-				}
-			}
+		/*
+		 * Split argument into number and unit to allow "sleep 1ms" etc.  We
+		 * don't have to terminate the number argument with null because it
+		 * will be parsed with atoi, which ignores trailing non-digit
+		 * characters.
+		 */
+		if (my_commands->argv[1][0] != ':')
+		{
+			char	   *c = my_commands->argv[1];
 
-			if (my_commands->argc >= 3)
+			while (isdigit((unsigned char) *c))
+				c++;
+			if (*c)
 			{
-				if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "s") != 0)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "unknown time unit, must be us, ms or s",
-								 my_commands->argv[2], my_commands->cols[2]);
-				}
+				my_commands->argv[2] = c;
+				if (my_commands->argc < 3)
+					my_commands->argc = 3;
 			}
-
-			/* this should be an error?! */
-			for (j = 3; j < my_commands->argc; j++)
-				fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
-						my_commands->argv[0], my_commands->argv[j]);
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+
+		if (my_commands->argc >= 3)
 		{
-			if (my_commands->argc < 3)
+			if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "s") != 0)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
+							 "unknown time unit, must be us, ms or s",
+							 my_commands->argv[2], my_commands->cols[2]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+
+		/* this should be an error?! */
+		for (j = 3; j < my_commands->argc; j++)
+			fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
+					my_commands->argv[0], my_commands->argv[j]);
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+	{
+		if (my_commands->argc < 3)
 		{
-			if (my_commands->argc < 1)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing command", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
 		}
-		else
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+	{
+		if (my_commands->argc < 1)
 		{
 			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-						 "invalid command", NULL, -1);
+						 "missing command", NULL, -1);
 		}
 	}
 	else
 	{
-		my_commands->type = SQL_COMMAND;
+		syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+					 "invalid command", NULL, -1);
+	}
+
+	return my_commands;
+}
 
+/* Parse a command line, return non-null if any command terminates. */
+static Command *
+process_commands(PsqlScanState scan_state, PQExpBuffer qbuf, char *buf,
+				 const char *source, const int lineno)
+{
+	Command *command = NULL;
+	PsqlScanResult scan_result;
+	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+
+	psql_scan_setup(scan_state, buf, strlen(buf), NULL, NULL, 0);
+						
+	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
+
+	if (scan_result == PSCAN_SEMICOLON)
+	{
+		/*
+		 * Command is terminated. Fill the struct.
+		 */
+		command = (Command*) pg_malloc(sizeof(Command));
+		command->line = pg_strdup(qbuf->data); /* line? */
+		command->command_num = num_commands++;
+		command->type = SQL_COMMAND;
+		command->argc = 0;
 		switch (querymode)
 		{
-			case QUERY_SIMPLE:
-				my_commands->argv[0] = pg_strdup(p);
-				my_commands->argc++;
-				break;
-			case QUERY_EXTENDED:
-			case QUERY_PREPARED:
-				if (!parseQuery(my_commands, p))
-					exit(1);
-				break;
-			default:
+		case QUERY_SIMPLE:
+			command->argv[0] = pg_strdup(qbuf->data); /* remove leading ws?*/
+			command->argc++;
+			break;
+		case QUERY_EXTENDED:
+		case QUERY_PREPARED:
+			if (!parseQuery(command, qbuf->data))
 				exit(1);
+			break;
+		default:
+			exit(1);
 		}
 	}
+	else if (scan_result == PSCAN_BACKSLASH)
+	{
+		/* backslash commands are always one-liner  */
+		command = process_backslash_commands(buf, source, lineno);
+	}
 
-	return my_commands;
+	psql_scan_finish(scan_state);
+
+	return command;
 }
 
+
 /*
  * Read a line from fd, and return it in a malloc'd buffer.
  * Return NULL at EOF.
@@ -2486,6 +2511,8 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
+	PsqlScanState scan_state;
+	PQExpBuffer query_buf = createPQExpBuffer();
 
 	if (num_files >= MAX_FILES)
 	{
@@ -2506,33 +2533,46 @@ process_file(char *filename)
 		return false;
 	}
 
+	scan_state = psql_scan_create();
+	resetPQExpBuffer(query_buf);
+
 	lineno = 0;
 	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
-		Command    *command;
+		Command *command = NULL;
 
 		lineno += 1;
 
-		command = process_commands(buf, filename, lineno);
-
+		command = process_commands(scan_state, query_buf, buf,
+								   filename, lineno);
 		free(buf);
 
 		if (command == NULL)
+		{
+			/*
+			 * command is NULL when psql_scan returns PSCAN_EOL or
+			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
+			 * cases.
+			 */
 			continue;
+		}
 
 		my_commands[index] = command;
-		index++;
-
+		resetPQExpBuffer(query_buf);
 		if (index >= alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
+			my_commands = pg_realloc(my_commands,
+									 sizeof(Command *) * alloc_num);
 		}
+		index++;
 	}
 	fclose(fd);
 
+	psql_scan_finish(scan_state);
+
 	my_commands[index] = NULL;
 
 	sql_files[num_files++] = my_commands;
@@ -2550,10 +2590,15 @@ process_builtin(char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
+	PsqlScanState scan_state;
+	PQExpBuffer query_buf = createPQExpBuffer();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
 
+	scan_state = psql_scan_create();
+	resetPQExpBuffer(query_buf);
+
 	lineno = 0;
 	index = 0;
 
@@ -2576,10 +2621,11 @@ process_builtin(char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(buf, source, lineno);
+		command = process_commands(scan_state, query_buf, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
+		resetPQExpBuffer(query_buf);
 		my_commands[index] = command;
 		index++;
 
@@ -2591,6 +2637,7 @@ process_builtin(char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
+	psql_scan_finish(scan_state);
 
 	return my_commands;
 }
diff --git a/src/bin/pgbench/prompt.h b/src/bin/pgbench/prompt.h
new file mode 100644
index 0000000..e3f6ce5
--- /dev/null
+++ b/src/bin/pgbench/prompt.h
@@ -0,0 +1,25 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2015, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/prompt.h
+ */
+#ifndef PROMPT_H
+#define PROMPT_H
+
+typedef enum _promptStatus
+{
+	PROMPT_READY,
+	PROMPT_CONTINUE,
+	PROMPT_COMMENT,
+	PROMPT_SINGLEQUOTE,
+	PROMPT_DOUBLEQUOTE,
+	PROMPT_DOLLARQUOTE,
+	PROMPT_PAREN,
+	PROMPT_COPY
+} promptStatus_t;
+
+char	   *get_prompt(promptStatus_t status);
+
+#endif   /* PROMPT_H */
diff --git a/src/bin/pgbench/psqlscan.h b/src/bin/pgbench/psqlscan.h
new file mode 100644
index 0000000..1b6361b
--- /dev/null
+++ b/src/bin/pgbench/psqlscan.h
@@ -0,0 +1,63 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2015, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.h
+ */
+#ifndef PSQLSCAN_H
+#define PSQLSCAN_H
+
+#include "pqexpbuffer.h"
+
+#include "prompt.h"
+#include "variables.h"
+
+/* Abstract type for lexer's internal state */
+typedef struct PsqlScanStateData *PsqlScanState;
+
+/* Termination states for psql_scan() */
+typedef enum
+{
+	PSCAN_SEMICOLON,			/* found command-ending semicolon */
+	PSCAN_BACKSLASH,			/* found backslash command */
+	PSCAN_INCOMPLETE,			/* end of line, SQL statement incomplete */
+	PSCAN_EOL					/* end of line, SQL possibly complete */
+} PsqlScanResult;
+
+/* Different ways for scan_slash_option to handle parameter words */
+enum slash_option_type
+{
+	OT_NORMAL,					/* normal case */
+	OT_SQLID,					/* treat as SQL identifier */
+	OT_SQLIDHACK,				/* SQL identifier, but don't downcase */
+	OT_FILEPIPE,				/* it's a filename or pipe */
+	OT_WHOLE_LINE,				/* just snarf the rest of the line */
+	OT_NO_EVAL					/* no expansion of backticks or variables */
+};
+
+extern PsqlScanState psql_scan_create(void);
+extern void psql_scan_destroy(PsqlScanState state);
+
+extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+							PGconn *db, VariableSpace vars, int encoding);
+extern void psql_scan_finish(PsqlScanState state);
+
+extern PsqlScanResult psql_scan(PsqlScanState state,
+		  PQExpBuffer query_buf,
+		  promptStatus_t *prompt);
+
+extern void psql_scan_reset(PsqlScanState state);
+
+extern bool psql_scan_in_quote(PsqlScanState state);
+
+extern char *psql_scan_slash_command(PsqlScanState state);
+
+extern char *psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon);
+
+extern void psql_scan_slash_command_end(PsqlScanState state);
+
+#endif   /* PSQLSCAN_H */
diff --git a/src/bin/pgbench/psqlscan.l b/src/bin/pgbench/psqlscan.l
new file mode 100644
index 0000000..08bd9d2
--- /dev/null
+++ b/src/bin/pgbench/psqlscan.l
@@ -0,0 +1,1997 @@
+%{
+/*-------------------------------------------------------------------------
+ *
+ * psqlscan.l
+ *	  lexical scanner for psql
+ *
+ * This code is mainly needed to determine where the end of a SQL statement
+ * is: we are looking for semicolons that are not within quotes, comments,
+ * or parentheses.  The most reliable way to handle this is to borrow the
+ * backend's flex lexer rules, lock, stock, and barrel.  The rules below
+ * are (except for a few) the same as the backend's, but their actions are
+ * just ECHO whereas the backend's actions generally do other things.
+ *
+ * XXX The rules in this file must be kept in sync with the backend lexer!!!
+ *
+ * XXX Avoid creating backtracking cases --- see the backend lexer for info.
+ *
+ * The most difficult aspect of this code is that we need to work in multibyte
+ * encodings that are not ASCII-safe.  A "safe" encoding is one in which each
+ * byte of a multibyte character has the high bit set (it's >= 0x80).  Since
+ * all our lexing rules treat all high-bit-set characters alike, we don't
+ * really need to care whether such a byte is part of a sequence or not.
+ * In an "unsafe" encoding, we still expect the first byte of a multibyte
+ * sequence to be >= 0x80, but later bytes might not be.  If we scan such
+ * a sequence as-is, the lexing rules could easily be fooled into matching
+ * such bytes to ordinary ASCII characters.  Our solution for this is to
+ * substitute 0xFF for each non-first byte within the data presented to flex.
+ * The flex rules will then pass the FF's through unmolested.  The emit()
+ * subroutine is responsible for looking back to the original string and
+ * replacing FF's with the corresponding original bytes.
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/bin/psql/psqlscan.l
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+
+#include "psqlscan.h"
+
+#include <ctype.h>
+
+/*
+ * We use a stack of flex buffers to handle substitution of psql variables.
+ * Each stacked buffer contains the as-yet-unread text from one psql variable.
+ * When we pop the stack all the way, we resume reading from the outer buffer
+ * identified by scanbufhandle.
+ */
+typedef struct StackElem
+{
+	YY_BUFFER_STATE buf;		/* flex input control structure */
+	char	   *bufstring;		/* data actually being scanned by flex */
+	char	   *origstring;		/* copy of original data, if needed */
+	char	   *varname;		/* name of variable providing data, or NULL */
+	struct StackElem *next;
+} StackElem;
+
+/*
+ * All working state of the lexer must be stored in PsqlScanStateData
+ * between calls.  This allows us to have multiple open lexer operations,
+ * which is needed for nested include files.  The lexer itself is not
+ * recursive, but it must be re-entrant.
+ */
+typedef struct PsqlScanStateData
+{
+	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
+	/*
+	 * These variables always refer to the outer buffer, never to any
+	 * stacked variable-expansion buffer.
+	 */
+	YY_BUFFER_STATE scanbufhandle;
+	char	   *scanbuf;		/* start of outer-level input buffer */
+	const char *scanline;		/* current input line at outer level */
+
+	/* safe_encoding, curline, refline are used by emit() to replace FFs */
+	PGconn	   *db;				/* active connection */
+	int			encoding;		/* encoding being used now */
+	bool		safe_encoding;	/* is current encoding "safe"? */
+	const char *curline;		/* actual flex input string for cur buf */
+	const char *refline;		/* original data for cur buffer */
+	VariableSpace vars;			/* "shell variable" repository */
+
+	/*
+	 * All this state lives across successive input lines, until explicitly
+	 * reset by psql_scan_reset.
+	 */
+	int			start_state;	/* saved YY_START */
+	int			paren_depth;	/* depth of nesting in parentheses */
+	int			xcdepth;		/* depth of nesting in slash-star comments */
+	char	   *dolqstart;		/* current $foo$ quote start string */
+} PsqlScanStateData;
+
+static PsqlScanState cur_state;	/* current state while active */
+
+static PQExpBuffer output_buf;	/* current output buffer */
+
+/* these variables do not need to be saved across calls */
+static enum slash_option_type option_type;
+static char *option_quote;
+static int	unquoted_option_chars;
+static int	backtick_start_offset;
+
+
+/* Return values from yylex() */
+#define LEXRES_EOL			0	/* end of input */
+#define LEXRES_SEMI			1	/* command-terminating semicolon found */
+#define LEXRES_BACKSLASH	2	/* backslash command start */
+#define LEXRES_OK			3	/* OK completion of backslash argument */
+
+
+static void evaluate_backtick(void);
+static void push_new_buffer(const char *newstr, const char *varname);
+static void pop_buffer_stack(PsqlScanState state);
+static bool var_is_current_source(PsqlScanState state, const char *varname);
+static YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
+									  char **txtcopy);
+static void emit(const char *txt, int len);
+static char *extract_substring(const char *txt, int len);
+static void escape_variable(bool as_ident);
+
+#define ECHO emit(yytext, yyleng)
+
+%}
+
+%option 8bit
+%option never-interactive
+%option nodefault
+%option noinput
+%option nounput
+%option noyywrap
+%option warn
+
+/*
+ * All of the following definitions and rules should exactly match
+ * src/backend/parser/scan.l so far as the flex patterns are concerned.
+ * The rule bodies are just ECHO as opposed to what the backend does,
+ * however.  (But be sure to duplicate code that affects the lexing process,
+ * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
+ * scan.l has a separate one for each exclusive state.
+ */
+
+/*
+ * OK, here is a short description of lex/flex rules behavior.
+ * The longest pattern which matches an input string is always chosen.
+ * For equal-length patterns, the first occurring in the rules list is chosen.
+ * INITIAL is the starting state, to which all non-conditional rules apply.
+ * Exclusive states change parsing rules while the state is active.  When in
+ * an exclusive state, only those rules defined for that state apply.
+ *
+ * We use exclusive states for quoted strings, extended comments,
+ * and to eliminate parsing troubles for numeric strings.
+ * Exclusive states:
+ *  <xb> bit string literal
+ *  <xc> extended C-style comments
+ *  <xd> delimited identifiers (double-quoted identifiers)
+ *  <xh> hexadecimal numeric string
+ *  <xq> standard quoted strings
+ *  <xe> extended quoted strings (support backslash escape sequences)
+ *  <xdolq> $foo$ quoted strings
+ *  <xui> quoted identifier with Unicode escapes
+ *  <xuiend> end of a quoted identifier with Unicode escapes, UESCAPE can follow
+ *  <xus> quoted string with Unicode escapes
+ *  <xusend> end of a quoted string with Unicode escapes, UESCAPE can follow
+ *
+ * Note: we intentionally don't mimic the backend's <xeu> state; we have
+ * no need to distinguish it from <xe> state, and no good way to get out
+ * of it in error cases.  The backend just throws yyerror() in those
+ * cases, but that's not an option here.
+ */
+
+%x xb
+%x xc
+%x xd
+%x xh
+%x xe
+%x xq
+%x xdolq
+%x xui
+%x xuiend
+%x xus
+%x xusend
+/* Additional exclusive states for psql only: lex backslash commands */
+%x xslashcmd
+%x xslashargstart
+%x xslasharg
+%x xslashquote
+%x xslashbackquote
+%x xslashdquote
+%x xslashwholeline
+%x xslashend
+
+/*
+ * In order to make the world safe for Windows and Mac clients as well as
+ * Unix ones, we accept either \n or \r as a newline.  A DOS-style \r\n
+ * sequence will be seen as two successive newlines, but that doesn't cause
+ * any problems.  Comments that start with -- and extend to the next
+ * newline are treated as equivalent to a single whitespace character.
+ *
+ * NOTE a fine point: if there is no newline following --, we will absorb
+ * everything to the end of the input as a comment.  This is correct.  Older
+ * versions of Postgres failed to recognize -- as a comment if the input
+ * did not end with a newline.
+ *
+ * XXX perhaps \f (formfeed) should be treated as a newline as well?
+ *
+ * XXX if you change the set of whitespace characters, fix scanner_isspace()
+ * to agree, and see also the plpgsql lexer.
+ */
+
+space			[ \t\n\r\f]
+horiz_space		[ \t\f]
+newline			[\n\r]
+non_newline		[^\n\r]
+
+comment			("--"{non_newline}*)
+
+whitespace		({space}+|{comment})
+
+/*
+ * SQL requires at least one newline in the whitespace separating
+ * string literals that are to be concatenated.  Silly, but who are we
+ * to argue?  Note that {whitespace_with_newline} should not have * after
+ * it, whereas {whitespace} should generally have a * after it...
+ */
+
+special_whitespace		({space}+|{comment}{newline})
+horiz_whitespace		({horiz_space}|{comment})
+whitespace_with_newline	({horiz_whitespace}*{newline}{special_whitespace}*)
+
+/*
+ * To ensure that {quotecontinue} can be scanned without having to back up
+ * if the full pattern isn't matched, we include trailing whitespace in
+ * {quotestop}.  This matches all cases where {quotecontinue} fails to match,
+ * except for {quote} followed by whitespace and just one "-" (not two,
+ * which would start a {comment}).  To cover that we have {quotefail}.
+ * The actions for {quotestop} and {quotefail} must throw back characters
+ * beyond the quote proper.
+ */
+quote			'
+quotestop		{quote}{whitespace}*
+quotecontinue	{quote}{whitespace_with_newline}{quote}
+quotefail		{quote}{whitespace}*"-"
+
+/* Bit string
+ * It is tempting to scan the string for only those characters
+ * which are allowed. However, this leads to silently swallowed
+ * characters if illegal characters are included in the string.
+ * For example, if xbinside is [01] then B'ABCD' is interpreted
+ * as a zero-length string, and the ABCD' is lost!
+ * Better to pass the string forward and let the input routines
+ * validate the contents.
+ */
+xbstart			[bB]{quote}
+xbinside		[^']*
+
+/* Hexadecimal number */
+xhstart			[xX]{quote}
+xhinside		[^']*
+
+/* National character */
+xnstart			[nN]{quote}
+
+/* Quoted string that allows backslash escapes */
+xestart			[eE]{quote}
+xeinside		[^\\']+
+xeescape		[\\][^0-7]
+xeoctesc		[\\][0-7]{1,3}
+xehexesc		[\\]x[0-9A-Fa-f]{1,2}
+xeunicode		[\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})
+xeunicodefail	[\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})
+
+/* Extended quote
+ * xqdouble implements embedded quote, ''''
+ */
+xqstart			{quote}
+xqdouble		{quote}{quote}
+xqinside		[^']+
+
+/* $foo$ style quotes ("dollar quoting")
+ * The quoted string starts with $foo$ where "foo" is an optional string
+ * in the form of an identifier, except that it may not contain "$",
+ * and extends to the first occurrence of an identical string.
+ * There is *no* processing of the quoted text.
+ *
+ * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim}
+ * fails to match its trailing "$".
+ */
+dolq_start		[A-Za-z\200-\377_]
+dolq_cont		[A-Za-z\200-\377_0-9]
+dolqdelim		\$({dolq_start}{dolq_cont}*)?\$
+dolqfailed		\${dolq_start}{dolq_cont}*
+dolqinside		[^$]+
+
+/* Double quote
+ * Allows embedded spaces and other special characters into identifiers.
+ */
+dquote			\"
+xdstart			{dquote}
+xdstop			{dquote}
+xddouble		{dquote}{dquote}
+xdinside		[^"]+
+
+/* Unicode escapes */
+uescape			[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote}
+/* error rule to avoid backup */
+uescapefail		[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU]
+
+/* Quoted identifier with Unicode escapes */
+xuistart		[uU]&{dquote}
+
+/* Quoted string with Unicode escapes */
+xusstart		[uU]&{quote}
+
+/* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */
+xustop1		{uescapefail}?
+xustop2		{uescape}
+
+/* error rule to avoid backup */
+xufailed		[uU]&
+
+
+/* C-style comments
+ *
+ * The "extended comment" syntax closely resembles allowable operator syntax.
+ * The tricky part here is to get lex to recognize a string starting with
+ * slash-star as a comment, when interpreting it as an operator would produce
+ * a longer match --- remember lex will prefer a longer match!  Also, if we
+ * have something like plus-slash-star, lex will think this is a 3-character
+ * operator whereas we want to see it as a + operator and a comment start.
+ * The solution is two-fold:
+ * 1. append {op_chars}* to xcstart so that it matches as much text as
+ *    {operator} would. Then the tie-breaker (first matching rule of same
+ *    length) ensures xcstart wins.  We put back the extra stuff with yyless()
+ *    in case it contains a star-slash that should terminate the comment.
+ * 2. In the operator rule, check for slash-star within the operator, and
+ *    if found throw it back with yyless().  This handles the plus-slash-star
+ *    problem.
+ * Dash-dash comments have similar interactions with the operator rule.
+ */
+xcstart			\/\*{op_chars}*
+xcstop			\*+\/
+xcinside		[^*/]+
+
+digit			[0-9]
+ident_start		[A-Za-z\200-\377_]
+ident_cont		[A-Za-z\200-\377_0-9\$]
+
+identifier		{ident_start}{ident_cont}*
+
+/* Assorted special-case operators and operator-like tokens */
+typecast		"::"
+dot_dot			\.\.
+colon_equals	":="
+equals_greater	"=>"
+less_equals		"<="
+greater_equals	">="
+less_greater	"<>"
+not_equals		"!="
+
+/*
+ * "self" is the set of chars that should be returned as single-character
+ * tokens.  "op_chars" is the set of chars that can make up "Op" tokens,
+ * which can be one or more characters long (but if a single-char token
+ * appears in the "self" set, it is not to be returned as an Op).  Note
+ * that the sets overlap, but each has some chars that are not in the other.
+ *
+ * If you change either set, adjust the character lists appearing in the
+ * rule for "operator"!
+ */
+self			[,()\[\].;\:\+\-\*\/\%\^\<\>\=]
+op_chars		[\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
+operator		{op_chars}+
+
+/* we no longer allow unary minus in numbers.
+ * instead we pass it separately to parser. there it gets
+ * coerced via doNegate() -- Leon aug 20 1999
+ *
+ * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 10.
+ *
+ * {realfail1} and {realfail2} are added to prevent the need for scanner
+ * backup when the {real} rule fails to match completely.
+ */
+
+integer			{digit}+
+decimal			(({digit}*\.{digit}+)|({digit}+\.{digit}*))
+decimalfail		{digit}+\.\.
+real			({integer}|{decimal})[Ee][-+]?{digit}+
+realfail1		({integer}|{decimal})[Ee]
+realfail2		({integer}|{decimal})[Ee][-+]
+
+param			\${integer}
+
+/* psql-specific: characters allowed in variable names */
+variable_char	[A-Za-z\200-\377_0-9]
+
+other			.
+
+/*
+ * Dollar quoted strings are totally opaque, and no escaping is done on them.
+ * Other quoted strings must allow some special characters such as single-quote
+ *  and newline.
+ * Embedded single-quotes are implemented both in the SQL standard
+ *  style of two adjacent single quotes "''" and in the Postgres/Java style
+ *  of escaped-quote "\'".
+ * Other embedded escaped characters are matched explicitly and the leading
+ *  backslash is dropped from the string.
+ * Note that xcstart must appear before operator, as explained above!
+ *  Also whitespace (comment) must appear before operator.
+ */
+
+%%
+
+{whitespace}	{
+					/*
+					 * Note that the whitespace rule includes both true
+					 * whitespace and single-line ("--" style) comments.
+					 * We suppress whitespace at the start of the query
+					 * buffer.  We also suppress all single-line comments,
+					 * which is pretty dubious but is the historical
+					 * behavior.
+					 */
+					if (!(output_buf->len == 0 || yytext[0] == '-'))
+						ECHO;
+				}
+
+{xcstart}		{
+					cur_state->xcdepth = 0;
+					BEGIN(xc);
+					/* Put back any characters past slash-star; see above */
+					yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstart}	{
+					cur_state->xcdepth++;
+					/* Put back any characters past slash-star; see above */
+					yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstop}	{
+					if (cur_state->xcdepth <= 0)
+					{
+						BEGIN(INITIAL);
+					}
+					else
+						cur_state->xcdepth--;
+					ECHO;
+				}
+
+<xc>{xcinside}	{
+					ECHO;
+				}
+
+<xc>{op_chars}	{
+					ECHO;
+				}
+
+<xc>\*+			{
+					ECHO;
+				}
+
+{xbstart}		{
+					BEGIN(xb);
+					ECHO;
+				}
+<xb>{quotestop}	|
+<xb>{quotefail} {
+					yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xh>{xhinside}	|
+<xb>{xbinside}	{
+					ECHO;
+				}
+<xh>{quotecontinue}	|
+<xb>{quotecontinue}	{
+					ECHO;
+				}
+
+{xhstart}		{
+					/* Hexadecimal bit type.
+					 * At some point we should simply pass the string
+					 * forward to the parser and label it there.
+					 * In the meantime, place a leading "x" on the string
+					 * to mark it for the input routine as a hex string.
+					 */
+					BEGIN(xh);
+					ECHO;
+				}
+<xh>{quotestop}	|
+<xh>{quotefail} {
+					yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+
+{xnstart}		{
+					yyless(1);				/* eat only 'n' this time */
+					ECHO;
+				}
+
+{xqstart}		{
+					if (standard_strings())
+						BEGIN(xq);
+					else
+						BEGIN(xe);
+					ECHO;
+				}
+{xestart}		{
+					BEGIN(xe);
+					ECHO;
+				}
+{xusstart}		{
+					BEGIN(xus);
+					ECHO;
+				}
+<xq,xe>{quotestop}	|
+<xq,xe>{quotefail} {
+					yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xus>{quotestop} |
+<xus>{quotefail} {
+					yyless(1);
+					BEGIN(xusend);
+					ECHO;
+				}
+<xusend>{whitespace} {
+					ECHO;
+				}
+<xusend>{other} |
+<xusend>{xustop1} {
+					yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xusend>{xustop2} {
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xq,xe,xus>{xqdouble} {
+					ECHO;
+				}
+<xq,xus>{xqinside}  {
+					ECHO;
+				}
+<xe>{xeinside}  {
+					ECHO;
+				}
+<xe>{xeunicode} {
+					ECHO;
+				}
+<xe>{xeunicodefail}	{
+					ECHO;
+				}
+<xe>{xeescape}  {
+					ECHO;
+				}
+<xe>{xeoctesc}  {
+					ECHO;
+				}
+<xe>{xehexesc}  {
+					ECHO;
+				}
+<xq,xe,xus>{quotecontinue} {
+					ECHO;
+				}
+<xe>.			{
+					/* This is only needed for \ just before EOF */
+					ECHO;
+				}
+
+{dolqdelim}		{
+					cur_state->dolqstart = pg_strdup(yytext);
+					BEGIN(xdolq);
+					ECHO;
+				}
+{dolqfailed}	{
+					/* throw back all but the initial "$" */
+					yyless(1);
+					ECHO;
+				}
+<xdolq>{dolqdelim} {
+					if (strcmp(yytext, cur_state->dolqstart) == 0)
+					{
+						free(cur_state->dolqstart);
+						cur_state->dolqstart = NULL;
+						BEGIN(INITIAL);
+					}
+					else
+					{
+						/*
+						 * When we fail to match $...$ to dolqstart, transfer
+						 * the $... part to the output, but put back the final
+						 * $ for rescanning.  Consider $delim$...$junk$delim$
+						 */
+						yyless(yyleng-1);
+					}
+					ECHO;
+				}
+<xdolq>{dolqinside} {
+					ECHO;
+				}
+<xdolq>{dolqfailed} {
+					ECHO;
+				}
+<xdolq>.		{
+					/* This is only needed for $ inside the quoted text */
+					ECHO;
+				}
+
+{xdstart}		{
+					BEGIN(xd);
+					ECHO;
+				}
+{xuistart}		{
+					BEGIN(xui);
+					ECHO;
+				}
+<xd>{xdstop}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xui>{dquote} {
+					yyless(1);
+					BEGIN(xuiend);
+					ECHO;
+				}
+<xuiend>{whitespace} {
+					ECHO;
+				}
+<xuiend>{other} |
+<xuiend>{xustop1} {
+					yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xuiend>{xustop2}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xd,xui>{xddouble}	{
+					ECHO;
+				}
+<xd,xui>{xdinside}	{
+					ECHO;
+				}
+
+{xufailed}	{
+					/* throw back all but the initial u/U */
+					yyless(1);
+					ECHO;
+				}
+
+{typecast}		{
+					ECHO;
+				}
+
+{dot_dot}		{
+					ECHO;
+				}
+
+{colon_equals}	{
+					ECHO;
+				}
+
+{equals_greater} {
+					ECHO;
+				}
+
+{less_equals}	{
+					ECHO;
+				}
+
+{greater_equals} {
+					ECHO;
+				}
+
+{less_greater}	{
+					ECHO;
+				}
+
+{not_equals}	{
+					ECHO;
+				}
+
+	/*
+	 * These rules are specific to psql --- they implement parenthesis
+	 * counting and detection of command-ending semicolon.  These must
+	 * appear before the {self} rule so that they take precedence over it.
+	 */
+
+"("				{
+					cur_state->paren_depth++;
+					ECHO;
+				}
+
+")"				{
+					if (cur_state->paren_depth > 0)
+						cur_state->paren_depth--;
+					ECHO;
+				}
+
+";"				{
+					ECHO;
+					if (cur_state->paren_depth == 0)
+					{
+						/* Terminate lexing temporarily */
+						return LEXRES_SEMI;
+					}
+				}
+
+	/*
+	 * psql-specific rules to handle backslash commands and variable
+	 * substitution.  We want these before {self}, also.
+	 */
+
+"\\"[;:]		{
+					/* Force a semicolon or colon into the query buffer */
+					emit(yytext + 1, 1);
+				}
+
+"\\"			{
+					/* Terminate lexing temporarily */
+					return LEXRES_BACKSLASH;
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					char   *varname = NULL;
+					const char *value = NULL;
+
+					if (cur_state->vars)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = GetVariable(cur_state->vars, varname);
+					}
+
+					if (value)
+					{
+						/* It is a variable, check for recursion */
+						if (var_is_current_source(cur_state, varname))
+						{
+							/* Recursive expansion --- don't go there */
+							psql_error("skipping recursive expansion of variable \"%s\"\n",
+									   varname);
+							/* Instead copy the string as is */
+							ECHO;
+						}
+						else
+						{
+							/* OK, perform substitution */
+							push_new_buffer(value, varname);
+							/* yy_scan_string already made buffer active */
+						}
+					}
+					else
+					{
+						/*
+						 * if the variable doesn't exist we'll copy the
+						 * string as is
+						 */
+						ECHO;
+					}
+
+					if (varname)
+						free(varname);
+				}
+
+:'{variable_char}+'	{
+					escape_variable(false);
+				}
+
+:\"{variable_char}+\"	{
+					escape_variable(true);
+				}
+
+	/*
+	 * These rules just avoid the need for scanner backup if one of the
+	 * two rules above fails to match completely.
+	 */
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					yyless(1);
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					yyless(1);
+					ECHO;
+				}
+
+	/*
+	 * Back to backend-compatible rules.
+	 */
+
+{self}			{
+					ECHO;
+				}
+
+{operator}		{
+					/*
+					 * Check for embedded slash-star or dash-dash; those
+					 * are comment starts, so operator must stop there.
+					 * Note that slash-star or dash-dash at the first
+					 * character will match a prior rule, not this one.
+					 */
+					int		nchars = yyleng;
+					char   *slashstar = strstr(yytext, "/*");
+					char   *dashdash = strstr(yytext, "--");
+
+					if (slashstar && dashdash)
+					{
+						/* if both appear, take the first one */
+						if (slashstar > dashdash)
+							slashstar = dashdash;
+					}
+					else if (!slashstar)
+						slashstar = dashdash;
+					if (slashstar)
+						nchars = slashstar - yytext;
+
+					/*
+					 * For SQL compatibility, '+' and '-' cannot be the
+					 * last char of a multi-char operator unless the operator
+					 * contains chars that are not in SQL operators.
+					 * The idea is to lex '=-' as two operators, but not
+					 * to forbid operator names like '?-' that could not be
+					 * sequences of SQL operators.
+					 */
+					while (nchars > 1 &&
+						   (yytext[nchars-1] == '+' ||
+							yytext[nchars-1] == '-'))
+					{
+						int		ic;
+
+						for (ic = nchars-2; ic >= 0; ic--)
+						{
+							if (strchr("~!@#^&|`?%", yytext[ic]))
+								break;
+						}
+						if (ic >= 0)
+							break; /* found a char that makes it OK */
+						nchars--; /* else remove the +/-, and check again */
+					}
+
+					if (nchars < yyleng)
+					{
+						/* Strip the unwanted chars from the token */
+						yyless(nchars);
+					}
+					ECHO;
+				}
+
+{param}			{
+					ECHO;
+				}
+
+{integer}		{
+					ECHO;
+				}
+{decimal}		{
+					ECHO;
+				}
+{decimalfail}	{
+					/* throw back the .., and treat as integer */
+					yyless(yyleng-2);
+					ECHO;
+				}
+{real}			{
+					ECHO;
+				}
+{realfail1}		{
+					/*
+					 * throw back the [Ee], and treat as {decimal}.  Note
+					 * that it is possible the input is actually {integer},
+					 * but since this case will almost certainly lead to a
+					 * syntax error anyway, we don't bother to distinguish.
+					 */
+					yyless(yyleng-1);
+					ECHO;
+				}
+{realfail2}		{
+					/* throw back the [Ee][+-], and proceed as above */
+					yyless(yyleng-2);
+					ECHO;
+				}
+
+
+{identifier}	{
+					ECHO;
+				}
+
+{other}			{
+					ECHO;
+				}
+
+
+	/*
+	 * Everything from here down is psql-specific.
+	 */
+
+<<EOF>>			{
+					StackElem  *stackelem = cur_state->buffer_stack;
+
+					if (stackelem == NULL)
+						return LEXRES_EOL; /* end of input reached */
+
+					/*
+					 * We were expanding a variable, so pop the inclusion
+					 * stack and keep lexing
+					 */
+					pop_buffer_stack(cur_state);
+
+					stackelem = cur_state->buffer_stack;
+					if (stackelem != NULL)
+					{
+						yy_switch_to_buffer(stackelem->buf);
+						cur_state->curline = stackelem->bufstring;
+						cur_state->refline = stackelem->origstring ? stackelem->origstring : stackelem->bufstring;
+					}
+					else
+					{
+						yy_switch_to_buffer(cur_state->scanbufhandle);
+						cur_state->curline = cur_state->scanbuf;
+						cur_state->refline = cur_state->scanline;
+					}
+				}
+
+	/*
+	 * Exclusive lexer states to handle backslash command lexing
+	 */
+
+<xslashcmd>{
+	/* command name ends at whitespace or backslash; eat all else */
+
+{space}|"\\"	{
+					yyless(0);
+					return LEXRES_OK;
+				}
+
+{other}			{ ECHO; }
+
+}
+
+<xslashargstart>{
+	/*
+	 * Discard any whitespace before argument, then go to xslasharg state.
+	 * An exception is that "|" is only special at start of argument, so we
+	 * check for it here.
+	 */
+
+{space}+		{ }
+
+"|"				{
+					if (option_type == OT_FILEPIPE)
+					{
+						/* treat like whole-string case */
+						ECHO;
+						BEGIN(xslashwholeline);
+					}
+					else
+					{
+						/* vertical bar is not special otherwise */
+						yyless(0);
+						BEGIN(xslasharg);
+					}
+				}
+
+{other}			{
+					yyless(0);
+					BEGIN(xslasharg);
+				}
+
+}
+
+<xslasharg>{
+	/*
+	 * Default processing of text in a slash command's argument.
+	 *
+	 * Note: unquoted_option_chars counts the number of characters at the
+	 * end of the argument that were not subject to any form of quoting.
+	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
+	 */
+
+{space}|"\\"	{
+					/*
+					 * Unquoted space is end of arg; do not eat.  Likewise
+					 * backslash is end of command or next command, do not eat
+					 *
+					 * XXX this means we can't conveniently accept options
+					 * that include unquoted backslashes; therefore, option
+					 * processing that encourages use of backslashes is rather
+					 * broken.
+					 */
+					yyless(0);
+					return LEXRES_OK;
+				}
+
+{quote}			{
+					*option_quote = '\'';
+					unquoted_option_chars = 0;
+					BEGIN(xslashquote);
+				}
+
+"`"				{
+					backtick_start_offset = output_buf->len;
+					*option_quote = '`';
+					unquoted_option_chars = 0;
+					BEGIN(xslashbackquote);
+				}
+
+{dquote}		{
+					ECHO;
+					*option_quote = '"';
+					unquoted_option_chars = 0;
+					BEGIN(xslashdquote);
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						char   *varname;
+						const char *value;
+
+						if (cur_state->vars)
+						{
+							varname = extract_substring(yytext + 1, yyleng - 1);
+							value = GetVariable(cur_state->vars, varname);
+							free(varname);
+						}
+
+						/*
+						 * The variable value is just emitted without any
+						 * further examination.  This is consistent with the
+						 * pre-8.0 code behavior, if not with the way that
+						 * variables are handled outside backslash commands.
+						 * Note that we needn't guard against recursion here.
+						 */
+						if (value)
+							appendPQExpBufferStr(output_buf, value);
+						else
+							ECHO;
+
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}+'	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(false);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+
+:\"{variable_char}+\"	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(true);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+{other}			{
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+}
+
+<xslashquote>{
+	/*
+	 * single-quoted text: copy literally except for '' and backslash
+	 * sequences
+	 */
+
+{quote}			{ BEGIN(xslasharg); }
+
+{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
+
+"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
+"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
+"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
+"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
+"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
+
+{xeoctesc}		{
+					/* octal case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 1, NULL, 8));
+				}
+
+{xehexesc}		{
+					/* hex case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 2, NULL, 16));
+				}
+
+"\\".			{ emit(yytext + 1, 1); }
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashbackquote>{
+	/*
+	 * backticked text: copy everything until next backquote, then evaluate.
+	 *
+	 * XXX Possible future behavioral change: substitute for :VARIABLE?
+	 */
+
+"`"				{
+					/* In NO_EVAL mode, don't evaluate the command */
+					if (option_type != OT_NO_EVAL)
+						evaluate_backtick();
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashdquote>{
+	/* double-quoted text: copy verbatim, including the double quotes */
+
+{dquote}		{
+					ECHO;
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashwholeline>{
+	/* copy everything until end of input line */
+	/* but suppress leading whitespace */
+
+{space}+		{
+					if (output_buf->len > 0)
+						ECHO;
+				}
+
+{other}			{ ECHO; }
+
+}
+
+<xslashend>{
+	/* at end of command, eat a double backslash, but not anything else */
+
+"\\\\"			{ return LEXRES_OK; }
+
+{other}|\n		{
+					yyless(0);
+					return LEXRES_OK;
+				}
+
+}
+
+%%
+
+/*
+ * Create a lexer working state struct.
+ */
+PsqlScanState
+psql_scan_create(void)
+{
+	PsqlScanState state;
+
+	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
+
+	psql_scan_reset(state);
+
+	return state;
+}
+
+/*
+ * Destroy a lexer working state struct, releasing all resources.
+ */
+void
+psql_scan_destroy(PsqlScanState state)
+{
+	psql_scan_finish(state);
+
+	psql_scan_reset(state);
+
+	free(state);
+}
+
+/*
+ * Set up to perform lexing of the given input line.
+ *
+ * The text at *line, extending for line_len bytes, will be scanned by
+ * subsequent calls to the psql_scan routines.  psql_scan_finish should
+ * be called when scanning is complete.  Note that the lexer retains
+ * a pointer to the storage at *line --- this string must not be altered
+ * or freed until after psql_scan_finish is called.
+ */
+void
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PGconn *db, VariableSpace vars, int encoding)
+{
+	/* Mustn't be scanning already */
+	Assert(state->scanbufhandle == NULL);
+	Assert(state->buffer_stack == NULL);
+
+	/* Do we need to hack the character set encoding? */
+	state->encoding = encoding;
+	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
+
+	state->vars = vars;
+
+	/* needed for prepare_buffer */
+	cur_state = state;
+
+	/* Set up flex input buffer with appropriate translation and padding */
+	state->scanbufhandle = prepare_buffer(line, line_len,
+										  &state->scanbuf);
+	state->scanline = line;
+
+	/* Set lookaside data in case we have to map unsafe encoding */
+	state->curline = state->scanbuf;
+	state->refline = state->scanline;
+}
+
+/*
+ * Do lexical analysis of SQL command text.
+ *
+ * The text previously passed to psql_scan_setup is scanned, and appended
+ * (possibly with transformation) to query_buf.
+ *
+ * The return value indicates the condition that stopped scanning:
+ *
+ * PSCAN_SEMICOLON: found a command-ending semicolon.  (The semicolon is
+ * transferred to query_buf.)  The command accumulated in query_buf should
+ * be executed, then clear query_buf and call again to scan the remainder
+ * of the line.
+ *
+ * PSCAN_BACKSLASH: found a backslash that starts a psql special command.
+ * Any previous data on the line has been transferred to query_buf.
+ * The caller will typically next call psql_scan_slash_command(),
+ * perhaps psql_scan_slash_option(), and psql_scan_slash_command_end().
+ *
+ * PSCAN_INCOMPLETE: the end of the line was reached, but we have an
+ * incomplete SQL command.  *prompt is set to the appropriate prompt type.
+ *
+ * PSCAN_EOL: the end of the line was reached, and there is no lexical
+ * reason to consider the command incomplete.  The caller may or may not
+ * choose to send it.  *prompt is set to the appropriate prompt type if
+ * the caller chooses to collect more input.
+ *
+ * In the PSCAN_INCOMPLETE and PSCAN_EOL cases, psql_scan_finish() should
+ * be called next, then the cycle may be repeated with a fresh input line.
+ *
+ * In all cases, *prompt is set to an appropriate prompt type code for the
+ * next line-input operation.
+ */
+PsqlScanResult
+psql_scan(PsqlScanState state,
+		  PQExpBuffer query_buf,
+		  promptStatus_t *prompt)
+{
+	PsqlScanResult result;
+	int			lexresult;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = query_buf;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(state->start_state);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/* Update static vars back to the state struct */
+	state->start_state = YY_START;
+
+	/*
+	 * Check termination state and return appropriate result info.
+	 */
+	switch (lexresult)
+	{
+		case LEXRES_EOL:		/* end of input */
+			switch (state->start_state)
+			{
+				/* This switch must cover all non-slash-command states. */
+				case INITIAL:
+				case xuiend:	/* we treat these like INITIAL */
+				case xusend:
+					if (state->paren_depth > 0)
+					{
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_PAREN;
+					}
+					else if (query_buf->len > 0)
+					{
+						result = PSCAN_EOL;
+						*prompt = PROMPT_CONTINUE;
+					}
+					else
+					{
+						/* never bother to send an empty buffer */
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_READY;
+					}
+					break;
+				case xb:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xc:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_COMMENT;
+					break;
+				case xd:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xh:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xe:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xdolq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOLLARQUOTE;
+					break;
+				case xui:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xus:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				default:
+					/* can't get here */
+					fprintf(stderr, "invalid YY_START\n");
+					exit(1);
+			}
+			break;
+		case LEXRES_SEMI:		/* semicolon */
+			result = PSCAN_SEMICOLON;
+			*prompt = PROMPT_READY;
+			break;
+		case LEXRES_BACKSLASH:	/* backslash */
+			result = PSCAN_BACKSLASH;
+			*prompt = PROMPT_READY;
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid yylex result\n");
+			exit(1);
+	}
+
+	return result;
+}
+
+/*
+ * Clean up after scanning a string.  This flushes any unread input and
+ * releases resources (but not the PsqlScanState itself).  Note however
+ * that this does not reset the lexer scan state; that can be done by
+ * psql_scan_reset(), which is an orthogonal operation.
+ *
+ * It is legal to call this when not scanning anything (makes it easier
+ * to deal with error recovery).
+ */
+void
+psql_scan_finish(PsqlScanState state)
+{
+	/* Drop any incomplete variable expansions. */
+	while (state->buffer_stack != NULL)
+		pop_buffer_stack(state);
+
+	/* Done with the outer scan buffer, too */
+	if (state->scanbufhandle)
+		yy_delete_buffer(state->scanbufhandle);
+	state->scanbufhandle = NULL;
+	if (state->scanbuf)
+		free(state->scanbuf);
+	state->scanbuf = NULL;
+}
+
+/*
+ * Reset lexer scanning state to start conditions.  This is appropriate
+ * for executing \r psql commands (or any other time that we discard the
+ * prior contents of query_buf).  It is not, however, necessary to do this
+ * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
+ * PSCAN_EOL scan result, because the scan state must be INITIAL when those
+ * conditions are returned.
+ *
+ * Note that this is unrelated to flushing unread input; that task is
+ * done by psql_scan_finish().
+ */
+void
+psql_scan_reset(PsqlScanState state)
+{
+	state->start_state = INITIAL;
+	state->paren_depth = 0;
+	state->xcdepth = 0;			/* not really necessary */
+	if (state->dolqstart)
+		free(state->dolqstart);
+	state->dolqstart = NULL;
+}
+
+/*
+ * Return true if lexer is currently in an "inside quotes" state.
+ *
+ * This is pretty grotty but is needed to preserve the old behavior
+ * that mainloop.c drops blank lines not inside quotes without even
+ * echoing them.
+ */
+bool
+psql_scan_in_quote(PsqlScanState state)
+{
+	return state->start_state != INITIAL;
+}
+
+/*
+ * Scan the command name of a psql backslash command.  This should be called
+ * after psql_scan() returns PSCAN_BACKSLASH.  It is assumed that the input
+ * has been consumed through the leading backslash.
+ *
+ * The return value is a malloc'd copy of the command name, as parsed off
+ * from the input.
+ */
+char *
+psql_scan_slash_command(PsqlScanState state)
+{
+	PQExpBufferData mybuf;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(xslashcmd);
+
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+
+	return mybuf.data;
+}
+
+/*
+ * Parse off the next argument for a backslash command, and return it as a
+ * malloc'd string.  If there are no more arguments, returns NULL.
+ *
+ * type tells what processing, if any, to perform on the option string;
+ * for example, if it's a SQL identifier, we want to downcase any unquoted
+ * letters.
+ *
+ * if quote is not NULL, *quote is set to 0 if no quoting was found, else
+ * the last quote symbol used in the argument.
+ *
+ * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
+ * be taken as part of the option string will be stripped.
+ *
+ * NOTE: the only possible syntax errors for backslash options are unmatched
+ * quotes, which are detected when we run out of input.  Therefore, on a
+ * syntax error we just throw away the string and return NULL; there is no
+ * need to worry about flushing remaining input.
+ */
+char *
+psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon)
+{
+	PQExpBufferData mybuf;
+	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
+	char		local_quote;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	if (quote == NULL)
+		quote = &local_quote;
+	*quote = 0;
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+	option_type = type;
+	option_quote = quote;
+	unquoted_option_chars = 0;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	if (type == OT_WHOLE_LINE)
+		BEGIN(xslashwholeline);
+	else
+		BEGIN(xslashargstart);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/*
+	 * Check the lex result: we should have gotten back either LEXRES_OK
+	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
+	 * a quoted string, as indicated by YY_START, EOL is an error.
+	 */
+	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
+
+	switch (YY_START)
+	{
+		case xslashargstart:
+			/* empty arg */
+			break;
+		case xslasharg:
+			/* Strip any unquoted trailing semi-colons if requested */
+			if (semicolon)
+			{
+				while (unquoted_option_chars-- > 0 &&
+					   mybuf.len > 0 &&
+					   mybuf.data[mybuf.len - 1] == ';')
+				{
+					mybuf.data[--mybuf.len] = '\0';
+				}
+			}
+
+			/*
+			 * If SQL identifier processing was requested, then we strip out
+			 * excess double quotes and downcase unquoted letters.
+			 * Doubled double-quotes become output double-quotes, per spec.
+			 *
+			 * Note that a string like FOO"BAR"BAZ will be converted to
+			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
+			 * which would have us parse it as several identifiers.  But
+			 * for psql's purposes, we want a string like "foo"."bar" to
+			 * be treated as one option, so there's little choice.
+			 */
+			if (type == OT_SQLID || type == OT_SQLIDHACK)
+			{
+				bool		inquotes = false;
+				char	   *cp = mybuf.data;
+
+				while (*cp)
+				{
+					if (*cp == '"')
+					{
+						if (inquotes && cp[1] == '"')
+						{
+							/* Keep the first quote, remove the second */
+							cp++;
+						}
+						inquotes = !inquotes;
+						/* Collapse out quote at *cp */
+						memmove(cp, cp + 1, strlen(cp));
+						mybuf.len--;
+						/* do not advance cp */
+					}
+					else
+					{
+						if (!inquotes && type == OT_SQLID)
+							*cp = pg_tolower((unsigned char) *cp);
+						cp += PQmblen(cp, cur_state->encoding);
+					}
+				}
+			}
+			break;
+		case xslashquote:
+		case xslashbackquote:
+		case xslashdquote:
+			/* must have hit EOL inside quotes */
+			psql_error("unterminated quoted string\n");
+			termPQExpBuffer(&mybuf);
+			return NULL;
+		case xslashwholeline:
+			/* always okay */
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid YY_START\n");
+			exit(1);
+	}
+
+	/*
+	 * An unquoted empty argument isn't possible unless we are at end of
+	 * command.  Return NULL instead.
+	 */
+	if (mybuf.len == 0 && *quote == 0)
+	{
+		termPQExpBuffer(&mybuf);
+		return NULL;
+	}
+
+	/* Else return the completed string. */
+	return mybuf.data;
+}
+
+/*
+ * Eat up any unused \\ to complete a backslash command.
+ */
+void
+psql_scan_slash_command_end(PsqlScanState state)
+{
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = NULL;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(xslashend);
+
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+}
+
+/*
+ * Evaluate a backticked substring of a slash command's argument.
+ *
+ * The portion of output_buf starting at backtick_start_offset is evaluated
+ * as a shell command and then replaced by the command's output.
+ */
+static void
+evaluate_backtick(void)
+{
+	char	   *cmd = output_buf->data + backtick_start_offset;
+	PQExpBufferData cmd_output;
+	FILE	   *fd;
+	bool		error = false;
+	char		buf[512];
+	size_t		result;
+
+	initPQExpBuffer(&cmd_output);
+
+	fd = popen(cmd, PG_BINARY_R);
+	if (!fd)
+	{
+		psql_error("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (!error)
+	{
+		do
+		{
+			result = fread(buf, 1, sizeof(buf), fd);
+			if (ferror(fd))
+			{
+				psql_error("%s: %s\n", cmd, strerror(errno));
+				error = true;
+				break;
+			}
+			appendBinaryPQExpBuffer(&cmd_output, buf, result);
+		} while (!feof(fd));
+	}
+
+	if (fd && pclose(fd) == -1)
+	{
+		psql_error("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (PQExpBufferDataBroken(cmd_output))
+	{
+		psql_error("%s: out of memory\n", cmd);
+		error = true;
+	}
+
+	/* Now done with cmd, delete it from output_buf */
+	output_buf->len = backtick_start_offset;
+	output_buf->data[output_buf->len] = '\0';
+
+	/* If no error, transfer result to output_buf */
+	if (!error)
+	{
+		/* strip any trailing newline */
+		if (cmd_output.len > 0 &&
+			cmd_output.data[cmd_output.len - 1] == '\n')
+			cmd_output.len--;
+		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
+	}
+
+	termPQExpBuffer(&cmd_output);
+}
+
+/*
+ * Push the given string onto the stack of stuff to scan.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+static void
+push_new_buffer(const char *newstr, const char *varname)
+{
+	StackElem  *stackelem;
+
+	stackelem = (StackElem *) pg_malloc(sizeof(StackElem));
+
+	/*
+	 * In current usage, the passed varname points at the current flex
+	 * input buffer; we must copy it before calling prepare_buffer()
+	 * because that will change the buffer state.
+	 */
+	stackelem->varname = varname ? pg_strdup(varname) : NULL;
+
+	stackelem->buf = prepare_buffer(newstr, strlen(newstr),
+									&stackelem->bufstring);
+	cur_state->curline = stackelem->bufstring;
+	if (cur_state->safe_encoding)
+	{
+		stackelem->origstring = NULL;
+		cur_state->refline = stackelem->bufstring;
+	}
+	else
+	{
+		stackelem->origstring = pg_strdup(newstr);
+		cur_state->refline = stackelem->origstring;
+	}
+	stackelem->next = cur_state->buffer_stack;
+	cur_state->buffer_stack = stackelem;
+}
+
+/*
+ * Pop the topmost buffer stack item (there must be one!)
+ *
+ * NB: after this, the flex input state is unspecified; caller must
+ * switch to an appropriate buffer to continue lexing.
+ */
+static void
+pop_buffer_stack(PsqlScanState state)
+{
+	StackElem  *stackelem = state->buffer_stack;
+
+	state->buffer_stack = stackelem->next;
+	yy_delete_buffer(stackelem->buf);
+	free(stackelem->bufstring);
+	if (stackelem->origstring)
+		free(stackelem->origstring);
+	if (stackelem->varname)
+		free(stackelem->varname);
+	free(stackelem);
+}
+
+/*
+ * Check if specified variable name is the source for any string
+ * currently being scanned
+ */
+static bool
+var_is_current_source(PsqlScanState state, const char *varname)
+{
+	StackElem  *stackelem;
+
+	for (stackelem = state->buffer_stack;
+		 stackelem != NULL;
+		 stackelem = stackelem->next)
+	{
+		if (stackelem->varname && strcmp(stackelem->varname, varname) == 0)
+			return true;
+	}
+	return false;
+}
+
+/*
+ * Set up a flex input buffer to scan the given data.  We always make a
+ * copy of the data.  If working in an unsafe encoding, the copy has
+ * multibyte sequences replaced by FFs to avoid fooling the lexer rules.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+static YY_BUFFER_STATE
+prepare_buffer(const char *txt, int len, char **txtcopy)
+{
+	char	   *newtxt;
+
+	/* Flex wants two \0 characters after the actual data */
+	newtxt = pg_malloc(len + 2);
+	*txtcopy = newtxt;
+	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
+
+	if (cur_state->safe_encoding)
+		memcpy(newtxt, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		int		i = 0;
+
+		while (i < len)
+		{
+			int		thislen = PQmblen(txt + i, cur_state->encoding);
+
+			/* first byte should always be okay... */
+			newtxt[i] = txt[i];
+			i++;
+			while (--thislen > 0 && i < len)
+				newtxt[i++] = (char) 0xFF;
+		}
+	}
+
+	return yy_scan_buffer(newtxt, len + 2);
+}
+
+/*
+ * emit() --- body for ECHO macro
+ *
+ * NB: this must be used for ALL and ONLY the text copied from the flex
+ * input data.  If you pass it something that is not part of the yytext
+ * string, you are making a mistake.  Internally generated text can be
+ * appended directly to output_buf.
+ */
+static void
+emit(const char *txt, int len)
+{
+	if (cur_state->safe_encoding)
+		appendBinaryPQExpBuffer(output_buf, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			appendPQExpBufferChar(output_buf, ch);
+		}
+	}
+}
+
+/*
+ * extract_substring --- fetch the true value of (part of) the current token
+ *
+ * This is like emit(), except that the data is returned as a malloc'd string
+ * rather than being pushed directly to output_buf.
+ */
+static char *
+extract_substring(const char *txt, int len)
+{
+	char	   *result = (char *) pg_malloc(len + 1);
+
+	if (cur_state->safe_encoding)
+		memcpy(result, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			result[i] = ch;
+		}
+	}
+	result[len] = '\0';
+	return result;
+}
+
+/*
+ * escape_variable --- process :'VARIABLE' or :"VARIABLE"
+ *
+ * If the variable name is found, escape its value using the appropriate
+ * quoting method and emit the value to output_buf.  (Since the result is
+ * surely quoted, there is never any reason to rescan it.)  If we don't
+ * find the variable or the escaping function fails, emit the token as-is.
+ */
+static void
+escape_variable(bool as_ident)
+{
+	char	   *varname;
+	const char *value;
+
+	/* Variable lookup if possible. */
+	if (cur_state->vars && cur_state->db)
+	{
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = GetVariable(cur_state->vars, varname);
+		free(varname);
+	}
+
+	/* Escaping. */
+	if (value)
+	{
+		if (!cur_state->db)
+			psql_error("can't escape without active connection\n");
+		else
+		{
+			char   *escaped_value;
+
+			if (as_ident)
+				escaped_value =
+					PQescapeIdentifier(cur_state->db, value, strlen(value));
+			else
+				escaped_value =
+					PQescapeLiteral(cur_state->db, value, strlen(value));
+
+			if (escaped_value == NULL)
+			{
+				const char *error = PQerrorMessage(cur_state->db);
+
+				psql_error("%s", error);
+			}
+			else
+			{
+				appendPQExpBufferStr(output_buf, escaped_value);
+				PQfreemem(escaped_value);
+				return;
+			}
+		}
+	}
+
+	/*
+	 * If we reach this point, some kind of error has occurred.  Emit the
+	 * original text into the output buffer.
+	 */
+	emit(yytext, yyleng);
+}
diff --git a/src/bin/pgbench/variables.c b/src/bin/pgbench/variables.c
new file mode 100644
index 0000000..ad27b51
--- /dev/null
+++ b/src/bin/pgbench/variables.c
@@ -0,0 +1,22 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * src/bin/pgbench/variables.c
+ */
+#include "postgres_fe.h"
+#include "variables.h"
+
+/*
+ * Dummy functions to compile psqlscan.l
+ *
+ * This function is needed to link with psqlscan.l but never called for
+ * pgbench.
+ */
+const char *
+GetVariable(VariableSpace space, const char *name)
+{
+	fprintf(stderr, "GetVariable is called. abort.\n");
+	exit(1);
+}
diff --git a/src/bin/pgbench/variables.h b/src/bin/pgbench/variables.h
new file mode 100644
index 0000000..2bfe557
--- /dev/null
+++ b/src/bin/pgbench/variables.h
@@ -0,0 +1,22 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * src/bin/pgbench/variables.h
+ */
+#ifndef VARIABLES_H
+#define VARIABLES_H
+
+#include "common.h"
+
+/*
+ * This file contains the minimal definitions for dummy function and type
+ * needed to link with psqlscan.l but never called for pgbench.
+ */
+struct _variable;
+typedef struct _variable *VariableSpace;
+
+const char *GetVariable(VariableSpace space, const char *name);
+
+#endif   /* VARIABLES_H */
-- 
1.8.3.1

hoge.sqltext/plain; charset=us-asciiDownload
#25Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Kyotaro HORIGUCHI (#24)
Re: pgbench - allow backslash-continuations in custom scripts

Hello Kyotaro-san,

If you feel that this feature only deserve a lexer solution, then the
patch should be "returned with feedback".

It's unfortunate to abandon this idea so I tried this and made it run
with psql's parser. I think it works as expected.

Wow, you are much more courageous than I am!:-)

- 0001-Prepare-for-share-psqlscan-with-pgbench.patch
A patch to modify psql so that psqlscan can be shared with other modules.

- 0002-Make-use-of-psqlscan-in-pgbench.patch
A patch to use psqlscan in pgbench.

- hoge.sql
A sample custom script including multilne statement and line comment

I can't judge wheter this is a new version of Febien's patch
following Tom's suggestion or brand-new one. Anyway I'd like to
post on this thread.

I think it is really a new patch, but posting it is seems logical because
that is where the discussion was lead.

- backslash commands is handled as the same as before: multiline
is not allowed.

Hmm... that is really the feature I wanted to add initially, too bad it is
the dropped one:-)

Suggestions? Opinions?

I don't have idea how to deal with the copy of psqlscan.[lh] from
psql. Currently they are simply the dead copies of those of psql.

I think that there should be no copies, but it should use relative
symbolic links so that the files are kept synchronized.

- Modifying psqlscan in psql requires consideration on how it is
used in pgbench.

Yep, that is one of the reason why I did not want to go this way, bar my
natural lazyness.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Fabien COELHO (#25)
4 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

Hi, all.

Attatched is the revised version of this patch.

The first patch is not changed from before.

The second is fixed a kind of bug.

Ths third is the new one to allow backslash continuation for
backslash commands.

hoge.sql is the test custom script.

======
At Fri, 24 Jul 2015 07:39:16 +0200 (CEST), Fabien COELHO <coelho@cri.ensmp.fr> wrote in <alpine.DEB.2.10.1507240731050.12839@sto>

- backslash commands is handled as the same as before: multiline
is not allowed.

Hmm... that is really the feature I wanted to add initially, too bad
it is the dropped one:-)

Ouch. The story has been derailed somewhere.

Since SQL statments could be multilined without particluar
marker, we cannot implement multilined backslash commands in the
same way..

The attached revised patch allows backslash continuation for
backslash comands. I suppose this is the same as what you did in
behavior. But SQL statements are still can be continued as psql
does.

I'm not satisfied by the design but I don't see another way..

I don't have idea how to deal with the copy of psqlscan.[lh] from
psql. Currently they are simply the dead copies of those of psql.

I think that there should be no copies, but it should use relative
symbolic links so that the files are kept synchronized.

Yeah, I think so but symlinks could harm on git and Windows. The
another way would be make copies it from psql directory. They
live next door to each other.

- Modifying psqlscan in psql requires consideration on how it is
used in pgbench.

Yep, that is one of the reason why I did not want to go this way, bar
my natural lazyness.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Prepare-for-share-psqlscan-with-pgbench.patchtext/x-patch; charset=us-asciiDownload
>From bc14230e36c5450af642bf2593598f1e5db8f62c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 23 Jul 2015 20:44:37 +0900
Subject: [PATCH 1/3] Prepare for share psqlscan with pgbench.

psql_scan no more accesses directly to pset struct and allow omission
of VariableSpace.
---
 src/bin/psql/mainloop.c |  6 +++--
 src/bin/psql/psqlscan.h |  7 +++---
 src/bin/psql/psqlscan.l | 59 ++++++++++++++++++++++++++++---------------------
 src/bin/psql/startup.c  |  4 ++--
 4 files changed, 43 insertions(+), 33 deletions(-)

diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index b6cef94..e98cb94 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -233,7 +233,8 @@ MainLoop(FILE *source)
 		/*
 		 * Parse line, looking for command separators.
 		 */
-		psql_scan_setup(scan_state, line, strlen(line));
+		psql_scan_setup(scan_state, line, strlen(line),
+						pset.db, pset.vars, pset.encoding);
 		success = true;
 		line_saved_in_history = false;
 
@@ -373,7 +374,8 @@ MainLoop(FILE *source)
 					resetPQExpBuffer(query_buf);
 					/* reset parsing state since we are rescanning whole line */
 					psql_scan_reset(scan_state);
-					psql_scan_setup(scan_state, line, strlen(line));
+					psql_scan_setup(scan_state, line, strlen(line),
+									pset.db, pset.vars, pset.encoding);
 					line_saved_in_history = false;
 					prompt_status = PROMPT_READY;
 				}
diff --git a/src/bin/psql/psqlscan.h b/src/bin/psql/psqlscan.h
index 55070ca..1b6361b 100644
--- a/src/bin/psql/psqlscan.h
+++ b/src/bin/psql/psqlscan.h
@@ -11,7 +11,7 @@
 #include "pqexpbuffer.h"
 
 #include "prompt.h"
-
+#include "variables.h"
 
 /* Abstract type for lexer's internal state */
 typedef struct PsqlScanStateData *PsqlScanState;
@@ -36,12 +36,11 @@ enum slash_option_type
 	OT_NO_EVAL					/* no expansion of backticks or variables */
 };
 
-
 extern PsqlScanState psql_scan_create(void);
 extern void psql_scan_destroy(PsqlScanState state);
 
-extern void psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len);
+extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+							PGconn *db, VariableSpace vars, int encoding);
 extern void psql_scan_finish(PsqlScanState state);
 
 extern PsqlScanResult psql_scan(PsqlScanState state,
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
index be059ab..08bd9d2 100644
--- a/src/bin/psql/psqlscan.l
+++ b/src/bin/psql/psqlscan.l
@@ -43,11 +43,6 @@
 
 #include <ctype.h>
 
-#include "common.h"
-#include "settings.h"
-#include "variables.h"
-
-
 /*
  * We use a stack of flex buffers to handle substitution of psql variables.
  * Each stacked buffer contains the as-yet-unread text from one psql variable.
@@ -81,10 +76,12 @@ typedef struct PsqlScanStateData
 	const char *scanline;		/* current input line at outer level */
 
 	/* safe_encoding, curline, refline are used by emit() to replace FFs */
+	PGconn	   *db;				/* active connection */
 	int			encoding;		/* encoding being used now */
 	bool		safe_encoding;	/* is current encoding "safe"? */
 	const char *curline;		/* actual flex input string for cur buf */
 	const char *refline;		/* original data for cur buffer */
+	VariableSpace vars;			/* "shell variable" repository */
 
 	/*
 	 * All this state lives across successive input lines, until explicitly
@@ -736,11 +733,14 @@ other			.
 
 :{variable_char}+	{
 					/* Possible psql variable substitution */
-					char   *varname;
-					const char *value;
+					char   *varname = NULL;
+					const char *value = NULL;
 
-					varname = extract_substring(yytext + 1, yyleng - 1);
-					value = GetVariable(pset.vars, varname);
+					if (cur_state->vars)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = GetVariable(cur_state->vars, varname);
+					}
 
 					if (value)
 					{
@@ -769,7 +769,8 @@ other			.
 						ECHO;
 					}
 
-					free(varname);
+					if (varname)
+						free(varname);
 				}
 
 :'{variable_char}+'	{
@@ -1033,9 +1034,12 @@ other			.
 						char   *varname;
 						const char *value;
 
-						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(pset.vars, varname);
-						free(varname);
+						if (cur_state->vars)
+						{
+							varname = extract_substring(yytext + 1, yyleng - 1);
+							value = GetVariable(cur_state->vars, varname);
+							free(varname);
+						}
 
 						/*
 						 * The variable value is just emitted without any
@@ -1227,17 +1231,19 @@ psql_scan_destroy(PsqlScanState state)
  * or freed until after psql_scan_finish is called.
  */
 void
-psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len)
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PGconn *db, VariableSpace vars, int encoding)
 {
 	/* Mustn't be scanning already */
 	Assert(state->scanbufhandle == NULL);
 	Assert(state->buffer_stack == NULL);
 
 	/* Do we need to hack the character set encoding? */
-	state->encoding = pset.encoding;
+	state->encoding = encoding;
 	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
 
+	state->vars = vars;
+
 	/* needed for prepare_buffer */
 	cur_state = state;
 
@@ -1615,7 +1621,7 @@ psql_scan_slash_option(PsqlScanState state,
 					{
 						if (!inquotes && type == OT_SQLID)
 							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, pset.encoding);
+						cp += PQmblen(cp, cur_state->encoding);
 					}
 				}
 			}
@@ -1944,15 +1950,18 @@ escape_variable(bool as_ident)
 	char	   *varname;
 	const char *value;
 
-	/* Variable lookup. */
-	varname = extract_substring(yytext + 2, yyleng - 3);
-	value = GetVariable(pset.vars, varname);
-	free(varname);
+	/* Variable lookup if possible. */
+	if (cur_state->vars && cur_state->db)
+	{
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = GetVariable(cur_state->vars, varname);
+		free(varname);
+	}
 
 	/* Escaping. */
 	if (value)
 	{
-		if (!pset.db)
+		if (!cur_state->db)
 			psql_error("can't escape without active connection\n");
 		else
 		{
@@ -1960,14 +1969,14 @@ escape_variable(bool as_ident)
 
 			if (as_ident)
 				escaped_value =
-					PQescapeIdentifier(pset.db, value, strlen(value));
+					PQescapeIdentifier(cur_state->db, value, strlen(value));
 			else
 				escaped_value =
-					PQescapeLiteral(pset.db, value, strlen(value));
+					PQescapeLiteral(cur_state->db, value, strlen(value));
 
 			if (escaped_value == NULL)
 			{
-				const char *error = PQerrorMessage(pset.db);
+				const char *error = PQerrorMessage(cur_state->db);
 
 				psql_error("%s", error);
 			}
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 28ba75a..c143dfe 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -305,8 +305,8 @@ main(int argc, char *argv[])
 
 		scan_state = psql_scan_create();
 		psql_scan_setup(scan_state,
-						options.action_string,
-						strlen(options.action_string));
+						options.action_string, strlen(options.action_string),
+						pset.db, pset.vars, pset.encoding);
 
 		successResult = HandleSlashCmds(scan_state, NULL) != PSQL_CMD_ERROR
 			? EXIT_SUCCESS : EXIT_FAILURE;
-- 
1.8.3.1

0002-Make-use-of-psqlscan-in-pgbench.patchtext/x-patch; charset=us-asciiDownload
>From f93355065725602a5e338743e322bdec2272f46c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Jul 2015 10:58:23 +0900
Subject: [PATCH 2/3] Make use of psqlscan in pgbench.

Make pgbench to use psqlscan of psql as is. This allows SQL statements
in custom scirpts to take the same form as psql e.g, multiline
statements or line comments. backslash commands are not allowed split
into multi lines as it used to be.
---
 src/bin/pgbench/Makefile    |    3 +-
 src/bin/pgbench/common.c    |   25 +
 src/bin/pgbench/common.h    |   17 +
 src/bin/pgbench/pgbench.c   |  348 ++++----
 src/bin/pgbench/prompt.h    |   25 +
 src/bin/pgbench/psqlscan.h  |   63 ++
 src/bin/pgbench/psqlscan.l  | 1997 +++++++++++++++++++++++++++++++++++++++++++
 src/bin/pgbench/variables.c |   22 +
 src/bin/pgbench/variables.h |   22 +
 9 files changed, 2371 insertions(+), 151 deletions(-)
 create mode 100644 src/bin/pgbench/common.c
 create mode 100644 src/bin/pgbench/common.h
 create mode 100644 src/bin/pgbench/prompt.h
 create mode 100644 src/bin/pgbench/psqlscan.h
 create mode 100644 src/bin/pgbench/psqlscan.l
 create mode 100644 src/bin/pgbench/variables.c
 create mode 100644 src/bin/pgbench/variables.h

diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile
index 18fdf58..9edae63 100644
--- a/src/bin/pgbench/Makefile
+++ b/src/bin/pgbench/Makefile
@@ -7,8 +7,7 @@ subdir = src/bin/pgbench
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = pgbench.o exprparse.o $(WIN32RES)
-
+OBJS = pgbench.o exprparse.o psqlscan.o common.o variables.o $(WIN32RES)
 override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
 
 ifneq ($(PORTNAME), win32)
diff --git a/src/bin/pgbench/common.c b/src/bin/pgbench/common.c
new file mode 100644
index 0000000..16eb7b6
--- /dev/null
+++ b/src/bin/pgbench/common.c
@@ -0,0 +1,25 @@
+/*
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * src/bin/pgbench/common.c
+ */
+
+#include "common.h"
+
+void
+psql_error(const char *fmt,...)
+{
+	fprintf(stderr, "psql_error is called. abort.\n");
+	exit(1);
+}
+
+/*
+ * This function originally returns the setting of standard_conforming_strings
+ *  but pgbench doesn't have connection while parsing scripts return always
+ *  true as the default value for 9.1 or later.
+ */
+bool
+standard_strings(void)
+{
+	return true;
+}
diff --git a/src/bin/pgbench/common.h b/src/bin/pgbench/common.h
new file mode 100644
index 0000000..80c6ac2
--- /dev/null
+++ b/src/bin/pgbench/common.h
@@ -0,0 +1,17 @@
+/*
+ * dummy functions for pgbench to use psqlscan
+ *
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * src/bin/pgbench/common.h
+ */
+#ifndef COMMON_H
+#define COMMON_H
+
+#include "postgres_fe.h"
+#include "libpq-fe.h"
+
+extern void psql_error(const char *fmt,...) pg_attribute_printf(1, 2);
+extern bool standard_strings(void);
+
+#endif   /* COMMON_H */
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index e839fa3..32a5094 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -54,7 +54,7 @@
 #endif
 
 #include "pgbench.h"
-
+#include "psqlscan.h"
 /*
  * Multi-platform pthread implementations
  */
@@ -2221,9 +2221,9 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
-/* Parse a command; return a Command struct, or NULL if it's a comment */
+/* Parse a backslash command; return a Command struct  */
 static Command *
-process_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(char *buf, const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 
@@ -2231,207 +2231,235 @@ process_commands(char *buf, const char *source, const int lineno)
 	int			j;
 	char	   *p,
 			   *tok;
+	int			max_args = -1;
 
 	/* Make the string buf end at the next newline */
 	if ((p = strchr(buf, '\n')) != NULL)
 		*p = '\0';
 
-	/* Skip leading whitespace */
-	p = buf;
+ 	p = buf;
 	while (isspace((unsigned char) *p))
 		p++;
 
-	/* If the line is empty or actually a comment, we're done */
-	if (*p == '\0' || strncmp(p, "--", 2) == 0)
-		return NULL;
+	p = buf;
+
+	if (*p != '\\')
+	{
+		fprintf(stderr, "Invalid backslash found: %s:%d\n", source, lineno);
+		exit(1);
+	}		
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
-	my_commands->line = pg_strdup(buf);
+	my_commands->line = pg_strdup(p);
 	my_commands->command_num = num_commands++;
-	my_commands->type = 0;		/* until set */
+	my_commands->type = META_COMMAND;
 	my_commands->argc = 0;
 
-	if (*p == '\\')
-	{
-		int			max_args = -1;
+	j = 0;
+	tok = strtok(++p, delim);
 
-		my_commands->type = META_COMMAND;
+	if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
+		max_args = 2;
 
-		j = 0;
-		tok = strtok(++p, delim);
+	while (tok != NULL)
+	{
+		my_commands->cols[j] = tok - buf + 1;
+		my_commands->argv[j++] = pg_strdup(tok);
+		my_commands->argc++;
+		if (max_args >= 0 && my_commands->argc >= max_args)
+			tok = strtok(NULL, "");
+		else
+			tok = strtok(NULL, delim);
+	}
 
-		if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
-			max_args = 2;
+	if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
+	{
+		/*
+		 * parsing: \setrandom variable min max [uniform] \setrandom
+		 * variable min max (gaussian|exponential) threshold
+		 */
 
-		while (tok != NULL)
+		if (my_commands->argc < 4)
 		{
-			my_commands->cols[j] = tok - buf + 1;
-			my_commands->argv[j++] = pg_strdup(tok);
-			my_commands->argc++;
-			if (max_args >= 0 && my_commands->argc >= max_args)
-				tok = strtok(NULL, "");
-			else
-				tok = strtok(NULL, delim);
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing arguments", NULL, -1);
 		}
 
-		if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
-		{
-			/*
-			 * parsing: \setrandom variable min max [uniform] \setrandom
-			 * variable min max (gaussian|exponential) threshold
-			 */
+		/* argc >= 4 */
 
-			if (my_commands->argc < 4)
+		if (my_commands->argc == 4 ||		/* uniform without/with
+											 * "uniform" keyword */
+			(my_commands->argc == 5 &&
+			 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
+		{
+			/* nothing to do */
+		}
+		else if (			/* argc >= 5 */
+			(pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
+			(pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
+		{
+			if (my_commands->argc < 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing arguments", NULL, -1);
+							 "missing threshold argument", my_commands->argv[4], -1);
 			}
-
-			/* argc >= 4 */
-
-			if (my_commands->argc == 4 ||		/* uniform without/with
-												 * "uniform" keyword */
-				(my_commands->argc == 5 &&
-				 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
-			{
-				/* nothing to do */
-			}
-			else if (			/* argc >= 5 */
-					 (pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
-				   (pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
-			{
-				if (my_commands->argc < 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-					 "missing threshold argument", my_commands->argv[4], -1);
-				}
-				else if (my_commands->argc > 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "too many arguments", my_commands->argv[4],
-								 my_commands->cols[6]);
-				}
-			}
-			else	/* cannot parse, unexpected arguments */
+			else if (my_commands->argc > 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "unexpected argument", my_commands->argv[4],
-							 my_commands->cols[4]);
+							 "too many arguments", my_commands->argv[4],
+							 my_commands->cols[6]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+		else	/* cannot parse, unexpected arguments */
 		{
-			if (my_commands->argc < 3)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "unexpected argument", my_commands->argv[4],
+						 my_commands->cols[4]);
+		}
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+	{
+		if (my_commands->argc < 3)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-			expr_scanner_init(my_commands->argv[2], source, lineno,
-							  my_commands->line, my_commands->argv[0],
-							  my_commands->cols[2] - 1);
+		expr_scanner_init(my_commands->argv[2], source, lineno,
+						  my_commands->line, my_commands->argv[0],
+						  my_commands->cols[2] - 1);
 
-			if (expr_yyparse() != 0)
-			{
-				/* dead code: exit done from syntax_error called by yyerror */
-				exit(1);
-			}
+		if (expr_yyparse() != 0)
+		{
+			/* dead code: exit done from syntax_error called by yyerror */
+			exit(1);
+		}
 
-			my_commands->expr = expr_parse_result;
+		my_commands->expr = expr_parse_result;
 
-			expr_scanner_finish();
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+		expr_scanner_finish();
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+	{
+		if (my_commands->argc < 2)
 		{
-			if (my_commands->argc < 2)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
-
-			/*
-			 * Split argument into number and unit to allow "sleep 1ms" etc.
-			 * We don't have to terminate the number argument with null
-			 * because it will be parsed with atoi, which ignores trailing
-			 * non-digit characters.
-			 */
-			if (my_commands->argv[1][0] != ':')
-			{
-				char	   *c = my_commands->argv[1];
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-				while (isdigit((unsigned char) *c))
-					c++;
-				if (*c)
-				{
-					my_commands->argv[2] = c;
-					if (my_commands->argc < 3)
-						my_commands->argc = 3;
-				}
-			}
+		/*
+		 * Split argument into number and unit to allow "sleep 1ms" etc.  We
+		 * don't have to terminate the number argument with null because it
+		 * will be parsed with atoi, which ignores trailing non-digit
+		 * characters.
+		 */
+		if (my_commands->argv[1][0] != ':')
+		{
+			char	   *c = my_commands->argv[1];
 
-			if (my_commands->argc >= 3)
+			while (isdigit((unsigned char) *c))
+				c++;
+			if (*c)
 			{
-				if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "s") != 0)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "unknown time unit, must be us, ms or s",
-								 my_commands->argv[2], my_commands->cols[2]);
-				}
+				my_commands->argv[2] = c;
+				if (my_commands->argc < 3)
+					my_commands->argc = 3;
 			}
-
-			/* this should be an error?! */
-			for (j = 3; j < my_commands->argc; j++)
-				fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
-						my_commands->argv[0], my_commands->argv[j]);
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+
+		if (my_commands->argc >= 3)
 		{
-			if (my_commands->argc < 3)
+			if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "s") != 0)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
+							 "unknown time unit, must be us, ms or s",
+							 my_commands->argv[2], my_commands->cols[2]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+
+		/* this should be an error?! */
+		for (j = 3; j < my_commands->argc; j++)
+			fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
+					my_commands->argv[0], my_commands->argv[j]);
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+	{
+		if (my_commands->argc < 3)
 		{
-			if (my_commands->argc < 1)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing command", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
 		}
-		else
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+	{
+		if (my_commands->argc < 1)
 		{
 			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-						 "invalid command", NULL, -1);
+						 "missing command", NULL, -1);
 		}
 	}
 	else
 	{
-		my_commands->type = SQL_COMMAND;
+		syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+					 "invalid command", NULL, -1);
+	}
+
+	return my_commands;
+}
+
+/* Parse a command line, return non-null if any command terminates. */
+static Command *
+process_commands(PsqlScanState scan_state, PQExpBuffer qbuf, char *buf,
+				 const char *source, const int lineno)
+{
+	Command *command = NULL;
+	PsqlScanResult scan_result;
+	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+
+	psql_scan_setup(scan_state, buf, strlen(buf), NULL, NULL, 0);
+						
+	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
 
+	if (scan_result == PSCAN_SEMICOLON)
+	{
+		/*
+		 * Command is terminated. Fill the struct.
+		 */
+		command = (Command*) pg_malloc(sizeof(Command));
+		command->line = pg_strdup(qbuf->data); /* line? */
+		command->command_num = num_commands++;
+		command->type = SQL_COMMAND;
+		command->argc = 0;
 		switch (querymode)
 		{
-			case QUERY_SIMPLE:
-				my_commands->argv[0] = pg_strdup(p);
-				my_commands->argc++;
-				break;
-			case QUERY_EXTENDED:
-			case QUERY_PREPARED:
-				if (!parseQuery(my_commands, p))
-					exit(1);
-				break;
-			default:
+		case QUERY_SIMPLE:
+			command->argv[0] = pg_strdup(qbuf->data); /* remove leading ws?*/
+			command->argc++;
+			break;
+		case QUERY_EXTENDED:
+		case QUERY_PREPARED:
+			if (!parseQuery(command, qbuf->data))
 				exit(1);
+			break;
+		default:
+			exit(1);
 		}
 	}
+	else if (scan_result == PSCAN_BACKSLASH)
+	{
+		/* backslash commands are always one-liner  */
+		command = process_backslash_commands(buf, source, lineno);
+	}
 
-	return my_commands;
+	psql_scan_finish(scan_state);
+
+	return command;
 }
 
+
 /*
  * Read a line from fd, and return it in a malloc'd buffer.
  * Return NULL at EOF.
@@ -2486,6 +2514,8 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
+	PsqlScanState scan_state;
+	PQExpBuffer query_buf = createPQExpBuffer();
 
 	if (num_files >= MAX_FILES)
 	{
@@ -2506,33 +2536,46 @@ process_file(char *filename)
 		return false;
 	}
 
+	scan_state = psql_scan_create();
+	resetPQExpBuffer(query_buf);
+
 	lineno = 0;
 	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
-		Command    *command;
+		Command *command = NULL;
 
 		lineno += 1;
 
-		command = process_commands(buf, filename, lineno);
-
+		command = process_commands(scan_state, query_buf, buf,
+								   filename, lineno);
 		free(buf);
 
 		if (command == NULL)
+		{
+			/*
+			 * command is NULL when psql_scan returns PSCAN_EOL or
+			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
+			 * cases.
+			 */
 			continue;
+		}
 
 		my_commands[index] = command;
-		index++;
-
+		resetPQExpBuffer(query_buf);
 		if (index >= alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
+			my_commands = pg_realloc(my_commands,
+									 sizeof(Command *) * alloc_num);
 		}
+		index++;
 	}
 	fclose(fd);
 
+	psql_scan_finish(scan_state);
+
 	my_commands[index] = NULL;
 
 	sql_files[num_files++] = my_commands;
@@ -2550,10 +2593,15 @@ process_builtin(char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
+	PsqlScanState scan_state;
+	PQExpBuffer query_buf = createPQExpBuffer();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
 
+	scan_state = psql_scan_create();
+	resetPQExpBuffer(query_buf);
+
 	lineno = 0;
 	index = 0;
 
@@ -2576,10 +2624,11 @@ process_builtin(char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(buf, source, lineno);
+		command = process_commands(scan_state, query_buf, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
+		resetPQExpBuffer(query_buf);
 		my_commands[index] = command;
 		index++;
 
@@ -2591,6 +2640,7 @@ process_builtin(char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
+	psql_scan_finish(scan_state);
 
 	return my_commands;
 }
diff --git a/src/bin/pgbench/prompt.h b/src/bin/pgbench/prompt.h
new file mode 100644
index 0000000..e3f6ce5
--- /dev/null
+++ b/src/bin/pgbench/prompt.h
@@ -0,0 +1,25 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2015, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/prompt.h
+ */
+#ifndef PROMPT_H
+#define PROMPT_H
+
+typedef enum _promptStatus
+{
+	PROMPT_READY,
+	PROMPT_CONTINUE,
+	PROMPT_COMMENT,
+	PROMPT_SINGLEQUOTE,
+	PROMPT_DOUBLEQUOTE,
+	PROMPT_DOLLARQUOTE,
+	PROMPT_PAREN,
+	PROMPT_COPY
+} promptStatus_t;
+
+char	   *get_prompt(promptStatus_t status);
+
+#endif   /* PROMPT_H */
diff --git a/src/bin/pgbench/psqlscan.h b/src/bin/pgbench/psqlscan.h
new file mode 100644
index 0000000..1b6361b
--- /dev/null
+++ b/src/bin/pgbench/psqlscan.h
@@ -0,0 +1,63 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2015, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.h
+ */
+#ifndef PSQLSCAN_H
+#define PSQLSCAN_H
+
+#include "pqexpbuffer.h"
+
+#include "prompt.h"
+#include "variables.h"
+
+/* Abstract type for lexer's internal state */
+typedef struct PsqlScanStateData *PsqlScanState;
+
+/* Termination states for psql_scan() */
+typedef enum
+{
+	PSCAN_SEMICOLON,			/* found command-ending semicolon */
+	PSCAN_BACKSLASH,			/* found backslash command */
+	PSCAN_INCOMPLETE,			/* end of line, SQL statement incomplete */
+	PSCAN_EOL					/* end of line, SQL possibly complete */
+} PsqlScanResult;
+
+/* Different ways for scan_slash_option to handle parameter words */
+enum slash_option_type
+{
+	OT_NORMAL,					/* normal case */
+	OT_SQLID,					/* treat as SQL identifier */
+	OT_SQLIDHACK,				/* SQL identifier, but don't downcase */
+	OT_FILEPIPE,				/* it's a filename or pipe */
+	OT_WHOLE_LINE,				/* just snarf the rest of the line */
+	OT_NO_EVAL					/* no expansion of backticks or variables */
+};
+
+extern PsqlScanState psql_scan_create(void);
+extern void psql_scan_destroy(PsqlScanState state);
+
+extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+							PGconn *db, VariableSpace vars, int encoding);
+extern void psql_scan_finish(PsqlScanState state);
+
+extern PsqlScanResult psql_scan(PsqlScanState state,
+		  PQExpBuffer query_buf,
+		  promptStatus_t *prompt);
+
+extern void psql_scan_reset(PsqlScanState state);
+
+extern bool psql_scan_in_quote(PsqlScanState state);
+
+extern char *psql_scan_slash_command(PsqlScanState state);
+
+extern char *psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon);
+
+extern void psql_scan_slash_command_end(PsqlScanState state);
+
+#endif   /* PSQLSCAN_H */
diff --git a/src/bin/pgbench/psqlscan.l b/src/bin/pgbench/psqlscan.l
new file mode 100644
index 0000000..08bd9d2
--- /dev/null
+++ b/src/bin/pgbench/psqlscan.l
@@ -0,0 +1,1997 @@
+%{
+/*-------------------------------------------------------------------------
+ *
+ * psqlscan.l
+ *	  lexical scanner for psql
+ *
+ * This code is mainly needed to determine where the end of a SQL statement
+ * is: we are looking for semicolons that are not within quotes, comments,
+ * or parentheses.  The most reliable way to handle this is to borrow the
+ * backend's flex lexer rules, lock, stock, and barrel.  The rules below
+ * are (except for a few) the same as the backend's, but their actions are
+ * just ECHO whereas the backend's actions generally do other things.
+ *
+ * XXX The rules in this file must be kept in sync with the backend lexer!!!
+ *
+ * XXX Avoid creating backtracking cases --- see the backend lexer for info.
+ *
+ * The most difficult aspect of this code is that we need to work in multibyte
+ * encodings that are not ASCII-safe.  A "safe" encoding is one in which each
+ * byte of a multibyte character has the high bit set (it's >= 0x80).  Since
+ * all our lexing rules treat all high-bit-set characters alike, we don't
+ * really need to care whether such a byte is part of a sequence or not.
+ * In an "unsafe" encoding, we still expect the first byte of a multibyte
+ * sequence to be >= 0x80, but later bytes might not be.  If we scan such
+ * a sequence as-is, the lexing rules could easily be fooled into matching
+ * such bytes to ordinary ASCII characters.  Our solution for this is to
+ * substitute 0xFF for each non-first byte within the data presented to flex.
+ * The flex rules will then pass the FF's through unmolested.  The emit()
+ * subroutine is responsible for looking back to the original string and
+ * replacing FF's with the corresponding original bytes.
+ *
+ * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/bin/psql/psqlscan.l
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres_fe.h"
+
+#include "psqlscan.h"
+
+#include <ctype.h>
+
+/*
+ * We use a stack of flex buffers to handle substitution of psql variables.
+ * Each stacked buffer contains the as-yet-unread text from one psql variable.
+ * When we pop the stack all the way, we resume reading from the outer buffer
+ * identified by scanbufhandle.
+ */
+typedef struct StackElem
+{
+	YY_BUFFER_STATE buf;		/* flex input control structure */
+	char	   *bufstring;		/* data actually being scanned by flex */
+	char	   *origstring;		/* copy of original data, if needed */
+	char	   *varname;		/* name of variable providing data, or NULL */
+	struct StackElem *next;
+} StackElem;
+
+/*
+ * All working state of the lexer must be stored in PsqlScanStateData
+ * between calls.  This allows us to have multiple open lexer operations,
+ * which is needed for nested include files.  The lexer itself is not
+ * recursive, but it must be re-entrant.
+ */
+typedef struct PsqlScanStateData
+{
+	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
+	/*
+	 * These variables always refer to the outer buffer, never to any
+	 * stacked variable-expansion buffer.
+	 */
+	YY_BUFFER_STATE scanbufhandle;
+	char	   *scanbuf;		/* start of outer-level input buffer */
+	const char *scanline;		/* current input line at outer level */
+
+	/* safe_encoding, curline, refline are used by emit() to replace FFs */
+	PGconn	   *db;				/* active connection */
+	int			encoding;		/* encoding being used now */
+	bool		safe_encoding;	/* is current encoding "safe"? */
+	const char *curline;		/* actual flex input string for cur buf */
+	const char *refline;		/* original data for cur buffer */
+	VariableSpace vars;			/* "shell variable" repository */
+
+	/*
+	 * All this state lives across successive input lines, until explicitly
+	 * reset by psql_scan_reset.
+	 */
+	int			start_state;	/* saved YY_START */
+	int			paren_depth;	/* depth of nesting in parentheses */
+	int			xcdepth;		/* depth of nesting in slash-star comments */
+	char	   *dolqstart;		/* current $foo$ quote start string */
+} PsqlScanStateData;
+
+static PsqlScanState cur_state;	/* current state while active */
+
+static PQExpBuffer output_buf;	/* current output buffer */
+
+/* these variables do not need to be saved across calls */
+static enum slash_option_type option_type;
+static char *option_quote;
+static int	unquoted_option_chars;
+static int	backtick_start_offset;
+
+
+/* Return values from yylex() */
+#define LEXRES_EOL			0	/* end of input */
+#define LEXRES_SEMI			1	/* command-terminating semicolon found */
+#define LEXRES_BACKSLASH	2	/* backslash command start */
+#define LEXRES_OK			3	/* OK completion of backslash argument */
+
+
+static void evaluate_backtick(void);
+static void push_new_buffer(const char *newstr, const char *varname);
+static void pop_buffer_stack(PsqlScanState state);
+static bool var_is_current_source(PsqlScanState state, const char *varname);
+static YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
+									  char **txtcopy);
+static void emit(const char *txt, int len);
+static char *extract_substring(const char *txt, int len);
+static void escape_variable(bool as_ident);
+
+#define ECHO emit(yytext, yyleng)
+
+%}
+
+%option 8bit
+%option never-interactive
+%option nodefault
+%option noinput
+%option nounput
+%option noyywrap
+%option warn
+
+/*
+ * All of the following definitions and rules should exactly match
+ * src/backend/parser/scan.l so far as the flex patterns are concerned.
+ * The rule bodies are just ECHO as opposed to what the backend does,
+ * however.  (But be sure to duplicate code that affects the lexing process,
+ * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
+ * scan.l has a separate one for each exclusive state.
+ */
+
+/*
+ * OK, here is a short description of lex/flex rules behavior.
+ * The longest pattern which matches an input string is always chosen.
+ * For equal-length patterns, the first occurring in the rules list is chosen.
+ * INITIAL is the starting state, to which all non-conditional rules apply.
+ * Exclusive states change parsing rules while the state is active.  When in
+ * an exclusive state, only those rules defined for that state apply.
+ *
+ * We use exclusive states for quoted strings, extended comments,
+ * and to eliminate parsing troubles for numeric strings.
+ * Exclusive states:
+ *  <xb> bit string literal
+ *  <xc> extended C-style comments
+ *  <xd> delimited identifiers (double-quoted identifiers)
+ *  <xh> hexadecimal numeric string
+ *  <xq> standard quoted strings
+ *  <xe> extended quoted strings (support backslash escape sequences)
+ *  <xdolq> $foo$ quoted strings
+ *  <xui> quoted identifier with Unicode escapes
+ *  <xuiend> end of a quoted identifier with Unicode escapes, UESCAPE can follow
+ *  <xus> quoted string with Unicode escapes
+ *  <xusend> end of a quoted string with Unicode escapes, UESCAPE can follow
+ *
+ * Note: we intentionally don't mimic the backend's <xeu> state; we have
+ * no need to distinguish it from <xe> state, and no good way to get out
+ * of it in error cases.  The backend just throws yyerror() in those
+ * cases, but that's not an option here.
+ */
+
+%x xb
+%x xc
+%x xd
+%x xh
+%x xe
+%x xq
+%x xdolq
+%x xui
+%x xuiend
+%x xus
+%x xusend
+/* Additional exclusive states for psql only: lex backslash commands */
+%x xslashcmd
+%x xslashargstart
+%x xslasharg
+%x xslashquote
+%x xslashbackquote
+%x xslashdquote
+%x xslashwholeline
+%x xslashend
+
+/*
+ * In order to make the world safe for Windows and Mac clients as well as
+ * Unix ones, we accept either \n or \r as a newline.  A DOS-style \r\n
+ * sequence will be seen as two successive newlines, but that doesn't cause
+ * any problems.  Comments that start with -- and extend to the next
+ * newline are treated as equivalent to a single whitespace character.
+ *
+ * NOTE a fine point: if there is no newline following --, we will absorb
+ * everything to the end of the input as a comment.  This is correct.  Older
+ * versions of Postgres failed to recognize -- as a comment if the input
+ * did not end with a newline.
+ *
+ * XXX perhaps \f (formfeed) should be treated as a newline as well?
+ *
+ * XXX if you change the set of whitespace characters, fix scanner_isspace()
+ * to agree, and see also the plpgsql lexer.
+ */
+
+space			[ \t\n\r\f]
+horiz_space		[ \t\f]
+newline			[\n\r]
+non_newline		[^\n\r]
+
+comment			("--"{non_newline}*)
+
+whitespace		({space}+|{comment})
+
+/*
+ * SQL requires at least one newline in the whitespace separating
+ * string literals that are to be concatenated.  Silly, but who are we
+ * to argue?  Note that {whitespace_with_newline} should not have * after
+ * it, whereas {whitespace} should generally have a * after it...
+ */
+
+special_whitespace		({space}+|{comment}{newline})
+horiz_whitespace		({horiz_space}|{comment})
+whitespace_with_newline	({horiz_whitespace}*{newline}{special_whitespace}*)
+
+/*
+ * To ensure that {quotecontinue} can be scanned without having to back up
+ * if the full pattern isn't matched, we include trailing whitespace in
+ * {quotestop}.  This matches all cases where {quotecontinue} fails to match,
+ * except for {quote} followed by whitespace and just one "-" (not two,
+ * which would start a {comment}).  To cover that we have {quotefail}.
+ * The actions for {quotestop} and {quotefail} must throw back characters
+ * beyond the quote proper.
+ */
+quote			'
+quotestop		{quote}{whitespace}*
+quotecontinue	{quote}{whitespace_with_newline}{quote}
+quotefail		{quote}{whitespace}*"-"
+
+/* Bit string
+ * It is tempting to scan the string for only those characters
+ * which are allowed. However, this leads to silently swallowed
+ * characters if illegal characters are included in the string.
+ * For example, if xbinside is [01] then B'ABCD' is interpreted
+ * as a zero-length string, and the ABCD' is lost!
+ * Better to pass the string forward and let the input routines
+ * validate the contents.
+ */
+xbstart			[bB]{quote}
+xbinside		[^']*
+
+/* Hexadecimal number */
+xhstart			[xX]{quote}
+xhinside		[^']*
+
+/* National character */
+xnstart			[nN]{quote}
+
+/* Quoted string that allows backslash escapes */
+xestart			[eE]{quote}
+xeinside		[^\\']+
+xeescape		[\\][^0-7]
+xeoctesc		[\\][0-7]{1,3}
+xehexesc		[\\]x[0-9A-Fa-f]{1,2}
+xeunicode		[\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})
+xeunicodefail	[\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})
+
+/* Extended quote
+ * xqdouble implements embedded quote, ''''
+ */
+xqstart			{quote}
+xqdouble		{quote}{quote}
+xqinside		[^']+
+
+/* $foo$ style quotes ("dollar quoting")
+ * The quoted string starts with $foo$ where "foo" is an optional string
+ * in the form of an identifier, except that it may not contain "$",
+ * and extends to the first occurrence of an identical string.
+ * There is *no* processing of the quoted text.
+ *
+ * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim}
+ * fails to match its trailing "$".
+ */
+dolq_start		[A-Za-z\200-\377_]
+dolq_cont		[A-Za-z\200-\377_0-9]
+dolqdelim		\$({dolq_start}{dolq_cont}*)?\$
+dolqfailed		\${dolq_start}{dolq_cont}*
+dolqinside		[^$]+
+
+/* Double quote
+ * Allows embedded spaces and other special characters into identifiers.
+ */
+dquote			\"
+xdstart			{dquote}
+xdstop			{dquote}
+xddouble		{dquote}{dquote}
+xdinside		[^"]+
+
+/* Unicode escapes */
+uescape			[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote}
+/* error rule to avoid backup */
+uescapefail		[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU]
+
+/* Quoted identifier with Unicode escapes */
+xuistart		[uU]&{dquote}
+
+/* Quoted string with Unicode escapes */
+xusstart		[uU]&{quote}
+
+/* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */
+xustop1		{uescapefail}?
+xustop2		{uescape}
+
+/* error rule to avoid backup */
+xufailed		[uU]&
+
+
+/* C-style comments
+ *
+ * The "extended comment" syntax closely resembles allowable operator syntax.
+ * The tricky part here is to get lex to recognize a string starting with
+ * slash-star as a comment, when interpreting it as an operator would produce
+ * a longer match --- remember lex will prefer a longer match!  Also, if we
+ * have something like plus-slash-star, lex will think this is a 3-character
+ * operator whereas we want to see it as a + operator and a comment start.
+ * The solution is two-fold:
+ * 1. append {op_chars}* to xcstart so that it matches as much text as
+ *    {operator} would. Then the tie-breaker (first matching rule of same
+ *    length) ensures xcstart wins.  We put back the extra stuff with yyless()
+ *    in case it contains a star-slash that should terminate the comment.
+ * 2. In the operator rule, check for slash-star within the operator, and
+ *    if found throw it back with yyless().  This handles the plus-slash-star
+ *    problem.
+ * Dash-dash comments have similar interactions with the operator rule.
+ */
+xcstart			\/\*{op_chars}*
+xcstop			\*+\/
+xcinside		[^*/]+
+
+digit			[0-9]
+ident_start		[A-Za-z\200-\377_]
+ident_cont		[A-Za-z\200-\377_0-9\$]
+
+identifier		{ident_start}{ident_cont}*
+
+/* Assorted special-case operators and operator-like tokens */
+typecast		"::"
+dot_dot			\.\.
+colon_equals	":="
+equals_greater	"=>"
+less_equals		"<="
+greater_equals	">="
+less_greater	"<>"
+not_equals		"!="
+
+/*
+ * "self" is the set of chars that should be returned as single-character
+ * tokens.  "op_chars" is the set of chars that can make up "Op" tokens,
+ * which can be one or more characters long (but if a single-char token
+ * appears in the "self" set, it is not to be returned as an Op).  Note
+ * that the sets overlap, but each has some chars that are not in the other.
+ *
+ * If you change either set, adjust the character lists appearing in the
+ * rule for "operator"!
+ */
+self			[,()\[\].;\:\+\-\*\/\%\^\<\>\=]
+op_chars		[\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
+operator		{op_chars}+
+
+/* we no longer allow unary minus in numbers.
+ * instead we pass it separately to parser. there it gets
+ * coerced via doNegate() -- Leon aug 20 1999
+ *
+ * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 10.
+ *
+ * {realfail1} and {realfail2} are added to prevent the need for scanner
+ * backup when the {real} rule fails to match completely.
+ */
+
+integer			{digit}+
+decimal			(({digit}*\.{digit}+)|({digit}+\.{digit}*))
+decimalfail		{digit}+\.\.
+real			({integer}|{decimal})[Ee][-+]?{digit}+
+realfail1		({integer}|{decimal})[Ee]
+realfail2		({integer}|{decimal})[Ee][-+]
+
+param			\${integer}
+
+/* psql-specific: characters allowed in variable names */
+variable_char	[A-Za-z\200-\377_0-9]
+
+other			.
+
+/*
+ * Dollar quoted strings are totally opaque, and no escaping is done on them.
+ * Other quoted strings must allow some special characters such as single-quote
+ *  and newline.
+ * Embedded single-quotes are implemented both in the SQL standard
+ *  style of two adjacent single quotes "''" and in the Postgres/Java style
+ *  of escaped-quote "\'".
+ * Other embedded escaped characters are matched explicitly and the leading
+ *  backslash is dropped from the string.
+ * Note that xcstart must appear before operator, as explained above!
+ *  Also whitespace (comment) must appear before operator.
+ */
+
+%%
+
+{whitespace}	{
+					/*
+					 * Note that the whitespace rule includes both true
+					 * whitespace and single-line ("--" style) comments.
+					 * We suppress whitespace at the start of the query
+					 * buffer.  We also suppress all single-line comments,
+					 * which is pretty dubious but is the historical
+					 * behavior.
+					 */
+					if (!(output_buf->len == 0 || yytext[0] == '-'))
+						ECHO;
+				}
+
+{xcstart}		{
+					cur_state->xcdepth = 0;
+					BEGIN(xc);
+					/* Put back any characters past slash-star; see above */
+					yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstart}	{
+					cur_state->xcdepth++;
+					/* Put back any characters past slash-star; see above */
+					yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstop}	{
+					if (cur_state->xcdepth <= 0)
+					{
+						BEGIN(INITIAL);
+					}
+					else
+						cur_state->xcdepth--;
+					ECHO;
+				}
+
+<xc>{xcinside}	{
+					ECHO;
+				}
+
+<xc>{op_chars}	{
+					ECHO;
+				}
+
+<xc>\*+			{
+					ECHO;
+				}
+
+{xbstart}		{
+					BEGIN(xb);
+					ECHO;
+				}
+<xb>{quotestop}	|
+<xb>{quotefail} {
+					yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xh>{xhinside}	|
+<xb>{xbinside}	{
+					ECHO;
+				}
+<xh>{quotecontinue}	|
+<xb>{quotecontinue}	{
+					ECHO;
+				}
+
+{xhstart}		{
+					/* Hexadecimal bit type.
+					 * At some point we should simply pass the string
+					 * forward to the parser and label it there.
+					 * In the meantime, place a leading "x" on the string
+					 * to mark it for the input routine as a hex string.
+					 */
+					BEGIN(xh);
+					ECHO;
+				}
+<xh>{quotestop}	|
+<xh>{quotefail} {
+					yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+
+{xnstart}		{
+					yyless(1);				/* eat only 'n' this time */
+					ECHO;
+				}
+
+{xqstart}		{
+					if (standard_strings())
+						BEGIN(xq);
+					else
+						BEGIN(xe);
+					ECHO;
+				}
+{xestart}		{
+					BEGIN(xe);
+					ECHO;
+				}
+{xusstart}		{
+					BEGIN(xus);
+					ECHO;
+				}
+<xq,xe>{quotestop}	|
+<xq,xe>{quotefail} {
+					yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xus>{quotestop} |
+<xus>{quotefail} {
+					yyless(1);
+					BEGIN(xusend);
+					ECHO;
+				}
+<xusend>{whitespace} {
+					ECHO;
+				}
+<xusend>{other} |
+<xusend>{xustop1} {
+					yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xusend>{xustop2} {
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xq,xe,xus>{xqdouble} {
+					ECHO;
+				}
+<xq,xus>{xqinside}  {
+					ECHO;
+				}
+<xe>{xeinside}  {
+					ECHO;
+				}
+<xe>{xeunicode} {
+					ECHO;
+				}
+<xe>{xeunicodefail}	{
+					ECHO;
+				}
+<xe>{xeescape}  {
+					ECHO;
+				}
+<xe>{xeoctesc}  {
+					ECHO;
+				}
+<xe>{xehexesc}  {
+					ECHO;
+				}
+<xq,xe,xus>{quotecontinue} {
+					ECHO;
+				}
+<xe>.			{
+					/* This is only needed for \ just before EOF */
+					ECHO;
+				}
+
+{dolqdelim}		{
+					cur_state->dolqstart = pg_strdup(yytext);
+					BEGIN(xdolq);
+					ECHO;
+				}
+{dolqfailed}	{
+					/* throw back all but the initial "$" */
+					yyless(1);
+					ECHO;
+				}
+<xdolq>{dolqdelim} {
+					if (strcmp(yytext, cur_state->dolqstart) == 0)
+					{
+						free(cur_state->dolqstart);
+						cur_state->dolqstart = NULL;
+						BEGIN(INITIAL);
+					}
+					else
+					{
+						/*
+						 * When we fail to match $...$ to dolqstart, transfer
+						 * the $... part to the output, but put back the final
+						 * $ for rescanning.  Consider $delim$...$junk$delim$
+						 */
+						yyless(yyleng-1);
+					}
+					ECHO;
+				}
+<xdolq>{dolqinside} {
+					ECHO;
+				}
+<xdolq>{dolqfailed} {
+					ECHO;
+				}
+<xdolq>.		{
+					/* This is only needed for $ inside the quoted text */
+					ECHO;
+				}
+
+{xdstart}		{
+					BEGIN(xd);
+					ECHO;
+				}
+{xuistart}		{
+					BEGIN(xui);
+					ECHO;
+				}
+<xd>{xdstop}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xui>{dquote} {
+					yyless(1);
+					BEGIN(xuiend);
+					ECHO;
+				}
+<xuiend>{whitespace} {
+					ECHO;
+				}
+<xuiend>{other} |
+<xuiend>{xustop1} {
+					yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xuiend>{xustop2}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xd,xui>{xddouble}	{
+					ECHO;
+				}
+<xd,xui>{xdinside}	{
+					ECHO;
+				}
+
+{xufailed}	{
+					/* throw back all but the initial u/U */
+					yyless(1);
+					ECHO;
+				}
+
+{typecast}		{
+					ECHO;
+				}
+
+{dot_dot}		{
+					ECHO;
+				}
+
+{colon_equals}	{
+					ECHO;
+				}
+
+{equals_greater} {
+					ECHO;
+				}
+
+{less_equals}	{
+					ECHO;
+				}
+
+{greater_equals} {
+					ECHO;
+				}
+
+{less_greater}	{
+					ECHO;
+				}
+
+{not_equals}	{
+					ECHO;
+				}
+
+	/*
+	 * These rules are specific to psql --- they implement parenthesis
+	 * counting and detection of command-ending semicolon.  These must
+	 * appear before the {self} rule so that they take precedence over it.
+	 */
+
+"("				{
+					cur_state->paren_depth++;
+					ECHO;
+				}
+
+")"				{
+					if (cur_state->paren_depth > 0)
+						cur_state->paren_depth--;
+					ECHO;
+				}
+
+";"				{
+					ECHO;
+					if (cur_state->paren_depth == 0)
+					{
+						/* Terminate lexing temporarily */
+						return LEXRES_SEMI;
+					}
+				}
+
+	/*
+	 * psql-specific rules to handle backslash commands and variable
+	 * substitution.  We want these before {self}, also.
+	 */
+
+"\\"[;:]		{
+					/* Force a semicolon or colon into the query buffer */
+					emit(yytext + 1, 1);
+				}
+
+"\\"			{
+					/* Terminate lexing temporarily */
+					return LEXRES_BACKSLASH;
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					char   *varname = NULL;
+					const char *value = NULL;
+
+					if (cur_state->vars)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = GetVariable(cur_state->vars, varname);
+					}
+
+					if (value)
+					{
+						/* It is a variable, check for recursion */
+						if (var_is_current_source(cur_state, varname))
+						{
+							/* Recursive expansion --- don't go there */
+							psql_error("skipping recursive expansion of variable \"%s\"\n",
+									   varname);
+							/* Instead copy the string as is */
+							ECHO;
+						}
+						else
+						{
+							/* OK, perform substitution */
+							push_new_buffer(value, varname);
+							/* yy_scan_string already made buffer active */
+						}
+					}
+					else
+					{
+						/*
+						 * if the variable doesn't exist we'll copy the
+						 * string as is
+						 */
+						ECHO;
+					}
+
+					if (varname)
+						free(varname);
+				}
+
+:'{variable_char}+'	{
+					escape_variable(false);
+				}
+
+:\"{variable_char}+\"	{
+					escape_variable(true);
+				}
+
+	/*
+	 * These rules just avoid the need for scanner backup if one of the
+	 * two rules above fails to match completely.
+	 */
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					yyless(1);
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					yyless(1);
+					ECHO;
+				}
+
+	/*
+	 * Back to backend-compatible rules.
+	 */
+
+{self}			{
+					ECHO;
+				}
+
+{operator}		{
+					/*
+					 * Check for embedded slash-star or dash-dash; those
+					 * are comment starts, so operator must stop there.
+					 * Note that slash-star or dash-dash at the first
+					 * character will match a prior rule, not this one.
+					 */
+					int		nchars = yyleng;
+					char   *slashstar = strstr(yytext, "/*");
+					char   *dashdash = strstr(yytext, "--");
+
+					if (slashstar && dashdash)
+					{
+						/* if both appear, take the first one */
+						if (slashstar > dashdash)
+							slashstar = dashdash;
+					}
+					else if (!slashstar)
+						slashstar = dashdash;
+					if (slashstar)
+						nchars = slashstar - yytext;
+
+					/*
+					 * For SQL compatibility, '+' and '-' cannot be the
+					 * last char of a multi-char operator unless the operator
+					 * contains chars that are not in SQL operators.
+					 * The idea is to lex '=-' as two operators, but not
+					 * to forbid operator names like '?-' that could not be
+					 * sequences of SQL operators.
+					 */
+					while (nchars > 1 &&
+						   (yytext[nchars-1] == '+' ||
+							yytext[nchars-1] == '-'))
+					{
+						int		ic;
+
+						for (ic = nchars-2; ic >= 0; ic--)
+						{
+							if (strchr("~!@#^&|`?%", yytext[ic]))
+								break;
+						}
+						if (ic >= 0)
+							break; /* found a char that makes it OK */
+						nchars--; /* else remove the +/-, and check again */
+					}
+
+					if (nchars < yyleng)
+					{
+						/* Strip the unwanted chars from the token */
+						yyless(nchars);
+					}
+					ECHO;
+				}
+
+{param}			{
+					ECHO;
+				}
+
+{integer}		{
+					ECHO;
+				}
+{decimal}		{
+					ECHO;
+				}
+{decimalfail}	{
+					/* throw back the .., and treat as integer */
+					yyless(yyleng-2);
+					ECHO;
+				}
+{real}			{
+					ECHO;
+				}
+{realfail1}		{
+					/*
+					 * throw back the [Ee], and treat as {decimal}.  Note
+					 * that it is possible the input is actually {integer},
+					 * but since this case will almost certainly lead to a
+					 * syntax error anyway, we don't bother to distinguish.
+					 */
+					yyless(yyleng-1);
+					ECHO;
+				}
+{realfail2}		{
+					/* throw back the [Ee][+-], and proceed as above */
+					yyless(yyleng-2);
+					ECHO;
+				}
+
+
+{identifier}	{
+					ECHO;
+				}
+
+{other}			{
+					ECHO;
+				}
+
+
+	/*
+	 * Everything from here down is psql-specific.
+	 */
+
+<<EOF>>			{
+					StackElem  *stackelem = cur_state->buffer_stack;
+
+					if (stackelem == NULL)
+						return LEXRES_EOL; /* end of input reached */
+
+					/*
+					 * We were expanding a variable, so pop the inclusion
+					 * stack and keep lexing
+					 */
+					pop_buffer_stack(cur_state);
+
+					stackelem = cur_state->buffer_stack;
+					if (stackelem != NULL)
+					{
+						yy_switch_to_buffer(stackelem->buf);
+						cur_state->curline = stackelem->bufstring;
+						cur_state->refline = stackelem->origstring ? stackelem->origstring : stackelem->bufstring;
+					}
+					else
+					{
+						yy_switch_to_buffer(cur_state->scanbufhandle);
+						cur_state->curline = cur_state->scanbuf;
+						cur_state->refline = cur_state->scanline;
+					}
+				}
+
+	/*
+	 * Exclusive lexer states to handle backslash command lexing
+	 */
+
+<xslashcmd>{
+	/* command name ends at whitespace or backslash; eat all else */
+
+{space}|"\\"	{
+					yyless(0);
+					return LEXRES_OK;
+				}
+
+{other}			{ ECHO; }
+
+}
+
+<xslashargstart>{
+	/*
+	 * Discard any whitespace before argument, then go to xslasharg state.
+	 * An exception is that "|" is only special at start of argument, so we
+	 * check for it here.
+	 */
+
+{space}+		{ }
+
+"|"				{
+					if (option_type == OT_FILEPIPE)
+					{
+						/* treat like whole-string case */
+						ECHO;
+						BEGIN(xslashwholeline);
+					}
+					else
+					{
+						/* vertical bar is not special otherwise */
+						yyless(0);
+						BEGIN(xslasharg);
+					}
+				}
+
+{other}			{
+					yyless(0);
+					BEGIN(xslasharg);
+				}
+
+}
+
+<xslasharg>{
+	/*
+	 * Default processing of text in a slash command's argument.
+	 *
+	 * Note: unquoted_option_chars counts the number of characters at the
+	 * end of the argument that were not subject to any form of quoting.
+	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
+	 */
+
+{space}|"\\"	{
+					/*
+					 * Unquoted space is end of arg; do not eat.  Likewise
+					 * backslash is end of command or next command, do not eat
+					 *
+					 * XXX this means we can't conveniently accept options
+					 * that include unquoted backslashes; therefore, option
+					 * processing that encourages use of backslashes is rather
+					 * broken.
+					 */
+					yyless(0);
+					return LEXRES_OK;
+				}
+
+{quote}			{
+					*option_quote = '\'';
+					unquoted_option_chars = 0;
+					BEGIN(xslashquote);
+				}
+
+"`"				{
+					backtick_start_offset = output_buf->len;
+					*option_quote = '`';
+					unquoted_option_chars = 0;
+					BEGIN(xslashbackquote);
+				}
+
+{dquote}		{
+					ECHO;
+					*option_quote = '"';
+					unquoted_option_chars = 0;
+					BEGIN(xslashdquote);
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						char   *varname;
+						const char *value;
+
+						if (cur_state->vars)
+						{
+							varname = extract_substring(yytext + 1, yyleng - 1);
+							value = GetVariable(cur_state->vars, varname);
+							free(varname);
+						}
+
+						/*
+						 * The variable value is just emitted without any
+						 * further examination.  This is consistent with the
+						 * pre-8.0 code behavior, if not with the way that
+						 * variables are handled outside backslash commands.
+						 * Note that we needn't guard against recursion here.
+						 */
+						if (value)
+							appendPQExpBufferStr(output_buf, value);
+						else
+							ECHO;
+
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}+'	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(false);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+
+:\"{variable_char}+\"	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(true);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+{other}			{
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+}
+
+<xslashquote>{
+	/*
+	 * single-quoted text: copy literally except for '' and backslash
+	 * sequences
+	 */
+
+{quote}			{ BEGIN(xslasharg); }
+
+{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
+
+"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
+"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
+"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
+"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
+"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
+
+{xeoctesc}		{
+					/* octal case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 1, NULL, 8));
+				}
+
+{xehexesc}		{
+					/* hex case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 2, NULL, 16));
+				}
+
+"\\".			{ emit(yytext + 1, 1); }
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashbackquote>{
+	/*
+	 * backticked text: copy everything until next backquote, then evaluate.
+	 *
+	 * XXX Possible future behavioral change: substitute for :VARIABLE?
+	 */
+
+"`"				{
+					/* In NO_EVAL mode, don't evaluate the command */
+					if (option_type != OT_NO_EVAL)
+						evaluate_backtick();
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashdquote>{
+	/* double-quoted text: copy verbatim, including the double quotes */
+
+{dquote}		{
+					ECHO;
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashwholeline>{
+	/* copy everything until end of input line */
+	/* but suppress leading whitespace */
+
+{space}+		{
+					if (output_buf->len > 0)
+						ECHO;
+				}
+
+{other}			{ ECHO; }
+
+}
+
+<xslashend>{
+	/* at end of command, eat a double backslash, but not anything else */
+
+"\\\\"			{ return LEXRES_OK; }
+
+{other}|\n		{
+					yyless(0);
+					return LEXRES_OK;
+				}
+
+}
+
+%%
+
+/*
+ * Create a lexer working state struct.
+ */
+PsqlScanState
+psql_scan_create(void)
+{
+	PsqlScanState state;
+
+	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
+
+	psql_scan_reset(state);
+
+	return state;
+}
+
+/*
+ * Destroy a lexer working state struct, releasing all resources.
+ */
+void
+psql_scan_destroy(PsqlScanState state)
+{
+	psql_scan_finish(state);
+
+	psql_scan_reset(state);
+
+	free(state);
+}
+
+/*
+ * Set up to perform lexing of the given input line.
+ *
+ * The text at *line, extending for line_len bytes, will be scanned by
+ * subsequent calls to the psql_scan routines.  psql_scan_finish should
+ * be called when scanning is complete.  Note that the lexer retains
+ * a pointer to the storage at *line --- this string must not be altered
+ * or freed until after psql_scan_finish is called.
+ */
+void
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PGconn *db, VariableSpace vars, int encoding)
+{
+	/* Mustn't be scanning already */
+	Assert(state->scanbufhandle == NULL);
+	Assert(state->buffer_stack == NULL);
+
+	/* Do we need to hack the character set encoding? */
+	state->encoding = encoding;
+	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
+
+	state->vars = vars;
+
+	/* needed for prepare_buffer */
+	cur_state = state;
+
+	/* Set up flex input buffer with appropriate translation and padding */
+	state->scanbufhandle = prepare_buffer(line, line_len,
+										  &state->scanbuf);
+	state->scanline = line;
+
+	/* Set lookaside data in case we have to map unsafe encoding */
+	state->curline = state->scanbuf;
+	state->refline = state->scanline;
+}
+
+/*
+ * Do lexical analysis of SQL command text.
+ *
+ * The text previously passed to psql_scan_setup is scanned, and appended
+ * (possibly with transformation) to query_buf.
+ *
+ * The return value indicates the condition that stopped scanning:
+ *
+ * PSCAN_SEMICOLON: found a command-ending semicolon.  (The semicolon is
+ * transferred to query_buf.)  The command accumulated in query_buf should
+ * be executed, then clear query_buf and call again to scan the remainder
+ * of the line.
+ *
+ * PSCAN_BACKSLASH: found a backslash that starts a psql special command.
+ * Any previous data on the line has been transferred to query_buf.
+ * The caller will typically next call psql_scan_slash_command(),
+ * perhaps psql_scan_slash_option(), and psql_scan_slash_command_end().
+ *
+ * PSCAN_INCOMPLETE: the end of the line was reached, but we have an
+ * incomplete SQL command.  *prompt is set to the appropriate prompt type.
+ *
+ * PSCAN_EOL: the end of the line was reached, and there is no lexical
+ * reason to consider the command incomplete.  The caller may or may not
+ * choose to send it.  *prompt is set to the appropriate prompt type if
+ * the caller chooses to collect more input.
+ *
+ * In the PSCAN_INCOMPLETE and PSCAN_EOL cases, psql_scan_finish() should
+ * be called next, then the cycle may be repeated with a fresh input line.
+ *
+ * In all cases, *prompt is set to an appropriate prompt type code for the
+ * next line-input operation.
+ */
+PsqlScanResult
+psql_scan(PsqlScanState state,
+		  PQExpBuffer query_buf,
+		  promptStatus_t *prompt)
+{
+	PsqlScanResult result;
+	int			lexresult;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = query_buf;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(state->start_state);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/* Update static vars back to the state struct */
+	state->start_state = YY_START;
+
+	/*
+	 * Check termination state and return appropriate result info.
+	 */
+	switch (lexresult)
+	{
+		case LEXRES_EOL:		/* end of input */
+			switch (state->start_state)
+			{
+				/* This switch must cover all non-slash-command states. */
+				case INITIAL:
+				case xuiend:	/* we treat these like INITIAL */
+				case xusend:
+					if (state->paren_depth > 0)
+					{
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_PAREN;
+					}
+					else if (query_buf->len > 0)
+					{
+						result = PSCAN_EOL;
+						*prompt = PROMPT_CONTINUE;
+					}
+					else
+					{
+						/* never bother to send an empty buffer */
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_READY;
+					}
+					break;
+				case xb:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xc:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_COMMENT;
+					break;
+				case xd:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xh:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xe:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xdolq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOLLARQUOTE;
+					break;
+				case xui:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xus:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				default:
+					/* can't get here */
+					fprintf(stderr, "invalid YY_START\n");
+					exit(1);
+			}
+			break;
+		case LEXRES_SEMI:		/* semicolon */
+			result = PSCAN_SEMICOLON;
+			*prompt = PROMPT_READY;
+			break;
+		case LEXRES_BACKSLASH:	/* backslash */
+			result = PSCAN_BACKSLASH;
+			*prompt = PROMPT_READY;
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid yylex result\n");
+			exit(1);
+	}
+
+	return result;
+}
+
+/*
+ * Clean up after scanning a string.  This flushes any unread input and
+ * releases resources (but not the PsqlScanState itself).  Note however
+ * that this does not reset the lexer scan state; that can be done by
+ * psql_scan_reset(), which is an orthogonal operation.
+ *
+ * It is legal to call this when not scanning anything (makes it easier
+ * to deal with error recovery).
+ */
+void
+psql_scan_finish(PsqlScanState state)
+{
+	/* Drop any incomplete variable expansions. */
+	while (state->buffer_stack != NULL)
+		pop_buffer_stack(state);
+
+	/* Done with the outer scan buffer, too */
+	if (state->scanbufhandle)
+		yy_delete_buffer(state->scanbufhandle);
+	state->scanbufhandle = NULL;
+	if (state->scanbuf)
+		free(state->scanbuf);
+	state->scanbuf = NULL;
+}
+
+/*
+ * Reset lexer scanning state to start conditions.  This is appropriate
+ * for executing \r psql commands (or any other time that we discard the
+ * prior contents of query_buf).  It is not, however, necessary to do this
+ * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
+ * PSCAN_EOL scan result, because the scan state must be INITIAL when those
+ * conditions are returned.
+ *
+ * Note that this is unrelated to flushing unread input; that task is
+ * done by psql_scan_finish().
+ */
+void
+psql_scan_reset(PsqlScanState state)
+{
+	state->start_state = INITIAL;
+	state->paren_depth = 0;
+	state->xcdepth = 0;			/* not really necessary */
+	if (state->dolqstart)
+		free(state->dolqstart);
+	state->dolqstart = NULL;
+}
+
+/*
+ * Return true if lexer is currently in an "inside quotes" state.
+ *
+ * This is pretty grotty but is needed to preserve the old behavior
+ * that mainloop.c drops blank lines not inside quotes without even
+ * echoing them.
+ */
+bool
+psql_scan_in_quote(PsqlScanState state)
+{
+	return state->start_state != INITIAL;
+}
+
+/*
+ * Scan the command name of a psql backslash command.  This should be called
+ * after psql_scan() returns PSCAN_BACKSLASH.  It is assumed that the input
+ * has been consumed through the leading backslash.
+ *
+ * The return value is a malloc'd copy of the command name, as parsed off
+ * from the input.
+ */
+char *
+psql_scan_slash_command(PsqlScanState state)
+{
+	PQExpBufferData mybuf;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(xslashcmd);
+
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+
+	return mybuf.data;
+}
+
+/*
+ * Parse off the next argument for a backslash command, and return it as a
+ * malloc'd string.  If there are no more arguments, returns NULL.
+ *
+ * type tells what processing, if any, to perform on the option string;
+ * for example, if it's a SQL identifier, we want to downcase any unquoted
+ * letters.
+ *
+ * if quote is not NULL, *quote is set to 0 if no quoting was found, else
+ * the last quote symbol used in the argument.
+ *
+ * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
+ * be taken as part of the option string will be stripped.
+ *
+ * NOTE: the only possible syntax errors for backslash options are unmatched
+ * quotes, which are detected when we run out of input.  Therefore, on a
+ * syntax error we just throw away the string and return NULL; there is no
+ * need to worry about flushing remaining input.
+ */
+char *
+psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon)
+{
+	PQExpBufferData mybuf;
+	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
+	char		local_quote;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	if (quote == NULL)
+		quote = &local_quote;
+	*quote = 0;
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+	option_type = type;
+	option_quote = quote;
+	unquoted_option_chars = 0;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	if (type == OT_WHOLE_LINE)
+		BEGIN(xslashwholeline);
+	else
+		BEGIN(xslashargstart);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/*
+	 * Check the lex result: we should have gotten back either LEXRES_OK
+	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
+	 * a quoted string, as indicated by YY_START, EOL is an error.
+	 */
+	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
+
+	switch (YY_START)
+	{
+		case xslashargstart:
+			/* empty arg */
+			break;
+		case xslasharg:
+			/* Strip any unquoted trailing semi-colons if requested */
+			if (semicolon)
+			{
+				while (unquoted_option_chars-- > 0 &&
+					   mybuf.len > 0 &&
+					   mybuf.data[mybuf.len - 1] == ';')
+				{
+					mybuf.data[--mybuf.len] = '\0';
+				}
+			}
+
+			/*
+			 * If SQL identifier processing was requested, then we strip out
+			 * excess double quotes and downcase unquoted letters.
+			 * Doubled double-quotes become output double-quotes, per spec.
+			 *
+			 * Note that a string like FOO"BAR"BAZ will be converted to
+			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
+			 * which would have us parse it as several identifiers.  But
+			 * for psql's purposes, we want a string like "foo"."bar" to
+			 * be treated as one option, so there's little choice.
+			 */
+			if (type == OT_SQLID || type == OT_SQLIDHACK)
+			{
+				bool		inquotes = false;
+				char	   *cp = mybuf.data;
+
+				while (*cp)
+				{
+					if (*cp == '"')
+					{
+						if (inquotes && cp[1] == '"')
+						{
+							/* Keep the first quote, remove the second */
+							cp++;
+						}
+						inquotes = !inquotes;
+						/* Collapse out quote at *cp */
+						memmove(cp, cp + 1, strlen(cp));
+						mybuf.len--;
+						/* do not advance cp */
+					}
+					else
+					{
+						if (!inquotes && type == OT_SQLID)
+							*cp = pg_tolower((unsigned char) *cp);
+						cp += PQmblen(cp, cur_state->encoding);
+					}
+				}
+			}
+			break;
+		case xslashquote:
+		case xslashbackquote:
+		case xslashdquote:
+			/* must have hit EOL inside quotes */
+			psql_error("unterminated quoted string\n");
+			termPQExpBuffer(&mybuf);
+			return NULL;
+		case xslashwholeline:
+			/* always okay */
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid YY_START\n");
+			exit(1);
+	}
+
+	/*
+	 * An unquoted empty argument isn't possible unless we are at end of
+	 * command.  Return NULL instead.
+	 */
+	if (mybuf.len == 0 && *quote == 0)
+	{
+		termPQExpBuffer(&mybuf);
+		return NULL;
+	}
+
+	/* Else return the completed string. */
+	return mybuf.data;
+}
+
+/*
+ * Eat up any unused \\ to complete a backslash command.
+ */
+void
+psql_scan_slash_command_end(PsqlScanState state)
+{
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = NULL;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(xslashend);
+
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+}
+
+/*
+ * Evaluate a backticked substring of a slash command's argument.
+ *
+ * The portion of output_buf starting at backtick_start_offset is evaluated
+ * as a shell command and then replaced by the command's output.
+ */
+static void
+evaluate_backtick(void)
+{
+	char	   *cmd = output_buf->data + backtick_start_offset;
+	PQExpBufferData cmd_output;
+	FILE	   *fd;
+	bool		error = false;
+	char		buf[512];
+	size_t		result;
+
+	initPQExpBuffer(&cmd_output);
+
+	fd = popen(cmd, PG_BINARY_R);
+	if (!fd)
+	{
+		psql_error("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (!error)
+	{
+		do
+		{
+			result = fread(buf, 1, sizeof(buf), fd);
+			if (ferror(fd))
+			{
+				psql_error("%s: %s\n", cmd, strerror(errno));
+				error = true;
+				break;
+			}
+			appendBinaryPQExpBuffer(&cmd_output, buf, result);
+		} while (!feof(fd));
+	}
+
+	if (fd && pclose(fd) == -1)
+	{
+		psql_error("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (PQExpBufferDataBroken(cmd_output))
+	{
+		psql_error("%s: out of memory\n", cmd);
+		error = true;
+	}
+
+	/* Now done with cmd, delete it from output_buf */
+	output_buf->len = backtick_start_offset;
+	output_buf->data[output_buf->len] = '\0';
+
+	/* If no error, transfer result to output_buf */
+	if (!error)
+	{
+		/* strip any trailing newline */
+		if (cmd_output.len > 0 &&
+			cmd_output.data[cmd_output.len - 1] == '\n')
+			cmd_output.len--;
+		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
+	}
+
+	termPQExpBuffer(&cmd_output);
+}
+
+/*
+ * Push the given string onto the stack of stuff to scan.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+static void
+push_new_buffer(const char *newstr, const char *varname)
+{
+	StackElem  *stackelem;
+
+	stackelem = (StackElem *) pg_malloc(sizeof(StackElem));
+
+	/*
+	 * In current usage, the passed varname points at the current flex
+	 * input buffer; we must copy it before calling prepare_buffer()
+	 * because that will change the buffer state.
+	 */
+	stackelem->varname = varname ? pg_strdup(varname) : NULL;
+
+	stackelem->buf = prepare_buffer(newstr, strlen(newstr),
+									&stackelem->bufstring);
+	cur_state->curline = stackelem->bufstring;
+	if (cur_state->safe_encoding)
+	{
+		stackelem->origstring = NULL;
+		cur_state->refline = stackelem->bufstring;
+	}
+	else
+	{
+		stackelem->origstring = pg_strdup(newstr);
+		cur_state->refline = stackelem->origstring;
+	}
+	stackelem->next = cur_state->buffer_stack;
+	cur_state->buffer_stack = stackelem;
+}
+
+/*
+ * Pop the topmost buffer stack item (there must be one!)
+ *
+ * NB: after this, the flex input state is unspecified; caller must
+ * switch to an appropriate buffer to continue lexing.
+ */
+static void
+pop_buffer_stack(PsqlScanState state)
+{
+	StackElem  *stackelem = state->buffer_stack;
+
+	state->buffer_stack = stackelem->next;
+	yy_delete_buffer(stackelem->buf);
+	free(stackelem->bufstring);
+	if (stackelem->origstring)
+		free(stackelem->origstring);
+	if (stackelem->varname)
+		free(stackelem->varname);
+	free(stackelem);
+}
+
+/*
+ * Check if specified variable name is the source for any string
+ * currently being scanned
+ */
+static bool
+var_is_current_source(PsqlScanState state, const char *varname)
+{
+	StackElem  *stackelem;
+
+	for (stackelem = state->buffer_stack;
+		 stackelem != NULL;
+		 stackelem = stackelem->next)
+	{
+		if (stackelem->varname && strcmp(stackelem->varname, varname) == 0)
+			return true;
+	}
+	return false;
+}
+
+/*
+ * Set up a flex input buffer to scan the given data.  We always make a
+ * copy of the data.  If working in an unsafe encoding, the copy has
+ * multibyte sequences replaced by FFs to avoid fooling the lexer rules.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+static YY_BUFFER_STATE
+prepare_buffer(const char *txt, int len, char **txtcopy)
+{
+	char	   *newtxt;
+
+	/* Flex wants two \0 characters after the actual data */
+	newtxt = pg_malloc(len + 2);
+	*txtcopy = newtxt;
+	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
+
+	if (cur_state->safe_encoding)
+		memcpy(newtxt, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		int		i = 0;
+
+		while (i < len)
+		{
+			int		thislen = PQmblen(txt + i, cur_state->encoding);
+
+			/* first byte should always be okay... */
+			newtxt[i] = txt[i];
+			i++;
+			while (--thislen > 0 && i < len)
+				newtxt[i++] = (char) 0xFF;
+		}
+	}
+
+	return yy_scan_buffer(newtxt, len + 2);
+}
+
+/*
+ * emit() --- body for ECHO macro
+ *
+ * NB: this must be used for ALL and ONLY the text copied from the flex
+ * input data.  If you pass it something that is not part of the yytext
+ * string, you are making a mistake.  Internally generated text can be
+ * appended directly to output_buf.
+ */
+static void
+emit(const char *txt, int len)
+{
+	if (cur_state->safe_encoding)
+		appendBinaryPQExpBuffer(output_buf, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			appendPQExpBufferChar(output_buf, ch);
+		}
+	}
+}
+
+/*
+ * extract_substring --- fetch the true value of (part of) the current token
+ *
+ * This is like emit(), except that the data is returned as a malloc'd string
+ * rather than being pushed directly to output_buf.
+ */
+static char *
+extract_substring(const char *txt, int len)
+{
+	char	   *result = (char *) pg_malloc(len + 1);
+
+	if (cur_state->safe_encoding)
+		memcpy(result, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			result[i] = ch;
+		}
+	}
+	result[len] = '\0';
+	return result;
+}
+
+/*
+ * escape_variable --- process :'VARIABLE' or :"VARIABLE"
+ *
+ * If the variable name is found, escape its value using the appropriate
+ * quoting method and emit the value to output_buf.  (Since the result is
+ * surely quoted, there is never any reason to rescan it.)  If we don't
+ * find the variable or the escaping function fails, emit the token as-is.
+ */
+static void
+escape_variable(bool as_ident)
+{
+	char	   *varname;
+	const char *value;
+
+	/* Variable lookup if possible. */
+	if (cur_state->vars && cur_state->db)
+	{
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = GetVariable(cur_state->vars, varname);
+		free(varname);
+	}
+
+	/* Escaping. */
+	if (value)
+	{
+		if (!cur_state->db)
+			psql_error("can't escape without active connection\n");
+		else
+		{
+			char   *escaped_value;
+
+			if (as_ident)
+				escaped_value =
+					PQescapeIdentifier(cur_state->db, value, strlen(value));
+			else
+				escaped_value =
+					PQescapeLiteral(cur_state->db, value, strlen(value));
+
+			if (escaped_value == NULL)
+			{
+				const char *error = PQerrorMessage(cur_state->db);
+
+				psql_error("%s", error);
+			}
+			else
+			{
+				appendPQExpBufferStr(output_buf, escaped_value);
+				PQfreemem(escaped_value);
+				return;
+			}
+		}
+	}
+
+	/*
+	 * If we reach this point, some kind of error has occurred.  Emit the
+	 * original text into the output buffer.
+	 */
+	emit(yytext, yyleng);
+}
diff --git a/src/bin/pgbench/variables.c b/src/bin/pgbench/variables.c
new file mode 100644
index 0000000..ad27b51
--- /dev/null
+++ b/src/bin/pgbench/variables.c
@@ -0,0 +1,22 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * src/bin/pgbench/variables.c
+ */
+#include "postgres_fe.h"
+#include "variables.h"
+
+/*
+ * Dummy functions to compile psqlscan.l
+ *
+ * This function is needed to link with psqlscan.l but never called for
+ * pgbench.
+ */
+const char *
+GetVariable(VariableSpace space, const char *name)
+{
+	fprintf(stderr, "GetVariable is called. abort.\n");
+	exit(1);
+}
diff --git a/src/bin/pgbench/variables.h b/src/bin/pgbench/variables.h
new file mode 100644
index 0000000..2bfe557
--- /dev/null
+++ b/src/bin/pgbench/variables.h
@@ -0,0 +1,22 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ * src/bin/pgbench/variables.h
+ */
+#ifndef VARIABLES_H
+#define VARIABLES_H
+
+#include "common.h"
+
+/*
+ * This file contains the minimal definitions for dummy function and type
+ * needed to link with psqlscan.l but never called for pgbench.
+ */
+struct _variable;
+typedef struct _variable *VariableSpace;
+
+const char *GetVariable(VariableSpace space, const char *name);
+
+#endif   /* VARIABLES_H */
-- 
1.8.3.1

0003-Allow-backslash-continuation-for-backslash-commads.patchtext/x-patch; charset=us-asciiDownload
>From ae55ba632d7fc92c60eb2bae327d5e66adaf100a Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Jul 2015 17:14:40 +0900
Subject: [PATCH 3/3] Allow backslash continuation for backslash commads.

Difrent from SQL statements, back slash command lines are continued by
a back slash at the end of the line.
---
 src/bin/pgbench/pgbench.c | 104 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 76 insertions(+), 28 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 32a5094..f4f4abc 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -293,6 +293,14 @@ typedef struct
 	double		sum2_lag;		/* sum(lag*lag) */
 } AggVals;
 
+typedef struct ProcComState
+{
+	PsqlScanState scan_state;
+	PQExpBuffer	  outbuf;
+	bool		  in_backslash_cmd;
+} ProcComStateData;
+typedef ProcComStateData *ProcComState;
+
 static Command **sql_files[MAX_FILES];	/* SQL script files */
 static int	num_files;			/* number of script files */
 static int	num_commands = 0;	/* total number of Command structs */
@@ -2221,33 +2229,69 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
+
+static ProcComState
+createProcComState(void)
+{
+	ProcComState ret = (ProcComState) pg_malloc(sizeof(ProcComStateData));
+
+	ret->scan_state = psql_scan_create();
+	ret->outbuf = createPQExpBuffer();
+
+	return ret;
+}
+
+#define proccom_reset_outbuf(pcs) resetPQExpBuffer((pcs)->outbuf)
+#define proccom_finish_scan(pcs) psql_scan_finish((pcs)->scan_state)
+#define proccom_in_backslash_command(pcs) ((pcs)->in_backslash_cmd)
+
 /* Parse a backslash command; return a Command struct  */
 static Command *
-process_backslash_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(ProcComState proc_state, char *buf,
+						   const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 
 	Command    *my_commands;
 	int			j;
 	char	   *p,
+			   *start,
 			   *tok;
 	int			max_args = -1;
 
+	/* Skip leading whitespace */
+ 	p = buf;
+	while (isspace((unsigned char) *p))
+		p++;
+
+	if (*p == '\\')
+		proccom_in_backslash_command(proc_state) = true;
+
+	if (!proccom_in_backslash_command(proc_state))
+		return NULL;
+
 	/* Make the string buf end at the next newline */
+	start = p;
 	if ((p = strchr(buf, '\n')) != NULL)
 		*p = '\0';
 
- 	p = buf;
-	while (isspace((unsigned char) *p))
-		p++;
+	/* Mark as "in back slash command" if the fist character is bask slash */
+	proccom_in_backslash_command(proc_state) = (*(--p) == '\\');
 
-	p = buf;
+	/* The last back slash is needless */
+	if (proccom_in_backslash_command(proc_state))
+		*p-- = '\0';
 
-	if (*p != '\\')
-	{
-		fprintf(stderr, "Invalid backslash found: %s:%d\n", source, lineno);
-		exit(1);
-	}		
+	appendPQExpBufferStr(proc_state->outbuf, start);
+
+	if (proccom_in_backslash_command(proc_state) && !isspace(*p))
+		appendPQExpBufferChar(proc_state->outbuf, ' ');
+
+	/* back slash command is not terminated, go to next line */
+	if (proccom_in_backslash_command(proc_state))
+		return NULL;
+
+	p = proc_state->outbuf->data;
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
@@ -2412,13 +2456,21 @@ process_backslash_commands(char *buf, const char *source, const int lineno)
 
 /* Parse a command line, return non-null if any command terminates. */
 static Command *
-process_commands(PsqlScanState scan_state, PQExpBuffer qbuf, char *buf,
+process_commands(ProcComState proc_state, char *buf,
 				 const char *source, const int lineno)
 {
 	Command *command = NULL;
+	PsqlScanState scan_state = proc_state->scan_state;
 	PsqlScanResult scan_result;
 	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+	PQExpBuffer qbuf = proc_state->outbuf;
 
+	command = process_backslash_commands(proc_state, buf, source, lineno);
+
+	/* go to next line for continuation line of backslash command. */
+	if (command != NULL || proccom_in_backslash_command(proc_state))
+		return command;
+	
 	psql_scan_setup(scan_state, buf, strlen(buf), NULL, NULL, 0);
 						
 	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
@@ -2450,8 +2502,8 @@ process_commands(PsqlScanState scan_state, PQExpBuffer qbuf, char *buf,
 	}
 	else if (scan_result == PSCAN_BACKSLASH)
 	{
-		/* backslash commands are always one-liner  */
-		command = process_backslash_commands(buf, source, lineno);
+		fprintf(stderr, "Backslash continuation is not allows in SQL statement: %s:%d\n", source, lineno);
+		exit(1);
 	}
 
 	psql_scan_finish(scan_state);
@@ -2514,8 +2566,7 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
-	PsqlScanState scan_state;
-	PQExpBuffer query_buf = createPQExpBuffer();
+	ProcComState proc_state = createProcComState();
 
 	if (num_files >= MAX_FILES)
 	{
@@ -2536,8 +2587,8 @@ process_file(char *filename)
 		return false;
 	}
 
-	scan_state = psql_scan_create();
-	resetPQExpBuffer(query_buf);
+	proccom_reset_outbuf(proc_state);
+	proccom_in_backslash_command(proc_state) = false;
 
 	lineno = 0;
 	index = 0;
@@ -2548,8 +2599,7 @@ process_file(char *filename)
 
 		lineno += 1;
 
-		command = process_commands(scan_state, query_buf, buf,
-								   filename, lineno);
+		command = process_commands(proc_state, buf, filename, lineno);
 		free(buf);
 
 		if (command == NULL)
@@ -2563,7 +2613,7 @@ process_file(char *filename)
 		}
 
 		my_commands[index] = command;
-		resetPQExpBuffer(query_buf);
+		proccom_reset_outbuf(proc_state);
 		if (index >= alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
@@ -2574,7 +2624,7 @@ process_file(char *filename)
 	}
 	fclose(fd);
 
-	psql_scan_finish(scan_state);
+	proccom_finish_scan(proc_state);
 
 	my_commands[index] = NULL;
 
@@ -2593,14 +2643,12 @@ process_builtin(char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
-	PsqlScanState scan_state;
-	PQExpBuffer query_buf = createPQExpBuffer();
+	ProcComState proc_state = createProcComState();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
 
-	scan_state = psql_scan_create();
-	resetPQExpBuffer(query_buf);
+	proccom_reset_outbuf(proc_state);
 
 	lineno = 0;
 	index = 0;
@@ -2624,11 +2672,11 @@ process_builtin(char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(scan_state, query_buf, buf, source, lineno);
+		command = process_commands(proc_state, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
-		resetPQExpBuffer(query_buf);
+		proccom_reset_outbuf(proc_state);
 		my_commands[index] = command;
 		index++;
 
@@ -2640,7 +2688,7 @@ process_builtin(char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
-	psql_scan_finish(scan_state);
+	proccom_finish_scan(proc_state);
 
 	return my_commands;
 }
-- 
1.8.3.1

hoge.sqltext/plain; charset=us-asciiDownload
#27Fabien COELHO
fabien.coelho@mines-paristech.fr
In reply to: Kyotaro HORIGUCHI (#26)
Re: pgbench - allow backslash-continuations in custom scripts

Attatched is the revised version of this patch.

The first patch is not changed from before.

The second is fixed a kind of bug.

Ths third is the new one to allow backslash continuation for
backslash commands.

Ah, thanks:-)

Would you consider adding the patch to the next commitfest? I may put
myself as a reviewer...

[...] I'm not satisfied by the design but I don't see another way..

I'll try to have a look.

I don't have idea how to deal with the copy of psqlscan.[lh] from
psql. Currently they are simply the dead copies of those of psql.

I think that there should be no copies, but it should use relative
symbolic links so that the files are kept synchronized.

Yeah, I think so but symlinks could harm on git and Windows.
The another way would be make copies it from psql directory. They live
next door to each other.

Indeed there are plenty of links already which are generated by makefiles
(see src/bin/pg_xlogdump/*), and probably a copy is made on windows. There
should no file duplication within the source tree.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Kyotaro HORIGUCHI (#26)
Re: pgbench - allow backslash-continuations in custom scripts

<oops, sorry, stalled post because of wrong from, posting again...>

Attatched is the revised version of this patch.

The first patch is not changed from before.

The second is fixed a kind of bug.

Ths third is the new one to allow backslash continuation for
backslash commands.

Ah, thanks:-)

Would you consider adding the patch to the next commitfest? I may put
myself as a reviewer...

[...] I'm not satisfied by the design but I don't see another way..

I'll try to have a look.

I don't have idea how to deal with the copy of psqlscan.[lh] from
psql. Currently they are simply the dead copies of those of psql.

I think that there should be no copies, but it should use relative
symbolic links so that the files are kept synchronized.

Yeah, I think so but symlinks could harm on git and Windows.
The another way would be make copies it from psql directory. They live
next door to each other.

Indeed there are plenty of links already which are generated by makefiles
(see src/bin/pg_xlogdump/*), and probably a copy is made on windows. There
should no file duplication within the source tree.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Fabien COELHO (#27)
Re: pgbench - allow backslash-continuations in custom scripts

Hello,

Attatched is the revised version of this patch.

The first patch is not changed from before.

The second is fixed a kind of bug.

Ths third is the new one to allow backslash continuation for
backslash commands.

Ah, thanks:-)

Would you consider adding the patch to the next commitfest? I may put
myself as a reviewer...

No problem.

[...] I'm not satisfied by the design but I don't see another way..

I'll try to have a look.

Thanks.

I don't have idea how to deal with the copy of psqlscan.[lh] from
psql. Currently they are simply the dead copies of those of psql.

I think that there should be no copies, but it should use relative
symbolic links so that the files are kept synchronized.

Yeah, I think so but symlinks could harm on git and Windows.
The another way would be make copies it from psql directory. They live
next door to each other.

Indeed there are plenty of links already which are generated by
makefiles (see src/bin/pg_xlogdump/*), and probably a copy is made on
windows. There should no file duplication within the source tree.

Thank you for pointing out that. Makefile of pg_xlogdump
re-creates symlinks to those files and they are replaced with cp
for the platforms where symlinks are not usable. But the files
are are explicitly added to .sln file on msvc. Anyway I'll
address it.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Kyotaro HORIGUCHI (#26)
Re: pgbench - allow backslash-continuations in custom scripts

On 07/24/2015 11:36 AM, Kyotaro HORIGUCHI wrote:

At Fri, 24 Jul 2015 07:39:16 +0200 (CEST), Fabien COELHO <coelho@cri.ensmp.fr> wrote in <alpine.DEB.2.10.1507240731050.12839@sto>

- backslash commands is handled as the same as before: multiline
is not allowed.

Hmm... that is really the feature I wanted to add initially, too bad
it is the dropped one:-)

Ouch. The story has been derailed somewhere.

Since SQL statments could be multilined without particluar
marker, we cannot implement multilined backslash commands in the
same way..

I don't think we actually want backslash-continuations. The feature we
want is "allow SQL statements span multiple lines", and using the psql
lexer solves that. We don't need the backslash-continuations when we
have that.

On 07/25/2015 05:53 PM, Fabien COELHO wrote:

I don't have idea how to deal with the copy of psqlscan.[lh] from
psql. Currently they are simply the dead copies of those of psql.

I think that there should be no copies, but it should use relative
symbolic links so that the files are kept synchronized.

Yeah, I think so but symlinks could harm on git and Windows.
The another way would be make copies it from psql directory. They live
next door to each other.

Indeed there are plenty of links already which are generated by makefiles
(see src/bin/pg_xlogdump/*), and probably a copy is made on windows. There
should no file duplication within the source tree.

Yeah, following the example of pg_xlogdump and others is the way to go.

Docs need updating, and there's probably some cleanup to do before this
is ready for committing, but overall I think this is definitely the
right direction.

I complained upthread that this makes it impossible to use
"multi-statements" in pgbench, as they would be split into separate
statements, but looking at psqlscan.l there is actually a syntax for
that in psql already. You escape the semicolon as \;, e.g. "SELECT 1 \;
SELECT 2;", and then both queries will be sent to the server as one. So
even that's OK.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Heikki Linnakangas (#30)
Re: pgbench - allow backslash-continuations in custom scripts

Hello Heikki,

I don't think we actually want backslash-continuations. The feature we want
is "allow SQL statements span multiple lines", and using the psql lexer
solves that. We don't need the backslash-continuations when we have that.

Sure. The feature *I* initially wanted was to have multi-line
meta-commands. For this feature ISTM that continuations are, alas, the
solution.

Indeed there are plenty of links already which are generated by makefiles
(see src/bin/pg_xlogdump/*), and probably a copy is made on windows. There
should no file duplication within the source tree.

Yeah, following the example of pg_xlogdump and others is the way to go.

Docs need updating, and there's probably some cleanup to do before this is
ready for committing, but overall I think this is definitely the right
direction.

I've created an entry for the next commitfest, and put the status to
"waiting on author".

I complained upthread that this makes it impossible to use "multi-statements"
in pgbench, as they would be split into separate statements, but looking at
psqlscan.l there is actually a syntax for that in psql already. You escape
the semicolon as \;, e.g. "SELECT 1 \; SELECT 2;", and then both queries will
be sent to the server as one. So even that's OK.

Good!

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Fabien COELHO (#31)
3 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

Hi, all.

I don't think we actually want backslash-continuations. The feature we
want is "allow SQL statements span multiple lines", and using the psql
lexer solves that. We don't need the backslash-continuations when we
have that.

Sure. The feature *I* initially wanted was to have multi-line
meta-commands. For this feature ISTM that continuations are, alas, the
solution.

Indeed there are plenty of links already which are generated by
makefiles
(see src/bin/pg_xlogdump/*), and probably a copy is made on
windows. There
should no file duplication within the source tree.

Yeah, following the example of pg_xlogdump and others is the way to
go.

Docs need updating, and there's probably some cleanup to do before
this is ready for committing, but overall I think this is definitely
the right direction.

I've created an entry for the next commitfest, and put the status to
"waiting on author".

I complained upthread that this makes it impossible to use
"multi-statements" in pgbench, as they would be split into separate
statements, but looking at psqlscan.l there is actually a syntax for
that in psql already. You escape the semicolon as \;, e.g. "SELECT 1
\; SELECT 2;", and then both queries will be sent to the server as
one. So even that's OK.

Good!

Hmm. psqlscan.l handles multistatement naturally.
I worked on that and the attached patche set does,

- backslash continuation for pgbench metacommands.

set variable \
<some value>

- SQL statement natural continuation lines.

SELECT :foo
FROM :bar;

- SQL multi-statement.

SELECT 1; SELECT 2;

The work to be left is eliminating double-format of Command
struct.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Prepare-for-share-psqlscan-with-pgbench.patchtext/x-patch; charset=us-asciiDownload
>From 274bc1cd6de4fb5806e308b002b086b1dfdf7479 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 23 Jul 2015 20:44:37 +0900
Subject: [PATCH 1/3] Prepare for share psqlscan with pgbench.

Eliminate direct usage of pset variables and enable parts unnecessary
for other than psql to be disabled by defining OUTSIDE_PSQL.
---
 src/bin/psql/mainloop.c |  6 ++--
 src/bin/psql/psqlscan.h | 14 +++++----
 src/bin/psql/psqlscan.l | 79 ++++++++++++++++++++++++++++++++-----------------
 src/bin/psql/startup.c  |  4 +--
 4 files changed, 67 insertions(+), 36 deletions(-)

diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index b6cef94..e98cb94 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -233,7 +233,8 @@ MainLoop(FILE *source)
 		/*
 		 * Parse line, looking for command separators.
 		 */
-		psql_scan_setup(scan_state, line, strlen(line));
+		psql_scan_setup(scan_state, line, strlen(line),
+						pset.db, pset.vars, pset.encoding);
 		success = true;
 		line_saved_in_history = false;
 
@@ -373,7 +374,8 @@ MainLoop(FILE *source)
 					resetPQExpBuffer(query_buf);
 					/* reset parsing state since we are rescanning whole line */
 					psql_scan_reset(scan_state);
-					psql_scan_setup(scan_state, line, strlen(line));
+					psql_scan_setup(scan_state, line, strlen(line),
+									pset.db, pset.vars, pset.encoding);
 					line_saved_in_history = false;
 					prompt_status = PROMPT_READY;
 				}
diff --git a/src/bin/psql/psqlscan.h b/src/bin/psql/psqlscan.h
index 55070ca..4bf8dcb 100644
--- a/src/bin/psql/psqlscan.h
+++ b/src/bin/psql/psqlscan.h
@@ -11,7 +11,11 @@
 #include "pqexpbuffer.h"
 
 #include "prompt.h"
-
+#if !defined OUTSIDE_PSQL
+#include "variables.h"
+#else
+typedef int * VariableSpace;
+#endif
 
 /* Abstract type for lexer's internal state */
 typedef struct PsqlScanStateData *PsqlScanState;
@@ -36,12 +40,11 @@ enum slash_option_type
 	OT_NO_EVAL					/* no expansion of backticks or variables */
 };
 
-
 extern PsqlScanState psql_scan_create(void);
 extern void psql_scan_destroy(PsqlScanState state);
 
-extern void psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len);
+extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+							PGconn *db, VariableSpace vars, int encoding);
 extern void psql_scan_finish(PsqlScanState state);
 
 extern PsqlScanResult psql_scan(PsqlScanState state,
@@ -52,6 +55,7 @@ extern void psql_scan_reset(PsqlScanState state);
 
 extern bool psql_scan_in_quote(PsqlScanState state);
 
+#if !defined OUTSIDE_PSQL
 extern char *psql_scan_slash_command(PsqlScanState state);
 
 extern char *psql_scan_slash_option(PsqlScanState state,
@@ -60,5 +64,5 @@ extern char *psql_scan_slash_option(PsqlScanState state,
 					   bool semicolon);
 
 extern void psql_scan_slash_command_end(PsqlScanState state);
-
+#endif	 /* if !defined OUTSIDE_PSQL */
 #endif   /* PSQLSCAN_H */
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
index be059ab..f9a19cd 100644
--- a/src/bin/psql/psqlscan.l
+++ b/src/bin/psql/psqlscan.l
@@ -43,11 +43,6 @@
 
 #include <ctype.h>
 
-#include "common.h"
-#include "settings.h"
-#include "variables.h"
-
-
 /*
  * We use a stack of flex buffers to handle substitution of psql variables.
  * Each stacked buffer contains the as-yet-unread text from one psql variable.
@@ -81,10 +76,12 @@ typedef struct PsqlScanStateData
 	const char *scanline;		/* current input line at outer level */
 
 	/* safe_encoding, curline, refline are used by emit() to replace FFs */
+	PGconn	   *db;				/* active connection */
 	int			encoding;		/* encoding being used now */
 	bool		safe_encoding;	/* is current encoding "safe"? */
 	const char *curline;		/* actual flex input string for cur buf */
 	const char *refline;		/* original data for cur buffer */
+	VariableSpace vars;			/* "shell variable" repository */
 
 	/*
 	 * All this state lives across successive input lines, until explicitly
@@ -126,6 +123,15 @@ static void escape_variable(bool as_ident);
 
 #define ECHO emit(yytext, yyleng)
 
+/* Provide dummy macros when no use of psql variables */
+#if defined OUTSIDE_PSQL
+#define GetVariable(space,name) NULL
+#define standard_strings() true
+#define psql_error(fmt,...) do { \
+	fprintf(stderr, "psql_error is called. abort.\n");\
+	exit(1);\
+} while(0)
+#endif
 %}
 
 %option 8bit
@@ -736,11 +742,14 @@ other			.
 
 :{variable_char}+	{
 					/* Possible psql variable substitution */
-					char   *varname;
-					const char *value;
+					char   *varname = NULL;
+					const char *value = NULL;
 
-					varname = extract_substring(yytext + 1, yyleng - 1);
-					value = GetVariable(pset.vars, varname);
+					if (cur_state->vars)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = GetVariable(cur_state->vars, varname);
+					}
 
 					if (value)
 					{
@@ -769,7 +778,8 @@ other			.
 						ECHO;
 					}
 
-					free(varname);
+					if (varname)
+						free(varname);
 				}
 
 :'{variable_char}+'	{
@@ -1033,9 +1043,12 @@ other			.
 						char   *varname;
 						const char *value;
 
-						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(pset.vars, varname);
-						free(varname);
+						if (cur_state->vars)
+						{
+							varname = extract_substring(yytext + 1, yyleng - 1);
+							value = GetVariable(cur_state->vars, varname);
+							free(varname);
+						}
 
 						/*
 						 * The variable value is just emitted without any
@@ -1227,17 +1240,19 @@ psql_scan_destroy(PsqlScanState state)
  * or freed until after psql_scan_finish is called.
  */
 void
-psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len)
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PGconn *db, VariableSpace vars, int encoding)
 {
 	/* Mustn't be scanning already */
 	Assert(state->scanbufhandle == NULL);
 	Assert(state->buffer_stack == NULL);
 
 	/* Do we need to hack the character set encoding? */
-	state->encoding = pset.encoding;
+	state->encoding = encoding;
 	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
 
+	state->vars = vars;
+
 	/* needed for prepare_buffer */
 	cur_state = state;
 
@@ -1459,6 +1474,7 @@ psql_scan_in_quote(PsqlScanState state)
 	return state->start_state != INITIAL;
 }
 
+#if !defined OUTSIDE_PSQL
 /*
  * Scan the command name of a psql backslash command.  This should be called
  * after psql_scan() returns PSCAN_BACKSLASH.  It is assumed that the input
@@ -1615,7 +1631,7 @@ psql_scan_slash_option(PsqlScanState state,
 					{
 						if (!inquotes && type == OT_SQLID)
 							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, pset.encoding);
+						cp += PQmblen(cp, cur_state->encoding);
 					}
 				}
 			}
@@ -1744,6 +1760,14 @@ evaluate_backtick(void)
 
 	termPQExpBuffer(&cmd_output);
 }
+#else
+static void
+evaluate_backtick(void)
+{
+	fprintf(stderr, "Unexpected call of evaluate_backtick.\n");
+	exit(1);
+}
+#endif /* if !defined OUTSIDE_PSQL*/
 
 /*
  * Push the given string onto the stack of stuff to scan.
@@ -1944,15 +1968,18 @@ escape_variable(bool as_ident)
 	char	   *varname;
 	const char *value;
 
-	/* Variable lookup. */
-	varname = extract_substring(yytext + 2, yyleng - 3);
-	value = GetVariable(pset.vars, varname);
-	free(varname);
+	/* Variable lookup if possible. */
+	if (cur_state->vars && cur_state->db)
+	{
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = GetVariable(cur_state->vars, varname);
+		free(varname);
+	}
 
 	/* Escaping. */
 	if (value)
 	{
-		if (!pset.db)
+		if (!cur_state->db)
 			psql_error("can't escape without active connection\n");
 		else
 		{
@@ -1960,16 +1987,14 @@ escape_variable(bool as_ident)
 
 			if (as_ident)
 				escaped_value =
-					PQescapeIdentifier(pset.db, value, strlen(value));
+					PQescapeIdentifier(cur_state->db, value, strlen(value));
 			else
 				escaped_value =
-					PQescapeLiteral(pset.db, value, strlen(value));
+					PQescapeLiteral(cur_state->db, value, strlen(value));
 
 			if (escaped_value == NULL)
 			{
-				const char *error = PQerrorMessage(pset.db);
-
-				psql_error("%s", error);
+				psql_error("%s", PQerrorMessage(cur_state->db));
 			}
 			else
 			{
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 28ba75a..c143dfe 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -305,8 +305,8 @@ main(int argc, char *argv[])
 
 		scan_state = psql_scan_create();
 		psql_scan_setup(scan_state,
-						options.action_string,
-						strlen(options.action_string));
+						options.action_string, strlen(options.action_string),
+						pset.db, pset.vars, pset.encoding);
 
 		successResult = HandleSlashCmds(scan_state, NULL) != PSQL_CMD_ERROR
 			? EXIT_SUCCESS : EXIT_FAILURE;
-- 
1.8.3.1

0002-Make-use-of-psqlscan-for-parsing-of-custom-script.patchtext/x-patch; charset=us-asciiDownload
>From 2fd47e1a7de8ad49999241cc83271de78e4b2b6e Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Jul 2015 10:58:23 +0900
Subject: [PATCH 2/3] Make use of psqlscan for parsing of custom script.

Get rid of home-made parser and allow backslash continuation for
backslash commands, multiline SQL statements and SQL multi statement
in custom scripts.
---
 src/bin/pgbench/Makefile  |  16 +-
 src/bin/pgbench/pgbench.c | 479 +++++++++++++++++++++++++++++++---------------
 2 files changed, 341 insertions(+), 154 deletions(-)

diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile
index 18fdf58..a0a736b 100644
--- a/src/bin/pgbench/Makefile
+++ b/src/bin/pgbench/Makefile
@@ -5,11 +5,13 @@ PGAPPICON = win32
 
 subdir = src/bin/pgbench
 top_builddir = ../../..
+psqlincdir = ../psql
 include $(top_builddir)/src/Makefile.global
 
 OBJS = pgbench.o exprparse.o $(WIN32RES)
 
-override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
+
+override CPPFLAGS := -DOUTSIDE_PSQL -I. -I$(srcdir) -I$(libpq_srcdir) -I$(psqlincdir) $(CPPFLAGS)
 
 ifneq ($(PORTNAME), win32)
 override CFLAGS += $(PTHREAD_CFLAGS)
@@ -18,6 +20,16 @@ endif
 
 all: pgbench
 
+psqlscan.c: FLEXFLAGS = -Cfe -p -p
+psqlscan.c: FLEX_NO_BACKUP=yes
+
+psqlscan.l: % : $(top_srcdir)/src/bin/psql/%
+	 rm -f $@ && $(LN_S)  $< .
+
+psqlscan.c:  psqlscan.l
+
+pgbench.o: psqlscan.c
+
 pgbench: $(OBJS) | submake-libpq submake-libpgport
 	$(CC) $(CFLAGS) $^ $(libpq_pgport) $(PTHREAD_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
 
@@ -39,4 +51,4 @@ clean distclean:
 	rm -f pgbench$(X) $(OBJS)
 
 maintainer-clean: distclean
-	rm -f exprparse.c exprscan.c
+	rm -f exprparse.c exprscan.c psqlscan.l psqlscan.c
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 6f5bd99..3b1ab7a 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -54,7 +54,7 @@
 #endif
 
 #include "pgbench.h"
-
+#include "psqlscan.h"
 /*
  * Multi-platform pthread implementations
  */
@@ -262,7 +262,7 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct
+typedef struct Command_t
 {
 	char	   *line;			/* full text of command line */
 	int			command_num;	/* unique index of this Command struct */
@@ -271,6 +271,7 @@ typedef struct
 	char	   *argv[MAX_ARGS]; /* command word list */
 	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
+	struct Command_t *next;		/* more command if any, for multistatements */
 } Command;
 
 typedef struct
@@ -293,6 +294,21 @@ typedef struct
 	double		sum2_lag;		/* sum(lag*lag) */
 } AggVals;
 
+typedef enum
+{
+	PS_IDLE,
+	PS_IN_STATEMENT,
+	PS_IN_BACKSLASH_CMD
+} ParseState;
+
+typedef struct ParseInfo
+{
+	PsqlScanState	scan_state;
+	PQExpBuffer		outbuf;
+	ParseState		mode;
+} ParseInfoData;
+typedef ParseInfoData *ParseInfo;
+
 static Command **sql_files[MAX_FILES];	/* SQL script files */
 static int	num_files;			/* number of script files */
 static int	num_commands = 0;	/* total number of Command structs */
@@ -2222,217 +2238,348 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
-/* Parse a command; return a Command struct, or NULL if it's a comment */
+static ParseInfo
+createParseInfo(void)
+{
+	ParseInfo ret = (ParseInfo) pg_malloc(sizeof(ParseInfoData));
+
+	ret->scan_state = psql_scan_create();
+	ret->outbuf = createPQExpBuffer();
+	ret->mode = PS_IDLE;
+
+	return ret;
+}
+
+#define parse_reset_outbuf(pcs) resetPQExpBuffer((pcs)->outbuf)
+#define parse_finish_scan(pcs) psql_scan_finish((pcs)->scan_state)
+
+/* copy a string after removing newlines and collapsing whitespaces */
+static char *
+strdup_nonl(const char *in)
+{
+	char *ret, *p, *q;
+
+	ret = pg_strdup(in);
+
+	/* Replace newlines into spaces */
+	for (p = ret ; *p ; p++)
+		if (*p == '\n') *p = ' ';
+
+	/* collapse successive spaces */
+	for (p = q = ret ; *p ; p++, q++)
+	{
+		while (isspace(*p) && isspace(*(p + 1))) p++;
+		if (p > q) *q = *p;
+	}
+	*q = '\0';
+
+	return ret;
+}
+
+/* Parse a backslash command; return a Command struct  */
 static Command *
-process_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(ParseInfo proc_state, char *buf,
+						   const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 
 	Command    *my_commands;
 	int			j;
 	char	   *p,
+			   *start,
 			   *tok;
-
-	/* Make the string buf end at the next newline */
-	if ((p = strchr(buf, '\n')) != NULL)
-		*p = '\0';
+	int			max_args = -1;
 
 	/* Skip leading whitespace */
 	p = buf;
 	while (isspace((unsigned char) *p))
 		p++;
+	start = p;
+
+	if (proc_state->mode != PS_IN_BACKSLASH_CMD)
+	{
+		if (*p != '\\')
+			return NULL;
+
+		/* This is the first line of a backslash command  */
+		proc_state->mode = PS_IN_BACKSLASH_CMD;
+	}
+
+	/*
+	 * Make the string buf end at the next newline, or move to just after the
+	 * end of line
+	 */
+	if ((p = strchr(start, '\n')) != NULL)
+		*p = '\0';
+	else
+		p = start + strlen(start);
+
+	/* continued line ends with a backslash */
+	if (*(--p) == '\\')
+	{
+		*p-- = '\0';
+		appendPQExpBufferStr(proc_state->outbuf, start);
+
+		/* Add a delimiter at the end of the line if necessary */
+		if (!isspace(*p))
+			appendPQExpBufferChar(proc_state->outbuf, ' ');
 
-	/* If the line is empty or actually a comment, we're done */
-	if (*p == '\0' || strncmp(p, "--", 2) == 0)
 		return NULL;
+	}
+
+	appendPQExpBufferStr(proc_state->outbuf, start);
+	proc_state->mode = PS_IDLE;
+
+	/* Start parsing the backslash command */
+
+	p = proc_state->outbuf->data;
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
-	my_commands->line = pg_strdup(buf);
+	my_commands->line = pg_strdup(p);
 	my_commands->command_num = num_commands++;
-	my_commands->type = 0;		/* until set */
+	my_commands->type = META_COMMAND;
 	my_commands->argc = 0;
+	my_commands->next = NULL;
 
-	if (*p == '\\')
-	{
-		int			max_args = -1;
+	j = 0;
+	tok = strtok(++p, delim);
 
-		my_commands->type = META_COMMAND;
+	if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
+		max_args = 2;
 
-		j = 0;
-		tok = strtok(++p, delim);
+	while (tok != NULL)
+	{
+		my_commands->cols[j] = tok - buf + 1;
+		my_commands->argv[j++] = pg_strdup(tok);
+		my_commands->argc++;
+		if (max_args >= 0 && my_commands->argc >= max_args)
+			tok = strtok(NULL, "");
+		else
+			tok = strtok(NULL, delim);
+	}
+	parse_reset_outbuf(proc_state);
 
-		if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
-			max_args = 2;
+	if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
+	{
+		/*
+		 * parsing: \setrandom variable min max [uniform] \setrandom
+		 * variable min max (gaussian|exponential) threshold
+		 */
 
-		while (tok != NULL)
+		if (my_commands->argc < 4)
 		{
-			my_commands->cols[j] = tok - buf + 1;
-			my_commands->argv[j++] = pg_strdup(tok);
-			my_commands->argc++;
-			if (max_args >= 0 && my_commands->argc >= max_args)
-				tok = strtok(NULL, "");
-			else
-				tok = strtok(NULL, delim);
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing arguments", NULL, -1);
 		}
 
-		if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
-		{
-			/*
-			 * parsing: \setrandom variable min max [uniform] \setrandom
-			 * variable min max (gaussian|exponential) threshold
-			 */
+		/* argc >= 4 */
 
-			if (my_commands->argc < 4)
+		if (my_commands->argc == 4 ||		/* uniform without/with
+											 * "uniform" keyword */
+			(my_commands->argc == 5 &&
+			 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
+		{
+			/* nothing to do */
+		}
+		else if (			/* argc >= 5 */
+			(pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
+			(pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
+		{
+			if (my_commands->argc < 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing arguments", NULL, -1);
-			}
-
-			/* argc >= 4 */
-
-			if (my_commands->argc == 4 ||		/* uniform without/with
-												 * "uniform" keyword */
-				(my_commands->argc == 5 &&
-				 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
-			{
-				/* nothing to do */
+							 "missing threshold argument", my_commands->argv[4], -1);
 			}
-			else if (			/* argc >= 5 */
-					 (pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
-				   (pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
-			{
-				if (my_commands->argc < 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-					 "missing threshold argument", my_commands->argv[4], -1);
-				}
-				else if (my_commands->argc > 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "too many arguments", my_commands->argv[4],
-								 my_commands->cols[6]);
-				}
-			}
-			else	/* cannot parse, unexpected arguments */
+			else if (my_commands->argc > 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "unexpected argument", my_commands->argv[4],
-							 my_commands->cols[4]);
+							 "too many arguments", my_commands->argv[4],
+							 my_commands->cols[6]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+		else	/* cannot parse, unexpected arguments */
 		{
-			if (my_commands->argc < 3)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "unexpected argument", my_commands->argv[4],
+						 my_commands->cols[4]);
+		}
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+	{
+		if (my_commands->argc < 3)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-			expr_scanner_init(my_commands->argv[2], source, lineno,
-							  my_commands->line, my_commands->argv[0],
-							  my_commands->cols[2] - 1);
+		expr_scanner_init(my_commands->argv[2], source, lineno,
+						  my_commands->line, my_commands->argv[0],
+						  my_commands->cols[2] - 1);
 
-			if (expr_yyparse() != 0)
-			{
-				/* dead code: exit done from syntax_error called by yyerror */
-				exit(1);
-			}
+		if (expr_yyparse() != 0)
+		{
+			/* dead code: exit done from syntax_error called by yyerror */
+			exit(1);
+		}
 
-			my_commands->expr = expr_parse_result;
+		my_commands->expr = expr_parse_result;
 
-			expr_scanner_finish();
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+		expr_scanner_finish();
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+	{
+		if (my_commands->argc < 2)
 		{
-			if (my_commands->argc < 2)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
-
-			/*
-			 * Split argument into number and unit to allow "sleep 1ms" etc.
-			 * We don't have to terminate the number argument with null
-			 * because it will be parsed with atoi, which ignores trailing
-			 * non-digit characters.
-			 */
-			if (my_commands->argv[1][0] != ':')
-			{
-				char	   *c = my_commands->argv[1];
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-				while (isdigit((unsigned char) *c))
-					c++;
-				if (*c)
-				{
-					my_commands->argv[2] = c;
-					if (my_commands->argc < 3)
-						my_commands->argc = 3;
-				}
-			}
+		/*
+		 * Split argument into number and unit to allow "sleep 1ms" etc.  We
+		 * don't have to terminate the number argument with null because it
+		 * will be parsed with atoi, which ignores trailing non-digit
+		 * characters.
+		 */
+		if (my_commands->argv[1][0] != ':')
+		{
+			char	   *c = my_commands->argv[1];
 
-			if (my_commands->argc >= 3)
+			while (isdigit((unsigned char) *c))
+				c++;
+			if (*c)
 			{
-				if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "s") != 0)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "unknown time unit, must be us, ms or s",
-								 my_commands->argv[2], my_commands->cols[2]);
-				}
+				my_commands->argv[2] = c;
+				if (my_commands->argc < 3)
+					my_commands->argc = 3;
 			}
-
-			/* this should be an error?! */
-			for (j = 3; j < my_commands->argc; j++)
-				fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
-						my_commands->argv[0], my_commands->argv[j]);
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+
+		if (my_commands->argc >= 3)
 		{
-			if (my_commands->argc < 3)
+			if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "s") != 0)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
+							 "unknown time unit, must be us, ms or s",
+							 my_commands->argv[2], my_commands->cols[2]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+
+		/* this should be an error?! */
+		for (j = 3; j < my_commands->argc; j++)
+			fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
+					my_commands->argv[0], my_commands->argv[j]);
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+	{
+		if (my_commands->argc < 3)
 		{
-			if (my_commands->argc < 1)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing command", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
 		}
-		else
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+	{
+		if (my_commands->argc < 1)
 		{
 			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-						 "invalid command", NULL, -1);
+						 "missing command", NULL, -1);
 		}
 	}
 	else
 	{
-		my_commands->type = SQL_COMMAND;
+		syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+					 "invalid command", NULL, -1);
+	}
+
+	return my_commands;
+}
+
+/* Parse a input line, return non-null if any command terminates. */
+static Command *
+process_commands(ParseInfo proc_state, char *buf,
+				 const char *source, const int lineno)
+{
+	Command *command = NULL;
+	Command *retcomd = NULL;
+	PsqlScanState scan_state = proc_state->scan_state;
+	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+	PQExpBuffer qbuf = proc_state->outbuf;
+	PsqlScanResult scan_result;
+
+	if (proc_state->mode != PS_IN_STATEMENT)
+	{
+		command = process_backslash_commands(proc_state, buf, source, lineno);
+
+		/* go to next line for continuation line of backslash command. */
+		if (command != NULL || proc_state->mode == PS_IN_BACKSLASH_CMD)
+			return command;
+	}
+
+	/* Parse statements */
+	psql_scan_setup(scan_state, buf, strlen(buf), NULL, NULL, 0);
+
+next_command:	
+	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
+
+	if (scan_result == PSCAN_SEMICOLON)
+	{
+		proc_state->mode = PS_IDLE;
+		/*
+		 * Command is terminated. Fill the struct.
+		 */
+		command = (Command*) pg_malloc(sizeof(Command));
+		command->line = strdup_nonl(qbuf->data);
+		command->command_num = num_commands++;
+		command->type = SQL_COMMAND;
+		command->argc = 0;
+		command->next = NULL;
+
+		/* Put this command at the end of returning command chain */
+		if (!retcomd)
+			retcomd = command;
+		else
+		{
+			Command *pcomm = retcomd;
+			while (pcomm->next) pcomm = pcomm->next;
+			pcomm->next = command;
+		}
 
 		switch (querymode)
 		{
-			case QUERY_SIMPLE:
-				my_commands->argv[0] = pg_strdup(p);
-				my_commands->argc++;
-				break;
-			case QUERY_EXTENDED:
-			case QUERY_PREPARED:
-				if (!parseQuery(my_commands, p))
-					exit(1);
-				break;
-			default:
+		case QUERY_SIMPLE:
+			command->argv[0] = pg_strdup(qbuf->data);
+			command->argc++;
+			break;
+		case QUERY_EXTENDED:
+		case QUERY_PREPARED:
+			if (!parseQuery(command, qbuf->data))
 				exit(1);
+			break;
+		default:
+			exit(1);
 		}
+
+		parse_reset_outbuf(proc_state);
+
+		/* Ask for the next statement in this line */
+		goto next_command;
+	}
+	else if (scan_result == PSCAN_BACKSLASH)
+	{
+		fprintf(stderr, "Unexpected backslash in SQL statement: %s:%d\n", source, lineno);
+		exit(1);
 	}
 
-	return my_commands;
+	proc_state->mode = PS_IN_STATEMENT;
+	psql_scan_finish(scan_state);
+
+	return retcomd;
 }
 
+
 /*
  * Read a line from fd, and return it in a malloc'd buffer.
  * Return NULL at EOF.
@@ -2487,6 +2634,7 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	if (num_files >= MAX_FILES)
 	{
@@ -2507,33 +2655,47 @@ process_file(char *filename)
 		return false;
 	}
 
+	proc_state->mode = PS_IDLE;
+
 	lineno = 0;
 	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
-		Command    *command;
+		Command *command = NULL;
 
 		lineno += 1;
 
-		command = process_commands(buf, filename, lineno);
-
+		command = process_commands(proc_state, buf, filename, lineno);
 		free(buf);
 
 		if (command == NULL)
+		{
+			/*
+			 * command is NULL when psql_scan returns PSCAN_EOL or
+			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
+			 * cases.
+			 */
 			continue;
+		}
 
-		my_commands[index] = command;
-		index++;
+		while (command)
+		{
+			my_commands[index++] = command;
+			command = command->next;
+		}
 
-		if (index >= alloc_num)
+		if (index > alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
+			my_commands = pg_realloc(my_commands,
+									 sizeof(Command *) * alloc_num);
 		}
 	}
 	fclose(fd);
 
+	parse_finish_scan(proc_state);
+
 	my_commands[index] = NULL;
 
 	sql_files[num_files++] = my_commands;
@@ -2551,6 +2713,7 @@ process_builtin(char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2577,10 +2740,12 @@ process_builtin(char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(buf, source, lineno);
+		command = process_commands(proc_state, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
+		/* builtin doesn't need multistatements */
+		Assert(command->next == NULL);
 		my_commands[index] = command;
 		index++;
 
@@ -2592,6 +2757,7 @@ process_builtin(char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
+	parse_finish_scan(proc_state);
 
 	return my_commands;
 }
@@ -3922,3 +4088,12 @@ pthread_join(pthread_t th, void **thread_return)
 }
 
 #endif   /* WIN32 */
+
+/*
+ * psqlscan.c is #include'd here instead of being compiled on its own.
+ * This is because we need postgres_fe.h to be read before any system
+ * include files, else things tend to break on platforms that have
+ * multiple infrastructures for stdio.h and so on.  flex is absolutely
+ * uncooperative about that, so we can't compile psqlscan.c on its own.
+ */
+#include "psqlscan.c"
-- 
1.8.3.1

0003-Change-MSVC-Build-script.patchtext/x-patch; charset=us-asciiDownload
>From 884d7fadc48f59709a466383d0891d714662ab77 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Tue, 4 Aug 2015 20:54:28 +0900
Subject: [PATCH 3/3] Change MSVC Build script

---
 src/tools/msvc/Mkvcbuild.pm | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 3abbb4c..f018a29 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -68,7 +68,7 @@ my $frontend_extrasource = {
 	  [ 'src/bin/pgbench/exprscan.l', 'src/bin/pgbench/exprparse.y' ], };
 my @frontend_excludes = (
 	'pgevent',     'pg_basebackup', 'pg_rewind', 'pg_dump',
-	'pg_xlogdump', 'scripts');
+	'pg_xlogdump', 'pgbench', 'scripts');
 
 sub mkvcbuild
 {
@@ -671,6 +671,14 @@ sub mkvcbuild
 	}
 	$pg_xlogdump->AddFile('src/backend/access/transam/xlogreader.c');
 
+	# fix up pg_xlogdump once it's been set up
+	# files symlinked on Unix are copied on windows
+	my $pgbench = AddSimpleFrontend('pgbench');
+	$pgbench->AddDefine('FRONTEND');
+	$pgbench->AddDefine('OUTSIDE_PSQL');
+	$pgbench->AddFile('src/bin/psql/psqlscan.l');
+	$pgbench->AddIncludeDir('src/bin/psql');
+
 	$solution->Save();
 	return $solution->{vcver};
 }
-- 
1.8.3.1

#33Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Kyotaro HORIGUCHI (#32)
4 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

Hello, Thank you for registering this to CF-Sep.
I missed the regtration window.. It ended earlier than usual..

Most troubles have gone and I'll be back next week.

The work to be left is eliminating double-format of Command
struct.

This is done as the additional fourth patch, not merged into
previous ones, to show what's changed in the manner of command
storing.

I repost on this thread the new version of this patch including
this and posted before. This is rebased to current master.

The changes in behaviors brought by this patch has been described
in the privous mail as the following,

Hmm. psqlscan.l handles multistatement naturally.
I worked on that and the attached patche set does,

- backslash continuation for pgbench metacommands.

set variable \
<some value>

- SQL statement natural continuation lines.

SELECT :foo
FROM :bar;

- SQL multi-statement.

SELECT 1; SELECT 2;

Each of the four patches does the following thigs,

1. 0001-Prepare-to-share-psqlscan-with-pgbench.patch

The global variable pset, VariableSpace and backslash syntax of
psql are the obstacles for psqlscan.l from being used by
pgbench. This patch eliminates direct reference to pset and
masks VariableSpace feature (looks ugry..), and enables
backslash syntax in psqlscan.l to be hidden from outside psql
by defining the symbol OUTSIDE_PSQL.

No behavioral changes of pasql are introduced by ths patch.

2. 0002-Make-use-of-psqlscan-for-parsing-of-custom-script.patch

This is the core of this patch, which makes pgbench to use
psqlscan.l and enables multi-statements,
multiline-SQL-statement and backslash-continuation of
metacommands.

The struct Command is modified that it can be chained in order
to convey multistatement in one line. But the commands are
stored in an array of Command just outside process_commands as
of old. This double-formatting will be removed by the fourth
patch.

psqlscan.c is compiled as a part of mainloop.c for some reason
described at the end of the file. I haven't confirmed that the
same thing will happen in pgbench, but I did the same thing for
pgbenc.c.

Compilation will fail on Windows as of this patch.

3. 0003-Change-MSVC-Build-script.patch

Changes the build script for Windows platform. It certainly
works but might look a bit odd because of the anormaly of the
compilation way of psqlscan.l

4. 0004-Change-the-way-to-hold-command-list.patch

Changes the way to hold commad list from an array to linked
list, to remove the double formatting of Command-list
introduced by the second patch. This removes the explicit
limitation on the number of commands in scripts, as a
side-effect.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Prepare-to-share-psqlscan-with-pgbench.patchtext/x-patch; charset=us-asciiDownload
>From 6842b81ba19f779244e4cad3ab3ba69db537b3ea Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 23 Jul 2015 20:44:37 +0900
Subject: [PATCH 1/4] Prepare to share psqlscan with pgbench.

Eliminate direct usage of pset variables and enable parts unnecessary
for other than psql to be disabled by defining OUTSIDE_PSQL.
---
 src/bin/psql/mainloop.c |  6 ++--
 src/bin/psql/psqlscan.h | 14 +++++----
 src/bin/psql/psqlscan.l | 79 ++++++++++++++++++++++++++++++++-----------------
 src/bin/psql/startup.c  |  4 +--
 4 files changed, 67 insertions(+), 36 deletions(-)

diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index b6cef94..e98cb94 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -233,7 +233,8 @@ MainLoop(FILE *source)
 		/*
 		 * Parse line, looking for command separators.
 		 */
-		psql_scan_setup(scan_state, line, strlen(line));
+		psql_scan_setup(scan_state, line, strlen(line),
+						pset.db, pset.vars, pset.encoding);
 		success = true;
 		line_saved_in_history = false;
 
@@ -373,7 +374,8 @@ MainLoop(FILE *source)
 					resetPQExpBuffer(query_buf);
 					/* reset parsing state since we are rescanning whole line */
 					psql_scan_reset(scan_state);
-					psql_scan_setup(scan_state, line, strlen(line));
+					psql_scan_setup(scan_state, line, strlen(line),
+									pset.db, pset.vars, pset.encoding);
 					line_saved_in_history = false;
 					prompt_status = PROMPT_READY;
 				}
diff --git a/src/bin/psql/psqlscan.h b/src/bin/psql/psqlscan.h
index 55070ca..4bf8dcb 100644
--- a/src/bin/psql/psqlscan.h
+++ b/src/bin/psql/psqlscan.h
@@ -11,7 +11,11 @@
 #include "pqexpbuffer.h"
 
 #include "prompt.h"
-
+#if !defined OUTSIDE_PSQL
+#include "variables.h"
+#else
+typedef int * VariableSpace;
+#endif
 
 /* Abstract type for lexer's internal state */
 typedef struct PsqlScanStateData *PsqlScanState;
@@ -36,12 +40,11 @@ enum slash_option_type
 	OT_NO_EVAL					/* no expansion of backticks or variables */
 };
 
-
 extern PsqlScanState psql_scan_create(void);
 extern void psql_scan_destroy(PsqlScanState state);
 
-extern void psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len);
+extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+							PGconn *db, VariableSpace vars, int encoding);
 extern void psql_scan_finish(PsqlScanState state);
 
 extern PsqlScanResult psql_scan(PsqlScanState state,
@@ -52,6 +55,7 @@ extern void psql_scan_reset(PsqlScanState state);
 
 extern bool psql_scan_in_quote(PsqlScanState state);
 
+#if !defined OUTSIDE_PSQL
 extern char *psql_scan_slash_command(PsqlScanState state);
 
 extern char *psql_scan_slash_option(PsqlScanState state,
@@ -60,5 +64,5 @@ extern char *psql_scan_slash_option(PsqlScanState state,
 					   bool semicolon);
 
 extern void psql_scan_slash_command_end(PsqlScanState state);
-
+#endif	 /* if !defined OUTSIDE_PSQL */
 #endif   /* PSQLSCAN_H */
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
index be059ab..f9a19cd 100644
--- a/src/bin/psql/psqlscan.l
+++ b/src/bin/psql/psqlscan.l
@@ -43,11 +43,6 @@
 
 #include <ctype.h>
 
-#include "common.h"
-#include "settings.h"
-#include "variables.h"
-
-
 /*
  * We use a stack of flex buffers to handle substitution of psql variables.
  * Each stacked buffer contains the as-yet-unread text from one psql variable.
@@ -81,10 +76,12 @@ typedef struct PsqlScanStateData
 	const char *scanline;		/* current input line at outer level */
 
 	/* safe_encoding, curline, refline are used by emit() to replace FFs */
+	PGconn	   *db;				/* active connection */
 	int			encoding;		/* encoding being used now */
 	bool		safe_encoding;	/* is current encoding "safe"? */
 	const char *curline;		/* actual flex input string for cur buf */
 	const char *refline;		/* original data for cur buffer */
+	VariableSpace vars;			/* "shell variable" repository */
 
 	/*
 	 * All this state lives across successive input lines, until explicitly
@@ -126,6 +123,15 @@ static void escape_variable(bool as_ident);
 
 #define ECHO emit(yytext, yyleng)
 
+/* Provide dummy macros when no use of psql variables */
+#if defined OUTSIDE_PSQL
+#define GetVariable(space,name) NULL
+#define standard_strings() true
+#define psql_error(fmt,...) do { \
+	fprintf(stderr, "psql_error is called. abort.\n");\
+	exit(1);\
+} while(0)
+#endif
 %}
 
 %option 8bit
@@ -736,11 +742,14 @@ other			.
 
 :{variable_char}+	{
 					/* Possible psql variable substitution */
-					char   *varname;
-					const char *value;
+					char   *varname = NULL;
+					const char *value = NULL;
 
-					varname = extract_substring(yytext + 1, yyleng - 1);
-					value = GetVariable(pset.vars, varname);
+					if (cur_state->vars)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = GetVariable(cur_state->vars, varname);
+					}
 
 					if (value)
 					{
@@ -769,7 +778,8 @@ other			.
 						ECHO;
 					}
 
-					free(varname);
+					if (varname)
+						free(varname);
 				}
 
 :'{variable_char}+'	{
@@ -1033,9 +1043,12 @@ other			.
 						char   *varname;
 						const char *value;
 
-						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(pset.vars, varname);
-						free(varname);
+						if (cur_state->vars)
+						{
+							varname = extract_substring(yytext + 1, yyleng - 1);
+							value = GetVariable(cur_state->vars, varname);
+							free(varname);
+						}
 
 						/*
 						 * The variable value is just emitted without any
@@ -1227,17 +1240,19 @@ psql_scan_destroy(PsqlScanState state)
  * or freed until after psql_scan_finish is called.
  */
 void
-psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len)
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PGconn *db, VariableSpace vars, int encoding)
 {
 	/* Mustn't be scanning already */
 	Assert(state->scanbufhandle == NULL);
 	Assert(state->buffer_stack == NULL);
 
 	/* Do we need to hack the character set encoding? */
-	state->encoding = pset.encoding;
+	state->encoding = encoding;
 	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
 
+	state->vars = vars;
+
 	/* needed for prepare_buffer */
 	cur_state = state;
 
@@ -1459,6 +1474,7 @@ psql_scan_in_quote(PsqlScanState state)
 	return state->start_state != INITIAL;
 }
 
+#if !defined OUTSIDE_PSQL
 /*
  * Scan the command name of a psql backslash command.  This should be called
  * after psql_scan() returns PSCAN_BACKSLASH.  It is assumed that the input
@@ -1615,7 +1631,7 @@ psql_scan_slash_option(PsqlScanState state,
 					{
 						if (!inquotes && type == OT_SQLID)
 							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, pset.encoding);
+						cp += PQmblen(cp, cur_state->encoding);
 					}
 				}
 			}
@@ -1744,6 +1760,14 @@ evaluate_backtick(void)
 
 	termPQExpBuffer(&cmd_output);
 }
+#else
+static void
+evaluate_backtick(void)
+{
+	fprintf(stderr, "Unexpected call of evaluate_backtick.\n");
+	exit(1);
+}
+#endif /* if !defined OUTSIDE_PSQL*/
 
 /*
  * Push the given string onto the stack of stuff to scan.
@@ -1944,15 +1968,18 @@ escape_variable(bool as_ident)
 	char	   *varname;
 	const char *value;
 
-	/* Variable lookup. */
-	varname = extract_substring(yytext + 2, yyleng - 3);
-	value = GetVariable(pset.vars, varname);
-	free(varname);
+	/* Variable lookup if possible. */
+	if (cur_state->vars && cur_state->db)
+	{
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = GetVariable(cur_state->vars, varname);
+		free(varname);
+	}
 
 	/* Escaping. */
 	if (value)
 	{
-		if (!pset.db)
+		if (!cur_state->db)
 			psql_error("can't escape without active connection\n");
 		else
 		{
@@ -1960,16 +1987,14 @@ escape_variable(bool as_ident)
 
 			if (as_ident)
 				escaped_value =
-					PQescapeIdentifier(pset.db, value, strlen(value));
+					PQescapeIdentifier(cur_state->db, value, strlen(value));
 			else
 				escaped_value =
-					PQescapeLiteral(pset.db, value, strlen(value));
+					PQescapeLiteral(cur_state->db, value, strlen(value));
 
 			if (escaped_value == NULL)
 			{
-				const char *error = PQerrorMessage(pset.db);
-
-				psql_error("%s", error);
+				psql_error("%s", PQerrorMessage(cur_state->db));
 			}
 			else
 			{
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 28ba75a..c143dfe 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -305,8 +305,8 @@ main(int argc, char *argv[])
 
 		scan_state = psql_scan_create();
 		psql_scan_setup(scan_state,
-						options.action_string,
-						strlen(options.action_string));
+						options.action_string, strlen(options.action_string),
+						pset.db, pset.vars, pset.encoding);
 
 		successResult = HandleSlashCmds(scan_state, NULL) != PSQL_CMD_ERROR
 			? EXIT_SUCCESS : EXIT_FAILURE;
-- 
1.8.3.1

0002-Make-use-of-psqlscan-for-parsing-of-custom-script.patchtext/x-patch; charset=us-asciiDownload
>From 71f435b2c9173ddb99a7ba89ce583bfff0c8a400 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 24 Jul 2015 10:58:23 +0900
Subject: [PATCH 2/4] Make use of psqlscan for parsing of custom script.

Make use of psqlscan instead of the home-made parser allowing
backslash continuation for backslash commands, multiline SQL
statements and SQL multi statement in custom scripts.
---
 src/bin/pgbench/Makefile  |  16 +-
 src/bin/pgbench/pgbench.c | 478 +++++++++++++++++++++++++++++++---------------
 2 files changed, 341 insertions(+), 153 deletions(-)

diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile
index 18fdf58..a0a736b 100644
--- a/src/bin/pgbench/Makefile
+++ b/src/bin/pgbench/Makefile
@@ -5,11 +5,13 @@ PGAPPICON = win32
 
 subdir = src/bin/pgbench
 top_builddir = ../../..
+psqlincdir = ../psql
 include $(top_builddir)/src/Makefile.global
 
 OBJS = pgbench.o exprparse.o $(WIN32RES)
 
-override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
+
+override CPPFLAGS := -DOUTSIDE_PSQL -I. -I$(srcdir) -I$(libpq_srcdir) -I$(psqlincdir) $(CPPFLAGS)
 
 ifneq ($(PORTNAME), win32)
 override CFLAGS += $(PTHREAD_CFLAGS)
@@ -18,6 +20,16 @@ endif
 
 all: pgbench
 
+psqlscan.c: FLEXFLAGS = -Cfe -p -p
+psqlscan.c: FLEX_NO_BACKUP=yes
+
+psqlscan.l: % : $(top_srcdir)/src/bin/psql/%
+	 rm -f $@ && $(LN_S)  $< .
+
+psqlscan.c:  psqlscan.l
+
+pgbench.o: psqlscan.c
+
 pgbench: $(OBJS) | submake-libpq submake-libpgport
 	$(CC) $(CFLAGS) $^ $(libpq_pgport) $(PTHREAD_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
 
@@ -39,4 +51,4 @@ clean distclean:
 	rm -f pgbench$(X) $(OBJS)
 
 maintainer-clean: distclean
-	rm -f exprparse.c exprscan.c
+	rm -f exprparse.c exprscan.c psqlscan.l psqlscan.c
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 30e8d2a..b6fd399 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -54,6 +54,7 @@
 #endif
 
 #include "pgbench.h"
+#include "psqlscan.h"
 
 #define ERRCODE_UNDEFINED_TABLE  "42P01"
 
@@ -264,7 +265,7 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct
+typedef struct Command_t
 {
 	char	   *line;			/* full text of command line */
 	int			command_num;	/* unique index of this Command struct */
@@ -273,6 +274,7 @@ typedef struct
 	char	   *argv[MAX_ARGS]; /* command word list */
 	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
+	struct Command_t *next;		/* more command if any, for multistatements */
 } Command;
 
 typedef struct
@@ -295,6 +297,21 @@ typedef struct
 	double		sum2_lag;		/* sum(lag*lag) */
 } AggVals;
 
+typedef enum
+{
+	PS_IDLE,
+	PS_IN_STATEMENT,
+	PS_IN_BACKSLASH_CMD
+} ParseState;
+
+typedef struct ParseInfo
+{
+	PsqlScanState	scan_state;
+	PQExpBuffer		outbuf;
+	ParseState		mode;
+} ParseInfoData;
+typedef ParseInfoData *ParseInfo;
+
 static Command **sql_files[MAX_FILES];	/* SQL script files */
 static int	num_files;			/* number of script files */
 static int	num_commands = 0;	/* total number of Command structs */
@@ -2224,217 +2241,348 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
-/* Parse a command; return a Command struct, or NULL if it's a comment */
+static ParseInfo
+createParseInfo(void)
+{
+	ParseInfo ret = (ParseInfo) pg_malloc(sizeof(ParseInfoData));
+
+	ret->scan_state = psql_scan_create();
+	ret->outbuf = createPQExpBuffer();
+	ret->mode = PS_IDLE;
+
+	return ret;
+}
+
+#define parse_reset_outbuf(pcs) resetPQExpBuffer((pcs)->outbuf)
+#define parse_finish_scan(pcs) psql_scan_finish((pcs)->scan_state)
+
+/* copy a string after removing newlines and collapsing whitespaces */
+static char *
+strdup_nonl(const char *in)
+{
+	char *ret, *p, *q;
+
+	ret = pg_strdup(in);
+
+	/* Replace newlines into spaces */
+	for (p = ret ; *p ; p++)
+		if (*p == '\n') *p = ' ';
+
+	/* collapse successive spaces */
+	for (p = q = ret ; *p ; p++, q++)
+	{
+		while (isspace(*p) && isspace(*(p + 1))) p++;
+		if (p > q) *q = *p;
+	}
+	*q = '\0';
+
+	return ret;
+}
+
+/* Parse a backslash command; return a Command struct  */
 static Command *
-process_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(ParseInfo proc_state, char *buf,
+						   const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 
 	Command    *my_commands;
 	int			j;
 	char	   *p,
+			   *start,
 			   *tok;
-
-	/* Make the string buf end at the next newline */
-	if ((p = strchr(buf, '\n')) != NULL)
-		*p = '\0';
+	int			max_args = -1;
 
 	/* Skip leading whitespace */
 	p = buf;
 	while (isspace((unsigned char) *p))
 		p++;
+	start = p;
+
+	if (proc_state->mode != PS_IN_BACKSLASH_CMD)
+	{
+		if (*p != '\\')
+			return NULL;
+
+		/* This is the first line of a backslash command  */
+		proc_state->mode = PS_IN_BACKSLASH_CMD;
+	}
+
+	/*
+	 * Make the string buf end at the next newline, or move to just after the
+	 * end of line
+	 */
+	if ((p = strchr(start, '\n')) != NULL)
+		*p = '\0';
+	else
+		p = start + strlen(start);
+
+	/* continued line ends with a backslash */
+	if (*(--p) == '\\')
+	{
+		*p-- = '\0';
+		appendPQExpBufferStr(proc_state->outbuf, start);
+
+		/* Add a delimiter at the end of the line if necessary */
+		if (!isspace(*p))
+			appendPQExpBufferChar(proc_state->outbuf, ' ');
 
-	/* If the line is empty or actually a comment, we're done */
-	if (*p == '\0' || strncmp(p, "--", 2) == 0)
 		return NULL;
+	}
+
+	appendPQExpBufferStr(proc_state->outbuf, start);
+	proc_state->mode = PS_IDLE;
+
+	/* Start parsing the backslash command */
+
+	p = proc_state->outbuf->data;
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
-	my_commands->line = pg_strdup(buf);
+	my_commands->line = pg_strdup(p);
 	my_commands->command_num = num_commands++;
-	my_commands->type = 0;		/* until set */
+	my_commands->type = META_COMMAND;
 	my_commands->argc = 0;
+	my_commands->next = NULL;
 
-	if (*p == '\\')
-	{
-		int			max_args = -1;
+	j = 0;
+	tok = strtok(++p, delim);
 
-		my_commands->type = META_COMMAND;
+	if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
+		max_args = 2;
 
-		j = 0;
-		tok = strtok(++p, delim);
+	while (tok != NULL)
+	{
+		my_commands->cols[j] = tok - buf + 1;
+		my_commands->argv[j++] = pg_strdup(tok);
+		my_commands->argc++;
+		if (max_args >= 0 && my_commands->argc >= max_args)
+			tok = strtok(NULL, "");
+		else
+			tok = strtok(NULL, delim);
+	}
+	parse_reset_outbuf(proc_state);
 
-		if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
-			max_args = 2;
+	if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
+	{
+		/*
+		 * parsing: \setrandom variable min max [uniform] \setrandom
+		 * variable min max (gaussian|exponential) threshold
+		 */
 
-		while (tok != NULL)
+		if (my_commands->argc < 4)
 		{
-			my_commands->cols[j] = tok - buf + 1;
-			my_commands->argv[j++] = pg_strdup(tok);
-			my_commands->argc++;
-			if (max_args >= 0 && my_commands->argc >= max_args)
-				tok = strtok(NULL, "");
-			else
-				tok = strtok(NULL, delim);
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing arguments", NULL, -1);
 		}
 
-		if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
-		{
-			/*
-			 * parsing: \setrandom variable min max [uniform] \setrandom
-			 * variable min max (gaussian|exponential) threshold
-			 */
+		/* argc >= 4 */
 
-			if (my_commands->argc < 4)
+		if (my_commands->argc == 4 ||		/* uniform without/with
+											 * "uniform" keyword */
+			(my_commands->argc == 5 &&
+			 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
+		{
+			/* nothing to do */
+		}
+		else if (			/* argc >= 5 */
+			(pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
+			(pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
+		{
+			if (my_commands->argc < 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing arguments", NULL, -1);
-			}
-
-			/* argc >= 4 */
-
-			if (my_commands->argc == 4 ||		/* uniform without/with
-												 * "uniform" keyword */
-				(my_commands->argc == 5 &&
-				 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
-			{
-				/* nothing to do */
+							 "missing threshold argument", my_commands->argv[4], -1);
 			}
-			else if (			/* argc >= 5 */
-					 (pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
-				   (pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
-			{
-				if (my_commands->argc < 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-					 "missing threshold argument", my_commands->argv[4], -1);
-				}
-				else if (my_commands->argc > 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "too many arguments", my_commands->argv[4],
-								 my_commands->cols[6]);
-				}
-			}
-			else	/* cannot parse, unexpected arguments */
+			else if (my_commands->argc > 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "unexpected argument", my_commands->argv[4],
-							 my_commands->cols[4]);
+							 "too many arguments", my_commands->argv[4],
+							 my_commands->cols[6]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+		else	/* cannot parse, unexpected arguments */
 		{
-			if (my_commands->argc < 3)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "unexpected argument", my_commands->argv[4],
+						 my_commands->cols[4]);
+		}
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+	{
+		if (my_commands->argc < 3)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-			expr_scanner_init(my_commands->argv[2], source, lineno,
-							  my_commands->line, my_commands->argv[0],
-							  my_commands->cols[2] - 1);
+		expr_scanner_init(my_commands->argv[2], source, lineno,
+						  my_commands->line, my_commands->argv[0],
+						  my_commands->cols[2] - 1);
 
-			if (expr_yyparse() != 0)
-			{
-				/* dead code: exit done from syntax_error called by yyerror */
-				exit(1);
-			}
+		if (expr_yyparse() != 0)
+		{
+			/* dead code: exit done from syntax_error called by yyerror */
+			exit(1);
+		}
 
-			my_commands->expr = expr_parse_result;
+		my_commands->expr = expr_parse_result;
 
-			expr_scanner_finish();
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+		expr_scanner_finish();
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+	{
+		if (my_commands->argc < 2)
 		{
-			if (my_commands->argc < 2)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
-
-			/*
-			 * Split argument into number and unit to allow "sleep 1ms" etc.
-			 * We don't have to terminate the number argument with null
-			 * because it will be parsed with atoi, which ignores trailing
-			 * non-digit characters.
-			 */
-			if (my_commands->argv[1][0] != ':')
-			{
-				char	   *c = my_commands->argv[1];
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-				while (isdigit((unsigned char) *c))
-					c++;
-				if (*c)
-				{
-					my_commands->argv[2] = c;
-					if (my_commands->argc < 3)
-						my_commands->argc = 3;
-				}
-			}
+		/*
+		 * Split argument into number and unit to allow "sleep 1ms" etc.  We
+		 * don't have to terminate the number argument with null because it
+		 * will be parsed with atoi, which ignores trailing non-digit
+		 * characters.
+		 */
+		if (my_commands->argv[1][0] != ':')
+		{
+			char	   *c = my_commands->argv[1];
 
-			if (my_commands->argc >= 3)
+			while (isdigit((unsigned char) *c))
+				c++;
+			if (*c)
 			{
-				if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "s") != 0)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "unknown time unit, must be us, ms or s",
-								 my_commands->argv[2], my_commands->cols[2]);
-				}
+				my_commands->argv[2] = c;
+				if (my_commands->argc < 3)
+					my_commands->argc = 3;
 			}
-
-			/* this should be an error?! */
-			for (j = 3; j < my_commands->argc; j++)
-				fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
-						my_commands->argv[0], my_commands->argv[j]);
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+
+		if (my_commands->argc >= 3)
 		{
-			if (my_commands->argc < 3)
+			if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "s") != 0)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
+							 "unknown time unit, must be us, ms or s",
+							 my_commands->argv[2], my_commands->cols[2]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+
+		/* this should be an error?! */
+		for (j = 3; j < my_commands->argc; j++)
+			fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
+					my_commands->argv[0], my_commands->argv[j]);
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+	{
+		if (my_commands->argc < 3)
 		{
-			if (my_commands->argc < 1)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing command", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
 		}
-		else
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+	{
+		if (my_commands->argc < 1)
 		{
 			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-						 "invalid command", NULL, -1);
+						 "missing command", NULL, -1);
 		}
 	}
 	else
 	{
-		my_commands->type = SQL_COMMAND;
+		syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+					 "invalid command", NULL, -1);
+	}
+
+	return my_commands;
+}
+
+/* Parse a input line, return non-null if any command terminates. */
+static Command *
+process_commands(ParseInfo proc_state, char *buf,
+				 const char *source, const int lineno)
+{
+	Command *command = NULL;
+	Command *retcomd = NULL;
+	PsqlScanState scan_state = proc_state->scan_state;
+	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+	PQExpBuffer qbuf = proc_state->outbuf;
+	PsqlScanResult scan_result;
+
+	if (proc_state->mode != PS_IN_STATEMENT)
+	{
+		command = process_backslash_commands(proc_state, buf, source, lineno);
+
+		/* go to next line for continuation line of backslash command. */
+		if (command != NULL || proc_state->mode == PS_IN_BACKSLASH_CMD)
+			return command;
+	}
+
+	/* Parse statements */
+	psql_scan_setup(scan_state, buf, strlen(buf), NULL, NULL, 0);
+
+next_command:	
+	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
+
+	if (scan_result == PSCAN_SEMICOLON)
+	{
+		proc_state->mode = PS_IDLE;
+		/*
+		 * Command is terminated. Fill the struct.
+		 */
+		command = (Command*) pg_malloc(sizeof(Command));
+		command->line = strdup_nonl(qbuf->data);
+		command->command_num = num_commands++;
+		command->type = SQL_COMMAND;
+		command->argc = 0;
+		command->next = NULL;
+
+		/* Put this command at the end of returning command chain */
+		if (!retcomd)
+			retcomd = command;
+		else
+		{
+			Command *pcomm = retcomd;
+			while (pcomm->next) pcomm = pcomm->next;
+			pcomm->next = command;
+		}
 
 		switch (querymode)
 		{
-			case QUERY_SIMPLE:
-				my_commands->argv[0] = pg_strdup(p);
-				my_commands->argc++;
-				break;
-			case QUERY_EXTENDED:
-			case QUERY_PREPARED:
-				if (!parseQuery(my_commands, p))
-					exit(1);
-				break;
-			default:
+		case QUERY_SIMPLE:
+			command->argv[0] = pg_strdup(qbuf->data);
+			command->argc++;
+			break;
+		case QUERY_EXTENDED:
+		case QUERY_PREPARED:
+			if (!parseQuery(command, qbuf->data))
 				exit(1);
+			break;
+		default:
+			exit(1);
 		}
+
+		parse_reset_outbuf(proc_state);
+
+		/* Ask for the next statement in this line */
+		goto next_command;
+	}
+	else if (scan_result == PSCAN_BACKSLASH)
+	{
+		fprintf(stderr, "Unexpected backslash in SQL statement: %s:%d\n", source, lineno);
+		exit(1);
 	}
 
-	return my_commands;
+	proc_state->mode = PS_IN_STATEMENT;
+	psql_scan_finish(scan_state);
+
+	return retcomd;
 }
 
+
 /*
  * Read a line from fd, and return it in a malloc'd buffer.
  * Return NULL at EOF.
@@ -2489,6 +2637,7 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	if (num_files >= MAX_FILES)
 	{
@@ -2509,33 +2658,47 @@ process_file(char *filename)
 		return false;
 	}
 
+	proc_state->mode = PS_IDLE;
+
 	lineno = 0;
 	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
-		Command    *command;
+		Command *command = NULL;
 
 		lineno += 1;
 
-		command = process_commands(buf, filename, lineno);
-
+		command = process_commands(proc_state, buf, filename, lineno);
 		free(buf);
 
 		if (command == NULL)
+		{
+			/*
+			 * command is NULL when psql_scan returns PSCAN_EOL or
+			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
+			 * cases.
+			 */
 			continue;
+		}
 
-		my_commands[index] = command;
-		index++;
+		while (command)
+		{
+			my_commands[index++] = command;
+			command = command->next;
+		}
 
-		if (index >= alloc_num)
+		if (index > alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
+			my_commands = pg_realloc(my_commands,
+									 sizeof(Command *) * alloc_num);
 		}
 	}
 	fclose(fd);
 
+	parse_finish_scan(proc_state);
+
 	my_commands[index] = NULL;
 
 	sql_files[num_files++] = my_commands;
@@ -2553,6 +2716,7 @@ process_builtin(char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2579,10 +2743,12 @@ process_builtin(char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(buf, source, lineno);
+		command = process_commands(proc_state, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
+		/* builtin doesn't need multistatements */
+		Assert(command->next == NULL);
 		my_commands[index] = command;
 		index++;
 
@@ -2594,6 +2760,7 @@ process_builtin(char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
+	parse_finish_scan(proc_state);
 
 	return my_commands;
 }
@@ -3934,3 +4101,12 @@ pthread_join(pthread_t th, void **thread_return)
 }
 
 #endif   /* WIN32 */
+
+/*
+ * psqlscan.c is #include'd here instead of being compiled on its own.
+ * This is because we need postgres_fe.h to be read before any system
+ * include files, else things tend to break on platforms that have
+ * multiple infrastructures for stdio.h and so on.  flex is absolutely
+ * uncooperative about that, so we can't compile psqlscan.c on its own.
+ */
+#include "psqlscan.c"
-- 
1.8.3.1

0003-Change-MSVC-Build-script.patchtext/x-patch; charset=us-asciiDownload
>From 0a9a27dbd032bc0aa736b03b48987b71fe21ac3c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Tue, 4 Aug 2015 20:54:28 +0900
Subject: [PATCH 3/4] Change MSVC Build script

---
 src/tools/msvc/Mkvcbuild.pm | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 3abbb4c..f018a29 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -68,7 +68,7 @@ my $frontend_extrasource = {
 	  [ 'src/bin/pgbench/exprscan.l', 'src/bin/pgbench/exprparse.y' ], };
 my @frontend_excludes = (
 	'pgevent',     'pg_basebackup', 'pg_rewind', 'pg_dump',
-	'pg_xlogdump', 'scripts');
+	'pg_xlogdump', 'pgbench', 'scripts');
 
 sub mkvcbuild
 {
@@ -671,6 +671,14 @@ sub mkvcbuild
 	}
 	$pg_xlogdump->AddFile('src/backend/access/transam/xlogreader.c');
 
+	# fix up pg_xlogdump once it's been set up
+	# files symlinked on Unix are copied on windows
+	my $pgbench = AddSimpleFrontend('pgbench');
+	$pgbench->AddDefine('FRONTEND');
+	$pgbench->AddDefine('OUTSIDE_PSQL');
+	$pgbench->AddFile('src/bin/psql/psqlscan.l');
+	$pgbench->AddIncludeDir('src/bin/psql');
+
 	$solution->Save();
 	return $solution->{vcver};
 }
-- 
1.8.3.1

0004-Change-the-way-to-hold-command-list.patchtext/x-patch; charset=us-asciiDownload
>From 328bea28c4fc777d483b2c7837fcb8fafcd08923 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 19 Aug 2015 12:53:13 +0900
Subject: [PATCH 4/4] Change the way to hold command list.

Commands are generated as a linked list and stored into and accessed
as an array. This patch unifies the way to store them to linked list.
---
 src/bin/pgbench/pgbench.c | 189 +++++++++++++++++++++++-----------------------
 1 file changed, 95 insertions(+), 94 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index b6fd399..285ccca 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -191,16 +191,29 @@ typedef struct
 
 #define MAX_FILES		128		/* max number of SQL script files allowed */
 #define SHELL_COMMAND_SIZE	256 /* maximum size allowed for shell command */
+#define MAX_ARGS		10
 
 /*
  * structures used in custom query mode
  */
 
+typedef struct Command_t
+{
+	char	   *line;			/* full text of command line */
+	int			command_num;	/* unique index of this Command struct */
+	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
+	int			argc;			/* number of command words */
+	char	   *argv[MAX_ARGS]; /* command word list */
+	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
+	PgBenchExpr *expr;			/* parsed expression */
+	struct Command_t *next;		/* more command if any, for multistatements */
+} Command;
+
 typedef struct
 {
 	PGconn	   *con;			/* connection handle to DB */
 	int			id;				/* client No. */
-	int			state;			/* state No. */
+	Command	   *curr;			/* current command */
 	int			listen;			/* 0 indicates that an async query has been
 								 * sent */
 	int			sleeping;		/* 1 indicates that the client is napping */
@@ -252,7 +265,6 @@ typedef struct
  */
 #define SQL_COMMAND		1
 #define META_COMMAND	2
-#define MAX_ARGS		10
 
 typedef enum QueryMode
 {
@@ -265,18 +277,6 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct Command_t
-{
-	char	   *line;			/* full text of command line */
-	int			command_num;	/* unique index of this Command struct */
-	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
-	int			argc;			/* number of command words */
-	char	   *argv[MAX_ARGS]; /* command word list */
-	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
-	PgBenchExpr *expr;			/* parsed expression */
-	struct Command_t *next;		/* more command if any, for multistatements */
-} Command;
-
 typedef struct
 {
 
@@ -312,7 +312,7 @@ typedef struct ParseInfo
 } ParseInfoData;
 typedef ParseInfoData *ParseInfo;
 
-static Command **sql_files[MAX_FILES];	/* SQL script files */
+static Command *sql_files[MAX_FILES];	/* SQL script files */
 static int	num_files;			/* number of script files */
 static int	num_commands = 0;	/* total number of Command structs */
 static int	debug = 0;			/* debug flag */
@@ -1140,12 +1140,27 @@ agg_vals_init(AggVals *aggs, instr_time start)
 	aggs->start_time = INSTR_TIME_GET_DOUBLE(start);
 }
 
+/* Return the ordinal of a command list item in a list */
+static int
+get_command_number(Command *head, Command *curr)
+{
+	int i;
+	Command *p = head;
+
+	for (i = 0 ; p && p != curr ; p = p->next, i++);
+
+	/* curr must be in the list */
+	Assert(p);
+
+	return i;
+}
+
 /* return false iff client should be disconnected */
 static bool
 doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVals *agg)
 {
 	PGresult   *res;
-	Command   **commands;
+	Command    *commands;
 	bool		trans_needs_throttle = false;
 	instr_time	now;
 
@@ -1242,13 +1257,14 @@ top:
 
 	if (st->listen)
 	{							/* are we receiver? */
-		if (commands[st->state]->type == SQL_COMMAND)
+		if (st->curr->type == SQL_COMMAND)
 		{
 			if (debug)
 				fprintf(stderr, "client %d receiving\n", st->id);
 			if (!PQconsumeInput(st->con))
 			{					/* there's something wrong */
-				fprintf(stderr, "client %d aborted in state %d; perhaps the backend died while processing\n", st->id, st->state);
+				fprintf(stderr, "client %d aborted in state %d; perhaps the backend died while processing\n",
+						st->id,	get_command_number(commands, st->curr));
 				return clientDone(st, false);
 			}
 			if (PQisBusy(st->con))
@@ -1261,7 +1277,7 @@ top:
 		 */
 		if (is_latencies)
 		{
-			int			cnum = commands[st->state]->command_num;
+			int			cnum = st->curr->command_num;
 
 			if (INSTR_TIME_IS_ZERO(now))
 				INSTR_TIME_SET_CURRENT(now);
@@ -1271,7 +1287,7 @@ top:
 		}
 
 		/* transaction finished: calculate latency and log the transaction */
-		if (commands[st->state + 1] == NULL)
+		if (st->curr->next == NULL)
 		{
 			/* only calculate latency if an option is used that needs it */
 			if (progress || throttle_delay || latency_limit)
@@ -1304,7 +1320,7 @@ top:
 				doLog(thread, st, logfile, &now, agg, false);
 		}
 
-		if (commands[st->state]->type == SQL_COMMAND)
+		if (st->curr->type == SQL_COMMAND)
 		{
 			/*
 			 * Read and discard the query result; note this is not included in
@@ -1318,7 +1334,8 @@ top:
 					break;		/* OK */
 				default:
 					fprintf(stderr, "client %d aborted in state %d: %s",
-							st->id, st->state, PQerrorMessage(st->con));
+							st->id, get_command_number(commands, st->curr),
+							PQerrorMessage(st->con));
 					PQclear(res);
 					return clientDone(st, false);
 			}
@@ -1326,7 +1343,7 @@ top:
 			discard_response(st);
 		}
 
-		if (commands[st->state + 1] == NULL)
+		if (st->curr->next == NULL)
 		{
 			if (is_connect)
 			{
@@ -1340,12 +1357,12 @@ top:
 		}
 
 		/* increment state counter */
-		st->state++;
-		if (commands[st->state] == NULL)
+		st->curr = st->curr->next;
+		if (st->curr == NULL)
 		{
-			st->state = 0;
 			st->use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = sql_files[st->use_file];
+			st->curr = commands;
 			st->is_throttled = false;
 
 			/*
@@ -1388,7 +1405,8 @@ top:
 	}
 
 	/* Record transaction start time under logging, progress or throttling */
-	if ((logfile || progress || throttle_delay || latency_limit) && st->state == 0)
+	if ((logfile || progress || throttle_delay || latency_limit) &&
+		st->curr == commands)
 	{
 		INSTR_TIME_SET_CURRENT(st->txn_begin);
 
@@ -1404,9 +1422,9 @@ top:
 	if (is_latencies)
 		INSTR_TIME_SET_CURRENT(st->stmt_begin);
 
-	if (commands[st->state]->type == SQL_COMMAND)
+	if (st->curr->type == SQL_COMMAND)
 	{
-		const Command *command = commands[st->state];
+		const Command *command = st->curr;
 		int			r;
 
 		if (querymode == QUERY_SIMPLE)
@@ -1440,18 +1458,19 @@ top:
 
 			if (!st->prepared[st->use_file])
 			{
-				int			j;
+				int			j = 0;
+				Command		*pcom = commands;
 
-				for (j = 0; commands[j] != NULL; j++)
+				for (; pcom ; pcom = pcom->next, j++)
 				{
 					PGresult   *res;
 					char		name[MAX_PREPARE_NAME];
 
-					if (commands[j]->type != SQL_COMMAND)
+					if (pcom->type != SQL_COMMAND)
 						continue;
 					preparedStatementName(name, st->use_file, j);
 					res = PQprepare(st->con, name,
-						  commands[j]->argv[0], commands[j]->argc - 1, NULL);
+						  pcom->argv[0], pcom->argc - 1, NULL);
 					if (PQresultStatus(res) != PGRES_COMMAND_OK)
 						fprintf(stderr, "%s", PQerrorMessage(st->con));
 					PQclear(res);
@@ -1460,7 +1479,8 @@ top:
 			}
 
 			getQueryParams(st, command, params);
-			preparedStatementName(name, st->use_file, st->state);
+			preparedStatementName(name, st->use_file,
+								  get_command_number(commands, st->curr));
 
 			if (debug)
 				fprintf(stderr, "client %d sending %s\n", st->id, name);
@@ -1480,11 +1500,11 @@ top:
 		else
 			st->listen = 1;		/* flags that should be listened */
 	}
-	else if (commands[st->state]->type == META_COMMAND)
+	else if (st->curr->type == META_COMMAND)
 	{
-		int			argc = commands[st->state]->argc,
+		int			argc = st->curr->argc,
 					i;
-		char	  **argv = commands[st->state]->argv;
+		char	  **argv = st->curr->argv;
 
 		if (debug)
 		{
@@ -1626,7 +1646,7 @@ top:
 		else if (pg_strcasecmp(argv[0], "set") == 0)
 		{
 			char		res[64];
-			PgBenchExpr *expr = commands[st->state]->expr;
+			PgBenchExpr *expr = st->curr->expr;
 			int64		result;
 
 			if (!evaluateExpr(st, expr, &result))
@@ -2629,14 +2649,11 @@ read_line_from_file(FILE *fd)
 static int
 process_file(char *filename)
 {
-#define COMMANDS_ALLOC_NUM 128
-
-	Command   **my_commands;
+	Command    *my_commands = NULL,
+			   *my_commands_tail = NULL;
 	FILE	   *fd;
-	int			lineno,
-				index;
+	int			lineno;
 	char	   *buf;
-	int			alloc_num;
 	ParseInfo proc_state = createParseInfo();
 
 	if (num_files >= MAX_FILES)
@@ -2645,23 +2662,18 @@ process_file(char *filename)
 		exit(1);
 	}
 
-	alloc_num = COMMANDS_ALLOC_NUM;
-	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
-
 	if (strcmp(filename, "-") == 0)
 		fd = stdin;
 	else if ((fd = fopen(filename, "r")) == NULL)
 	{
 		fprintf(stderr, "could not open file \"%s\": %s\n",
 				filename, strerror(errno));
-		pg_free(my_commands);
 		return false;
 	}
 
 	proc_state->mode = PS_IDLE;
 
 	lineno = 0;
-	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
@@ -2677,52 +2689,42 @@ process_file(char *filename)
 			/*
 			 * command is NULL when psql_scan returns PSCAN_EOL or
 			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
-			 * cases.
+			 * case.
 			 */
 			continue;
 		}
 
-		while (command)
-		{
-			my_commands[index++] = command;
-			command = command->next;
-		}
+		/* Append new commands at the end of the list */
+		if (my_commands_tail)
+			my_commands_tail->next = command;
+		else
+			my_commands = my_commands_tail = command;
 
-		if (index > alloc_num)
-		{
-			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands,
-									 sizeof(Command *) * alloc_num);
-		}
+		/* Seek to the tail of the list */
+		while (my_commands_tail->next)
+			my_commands_tail = my_commands_tail->next;
 	}
 	fclose(fd);
 
 	parse_finish_scan(proc_state);
 
-	my_commands[index] = NULL;
+	my_commands_tail->next = NULL;
 
 	sql_files[num_files++] = my_commands;
 
 	return true;
 }
 
-static Command **
+static Command *
 process_builtin(char *tb, const char *source)
 {
-#define COMMANDS_ALLOC_NUM 128
-
-	Command   **my_commands;
-	int			lineno,
-				index;
+	Command    *my_commands = NULL,
+			   *my_commands_tail = NULL;
+	int			lineno;
 	char		buf[BUFSIZ];
-	int			alloc_num;
 	ParseInfo proc_state = createParseInfo();
 
-	alloc_num = COMMANDS_ALLOC_NUM;
-	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
-
 	lineno = 0;
-	index = 0;
 
 	for (;;)
 	{
@@ -2747,19 +2749,17 @@ process_builtin(char *tb, const char *source)
 		if (command == NULL)
 			continue;
 
-		/* builtin doesn't need multistatements */
+		/* For simplisity, inhibit builtin from multistatements */
 		Assert(command->next == NULL);
-		my_commands[index] = command;
-		index++;
-
-		if (index >= alloc_num)
+		if (my_commands_tail)
 		{
-			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
+			my_commands_tail->next = command;
+			my_commands_tail = command;
 		}
+		else
+			my_commands = my_commands_tail = command;
 	}
 
-	my_commands[index] = NULL;
 	parse_finish_scan(proc_state);
 
 	return my_commands;
@@ -2864,16 +2864,16 @@ printResults(int ttype, int64 normal_xacts, int nclients,
 
 		for (i = 0; i < num_files; i++)
 		{
-			Command   **commands;
+			Command   *command;
 
 			if (num_files > 1)
 				printf("statement latencies in milliseconds, file %d:\n", i + 1);
 			else
 				printf("statement latencies in milliseconds:\n");
 
-			for (commands = sql_files[i]; *commands != NULL; commands++)
+			for (command = sql_files[i]; command ;
+				 command=command->next)
 			{
-				Command    *command = *commands;
 				int			cnum = command->command_num;
 				double		total_time;
 				instr_time	total_exec_elapsed;
@@ -3153,7 +3153,7 @@ main(int argc, char **argv)
 				benchmarking_option_set = true;
 				ttype = 3;
 				filename = pg_strdup(optarg);
-				if (process_file(filename) == false || *sql_files[num_files - 1] == NULL)
+				if (process_file(filename) == false || sql_files[num_files - 1] == NULL)
 					exit(1);
 				break;
 			case 'D':
@@ -3735,17 +3735,19 @@ threadRun(void *arg)
 	for (i = 0; i < nstate; i++)
 	{
 		CState	   *st = &state[i];
-		Command   **commands = sql_files[st->use_file];
+		Command    *commands = sql_files[st->use_file];
 		int			prev_ecnt = st->ecnt;
 
 		st->use_file = getrand(thread, 0, num_files - 1);
+		st->curr = sql_files[st->use_file];
+
 		if (!doCustom(thread, st, &thread->conn_time, logfile, &aggs))
 			remains--;			/* I've aborted */
 
-		if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
+		if (st->ecnt > prev_ecnt && st->curr->type == META_COMMAND)
 		{
 			fprintf(stderr, "client %d aborted in state %d; execution of meta-command failed\n",
-					i, st->state);
+					i, get_command_number(commands, st->curr));
 			remains--;			/* I've aborted */
 			PQfinish(st->con);
 			st->con = NULL;
@@ -3766,7 +3768,6 @@ threadRun(void *arg)
 		for (i = 0; i < nstate; i++)
 		{
 			CState	   *st = &state[i];
-			Command   **commands = sql_files[st->use_file];
 			int			sock;
 
 			if (st->con == NULL)
@@ -3802,7 +3803,7 @@ threadRun(void *arg)
 						min_usec = this_usec;
 				}
 			}
-			else if (commands[st->state]->type == META_COMMAND)
+			else if (st->curr->type == META_COMMAND)
 			{
 				min_usec = 0;	/* the connection is ready to run */
 				break;
@@ -3872,20 +3873,20 @@ threadRun(void *arg)
 		for (i = 0; i < nstate; i++)
 		{
 			CState	   *st = &state[i];
-			Command   **commands = sql_files[st->use_file];
+			Command    *commands = sql_files[st->use_file];
 			int			prev_ecnt = st->ecnt;
 
 			if (st->con && (FD_ISSET(PQsocket(st->con), &input_mask)
-							|| commands[st->state]->type == META_COMMAND))
+							|| st->curr->type == META_COMMAND))
 			{
 				if (!doCustom(thread, st, &thread->conn_time, logfile, &aggs))
 					remains--;	/* I've aborted */
 			}
 
-			if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
+			if (st->ecnt > prev_ecnt && st->curr->type == META_COMMAND)
 			{
 				fprintf(stderr, "client %d aborted in state %d; execution of meta-command failed\n",
-						i, st->state);
+						i, get_command_number(commands, st->curr));
 				remains--;		/* I've aborted */
 				PQfinish(st->con);
 				st->con = NULL;
-- 
1.8.3.1

#34Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Kyotaro HORIGUCHI (#33)
1 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

Hello,

Here is a review, sorry for the delay...

This is done as the additional fourth patch, not merged into
previous ones, to show what's changed in the manner of command
storing.
[...]

- SQL multi-statement.

SELECT 1; SELECT 2;

I think this is really "SELECT 1\; SELECT 2;"

I join a test script I used.

The purpose of this 4 parts patch is to reuse psql scanner from pgbench
so that commands are cleanly separated by ";", including managing dollar
quoting, having \ continuations in backslash-commands, having
multi-statement commands...

This review is about 4 part v4 of the patch. The patches apply and compile
cleanly.

I think that the features are worthwhile. I would have prefer more limited
changes to get them, but my earlier attempt was rejected, and the scanner
sharing with psql results in reasonably limited changes, so I would go for
it.

* 0001 patch about psql scanner reworking

The relevant features lexer which can be reused by pgbench are isolated
and adapted thanks to ifdefs, guards, and putting some stuff in the
current state.

I'm not sure of the "OUTSIDE_PSQL" macro name. ISTM that "PGBENCH_SCANNER"
would be better, as it makes it very clear that it is being used for pgbench,
and if someone wants to use it for something else they should define and handle
their own case explicitely.

Use "void *" instead of "int *" for VariableSpace?

Rule ":{variable_char}+": the ECHO works more or less as a side effect,
and most of the code is really ignored by pgbench. Instead of the different
changes which rely on NULL values, what about a simple ifdef/else/endif
to replace the whole stuff by ECHO for pgbench, without changing the current
code?

Also, on the same part of the code, I'm not sure about having two assignments
on the "const char * value" variable, because of "const".

The "db" parameter is not used by psql_scan_setup, so the state's db field
is never initialized, so probably "psql" does not work properly because
it seems used in several places.

I'm not sure what would happen if there are reconnections from psql (is
that possible? Without reseting the scanner state?), as there are two
connections, one in pset and one in the scanner state?

Variable lookup guards: why is a database connection necessary for doing
":variables" lookups? It seemed not to be required in the initial version,
and is not really needed.

Avoid changing style without clear motivation, for instance the
PQerrorMessage/psql_error on escape_value==NULL?

*** 0002 patch to use the scanner in pgbench

There is no documentation nor examples about the new features.
I think that the internal builtin commands and the documentation should
be used to show the new features where appropriate, and insist on that
";" are now required at the end of sql commands.

If the file starts with a -- comment followed by a backslash-command, eg:
-- there is only one foo currently available
\set foo 1
an error is reported: the comment should probably just be ignored.

I'm not sure that the special "next" command parsing management is useful.
I do not see a significant added value to managing especially a list of
commands for commands which happened to be on the same line but separated
by a simple ";". Could not the code be simplified by just restarting
the scanner where it left, instead of looping in "process_commands"?

It seems that part of the changes are just reindentations, especially
all the parsing code for backslash commands. This should be avoided if
possible.

Some spurious spaces after "next_command:".

*** 0003 patch for ms build

I don't do windows:-)

The perl script changes look pretty reasonable, although the copied
comments refer to pg_xlogdump, I guess it should rather refer to pgbench.

*** 0004 command list patch

This patch changes the command array to use a linked list.

As the command number is needed in several places and has to be replaced by a
function call which scans the list, I do not think it is a good idea, and
I recommand not to consider this part for inclusion.

--
Fabien.

Attachments:

test.sqlapplication/x-sql; name=test.sqlDownload
#35Michael Paquier
michael.paquier@gmail.com
In reply to: Fabien COELHO (#34)
Re: pgbench - allow backslash-continuations in custom scripts

On Wed, Oct 14, 2015 at 5:49 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello,

Here is a review, sorry for the delay...

This is done as the additional fourth patch, not merged into
previous ones, to show what's changed in the manner of command
storing.
[...]

- SQL multi-statement.

SELECT 1; SELECT 2;

I think this is really "SELECT 1\; SELECT 2;"

I join a test script I used.

The purpose of this 4 parts patch is to reuse psql scanner from pgbench
so that commands are cleanly separated by ";", including managing dollar
quoting, having \ continuations in backslash-commands, having
multi-statement commands...

This review is about 4 part v4 of the patch. The patches apply and compile
cleanly.

I think that the features are worthwhile. I would have prefer more limited
changes to get them, but my earlier attempt was rejected, and the scanner
sharing with psql results in reasonably limited changes, so I would go for
it.

Regarding that:
+#if !defined OUTSIDE_PSQL
+#include "variables.h"
+#else
+typedef int * VariableSpace;
+#endif
And that:
+/* Provide dummy macros when no use of psql variables */
+#if defined OUTSIDE_PSQL
+#define GetVariable(space,name) NULL
+#define standard_strings() true
+#define psql_error(fmt,...) do { \
+    fprintf(stderr, "psql_error is called. abort.\n");\
+    exit(1);\
+} while(0)
+#endif
That's ugly... Wouldn't it be better with something say in src/common
which is frontend-only? We could start with a set of routines allowing
commands to be parsed. That gives us more room for future improvement.
+    # fix up pg_xlogdump once it's been set up
+    # files symlinked on Unix are copied on windows
+    my $pgbench = AddSimpleFrontend('pgbench');
+    $pgbench->AddDefine('FRONTEND');
+    $pgbench->AddDefine('OUTSIDE_PSQL');
+    $pgbench->AddFile('src/bin/psql/psqlscan.l');
+    $pgbench->AddIncludeDir('src/bin/psql');
This is a simple copy-paste, with an incorrect comment at least
(haven't tested compilation with MSVC, I suspect that this is going to
fail still the flags are correctly set).

This patch is waiting for input from its author for quite some time
now, and the structure of this patch needs a rework. Are folks on this
thread fine if it is returned with feedback?
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Michael Paquier (#35)
Re: pgbench - allow backslash-continuations in custom scripts

I'm terribly sorry to have overlooked Febien's comment.

I'll rework on this considering Febien's previous comment and
Michael's this comment in the next CF.

At Tue, 22 Dec 2015 16:17:07 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqRX_-VymeEH-3ChoPrQLgKh=EGgQ2GUtZ53ccO9uLGmxA@mail.gmail.com>

On Wed, Oct 14, 2015 at 5:49 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello,

Here is a review, sorry for the delay...

This is done as the additional fourth patch, not merged into
previous ones, to show what's changed in the manner of command
storing.
[...]

- SQL multi-statement.

SELECT 1; SELECT 2;

I think this is really "SELECT 1\; SELECT 2;"

Mmm. It is differenct from my recognition. I'll confirm that.

I join a test script I used.

Thank you. I'll work on with it.

The purpose of this 4 parts patch is to reuse psql scanner from pgbench
so that commands are cleanly separated by ";", including managing dollar
quoting, having \ continuations in backslash-commands, having
multi-statement commands...

This review is about 4 part v4 of the patch. The patches apply and compile
cleanly.

I think that the features are worthwhile. I would have prefer more limited
changes to get them, but my earlier attempt was rejected, and the scanner
sharing with psql results in reasonably limited changes, so I would go for
it.

Regarding that:
+#if !defined OUTSIDE_PSQL
+#include "variables.h"
+#else
+typedef int * VariableSpace;
+#endif
And that:
+/* Provide dummy macros when no use of psql variables */
+#if defined OUTSIDE_PSQL
+#define GetVariable(space,name) NULL
+#define standard_strings() true
+#define psql_error(fmt,...) do { \
+    fprintf(stderr, "psql_error is called. abort.\n");\
+    exit(1);\
+} while(0)
+#endif
That's ugly...

I have to admit that I think just the same.

Wouldn't it be better with something say in src/common
which is frontend-only? We could start with a set of routines allowing
commands to be parsed. That gives us more room for future improvement.

If I read you correctly, I should cut it out into a new file and
include it. Is it correct?

+    # fix up pg_xlogdump once it's been set up
+    # files symlinked on Unix are copied on windows
+    my $pgbench = AddSimpleFrontend('pgbench');
+    $pgbench->AddDefine('FRONTEND');
+    $pgbench->AddDefine('OUTSIDE_PSQL');
+    $pgbench->AddFile('src/bin/psql/psqlscan.l');
+    $pgbench->AddIncludeDir('src/bin/psql');
This is a simple copy-paste, with an incorrect comment at least
(haven't tested compilation with MSVC, I suspect that this is going to
fail still the flags are correctly set).

Oops. Thank you for pointing out. It worked for me but, honestly
saying, I couldn't another clean way to do that but I'll
reconsider it.

This patch is waiting for input from its author for quite some time
now, and the structure of this patch needs a rework. Are folks on this
thread fine if it is returned with feedback?

It's fine for me.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#37Michael Paquier
michael.paquier@gmail.com
In reply to: Kyotaro HORIGUCHI (#36)
Re: pgbench - allow backslash-continuations in custom scripts

On Tue, Dec 22, 2015 at 5:34 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

I wrote:

Wouldn't it be better with something say in src/common
which is frontend-only? We could start with a set of routines allowing
commands to be parsed. That gives us more room for future improvement.

If I read you correctly, I should cut it out into a new file and
include it. Is it correct?

Not really, I meant to see if it would be possible to include this set
of routines directly in libpqcommon (as part of OBJS_FRONTEND). This
way any client applications could easily reuse it. If we found that
what was in psql is now useful to pgbench, I have little doubt that
some other folks would make good use of that. I honestly have not
looked at the code to see if that's doable or not, but soft-linking
directly in pgbench a file of psql will not help future maintenance
for sure. This increases the complexity of the code.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Michael Paquier (#37)
Re: pgbench - allow backslash-continuations in custom scripts

Hello Michaël,

If I read you correctly, I should cut it out into a new file and
include it. Is it correct?

Not really, I meant to see if it would be possible to include this set
of routines directly in libpqcommon (as part of OBJS_FRONTEND). This
way any client applications could easily reuse it. If we found that
what was in psql is now useful to pgbench, I have little doubt that
some other folks would make good use of that. I honestly have not
looked at the code to see if that's doable or not, but soft-linking
directly in pgbench a file of psql will not help future maintenance
for sure. This increases the complexity of the code.

Just my 0.02€:

I understand that you suggest some kind of dynamic parametrization... this
is harder to do and potentially as fragile as the link/ifdef option, with
an undertermined set of callbacks to provide... the generic version would
be harder to debug, and this approach would prevent changing lexer
options. Basically I'm not sure that doing all that for improving the
handling of pgbench scripts is worth the effort. I would go with the
simpler option.

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#39Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Fabien COELHO (#38)
Re: pgbench - allow backslash-continuations in custom scripts

Hello,

At Thu, 24 Dec 2015 07:40:19 +0100 (CET), Fabien COELHO <coelho@cri.ensmp.fr> wrote in <alpine.DEB.2.10.1512240729160.17411@sto>

Hello Michaël,

If I read you correctly, I should cut it out into a new file and
include it. Is it correct?

Not really, I meant to see if it would be possible to include this set
of routines directly in libpqcommon (as part of OBJS_FRONTEND). This
way any client applications could easily reuse it. If we found that
what was in psql is now useful to pgbench, I have little doubt that
some other folks would make good use of that. I honestly have not
looked at the code to see if that's doable or not, but soft-linking
directly in pgbench a file of psql will not help future maintenance
for sure. This increases the complexity of the code.

Thanks. I understand it and I agree to the last sentense. I don't
want them to be exposed as generic features.

Instaed, I'd like to separate backslash commands from psqlscan.l
and use callbacks to access variables by ":name" syntax (it is a
common syntax between pgbench and psql). Current psqlscan.l
simply exits noticing of leading backslash of backslash commands
to the caller so it would be separated without heavy
surgery. This can put away the ugliness of VariableSpace
overriding.

standard_strings() checks standard_comforming_strins on the
session. This is necessarily called on parsing so it can be moved
out into PsqlScanState.

psql_error() redirects messages according to queryFout and adds
input file information. They are heavily dependent to psql. So
I'd like to make them a callback in PsqlScanState and do fprintf
as the default behavior (=NULL).

These measures will get rid of the ugliness. What do you think
about this?

Just my 0.02€:

I understand that you suggest some kind of dynamic
parametrization... this is harder to do and potentially as fragile as
the link/ifdef option, with an undertermined set of callbacks to
provide... the generic version would be harder to debug, and this
approach would prevent changing lexer options. Basically I'm not sure
that doing all that for improving the handling of pgbench scripts is
worth the effort. I would go with the simpler option.

Undetermined set of callbacks would do so but it seems to me a
set of few known functions to deal with as shown above. The
shared lexer deals only with SQL and a backslash at the top of a
command.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#40Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Kyotaro HORIGUCHI (#39)
5 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

Hello,

At Fri, 25 Dec 2015 14:18:24 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20151225.141824.75912397.horiguchi.kyotaro@lab.ntt.co.jp>

Not really, I meant to see if it would be possible to include this set
of routines directly in libpqcommon (as part of OBJS_FRONTEND). This
way any client applications could easily reuse it. If we found that
what was in psql is now useful to pgbench, I have little doubt that
some other folks would make good use of that. I honestly have not
looked at the code to see if that's doable or not, but soft-linking
directly in pgbench a file of psql will not help future maintenance
for sure. This increases the complexity of the code.

Thanks. I understand it and I agree to the last sentense. I don't
want them to be exposed as generic features.

Instaed, I'd like to separate backslash commands from psqlscan.l
and use callbacks to access variables by ":name" syntax (it is a
common syntax between pgbench and psql). Current psqlscan.l
simply exits noticing of leading backslash of backslash commands
to the caller so it would be separated without heavy
surgery. This can put away the ugliness of VariableSpace
overriding.

Done. Lexer for backslash commands is moved out of SQL lexer as a
standalone lexer. Finally the SQL lexer left behind no longer is
aware of VariableSpace and simply used from pgbench by linking
the object file in psql directory. Addition to that, PGconn
required by escape_variable() so the core of the escape feature
is moved out into callback function side.

standard_strings() checks standard_comforming_strins on the
session. This is necessarily called on parsing so it can be moved
out into PsqlScanState.

Such behavior is not compatible with the current behavior. So I
made starndard_strings() a callback, too.

psql_error() redirects messages according to queryFout and adds
input file information. They are heavily dependent to psql. So
I'd like to make them a callback in PsqlScanState and do fprintf
as the default behavior (=NULL).

Done.

These measures will get rid of the ugliness. What do you think
about this?

Just my 0.02€:

I understand that you suggest some kind of dynamic
parametrization... this is harder to do and potentially as fragile as
the link/ifdef option, with an undertermined set of callbacks to
provide... the generic version would be harder to debug, and this
approach would prevent changing lexer options. Basically I'm not sure
that doing all that for improving the handling of pgbench scripts is
worth the effort. I would go with the simpler option.

Undetermined set of callbacks would do so but it seems to me a
set of few known functions to deal with as shown above. The
shared lexer deals only with SQL and a backslash at the top of a
command.

Finally, PsqlScanState has four callback funcions and all pgbench
needs to do to use it is setting NULL to all of them and link the
object file in psql directory. No link switch/ifdef are necessary.

| const char get_variable(const char *, bool escape, bool as_ident, void (**free_fn)(void *));
| int enc_mblen(const char *);
| bool standard_strings(void);
| void error_out(const char *fmt, ...)

The attached patches are the following.

- 0001-Prepare-for-sharing-psqlscan-with-pgbench.patch

This diff looks a bit large but most of them is cut'n-paste
work and the substantial change is rather small.

This refactors psqlscan.l into two .l files. The additional
psqlscan_slash.l is a bit tricky in the point that recreating
scan_state on transition between psqlscan.l.

.c files generated from .c are no longer complied as a part of
mainloop.c. Thay have dedicated envelope .c files so that the
sql lexer can easily used outside psql.

- 0002-Change-the-access-method-to-shell-variables.patch

Detaches VariableSpace from psqlscan.

- 0003-Detach-common.c-from-psqlscan.patch

Detaches PGconn from psqlscan by throwing out the session
encoding stuff by ading two callbacks.

- 0004-pgbench-uses-common-frontend-SQL-parser.patch

This looks larger than what actually it does because an if
branch with a large block in process_commands() was removed.
'git diff -w' shows far small, substantial changes.

- 0005-Change-the-way-to-hold-command-list.patch

In the changes in 0004, SQL multistatement is passed as a
linked list to the caller, then the caller assigns them on an
array with the same data type. This patch changes the way to
hold commands entirely to linked list.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Prepare-for-sharing-psqlscan-with-pgbench.patchtext/x-patch; charset=us-asciiDownload
>From e41c9318c75ae4a70a1e932cb0cbbab19727e935 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 23 Jul 2015 20:44:37 +0900
Subject: [PATCH 1/5] Prepare for sharing psqlscan with pgbench.

Lexer is no longer compiled as a part of mainloop.c.  The slash
command lexer is brought out from the command line lexer.  psql_scan
no longer accesses directly to pset struct and VariableSpace. This
change allows psqlscan to be used without these things.
---
 src/bin/psql/Makefile             |   15 +-
 src/bin/psql/command.c            |    1 +
 src/bin/psql/mainloop.c           |   16 +-
 src/bin/psql/psqlscan.c           |   18 +
 src/bin/psql/psqlscan.h           |   32 +-
 src/bin/psql/psqlscan.l           | 1988 -------------------------------------
 src/bin/psql/psqlscan_int.h       |   87 ++
 src/bin/psql/psqlscan_slash.c     |   19 +
 src/bin/psql/psqlscan_slash.h     |   31 +
 src/bin/psql/psqlscan_slashbody.l |  757 ++++++++++++++
 src/bin/psql/psqlscanbody.l       | 1431 ++++++++++++++++++++++++++
 src/bin/psql/startup.c            |    4 +-
 12 files changed, 2363 insertions(+), 2036 deletions(-)
 create mode 100644 src/bin/psql/psqlscan.c
 delete mode 100644 src/bin/psql/psqlscan.l
 create mode 100644 src/bin/psql/psqlscan_int.h
 create mode 100644 src/bin/psql/psqlscan_slash.c
 create mode 100644 src/bin/psql/psqlscan_slash.h
 create mode 100644 src/bin/psql/psqlscan_slashbody.l
 create mode 100644 src/bin/psql/psqlscanbody.l

diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index 66e14fb..0435fee 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -23,7 +23,7 @@ override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(top_srcdir)/src/bin/p
 OBJS=	command.o common.o help.o input.o stringutils.o mainloop.o copy.o \
 	startup.o prompt.o variables.o large_obj.o print.o describe.o \
 	tab-complete.o mbprint.o dumputils.o keywords.o kwlookup.o \
-	sql_help.o \
+	sql_help.o psqlscan.o psqlscan_slash.o\
 	$(WIN32RES)
 
 
@@ -44,13 +44,14 @@ sql_help.c: sql_help.h ;
 sql_help.h: create_help.pl $(wildcard $(REFDOCDIR)/*.sgml)
 	$(PERL) $< $(REFDOCDIR) $*
 
-# psqlscan is compiled as part of mainloop
-mainloop.o: psqlscan.c
+psqlscan.o: psqlscan.c psqlscanbody.c common.h psqlscan_int.h
+psqlscan_slash.o: psqlscan_slash.c psqlscan_slashbody.c common.h psqlscan_int.h
 
-psqlscan.c: FLEXFLAGS = -Cfe -p -p
-psqlscan.c: FLEX_NO_BACKUP=yes
+psqlscanbody.c: FLEXFLAGS = -Cfe -p -p
+psqlscan_slashbody.c: FLEXFLAGS = -Cfe -p -p -P yys
+psqlscanbody.c psqlscan_slashbody.c: FLEX_NO_BACKUP=yes
 
-distprep: sql_help.h psqlscan.c
+distprep: sql_help.h psqlscanbody.c psqlscan_slashbody.c
 
 install: all installdirs
 	$(INSTALL_PROGRAM) psql$(X) '$(DESTDIR)$(bindir)/psql$(X)'
@@ -67,4 +68,4 @@ clean distclean:
 	rm -f psql$(X) $(OBJS) dumputils.c keywords.c kwlookup.c lex.backup
 
 maintainer-clean: distclean
-	rm -f sql_help.h sql_help.c psqlscan.c
+	rm -f sql_help.h sql_help.c psqlscanbody.c psqlscan_slashbody.c
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9750a5b..e42fca7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -46,6 +46,7 @@
 #include "mainloop.h"
 #include "print.h"
 #include "psqlscan.h"
+#include "psqlscan_slash.h"
 #include "settings.h"
 #include "variables.h"
 
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index dadbd29..947caff 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -233,7 +233,8 @@ MainLoop(FILE *source)
 		/*
 		 * Parse line, looking for command separators.
 		 */
-		psql_scan_setup(scan_state, line, strlen(line));
+		psql_scan_setup(scan_state, line, strlen(line),
+						pset.db, pset.vars, pset.encoding);
 		success = true;
 		line_saved_in_history = false;
 
@@ -373,7 +374,8 @@ MainLoop(FILE *source)
 					resetPQExpBuffer(query_buf);
 					/* reset parsing state since we are rescanning whole line */
 					psql_scan_reset(scan_state);
-					psql_scan_setup(scan_state, line, strlen(line));
+					psql_scan_setup(scan_state, line, strlen(line),
+									pset.db, pset.vars, pset.encoding);
 					line_saved_in_history = false;
 					prompt_status = PROMPT_READY;
 				}
@@ -450,13 +452,3 @@ MainLoop(FILE *source)
 
 	return successResult;
 }	/* MainLoop() */
-
-
-/*
- * psqlscan.c is #include'd here instead of being compiled on its own.
- * This is because we need postgres_fe.h to be read before any system
- * include files, else things tend to break on platforms that have
- * multiple infrastructures for stdio.h and so on.  flex is absolutely
- * uncooperative about that, so we can't compile psqlscan.c on its own.
- */
-#include "psqlscan.c"
diff --git a/src/bin/psql/psqlscan.c b/src/bin/psql/psqlscan.c
new file mode 100644
index 0000000..7f09fa3
--- /dev/null
+++ b/src/bin/psql/psqlscan.c
@@ -0,0 +1,18 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.c
+ *
+ */
+
+/*
+ * psqlscanbody.c is #include'd here instead of being compiled on its own.
+ * This is because we need postgres_fe.h to be read before any system
+ * include files, else things tend to break on platforms that have
+ * multiple infrastructures for stdio.h and so on.  flex is absolutely
+ * uncooperative about that, so we can't compile psqlscan.c on its own.
+ */
+#include "common.h"
+#include "psqlscanbody.c"
diff --git a/src/bin/psql/psqlscan.h b/src/bin/psql/psqlscan.h
index 674ba69..7615df2 100644
--- a/src/bin/psql/psqlscan.h
+++ b/src/bin/psql/psqlscan.h
@@ -11,7 +11,7 @@
 #include "pqexpbuffer.h"
 
 #include "prompt.h"
-
+#include "variables.h"
 
 /* Abstract type for lexer's internal state */
 typedef struct PsqlScanStateData *PsqlScanState;
@@ -25,40 +25,18 @@ typedef enum
 	PSCAN_EOL					/* end of line, SQL possibly complete */
 } PsqlScanResult;
 
-/* Different ways for scan_slash_option to handle parameter words */
-enum slash_option_type
-{
-	OT_NORMAL,					/* normal case */
-	OT_SQLID,					/* treat as SQL identifier */
-	OT_SQLIDHACK,				/* SQL identifier, but don't downcase */
-	OT_FILEPIPE,				/* it's a filename or pipe */
-	OT_WHOLE_LINE,				/* just snarf the rest of the line */
-	OT_NO_EVAL					/* no expansion of backticks or variables */
-};
-
-
 extern PsqlScanState psql_scan_create(void);
 extern void psql_scan_destroy(PsqlScanState state);
 
-extern void psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len);
+extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+							PGconn *db, VariableSpace vars, int encoding);
 extern void psql_scan_finish(PsqlScanState state);
 
 extern PsqlScanResult psql_scan(PsqlScanState state,
-		  PQExpBuffer query_buf,
-		  promptStatus_t *prompt);
+								PQExpBuffer query_buf,
+								promptStatus_t *prompt);
 
 extern void psql_scan_reset(PsqlScanState state);
-
 extern bool psql_scan_in_quote(PsqlScanState state);
 
-extern char *psql_scan_slash_command(PsqlScanState state);
-
-extern char *psql_scan_slash_option(PsqlScanState state,
-					   enum slash_option_type type,
-					   char *quote,
-					   bool semicolon);
-
-extern void psql_scan_slash_command_end(PsqlScanState state);
-
 #endif   /* PSQLSCAN_H */
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
deleted file mode 100644
index bbe0172..0000000
--- a/src/bin/psql/psqlscan.l
+++ /dev/null
@@ -1,1988 +0,0 @@
-%{
-/*-------------------------------------------------------------------------
- *
- * psqlscan.l
- *	  lexical scanner for psql
- *
- * This code is mainly needed to determine where the end of a SQL statement
- * is: we are looking for semicolons that are not within quotes, comments,
- * or parentheses.  The most reliable way to handle this is to borrow the
- * backend's flex lexer rules, lock, stock, and barrel.  The rules below
- * are (except for a few) the same as the backend's, but their actions are
- * just ECHO whereas the backend's actions generally do other things.
- *
- * XXX The rules in this file must be kept in sync with the backend lexer!!!
- *
- * XXX Avoid creating backtracking cases --- see the backend lexer for info.
- *
- * The most difficult aspect of this code is that we need to work in multibyte
- * encodings that are not ASCII-safe.  A "safe" encoding is one in which each
- * byte of a multibyte character has the high bit set (it's >= 0x80).  Since
- * all our lexing rules treat all high-bit-set characters alike, we don't
- * really need to care whether such a byte is part of a sequence or not.
- * In an "unsafe" encoding, we still expect the first byte of a multibyte
- * sequence to be >= 0x80, but later bytes might not be.  If we scan such
- * a sequence as-is, the lexing rules could easily be fooled into matching
- * such bytes to ordinary ASCII characters.  Our solution for this is to
- * substitute 0xFF for each non-first byte within the data presented to flex.
- * The flex rules will then pass the FF's through unmolested.  The emit()
- * subroutine is responsible for looking back to the original string and
- * replacing FF's with the corresponding original bytes.
- *
- * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * IDENTIFICATION
- *	  src/bin/psql/psqlscan.l
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres_fe.h"
-
-#include "psqlscan.h"
-
-#include <ctype.h>
-
-#include "common.h"
-#include "settings.h"
-#include "variables.h"
-
-
-/*
- * We use a stack of flex buffers to handle substitution of psql variables.
- * Each stacked buffer contains the as-yet-unread text from one psql variable.
- * When we pop the stack all the way, we resume reading from the outer buffer
- * identified by scanbufhandle.
- */
-typedef struct StackElem
-{
-	YY_BUFFER_STATE buf;		/* flex input control structure */
-	char	   *bufstring;		/* data actually being scanned by flex */
-	char	   *origstring;		/* copy of original data, if needed */
-	char	   *varname;		/* name of variable providing data, or NULL */
-	struct StackElem *next;
-} StackElem;
-
-/*
- * All working state of the lexer must be stored in PsqlScanStateData
- * between calls.  This allows us to have multiple open lexer operations,
- * which is needed for nested include files.  The lexer itself is not
- * recursive, but it must be re-entrant.
- */
-typedef struct PsqlScanStateData
-{
-	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
-	/*
-	 * These variables always refer to the outer buffer, never to any
-	 * stacked variable-expansion buffer.
-	 */
-	YY_BUFFER_STATE scanbufhandle;
-	char	   *scanbuf;		/* start of outer-level input buffer */
-	const char *scanline;		/* current input line at outer level */
-
-	/* safe_encoding, curline, refline are used by emit() to replace FFs */
-	int			encoding;		/* encoding being used now */
-	bool		safe_encoding;	/* is current encoding "safe"? */
-	const char *curline;		/* actual flex input string for cur buf */
-	const char *refline;		/* original data for cur buffer */
-
-	/*
-	 * All this state lives across successive input lines, until explicitly
-	 * reset by psql_scan_reset.
-	 */
-	int			start_state;	/* saved YY_START */
-	int			paren_depth;	/* depth of nesting in parentheses */
-	int			xcdepth;		/* depth of nesting in slash-star comments */
-	char	   *dolqstart;		/* current $foo$ quote start string */
-} PsqlScanStateData;
-
-static PsqlScanState cur_state;	/* current state while active */
-
-static PQExpBuffer output_buf;	/* current output buffer */
-
-/* these variables do not need to be saved across calls */
-static enum slash_option_type option_type;
-static char *option_quote;
-static int	unquoted_option_chars;
-static int	backtick_start_offset;
-
-
-/* Return values from yylex() */
-#define LEXRES_EOL			0	/* end of input */
-#define LEXRES_SEMI			1	/* command-terminating semicolon found */
-#define LEXRES_BACKSLASH	2	/* backslash command start */
-#define LEXRES_OK			3	/* OK completion of backslash argument */
-
-
-static void evaluate_backtick(void);
-static void push_new_buffer(const char *newstr, const char *varname);
-static void pop_buffer_stack(PsqlScanState state);
-static bool var_is_current_source(PsqlScanState state, const char *varname);
-static YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
-									  char **txtcopy);
-static void emit(const char *txt, int len);
-static char *extract_substring(const char *txt, int len);
-static void escape_variable(bool as_ident);
-
-#define ECHO emit(yytext, yyleng)
-
-%}
-
-%option 8bit
-%option never-interactive
-%option nodefault
-%option noinput
-%option nounput
-%option noyywrap
-%option warn
-
-/*
- * All of the following definitions and rules should exactly match
- * src/backend/parser/scan.l so far as the flex patterns are concerned.
- * The rule bodies are just ECHO as opposed to what the backend does,
- * however.  (But be sure to duplicate code that affects the lexing process,
- * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
- * scan.l has a separate one for each exclusive state.
- */
-
-/*
- * OK, here is a short description of lex/flex rules behavior.
- * The longest pattern which matches an input string is always chosen.
- * For equal-length patterns, the first occurring in the rules list is chosen.
- * INITIAL is the starting state, to which all non-conditional rules apply.
- * Exclusive states change parsing rules while the state is active.  When in
- * an exclusive state, only those rules defined for that state apply.
- *
- * We use exclusive states for quoted strings, extended comments,
- * and to eliminate parsing troubles for numeric strings.
- * Exclusive states:
- *  <xb> bit string literal
- *  <xc> extended C-style comments
- *  <xd> delimited identifiers (double-quoted identifiers)
- *  <xh> hexadecimal numeric string
- *  <xq> standard quoted strings
- *  <xe> extended quoted strings (support backslash escape sequences)
- *  <xdolq> $foo$ quoted strings
- *  <xui> quoted identifier with Unicode escapes
- *  <xuiend> end of a quoted identifier with Unicode escapes, UESCAPE can follow
- *  <xus> quoted string with Unicode escapes
- *  <xusend> end of a quoted string with Unicode escapes, UESCAPE can follow
- *
- * Note: we intentionally don't mimic the backend's <xeu> state; we have
- * no need to distinguish it from <xe> state, and no good way to get out
- * of it in error cases.  The backend just throws yyerror() in those
- * cases, but that's not an option here.
- */
-
-%x xb
-%x xc
-%x xd
-%x xh
-%x xe
-%x xq
-%x xdolq
-%x xui
-%x xuiend
-%x xus
-%x xusend
-/* Additional exclusive states for psql only: lex backslash commands */
-%x xslashcmd
-%x xslashargstart
-%x xslasharg
-%x xslashquote
-%x xslashbackquote
-%x xslashdquote
-%x xslashwholeline
-%x xslashend
-
-/*
- * In order to make the world safe for Windows and Mac clients as well as
- * Unix ones, we accept either \n or \r as a newline.  A DOS-style \r\n
- * sequence will be seen as two successive newlines, but that doesn't cause
- * any problems.  Comments that start with -- and extend to the next
- * newline are treated as equivalent to a single whitespace character.
- *
- * NOTE a fine point: if there is no newline following --, we will absorb
- * everything to the end of the input as a comment.  This is correct.  Older
- * versions of Postgres failed to recognize -- as a comment if the input
- * did not end with a newline.
- *
- * XXX perhaps \f (formfeed) should be treated as a newline as well?
- *
- * XXX if you change the set of whitespace characters, fix scanner_isspace()
- * to agree, and see also the plpgsql lexer.
- */
-
-space			[ \t\n\r\f]
-horiz_space		[ \t\f]
-newline			[\n\r]
-non_newline		[^\n\r]
-
-comment			("--"{non_newline}*)
-
-whitespace		({space}+|{comment})
-
-/*
- * SQL requires at least one newline in the whitespace separating
- * string literals that are to be concatenated.  Silly, but who are we
- * to argue?  Note that {whitespace_with_newline} should not have * after
- * it, whereas {whitespace} should generally have a * after it...
- */
-
-special_whitespace		({space}+|{comment}{newline})
-horiz_whitespace		({horiz_space}|{comment})
-whitespace_with_newline	({horiz_whitespace}*{newline}{special_whitespace}*)
-
-/*
- * To ensure that {quotecontinue} can be scanned without having to back up
- * if the full pattern isn't matched, we include trailing whitespace in
- * {quotestop}.  This matches all cases where {quotecontinue} fails to match,
- * except for {quote} followed by whitespace and just one "-" (not two,
- * which would start a {comment}).  To cover that we have {quotefail}.
- * The actions for {quotestop} and {quotefail} must throw back characters
- * beyond the quote proper.
- */
-quote			'
-quotestop		{quote}{whitespace}*
-quotecontinue	{quote}{whitespace_with_newline}{quote}
-quotefail		{quote}{whitespace}*"-"
-
-/* Bit string
- * It is tempting to scan the string for only those characters
- * which are allowed. However, this leads to silently swallowed
- * characters if illegal characters are included in the string.
- * For example, if xbinside is [01] then B'ABCD' is interpreted
- * as a zero-length string, and the ABCD' is lost!
- * Better to pass the string forward and let the input routines
- * validate the contents.
- */
-xbstart			[bB]{quote}
-xbinside		[^']*
-
-/* Hexadecimal number */
-xhstart			[xX]{quote}
-xhinside		[^']*
-
-/* National character */
-xnstart			[nN]{quote}
-
-/* Quoted string that allows backslash escapes */
-xestart			[eE]{quote}
-xeinside		[^\\']+
-xeescape		[\\][^0-7]
-xeoctesc		[\\][0-7]{1,3}
-xehexesc		[\\]x[0-9A-Fa-f]{1,2}
-xeunicode		[\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})
-xeunicodefail	[\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})
-
-/* Extended quote
- * xqdouble implements embedded quote, ''''
- */
-xqstart			{quote}
-xqdouble		{quote}{quote}
-xqinside		[^']+
-
-/* $foo$ style quotes ("dollar quoting")
- * The quoted string starts with $foo$ where "foo" is an optional string
- * in the form of an identifier, except that it may not contain "$",
- * and extends to the first occurrence of an identical string.
- * There is *no* processing of the quoted text.
- *
- * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim}
- * fails to match its trailing "$".
- */
-dolq_start		[A-Za-z\200-\377_]
-dolq_cont		[A-Za-z\200-\377_0-9]
-dolqdelim		\$({dolq_start}{dolq_cont}*)?\$
-dolqfailed		\${dolq_start}{dolq_cont}*
-dolqinside		[^$]+
-
-/* Double quote
- * Allows embedded spaces and other special characters into identifiers.
- */
-dquote			\"
-xdstart			{dquote}
-xdstop			{dquote}
-xddouble		{dquote}{dquote}
-xdinside		[^"]+
-
-/* Unicode escapes */
-uescape			[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote}
-/* error rule to avoid backup */
-uescapefail		[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU]
-
-/* Quoted identifier with Unicode escapes */
-xuistart		[uU]&{dquote}
-
-/* Quoted string with Unicode escapes */
-xusstart		[uU]&{quote}
-
-/* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */
-xustop1		{uescapefail}?
-xustop2		{uescape}
-
-/* error rule to avoid backup */
-xufailed		[uU]&
-
-
-/* C-style comments
- *
- * The "extended comment" syntax closely resembles allowable operator syntax.
- * The tricky part here is to get lex to recognize a string starting with
- * slash-star as a comment, when interpreting it as an operator would produce
- * a longer match --- remember lex will prefer a longer match!  Also, if we
- * have something like plus-slash-star, lex will think this is a 3-character
- * operator whereas we want to see it as a + operator and a comment start.
- * The solution is two-fold:
- * 1. append {op_chars}* to xcstart so that it matches as much text as
- *    {operator} would. Then the tie-breaker (first matching rule of same
- *    length) ensures xcstart wins.  We put back the extra stuff with yyless()
- *    in case it contains a star-slash that should terminate the comment.
- * 2. In the operator rule, check for slash-star within the operator, and
- *    if found throw it back with yyless().  This handles the plus-slash-star
- *    problem.
- * Dash-dash comments have similar interactions with the operator rule.
- */
-xcstart			\/\*{op_chars}*
-xcstop			\*+\/
-xcinside		[^*/]+
-
-digit			[0-9]
-ident_start		[A-Za-z\200-\377_]
-ident_cont		[A-Za-z\200-\377_0-9\$]
-
-identifier		{ident_start}{ident_cont}*
-
-/* Assorted special-case operators and operator-like tokens */
-typecast		"::"
-dot_dot			\.\.
-colon_equals	":="
-equals_greater	"=>"
-less_equals		"<="
-greater_equals	">="
-less_greater	"<>"
-not_equals		"!="
-
-/*
- * "self" is the set of chars that should be returned as single-character
- * tokens.  "op_chars" is the set of chars that can make up "Op" tokens,
- * which can be one or more characters long (but if a single-char token
- * appears in the "self" set, it is not to be returned as an Op).  Note
- * that the sets overlap, but each has some chars that are not in the other.
- *
- * If you change either set, adjust the character lists appearing in the
- * rule for "operator"!
- */
-self			[,()\[\].;\:\+\-\*\/\%\^\<\>\=]
-op_chars		[\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
-operator		{op_chars}+
-
-/* we no longer allow unary minus in numbers.
- * instead we pass it separately to parser. there it gets
- * coerced via doNegate() -- Leon aug 20 1999
- *
- * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 10.
- *
- * {realfail1} and {realfail2} are added to prevent the need for scanner
- * backup when the {real} rule fails to match completely.
- */
-
-integer			{digit}+
-decimal			(({digit}*\.{digit}+)|({digit}+\.{digit}*))
-decimalfail		{digit}+\.\.
-real			({integer}|{decimal})[Ee][-+]?{digit}+
-realfail1		({integer}|{decimal})[Ee]
-realfail2		({integer}|{decimal})[Ee][-+]
-
-param			\${integer}
-
-/* psql-specific: characters allowed in variable names */
-variable_char	[A-Za-z\200-\377_0-9]
-
-other			.
-
-/*
- * Dollar quoted strings are totally opaque, and no escaping is done on them.
- * Other quoted strings must allow some special characters such as single-quote
- *  and newline.
- * Embedded single-quotes are implemented both in the SQL standard
- *  style of two adjacent single quotes "''" and in the Postgres/Java style
- *  of escaped-quote "\'".
- * Other embedded escaped characters are matched explicitly and the leading
- *  backslash is dropped from the string.
- * Note that xcstart must appear before operator, as explained above!
- *  Also whitespace (comment) must appear before operator.
- */
-
-%%
-
-{whitespace}	{
-					/*
-					 * Note that the whitespace rule includes both true
-					 * whitespace and single-line ("--" style) comments.
-					 * We suppress whitespace at the start of the query
-					 * buffer.  We also suppress all single-line comments,
-					 * which is pretty dubious but is the historical
-					 * behavior.
-					 */
-					if (!(output_buf->len == 0 || yytext[0] == '-'))
-						ECHO;
-				}
-
-{xcstart}		{
-					cur_state->xcdepth = 0;
-					BEGIN(xc);
-					/* Put back any characters past slash-star; see above */
-					yyless(2);
-					ECHO;
-				}
-
-<xc>{xcstart}	{
-					cur_state->xcdepth++;
-					/* Put back any characters past slash-star; see above */
-					yyless(2);
-					ECHO;
-				}
-
-<xc>{xcstop}	{
-					if (cur_state->xcdepth <= 0)
-					{
-						BEGIN(INITIAL);
-					}
-					else
-						cur_state->xcdepth--;
-					ECHO;
-				}
-
-<xc>{xcinside}	{
-					ECHO;
-				}
-
-<xc>{op_chars}	{
-					ECHO;
-				}
-
-<xc>\*+			{
-					ECHO;
-				}
-
-{xbstart}		{
-					BEGIN(xb);
-					ECHO;
-				}
-<xb>{quotestop}	|
-<xb>{quotefail} {
-					yyless(1);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xh>{xhinside}	|
-<xb>{xbinside}	{
-					ECHO;
-				}
-<xh>{quotecontinue}	|
-<xb>{quotecontinue}	{
-					ECHO;
-				}
-
-{xhstart}		{
-					/* Hexadecimal bit type.
-					 * At some point we should simply pass the string
-					 * forward to the parser and label it there.
-					 * In the meantime, place a leading "x" on the string
-					 * to mark it for the input routine as a hex string.
-					 */
-					BEGIN(xh);
-					ECHO;
-				}
-<xh>{quotestop}	|
-<xh>{quotefail} {
-					yyless(1);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-
-{xnstart}		{
-					yyless(1);				/* eat only 'n' this time */
-					ECHO;
-				}
-
-{xqstart}		{
-					if (standard_strings())
-						BEGIN(xq);
-					else
-						BEGIN(xe);
-					ECHO;
-				}
-{xestart}		{
-					BEGIN(xe);
-					ECHO;
-				}
-{xusstart}		{
-					BEGIN(xus);
-					ECHO;
-				}
-<xq,xe>{quotestop}	|
-<xq,xe>{quotefail} {
-					yyless(1);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xus>{quotestop} |
-<xus>{quotefail} {
-					yyless(1);
-					BEGIN(xusend);
-					ECHO;
-				}
-<xusend>{whitespace} {
-					ECHO;
-				}
-<xusend>{other} |
-<xusend>{xustop1} {
-					yyless(0);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xusend>{xustop2} {
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xq,xe,xus>{xqdouble} {
-					ECHO;
-				}
-<xq,xus>{xqinside}  {
-					ECHO;
-				}
-<xe>{xeinside}  {
-					ECHO;
-				}
-<xe>{xeunicode} {
-					ECHO;
-				}
-<xe>{xeunicodefail}	{
-					ECHO;
-				}
-<xe>{xeescape}  {
-					ECHO;
-				}
-<xe>{xeoctesc}  {
-					ECHO;
-				}
-<xe>{xehexesc}  {
-					ECHO;
-				}
-<xq,xe,xus>{quotecontinue} {
-					ECHO;
-				}
-<xe>.			{
-					/* This is only needed for \ just before EOF */
-					ECHO;
-				}
-
-{dolqdelim}		{
-					cur_state->dolqstart = pg_strdup(yytext);
-					BEGIN(xdolq);
-					ECHO;
-				}
-{dolqfailed}	{
-					/* throw back all but the initial "$" */
-					yyless(1);
-					ECHO;
-				}
-<xdolq>{dolqdelim} {
-					if (strcmp(yytext, cur_state->dolqstart) == 0)
-					{
-						free(cur_state->dolqstart);
-						cur_state->dolqstart = NULL;
-						BEGIN(INITIAL);
-					}
-					else
-					{
-						/*
-						 * When we fail to match $...$ to dolqstart, transfer
-						 * the $... part to the output, but put back the final
-						 * $ for rescanning.  Consider $delim$...$junk$delim$
-						 */
-						yyless(yyleng-1);
-					}
-					ECHO;
-				}
-<xdolq>{dolqinside} {
-					ECHO;
-				}
-<xdolq>{dolqfailed} {
-					ECHO;
-				}
-<xdolq>.		{
-					/* This is only needed for $ inside the quoted text */
-					ECHO;
-				}
-
-{xdstart}		{
-					BEGIN(xd);
-					ECHO;
-				}
-{xuistart}		{
-					BEGIN(xui);
-					ECHO;
-				}
-<xd>{xdstop}	{
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xui>{dquote} {
-					yyless(1);
-					BEGIN(xuiend);
-					ECHO;
-				}
-<xuiend>{whitespace} {
-					ECHO;
-				}
-<xuiend>{other} |
-<xuiend>{xustop1} {
-					yyless(0);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xuiend>{xustop2}	{
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xd,xui>{xddouble}	{
-					ECHO;
-				}
-<xd,xui>{xdinside}	{
-					ECHO;
-				}
-
-{xufailed}	{
-					/* throw back all but the initial u/U */
-					yyless(1);
-					ECHO;
-				}
-
-{typecast}		{
-					ECHO;
-				}
-
-{dot_dot}		{
-					ECHO;
-				}
-
-{colon_equals}	{
-					ECHO;
-				}
-
-{equals_greater} {
-					ECHO;
-				}
-
-{less_equals}	{
-					ECHO;
-				}
-
-{greater_equals} {
-					ECHO;
-				}
-
-{less_greater}	{
-					ECHO;
-				}
-
-{not_equals}	{
-					ECHO;
-				}
-
-	/*
-	 * These rules are specific to psql --- they implement parenthesis
-	 * counting and detection of command-ending semicolon.  These must
-	 * appear before the {self} rule so that they take precedence over it.
-	 */
-
-"("				{
-					cur_state->paren_depth++;
-					ECHO;
-				}
-
-")"				{
-					if (cur_state->paren_depth > 0)
-						cur_state->paren_depth--;
-					ECHO;
-				}
-
-";"				{
-					ECHO;
-					if (cur_state->paren_depth == 0)
-					{
-						/* Terminate lexing temporarily */
-						return LEXRES_SEMI;
-					}
-				}
-
-	/*
-	 * psql-specific rules to handle backslash commands and variable
-	 * substitution.  We want these before {self}, also.
-	 */
-
-"\\"[;:]		{
-					/* Force a semicolon or colon into the query buffer */
-					emit(yytext + 1, 1);
-				}
-
-"\\"			{
-					/* Terminate lexing temporarily */
-					return LEXRES_BACKSLASH;
-				}
-
-:{variable_char}+	{
-					/* Possible psql variable substitution */
-					char   *varname;
-					const char *value;
-
-					varname = extract_substring(yytext + 1, yyleng - 1);
-					value = GetVariable(pset.vars, varname);
-
-					if (value)
-					{
-						/* It is a variable, check for recursion */
-						if (var_is_current_source(cur_state, varname))
-						{
-							/* Recursive expansion --- don't go there */
-							psql_error("skipping recursive expansion of variable \"%s\"\n",
-									   varname);
-							/* Instead copy the string as is */
-							ECHO;
-						}
-						else
-						{
-							/* OK, perform substitution */
-							push_new_buffer(value, varname);
-							/* yy_scan_string already made buffer active */
-						}
-					}
-					else
-					{
-						/*
-						 * if the variable doesn't exist we'll copy the
-						 * string as is
-						 */
-						ECHO;
-					}
-
-					free(varname);
-				}
-
-:'{variable_char}+'	{
-					escape_variable(false);
-				}
-
-:\"{variable_char}+\"	{
-					escape_variable(true);
-				}
-
-	/*
-	 * These rules just avoid the need for scanner backup if one of the
-	 * two rules above fails to match completely.
-	 */
-
-:'{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					ECHO;
-				}
-
-:\"{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					ECHO;
-				}
-
-	/*
-	 * Back to backend-compatible rules.
-	 */
-
-{self}			{
-					ECHO;
-				}
-
-{operator}		{
-					/*
-					 * Check for embedded slash-star or dash-dash; those
-					 * are comment starts, so operator must stop there.
-					 * Note that slash-star or dash-dash at the first
-					 * character will match a prior rule, not this one.
-					 */
-					int		nchars = yyleng;
-					char   *slashstar = strstr(yytext, "/*");
-					char   *dashdash = strstr(yytext, "--");
-
-					if (slashstar && dashdash)
-					{
-						/* if both appear, take the first one */
-						if (slashstar > dashdash)
-							slashstar = dashdash;
-					}
-					else if (!slashstar)
-						slashstar = dashdash;
-					if (slashstar)
-						nchars = slashstar - yytext;
-
-					/*
-					 * For SQL compatibility, '+' and '-' cannot be the
-					 * last char of a multi-char operator unless the operator
-					 * contains chars that are not in SQL operators.
-					 * The idea is to lex '=-' as two operators, but not
-					 * to forbid operator names like '?-' that could not be
-					 * sequences of SQL operators.
-					 */
-					while (nchars > 1 &&
-						   (yytext[nchars-1] == '+' ||
-							yytext[nchars-1] == '-'))
-					{
-						int		ic;
-
-						for (ic = nchars-2; ic >= 0; ic--)
-						{
-							if (strchr("~!@#^&|`?%", yytext[ic]))
-								break;
-						}
-						if (ic >= 0)
-							break; /* found a char that makes it OK */
-						nchars--; /* else remove the +/-, and check again */
-					}
-
-					if (nchars < yyleng)
-					{
-						/* Strip the unwanted chars from the token */
-						yyless(nchars);
-					}
-					ECHO;
-				}
-
-{param}			{
-					ECHO;
-				}
-
-{integer}		{
-					ECHO;
-				}
-{decimal}		{
-					ECHO;
-				}
-{decimalfail}	{
-					/* throw back the .., and treat as integer */
-					yyless(yyleng-2);
-					ECHO;
-				}
-{real}			{
-					ECHO;
-				}
-{realfail1}		{
-					/*
-					 * throw back the [Ee], and treat as {decimal}.  Note
-					 * that it is possible the input is actually {integer},
-					 * but since this case will almost certainly lead to a
-					 * syntax error anyway, we don't bother to distinguish.
-					 */
-					yyless(yyleng-1);
-					ECHO;
-				}
-{realfail2}		{
-					/* throw back the [Ee][+-], and proceed as above */
-					yyless(yyleng-2);
-					ECHO;
-				}
-
-
-{identifier}	{
-					ECHO;
-				}
-
-{other}			{
-					ECHO;
-				}
-
-
-	/*
-	 * Everything from here down is psql-specific.
-	 */
-
-<<EOF>>			{
-					StackElem  *stackelem = cur_state->buffer_stack;
-
-					if (stackelem == NULL)
-						return LEXRES_EOL; /* end of input reached */
-
-					/*
-					 * We were expanding a variable, so pop the inclusion
-					 * stack and keep lexing
-					 */
-					pop_buffer_stack(cur_state);
-
-					stackelem = cur_state->buffer_stack;
-					if (stackelem != NULL)
-					{
-						yy_switch_to_buffer(stackelem->buf);
-						cur_state->curline = stackelem->bufstring;
-						cur_state->refline = stackelem->origstring ? stackelem->origstring : stackelem->bufstring;
-					}
-					else
-					{
-						yy_switch_to_buffer(cur_state->scanbufhandle);
-						cur_state->curline = cur_state->scanbuf;
-						cur_state->refline = cur_state->scanline;
-					}
-				}
-
-	/*
-	 * Exclusive lexer states to handle backslash command lexing
-	 */
-
-<xslashcmd>{
-	/* command name ends at whitespace or backslash; eat all else */
-
-{space}|"\\"	{
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashargstart>{
-	/*
-	 * Discard any whitespace before argument, then go to xslasharg state.
-	 * An exception is that "|" is only special at start of argument, so we
-	 * check for it here.
-	 */
-
-{space}+		{ }
-
-"|"				{
-					if (option_type == OT_FILEPIPE)
-					{
-						/* treat like whole-string case */
-						ECHO;
-						BEGIN(xslashwholeline);
-					}
-					else
-					{
-						/* vertical bar is not special otherwise */
-						yyless(0);
-						BEGIN(xslasharg);
-					}
-				}
-
-{other}			{
-					yyless(0);
-					BEGIN(xslasharg);
-				}
-
-}
-
-<xslasharg>{
-	/*
-	 * Default processing of text in a slash command's argument.
-	 *
-	 * Note: unquoted_option_chars counts the number of characters at the
-	 * end of the argument that were not subject to any form of quoting.
-	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
-	 */
-
-{space}|"\\"	{
-					/*
-					 * Unquoted space is end of arg; do not eat.  Likewise
-					 * backslash is end of command or next command, do not eat
-					 *
-					 * XXX this means we can't conveniently accept options
-					 * that include unquoted backslashes; therefore, option
-					 * processing that encourages use of backslashes is rather
-					 * broken.
-					 */
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{quote}			{
-					*option_quote = '\'';
-					unquoted_option_chars = 0;
-					BEGIN(xslashquote);
-				}
-
-"`"				{
-					backtick_start_offset = output_buf->len;
-					*option_quote = '`';
-					unquoted_option_chars = 0;
-					BEGIN(xslashbackquote);
-				}
-
-{dquote}		{
-					ECHO;
-					*option_quote = '"';
-					unquoted_option_chars = 0;
-					BEGIN(xslashdquote);
-				}
-
-:{variable_char}+	{
-					/* Possible psql variable substitution */
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						char   *varname;
-						const char *value;
-
-						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(pset.vars, varname);
-						free(varname);
-
-						/*
-						 * The variable value is just emitted without any
-						 * further examination.  This is consistent with the
-						 * pre-8.0 code behavior, if not with the way that
-						 * variables are handled outside backslash commands.
-						 * Note that we needn't guard against recursion here.
-						 */
-						if (value)
-							appendPQExpBufferStr(output_buf, value);
-						else
-							ECHO;
-
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}+'	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(false);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-
-:\"{variable_char}+\"	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(true);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-:\"{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-{other}			{
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-}
-
-<xslashquote>{
-	/*
-	 * single-quoted text: copy literally except for '' and backslash
-	 * sequences
-	 */
-
-{quote}			{ BEGIN(xslasharg); }
-
-{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
-
-"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
-"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
-"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
-"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
-"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
-
-{xeoctesc}		{
-					/* octal case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 1, NULL, 8));
-				}
-
-{xehexesc}		{
-					/* hex case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 2, NULL, 16));
-				}
-
-"\\".			{ emit(yytext + 1, 1); }
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashbackquote>{
-	/*
-	 * backticked text: copy everything until next backquote, then evaluate.
-	 *
-	 * XXX Possible future behavioral change: substitute for :VARIABLE?
-	 */
-
-"`"				{
-					/* In NO_EVAL mode, don't evaluate the command */
-					if (option_type != OT_NO_EVAL)
-						evaluate_backtick();
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashdquote>{
-	/* double-quoted text: copy verbatim, including the double quotes */
-
-{dquote}		{
-					ECHO;
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashwholeline>{
-	/* copy everything until end of input line */
-	/* but suppress leading whitespace */
-
-{space}+		{
-					if (output_buf->len > 0)
-						ECHO;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashend>{
-	/* at end of command, eat a double backslash, but not anything else */
-
-"\\\\"			{ return LEXRES_OK; }
-
-{other}|\n		{
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-}
-
-%%
-
-/*
- * Create a lexer working state struct.
- */
-PsqlScanState
-psql_scan_create(void)
-{
-	PsqlScanState state;
-
-	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
-
-	psql_scan_reset(state);
-
-	return state;
-}
-
-/*
- * Destroy a lexer working state struct, releasing all resources.
- */
-void
-psql_scan_destroy(PsqlScanState state)
-{
-	psql_scan_finish(state);
-
-	psql_scan_reset(state);
-
-	free(state);
-}
-
-/*
- * Set up to perform lexing of the given input line.
- *
- * The text at *line, extending for line_len bytes, will be scanned by
- * subsequent calls to the psql_scan routines.  psql_scan_finish should
- * be called when scanning is complete.  Note that the lexer retains
- * a pointer to the storage at *line --- this string must not be altered
- * or freed until after psql_scan_finish is called.
- */
-void
-psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len)
-{
-	/* Mustn't be scanning already */
-	Assert(state->scanbufhandle == NULL);
-	Assert(state->buffer_stack == NULL);
-
-	/* Do we need to hack the character set encoding? */
-	state->encoding = pset.encoding;
-	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
-
-	/* needed for prepare_buffer */
-	cur_state = state;
-
-	/* Set up flex input buffer with appropriate translation and padding */
-	state->scanbufhandle = prepare_buffer(line, line_len,
-										  &state->scanbuf);
-	state->scanline = line;
-
-	/* Set lookaside data in case we have to map unsafe encoding */
-	state->curline = state->scanbuf;
-	state->refline = state->scanline;
-}
-
-/*
- * Do lexical analysis of SQL command text.
- *
- * The text previously passed to psql_scan_setup is scanned, and appended
- * (possibly with transformation) to query_buf.
- *
- * The return value indicates the condition that stopped scanning:
- *
- * PSCAN_SEMICOLON: found a command-ending semicolon.  (The semicolon is
- * transferred to query_buf.)  The command accumulated in query_buf should
- * be executed, then clear query_buf and call again to scan the remainder
- * of the line.
- *
- * PSCAN_BACKSLASH: found a backslash that starts a psql special command.
- * Any previous data on the line has been transferred to query_buf.
- * The caller will typically next call psql_scan_slash_command(),
- * perhaps psql_scan_slash_option(), and psql_scan_slash_command_end().
- *
- * PSCAN_INCOMPLETE: the end of the line was reached, but we have an
- * incomplete SQL command.  *prompt is set to the appropriate prompt type.
- *
- * PSCAN_EOL: the end of the line was reached, and there is no lexical
- * reason to consider the command incomplete.  The caller may or may not
- * choose to send it.  *prompt is set to the appropriate prompt type if
- * the caller chooses to collect more input.
- *
- * In the PSCAN_INCOMPLETE and PSCAN_EOL cases, psql_scan_finish() should
- * be called next, then the cycle may be repeated with a fresh input line.
- *
- * In all cases, *prompt is set to an appropriate prompt type code for the
- * next line-input operation.
- */
-PsqlScanResult
-psql_scan(PsqlScanState state,
-		  PQExpBuffer query_buf,
-		  promptStatus_t *prompt)
-{
-	PsqlScanResult result;
-	int			lexresult;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = query_buf;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(state->start_state);
-
-	/* And lex. */
-	lexresult = yylex();
-
-	/* Update static vars back to the state struct */
-	state->start_state = YY_START;
-
-	/*
-	 * Check termination state and return appropriate result info.
-	 */
-	switch (lexresult)
-	{
-		case LEXRES_EOL:		/* end of input */
-			switch (state->start_state)
-			{
-				/* This switch must cover all non-slash-command states. */
-				case INITIAL:
-				case xuiend:	/* we treat these like INITIAL */
-				case xusend:
-					if (state->paren_depth > 0)
-					{
-						result = PSCAN_INCOMPLETE;
-						*prompt = PROMPT_PAREN;
-					}
-					else if (query_buf->len > 0)
-					{
-						result = PSCAN_EOL;
-						*prompt = PROMPT_CONTINUE;
-					}
-					else
-					{
-						/* never bother to send an empty buffer */
-						result = PSCAN_INCOMPLETE;
-						*prompt = PROMPT_READY;
-					}
-					break;
-				case xb:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xc:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_COMMENT;
-					break;
-				case xd:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_DOUBLEQUOTE;
-					break;
-				case xh:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xe:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xq:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xdolq:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_DOLLARQUOTE;
-					break;
-				case xui:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_DOUBLEQUOTE;
-					break;
-				case xus:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				default:
-					/* can't get here */
-					fprintf(stderr, "invalid YY_START\n");
-					exit(1);
-			}
-			break;
-		case LEXRES_SEMI:		/* semicolon */
-			result = PSCAN_SEMICOLON;
-			*prompt = PROMPT_READY;
-			break;
-		case LEXRES_BACKSLASH:	/* backslash */
-			result = PSCAN_BACKSLASH;
-			*prompt = PROMPT_READY;
-			break;
-		default:
-			/* can't get here */
-			fprintf(stderr, "invalid yylex result\n");
-			exit(1);
-	}
-
-	return result;
-}
-
-/*
- * Clean up after scanning a string.  This flushes any unread input and
- * releases resources (but not the PsqlScanState itself).  Note however
- * that this does not reset the lexer scan state; that can be done by
- * psql_scan_reset(), which is an orthogonal operation.
- *
- * It is legal to call this when not scanning anything (makes it easier
- * to deal with error recovery).
- */
-void
-psql_scan_finish(PsqlScanState state)
-{
-	/* Drop any incomplete variable expansions. */
-	while (state->buffer_stack != NULL)
-		pop_buffer_stack(state);
-
-	/* Done with the outer scan buffer, too */
-	if (state->scanbufhandle)
-		yy_delete_buffer(state->scanbufhandle);
-	state->scanbufhandle = NULL;
-	if (state->scanbuf)
-		free(state->scanbuf);
-	state->scanbuf = NULL;
-}
-
-/*
- * Reset lexer scanning state to start conditions.  This is appropriate
- * for executing \r psql commands (or any other time that we discard the
- * prior contents of query_buf).  It is not, however, necessary to do this
- * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
- * PSCAN_EOL scan result, because the scan state must be INITIAL when those
- * conditions are returned.
- *
- * Note that this is unrelated to flushing unread input; that task is
- * done by psql_scan_finish().
- */
-void
-psql_scan_reset(PsqlScanState state)
-{
-	state->start_state = INITIAL;
-	state->paren_depth = 0;
-	state->xcdepth = 0;			/* not really necessary */
-	if (state->dolqstart)
-		free(state->dolqstart);
-	state->dolqstart = NULL;
-}
-
-/*
- * Return true if lexer is currently in an "inside quotes" state.
- *
- * This is pretty grotty but is needed to preserve the old behavior
- * that mainloop.c drops blank lines not inside quotes without even
- * echoing them.
- */
-bool
-psql_scan_in_quote(PsqlScanState state)
-{
-	return state->start_state != INITIAL;
-}
-
-/*
- * Scan the command name of a psql backslash command.  This should be called
- * after psql_scan() returns PSCAN_BACKSLASH.  It is assumed that the input
- * has been consumed through the leading backslash.
- *
- * The return value is a malloc'd copy of the command name, as parsed off
- * from the input.
- */
-char *
-psql_scan_slash_command(PsqlScanState state)
-{
-	PQExpBufferData mybuf;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashcmd);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-
-	return mybuf.data;
-}
-
-/*
- * Parse off the next argument for a backslash command, and return it as a
- * malloc'd string.  If there are no more arguments, returns NULL.
- *
- * type tells what processing, if any, to perform on the option string;
- * for example, if it's a SQL identifier, we want to downcase any unquoted
- * letters.
- *
- * if quote is not NULL, *quote is set to 0 if no quoting was found, else
- * the last quote symbol used in the argument.
- *
- * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
- * be taken as part of the option string will be stripped.
- *
- * NOTE: the only possible syntax errors for backslash options are unmatched
- * quotes, which are detected when we run out of input.  Therefore, on a
- * syntax error we just throw away the string and return NULL; there is no
- * need to worry about flushing remaining input.
- */
-char *
-psql_scan_slash_option(PsqlScanState state,
-					   enum slash_option_type type,
-					   char *quote,
-					   bool semicolon)
-{
-	PQExpBufferData mybuf;
-	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
-	char		local_quote;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	if (quote == NULL)
-		quote = &local_quote;
-	*quote = 0;
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-	option_type = type;
-	option_quote = quote;
-	unquoted_option_chars = 0;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	if (type == OT_WHOLE_LINE)
-		BEGIN(xslashwholeline);
-	else
-		BEGIN(xslashargstart);
-
-	/* And lex. */
-	lexresult = yylex();
-
-	/*
-	 * Check the lex result: we should have gotten back either LEXRES_OK
-	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
-	 * a quoted string, as indicated by YY_START, EOL is an error.
-	 */
-	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
-
-	switch (YY_START)
-	{
-		case xslashargstart:
-			/* empty arg */
-			break;
-		case xslasharg:
-			/* Strip any unquoted trailing semi-colons if requested */
-			if (semicolon)
-			{
-				while (unquoted_option_chars-- > 0 &&
-					   mybuf.len > 0 &&
-					   mybuf.data[mybuf.len - 1] == ';')
-				{
-					mybuf.data[--mybuf.len] = '\0';
-				}
-			}
-
-			/*
-			 * If SQL identifier processing was requested, then we strip out
-			 * excess double quotes and downcase unquoted letters.
-			 * Doubled double-quotes become output double-quotes, per spec.
-			 *
-			 * Note that a string like FOO"BAR"BAZ will be converted to
-			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
-			 * which would have us parse it as several identifiers.  But
-			 * for psql's purposes, we want a string like "foo"."bar" to
-			 * be treated as one option, so there's little choice.
-			 */
-			if (type == OT_SQLID || type == OT_SQLIDHACK)
-			{
-				bool		inquotes = false;
-				char	   *cp = mybuf.data;
-
-				while (*cp)
-				{
-					if (*cp == '"')
-					{
-						if (inquotes && cp[1] == '"')
-						{
-							/* Keep the first quote, remove the second */
-							cp++;
-						}
-						inquotes = !inquotes;
-						/* Collapse out quote at *cp */
-						memmove(cp, cp + 1, strlen(cp));
-						mybuf.len--;
-						/* do not advance cp */
-					}
-					else
-					{
-						if (!inquotes && type == OT_SQLID)
-							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, pset.encoding);
-					}
-				}
-			}
-			break;
-		case xslashquote:
-		case xslashbackquote:
-		case xslashdquote:
-			/* must have hit EOL inside quotes */
-			psql_error("unterminated quoted string\n");
-			termPQExpBuffer(&mybuf);
-			return NULL;
-		case xslashwholeline:
-			/* always okay */
-			break;
-		default:
-			/* can't get here */
-			fprintf(stderr, "invalid YY_START\n");
-			exit(1);
-	}
-
-	/*
-	 * An unquoted empty argument isn't possible unless we are at end of
-	 * command.  Return NULL instead.
-	 */
-	if (mybuf.len == 0 && *quote == 0)
-	{
-		termPQExpBuffer(&mybuf);
-		return NULL;
-	}
-
-	/* Else return the completed string. */
-	return mybuf.data;
-}
-
-/*
- * Eat up any unused \\ to complete a backslash command.
- */
-void
-psql_scan_slash_command_end(PsqlScanState state)
-{
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = NULL;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashend);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-}
-
-/*
- * Evaluate a backticked substring of a slash command's argument.
- *
- * The portion of output_buf starting at backtick_start_offset is evaluated
- * as a shell command and then replaced by the command's output.
- */
-static void
-evaluate_backtick(void)
-{
-	char	   *cmd = output_buf->data + backtick_start_offset;
-	PQExpBufferData cmd_output;
-	FILE	   *fd;
-	bool		error = false;
-	char		buf[512];
-	size_t		result;
-
-	initPQExpBuffer(&cmd_output);
-
-	fd = popen(cmd, PG_BINARY_R);
-	if (!fd)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (!error)
-	{
-		do
-		{
-			result = fread(buf, 1, sizeof(buf), fd);
-			if (ferror(fd))
-			{
-				psql_error("%s: %s\n", cmd, strerror(errno));
-				error = true;
-				break;
-			}
-			appendBinaryPQExpBuffer(&cmd_output, buf, result);
-		} while (!feof(fd));
-	}
-
-	if (fd && pclose(fd) == -1)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (PQExpBufferDataBroken(cmd_output))
-	{
-		psql_error("%s: out of memory\n", cmd);
-		error = true;
-	}
-
-	/* Now done with cmd, delete it from output_buf */
-	output_buf->len = backtick_start_offset;
-	output_buf->data[output_buf->len] = '\0';
-
-	/* If no error, transfer result to output_buf */
-	if (!error)
-	{
-		/* strip any trailing newline */
-		if (cmd_output.len > 0 &&
-			cmd_output.data[cmd_output.len - 1] == '\n')
-			cmd_output.len--;
-		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
-	}
-
-	termPQExpBuffer(&cmd_output);
-}
-
-/*
- * Push the given string onto the stack of stuff to scan.
- *
- * cur_state must point to the active PsqlScanState.
- *
- * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
- */
-static void
-push_new_buffer(const char *newstr, const char *varname)
-{
-	StackElem  *stackelem;
-
-	stackelem = (StackElem *) pg_malloc(sizeof(StackElem));
-
-	/*
-	 * In current usage, the passed varname points at the current flex
-	 * input buffer; we must copy it before calling prepare_buffer()
-	 * because that will change the buffer state.
-	 */
-	stackelem->varname = varname ? pg_strdup(varname) : NULL;
-
-	stackelem->buf = prepare_buffer(newstr, strlen(newstr),
-									&stackelem->bufstring);
-	cur_state->curline = stackelem->bufstring;
-	if (cur_state->safe_encoding)
-	{
-		stackelem->origstring = NULL;
-		cur_state->refline = stackelem->bufstring;
-	}
-	else
-	{
-		stackelem->origstring = pg_strdup(newstr);
-		cur_state->refline = stackelem->origstring;
-	}
-	stackelem->next = cur_state->buffer_stack;
-	cur_state->buffer_stack = stackelem;
-}
-
-/*
- * Pop the topmost buffer stack item (there must be one!)
- *
- * NB: after this, the flex input state is unspecified; caller must
- * switch to an appropriate buffer to continue lexing.
- */
-static void
-pop_buffer_stack(PsqlScanState state)
-{
-	StackElem  *stackelem = state->buffer_stack;
-
-	state->buffer_stack = stackelem->next;
-	yy_delete_buffer(stackelem->buf);
-	free(stackelem->bufstring);
-	if (stackelem->origstring)
-		free(stackelem->origstring);
-	if (stackelem->varname)
-		free(stackelem->varname);
-	free(stackelem);
-}
-
-/*
- * Check if specified variable name is the source for any string
- * currently being scanned
- */
-static bool
-var_is_current_source(PsqlScanState state, const char *varname)
-{
-	StackElem  *stackelem;
-
-	for (stackelem = state->buffer_stack;
-		 stackelem != NULL;
-		 stackelem = stackelem->next)
-	{
-		if (stackelem->varname && strcmp(stackelem->varname, varname) == 0)
-			return true;
-	}
-	return false;
-}
-
-/*
- * Set up a flex input buffer to scan the given data.  We always make a
- * copy of the data.  If working in an unsafe encoding, the copy has
- * multibyte sequences replaced by FFs to avoid fooling the lexer rules.
- *
- * cur_state must point to the active PsqlScanState.
- *
- * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
- */
-static YY_BUFFER_STATE
-prepare_buffer(const char *txt, int len, char **txtcopy)
-{
-	char	   *newtxt;
-
-	/* Flex wants two \0 characters after the actual data */
-	newtxt = pg_malloc(len + 2);
-	*txtcopy = newtxt;
-	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
-
-	if (cur_state->safe_encoding)
-		memcpy(newtxt, txt, len);
-	else
-	{
-		/* Gotta do it the hard way */
-		int		i = 0;
-
-		while (i < len)
-		{
-			int		thislen = PQmblen(txt + i, cur_state->encoding);
-
-			/* first byte should always be okay... */
-			newtxt[i] = txt[i];
-			i++;
-			while (--thislen > 0 && i < len)
-				newtxt[i++] = (char) 0xFF;
-		}
-	}
-
-	return yy_scan_buffer(newtxt, len + 2);
-}
-
-/*
- * emit() --- body for ECHO macro
- *
- * NB: this must be used for ALL and ONLY the text copied from the flex
- * input data.  If you pass it something that is not part of the yytext
- * string, you are making a mistake.  Internally generated text can be
- * appended directly to output_buf.
- */
-static void
-emit(const char *txt, int len)
-{
-	if (cur_state->safe_encoding)
-		appendBinaryPQExpBuffer(output_buf, txt, len);
-	else
-	{
-		/* Gotta do it the hard way */
-		const char *reference = cur_state->refline;
-		int		i;
-
-		reference += (txt - cur_state->curline);
-
-		for (i = 0; i < len; i++)
-		{
-			char	ch = txt[i];
-
-			if (ch == (char) 0xFF)
-				ch = reference[i];
-			appendPQExpBufferChar(output_buf, ch);
-		}
-	}
-}
-
-/*
- * extract_substring --- fetch the true value of (part of) the current token
- *
- * This is like emit(), except that the data is returned as a malloc'd string
- * rather than being pushed directly to output_buf.
- */
-static char *
-extract_substring(const char *txt, int len)
-{
-	char	   *result = (char *) pg_malloc(len + 1);
-
-	if (cur_state->safe_encoding)
-		memcpy(result, txt, len);
-	else
-	{
-		/* Gotta do it the hard way */
-		const char *reference = cur_state->refline;
-		int		i;
-
-		reference += (txt - cur_state->curline);
-
-		for (i = 0; i < len; i++)
-		{
-			char	ch = txt[i];
-
-			if (ch == (char) 0xFF)
-				ch = reference[i];
-			result[i] = ch;
-		}
-	}
-	result[len] = '\0';
-	return result;
-}
-
-/*
- * escape_variable --- process :'VARIABLE' or :"VARIABLE"
- *
- * If the variable name is found, escape its value using the appropriate
- * quoting method and emit the value to output_buf.  (Since the result is
- * surely quoted, there is never any reason to rescan it.)  If we don't
- * find the variable or the escaping function fails, emit the token as-is.
- */
-static void
-escape_variable(bool as_ident)
-{
-	char	   *varname;
-	const char *value;
-
-	/* Variable lookup. */
-	varname = extract_substring(yytext + 2, yyleng - 3);
-	value = GetVariable(pset.vars, varname);
-	free(varname);
-
-	/* Escaping. */
-	if (value)
-	{
-		if (!pset.db)
-			psql_error("can't escape without active connection\n");
-		else
-		{
-			char   *escaped_value;
-
-			if (as_ident)
-				escaped_value =
-					PQescapeIdentifier(pset.db, value, strlen(value));
-			else
-				escaped_value =
-					PQescapeLiteral(pset.db, value, strlen(value));
-
-			if (escaped_value == NULL)
-			{
-				const char *error = PQerrorMessage(pset.db);
-
-				psql_error("%s", error);
-			}
-			else
-			{
-				appendPQExpBufferStr(output_buf, escaped_value);
-				PQfreemem(escaped_value);
-				return;
-			}
-		}
-	}
-
-	/*
-	 * If we reach this point, some kind of error has occurred.  Emit the
-	 * original text into the output buffer.
-	 */
-	emit(yytext, yyleng);
-}
diff --git a/src/bin/psql/psqlscan_int.h b/src/bin/psql/psqlscan_int.h
new file mode 100644
index 0000000..0fb04c7
--- /dev/null
+++ b/src/bin/psql/psqlscan_int.h
@@ -0,0 +1,87 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.h
+ */
+#ifndef PSQLSCAN_INT_H
+#define PSQLSCAN_INT_H
+
+/* Abstract type for lexer's internal state */
+typedef struct PsqlScanStateData *PsqlScanState;
+
+/* Return values from yylex() */
+#define LEXRES_EOL			0	/* end of input */
+#define LEXRES_SEMI			1	/* command-terminating semicolon found */
+#define LEXRES_BACKSLASH	2	/* backslash command start */
+#define LEXRES_OK			3	/* OK completion of backslash argument */
+
+/*
+ * We use a stack of flex buffers to handle substitution of psql variables.
+ * Each stacked buffer contains the as-yet-unread text from one psql variable.
+ * When we pop the stack all the way, we resume reading from the outer buffer
+ * identified by scanbufhandle.
+ */
+typedef struct StackElem
+{
+	YY_BUFFER_STATE buf;		/* flex input control structure */
+	char	   *bufstring;		/* data actually being scanned by flex */
+	char	   *origstring;		/* copy of original data, if needed */
+	char	   *varname;		/* name of variable providing data, or NULL */
+	struct StackElem *next;
+} StackElem;
+
+/*
+ * All working state of the lexer must be stored in PsqlScanStateData
+ * between calls.  This allows us to have multiple open lexer operations,
+ * which is needed for nested include files.  The lexer itself is not
+ * recursive, but it must be re-entrant.
+ */
+typedef struct PsqlScanStateData
+{
+	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
+	/*
+	 * These variables always refer to the outer buffer, never to any
+	 * stacked variable-expansion buffer.
+	 */
+	YY_BUFFER_STATE scanbufhandle;
+	char	   *scanbuf;		/* start of outer-level input buffer */
+	const char *scanline;		/* current input line at outer level */
+
+	/* safe_encoding, curline, refline are used by scan_emit() to replace FFs */
+	PGconn	   *db;				/* active connection */
+	int			encoding;		/* encoding being used now */
+	bool		safe_encoding;	/* is current encoding "safe"? */
+	const char *curline;		/* actual flex input string for cur buf */
+	const char *refline;		/* original data for cur buffer */
+	int			curpos;			/* current position in curline  */
+	VariableSpace vars;			/* "shell variable" repository */
+
+	/*
+	 * All this state lives across successive input lines, until explicitly
+	 * reset by psql_scan_reset.
+	 */
+	int			start_state;	/* saved YY_START */
+	int			paren_depth;	/* depth of nesting in parentheses */
+	int			xcdepth;		/* depth of nesting in slash-star comments */
+	char	   *dolqstart;		/* current $foo$ quote start string */
+
+	/* Scan, cleanup and reset function for the lexer for this scan state */
+	void	(*finish)(PsqlScanState state);
+	void	(*reset)(PsqlScanState state);
+	YY_BUFFER_STATE (*my_yy_scan_buffer)(char *base, yy_size_t size);
+} PsqlScanStateData;
+
+extern PQExpBuffer output_buf;	/* current output buffer */
+extern void psql_scan_switch_lexer(PsqlScanState state);
+extern char *extract_substring(const char *txt, int len);
+extern void escape_variable(bool as_ident);
+extern void push_new_buffer(const char *newstr, const char *varname);
+extern void pop_buffer_stack(PsqlScanState state);
+extern bool var_is_current_source(PsqlScanState state, const char *varname);
+extern void scan_emit(const char *txt, int len);
+extern YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
+									  char **txtcopy);
+
+#endif   /* PSQLSCAN_INT_H */
diff --git a/src/bin/psql/psqlscan_slash.c b/src/bin/psql/psqlscan_slash.c
new file mode 100644
index 0000000..223bde4
--- /dev/null
+++ b/src/bin/psql/psqlscan_slash.c
@@ -0,0 +1,19 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan_slash.c
+ *
+ */
+
+/*
+ * psqlscan_slashbody.c is #include'd here instead of being compiled on its own.
+ * This is because we need postgres_fe.h to be read before any system include
+ * files, else things tend to break on platforms that have multiple
+ * infrastructures for stdio.h and so on.  flex is absolutely uncooperative
+ * about that, so we can't compile psqlscan.c on its own.
+ */
+#include "common.h"
+#include "psqlscan.h"
+#include "psqlscan_slashbody.c"
diff --git a/src/bin/psql/psqlscan_slash.h b/src/bin/psql/psqlscan_slash.h
new file mode 100644
index 0000000..71acbfb
--- /dev/null
+++ b/src/bin/psql/psqlscan_slash.h
@@ -0,0 +1,31 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.h
+ */
+#ifndef PSQLSCAN_SLASH_H
+#define PSQLSCAN_SLASH_H
+
+/* Different ways for scan_slash_option to handle parameter words */
+enum slash_option_type
+{
+	OT_NORMAL,					/* normal case */
+	OT_SQLID,					/* treat as SQL identifier */
+	OT_SQLIDHACK,				/* SQL identifier, but don't downcase */
+	OT_FILEPIPE,				/* it's a filename or pipe */
+	OT_WHOLE_LINE,				/* just snarf the rest of the line */
+	OT_NO_EVAL					/* no expansion of backticks or variables */
+};
+
+extern char *psql_scan_slash_command(PsqlScanState state);
+
+extern char *psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon);
+
+extern void psql_scan_slash_command_end(PsqlScanState state);
+
+#endif   /* PSQLSCAN_H */
diff --git a/src/bin/psql/psqlscan_slashbody.l b/src/bin/psql/psqlscan_slashbody.l
new file mode 100644
index 0000000..06c218b
--- /dev/null
+++ b/src/bin/psql/psqlscan_slashbody.l
@@ -0,0 +1,757 @@
+%{
+/*-------------------------------------------------------------------------
+ *
+ * psqlscan_slashcmd.l
+ *	  lexical scanner for slash commands of psql
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/bin/psql/psqlscan_slashcmd.l
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "psqlscan.h"
+#include "psqlscan_int.h"
+#include "psqlscan_slash.h"
+
+#include <ctype.h>
+
+static PsqlScanState cur_state;	/* current state while active */
+
+/* these variables do not need to be saved across calls */
+static enum slash_option_type option_type;
+static char *option_quote;
+static int	unquoted_option_chars;
+static int	backtick_start_offset;
+
+static void evaluate_backtick(void);
+
+#define ECHO scan_emit(yytext, yyleng)
+
+/* Adjust curpos on yyless */
+#define my_yyless(n) cur_state->curpos -= (yyleng - (n)); yyless(n)
+
+/* Track where lexer parsed up to */
+#define YY_USER_ACTION cur_state->curpos += yyleng;
+
+%}
+
+%option 8bit
+%option never-interactive
+%option nodefault
+%option noinput
+%option nounput
+%option noyywrap
+%option warn
+
+
+/*
+ * All of the following definitions and rules should exactly match
+ * src/backend/parser/scan.l so far as the flex patterns are concerned.
+ * The rule bodies are just ECHO as opposed to what the backend does,
+ * however.  (But be sure to duplicate code that affects the lexing process,
+ * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
+ * scan.l has a separate one for each exclusive state.
+ */
+
+/* Exclusive states for psql only: lex backslash commands */
+%x xslashargstart
+%x xslasharg
+%x xslashquote
+%x xslashbackquote
+%x xslashdquote
+%x xslashwholeline
+%x xslashend
+
+space			[ \t\n\r\f]
+quote			'
+
+/* Quoted string that allows backslash escapes */
+xeoctesc		[\\][0-7]{1,3}
+xehexesc		[\\]x[0-9A-Fa-f]{1,2}
+
+/* Extended quote
+ * xqdouble implements embedded quote, ''''
+ */
+xqdouble		{quote}{quote}
+
+/* Double quote
+ * Allows embedded spaces and other special characters into identifiers.
+ */
+dquote			\"
+
+/* psql-specific: characters allowed in variable names */
+variable_char	[A-Za-z\200-\377_0-9]
+
+other			.
+
+%%
+	/*
+	 * Exclusive lexer states to handle backslash command lexing
+	 */
+
+{
+	/* command name ends at whitespace or backslash; eat all else */
+
+{space}|"\\"	{
+					my_yyless(0);
+					return LEXRES_OK;
+				}
+
+{other}			{ ECHO;}
+
+}
+
+<xslashargstart>{
+	/*
+	 * Discard any whitespace before argument, then go to xslasharg state.
+	 * An exception is that "|" is only special at start of argument, so we
+	 * check for it here.
+	 */
+
+{space}+		{ }
+
+"|"				{
+					if (option_type == OT_FILEPIPE)
+					{
+						/* treat like whole-string case */
+						ECHO;
+						BEGIN(xslashwholeline);
+					}
+					else
+					{
+						/* vertical bar is not special otherwise */
+						my_yyless(0);
+						BEGIN(xslasharg);
+					}
+				}
+
+{other}			{
+					my_yyless(0);
+					BEGIN(xslasharg);
+				}
+
+}
+
+<xslasharg>{
+	/*
+	 * Default processing of text in a slash command's argument.
+	 *
+	 * Note: unquoted_option_chars counts the number of characters at the
+	 * end of the argument that were not subject to any form of quoting.
+	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
+	 */
+
+{space}|"\\"	{
+					/*
+					 * Unquoted space is end of arg; do not eat.  Likewise
+					 * backslash is end of command or next command, do not eat
+					 *
+					 * XXX this means we can't conveniently accept options
+					 * that include unquoted backslashes; therefore, option
+					 * processing that encourages use of backslashes is rather
+					 * broken.
+					 */
+					my_yyless(0);
+					return LEXRES_OK;
+				}
+
+{quote}			{
+					*option_quote = '\'';
+					unquoted_option_chars = 0;
+					BEGIN(xslashquote);
+				}
+
+"`"				{
+					backtick_start_offset = output_buf->len;
+					*option_quote = '`';
+					unquoted_option_chars = 0;
+					BEGIN(xslashbackquote);
+				}
+
+{dquote}		{
+					ECHO;
+					*option_quote = '"';
+					unquoted_option_chars = 0;
+					BEGIN(xslashdquote);
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						char   *varname;
+						const char *value;
+
+						if (cur_state->vars)
+						{
+							varname = extract_substring(yytext + 1, yyleng - 1);
+							value = GetVariable(cur_state->vars, varname);
+							free(varname);
+						}
+
+						/*
+						 * The variable value is just emitted without any
+						 * further examination.  This is consistent with the
+						 * pre-8.0 code behavior, if not with the way that
+						 * variables are handled outside backslash commands.
+						 * Note that we needn't guard against recursion here.
+						 */
+						if (value)
+							appendPQExpBufferStr(output_buf, value);
+						else
+							ECHO;
+
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}+'	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(false);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+
+:\"{variable_char}+\"	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(true);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+{other}			{
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+}
+
+<xslashquote>{
+	/*
+	 * single-quoted text: copy literally except for '' and backslash
+	 * sequences
+	 */
+
+{quote}			{ BEGIN(xslasharg); }
+
+{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
+
+"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
+"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
+"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
+"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
+"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
+
+{xeoctesc}		{
+					/* octal case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 1, NULL, 8));
+				}
+
+{xehexesc}		{
+					/* hex case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 2, NULL, 16));
+				}
+
+"\\".			{ scan_emit(yytext + 1, 1); }
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashbackquote>{
+	/*
+	 * backticked text: copy everything until next backquote, then evaluate.
+	 *
+	 * XXX Possible future behavioral change: substitute for :VARIABLE?
+	 */
+
+"`"				{
+					/* In NO_EVAL mode, don't evaluate the command */
+					if (option_type != OT_NO_EVAL)
+						evaluate_backtick();
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashdquote>{
+	/* double-quoted text: copy verbatim, including the double quotes */
+
+{dquote}		{
+					ECHO;
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashwholeline>{
+	/* copy everything until end of input line */
+	/* but suppress leading whitespace */
+
+{space}+		{
+					if (output_buf->len > 0)
+						ECHO;
+				}
+
+{other}			{ ECHO; }
+
+}
+
+<xslashend>{
+	/* at end of command, eat a double backslash, but not anything else */
+
+"\\\\"			{ return LEXRES_OK; }
+
+{other}|\n		{
+					my_yyless(0);
+					return LEXRES_OK;
+				}
+
+}
+
+%%
+
+static void psql_scan_slash_command_finish(PsqlScanState state);
+static void psql_scan_slash_command_reset(PsqlScanState state);
+
+static void
+psql_scan_slash_command_initialize(PsqlScanState state)
+{
+	psql_scan_finish(state);
+	psql_scan_reset(state);
+	memset(state, 0, sizeof(*state));
+	state->finish = &psql_scan_slash_command_finish;
+	state->reset = &psql_scan_slash_command_reset;
+	state->my_yy_scan_buffer = &yy_scan_buffer;
+	state->reset(state);
+}
+
+/*
+ * Set up to perform lexing of the given input line.
+ *
+ * The text at *line, extending for line_len bytes, will be scanned by
+ * subsequent calls to the psql_scan routines.  psql_scan_finish should
+ * be called when scanning is complete.  Note that the lexer retains
+ * a pointer to the storage at *line --- this string must not be altered
+ * or freed until after psql_scan_finish is called.
+ */
+static void
+psql_scan_slash_command_setup(PsqlScanState state,
+							  const char *line, int line_len,
+							  PGconn *db, VariableSpace vars, int encoding)
+{
+	/* Mustn't be scanning already */
+	Assert(state->scanbufhandle == NULL);
+	Assert(state->buffer_stack == NULL);
+
+	/* Do we need to hack the character set encoding? */
+	state->encoding = encoding;
+	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
+
+	state->vars = vars;
+
+	/* needed for prepare_buffer */
+	cur_state = state;
+
+	/* Set up flex input buffer with appropriate translation and padding */
+	state->scanbufhandle = prepare_buffer(line, line_len,
+										  &state->scanbuf);
+	state->scanline = line;
+	state->curpos = 0;
+
+	/* Set lookaside data in case we have to map unsafe encoding */
+	state->curline = state->scanbuf;
+	state->refline = state->scanline;
+}
+
+/*
+ * Create new lexer scanning state for this lexer which parses from the current
+ * position of the given scanning state for another lexer. The given state is
+ * destroyed.
+ * 
+ * Note: This function cannot access yy* functions and varialbes of the given
+ * state because they are of different lexer.
+ */
+static void
+psql_scan_slash_command_switch_lexer(PsqlScanState state)
+{
+	const char *newscanline = state->scanline + state->curpos;
+	PGconn		   *db = state->db;
+	VariableSpace	vars = state->vars;
+	int				encoding = state->encoding;
+
+	psql_scan_slash_command_initialize(state);
+	psql_scan_slash_command_setup(state, newscanline, strlen(newscanline),
+								  db, vars, encoding);
+}
+
+/*
+ * Scan the command name of a psql backslash command.  This should be called
+ * after psql_scan() on the main lexer returns PSCAN_BACKSLASH.  It is assumed
+ * that the input has been consumed through the leading backslash.
+ *
+ * The return value is a malloc'd copy of the command name, as parsed off
+ * from the input.
+ */
+char *
+psql_scan_slash_command(PsqlScanState state)
+{
+	PQExpBufferData mybuf;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	psql_scan_slash_command_switch_lexer(state);
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+
+	if (state->buffer_stack != NULL)
+		yys_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yys_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(INITIAL);
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+
+	return mybuf.data;
+}
+
+/*
+ * Parse off the next argument for a backslash command, and return it as a
+ * malloc'd string.  If there are no more arguments, returns NULL.
+ *
+ * type tells what processing, if any, to perform on the option string;
+ * for example, if it's a SQL identifier, we want to downcase any unquoted
+ * letters.
+ *
+ * if quote is not NULL, *quote is set to 0 if no quoting was found, else
+ * the last quote symbol used in the argument.
+ *
+ * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
+ * be taken as part of the option string will be stripped.
+ *
+ * NOTE: the only possible syntax errors for backslash options are unmatched
+ * quotes, which are detected when we run out of input.  Therefore, on a
+ * syntax error we just throw away the string and return NULL; there is no
+ * need to worry about flushing remaining input.
+ */
+char *
+psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon)
+{
+	PQExpBufferData mybuf;
+	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
+	char		local_quote;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	if (quote == NULL)
+		quote = &local_quote;
+	*quote = 0;
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+	option_type = type;
+	option_quote = quote;
+	unquoted_option_chars = 0;
+
+	if (state->buffer_stack != NULL)
+		yys_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yys_switch_to_buffer(state->scanbufhandle);
+
+	if (type == OT_WHOLE_LINE)
+		BEGIN(xslashwholeline);
+	else
+		BEGIN(xslashargstart);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/*
+	 * Check the lex result: we should have gotten back either LEXRES_OK
+	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
+	 * a quoted string, as indicated by YY_START, EOL is an error.
+	 */
+	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
+
+	switch (YY_START)
+	{
+		case xslashargstart:
+			/* empty arg */
+			break;
+		case xslasharg:
+			/* Strip any unquoted trailing semi-colons if requested */
+			if (semicolon)
+			{
+				while (unquoted_option_chars-- > 0 &&
+					   mybuf.len > 0 &&
+					   mybuf.data[mybuf.len - 1] == ';')
+				{
+					mybuf.data[--mybuf.len] = '\0';
+				}
+			}
+
+			/*
+			 * If SQL identifier processing was requested, then we strip out
+			 * excess double quotes and downcase unquoted letters.
+			 * Doubled double-quotes become output double-quotes, per spec.
+			 *
+			 * Note that a string like FOO"BAR"BAZ will be converted to
+			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
+			 * which would have us parse it as several identifiers.  But
+			 * for psql's purposes, we want a string like "foo"."bar" to
+			 * be treated as one option, so there's little choice.
+			 */
+			if (type == OT_SQLID || type == OT_SQLIDHACK)
+			{
+				bool		inquotes = false;
+				char	   *cp = mybuf.data;
+
+				while (*cp)
+				{
+					if (*cp == '"')
+					{
+						if (inquotes && cp[1] == '"')
+						{
+							/* Keep the first quote, remove the second */
+							cp++;
+						}
+						inquotes = !inquotes;
+						/* Collapse out quote at *cp */
+						memmove(cp, cp + 1, strlen(cp));
+						mybuf.len--;
+						/* do not advance cp */
+					}
+					else
+					{
+						if (!inquotes && type == OT_SQLID)
+							*cp = pg_tolower((unsigned char) *cp);
+						cp += PQmblen(cp, cur_state->encoding);
+					}
+				}
+			}
+			break;
+		case xslashquote:
+		case xslashbackquote:
+		case xslashdquote:
+			/* must have hit EOL inside quotes */
+			psql_error("unterminated quoted string\n");
+			termPQExpBuffer(&mybuf);
+			return NULL;
+		case xslashwholeline:
+			/* always okay */
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid YY_START\n");
+			exit(1);
+	}
+
+	/*
+	 * An unquoted empty argument isn't possible unless we are at end of
+	 * command.  Return NULL instead.
+	 */
+	if (mybuf.len == 0 && *quote == 0)
+	{
+		termPQExpBuffer(&mybuf);
+		return NULL;
+	}
+
+	/* Else return the completed string. */
+	return mybuf.data;
+}
+
+/*
+ * Eat up any unused \\ to complete a backslash command.
+ */
+void
+psql_scan_slash_command_end(PsqlScanState state)
+{
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = NULL;
+
+	if (state->buffer_stack != NULL)
+		yys_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yys_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(xslashend);
+
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+	psql_scan_switch_lexer(state);
+}
+
+/*
+ * Evaluate a backticked substring of a slash command's argument.
+ *
+ * The portion of output_buf starting at backtick_start_offset is evaluated
+ * as a shell command and then replaced by the command's output.
+ */
+static void
+evaluate_backtick(void)
+{
+	char	   *cmd = output_buf->data + backtick_start_offset;
+	PQExpBufferData cmd_output;
+	FILE	   *fd;
+	bool		error = false;
+	char		buf[512];
+	size_t		result;
+
+	initPQExpBuffer(&cmd_output);
+
+	fd = popen(cmd, PG_BINARY_R);
+	if (!fd)
+	{
+		psql_error("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (!error)
+	{
+		do
+		{
+			result = fread(buf, 1, sizeof(buf), fd);
+			if (ferror(fd))
+			{
+				psql_error("%s: %s\n", cmd, strerror(errno));
+				error = true;
+				break;
+			}
+			appendBinaryPQExpBuffer(&cmd_output, buf, result);
+		} while (!feof(fd));
+	}
+
+	if (fd && pclose(fd) == -1)
+	{
+		psql_error("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (PQExpBufferDataBroken(cmd_output))
+	{
+		psql_error("%s: out of memory\n", cmd);
+		error = true;
+	}
+
+	/* Now done with cmd, delete it from output_buf */
+	output_buf->len = backtick_start_offset;
+	output_buf->data[output_buf->len] = '\0';
+
+	/* If no error, transfer result to output_buf */
+	if (!error)
+	{
+		/* strip any trailing newline */
+		if (cmd_output.len > 0 &&
+			cmd_output.data[cmd_output.len - 1] == '\n')
+			cmd_output.len--;
+		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
+	}
+
+	termPQExpBuffer(&cmd_output);
+}
+
+/*
+ * Clean up after scanning a string.  This flushes any unread input and
+ * releases resources (but not the PsqlScanState itself).  Note however
+ * that this does not reset the lexer scan state; that can be done by
+ * psql_scan_reset(), which is an orthogonal operation.
+ *
+ * It is legal to call this when not scanning anything (makes it easier
+ * to deal with error recovery).
+ */
+static void
+psql_scan_slash_command_finish(PsqlScanState state)
+{
+	/* Drop any incomplete variable expansions. */
+	while (state->buffer_stack != NULL)
+		pop_buffer_stack(state);
+
+	/* Done with the outer scan buffer, too */
+	if (state->scanbufhandle)
+		yys_delete_buffer(state->scanbufhandle);
+	state->scanbufhandle = NULL;
+	if (state->scanbuf)
+		free(state->scanbuf);
+	state->scanbuf = NULL;
+}
+
+/*
+ * Reset lexer scanning state to start conditions.  This is appropriate
+ * for executing \r psql commands (or any other time that we discard the
+ * prior contents of query_buf).  It is not, however, necessary to do this
+ * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
+ * PSCAN_EOL scan result, because the scan state must be INITIAL when those
+ * conditions are returned.
+ *
+ * Note that this is unrelated to flushing unread input; that task is
+ * done by psql_scan_finish().
+ */
+static void
+psql_scan_slash_command_reset(PsqlScanState state)
+{
+	state->start_state = INITIAL;
+	state->paren_depth = 0;
+	state->xcdepth = 0;			/* not really necessary */
+	if (state->dolqstart)
+		free(state->dolqstart);
+	state->dolqstart = NULL;
+}
+
diff --git a/src/bin/psql/psqlscanbody.l b/src/bin/psql/psqlscanbody.l
new file mode 100644
index 0000000..cf5755c
--- /dev/null
+++ b/src/bin/psql/psqlscanbody.l
@@ -0,0 +1,1431 @@
+%{
+/*-------------------------------------------------------------------------
+ *
+ * psqlscan.l
+ *	  lexical scanner for psql
+ *
+ * This code is mainly needed to determine where the end of a SQL statement
+ * is: we are looking for semicolons that are not within quotes, comments,
+ * or parentheses.  The most reliable way to handle this is to borrow the
+ * backend's flex lexer rules, lock, stock, and barrel.  The rules below
+ * are (except for a few) the same as the backend's, but their actions are
+ * just ECHO whereas the backend's actions generally do other things.
+ *
+ * XXX The rules in this file must be kept in sync with the backend lexer!!!
+ *
+ * XXX Avoid creating backtracking cases --- see the backend lexer for info.
+ *
+ * The most difficult aspect of this code is that we need to work in multibyte
+ * encodings that are not ASCII-safe.  A "safe" encoding is one in which each
+ * byte of a multibyte character has the high bit set (it's >= 0x80).  Since
+ * all our lexing rules treat all high-bit-set characters alike, we don't
+ * really need to care whether such a byte is part of a sequence or not.
+ * In an "unsafe" encoding, we still expect the first byte of a multibyte
+ * sequence to be >= 0x80, but later bytes might not be.  If we scan such
+ * a sequence as-is, the lexing rules could easily be fooled into matching
+ * such bytes to ordinary ASCII characters.  Our solution for this is to
+ * substitute 0xFF for each non-first byte within the data presented to flex.
+ * The flex rules will then pass the FF's through unmolested.  The emit()
+ * subroutine is responsible for looking back to the original string and
+ * replacing FF's with the corresponding original bytes.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/bin/psql/psqlscan.l
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "psqlscan.h"
+#include "psqlscan_int.h"
+
+#include <ctype.h>
+
+static PsqlScanState cur_state;	/* current state while active */
+
+PQExpBuffer output_buf;	/* current output buffer */
+
+#define ECHO scan_emit(yytext, yyleng)
+
+/* Adjust curpos on yyless */
+#define my_yyless(n) cur_state->curpos -= (yyleng - (n)); yyless(n)
+
+/* Track where lexer parsed up to */
+#define YY_USER_ACTION cur_state->curpos += yyleng;
+
+%}
+
+%option 8bit
+%option never-interactive
+%option nodefault
+%option noinput
+%option nounput
+%option noyywrap
+%option warn
+
+/*
+ * All of the following definitions and rules should exactly match
+ * src/backend/parser/scan.l so far as the flex patterns are concerned.
+ * The rule bodies are just ECHO as opposed to what the backend does,
+ * however.  (But be sure to duplicate code that affects the lexing process,
+ * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
+ * scan.l has a separate one for each exclusive state.
+ */
+
+/*
+ * OK, here is a short description of lex/flex rules behavior.
+ * The longest pattern which matches an input string is always chosen.
+ * For equal-length patterns, the first occurring in the rules list is chosen.
+ * INITIAL is the starting state, to which all non-conditional rules apply.
+ * Exclusive states change parsing rules while the state is active.  When in
+ * an exclusive state, only those rules defined for that state apply.
+ *
+ * We use exclusive states for quoted strings, extended comments,
+ * and to eliminate parsing troubles for numeric strings.
+ * Exclusive states:
+ *  <xb> bit string literal
+ *  <xc> extended C-style comments
+ *  <xd> delimited identifiers (double-quoted identifiers)
+ *  <xh> hexadecimal numeric string
+ *  <xq> standard quoted strings
+ *  <xe> extended quoted strings (support backslash escape sequences)
+ *  <xdolq> $foo$ quoted strings
+ *  <xui> quoted identifier with Unicode escapes
+ *  <xuiend> end of a quoted identifier with Unicode escapes, UESCAPE can follow
+ *  <xus> quoted string with Unicode escapes
+ *  <xusend> end of a quoted string with Unicode escapes, UESCAPE can follow
+ *
+ * Note: we intentionally don't mimic the backend's <xeu> state; we have
+ * no need to distinguish it from <xe> state, and no good way to get out
+ * of it in error cases.  The backend just throws yyerror() in those
+ * cases, but that's not an option here.
+ */
+
+%x xb
+%x xc
+%x xd
+%x xh
+%x xe
+%x xq
+%x xdolq
+%x xui
+%x xuiend
+%x xus
+%x xusend
+
+/*
+ * In order to make the world safe for Windows and Mac clients as well as
+ * Unix ones, we accept either \n or \r as a newline.  A DOS-style \r\n
+ * sequence will be seen as two successive newlines, but that doesn't cause
+ * any problems.  Comments that start with -- and extend to the next
+ * newline are treated as equivalent to a single whitespace character.
+ *
+ * NOTE a fine point: if there is no newline following --, we will absorb
+ * everything to the end of the input as a comment.  This is correct.  Older
+ * versions of Postgres failed to recognize -- as a comment if the input
+ * did not end with a newline.
+ *
+ * XXX perhaps \f (formfeed) should be treated as a newline as well?
+ *
+ * XXX if you change the set of whitespace characters, fix scanner_isspace()
+ * to agree, and see also the plpgsql lexer.
+ */
+
+space			[ \t\n\r\f]
+horiz_space		[ \t\f]
+newline			[\n\r]
+non_newline		[^\n\r]
+
+comment			("--"{non_newline}*)
+
+whitespace		({space}+|{comment})
+
+/*
+ * SQL requires at least one newline in the whitespace separating
+ * string literals that are to be concatenated.  Silly, but who are we
+ * to argue?  Note that {whitespace_with_newline} should not have * after
+ * it, whereas {whitespace} should generally have a * after it...
+ */
+
+special_whitespace		({space}+|{comment}{newline})
+horiz_whitespace		({horiz_space}|{comment})
+whitespace_with_newline	({horiz_whitespace}*{newline}{special_whitespace}*)
+
+/*
+ * To ensure that {quotecontinue} can be scanned without having to back up
+ * if the full pattern isn't matched, we include trailing whitespace in
+ * {quotestop}.  This matches all cases where {quotecontinue} fails to match,
+ * except for {quote} followed by whitespace and just one "-" (not two,
+ * which would start a {comment}).  To cover that we have {quotefail}.
+ * The actions for {quotestop} and {quotefail} must throw back characters
+ * beyond the quote proper.
+ */
+quote			'
+quotestop		{quote}{whitespace}*
+quotecontinue	{quote}{whitespace_with_newline}{quote}
+quotefail		{quote}{whitespace}*"-"
+
+/* Bit string
+ * It is tempting to scan the string for only those characters
+ * which are allowed. However, this leads to silently swallowed
+ * characters if illegal characters are included in the string.
+ * For example, if xbinside is [01] then B'ABCD' is interpreted
+ * as a zero-length string, and the ABCD' is lost!
+ * Better to pass the string forward and let the input routines
+ * validate the contents.
+ */
+xbstart			[bB]{quote}
+xbinside		[^']*
+
+/* Hexadecimal number */
+xhstart			[xX]{quote}
+xhinside		[^']*
+
+/* National character */
+xnstart			[nN]{quote}
+
+/* Quoted string that allows backslash escapes */
+xestart			[eE]{quote}
+xeinside		[^\\']+
+xeescape		[\\][^0-7]
+xeoctesc		[\\][0-7]{1,3}
+xehexesc		[\\]x[0-9A-Fa-f]{1,2}
+xeunicode		[\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})
+xeunicodefail	[\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})
+
+/* Extended quote
+ * xqdouble implements embedded quote, ''''
+ */
+xqstart			{quote}
+xqdouble		{quote}{quote}
+xqinside		[^']+
+
+/* $foo$ style quotes ("dollar quoting")
+ * The quoted string starts with $foo$ where "foo" is an optional string
+ * in the form of an identifier, except that it may not contain "$",
+ * and extends to the first occurrence of an identical string.
+ * There is *no* processing of the quoted text.
+ *
+ * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim}
+ * fails to match its trailing "$".
+ */
+dolq_start		[A-Za-z\200-\377_]
+dolq_cont		[A-Za-z\200-\377_0-9]
+dolqdelim		\$({dolq_start}{dolq_cont}*)?\$
+dolqfailed		\${dolq_start}{dolq_cont}*
+dolqinside		[^$]+
+
+/* Double quote
+ * Allows embedded spaces and other special characters into identifiers.
+ */
+dquote			\"
+xdstart			{dquote}
+xdstop			{dquote}
+xddouble		{dquote}{dquote}
+xdinside		[^"]+
+
+/* Unicode escapes */
+uescape			[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote}
+/* error rule to avoid backup */
+uescapefail		[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU]
+
+/* Quoted identifier with Unicode escapes */
+xuistart		[uU]&{dquote}
+
+/* Quoted string with Unicode escapes */
+xusstart		[uU]&{quote}
+
+/* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */
+xustop1		{uescapefail}?
+xustop2		{uescape}
+
+/* error rule to avoid backup */
+xufailed		[uU]&
+
+
+/* C-style comments
+ *
+ * The "extended comment" syntax closely resembles allowable operator syntax.
+ * The tricky part here is to get lex to recognize a string starting with
+ * slash-star as a comment, when interpreting it as an operator would produce
+ * a longer match --- remember lex will prefer a longer match!  Also, if we
+ * have something like plus-slash-star, lex will think this is a 3-character
+ * operator whereas we want to see it as a + operator and a comment start.
+ * The solution is two-fold:
+ * 1. append {op_chars}* to xcstart so that it matches as much text as
+ *    {operator} would. Then the tie-breaker (first matching rule of same
+ *    length) ensures xcstart wins.  We put back the extra stuff with yyless()
+ *    in case it contains a star-slash that should terminate the comment.
+ * 2. In the operator rule, check for slash-star within the operator, and
+ *    if found throw it back with yyless().  This handles the plus-slash-star
+ *    problem.
+ * Dash-dash comments have similar interactions with the operator rule.
+ */
+xcstart			\/\*{op_chars}*
+xcstop			\*+\/
+xcinside		[^*/]+
+
+digit			[0-9]
+ident_start		[A-Za-z\200-\377_]
+ident_cont		[A-Za-z\200-\377_0-9\$]
+
+identifier		{ident_start}{ident_cont}*
+
+/* Assorted special-case operators and operator-like tokens */
+typecast		"::"
+dot_dot			\.\.
+colon_equals	":="
+equals_greater	"=>"
+less_equals		"<="
+greater_equals	">="
+less_greater	"<>"
+not_equals		"!="
+
+/*
+ * "self" is the set of chars that should be returned as single-character
+ * tokens.  "op_chars" is the set of chars that can make up "Op" tokens,
+ * which can be one or more characters long (but if a single-char token
+ * appears in the "self" set, it is not to be returned as an Op).  Note
+ * that the sets overlap, but each has some chars that are not in the other.
+ *
+ * If you change either set, adjust the character lists appearing in the
+ * rule for "operator"!
+ */
+self			[,()\[\].;\:\+\-\*\/\%\^\<\>\=]
+op_chars		[\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
+operator		{op_chars}+
+
+/* we no longer allow unary minus in numbers.
+ * instead we pass it separately to parser. there it gets
+ * coerced via doNegate() -- Leon aug 20 1999
+ *
+ * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 10.
+ *
+ * {realfail1} and {realfail2} are added to prevent the need for scanner
+ * backup when the {real} rule fails to match completely.
+ */
+
+integer			{digit}+
+decimal			(({digit}*\.{digit}+)|({digit}+\.{digit}*))
+decimalfail		{digit}+\.\.
+real			({integer}|{decimal})[Ee][-+]?{digit}+
+realfail1		({integer}|{decimal})[Ee]
+realfail2		({integer}|{decimal})[Ee][-+]
+
+param			\${integer}
+
+/* psql-specific: characters allowed in variable names */
+variable_char	[A-Za-z\200-\377_0-9]
+
+other			.
+
+/*
+ * Dollar quoted strings are totally opaque, and no escaping is done on them.
+ * Other quoted strings must allow some special characters such as single-quote
+ *  and newline.
+ * Embedded single-quotes are implemented both in the SQL standard
+ *  style of two adjacent single quotes "''" and in the Postgres/Java style
+ *  of escaped-quote "\'".
+ * Other embedded escaped characters are matched explicitly and the leading
+ *  backslash is dropped from the string.
+ * Note that xcstart must appear before operator, as explained above!
+ *  Also whitespace (comment) must appear before operator.
+ */
+
+%%
+
+{whitespace}	{
+					/*
+					 * Note that the whitespace rule includes both true
+					 * whitespace and single-line ("--" style) comments.
+					 * We suppress whitespace at the start of the query
+					 * buffer.  We also suppress all single-line comments,
+					 * which is pretty dubious but is the historical
+					 * behavior.
+					 */
+					if (!(output_buf->len == 0 || yytext[0] == '-'))
+						ECHO;
+				}
+
+{xcstart}		{
+					cur_state->xcdepth = 0;
+					BEGIN(xc);
+					/* Put back any characters past slash-star; see above */
+					my_yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstart}	{
+					cur_state->xcdepth++;
+					/* Put back any characters past slash-star; see above */
+					my_yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstop}	{
+					if (cur_state->xcdepth <= 0)
+					{
+						BEGIN(INITIAL);
+					}
+					else
+						cur_state->xcdepth--;
+					ECHO;
+				}
+
+<xc>{xcinside}	{
+					ECHO;
+				}
+
+<xc>{op_chars}	{
+					ECHO;
+				}
+
+<xc>\*+			{
+					ECHO;
+				}
+
+{xbstart}		{
+					BEGIN(xb);
+					ECHO;
+				}
+<xb>{quotestop}	|
+<xb>{quotefail} {
+					my_yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xh>{xhinside}	|
+<xb>{xbinside}	{
+					ECHO;
+				}
+<xh>{quotecontinue}	|
+<xb>{quotecontinue}	{
+					ECHO;
+				}
+
+{xhstart}		{
+					/* Hexadecimal bit type.
+					 * At some point we should simply pass the string
+					 * forward to the parser and label it there.
+					 * In the meantime, place a leading "x" on the string
+					 * to mark it for the input routine as a hex string.
+					 */
+					BEGIN(xh);
+					ECHO;
+				}
+<xh>{quotestop}	|
+<xh>{quotefail} {
+					my_yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+
+{xnstart}		{
+					my_yyless(1);				/* eat only 'n' this time */
+					ECHO;
+				}
+
+{xqstart}		{
+					if (standard_strings())
+						BEGIN(xq);
+					else
+						BEGIN(xe);
+					ECHO;
+				}
+{xestart}		{
+					BEGIN(xe);
+					ECHO;
+				}
+{xusstart}		{
+					BEGIN(xus);
+					ECHO;
+				}
+<xq,xe>{quotestop}	|
+<xq,xe>{quotefail} {
+					my_yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xus>{quotestop} |
+<xus>{quotefail} {
+					my_yyless(1);
+					BEGIN(xusend);
+					ECHO;
+				}
+<xusend>{whitespace} {
+					ECHO;
+				}
+<xusend>{other} |
+<xusend>{xustop1} {
+					my_yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xusend>{xustop2} {
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xq,xe,xus>{xqdouble} {
+					ECHO;
+				}
+<xq,xus>{xqinside}  {
+					ECHO;
+				}
+<xe>{xeinside}  {
+					ECHO;
+				}
+<xe>{xeunicode} {
+					ECHO;
+				}
+<xe>{xeunicodefail}	{
+					ECHO;
+				}
+<xe>{xeescape}  {
+					ECHO;
+				}
+<xe>{xeoctesc}  {
+					ECHO;
+				}
+<xe>{xehexesc}  {
+					ECHO;
+				}
+<xq,xe,xus>{quotecontinue} {
+					ECHO;
+				}
+<xe>.			{
+					/* This is only needed for \ just before EOF */
+					ECHO;
+				}
+
+{dolqdelim}		{
+					cur_state->dolqstart = pg_strdup(yytext);
+					BEGIN(xdolq);
+					ECHO;
+				}
+{dolqfailed}	{
+					/* throw back all but the initial "$" */
+					my_yyless(1);
+					ECHO;
+				}
+<xdolq>{dolqdelim} {
+					if (strcmp(yytext, cur_state->dolqstart) == 0)
+					{
+						free(cur_state->dolqstart);
+						cur_state->dolqstart = NULL;
+						BEGIN(INITIAL);
+					}
+					else
+					{
+						/*
+						 * When we fail to match $...$ to dolqstart, transfer
+						 * the $... part to the output, but put back the final
+						 * $ for rescanning.  Consider $delim$...$junk$delim$
+						 */
+						my_yyless(yyleng-1);
+					}
+					ECHO;
+				}
+<xdolq>{dolqinside} {
+					ECHO;
+				}
+<xdolq>{dolqfailed} {
+					ECHO;
+				}
+<xdolq>.		{
+					/* This is only needed for $ inside the quoted text */
+					ECHO;
+				}
+
+{xdstart}		{
+					BEGIN(xd);
+					ECHO;
+				}
+{xuistart}		{
+					BEGIN(xui);
+					ECHO;
+				}
+<xd>{xdstop}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xui>{dquote} {
+					my_yyless(1);
+					BEGIN(xuiend);
+					ECHO;
+				}
+<xuiend>{whitespace} {
+					ECHO;
+				}
+<xuiend>{other} |
+<xuiend>{xustop1} {
+					my_yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xuiend>{xustop2}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xd,xui>{xddouble}	{
+					ECHO;
+				}
+<xd,xui>{xdinside}	{
+					ECHO;
+				}
+
+{xufailed}	{
+					/* throw back all but the initial u/U */
+					my_yyless(1);
+					ECHO;
+				}
+
+{typecast}		{
+					ECHO;
+				}
+
+{dot_dot}		{
+					ECHO;
+				}
+
+{colon_equals}	{
+					ECHO;
+				}
+
+{equals_greater} {
+					ECHO;
+				}
+
+{less_equals}	{
+					ECHO;
+				}
+
+{greater_equals} {
+					ECHO;
+				}
+
+{less_greater}	{
+					ECHO;
+				}
+
+{not_equals}	{
+					ECHO;
+				}
+
+	/*
+	 * These rules are specific to psql --- they implement parenthesis
+	 * counting and detection of command-ending semicolon.  These must
+	 * appear before the {self} rule so that they take precedence over it.
+	 */
+
+"("				{
+					cur_state->paren_depth++;
+					ECHO;
+				}
+
+")"				{
+					if (cur_state->paren_depth > 0)
+						cur_state->paren_depth--;
+					ECHO;
+				}
+
+";"				{
+					ECHO;
+					if (cur_state->paren_depth == 0)
+					{
+						/* Terminate lexing temporarily */
+						return LEXRES_SEMI;
+					}
+				}
+
+	/*
+	 * psql-specific rules to handle backslash commands and variable
+	 * substitution.  We want these before {self}, also.
+	 */
+
+"\\"[;:]		{
+					/* Force a semicolon or colon into the query buffer */
+					scan_emit(yytext + 1, 1);
+				}
+
+"\\"			{
+					/* Terminate lexing temporarily */
+					return LEXRES_BACKSLASH;
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					char   *varname = NULL;
+					const char *value = NULL;
+
+					if (cur_state->vars)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = GetVariable(cur_state->vars, varname);
+					}
+
+					if (value)
+					{
+						/* It is a variable, check for recursion */
+						if (var_is_current_source(cur_state, varname))
+						{
+							/* Recursive expansion --- don't go there */
+							psql_error("skipping recursive expansion of variable \"%s\"\n",
+									   varname);
+							/* Instead copy the string as is */
+							ECHO;
+						}
+						else
+						{
+							/* OK, perform substitution */
+							push_new_buffer(value, varname);
+							/* yy_scan_string already made buffer active */
+						}
+					}
+					else
+					{
+						/*
+						 * if the variable doesn't exist we'll copy the
+						 * string as is
+						 */
+						ECHO;
+					}
+
+					if (varname)
+						free(varname);
+				}
+
+:'{variable_char}+'	{
+					escape_variable(false);
+				}
+
+:\"{variable_char}+\"	{
+					escape_variable(true);
+				}
+
+	/*
+	 * These rules just avoid the need for scanner backup if one of the
+	 * two rules above fails to match completely.
+	 */
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					ECHO;
+				}
+
+	/*
+	 * Back to backend-compatible rules.
+	 */
+
+{self}			{
+					ECHO;
+				}
+
+{operator}		{
+					/*
+					 * Check for embedded slash-star or dash-dash; those
+					 * are comment starts, so operator must stop there.
+					 * Note that slash-star or dash-dash at the first
+					 * character will match a prior rule, not this one.
+					 */
+					int		nchars = yyleng;
+					char   *slashstar = strstr(yytext, "/*");
+					char   *dashdash = strstr(yytext, "--");
+
+					if (slashstar && dashdash)
+					{
+						/* if both appear, take the first one */
+						if (slashstar > dashdash)
+							slashstar = dashdash;
+					}
+					else if (!slashstar)
+						slashstar = dashdash;
+					if (slashstar)
+						nchars = slashstar - yytext;
+
+					/*
+					 * For SQL compatibility, '+' and '-' cannot be the
+					 * last char of a multi-char operator unless the operator
+					 * contains chars that are not in SQL operators.
+					 * The idea is to lex '=-' as two operators, but not
+					 * to forbid operator names like '?-' that could not be
+					 * sequences of SQL operators.
+					 */
+					while (nchars > 1 &&
+						   (yytext[nchars-1] == '+' ||
+							yytext[nchars-1] == '-'))
+					{
+						int		ic;
+
+						for (ic = nchars-2; ic >= 0; ic--)
+						{
+							if (strchr("~!@#^&|`?%", yytext[ic]))
+								break;
+						}
+						if (ic >= 0)
+							break; /* found a char that makes it OK */
+						nchars--; /* else remove the +/-, and check again */
+					}
+
+					if (nchars < yyleng)
+					{
+						/* Strip the unwanted chars from the token */
+						my_yyless(nchars);
+					}
+					ECHO;
+				}
+
+{param}			{
+					ECHO;
+				}
+
+{integer}		{
+					ECHO;
+				}
+{decimal}		{
+					ECHO;
+				}
+{decimalfail}	{
+					/* throw back the .., and treat as integer */
+					my_yyless(yyleng-2);
+					ECHO;
+				}
+{real}			{
+					ECHO;
+				}
+{realfail1}		{
+					/*
+					 * throw back the [Ee], and treat as {decimal}.  Note
+					 * that it is possible the input is actually {integer},
+					 * but since this case will almost certainly lead to a
+					 * syntax error anyway, we don't bother to distinguish.
+					 */
+					my_yyless(yyleng-1);
+					ECHO;
+				}
+{realfail2}		{
+					/* throw back the [Ee][+-], and proceed as above */
+					my_yyless(yyleng-2);
+					ECHO;
+				}
+
+
+{identifier}	{
+					ECHO;
+				}
+
+{other}			{
+					ECHO;
+				}
+
+
+	/*
+	 * Everything from here down is psql-specific.
+	 */
+
+<<EOF>>			{
+					StackElem  *stackelem = cur_state->buffer_stack;
+
+					if (stackelem == NULL)
+						return LEXRES_EOL; /* end of input reached */
+
+					/*
+					 * We were expanding a variable, so pop the inclusion
+					 * stack and keep lexing
+					 */
+					pop_buffer_stack(cur_state);
+
+					stackelem = cur_state->buffer_stack;
+					if (stackelem != NULL)
+					{
+						yy_switch_to_buffer(stackelem->buf);
+						cur_state->curline = stackelem->bufstring;
+						cur_state->refline = stackelem->origstring ? stackelem->origstring : stackelem->bufstring;
+					}
+					else
+					{
+						yy_switch_to_buffer(cur_state->scanbufhandle);
+						cur_state->curline = cur_state->scanbuf;
+						cur_state->refline = cur_state->scanline;
+					}
+				}
+%%
+
+static void my_psql_scan_finish(PsqlScanState state);
+static void my_psql_scan_reset(PsqlScanState state);
+
+static void
+psql_scan_initialize(PsqlScanState state)
+{
+	psql_scan_finish(state);
+	psql_scan_reset(state);
+	memset(state, 0, sizeof(*state));
+	state->finish = &my_psql_scan_finish;
+	state->reset = &my_psql_scan_reset;
+	state->my_yy_scan_buffer = &yy_scan_buffer;
+	state->reset(state);
+}
+
+/*
+ * Create a lexer working state struct.
+ */
+PsqlScanState
+psql_scan_create(void)
+{
+	PsqlScanState state;
+
+	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
+	psql_scan_initialize(state);
+
+	return state;
+}
+
+/*
+ * Destroy a lexer working state struct, releasing all resources.
+ */
+void
+psql_scan_destroy(PsqlScanState state)
+{
+	psql_scan_finish(state);
+
+	psql_scan_reset(state);
+
+	free(state);
+}
+
+/*
+ * Set up to perform lexing of the given input line.
+ *
+ * The text at *line, extending for line_len bytes, will be scanned by
+ * subsequent calls to the psql_scan routines.  psql_scan_finish should
+ * be called when scanning is complete.  Note that the lexer retains
+ * a pointer to the storage at *line --- this string must not be altered
+ * or freed until after psql_scan_finish is called.
+ */
+void
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PGconn *db, VariableSpace vars, int encoding)
+{
+	/* Mustn't be scanning already */
+	Assert(state->scanbufhandle == NULL);
+	Assert(state->buffer_stack == NULL);
+
+	/* Do we need to hack the character set encoding? */
+	state->encoding = encoding;
+	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
+
+	state->vars = vars;
+
+	/* needed for prepare_buffer */
+	cur_state = state;
+
+	/* Set up flex input buffer with appropriate translation and padding */
+	state->scanbufhandle = prepare_buffer(line, line_len,
+										  &state->scanbuf);
+	state->scanline = line;
+	state->curpos = 0;
+
+	/* Set lookaside data in case we have to map unsafe encoding */
+	state->curline = state->scanbuf;
+	state->refline = state->scanline;
+}
+
+/*
+ * Redirect functions for indirect calls. These functions may be called for
+ * scan state of other lexers.
+ */
+void
+psql_scan_finish(PsqlScanState state)
+{
+	if (state->finish)
+		state->finish(state);
+}
+
+void
+psql_scan_reset(PsqlScanState state)
+{
+	if (state->reset)
+		state->reset(state);
+}
+
+
+/*
+ * Do lexical analysis of SQL command text.
+ *
+ * The text previously passed to psql_scan_setup is scanned, and appended
+ * (possibly with transformation) to query_buf.
+ *
+ * The return value indicates the condition that stopped scanning:
+ *
+ * PSCAN_SEMICOLON: found a command-ending semicolon.  (The semicolon is
+ * transferred to query_buf.)  The command accumulated in query_buf should
+ * be executed, then clear query_buf and call again to scan the remainder
+ * of the line.
+ *
+ * PSCAN_BACKSLASH: found a backslash that starts a psql special command.
+ * Any previous data on the line has been transferred to query_buf.
+ * The caller will typically next call psql_scan_slash_command(),
+ * perhaps psql_scan_slash_option(), and psql_scan_slash_command_end().
+ *
+ * PSCAN_INCOMPLETE: the end of the line was reached, but we have an
+ * incomplete SQL command.  *prompt is set to the appropriate prompt type.
+ *
+ * PSCAN_EOL: the end of the line was reached, and there is no lexical
+ * reason to consider the command incomplete.  The caller may or may not
+ * choose to send it.  *prompt is set to the appropriate prompt type if
+ * the caller chooses to collect more input.
+ *
+ * In the PSCAN_INCOMPLETE and PSCAN_EOL cases, psql_scan_finish() should
+ * be called next, then the cycle may be repeated with a fresh input line.
+ *
+ * In all cases, *prompt is set to an appropriate prompt type code for the
+ * next line-input operation.
+ */
+PsqlScanResult
+psql_scan(PsqlScanState state,
+		  PQExpBuffer query_buf,
+		  promptStatus_t *prompt)
+{
+	PsqlScanResult result;
+	int			lexresult;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = query_buf;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(state->start_state);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/* Update static vars back to the state struct */
+	state->start_state = YY_START;
+
+	/*
+	 * Check termination state and return appropriate result info.
+	 */
+	switch (lexresult)
+	{
+		case LEXRES_EOL:		/* end of input */
+			switch (state->start_state)
+			{
+				/* This switch must cover all non-slash-command states. */
+				case INITIAL:
+				case xuiend:	/* we treat these like INITIAL */
+				case xusend:
+					if (state->paren_depth > 0)
+					{
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_PAREN;
+					}
+					else if (query_buf->len > 0)
+					{
+						result = PSCAN_EOL;
+						*prompt = PROMPT_CONTINUE;
+					}
+					else
+					{
+						/* never bother to send an empty buffer */
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_READY;
+					}
+					break;
+				case xb:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xc:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_COMMENT;
+					break;
+				case xd:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xh:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xe:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xdolq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOLLARQUOTE;
+					break;
+				case xui:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xus:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				default:
+					/* can't get here */
+					fprintf(stderr, "invalid YY_START\n");
+					exit(1);
+			}
+			break;
+		case LEXRES_SEMI:		/* semicolon */
+			result = PSCAN_SEMICOLON;
+			*prompt = PROMPT_READY;
+			break;
+		case LEXRES_BACKSLASH:	/* backslash */
+			result = PSCAN_BACKSLASH;
+			*prompt = PROMPT_READY;
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid yylex result\n");
+			exit(1);
+	}
+
+	return result;
+}
+
+/*
+ * Clean up after scanning a string.  This flushes any unread input and
+ * releases resources (but not the PsqlScanState itself).  Note however
+ * that this does not reset the lexer scan state; that can be done by
+ * psql_scan_reset(), which is an orthogonal operation.
+ *
+ * It is legal to call this when not scanning anything (makes it easier
+ * to deal with error recovery).
+ */
+static void
+my_psql_scan_finish(PsqlScanState state)
+{
+	/* Drop any incomplete variable expansions. */
+	while (state->buffer_stack != NULL)
+		pop_buffer_stack(state);
+
+	/* Done with the outer scan buffer, too */
+	if (state->scanbufhandle)
+		yy_delete_buffer(state->scanbufhandle);
+	state->scanbufhandle = NULL;
+	if (state->scanbuf)
+		free(state->scanbuf);
+	state->scanbuf = NULL;
+}
+
+/*
+ * Create new lexer scanning state for this lexer which parses from the current
+ * position of the given scanning state for another lexer. The given state is
+ * destroyed.
+ * 
+ * Note: This function cannot access yy* functions and varialbes of the given
+ * state because they are of different lexer.
+ */
+void
+psql_scan_switch_lexer(PsqlScanState state)
+{
+	const char	   *newscanline = state->scanline + state->curpos;
+	PGconn		   *db = state->db;
+	VariableSpace	vars = state->vars;
+	int				encoding = state->encoding;
+
+	psql_scan_initialize(state);
+	psql_scan_setup(state, newscanline, strlen(newscanline),
+					db, vars, encoding);
+}
+
+/*
+ * Reset lexer scanning state to start conditions.  This is appropriate
+ * for executing \r psql commands (or any other time that we discard the
+ * prior contents of query_buf).  It is not, however, necessary to do this
+ * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
+ * PSCAN_EOL scan result, because the scan state must be INITIAL when those
+ * conditions are returned.
+ *
+ * Note that this is unrelated to flushing unread input; that task is
+ * done by psql_scan_finish().
+ */
+static void
+my_psql_scan_reset(PsqlScanState state)
+{
+	state->start_state = INITIAL;
+	state->paren_depth = 0;
+	state->xcdepth = 0;			/* not really necessary */
+	if (state->dolqstart)
+		free(state->dolqstart);
+	state->dolqstart = NULL;
+}
+
+/*
+ * Return true if lexer is currently in an "inside quotes" state.
+ *
+ * This is pretty grotty but is needed to preserve the old behavior
+ * that mainloop.c drops blank lines not inside quotes without even
+ * echoing them.
+ */
+bool
+psql_scan_in_quote(PsqlScanState state)
+{
+	return state->start_state != INITIAL;
+}
+
+/*
+ * Push the given string onto the stack of stuff to scan.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+void
+push_new_buffer(const char *newstr, const char *varname)
+{
+	StackElem  *stackelem;
+
+	stackelem = (StackElem *) pg_malloc(sizeof(StackElem));
+
+	/*
+	 * In current usage, the passed varname points at the current flex
+	 * input buffer; we must copy it before calling prepare_buffer()
+	 * because that will change the buffer state.
+	 */
+	stackelem->varname = varname ? pg_strdup(varname) : NULL;
+
+	stackelem->buf = prepare_buffer(newstr, strlen(newstr),
+									&stackelem->bufstring);
+	cur_state->curline = stackelem->bufstring;
+	if (cur_state->safe_encoding)
+	{
+		stackelem->origstring = NULL;
+		cur_state->refline = stackelem->bufstring;
+	}
+	else
+	{
+		stackelem->origstring = pg_strdup(newstr);
+		cur_state->refline = stackelem->origstring;
+	}
+	stackelem->next = cur_state->buffer_stack;
+	cur_state->buffer_stack = stackelem;
+}
+
+/*
+ * Pop the topmost buffer stack item (there must be one!)
+ *
+ * NB: after this, the flex input state is unspecified; caller must
+ * switch to an appropriate buffer to continue lexing.
+ */
+void
+pop_buffer_stack(PsqlScanState state)
+{
+	StackElem  *stackelem = state->buffer_stack;
+
+	state->buffer_stack = stackelem->next;
+	yy_delete_buffer(stackelem->buf);
+	free(stackelem->bufstring);
+	if (stackelem->origstring)
+		free(stackelem->origstring);
+	if (stackelem->varname)
+		free(stackelem->varname);
+	free(stackelem);
+}
+
+/*
+ * Check if specified variable name is the source for any string
+ * currently being scanned
+ */
+bool
+var_is_current_source(PsqlScanState state, const char *varname)
+{
+	StackElem  *stackelem;
+
+	for (stackelem = state->buffer_stack;
+		 stackelem != NULL;
+		 stackelem = stackelem->next)
+	{
+		if (stackelem->varname && strcmp(stackelem->varname, varname) == 0)
+			return true;
+	}
+	return false;
+}
+
+/*
+ * Set up a flex input buffer to scan the given data.  We always make a
+ * copy of the data.  If working in an unsafe encoding, the copy has
+ * multibyte sequences replaced by FFs to avoid fooling the lexer rules.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+YY_BUFFER_STATE
+prepare_buffer(const char *txt, int len, char **txtcopy)
+{
+	char	   *newtxt;
+
+	/* Flex wants two \0 characters after the actual data */
+	newtxt = pg_malloc(len + 2);
+	*txtcopy = newtxt;
+	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
+
+	if (cur_state->safe_encoding)
+		memcpy(newtxt, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		int		i = 0;
+
+		while (i < len)
+		{
+			int		thislen = PQmblen(txt + i, cur_state->encoding);
+
+			/* first byte should always be okay... */
+			newtxt[i] = txt[i];
+			i++;
+			while (--thislen > 0 && i < len)
+				newtxt[i++] = (char) 0xFF;
+		}
+	}
+
+	return cur_state->my_yy_scan_buffer(newtxt, len + 2);
+}
+
+/*
+ * scan_emit() --- body for ECHO macro
+ *
+ * NB: this must be used for ALL and ONLY the text copied from the flex
+ * input data.  If you pass it something that is not part of the yytext
+ * string, you are making a mistake.  Internally generated text can be
+ * appended directly to output_buf.
+ */
+void
+scan_emit(const char *txt, int len)
+{
+	if (cur_state->safe_encoding)
+		appendBinaryPQExpBuffer(output_buf, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			appendPQExpBufferChar(output_buf, ch);
+		}
+	}
+}
+
+/*
+ * extract_substring --- fetch the true value of (part of) the current token
+ *
+ * This is like scan_emit(), except that the data is returned as a malloc'd
+ * string rather than being pushed directly to output_buf.
+ */
+char *
+extract_substring(const char *txt, int len)
+{
+	char	   *result = (char *) pg_malloc(len + 1);
+
+	if (cur_state->safe_encoding)
+		memcpy(result, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			result[i] = ch;
+		}
+	}
+	result[len] = '\0';
+	return result;
+}
+
+/*
+ * escape_variable --- process :'VARIABLE' or :"VARIABLE"
+ *
+ * If the variable name is found, escape its value using the appropriate
+ * quoting method and emit the value to output_buf.  (Since the result is
+ * surely quoted, there is never any reason to rescan it.)  If we don't
+ * find the variable or the escaping function fails, emit the token as-is.
+ */
+void
+escape_variable(bool as_ident)
+{
+	char	   *varname;
+	const char *value;
+
+	/* Variable lookup if possible. */
+	if (cur_state->vars && cur_state->db)
+	{
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = GetVariable(cur_state->vars, varname);
+		free(varname);
+	}
+
+	/* Escaping. */
+	if (value)
+	{
+		if (!cur_state->db)
+			psql_error("can't escape without active connection\n");
+		else
+		{
+			char   *escaped_value;
+
+			if (as_ident)
+				escaped_value =
+					PQescapeIdentifier(cur_state->db, value, strlen(value));
+			else
+				escaped_value =
+					PQescapeLiteral(cur_state->db, value, strlen(value));
+
+			if (escaped_value == NULL)
+			{
+				const char *error = PQerrorMessage(cur_state->db);
+
+				psql_error("%s", error);
+			}
+			else
+			{
+				appendPQExpBufferStr(output_buf, escaped_value);
+				PQfreemem(escaped_value);
+				return;
+			}
+		}
+	}
+
+	/*
+	 * If we reach this point, some kind of error has occurred.  Emit the
+	 * original text into the output buffer.
+	 */
+	scan_emit(yytext, yyleng);
+}
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 6916f6f..5461763 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -338,8 +338,8 @@ main(int argc, char *argv[])
 
 				scan_state = psql_scan_create();
 				psql_scan_setup(scan_state,
-								cell->val,
-								strlen(cell->val));
+								cell->val, strlen(cell->val),
+								pset.db, pset.vars, pset.encoding);
 
 				successResult = HandleSlashCmds(scan_state, NULL) != PSQL_CMD_ERROR
 					? EXIT_SUCCESS : EXIT_FAILURE;
-- 
1.8.3.1

0002-Change-the-access-method-to-shell-variables.patchtext/x-patch; charset=us-asciiDownload
>From 7849ccf2a44b1ccc1b431cc4fd82b7fbe1616329 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 6 Jan 2016 15:27:43 +0900
Subject: [PATCH 2/5] Change the access method to shell variables

Access to shell variables via a callback function so that the lexer no
longer need to be aware of VariableSpace.
---
 src/bin/psql/common.c             |  7 +++++++
 src/bin/psql/common.h             |  2 ++
 src/bin/psql/mainloop.c           |  4 ++--
 src/bin/psql/psqlscan.h           |  6 ++++--
 src/bin/psql/psqlscan_int.h       |  2 +-
 src/bin/psql/psqlscan_slashbody.l | 14 +++++++-------
 src/bin/psql/psqlscanbody.l       | 16 ++++++++--------
 src/bin/psql/startup.c            |  2 +-
 8 files changed, 32 insertions(+), 21 deletions(-)

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 2cb2e9b..1216d1e 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1901,3 +1901,10 @@ recognized_connection_string(const char *connstr)
 {
 	return uri_prefix_length(connstr) != 0 || strchr(connstr, '=') != NULL;
 }
+
+/* Access callback to "shell variables" for lexer */
+const char *
+get_variable(const char *name)
+{
+	return GetVariable(pset.vars, name);
+}
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index 6ba3f44..636331e 100644
--- a/src/bin/psql/common.h
+++ b/src/bin/psql/common.h
@@ -49,4 +49,6 @@ extern void expand_tilde(char **filename);
 
 extern bool recognized_connection_string(const char *connstr);
 
+extern const char *get_variable(const char *name);
+
 #endif   /* COMMON_H */
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index 947caff..76b2f78 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -234,7 +234,7 @@ MainLoop(FILE *source)
 		 * Parse line, looking for command separators.
 		 */
 		psql_scan_setup(scan_state, line, strlen(line),
-						pset.db, pset.vars, pset.encoding);
+						pset.db, &get_variable, pset.encoding);
 		success = true;
 		line_saved_in_history = false;
 
@@ -375,7 +375,7 @@ MainLoop(FILE *source)
 					/* reset parsing state since we are rescanning whole line */
 					psql_scan_reset(scan_state);
 					psql_scan_setup(scan_state, line, strlen(line),
-									pset.db, pset.vars, pset.encoding);
+									pset.db, &get_variable, pset.encoding);
 					line_saved_in_history = false;
 					prompt_status = PROMPT_READY;
 				}
diff --git a/src/bin/psql/psqlscan.h b/src/bin/psql/psqlscan.h
index 7615df2..651f074 100644
--- a/src/bin/psql/psqlscan.h
+++ b/src/bin/psql/psqlscan.h
@@ -11,11 +11,13 @@
 #include "pqexpbuffer.h"
 
 #include "prompt.h"
-#include "variables.h"
 
 /* Abstract type for lexer's internal state */
 typedef struct PsqlScanStateData *PsqlScanState;
 
+/* Callback type for retrieve variables */
+typedef const char *(*VariableGetter)(const char *);
+
 /* Termination states for psql_scan() */
 typedef enum
 {
@@ -29,7 +31,7 @@ extern PsqlScanState psql_scan_create(void);
 extern void psql_scan_destroy(PsqlScanState state);
 
 extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
-							PGconn *db, VariableSpace vars, int encoding);
+							PGconn *db, VariableGetter getter, int encoding);
 extern void psql_scan_finish(PsqlScanState state);
 
 extern PsqlScanResult psql_scan(PsqlScanState state,
diff --git a/src/bin/psql/psqlscan_int.h b/src/bin/psql/psqlscan_int.h
index 0fb04c7..6f9fa13 100644
--- a/src/bin/psql/psqlscan_int.h
+++ b/src/bin/psql/psqlscan_int.h
@@ -56,7 +56,7 @@ typedef struct PsqlScanStateData
 	const char *curline;		/* actual flex input string for cur buf */
 	const char *refline;		/* original data for cur buffer */
 	int			curpos;			/* current position in curline  */
-	VariableSpace vars;			/* "shell variable" repository */
+	VariableGetter get_variable; /* access to the "shell variable" repository */
 
 	/*
 	 * All this state lives across successive input lines, until explicitly
diff --git a/src/bin/psql/psqlscan_slashbody.l b/src/bin/psql/psqlscan_slashbody.l
index 06c218b..ca0870e 100644
--- a/src/bin/psql/psqlscan_slashbody.l
+++ b/src/bin/psql/psqlscan_slashbody.l
@@ -187,10 +187,10 @@ other			.
 						char   *varname;
 						const char *value;
 
-						if (cur_state->vars)
+						if (cur_state->get_variable)
 						{
 							varname = extract_substring(yytext + 1, yyleng - 1);
-							value = GetVariable(cur_state->vars, varname);
+							value = cur_state->get_variable(varname);
 							free(varname);
 						}
 
@@ -372,8 +372,8 @@ psql_scan_slash_command_initialize(PsqlScanState state)
  */
 static void
 psql_scan_slash_command_setup(PsqlScanState state,
-							  const char *line, int line_len,
-							  PGconn *db, VariableSpace vars, int encoding)
+						  const char *line, int line_len,
+						  PGconn *db, VariableGetter vargetter, int encoding)
 {
 	/* Mustn't be scanning already */
 	Assert(state->scanbufhandle == NULL);
@@ -383,7 +383,7 @@ psql_scan_slash_command_setup(PsqlScanState state,
 	state->encoding = encoding;
 	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
 
-	state->vars = vars;
+	state->get_variable = vargetter;
 
 	/* needed for prepare_buffer */
 	cur_state = state;
@@ -412,12 +412,12 @@ psql_scan_slash_command_switch_lexer(PsqlScanState state)
 {
 	const char *newscanline = state->scanline + state->curpos;
 	PGconn		   *db = state->db;
-	VariableSpace	vars = state->vars;
+	VariableGetter	vargetter = state->get_variable;
 	int				encoding = state->encoding;
 
 	psql_scan_slash_command_initialize(state);
 	psql_scan_slash_command_setup(state, newscanline, strlen(newscanline),
-								  db, vars, encoding);
+								  db, vargetter, encoding);
 }
 
 /*
diff --git a/src/bin/psql/psqlscanbody.l b/src/bin/psql/psqlscanbody.l
index cf5755c..4418a5c 100644
--- a/src/bin/psql/psqlscanbody.l
+++ b/src/bin/psql/psqlscanbody.l
@@ -658,10 +658,10 @@ other			.
 					char   *varname = NULL;
 					const char *value = NULL;
 
-					if (cur_state->vars)
+					if (cur_state->get_variable)
 					{
 						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(cur_state->vars, varname);
+						value = cur_state->get_variable(varname);
 					}
 
 					if (value)
@@ -911,7 +911,7 @@ psql_scan_destroy(PsqlScanState state)
  */
 void
 psql_scan_setup(PsqlScanState state, const char *line, int line_len,
-				PGconn *db, VariableSpace vars, int encoding)
+				PGconn *db, VariableGetter vargetter, int encoding)
 {
 	/* Mustn't be scanning already */
 	Assert(state->scanbufhandle == NULL);
@@ -921,7 +921,7 @@ psql_scan_setup(PsqlScanState state, const char *line, int line_len,
 	state->encoding = encoding;
 	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
 
-	state->vars = vars;
+	state->get_variable = vargetter;
 
 	/* needed for prepare_buffer */
 	cur_state = state;
@@ -1142,12 +1142,12 @@ psql_scan_switch_lexer(PsqlScanState state)
 {
 	const char	   *newscanline = state->scanline + state->curpos;
 	PGconn		   *db = state->db;
-	VariableSpace	vars = state->vars;
+	VariableGetter	vargetter = state->get_variable;
 	int				encoding = state->encoding;
 
 	psql_scan_initialize(state);
 	psql_scan_setup(state, newscanline, strlen(newscanline),
-					db, vars, encoding);
+					db, vargetter, encoding);
 }
 
 /*
@@ -1385,10 +1385,10 @@ escape_variable(bool as_ident)
 	const char *value;
 
 	/* Variable lookup if possible. */
-	if (cur_state->vars && cur_state->db)
+	if (cur_state->get_variable)
 	{
 		varname = extract_substring(yytext + 2, yyleng - 3);
-		value = GetVariable(cur_state->vars, varname);
+		value = cur_state->get_variable(varname);
 		free(varname);
 	}
 
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 5461763..553efc0 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -339,7 +339,7 @@ main(int argc, char *argv[])
 				scan_state = psql_scan_create();
 				psql_scan_setup(scan_state,
 								cell->val, strlen(cell->val),
-								pset.db, pset.vars, pset.encoding);
+								pset.db, get_variable, pset.encoding);
 
 				successResult = HandleSlashCmds(scan_state, NULL) != PSQL_CMD_ERROR
 					? EXIT_SUCCESS : EXIT_FAILURE;
-- 
1.8.3.1

0003-Detach-common.c-from-psqlscan.patchtext/x-patch; charset=us-asciiDownload
>From ea1ddcdec4462a96fd192e67908afc42e6cea6cb Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 6 Jan 2016 15:31:27 +0900
Subject: [PATCH 3/5] Detach common.c from psqlscan

Call standard_strings() and psql_error() via callback functions so
that psqlscan.l can live totally without common.c stuff. They are
bundled up with get_variable() callback in one struct since now we
have as many as four callback functions.
---
 src/bin/psql/common.c             |  51 ++++++++++++++++-
 src/bin/psql/common.h             |   8 ++-
 src/bin/psql/mainloop.c           |  10 ++--
 src/bin/psql/psqlscan.c           |   2 +-
 src/bin/psql/psqlscan.h           |  14 ++++-
 src/bin/psql/psqlscan_int.h       |   8 +--
 src/bin/psql/psqlscan_slash.c     |   2 +-
 src/bin/psql/psqlscan_slashbody.l |  50 ++++++++++-------
 src/bin/psql/psqlscanbody.l       | 113 ++++++++++++++++++++------------------
 src/bin/psql/startup.c            |   9 ++-
 10 files changed, 173 insertions(+), 94 deletions(-)

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 1216d1e..79eb04e 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -30,6 +30,9 @@ static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec);
 static bool command_no_begin(const char *query);
 static bool is_select_command(const char *query);
 
+PsqlScanCallbacks psqlscan_callbacks =
+{&get_variable, &psql_mblen, &standard_strings, &psql_error};
+
 
 /*
  * openQueryOutputFile --- attempt to open a query output file
@@ -1904,7 +1907,51 @@ recognized_connection_string(const char *connstr)
 
 /* Access callback to "shell variables" for lexer */
 const char *
-get_variable(const char *name)
+get_variable(const char *name, bool escape, bool as_ident,
+			 void (**free_func)(void *))
 {
-	return GetVariable(pset.vars, name);
+	const char *value;
+	char   *escaped_value;
+
+	*free_func = NULL;
+
+	value = GetVariable(pset.vars, name);
+
+	if (!escape)
+		return value;
+
+	/* Escaping. */
+
+	if (!value)
+		return NULL;
+
+	if (!pset.db)
+	{
+		psql_error("can't escape without active connection\n");
+		return NULL;
+	}
+
+	if (as_ident)
+		escaped_value =
+			PQescapeIdentifier(pset.db, value, strlen(value));
+	else
+		escaped_value =
+			PQescapeLiteral(pset.db, value, strlen(value));
+
+	if (escaped_value == NULL)
+	{
+		const char *error = PQerrorMessage(pset.db);
+
+		psql_error("%s", error);
+		return NULL;
+	}
+
+	*free_func = &PQfreemem;
+	return escaped_value;
+}
+
+int
+psql_mblen(const char *s)
+{
+	return PQmblen(s, pset.encoding);
 }
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index 636331e..686503a 100644
--- a/src/bin/psql/common.h
+++ b/src/bin/psql/common.h
@@ -13,6 +13,7 @@
 #include "libpq-fe.h"
 
 #include "print.h"
+#include "psqlscan.h"
 
 #define atooid(x)  ((Oid) strtoul((x), NULL, 10))
 
@@ -29,6 +30,8 @@ extern sigjmp_buf sigint_interrupt_jmp;
 
 extern volatile bool cancel_pressed;
 
+extern PsqlScanCallbacks psqlscan_callbacks;
+
 /* Note: cancel_pressed is defined in print.c, see that file for reasons */
 
 extern void setup_cancel_handler(void);
@@ -49,6 +52,9 @@ extern void expand_tilde(char **filename);
 
 extern bool recognized_connection_string(const char *connstr);
 
-extern const char *get_variable(const char *name);
+extern const char *get_variable(const char *name, bool escape, bool as_ident,
+								void (**free_func)(void *));
+
+extern int psql_mblen(const char *s);
 
 #endif   /* COMMON_H */
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index 76b2f78..13424be 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -16,7 +16,6 @@
 
 #include "mb/pg_wchar.h"
 
-
 /*
  * Main processing loop for reading lines of input
  *	and sending them to the backend.
@@ -233,8 +232,11 @@ MainLoop(FILE *source)
 		/*
 		 * Parse line, looking for command separators.
 		 */
-		psql_scan_setup(scan_state, line, strlen(line),
-						pset.db, &get_variable, pset.encoding);
+		/* set enc_mblen according to the encoding */
+		psqlscan_callbacks.enc_mblen =
+			(pg_valid_server_encoding_id(pset.encoding) ? NULL : &psql_mblen);
+
+		psql_scan_setup(scan_state, line, strlen(line),	&psqlscan_callbacks);
 		success = true;
 		line_saved_in_history = false;
 
@@ -375,7 +377,7 @@ MainLoop(FILE *source)
 					/* reset parsing state since we are rescanning whole line */
 					psql_scan_reset(scan_state);
 					psql_scan_setup(scan_state, line, strlen(line),
-									pset.db, &get_variable, pset.encoding);
+									&psqlscan_callbacks);
 					line_saved_in_history = false;
 					prompt_status = PROMPT_READY;
 				}
diff --git a/src/bin/psql/psqlscan.c b/src/bin/psql/psqlscan.c
index 7f09fa3..de7f746 100644
--- a/src/bin/psql/psqlscan.c
+++ b/src/bin/psql/psqlscan.c
@@ -14,5 +14,5 @@
  * multiple infrastructures for stdio.h and so on.  flex is absolutely
  * uncooperative about that, so we can't compile psqlscan.c on its own.
  */
-#include "common.h"
+#include "postgres_fe.h"
 #include "psqlscanbody.c"
diff --git a/src/bin/psql/psqlscan.h b/src/bin/psql/psqlscan.h
index 651f074..322edd3 100644
--- a/src/bin/psql/psqlscan.h
+++ b/src/bin/psql/psqlscan.h
@@ -15,8 +15,16 @@
 /* Abstract type for lexer's internal state */
 typedef struct PsqlScanStateData *PsqlScanState;
 
-/* Callback type for retrieve variables */
-typedef const char *(*VariableGetter)(const char *);
+typedef struct PsqlScanCallbacks
+{
+	const char *(*get_variable)(const char *, bool escape, bool as_ident,
+								void (**free_fn)(void *));
+	/* enc_mblen is needed only if encoding is not safe */
+	int	 (*enc_mblen)(const char *);
+	bool (*standard_strings)(void); /* standard_conforming_strings */
+	void (*error_out)(const char *fmt, ...) /* write error message */
+		pg_attribute_printf(1, 2);
+} PsqlScanCallbacks;
 
 /* Termination states for psql_scan() */
 typedef enum
@@ -31,7 +39,7 @@ extern PsqlScanState psql_scan_create(void);
 extern void psql_scan_destroy(PsqlScanState state);
 
 extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
-							PGconn *db, VariableGetter getter, int encoding);
+							PsqlScanCallbacks *callbacks);
 extern void psql_scan_finish(PsqlScanState state);
 
 extern PsqlScanResult psql_scan(PsqlScanState state,
diff --git a/src/bin/psql/psqlscan_int.h b/src/bin/psql/psqlscan_int.h
index 6f9fa13..fb6b036 100644
--- a/src/bin/psql/psqlscan_int.h
+++ b/src/bin/psql/psqlscan_int.h
@@ -49,14 +49,12 @@ typedef struct PsqlScanStateData
 	char	   *scanbuf;		/* start of outer-level input buffer */
 	const char *scanline;		/* current input line at outer level */
 
-	/* safe_encoding, curline, refline are used by scan_emit() to replace FFs */
-	PGconn	   *db;				/* active connection */
-	int			encoding;		/* encoding being used now */
-	bool		safe_encoding;	/* is current encoding "safe"? */
 	const char *curline;		/* actual flex input string for cur buf */
 	const char *refline;		/* original data for cur buffer */
 	int			curpos;			/* current position in curline  */
-	VariableGetter get_variable; /* access to the "shell variable" repository */
+
+	PsqlScanCallbacks cb;		/* callback given from outside */
+
 
 	/*
 	 * All this state lives across successive input lines, until explicitly
diff --git a/src/bin/psql/psqlscan_slash.c b/src/bin/psql/psqlscan_slash.c
index 223bde4..bf8c0f3 100644
--- a/src/bin/psql/psqlscan_slash.c
+++ b/src/bin/psql/psqlscan_slash.c
@@ -14,6 +14,6 @@
  * infrastructures for stdio.h and so on.  flex is absolutely uncooperative
  * about that, so we can't compile psqlscan.c on its own.
  */
-#include "common.h"
+#include "postgres_fe.h"
 #include "psqlscan.h"
 #include "psqlscan_slashbody.c"
diff --git a/src/bin/psql/psqlscan_slashbody.l b/src/bin/psql/psqlscan_slashbody.l
index ca0870e..7f2edfd 100644
--- a/src/bin/psql/psqlscan_slashbody.l
+++ b/src/bin/psql/psqlscan_slashbody.l
@@ -36,6 +36,7 @@ static void evaluate_backtick(void);
 /* Track where lexer parsed up to */
 #define YY_USER_ACTION cur_state->curpos += yyleng;
 
+#define ENC_IS_SAFE(s) (!(s)->cb.enc_mblen)
 %}
 
 %option 8bit
@@ -184,13 +185,15 @@ other			.
 						ECHO;
 					else
 					{
-						char   *varname;
+						char	   *varname;
 						const char *value;
+						void	  (*free_fn)(void *) = NULL;
 
-						if (cur_state->get_variable)
+						if (cur_state->cb.get_variable)
 						{
 							varname = extract_substring(yytext + 1, yyleng - 1);
-							value = cur_state->get_variable(varname);
+							value = cur_state->cb.get_variable(varname,
+												   false, false, &free_fn);
 							free(varname);
 						}
 
@@ -202,7 +205,11 @@ other			.
 						 * Note that we needn't guard against recursion here.
 						 */
 						if (value)
+						{
 							appendPQExpBufferStr(output_buf, value);
+							if (free_fn)
+								free_fn((void*)value);
+						}
 						else
 							ECHO;
 
@@ -372,18 +379,19 @@ psql_scan_slash_command_initialize(PsqlScanState state)
  */
 static void
 psql_scan_slash_command_setup(PsqlScanState state,
-						  const char *line, int line_len,
-						  PGconn *db, VariableGetter vargetter, int encoding)
+							  const char *line, int line_len,
+							  PsqlScanCallbacks *cb)
 {
 	/* Mustn't be scanning already */
 	Assert(state->scanbufhandle == NULL);
 	Assert(state->buffer_stack == NULL);
+	Assert(cb->error_out != NULL);
 
-	/* Do we need to hack the character set encoding? */
-	state->encoding = encoding;
-	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
-
-	state->get_variable = vargetter;
+	/* copy callback functions */
+	state->cb.get_variable = cb->get_variable;
+	state->cb.enc_mblen = cb->enc_mblen;
+	state->cb.standard_strings = cb->standard_strings;
+	state->cb.error_out = cb->error_out;
 
 	/* needed for prepare_buffer */
 	cur_state = state;
@@ -411,13 +419,10 @@ static void
 psql_scan_slash_command_switch_lexer(PsqlScanState state)
 {
 	const char *newscanline = state->scanline + state->curpos;
-	PGconn		   *db = state->db;
-	VariableGetter	vargetter = state->get_variable;
-	int				encoding = state->encoding;
+	PsqlScanCallbacks cb = state->cb;
 
 	psql_scan_slash_command_initialize(state);
-	psql_scan_slash_command_setup(state, newscanline, strlen(newscanline),
-								  db, vargetter, encoding);
+	psql_scan_slash_command_setup(state, newscanline, strlen(newscanline), &cb);
 }
 
 /*
@@ -577,7 +582,10 @@ psql_scan_slash_option(PsqlScanState state,
 					{
 						if (!inquotes && type == OT_SQLID)
 							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, cur_state->encoding);
+						if (ENC_IS_SAFE(cur_state))
+							cp += strlen(cp);
+						else
+							cp += cur_state->cb.enc_mblen(cp);
 					}
 				}
 			}
@@ -586,7 +594,7 @@ psql_scan_slash_option(PsqlScanState state,
 		case xslashbackquote:
 		case xslashdquote:
 			/* must have hit EOL inside quotes */
-			psql_error("unterminated quoted string\n");
+			cur_state->cb.error_out("unterminated quoted string\n");
 			termPQExpBuffer(&mybuf);
 			return NULL;
 		case xslashwholeline:
@@ -660,7 +668,7 @@ evaluate_backtick(void)
 	fd = popen(cmd, PG_BINARY_R);
 	if (!fd)
 	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
+		cur_state->cb.error_out("%s: %s\n", cmd, strerror(errno));
 		error = true;
 	}
 
@@ -671,7 +679,7 @@ evaluate_backtick(void)
 			result = fread(buf, 1, sizeof(buf), fd);
 			if (ferror(fd))
 			{
-				psql_error("%s: %s\n", cmd, strerror(errno));
+				cur_state->cb.error_out("%s: %s\n", cmd, strerror(errno));
 				error = true;
 				break;
 			}
@@ -681,13 +689,13 @@ evaluate_backtick(void)
 
 	if (fd && pclose(fd) == -1)
 	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
+		cur_state->cb.error_out("%s: %s\n", cmd, strerror(errno));
 		error = true;
 	}
 
 	if (PQExpBufferDataBroken(cmd_output))
 	{
-		psql_error("%s: out of memory\n", cmd);
+		cur_state->cb.error_out("%s: out of memory\n", cmd);
 		error = true;
 	}
 
diff --git a/src/bin/psql/psqlscanbody.l b/src/bin/psql/psqlscanbody.l
index 4418a5c..4de29ec 100644
--- a/src/bin/psql/psqlscanbody.l
+++ b/src/bin/psql/psqlscanbody.l
@@ -54,6 +54,7 @@ PQExpBuffer output_buf;	/* current output buffer */
 /* Track where lexer parsed up to */
 #define YY_USER_ACTION cur_state->curpos += yyleng;
 
+#define ENC_IS_SAFE(s) (!(s)->cb.enc_mblen)
 %}
 
 %option 8bit
@@ -427,7 +428,7 @@ other			.
 				}
 
 {xqstart}		{
-					if (standard_strings())
+					if (cur_state->cb.standard_strings())
 						BEGIN(xq);
 					else
 						BEGIN(xe);
@@ -655,13 +656,15 @@ other			.
 
 :{variable_char}+	{
 					/* Possible psql variable substitution */
-					char   *varname = NULL;
+					char	   *varname = NULL;
 					const char *value = NULL;
+					void	  (*free_fn)(void *) = NULL;
 
-					if (cur_state->get_variable)
+					if (cur_state->cb.get_variable)
 					{
 						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = cur_state->get_variable(varname);
+						value = cur_state->cb.get_variable(varname,
+ 									false, false, &free_fn);
 					}
 
 					if (value)
@@ -670,7 +673,7 @@ other			.
 						if (var_is_current_source(cur_state, varname))
 						{
 							/* Recursive expansion --- don't go there */
-							psql_error("skipping recursive expansion of variable \"%s\"\n",
+							cur_state->cb.error_out("skipping recursive expansion of variable \"%s\"\n",
 									   varname);
 							/* Instead copy the string as is */
 							ECHO;
@@ -681,6 +684,8 @@ other			.
 							push_new_buffer(value, varname);
 							/* yy_scan_string already made buffer active */
 						}
+						if (free_fn)
+							free_fn((void*)value);
 					}
 					else
 					{
@@ -860,6 +865,9 @@ other			.
 
 static void my_psql_scan_finish(PsqlScanState state);
 static void my_psql_scan_reset(PsqlScanState state);
+static void psql_error_errout(const char *fmt, ...)
+	__attribute__ ((format (printf, 1, 2)));
+static bool psql_standard_strings(void);
 
 static void
 psql_scan_initialize(PsqlScanState state)
@@ -911,17 +919,25 @@ psql_scan_destroy(PsqlScanState state)
  */
 void
 psql_scan_setup(PsqlScanState state, const char *line, int line_len,
-				PGconn *db, VariableGetter vargetter, int encoding)
+				PsqlScanCallbacks *cb)
 {
 	/* Mustn't be scanning already */
 	Assert(state->scanbufhandle == NULL);
 	Assert(state->buffer_stack == NULL);
 
-	/* Do we need to hack the character set encoding? */
-	state->encoding = encoding;
-	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
+	/* copy callback functions */
+	state->cb.get_variable = cb->get_variable;
+	if (cb->standard_strings)
+		state->cb.standard_strings = cb->standard_strings;
+	else
+		state->cb.standard_strings = &psql_standard_strings;
 
-	state->get_variable = vargetter;
+	state->cb.enc_mblen = cb->enc_mblen;
+
+	if (cb->error_out)
+		state->cb.error_out = cb->error_out;
+	else
+		state->cb.error_out = &psql_error_errout;
 
 	/* needed for prepare_buffer */
 	cur_state = state;
@@ -1141,13 +1157,10 @@ void
 psql_scan_switch_lexer(PsqlScanState state)
 {
 	const char	   *newscanline = state->scanline + state->curpos;
-	PGconn		   *db = state->db;
-	VariableGetter	vargetter = state->get_variable;
-	int				encoding = state->encoding;
+	PsqlScanCallbacks cb = state->cb;
 
 	psql_scan_initialize(state);
-	psql_scan_setup(state, newscanline, strlen(newscanline),
-					db, vargetter, encoding);
+	psql_scan_setup(state, newscanline, strlen(newscanline), &cb);
 }
 
 /*
@@ -1209,7 +1222,7 @@ push_new_buffer(const char *newstr, const char *varname)
 	stackelem->buf = prepare_buffer(newstr, strlen(newstr),
 									&stackelem->bufstring);
 	cur_state->curline = stackelem->bufstring;
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 	{
 		stackelem->origstring = NULL;
 		cur_state->refline = stackelem->bufstring;
@@ -1282,7 +1295,7 @@ prepare_buffer(const char *txt, int len, char **txtcopy)
 	*txtcopy = newtxt;
 	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
 
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 		memcpy(newtxt, txt, len);
 	else
 	{
@@ -1291,7 +1304,7 @@ prepare_buffer(const char *txt, int len, char **txtcopy)
 
 		while (i < len)
 		{
-			int		thislen = PQmblen(txt + i, cur_state->encoding);
+			int		thislen = cur_state->cb.enc_mblen(txt + i);
 
 			/* first byte should always be okay... */
 			newtxt[i] = txt[i];
@@ -1315,7 +1328,7 @@ prepare_buffer(const char *txt, int len, char **txtcopy)
 void
 scan_emit(const char *txt, int len)
 {
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 		appendBinaryPQExpBuffer(output_buf, txt, len);
 	else
 	{
@@ -1347,7 +1360,7 @@ extract_substring(const char *txt, int len)
 {
 	char	   *result = (char *) pg_malloc(len + 1);
 
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 		memcpy(result, txt, len);
 	else
 	{
@@ -1381,45 +1394,23 @@ extract_substring(const char *txt, int len)
 void
 escape_variable(bool as_ident)
 {
-	char	   *varname;
-	const char *value;
-
 	/* Variable lookup if possible. */
-	if (cur_state->get_variable)
+	if (cur_state->cb.get_variable)
 	{
+		char		*varname;
+		const char  *value;
+		void	   (*free_fn)(void *);
+
 		varname = extract_substring(yytext + 2, yyleng - 3);
-		value = cur_state->get_variable(varname);
+		value = cur_state->cb.get_variable(varname, true, as_ident, &free_fn);
 		free(varname);
-	}
 
-	/* Escaping. */
-	if (value)
-	{
-		if (!cur_state->db)
-			psql_error("can't escape without active connection\n");
-		else
+		if (value)
 		{
-			char   *escaped_value;
-
-			if (as_ident)
-				escaped_value =
-					PQescapeIdentifier(cur_state->db, value, strlen(value));
-			else
-				escaped_value =
-					PQescapeLiteral(cur_state->db, value, strlen(value));
-
-			if (escaped_value == NULL)
-			{
-				const char *error = PQerrorMessage(cur_state->db);
-
-				psql_error("%s", error);
-			}
-			else
-			{
-				appendPQExpBufferStr(output_buf, escaped_value);
-				PQfreemem(escaped_value);
-				return;
-			}
+			appendPQExpBufferStr(output_buf, value);
+			if (free_fn)
+				free_fn((void*)value);
+			return;
 		}
 	}
 
@@ -1429,3 +1420,19 @@ escape_variable(bool as_ident)
 	 */
 	scan_emit(yytext, yyleng);
 }
+
+/* Default error output function */
+static void psql_error_errout(const char *fmt, ...)
+{
+	va_list	ap;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, _(fmt), ap);
+	va_end(ap);
+}
+
+/* Default function to check standard_conforming_strings */
+static bool psql_standard_strings(void)
+{
+	return false;
+}
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 553efc0..47e9077 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -337,9 +337,12 @@ main(int argc, char *argv[])
 					puts(cell->val);
 
 				scan_state = psql_scan_create();
-				psql_scan_setup(scan_state,
-								cell->val, strlen(cell->val),
-								pset.db, get_variable, pset.encoding);
+				/* set enc_mblen according to the encoding */
+				psqlscan_callbacks.enc_mblen =
+					(pg_valid_server_encoding_id(pset.encoding) ?
+					 NULL : &psql_mblen);
+				psql_scan_setup(scan_state,	cell->val, strlen(cell->val),
+								&psqlscan_callbacks);
 
 				successResult = HandleSlashCmds(scan_state, NULL) != PSQL_CMD_ERROR
 					? EXIT_SUCCESS : EXIT_FAILURE;
-- 
1.8.3.1

0004-pgbench-uses-common-frontend-SQL-parser.patchtext/x-patch; charset=us-asciiDownload
>From 8adb84017a87930b9b7f83273a9e4fbe9af8ee78 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 6 Jan 2016 15:45:58 +0900
Subject: [PATCH 4/5] pgbench uses common frontend SQL parser

Make pgbench to use common frontend SQL parser instead of its
homegrown parser.
---
 src/bin/pgbench/Makefile  |   5 +-
 src/bin/pgbench/pgbench.c | 537 ++++++++++++++++++++++++++++++----------------
 2 files changed, 356 insertions(+), 186 deletions(-)

diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile
index 18fdf58..b97556c 100644
--- a/src/bin/pgbench/Makefile
+++ b/src/bin/pgbench/Makefile
@@ -5,11 +5,12 @@ PGAPPICON = win32
 
 subdir = src/bin/pgbench
 top_builddir = ../../..
+psqldir = ../psql
 include $(top_builddir)/src/Makefile.global
 
-OBJS = pgbench.o exprparse.o $(WIN32RES)
+OBJS = pgbench.o exprparse.o $(psqldir)/psqlscan.o $(WIN32RES)
 
-override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
+override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(psqldir) $(CPPFLAGS)
 
 ifneq ($(PORTNAME), win32)
 override CFLAGS += $(PTHREAD_CFLAGS)
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 9e422c5..0b455e7 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -54,6 +54,7 @@
 #endif
 
 #include "pgbench.h"
+#include "psqlscan.h"
 
 #define ERRCODE_UNDEFINED_TABLE  "42P01"
 
@@ -265,7 +266,7 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct
+typedef struct Command_t
 {
 	char	   *line;			/* full text of command line */
 	int			command_num;	/* unique index of this Command struct */
@@ -274,6 +275,7 @@ typedef struct
 	char	   *argv[MAX_ARGS]; /* command word list */
 	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
+	struct Command_t *next;		/* more command if any, for multistatements */
 } Command;
 
 typedef struct
@@ -296,6 +298,21 @@ typedef struct
 	double		sum2_lag;		/* sum(lag*lag) */
 } AggVals;
 
+typedef enum
+{
+	PS_IDLE,
+	PS_IN_STATEMENT,
+	PS_IN_BACKSLASH_CMD
+} ParseState;
+
+typedef struct ParseInfo
+{
+	PsqlScanState	scan_state;
+	PQExpBuffer		outbuf;
+	ParseState		mode;
+} ParseInfoData;
+typedef ParseInfoData *ParseInfo;
+
 static Command **sql_files[MAX_FILES];	/* SQL script files */
 static int	num_files;			/* number of script files */
 static int	num_commands = 0;	/* total number of Command structs */
@@ -402,6 +419,9 @@ usage(void)
 		   progname, progname);
 }
 
+PsqlScanCallbacks pgbench_scan_callbacks =
+{NULL, NULL, NULL};
+
 /*
  * strtoint64 -- convert a string to 64-bit integer
  *
@@ -2233,218 +2253,348 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
-/* Parse a command; return a Command struct, or NULL if it's a comment */
+static ParseInfo
+createParseInfo(void)
+{
+	ParseInfo ret = (ParseInfo) pg_malloc(sizeof(ParseInfoData));
+
+	ret->scan_state = psql_scan_create();
+	ret->outbuf = createPQExpBuffer();
+	ret->mode = PS_IDLE;
+
+	return ret;
+}
+
+#define parse_reset_outbuf(pcs) resetPQExpBuffer((pcs)->outbuf)
+#define parse_finish_scan(pcs) psql_scan_finish((pcs)->scan_state)
+
+/* copy a string after removing newlines and collapsing whitespaces */
+static char *
+strdup_nonl(const char *in)
+{
+	char *ret, *p, *q;
+
+	ret = pg_strdup(in);
+
+	/* Replace newlines into spaces */
+	for (p = ret ; *p ; p++)
+		if (*p == '\n') *p = ' ';
+
+	/* collapse successive spaces */
+	for (p = q = ret ; *p ; p++, q++)
+	{
+		while (isspace(*p) && isspace(*(p + 1))) p++;
+		if (p > q) *q = *p;
+	}
+	*q = '\0';
+
+	return ret;
+}
+
+/* Parse a backslash command; return a Command struct */
 static Command *
-process_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(ParseInfo proc_state, char *buf,
+						   const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 
 	Command    *my_commands;
 	int			j;
 	char	   *p,
+			   *start,
 			   *tok;
-
-	/* Make the string buf end at the next newline */
-	if ((p = strchr(buf, '\n')) != NULL)
-		*p = '\0';
+	int			max_args = -1;
 
 	/* Skip leading whitespace */
 	p = buf;
 	while (isspace((unsigned char) *p))
 		p++;
+	start = p;
 
-	/* If the line is empty or actually a comment, we're done */
-	if (*p == '\0' || strncmp(p, "--", 2) == 0)
-		return NULL;
+	if (proc_state->mode != PS_IN_BACKSLASH_CMD)
+	{
+		if (*p != '\\')
+			return NULL;	/* not a backslash command */
+
+		/* This is the first line of a backslash command  */
+		proc_state->mode = PS_IN_BACKSLASH_CMD;
+	}
+
+	/*
+	 * Make the string buf end at the next newline, or move to just after the
+	 * end of line
+	 */
+	if ((p = strchr(start, '\n')) != NULL)
+		*p = '\0';
+	else
+		p = start + strlen(start);
+
+	/* continued line ends with a backslash */
+	if (*(--p) == '\\')
+	{
+		*p-- = '\0';
+		appendPQExpBufferStr(proc_state->outbuf, start);
+
+		/* Add a delimiter at the end of the line if necessary */
+		if (!isspace(*p))
+			appendPQExpBufferChar(proc_state->outbuf, ' ');
+ 		return NULL;
+	}
+
+	appendPQExpBufferStr(proc_state->outbuf, start);
+	proc_state->mode = PS_IDLE;
+
+	/* Start parsing the backslash command */
+
+	p = proc_state->outbuf->data;
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
-	my_commands->line = pg_strdup(buf);
+	my_commands->line = pg_strdup(p);
 	my_commands->command_num = num_commands++;
-	my_commands->type = 0;		/* until set */
+	my_commands->type = META_COMMAND;
 	my_commands->argc = 0;
+	my_commands->next = NULL;
 
-	if (*p == '\\')
+	j = 0;
+	tok = strtok(++p, delim);
+
+	if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
+		max_args = 2;
+
+	while (tok != NULL)
 	{
-		int			max_args = -1;
-
-		my_commands->type = META_COMMAND;
-
-		j = 0;
-		tok = strtok(++p, delim);
-
-		if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
-			max_args = 2;
-
-		while (tok != NULL)
-		{
-			my_commands->cols[j] = tok - buf + 1;
-			my_commands->argv[j++] = pg_strdup(tok);
-			my_commands->argc++;
-			if (max_args >= 0 && my_commands->argc >= max_args)
-				tok = strtok(NULL, "");
-			else
-				tok = strtok(NULL, delim);
-		}
-
-		if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
-		{
-			/*
-			 * parsing:
-			 *   \setrandom variable min max [uniform]
-			 *   \setrandom variable min max (gaussian|exponential) parameter
-			 */
-
-			if (my_commands->argc < 4)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing arguments", NULL, -1);
-			}
-
-			/* argc >= 4 */
-
-			if (my_commands->argc == 4 ||		/* uniform without/with
-												 * "uniform" keyword */
-				(my_commands->argc == 5 &&
-				 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
-			{
-				/* nothing to do */
-			}
-			else if (			/* argc >= 5 */
-					 (pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
-				   (pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
-			{
-				if (my_commands->argc < 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-					 "missing parameter", my_commands->argv[4], -1);
-				}
-				else if (my_commands->argc > 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "too many arguments", my_commands->argv[4],
-								 my_commands->cols[6]);
-				}
-			}
-			else	/* cannot parse, unexpected arguments */
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "unexpected argument", my_commands->argv[4],
-							 my_commands->cols[4]);
-			}
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
-		{
-			if (my_commands->argc < 3)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
-
-			expr_scanner_init(my_commands->argv[2], source, lineno,
-							  my_commands->line, my_commands->argv[0],
-							  my_commands->cols[2] - 1);
-
-			if (expr_yyparse() != 0)
-			{
-				/* dead code: exit done from syntax_error called by yyerror */
-				exit(1);
-			}
-
-			my_commands->expr = expr_parse_result;
-
-			expr_scanner_finish();
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
-		{
-			if (my_commands->argc < 2)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
-
-			/*
-			 * Split argument into number and unit to allow "sleep 1ms" etc.
-			 * We don't have to terminate the number argument with null
-			 * because it will be parsed with atoi, which ignores trailing
-			 * non-digit characters.
-			 */
-			if (my_commands->argv[1][0] != ':')
-			{
-				char	   *c = my_commands->argv[1];
-
-				while (isdigit((unsigned char) *c))
-					c++;
-				if (*c)
-				{
-					my_commands->argv[2] = c;
-					if (my_commands->argc < 3)
-						my_commands->argc = 3;
-				}
-			}
-
-			if (my_commands->argc >= 3)
-			{
-				if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "s") != 0)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "unknown time unit, must be us, ms or s",
-								 my_commands->argv[2], my_commands->cols[2]);
-				}
-			}
-
-			/* this should be an error?! */
-			for (j = 3; j < my_commands->argc; j++)
-				fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
-						my_commands->argv[0], my_commands->argv[j]);
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
-		{
-			if (my_commands->argc < 3)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
-		{
-			if (my_commands->argc < 1)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing command", NULL, -1);
-			}
-		}
+		my_commands->cols[j] = tok - buf + 1;
+		my_commands->argv[j++] = pg_strdup(tok);
+		my_commands->argc++;
+		if (max_args >= 0 && my_commands->argc >= max_args)
+			tok = strtok(NULL, "");
 		else
+			tok = strtok(NULL, delim);
+	}
+	parse_reset_outbuf(proc_state);
+
+	if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
+	{
+		/*
+		 * parsing:
+		 *   \setrandom variable min max [uniform]
+		 *   \setrandom variable min max (gaussian|exponential) parameter
+		 */
+
+		if (my_commands->argc < 4)
 		{
 			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-						 "invalid command", NULL, -1);
+						 "missing arguments", NULL, -1);
+		}
+
+		/* argc >= 4 */
+
+		if (my_commands->argc == 4 ||		/* uniform without/with
+											 * "uniform" keyword */
+			(my_commands->argc == 5 &&
+			 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
+		{
+			/* nothing to do */
+		}
+		else if (			/* argc >= 5 */
+			(pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
+			(pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
+		{
+			if (my_commands->argc < 6)
+			{
+				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+							 "missing parameter", my_commands->argv[4], -1);
+			}
+			else if (my_commands->argc > 6)
+			{
+				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+							 "too many arguments", my_commands->argv[4],
+							 my_commands->cols[6]);
+			}
+		}
+		else	/* cannot parse, unexpected arguments */
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "unexpected argument", my_commands->argv[4],
+						 my_commands->cols[4]);
+		}
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+	{
+		if (my_commands->argc < 3)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
+
+		expr_scanner_init(my_commands->argv[2], source, lineno,
+						  my_commands->line, my_commands->argv[0],
+						  my_commands->cols[2] - 1);
+
+		if (expr_yyparse() != 0)
+		{
+			/* dead code: exit done from syntax_error called by yyerror */
+			exit(1);
+		}
+
+		my_commands->expr = expr_parse_result;
+
+		expr_scanner_finish();
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+	{
+		if (my_commands->argc < 2)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
+
+		/*
+		 * Split argument into number and unit to allow "sleep 1ms" etc.  We
+		 * don't have to terminate the number argument with null because it
+		 * will be parsed with atoi, which ignores trailing non-digit
+		 * characters.
+		 */
+		if (my_commands->argv[1][0] != ':')
+		{
+			char	   *c = my_commands->argv[1];
+
+			while (isdigit((unsigned char) *c))
+				c++;
+			if (*c)
+			{
+				my_commands->argv[2] = c;
+				if (my_commands->argc < 3)
+					my_commands->argc = 3;
+			}
+		}
+
+		if (my_commands->argc >= 3)
+		{
+			if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "s") != 0)
+			{
+				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+							 "unknown time unit, must be us, ms or s",
+							 my_commands->argv[2], my_commands->cols[2]);
+			}
+		}
+
+		/* this should be an error?! */
+		for (j = 3; j < my_commands->argc; j++)
+			fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
+					my_commands->argv[0], my_commands->argv[j]);
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+	{
+		if (my_commands->argc < 3)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+	{
+		if (my_commands->argc < 1)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing command", NULL, -1);
 		}
 	}
 	else
 	{
-		my_commands->type = SQL_COMMAND;
+		syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+					 "invalid command", NULL, -1);
+	}
+	return my_commands;
+}
+
+/* Parse an input line, return non-null if any command terminates. */
+static Command *
+process_commands(ParseInfo proc_state, char *buf,
+				 const char *source, const int lineno)
+{
+	Command *command = NULL;
+	Command *retcomd = NULL;
+	PsqlScanState scan_state = proc_state->scan_state;
+	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+	PQExpBuffer qbuf = proc_state->outbuf;
+	PsqlScanResult scan_result;
+
+	if (proc_state->mode != PS_IN_STATEMENT)
+	{
+		command = process_backslash_commands(proc_state, buf, source, lineno);
+
+		/* go to next line for continuation of the backslash command. */
+		if (command != NULL || proc_state->mode == PS_IN_BACKSLASH_CMD)
+			return command;
+	}
+
+	/* Parse statements */
+	psql_scan_setup(scan_state, buf, strlen(buf), &pgbench_scan_callbacks);
+
+next_command:	
+	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
+
+	if (scan_result == PSCAN_SEMICOLON)
+	{
+		proc_state->mode = PS_IDLE;
+		/*
+		 * Command is terminated. Fill the struct.
+		 */
+		command = (Command*) pg_malloc(sizeof(Command));
+		command->line = strdup_nonl(qbuf->data);
+		command->command_num = num_commands++;
+		command->type = SQL_COMMAND;
+		command->argc = 0;
+		command->next = NULL;
+
+		/* Put this command at the end of returning command chain */
+		if (!retcomd)
+			retcomd = command;
+		else
+		{
+			Command *pcomm = retcomd;
+			while (pcomm->next) pcomm = pcomm->next;
+			pcomm->next = command;
+		}
 
 		switch (querymode)
 		{
-			case QUERY_SIMPLE:
-				my_commands->argv[0] = pg_strdup(p);
-				my_commands->argc++;
-				break;
-			case QUERY_EXTENDED:
-			case QUERY_PREPARED:
-				if (!parseQuery(my_commands, p))
-					exit(1);
-				break;
-			default:
+		case QUERY_SIMPLE:
+			command->argv[0] = pg_strdup(qbuf->data);
+			command->argc++;
+			break;
+		case QUERY_EXTENDED:
+		case QUERY_PREPARED:
+			if (!parseQuery(command, qbuf->data))
 				exit(1);
+			break;
+		default:
+			exit(1);
 		}
+
+		parse_reset_outbuf(proc_state);
+
+		/* Ask for the next statement in this line */
+		goto next_command;
 	}
+	else if (scan_result == PSCAN_BACKSLASH)
+	{
+		fprintf(stderr, "Unexpected backslash in SQL statement: %s:%d\n",
+				source, lineno);
+		exit(1);
+	}
+
+	proc_state->mode = PS_IN_STATEMENT;
+	psql_scan_finish(scan_state);
 
-	return my_commands;
+	return retcomd;
 }
 
+
 /*
  * Read a line from fd, and return it in a malloc'd buffer.
  * Return NULL at EOF.
@@ -2499,6 +2649,7 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	if (num_files >= MAX_FILES)
 	{
@@ -2519,26 +2670,38 @@ process_file(char *filename)
 		return false;
 	}
 
+	proc_state->mode = PS_IDLE;
+
 	lineno = 0;
 	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
-		Command    *command;
+		Command *command = NULL;
 
 		lineno += 1;
 
-		command = process_commands(buf, filename, lineno);
+		command = process_commands(proc_state, buf, filename, lineno);
 
 		free(buf);
 
 		if (command == NULL)
-			continue;
+		{
+			/*
+			 * command is NULL when psql_scan returns PSCAN_EOL or
+			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
+			 * cases.
+			 */
+ 			continue;
+		}
 
-		my_commands[index] = command;
-		index++;
-
-		if (index >= alloc_num)
+		while (command)
+		{
+			my_commands[index++] = command;
+			command = command->next;
+		}
+		
+		if (index > alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
 			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
@@ -2546,6 +2709,8 @@ process_file(char *filename)
 	}
 	fclose(fd);
 
+	parse_finish_scan(proc_state);
+
 	my_commands[index] = NULL;
 
 	sql_files[num_files++] = my_commands;
@@ -2563,6 +2728,7 @@ process_builtin(char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2589,10 +2755,12 @@ process_builtin(char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(buf, source, lineno);
+		command = process_commands(proc_state, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
+		/* builtin doesn't need multistatements */
+		Assert(command->next == NULL);
 		my_commands[index] = command;
 		index++;
 
@@ -2604,6 +2772,7 @@ process_builtin(char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
+	parse_finish_scan(proc_state);
 
 	return my_commands;
 }
-- 
1.8.3.1

0005-Change-the-way-to-hold-command-list.patchtext/x-patch; charset=us-asciiDownload
>From 0ada27f20c4519617b2c805857f9d955cd65a054 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 7 Jan 2016 15:54:19 +0900
Subject: [PATCH 5/5] Change the way to hold command list.

Currently commands for SQL statements are generated as a linked list
and stored into and accessed as an array. This patch unifies the way
to store them to linked list.
---
 src/bin/pgbench/pgbench.c | 168 +++++++++++++++++++++-------------------------
 1 file changed, 76 insertions(+), 92 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 0b455e7..989ab77 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -192,16 +192,29 @@ typedef struct
 
 #define MAX_FILES		128		/* max number of SQL script files allowed */
 #define SHELL_COMMAND_SIZE	256 /* maximum size allowed for shell command */
+#define MAX_ARGS		10
 
 /*
  * structures used in custom query mode
  */
 
+typedef struct Command_t
+{
+	char	   *line;			/* full text of command line */
+	int			command_num;	/* unique index of this Command struct */
+	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
+	int			argc;			/* number of command words */
+	char	   *argv[MAX_ARGS]; /* command word list */
+	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
+	PgBenchExpr *expr;			/* parsed expression */
+	struct Command_t *next;		/* more command if any, for multistatements */
+} Command;
+
 typedef struct
 {
 	PGconn	   *con;			/* connection handle to DB */
 	int			id;				/* client No. */
-	int			state;			/* state No. */
+	Command	   *curr;			/* current command */
 	int			listen;			/* 0 indicates that an async query has been
 								 * sent */
 	int			sleeping;		/* 1 indicates that the client is napping */
@@ -253,7 +266,6 @@ typedef struct
  */
 #define SQL_COMMAND		1
 #define META_COMMAND	2
-#define MAX_ARGS		10
 
 typedef enum QueryMode
 {
@@ -266,18 +278,6 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct Command_t
-{
-	char	   *line;			/* full text of command line */
-	int			command_num;	/* unique index of this Command struct */
-	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
-	int			argc;			/* number of command words */
-	char	   *argv[MAX_ARGS]; /* command word list */
-	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
-	PgBenchExpr *expr;			/* parsed expression */
-	struct Command_t *next;		/* more command if any, for multistatements */
-} Command;
-
 typedef struct
 {
 
@@ -313,7 +313,7 @@ typedef struct ParseInfo
 } ParseInfoData;
 typedef ParseInfoData *ParseInfo;
 
-static Command **sql_files[MAX_FILES];	/* SQL script files */
+static Command *sql_files[MAX_FILES];	/* SQL script files */
 static int	num_files;			/* number of script files */
 static int	num_commands = 0;	/* total number of Command structs */
 static int	debug = 0;			/* debug flag */
@@ -1149,7 +1149,7 @@ static bool
 doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVals *agg)
 {
 	PGresult   *res;
-	Command   **commands;
+	Command    *commands;
 	bool		trans_needs_throttle = false;
 	instr_time	now;
 
@@ -1246,13 +1246,14 @@ top:
 
 	if (st->listen)
 	{							/* are we receiver? */
-		if (commands[st->state]->type == SQL_COMMAND)
+		if (st->curr->type == SQL_COMMAND)
 		{
 			if (debug)
 				fprintf(stderr, "client %d receiving\n", st->id);
 			if (!PQconsumeInput(st->con))
 			{					/* there's something wrong */
-				fprintf(stderr, "client %d aborted in state %d; perhaps the backend died while processing\n", st->id, st->state);
+				fprintf(stderr, "client %d aborted in state %d; perhaps the backend died while processing\n",
+						st->id,	st->curr->command_num);
 				return clientDone(st, false);
 			}
 			if (PQisBusy(st->con))
@@ -1265,7 +1266,7 @@ top:
 		 */
 		if (is_latencies)
 		{
-			int			cnum = commands[st->state]->command_num;
+			int			cnum = st->curr->command_num;
 
 			if (INSTR_TIME_IS_ZERO(now))
 				INSTR_TIME_SET_CURRENT(now);
@@ -1275,7 +1276,7 @@ top:
 		}
 
 		/* transaction finished: calculate latency and log the transaction */
-		if (commands[st->state + 1] == NULL)
+		if (st->curr->next == NULL)
 		{
 			/* only calculate latency if an option is used that needs it */
 			if (progress || throttle_delay || latency_limit)
@@ -1308,7 +1309,7 @@ top:
 				doLog(thread, st, logfile, &now, agg, false);
 		}
 
-		if (commands[st->state]->type == SQL_COMMAND)
+		if (st->curr->type == SQL_COMMAND)
 		{
 			/*
 			 * Read and discard the query result; note this is not included in
@@ -1322,7 +1323,8 @@ top:
 					break;		/* OK */
 				default:
 					fprintf(stderr, "client %d aborted in state %d: %s",
-							st->id, st->state, PQerrorMessage(st->con));
+							st->id, st->curr->command_num,
+							PQerrorMessage(st->con));
 					PQclear(res);
 					return clientDone(st, false);
 			}
@@ -1330,7 +1332,7 @@ top:
 			discard_response(st);
 		}
 
-		if (commands[st->state + 1] == NULL)
+		if (st->curr->next == NULL)
 		{
 			if (is_connect)
 			{
@@ -1344,12 +1346,12 @@ top:
 		}
 
 		/* increment state counter */
-		st->state++;
-		if (commands[st->state] == NULL)
+		st->curr = st->curr->next;
+		if (st->curr == NULL)
 		{
-			st->state = 0;
 			st->use_file = (int) getrand(thread, 0, num_files - 1);
 			commands = sql_files[st->use_file];
+			st->curr = commands;
 			st->is_throttled = false;
 
 			/*
@@ -1392,7 +1394,8 @@ top:
 	}
 
 	/* Record transaction start time under logging, progress or throttling */
-	if ((logfile || progress || throttle_delay || latency_limit) && st->state == 0)
+	if ((logfile || progress || throttle_delay || latency_limit) &&
+		st->curr == commands)
 	{
 		INSTR_TIME_SET_CURRENT(st->txn_begin);
 
@@ -1408,9 +1411,9 @@ top:
 	if (is_latencies)
 		INSTR_TIME_SET_CURRENT(st->stmt_begin);
 
-	if (commands[st->state]->type == SQL_COMMAND)
+	if (st->curr->type == SQL_COMMAND)
 	{
-		const Command *command = commands[st->state];
+		const Command *command = st->curr;
 		int			r;
 
 		if (querymode == QUERY_SIMPLE)
@@ -1444,18 +1447,19 @@ top:
 
 			if (!st->prepared[st->use_file])
 			{
-				int			j;
+				int			j = 0;
+				Command		*pcom = commands;
 
-				for (j = 0; commands[j] != NULL; j++)
+				for (; pcom ; pcom = pcom->next, j++)
 				{
 					PGresult   *res;
 					char		name[MAX_PREPARE_NAME];
 
-					if (commands[j]->type != SQL_COMMAND)
+					if (pcom->type != SQL_COMMAND)
 						continue;
 					preparedStatementName(name, st->use_file, j);
 					res = PQprepare(st->con, name,
-						  commands[j]->argv[0], commands[j]->argc - 1, NULL);
+						  pcom->argv[0], pcom->argc - 1, NULL);
 					if (PQresultStatus(res) != PGRES_COMMAND_OK)
 						fprintf(stderr, "%s", PQerrorMessage(st->con));
 					PQclear(res);
@@ -1464,7 +1468,7 @@ top:
 			}
 
 			getQueryParams(st, command, params);
-			preparedStatementName(name, st->use_file, st->state);
+			preparedStatementName(name, st->use_file, st->curr->command_num);
 
 			if (debug)
 				fprintf(stderr, "client %d sending %s\n", st->id, name);
@@ -1484,11 +1488,11 @@ top:
 		else
 			st->listen = 1;		/* flags that should be listened */
 	}
-	else if (commands[st->state]->type == META_COMMAND)
+	else if (st->curr->type == META_COMMAND)
 	{
-		int			argc = commands[st->state]->argc,
+		int			argc = st->curr->argc,
 					i;
-		char	  **argv = commands[st->state]->argv;
+		char	  **argv = st->curr->argv;
 
 		if (debug)
 		{
@@ -1638,7 +1642,7 @@ top:
 		else if (pg_strcasecmp(argv[0], "set") == 0)
 		{
 			char		res[64];
-			PgBenchExpr *expr = commands[st->state]->expr;
+			PgBenchExpr *expr = st->curr->expr;
 			int64		result;
 
 			if (!evaluateExpr(st, expr, &result))
@@ -2641,14 +2645,11 @@ read_line_from_file(FILE *fd)
 static int
 process_file(char *filename)
 {
-#define COMMANDS_ALLOC_NUM 128
-
-	Command   **my_commands;
+	Command    *my_commands = NULL,
+			   *my_commands_tail = NULL;
 	FILE	   *fd;
-	int			lineno,
-				index;
+	int			lineno;
 	char	   *buf;
-	int			alloc_num;
 	ParseInfo proc_state = createParseInfo();
 
 	if (num_files >= MAX_FILES)
@@ -2657,23 +2658,18 @@ process_file(char *filename)
 		exit(1);
 	}
 
-	alloc_num = COMMANDS_ALLOC_NUM;
-	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
-
 	if (strcmp(filename, "-") == 0)
 		fd = stdin;
 	else if ((fd = fopen(filename, "r")) == NULL)
 	{
 		fprintf(stderr, "could not open file \"%s\": %s\n",
 				filename, strerror(errno));
-		pg_free(my_commands);
 		return false;
 	}
 
 	proc_state->mode = PS_IDLE;
 
 	lineno = 0;
-	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
@@ -2695,46 +2691,37 @@ process_file(char *filename)
  			continue;
 		}
 
-		while (command)
-		{
-			my_commands[index++] = command;
-			command = command->next;
-		}
+		/* Append new commands at the end of the list */
+		if (my_commands_tail)
+			my_commands_tail->next = command;
+		else
+			my_commands = my_commands_tail = command;
 		
-		if (index > alloc_num)
-		{
-			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
-		}
+		/* Seek to the tail of the list */
+		while (my_commands_tail->next)
+			my_commands_tail = my_commands_tail->next;
 	}
 	fclose(fd);
 
 	parse_finish_scan(proc_state);
 
-	my_commands[index] = NULL;
+	my_commands_tail->next = NULL;
 
 	sql_files[num_files++] = my_commands;
 
 	return true;
 }
 
-static Command **
+static Command *
 process_builtin(char *tb, const char *source)
 {
-#define COMMANDS_ALLOC_NUM 128
-
-	Command   **my_commands;
-	int			lineno,
-				index;
+	Command    *my_commands = NULL,
+			   *my_commands_tail = NULL;
+	int			lineno;
 	char		buf[BUFSIZ];
-	int			alloc_num;
 	ParseInfo proc_state = createParseInfo();
 
-	alloc_num = COMMANDS_ALLOC_NUM;
-	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
-
 	lineno = 0;
-	index = 0;
 
 	for (;;)
 	{
@@ -2759,19 +2746,17 @@ process_builtin(char *tb, const char *source)
 		if (command == NULL)
 			continue;
 
-		/* builtin doesn't need multistatements */
+		/* For simplisity, inhibit builtin from multistatements */
 		Assert(command->next == NULL);
-		my_commands[index] = command;
-		index++;
-
-		if (index >= alloc_num)
+		if (my_commands_tail)
 		{
-			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
+			my_commands_tail->next = command;
+			my_commands_tail = command;
 		}
+		else
+			my_commands = my_commands_tail = command;
 	}
 
-	my_commands[index] = NULL;
 	parse_finish_scan(proc_state);
 
 	return my_commands;
@@ -2876,16 +2861,16 @@ printResults(int ttype, int64 normal_xacts, int nclients,
 
 		for (i = 0; i < num_files; i++)
 		{
-			Command   **commands;
+			Command   *command;
 
 			if (num_files > 1)
 				printf("statement latencies in milliseconds, file %d:\n", i + 1);
 			else
 				printf("statement latencies in milliseconds:\n");
 
-			for (commands = sql_files[i]; *commands != NULL; commands++)
+			for (command = sql_files[i]; command ;
+				 command=command->next)
 			{
-				Command    *command = *commands;
 				int			cnum = command->command_num;
 				double		total_time;
 				instr_time	total_exec_elapsed;
@@ -3166,7 +3151,7 @@ main(int argc, char **argv)
 				benchmarking_option_set = true;
 				ttype = 3;
 				filename = pg_strdup(optarg);
-				if (process_file(filename) == false || *sql_files[num_files - 1] == NULL)
+				if (process_file(filename) == false || sql_files[num_files - 1] == NULL)
 					exit(1);
 				break;
 			case 'D':
@@ -3752,17 +3737,18 @@ threadRun(void *arg)
 	for (i = 0; i < nstate; i++)
 	{
 		CState	   *st = &state[i];
-		Command   **commands = sql_files[st->use_file];
 		int			prev_ecnt = st->ecnt;
 
 		st->use_file = getrand(thread, 0, num_files - 1);
+		st->curr = sql_files[st->use_file];
+
 		if (!doCustom(thread, st, &thread->conn_time, logfile, &aggs))
 			remains--;			/* I've aborted */
 
-		if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
+		if (st->ecnt > prev_ecnt && st->curr->type == META_COMMAND)
 		{
 			fprintf(stderr, "client %d aborted in state %d; execution of meta-command failed\n",
-					i, st->state);
+					i, st->curr->command_num);
 			remains--;			/* I've aborted */
 			PQfinish(st->con);
 			st->con = NULL;
@@ -3783,7 +3769,6 @@ threadRun(void *arg)
 		for (i = 0; i < nstate; i++)
 		{
 			CState	   *st = &state[i];
-			Command   **commands = sql_files[st->use_file];
 			int			sock;
 
 			if (st->con == NULL)
@@ -3819,7 +3804,7 @@ threadRun(void *arg)
 						min_usec = this_usec;
 				}
 			}
-			else if (commands[st->state]->type == META_COMMAND)
+			else if (st->curr->type == META_COMMAND)
 			{
 				min_usec = 0;	/* the connection is ready to run */
 				break;
@@ -3889,20 +3874,19 @@ threadRun(void *arg)
 		for (i = 0; i < nstate; i++)
 		{
 			CState	   *st = &state[i];
-			Command   **commands = sql_files[st->use_file];
 			int			prev_ecnt = st->ecnt;
 
 			if (st->con && (FD_ISSET(PQsocket(st->con), &input_mask)
-							|| commands[st->state]->type == META_COMMAND))
+							|| st->curr->type == META_COMMAND))
 			{
 				if (!doCustom(thread, st, &thread->conn_time, logfile, &aggs))
 					remains--;	/* I've aborted */
 			}
 
-			if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
+			if (st->ecnt > prev_ecnt && st->curr->type == META_COMMAND)
 			{
 				fprintf(stderr, "client %d aborted in state %d; execution of meta-command failed\n",
-						i, st->state);
+						i, st->curr->command_num);
 				remains--;		/* I've aborted */
 				PQfinish(st->con);
 				st->con = NULL;
-- 
1.8.3.1

#41Michael Paquier
michael.paquier@gmail.com
In reply to: Kyotaro HORIGUCHI (#40)
Re: pgbench - allow backslash-continuations in custom scripts

On Thu, Jan 7, 2016 at 5:36 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Finally, PsqlScanState has four callback funcions and all pgbench
needs to do to use it is setting NULL to all of them and link the
object file in psql directory. No link switch/ifdef are necessary.

Am I missing something? This patch is not registered in the CF app.
Horiguchi-san, if you expect feedback, it would be good to get it
there.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Michael Paquier (#41)
Re: pgbench - allow backslash-continuations in custom scripts

Mmm. I believed that this is on CF app..

At Tue, 19 Jan 2016 15:41:54 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqQFnORG=LjDiGgD_hy_M00scx1ihn89QHx_1B9+3Vz7tQ@mail.gmail.com>

On Thu, Jan 7, 2016 at 5:36 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Finally, PsqlScanState has four callback funcions and all pgbench
needs to do to use it is setting NULL to all of them and link the
object file in psql directory. No link switch/ifdef are necessary.

Am I missing something? This patch is not registered in the CF app.
Horiguchi-san, if you expect feedback, it would be good to get it
there.

Thank you very much Michael but the CF app doesn't allow me to
regsiter new one. Filling the Description field with "pgbench -
allow backslash-continuations in custom scripts" and chose a
topic then "Find thread" shows nothing. Filling the search text
field on the "Attach thread" dialogue with the description or
giving the exact message-id gave me nothing to choose.

Maybe should I repost the patch so that the "Attach thread" can
find it as a "recent" email?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Kyotaro HORIGUCHI (#42)
Re: pgbench - allow backslash-continuations in custom scripts

Hello Kyotaro-san,

Thank you very much Michael but the CF app doesn't allow me to
regsiter new one. Filling the Description field with "pgbench -
allow backslash-continuations in custom scripts" and chose a
topic then "Find thread" shows nothing. Filling the search text
field on the "Attach thread" dialogue with the description or
giving the exact message-id gave me nothing to choose.

Strange.

You could try taking the old entry and selecting state "move to next CF"?

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#44Michael Paquier
michael.paquier@gmail.com
In reply to: Kyotaro HORIGUCHI (#42)
Re: pgbench - allow backslash-continuations in custom scripts

On Tue, Jan 26, 2016 at 6:51 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Mmm. I believed that this is on CF app..

At Tue, 19 Jan 2016 15:41:54 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqQFnORG=LjDiGgD_hy_M00scx1ihn89QHx_1B9+3Vz7tQ@mail.gmail.com>

On Thu, Jan 7, 2016 at 5:36 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Finally, PsqlScanState has four callback funcions and all pgbench
needs to do to use it is setting NULL to all of them and link the
object file in psql directory. No link switch/ifdef are necessary.

Am I missing something? This patch is not registered in the CF app.
Horiguchi-san, if you expect feedback, it would be good to get it
there.

Thank you very much Michael but the CF app doesn't allow me to
regsiter new one.

That's perhaps a bit late I am afraid for this CF, we are at the end
of January already...

Filling the Description field with "pgbench -
allow backslash-continuations in custom scripts" and chose a
topic then "Find thread" shows nothing. Filling the search text
field on the "Attach thread" dialogue with the description or
giving the exact message-id gave me nothing to choose.

Really? That's because the patch is marked as returned with feedback here:
https://commitfest.postgresql.org/7/319/

Maybe should I repost the patch so that the "Attach thread" can
find it as a "recent" email?

What if you just add it to next CF with a new entry? You are actually
proposing an entirely new patch.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#45Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Michael Paquier (#41)
Re: pgbench - allow backslash-continuations in custom scripts

Hello, thank you, Febien, Micael.

# Though I have made almost no activity in the last month...

At Tue, 26 Jan 2016 13:53:33 +0100 (CET), Fabien COELHO <coelho@cri.ensmp.fr> wrote in <alpine.DEB.2.10.1601261352210.6482@sto>

Hello Kyotaro-san,

Thank you very much Michael but the CF app doesn't allow me to
regsiter new one. Filling the Description field with "pgbench -
allow backslash-continuations in custom scripts" and chose a
topic then "Find thread" shows nothing. Filling the search text
field on the "Attach thread" dialogue with the description or
giving the exact message-id gave me nothing to choose.

Strange.

You could try taking the old entry and selecting state "move to next
CF"?

Hmm. The state of the old entry in CF2015-11 is already "Move*d*
to next CF" and it is not found in CF2016-01, as far as I saw.

At Tue, 26 Jan 2016 22:21:49 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqRtaz2nGb-7wQJ+w1-rFyFFxkruesNBM3RcPCgXaCoSmQ@mail.gmail.com>

Filling the Description field with "pgbench -
allow backslash-continuations in custom scripts" and chose a
topic then "Find thread" shows nothing. Filling the search text
field on the "Attach thread" dialogue with the description or
giving the exact message-id gave me nothing to choose.

Really? That's because the patch is marked as returned with feedback here:
https://commitfest.postgresql.org/7/319/

Ah, I have many candidates in "Attach thread" dialog. That would
be a temporary symptom of a kind of the CF-seaon-wise
meaintenance.

Maybe should I repost the patch so that the "Attach thread" can
find it as a "recent" email?

What if you just add it to next CF with a new entry? You are actually
proposing an entirely new patch.

So, I finally could register an entry for CF2016-3.
Thank you all for the suggestion.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#46Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro HORIGUCHI (#40)
Re: pgbench - allow backslash-continuations in custom scripts

On Thu, Jan 7, 2016 at 3:36 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

- 0001-Prepare-for-sharing-psqlscan-with-pgbench.patch

This diff looks a bit large but most of them is cut'n-paste
work and the substantial change is rather small.

This refactors psqlscan.l into two .l files. The additional
psqlscan_slash.l is a bit tricky in the point that recreating
scan_state on transition between psqlscan.l.

I've looked at this patch a few times now but find it rather hard to
verify. I am wondering if you would be willing to separate 0001 into
subpatches. For example, maybe there could be one or two patches that
ONLY move code around and then one or more patches that make the
changes to that code. Right now, for example, psql_scan_setup() is
getting three additional arguments, but it's also being moved to a
different file. Perhaps those two things could be done one at a time.

I also think this patch could really benefit from a detailed set of
submission notes that specifically lay out why each change was made
and why. For instance, I see that psqlscan.l used yyless() while
psqlscanbody.l uses a new my_yyless() you've defined. There is
probably a great reason for that and I'm sure if I stare at this for
long enough I can figure out what that reason is, but it would be
better if you had a list of bullet points explaining what was changed
and why.

I would really like to see this patch committed; my problem is that I
don't have enough braincells to be sure that it's correct in the
present form.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#47Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Robert Haas (#46)
Re: pgbench - allow backslash-continuations in custom scripts

Hello, thank you for reviewing this.

On Thu, Jan 7, 2016 at 3:36 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

- 0001-Prepare-for-sharing-psqlscan-with-pgbench.patch

This diff looks a bit large but most of them is cut'n-paste
work and the substantial change is rather small.

This refactors psqlscan.l into two .l files. The additional
psqlscan_slash.l is a bit tricky in the point that recreating
scan_state on transition between psqlscan.l.

I've looked at this patch a few times now but find it rather hard to
verify. I am wondering if you would be willing to separate 0001 into
subpatches. For example, maybe there could be one or two patches that
ONLY move code around and then one or more patches that make the
changes to that code. Right now, for example, psql_scan_setup() is
getting three additional arguments, but it's also being moved to a
different file. Perhaps those two things could be done one at a time.

I tried to split it into patches with some meaningful (I thought)
steps, but I'll arrange them if it is not easy to read.

I also think this patch could really benefit from a detailed set of
submission notes that specifically lay out why each change was made
and why. For instance, I see that psqlscan.l used yyless() while
psqlscanbody.l uses a new my_yyless() you've defined. There is
probably a great reason for that and I'm sure if I stare at this for
long enough I can figure out what that reason is, but it would be
better if you had a list of bullet points explaining what was changed
and why.

I'm sorry, but I didn't understood the 'submission notes' exactly
means. Is it precise descriptions in source comments? or commit
message of git-commit?

I would really like to see this patch committed; my problem is that I
don't have enough braincells to be sure that it's correct in the
present form.

Thank you. I'll send the rearranged patch sooner.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#48Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro HORIGUCHI (#47)
Re: pgbench - allow backslash-continuations in custom scripts

On Mon, Feb 15, 2016 at 1:04 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

I'm sorry, but I didn't understood the 'submission notes' exactly
means. Is it precise descriptions in source comments? or commit
message of git-commit?

Write a detailed email explaining each change that is part of the
patch and why it is there. Attach the patch to that same email.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#49Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Robert Haas (#48)
5 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

Hello,

At Tue, 16 Feb 2016 08:05:10 -0500, Robert Haas
<robertmhaas@gmail.com> wrote in
<CA+TgmoavXzXVV_k-89SbgMKB-Eyp+RpSKu_0tPGqx_ceEk=kCQ@mail.gmail.com>

On Mon, Feb 15, 2016 at 1:04 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

I'm sorry, but I didn't understood the 'submission notes' exactly
means. Is it precise descriptions in source comments? or commit
message of git-commit?

Write a detailed email explaining each change that is part of the
patch and why it is there. Attach the patch to that same email.

Sorry for the silly question. I'll try to describe this in
detail. I hope you are patient enough to read my clumsy (but a
bit long) text.

First, I rebased the previous patch set and merged three of
them. Now they are of three patches.

1. Making SQL parser part of psqlscan independent from psql.

Moved psql's baskslsh command stuff out of original psqlscan.l
and some psql stuff the parser directly looked are used via a
set of callback functions, which can be all NULL for usages
from other than psql.

2. Making pgbench to use the new psqlscan parser.

3. Changing the way to hold SQL/META commands from array to
linked list.

The #2 introduced linked list to store SQL multistatement but
immediately the caller moves the elements into an array. This
patch totally changes the way to linked list.

I'll explain in detail of each patch.

1. Make SQL parser part of psqlscan independent from psql.

The new file psqlscan_slashbody.l is the moved-out part of
psqlscan.l. The prefix of flex symbols are changed to "yys".

The psqlscan(_slash).c are container C source which does what is
previously done by mainloop.c. This makes other bin/ programs to
link psqlscan.o directly, without compilation.

1.1. The psqlscanbody.l

It is the SQL part of old psqlscan.l but the difference between
them is a bit bothersome to see. I attached the diff between them
as "psqlscanbody.l.diff" for convenience.

1.1.1. Switching between two lexers.

The SQL parser and the metacommand parser should be alternately
callable each other but yyparse() cannot parse from intermediate
position of the input text. So I provided functions to remake a
parser state using an input text beginning at just after the
position already parsed. psql_scan_switch_lexer() and
psql_scan_slash_command_switch_lexer() do that for psqlscan.l and
psqlscan_backslash.l respectively.

I haven't found no variable available to know how many characters
the parser has eaten so I decided to have a counter for the usage
as PsqlScanState.curpos and keep it pointing just after the last
letter already parsed. It is rather a common way to advance it
using YY_USER_ACTION and I did so, but the parser occasionally
steps back using yyless() when it reads some sequences. Hence
yyless() should decrement .curpos, but flex has no hook point for
the purpose. I defined my_yyless() (the name should need to be
changed) macro to do this.

1.1.2. Detaching psqlscan from psql stuff

psqlscan.l had depended on psql through pset.vars, pset.encoding,
pset.db and psql_erorr(). I have modified the parser to access
them via callback functions. psql_scan_setup() now has the new
fourth parameter PsqlScanCallbacks to receive them. The two
callbacks of them, standard_strings and error_out wouldn't need
detailed explanation, but enc_mblen and get_variable would need.

pset.encoding was used only to check if PQmblen is applicable
then to be given to PQmblen. So I provided the callback enc_mblen
which could be given if strings can be encoded using it. The
another one, get_variable() is an equivalent of a wrapper
function of GetVariable(). GetVariable() was called directly in
lexer definitions and indirectly via escape_variable() in
psqlscan.l but escape_variable() was accessing pset.db only in
order to check existence of an active connection. I could give it
via another callback, but I have moved out the part of the
function accessing it because it is accessed only in the
function.

Finally, I defined the callbacks in common.c.

1.2. The reason for the names psqlscanbody.l and psqlscan_slashbody.l.

psqlscan.l was finally included as psqlscan.c by mainloop.c. The
reason is postgresql_fe.h must be read before psqlscan.c on some
platform, according to the comment at the end of mainloop.c. But
it is an annoyance when using it from other bin/
programs. Therefore, I provided dedicated .c files to do so for
the two lexer .c files. In order to make the name of the file to
be linked from outside psql be psqlscan.o, I have renamed the *.l
files to *body.l.

1.3 The psqlscan_int.h file

As the new psqlscan.o is used from outside psql, psqlscan.h
should have only definitions needed to do so. psqlscan_int.h
defines some stuffs used by the lexers but not necessary to use
them.

1.4 Other files

Other files that are not mentioned so far, Makefile, common.h,
psqlscan_slash.h and startup.c would'nt be neccesary to be
explained.

2. Making pgbench to use the new psqlscan parser.

By the patch #2, pgbench.c gets mainly two major
modifications. Splitting of process_commands() and adding
backslash-continuation feature to the added function
process_backslash_commands().

2.1. process_commands() has been splitted into two functions

The function process_commands() has been splitted into
process_commands() and new function
process_backslash_commands(). The former has been made to use
psqlscan. In contrast to psql, pgbench first checks if the input
on focus is a backslash command or not in process_commands(),
then parses it using psqlscan if it was not a backslash
command. process_backslash_commands() is a cut out from the
original process_command() and it is modified to handle backslash
continuation. Both functions read multiple lines in a context so
the processing context is to be made by the caller
(i.e. process_file) and kept throughout all input lines.

2.2 backslash continuation in process_backslash_commands().

The loop over input lines in the old process_commands() is
refactored to handle SQL statements and backslash commands
separately. The most part of the new process_backslash_commands()
is almost the same with the corresponding part in the old
process_commands() except indentation. "git diff -b -patience"
gives the far-less-noisy differences so I attached it as the
"pgbench.c.patient.diff" for convenience.

3. Changing the way to hold SQL/META commands from array to
linked list.

The patch #2 adds a new member "next" to Command_t. It is
necessary to handle SQL multistatements. But the caller
process_file holds the commands in an array of Commant_t. This is
apparently not in undesirable form. Since the liked list seems
simpler for this usage (for me), I decided to unify them to the
linked list. Most of this patch is rather simple one by one
replacement.

I hope this submission note makes sense and this patch becomes
easy to read.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Make-SQL-parser-part-of-psqlscan-independent-from-ps.patchtext/x-patch; charset=us-asciiDownload
From c93e180e1a2bab8227be980e5d2a7584a55aef05 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 23 Jul 2015 20:44:37 +0900
Subject: [PATCH 1/3] Make SQL parser part of psqlscan independent from psql.

Moved psql's baskslsh command stuff out of original psqlscan.l and
some psql stuff the parser directly looked are used via a set of
callback functions, which can be all NULL for usages from other than
psql.
---
 src/bin/psql/Makefile             |   14 +-
 src/bin/psql/command.c            |    1 +
 src/bin/psql/common.c             |   54 +
 src/bin/psql/common.h             |    8 +
 src/bin/psql/mainloop.c           |   20 +-
 src/bin/psql/psqlscan.c           |   18 +
 src/bin/psql/psqlscan.h           |   42 +-
 src/bin/psql/psqlscan.l           | 1988 -------------------------------------
 src/bin/psql/psqlscan_int.h       |   84 ++
 src/bin/psql/psqlscan_slash.c     |   19 +
 src/bin/psql/psqlscan_slash.h     |   31 +
 src/bin/psql/psqlscan_slashbody.l |  766 ++++++++++++++
 src/bin/psql/psqlscanbody.l       | 1438 +++++++++++++++++++++++++++
 src/bin/psql/startup.c            |    9 +-
 14 files changed, 2454 insertions(+), 2038 deletions(-)
 create mode 100644 src/bin/psql/psqlscan.c
 delete mode 100644 src/bin/psql/psqlscan.l
 create mode 100644 src/bin/psql/psqlscan_int.h
 create mode 100644 src/bin/psql/psqlscan_slash.c
 create mode 100644 src/bin/psql/psqlscan_slash.h
 create mode 100644 src/bin/psql/psqlscan_slashbody.l
 create mode 100644 src/bin/psql/psqlscanbody.l

diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index 66e14fb..05a60fe 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -23,7 +23,7 @@ override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(top_srcdir)/src/bin/p
 OBJS=	command.o common.o help.o input.o stringutils.o mainloop.o copy.o \
 	startup.o prompt.o variables.o large_obj.o print.o describe.o \
 	tab-complete.o mbprint.o dumputils.o keywords.o kwlookup.o \
-	sql_help.o \
+	sql_help.o psqlscan.o psqlscan_slash.o\
 	$(WIN32RES)
 
 
@@ -44,13 +44,13 @@ sql_help.c: sql_help.h ;
 sql_help.h: create_help.pl $(wildcard $(REFDOCDIR)/*.sgml)
 	$(PERL) $< $(REFDOCDIR) $*
 
-# psqlscan is compiled as part of mainloop
-mainloop.o: psqlscan.c
+psqlscan.o: psqlscan.c psqlscanbody.c common.h psqlscan_int.h
+psqlscan_slash.o: psqlscan_slash.c psqlscan_slashbody.c common.h psqlscan_int.h
 
-psqlscan.c: FLEXFLAGS = -Cfe -p -p
-psqlscan.c: FLEX_NO_BACKUP=yes
+psqlscanbody.c psqlscan_slashbody.c: FLEXFLAGS = -Cfe -p -p
+psqlscanbody.c psqlscan_slashbody.c: FLEX_NO_BACKUP=yes
 
-distprep: sql_help.h psqlscan.c
+distprep: sql_help.h psqlscanbody.c psqlscan_slashbody.c
 
 install: all installdirs
 	$(INSTALL_PROGRAM) psql$(X) '$(DESTDIR)$(bindir)/psql$(X)'
@@ -67,4 +67,4 @@ clean distclean:
 	rm -f psql$(X) $(OBJS) dumputils.c keywords.c kwlookup.c lex.backup
 
 maintainer-clean: distclean
-	rm -f sql_help.h sql_help.c psqlscan.c
+	rm -f sql_help.h sql_help.c psqlscanbody.c psqlscan_slashbody.c
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9750a5b..e42fca7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -46,6 +46,7 @@
 #include "mainloop.h"
 #include "print.h"
 #include "psqlscan.h"
+#include "psqlscan_slash.h"
 #include "settings.h"
 #include "variables.h"
 
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 2cb2e9b..79eb04e 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -30,6 +30,9 @@ static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec);
 static bool command_no_begin(const char *query);
 static bool is_select_command(const char *query);
 
+PsqlScanCallbacks psqlscan_callbacks =
+{&get_variable, &psql_mblen, &standard_strings, &psql_error};
+
 
 /*
  * openQueryOutputFile --- attempt to open a query output file
@@ -1901,3 +1904,54 @@ recognized_connection_string(const char *connstr)
 {
 	return uri_prefix_length(connstr) != 0 || strchr(connstr, '=') != NULL;
 }
+
+/* Access callback to "shell variables" for lexer */
+const char *
+get_variable(const char *name, bool escape, bool as_ident,
+			 void (**free_func)(void *))
+{
+	const char *value;
+	char   *escaped_value;
+
+	*free_func = NULL;
+
+	value = GetVariable(pset.vars, name);
+
+	if (!escape)
+		return value;
+
+	/* Escaping. */
+
+	if (!value)
+		return NULL;
+
+	if (!pset.db)
+	{
+		psql_error("can't escape without active connection\n");
+		return NULL;
+	}
+
+	if (as_ident)
+		escaped_value =
+			PQescapeIdentifier(pset.db, value, strlen(value));
+	else
+		escaped_value =
+			PQescapeLiteral(pset.db, value, strlen(value));
+
+	if (escaped_value == NULL)
+	{
+		const char *error = PQerrorMessage(pset.db);
+
+		psql_error("%s", error);
+		return NULL;
+	}
+
+	*free_func = &PQfreemem;
+	return escaped_value;
+}
+
+int
+psql_mblen(const char *s)
+{
+	return PQmblen(s, pset.encoding);
+}
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index 6ba3f44..686503a 100644
--- a/src/bin/psql/common.h
+++ b/src/bin/psql/common.h
@@ -13,6 +13,7 @@
 #include "libpq-fe.h"
 
 #include "print.h"
+#include "psqlscan.h"
 
 #define atooid(x)  ((Oid) strtoul((x), NULL, 10))
 
@@ -29,6 +30,8 @@ extern sigjmp_buf sigint_interrupt_jmp;
 
 extern volatile bool cancel_pressed;
 
+extern PsqlScanCallbacks psqlscan_callbacks;
+
 /* Note: cancel_pressed is defined in print.c, see that file for reasons */
 
 extern void setup_cancel_handler(void);
@@ -49,4 +52,9 @@ extern void expand_tilde(char **filename);
 
 extern bool recognized_connection_string(const char *connstr);
 
+extern const char *get_variable(const char *name, bool escape, bool as_ident,
+								void (**free_func)(void *));
+
+extern int psql_mblen(const char *s);
+
 #endif   /* COMMON_H */
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index dadbd29..13424be 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -16,7 +16,6 @@
 
 #include "mb/pg_wchar.h"
 
-
 /*
  * Main processing loop for reading lines of input
  *	and sending them to the backend.
@@ -233,7 +232,11 @@ MainLoop(FILE *source)
 		/*
 		 * Parse line, looking for command separators.
 		 */
-		psql_scan_setup(scan_state, line, strlen(line));
+		/* set enc_mblen according to the encoding */
+		psqlscan_callbacks.enc_mblen =
+			(pg_valid_server_encoding_id(pset.encoding) ? NULL : &psql_mblen);
+
+		psql_scan_setup(scan_state, line, strlen(line),	&psqlscan_callbacks);
 		success = true;
 		line_saved_in_history = false;
 
@@ -373,7 +376,8 @@ MainLoop(FILE *source)
 					resetPQExpBuffer(query_buf);
 					/* reset parsing state since we are rescanning whole line */
 					psql_scan_reset(scan_state);
-					psql_scan_setup(scan_state, line, strlen(line));
+					psql_scan_setup(scan_state, line, strlen(line),
+									&psqlscan_callbacks);
 					line_saved_in_history = false;
 					prompt_status = PROMPT_READY;
 				}
@@ -450,13 +454,3 @@ MainLoop(FILE *source)
 
 	return successResult;
 }	/* MainLoop() */
-
-
-/*
- * psqlscan.c is #include'd here instead of being compiled on its own.
- * This is because we need postgres_fe.h to be read before any system
- * include files, else things tend to break on platforms that have
- * multiple infrastructures for stdio.h and so on.  flex is absolutely
- * uncooperative about that, so we can't compile psqlscan.c on its own.
- */
-#include "psqlscan.c"
diff --git a/src/bin/psql/psqlscan.c b/src/bin/psql/psqlscan.c
new file mode 100644
index 0000000..de7f746
--- /dev/null
+++ b/src/bin/psql/psqlscan.c
@@ -0,0 +1,18 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.c
+ *
+ */
+
+/*
+ * psqlscanbody.c is #include'd here instead of being compiled on its own.
+ * This is because we need postgres_fe.h to be read before any system
+ * include files, else things tend to break on platforms that have
+ * multiple infrastructures for stdio.h and so on.  flex is absolutely
+ * uncooperative about that, so we can't compile psqlscan.c on its own.
+ */
+#include "postgres_fe.h"
+#include "psqlscanbody.c"
diff --git a/src/bin/psql/psqlscan.h b/src/bin/psql/psqlscan.h
index 674ba69..322edd3 100644
--- a/src/bin/psql/psqlscan.h
+++ b/src/bin/psql/psqlscan.h
@@ -12,10 +12,20 @@
 
 #include "prompt.h"
 
-
 /* Abstract type for lexer's internal state */
 typedef struct PsqlScanStateData *PsqlScanState;
 
+typedef struct PsqlScanCallbacks
+{
+	const char *(*get_variable)(const char *, bool escape, bool as_ident,
+								void (**free_fn)(void *));
+	/* enc_mblen is needed only if encoding is not safe */
+	int	 (*enc_mblen)(const char *);
+	bool (*standard_strings)(void); /* standard_conforming_strings */
+	void (*error_out)(const char *fmt, ...) /* write error message */
+		pg_attribute_printf(1, 2);
+} PsqlScanCallbacks;
+
 /* Termination states for psql_scan() */
 typedef enum
 {
@@ -25,40 +35,18 @@ typedef enum
 	PSCAN_EOL					/* end of line, SQL possibly complete */
 } PsqlScanResult;
 
-/* Different ways for scan_slash_option to handle parameter words */
-enum slash_option_type
-{
-	OT_NORMAL,					/* normal case */
-	OT_SQLID,					/* treat as SQL identifier */
-	OT_SQLIDHACK,				/* SQL identifier, but don't downcase */
-	OT_FILEPIPE,				/* it's a filename or pipe */
-	OT_WHOLE_LINE,				/* just snarf the rest of the line */
-	OT_NO_EVAL					/* no expansion of backticks or variables */
-};
-
-
 extern PsqlScanState psql_scan_create(void);
 extern void psql_scan_destroy(PsqlScanState state);
 
-extern void psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len);
+extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+							PsqlScanCallbacks *callbacks);
 extern void psql_scan_finish(PsqlScanState state);
 
 extern PsqlScanResult psql_scan(PsqlScanState state,
-		  PQExpBuffer query_buf,
-		  promptStatus_t *prompt);
+								PQExpBuffer query_buf,
+								promptStatus_t *prompt);
 
 extern void psql_scan_reset(PsqlScanState state);
-
 extern bool psql_scan_in_quote(PsqlScanState state);
 
-extern char *psql_scan_slash_command(PsqlScanState state);
-
-extern char *psql_scan_slash_option(PsqlScanState state,
-					   enum slash_option_type type,
-					   char *quote,
-					   bool semicolon);
-
-extern void psql_scan_slash_command_end(PsqlScanState state);
-
 #endif   /* PSQLSCAN_H */
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
deleted file mode 100644
index bbe0172..0000000
--- a/src/bin/psql/psqlscan.l
+++ /dev/null
@@ -1,1988 +0,0 @@
-%{
-/*-------------------------------------------------------------------------
- *
- * psqlscan.l
- *	  lexical scanner for psql
- *
- * This code is mainly needed to determine where the end of a SQL statement
- * is: we are looking for semicolons that are not within quotes, comments,
- * or parentheses.  The most reliable way to handle this is to borrow the
- * backend's flex lexer rules, lock, stock, and barrel.  The rules below
- * are (except for a few) the same as the backend's, but their actions are
- * just ECHO whereas the backend's actions generally do other things.
- *
- * XXX The rules in this file must be kept in sync with the backend lexer!!!
- *
- * XXX Avoid creating backtracking cases --- see the backend lexer for info.
- *
- * The most difficult aspect of this code is that we need to work in multibyte
- * encodings that are not ASCII-safe.  A "safe" encoding is one in which each
- * byte of a multibyte character has the high bit set (it's >= 0x80).  Since
- * all our lexing rules treat all high-bit-set characters alike, we don't
- * really need to care whether such a byte is part of a sequence or not.
- * In an "unsafe" encoding, we still expect the first byte of a multibyte
- * sequence to be >= 0x80, but later bytes might not be.  If we scan such
- * a sequence as-is, the lexing rules could easily be fooled into matching
- * such bytes to ordinary ASCII characters.  Our solution for this is to
- * substitute 0xFF for each non-first byte within the data presented to flex.
- * The flex rules will then pass the FF's through unmolested.  The emit()
- * subroutine is responsible for looking back to the original string and
- * replacing FF's with the corresponding original bytes.
- *
- * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * IDENTIFICATION
- *	  src/bin/psql/psqlscan.l
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres_fe.h"
-
-#include "psqlscan.h"
-
-#include <ctype.h>
-
-#include "common.h"
-#include "settings.h"
-#include "variables.h"
-
-
-/*
- * We use a stack of flex buffers to handle substitution of psql variables.
- * Each stacked buffer contains the as-yet-unread text from one psql variable.
- * When we pop the stack all the way, we resume reading from the outer buffer
- * identified by scanbufhandle.
- */
-typedef struct StackElem
-{
-	YY_BUFFER_STATE buf;		/* flex input control structure */
-	char	   *bufstring;		/* data actually being scanned by flex */
-	char	   *origstring;		/* copy of original data, if needed */
-	char	   *varname;		/* name of variable providing data, or NULL */
-	struct StackElem *next;
-} StackElem;
-
-/*
- * All working state of the lexer must be stored in PsqlScanStateData
- * between calls.  This allows us to have multiple open lexer operations,
- * which is needed for nested include files.  The lexer itself is not
- * recursive, but it must be re-entrant.
- */
-typedef struct PsqlScanStateData
-{
-	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
-	/*
-	 * These variables always refer to the outer buffer, never to any
-	 * stacked variable-expansion buffer.
-	 */
-	YY_BUFFER_STATE scanbufhandle;
-	char	   *scanbuf;		/* start of outer-level input buffer */
-	const char *scanline;		/* current input line at outer level */
-
-	/* safe_encoding, curline, refline are used by emit() to replace FFs */
-	int			encoding;		/* encoding being used now */
-	bool		safe_encoding;	/* is current encoding "safe"? */
-	const char *curline;		/* actual flex input string for cur buf */
-	const char *refline;		/* original data for cur buffer */
-
-	/*
-	 * All this state lives across successive input lines, until explicitly
-	 * reset by psql_scan_reset.
-	 */
-	int			start_state;	/* saved YY_START */
-	int			paren_depth;	/* depth of nesting in parentheses */
-	int			xcdepth;		/* depth of nesting in slash-star comments */
-	char	   *dolqstart;		/* current $foo$ quote start string */
-} PsqlScanStateData;
-
-static PsqlScanState cur_state;	/* current state while active */
-
-static PQExpBuffer output_buf;	/* current output buffer */
-
-/* these variables do not need to be saved across calls */
-static enum slash_option_type option_type;
-static char *option_quote;
-static int	unquoted_option_chars;
-static int	backtick_start_offset;
-
-
-/* Return values from yylex() */
-#define LEXRES_EOL			0	/* end of input */
-#define LEXRES_SEMI			1	/* command-terminating semicolon found */
-#define LEXRES_BACKSLASH	2	/* backslash command start */
-#define LEXRES_OK			3	/* OK completion of backslash argument */
-
-
-static void evaluate_backtick(void);
-static void push_new_buffer(const char *newstr, const char *varname);
-static void pop_buffer_stack(PsqlScanState state);
-static bool var_is_current_source(PsqlScanState state, const char *varname);
-static YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
-									  char **txtcopy);
-static void emit(const char *txt, int len);
-static char *extract_substring(const char *txt, int len);
-static void escape_variable(bool as_ident);
-
-#define ECHO emit(yytext, yyleng)
-
-%}
-
-%option 8bit
-%option never-interactive
-%option nodefault
-%option noinput
-%option nounput
-%option noyywrap
-%option warn
-
-/*
- * All of the following definitions and rules should exactly match
- * src/backend/parser/scan.l so far as the flex patterns are concerned.
- * The rule bodies are just ECHO as opposed to what the backend does,
- * however.  (But be sure to duplicate code that affects the lexing process,
- * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
- * scan.l has a separate one for each exclusive state.
- */
-
-/*
- * OK, here is a short description of lex/flex rules behavior.
- * The longest pattern which matches an input string is always chosen.
- * For equal-length patterns, the first occurring in the rules list is chosen.
- * INITIAL is the starting state, to which all non-conditional rules apply.
- * Exclusive states change parsing rules while the state is active.  When in
- * an exclusive state, only those rules defined for that state apply.
- *
- * We use exclusive states for quoted strings, extended comments,
- * and to eliminate parsing troubles for numeric strings.
- * Exclusive states:
- *  <xb> bit string literal
- *  <xc> extended C-style comments
- *  <xd> delimited identifiers (double-quoted identifiers)
- *  <xh> hexadecimal numeric string
- *  <xq> standard quoted strings
- *  <xe> extended quoted strings (support backslash escape sequences)
- *  <xdolq> $foo$ quoted strings
- *  <xui> quoted identifier with Unicode escapes
- *  <xuiend> end of a quoted identifier with Unicode escapes, UESCAPE can follow
- *  <xus> quoted string with Unicode escapes
- *  <xusend> end of a quoted string with Unicode escapes, UESCAPE can follow
- *
- * Note: we intentionally don't mimic the backend's <xeu> state; we have
- * no need to distinguish it from <xe> state, and no good way to get out
- * of it in error cases.  The backend just throws yyerror() in those
- * cases, but that's not an option here.
- */
-
-%x xb
-%x xc
-%x xd
-%x xh
-%x xe
-%x xq
-%x xdolq
-%x xui
-%x xuiend
-%x xus
-%x xusend
-/* Additional exclusive states for psql only: lex backslash commands */
-%x xslashcmd
-%x xslashargstart
-%x xslasharg
-%x xslashquote
-%x xslashbackquote
-%x xslashdquote
-%x xslashwholeline
-%x xslashend
-
-/*
- * In order to make the world safe for Windows and Mac clients as well as
- * Unix ones, we accept either \n or \r as a newline.  A DOS-style \r\n
- * sequence will be seen as two successive newlines, but that doesn't cause
- * any problems.  Comments that start with -- and extend to the next
- * newline are treated as equivalent to a single whitespace character.
- *
- * NOTE a fine point: if there is no newline following --, we will absorb
- * everything to the end of the input as a comment.  This is correct.  Older
- * versions of Postgres failed to recognize -- as a comment if the input
- * did not end with a newline.
- *
- * XXX perhaps \f (formfeed) should be treated as a newline as well?
- *
- * XXX if you change the set of whitespace characters, fix scanner_isspace()
- * to agree, and see also the plpgsql lexer.
- */
-
-space			[ \t\n\r\f]
-horiz_space		[ \t\f]
-newline			[\n\r]
-non_newline		[^\n\r]
-
-comment			("--"{non_newline}*)
-
-whitespace		({space}+|{comment})
-
-/*
- * SQL requires at least one newline in the whitespace separating
- * string literals that are to be concatenated.  Silly, but who are we
- * to argue?  Note that {whitespace_with_newline} should not have * after
- * it, whereas {whitespace} should generally have a * after it...
- */
-
-special_whitespace		({space}+|{comment}{newline})
-horiz_whitespace		({horiz_space}|{comment})
-whitespace_with_newline	({horiz_whitespace}*{newline}{special_whitespace}*)
-
-/*
- * To ensure that {quotecontinue} can be scanned without having to back up
- * if the full pattern isn't matched, we include trailing whitespace in
- * {quotestop}.  This matches all cases where {quotecontinue} fails to match,
- * except for {quote} followed by whitespace and just one "-" (not two,
- * which would start a {comment}).  To cover that we have {quotefail}.
- * The actions for {quotestop} and {quotefail} must throw back characters
- * beyond the quote proper.
- */
-quote			'
-quotestop		{quote}{whitespace}*
-quotecontinue	{quote}{whitespace_with_newline}{quote}
-quotefail		{quote}{whitespace}*"-"
-
-/* Bit string
- * It is tempting to scan the string for only those characters
- * which are allowed. However, this leads to silently swallowed
- * characters if illegal characters are included in the string.
- * For example, if xbinside is [01] then B'ABCD' is interpreted
- * as a zero-length string, and the ABCD' is lost!
- * Better to pass the string forward and let the input routines
- * validate the contents.
- */
-xbstart			[bB]{quote}
-xbinside		[^']*
-
-/* Hexadecimal number */
-xhstart			[xX]{quote}
-xhinside		[^']*
-
-/* National character */
-xnstart			[nN]{quote}
-
-/* Quoted string that allows backslash escapes */
-xestart			[eE]{quote}
-xeinside		[^\\']+
-xeescape		[\\][^0-7]
-xeoctesc		[\\][0-7]{1,3}
-xehexesc		[\\]x[0-9A-Fa-f]{1,2}
-xeunicode		[\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})
-xeunicodefail	[\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})
-
-/* Extended quote
- * xqdouble implements embedded quote, ''''
- */
-xqstart			{quote}
-xqdouble		{quote}{quote}
-xqinside		[^']+
-
-/* $foo$ style quotes ("dollar quoting")
- * The quoted string starts with $foo$ where "foo" is an optional string
- * in the form of an identifier, except that it may not contain "$",
- * and extends to the first occurrence of an identical string.
- * There is *no* processing of the quoted text.
- *
- * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim}
- * fails to match its trailing "$".
- */
-dolq_start		[A-Za-z\200-\377_]
-dolq_cont		[A-Za-z\200-\377_0-9]
-dolqdelim		\$({dolq_start}{dolq_cont}*)?\$
-dolqfailed		\${dolq_start}{dolq_cont}*
-dolqinside		[^$]+
-
-/* Double quote
- * Allows embedded spaces and other special characters into identifiers.
- */
-dquote			\"
-xdstart			{dquote}
-xdstop			{dquote}
-xddouble		{dquote}{dquote}
-xdinside		[^"]+
-
-/* Unicode escapes */
-uescape			[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote}
-/* error rule to avoid backup */
-uescapefail		[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU]
-
-/* Quoted identifier with Unicode escapes */
-xuistart		[uU]&{dquote}
-
-/* Quoted string with Unicode escapes */
-xusstart		[uU]&{quote}
-
-/* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */
-xustop1		{uescapefail}?
-xustop2		{uescape}
-
-/* error rule to avoid backup */
-xufailed		[uU]&
-
-
-/* C-style comments
- *
- * The "extended comment" syntax closely resembles allowable operator syntax.
- * The tricky part here is to get lex to recognize a string starting with
- * slash-star as a comment, when interpreting it as an operator would produce
- * a longer match --- remember lex will prefer a longer match!  Also, if we
- * have something like plus-slash-star, lex will think this is a 3-character
- * operator whereas we want to see it as a + operator and a comment start.
- * The solution is two-fold:
- * 1. append {op_chars}* to xcstart so that it matches as much text as
- *    {operator} would. Then the tie-breaker (first matching rule of same
- *    length) ensures xcstart wins.  We put back the extra stuff with yyless()
- *    in case it contains a star-slash that should terminate the comment.
- * 2. In the operator rule, check for slash-star within the operator, and
- *    if found throw it back with yyless().  This handles the plus-slash-star
- *    problem.
- * Dash-dash comments have similar interactions with the operator rule.
- */
-xcstart			\/\*{op_chars}*
-xcstop			\*+\/
-xcinside		[^*/]+
-
-digit			[0-9]
-ident_start		[A-Za-z\200-\377_]
-ident_cont		[A-Za-z\200-\377_0-9\$]
-
-identifier		{ident_start}{ident_cont}*
-
-/* Assorted special-case operators and operator-like tokens */
-typecast		"::"
-dot_dot			\.\.
-colon_equals	":="
-equals_greater	"=>"
-less_equals		"<="
-greater_equals	">="
-less_greater	"<>"
-not_equals		"!="
-
-/*
- * "self" is the set of chars that should be returned as single-character
- * tokens.  "op_chars" is the set of chars that can make up "Op" tokens,
- * which can be one or more characters long (but if a single-char token
- * appears in the "self" set, it is not to be returned as an Op).  Note
- * that the sets overlap, but each has some chars that are not in the other.
- *
- * If you change either set, adjust the character lists appearing in the
- * rule for "operator"!
- */
-self			[,()\[\].;\:\+\-\*\/\%\^\<\>\=]
-op_chars		[\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
-operator		{op_chars}+
-
-/* we no longer allow unary minus in numbers.
- * instead we pass it separately to parser. there it gets
- * coerced via doNegate() -- Leon aug 20 1999
- *
- * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 10.
- *
- * {realfail1} and {realfail2} are added to prevent the need for scanner
- * backup when the {real} rule fails to match completely.
- */
-
-integer			{digit}+
-decimal			(({digit}*\.{digit}+)|({digit}+\.{digit}*))
-decimalfail		{digit}+\.\.
-real			({integer}|{decimal})[Ee][-+]?{digit}+
-realfail1		({integer}|{decimal})[Ee]
-realfail2		({integer}|{decimal})[Ee][-+]
-
-param			\${integer}
-
-/* psql-specific: characters allowed in variable names */
-variable_char	[A-Za-z\200-\377_0-9]
-
-other			.
-
-/*
- * Dollar quoted strings are totally opaque, and no escaping is done on them.
- * Other quoted strings must allow some special characters such as single-quote
- *  and newline.
- * Embedded single-quotes are implemented both in the SQL standard
- *  style of two adjacent single quotes "''" and in the Postgres/Java style
- *  of escaped-quote "\'".
- * Other embedded escaped characters are matched explicitly and the leading
- *  backslash is dropped from the string.
- * Note that xcstart must appear before operator, as explained above!
- *  Also whitespace (comment) must appear before operator.
- */
-
-%%
-
-{whitespace}	{
-					/*
-					 * Note that the whitespace rule includes both true
-					 * whitespace and single-line ("--" style) comments.
-					 * We suppress whitespace at the start of the query
-					 * buffer.  We also suppress all single-line comments,
-					 * which is pretty dubious but is the historical
-					 * behavior.
-					 */
-					if (!(output_buf->len == 0 || yytext[0] == '-'))
-						ECHO;
-				}
-
-{xcstart}		{
-					cur_state->xcdepth = 0;
-					BEGIN(xc);
-					/* Put back any characters past slash-star; see above */
-					yyless(2);
-					ECHO;
-				}
-
-<xc>{xcstart}	{
-					cur_state->xcdepth++;
-					/* Put back any characters past slash-star; see above */
-					yyless(2);
-					ECHO;
-				}
-
-<xc>{xcstop}	{
-					if (cur_state->xcdepth <= 0)
-					{
-						BEGIN(INITIAL);
-					}
-					else
-						cur_state->xcdepth--;
-					ECHO;
-				}
-
-<xc>{xcinside}	{
-					ECHO;
-				}
-
-<xc>{op_chars}	{
-					ECHO;
-				}
-
-<xc>\*+			{
-					ECHO;
-				}
-
-{xbstart}		{
-					BEGIN(xb);
-					ECHO;
-				}
-<xb>{quotestop}	|
-<xb>{quotefail} {
-					yyless(1);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xh>{xhinside}	|
-<xb>{xbinside}	{
-					ECHO;
-				}
-<xh>{quotecontinue}	|
-<xb>{quotecontinue}	{
-					ECHO;
-				}
-
-{xhstart}		{
-					/* Hexadecimal bit type.
-					 * At some point we should simply pass the string
-					 * forward to the parser and label it there.
-					 * In the meantime, place a leading "x" on the string
-					 * to mark it for the input routine as a hex string.
-					 */
-					BEGIN(xh);
-					ECHO;
-				}
-<xh>{quotestop}	|
-<xh>{quotefail} {
-					yyless(1);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-
-{xnstart}		{
-					yyless(1);				/* eat only 'n' this time */
-					ECHO;
-				}
-
-{xqstart}		{
-					if (standard_strings())
-						BEGIN(xq);
-					else
-						BEGIN(xe);
-					ECHO;
-				}
-{xestart}		{
-					BEGIN(xe);
-					ECHO;
-				}
-{xusstart}		{
-					BEGIN(xus);
-					ECHO;
-				}
-<xq,xe>{quotestop}	|
-<xq,xe>{quotefail} {
-					yyless(1);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xus>{quotestop} |
-<xus>{quotefail} {
-					yyless(1);
-					BEGIN(xusend);
-					ECHO;
-				}
-<xusend>{whitespace} {
-					ECHO;
-				}
-<xusend>{other} |
-<xusend>{xustop1} {
-					yyless(0);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xusend>{xustop2} {
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xq,xe,xus>{xqdouble} {
-					ECHO;
-				}
-<xq,xus>{xqinside}  {
-					ECHO;
-				}
-<xe>{xeinside}  {
-					ECHO;
-				}
-<xe>{xeunicode} {
-					ECHO;
-				}
-<xe>{xeunicodefail}	{
-					ECHO;
-				}
-<xe>{xeescape}  {
-					ECHO;
-				}
-<xe>{xeoctesc}  {
-					ECHO;
-				}
-<xe>{xehexesc}  {
-					ECHO;
-				}
-<xq,xe,xus>{quotecontinue} {
-					ECHO;
-				}
-<xe>.			{
-					/* This is only needed for \ just before EOF */
-					ECHO;
-				}
-
-{dolqdelim}		{
-					cur_state->dolqstart = pg_strdup(yytext);
-					BEGIN(xdolq);
-					ECHO;
-				}
-{dolqfailed}	{
-					/* throw back all but the initial "$" */
-					yyless(1);
-					ECHO;
-				}
-<xdolq>{dolqdelim} {
-					if (strcmp(yytext, cur_state->dolqstart) == 0)
-					{
-						free(cur_state->dolqstart);
-						cur_state->dolqstart = NULL;
-						BEGIN(INITIAL);
-					}
-					else
-					{
-						/*
-						 * When we fail to match $...$ to dolqstart, transfer
-						 * the $... part to the output, but put back the final
-						 * $ for rescanning.  Consider $delim$...$junk$delim$
-						 */
-						yyless(yyleng-1);
-					}
-					ECHO;
-				}
-<xdolq>{dolqinside} {
-					ECHO;
-				}
-<xdolq>{dolqfailed} {
-					ECHO;
-				}
-<xdolq>.		{
-					/* This is only needed for $ inside the quoted text */
-					ECHO;
-				}
-
-{xdstart}		{
-					BEGIN(xd);
-					ECHO;
-				}
-{xuistart}		{
-					BEGIN(xui);
-					ECHO;
-				}
-<xd>{xdstop}	{
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xui>{dquote} {
-					yyless(1);
-					BEGIN(xuiend);
-					ECHO;
-				}
-<xuiend>{whitespace} {
-					ECHO;
-				}
-<xuiend>{other} |
-<xuiend>{xustop1} {
-					yyless(0);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xuiend>{xustop2}	{
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xd,xui>{xddouble}	{
-					ECHO;
-				}
-<xd,xui>{xdinside}	{
-					ECHO;
-				}
-
-{xufailed}	{
-					/* throw back all but the initial u/U */
-					yyless(1);
-					ECHO;
-				}
-
-{typecast}		{
-					ECHO;
-				}
-
-{dot_dot}		{
-					ECHO;
-				}
-
-{colon_equals}	{
-					ECHO;
-				}
-
-{equals_greater} {
-					ECHO;
-				}
-
-{less_equals}	{
-					ECHO;
-				}
-
-{greater_equals} {
-					ECHO;
-				}
-
-{less_greater}	{
-					ECHO;
-				}
-
-{not_equals}	{
-					ECHO;
-				}
-
-	/*
-	 * These rules are specific to psql --- they implement parenthesis
-	 * counting and detection of command-ending semicolon.  These must
-	 * appear before the {self} rule so that they take precedence over it.
-	 */
-
-"("				{
-					cur_state->paren_depth++;
-					ECHO;
-				}
-
-")"				{
-					if (cur_state->paren_depth > 0)
-						cur_state->paren_depth--;
-					ECHO;
-				}
-
-";"				{
-					ECHO;
-					if (cur_state->paren_depth == 0)
-					{
-						/* Terminate lexing temporarily */
-						return LEXRES_SEMI;
-					}
-				}
-
-	/*
-	 * psql-specific rules to handle backslash commands and variable
-	 * substitution.  We want these before {self}, also.
-	 */
-
-"\\"[;:]		{
-					/* Force a semicolon or colon into the query buffer */
-					emit(yytext + 1, 1);
-				}
-
-"\\"			{
-					/* Terminate lexing temporarily */
-					return LEXRES_BACKSLASH;
-				}
-
-:{variable_char}+	{
-					/* Possible psql variable substitution */
-					char   *varname;
-					const char *value;
-
-					varname = extract_substring(yytext + 1, yyleng - 1);
-					value = GetVariable(pset.vars, varname);
-
-					if (value)
-					{
-						/* It is a variable, check for recursion */
-						if (var_is_current_source(cur_state, varname))
-						{
-							/* Recursive expansion --- don't go there */
-							psql_error("skipping recursive expansion of variable \"%s\"\n",
-									   varname);
-							/* Instead copy the string as is */
-							ECHO;
-						}
-						else
-						{
-							/* OK, perform substitution */
-							push_new_buffer(value, varname);
-							/* yy_scan_string already made buffer active */
-						}
-					}
-					else
-					{
-						/*
-						 * if the variable doesn't exist we'll copy the
-						 * string as is
-						 */
-						ECHO;
-					}
-
-					free(varname);
-				}
-
-:'{variable_char}+'	{
-					escape_variable(false);
-				}
-
-:\"{variable_char}+\"	{
-					escape_variable(true);
-				}
-
-	/*
-	 * These rules just avoid the need for scanner backup if one of the
-	 * two rules above fails to match completely.
-	 */
-
-:'{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					ECHO;
-				}
-
-:\"{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					ECHO;
-				}
-
-	/*
-	 * Back to backend-compatible rules.
-	 */
-
-{self}			{
-					ECHO;
-				}
-
-{operator}		{
-					/*
-					 * Check for embedded slash-star or dash-dash; those
-					 * are comment starts, so operator must stop there.
-					 * Note that slash-star or dash-dash at the first
-					 * character will match a prior rule, not this one.
-					 */
-					int		nchars = yyleng;
-					char   *slashstar = strstr(yytext, "/*");
-					char   *dashdash = strstr(yytext, "--");
-
-					if (slashstar && dashdash)
-					{
-						/* if both appear, take the first one */
-						if (slashstar > dashdash)
-							slashstar = dashdash;
-					}
-					else if (!slashstar)
-						slashstar = dashdash;
-					if (slashstar)
-						nchars = slashstar - yytext;
-
-					/*
-					 * For SQL compatibility, '+' and '-' cannot be the
-					 * last char of a multi-char operator unless the operator
-					 * contains chars that are not in SQL operators.
-					 * The idea is to lex '=-' as two operators, but not
-					 * to forbid operator names like '?-' that could not be
-					 * sequences of SQL operators.
-					 */
-					while (nchars > 1 &&
-						   (yytext[nchars-1] == '+' ||
-							yytext[nchars-1] == '-'))
-					{
-						int		ic;
-
-						for (ic = nchars-2; ic >= 0; ic--)
-						{
-							if (strchr("~!@#^&|`?%", yytext[ic]))
-								break;
-						}
-						if (ic >= 0)
-							break; /* found a char that makes it OK */
-						nchars--; /* else remove the +/-, and check again */
-					}
-
-					if (nchars < yyleng)
-					{
-						/* Strip the unwanted chars from the token */
-						yyless(nchars);
-					}
-					ECHO;
-				}
-
-{param}			{
-					ECHO;
-				}
-
-{integer}		{
-					ECHO;
-				}
-{decimal}		{
-					ECHO;
-				}
-{decimalfail}	{
-					/* throw back the .., and treat as integer */
-					yyless(yyleng-2);
-					ECHO;
-				}
-{real}			{
-					ECHO;
-				}
-{realfail1}		{
-					/*
-					 * throw back the [Ee], and treat as {decimal}.  Note
-					 * that it is possible the input is actually {integer},
-					 * but since this case will almost certainly lead to a
-					 * syntax error anyway, we don't bother to distinguish.
-					 */
-					yyless(yyleng-1);
-					ECHO;
-				}
-{realfail2}		{
-					/* throw back the [Ee][+-], and proceed as above */
-					yyless(yyleng-2);
-					ECHO;
-				}
-
-
-{identifier}	{
-					ECHO;
-				}
-
-{other}			{
-					ECHO;
-				}
-
-
-	/*
-	 * Everything from here down is psql-specific.
-	 */
-
-<<EOF>>			{
-					StackElem  *stackelem = cur_state->buffer_stack;
-
-					if (stackelem == NULL)
-						return LEXRES_EOL; /* end of input reached */
-
-					/*
-					 * We were expanding a variable, so pop the inclusion
-					 * stack and keep lexing
-					 */
-					pop_buffer_stack(cur_state);
-
-					stackelem = cur_state->buffer_stack;
-					if (stackelem != NULL)
-					{
-						yy_switch_to_buffer(stackelem->buf);
-						cur_state->curline = stackelem->bufstring;
-						cur_state->refline = stackelem->origstring ? stackelem->origstring : stackelem->bufstring;
-					}
-					else
-					{
-						yy_switch_to_buffer(cur_state->scanbufhandle);
-						cur_state->curline = cur_state->scanbuf;
-						cur_state->refline = cur_state->scanline;
-					}
-				}
-
-	/*
-	 * Exclusive lexer states to handle backslash command lexing
-	 */
-
-<xslashcmd>{
-	/* command name ends at whitespace or backslash; eat all else */
-
-{space}|"\\"	{
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashargstart>{
-	/*
-	 * Discard any whitespace before argument, then go to xslasharg state.
-	 * An exception is that "|" is only special at start of argument, so we
-	 * check for it here.
-	 */
-
-{space}+		{ }
-
-"|"				{
-					if (option_type == OT_FILEPIPE)
-					{
-						/* treat like whole-string case */
-						ECHO;
-						BEGIN(xslashwholeline);
-					}
-					else
-					{
-						/* vertical bar is not special otherwise */
-						yyless(0);
-						BEGIN(xslasharg);
-					}
-				}
-
-{other}			{
-					yyless(0);
-					BEGIN(xslasharg);
-				}
-
-}
-
-<xslasharg>{
-	/*
-	 * Default processing of text in a slash command's argument.
-	 *
-	 * Note: unquoted_option_chars counts the number of characters at the
-	 * end of the argument that were not subject to any form of quoting.
-	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
-	 */
-
-{space}|"\\"	{
-					/*
-					 * Unquoted space is end of arg; do not eat.  Likewise
-					 * backslash is end of command or next command, do not eat
-					 *
-					 * XXX this means we can't conveniently accept options
-					 * that include unquoted backslashes; therefore, option
-					 * processing that encourages use of backslashes is rather
-					 * broken.
-					 */
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{quote}			{
-					*option_quote = '\'';
-					unquoted_option_chars = 0;
-					BEGIN(xslashquote);
-				}
-
-"`"				{
-					backtick_start_offset = output_buf->len;
-					*option_quote = '`';
-					unquoted_option_chars = 0;
-					BEGIN(xslashbackquote);
-				}
-
-{dquote}		{
-					ECHO;
-					*option_quote = '"';
-					unquoted_option_chars = 0;
-					BEGIN(xslashdquote);
-				}
-
-:{variable_char}+	{
-					/* Possible psql variable substitution */
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						char   *varname;
-						const char *value;
-
-						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(pset.vars, varname);
-						free(varname);
-
-						/*
-						 * The variable value is just emitted without any
-						 * further examination.  This is consistent with the
-						 * pre-8.0 code behavior, if not with the way that
-						 * variables are handled outside backslash commands.
-						 * Note that we needn't guard against recursion here.
-						 */
-						if (value)
-							appendPQExpBufferStr(output_buf, value);
-						else
-							ECHO;
-
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}+'	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(false);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-
-:\"{variable_char}+\"	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(true);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-:\"{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-{other}			{
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-}
-
-<xslashquote>{
-	/*
-	 * single-quoted text: copy literally except for '' and backslash
-	 * sequences
-	 */
-
-{quote}			{ BEGIN(xslasharg); }
-
-{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
-
-"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
-"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
-"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
-"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
-"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
-
-{xeoctesc}		{
-					/* octal case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 1, NULL, 8));
-				}
-
-{xehexesc}		{
-					/* hex case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 2, NULL, 16));
-				}
-
-"\\".			{ emit(yytext + 1, 1); }
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashbackquote>{
-	/*
-	 * backticked text: copy everything until next backquote, then evaluate.
-	 *
-	 * XXX Possible future behavioral change: substitute for :VARIABLE?
-	 */
-
-"`"				{
-					/* In NO_EVAL mode, don't evaluate the command */
-					if (option_type != OT_NO_EVAL)
-						evaluate_backtick();
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashdquote>{
-	/* double-quoted text: copy verbatim, including the double quotes */
-
-{dquote}		{
-					ECHO;
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashwholeline>{
-	/* copy everything until end of input line */
-	/* but suppress leading whitespace */
-
-{space}+		{
-					if (output_buf->len > 0)
-						ECHO;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashend>{
-	/* at end of command, eat a double backslash, but not anything else */
-
-"\\\\"			{ return LEXRES_OK; }
-
-{other}|\n		{
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-}
-
-%%
-
-/*
- * Create a lexer working state struct.
- */
-PsqlScanState
-psql_scan_create(void)
-{
-	PsqlScanState state;
-
-	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
-
-	psql_scan_reset(state);
-
-	return state;
-}
-
-/*
- * Destroy a lexer working state struct, releasing all resources.
- */
-void
-psql_scan_destroy(PsqlScanState state)
-{
-	psql_scan_finish(state);
-
-	psql_scan_reset(state);
-
-	free(state);
-}
-
-/*
- * Set up to perform lexing of the given input line.
- *
- * The text at *line, extending for line_len bytes, will be scanned by
- * subsequent calls to the psql_scan routines.  psql_scan_finish should
- * be called when scanning is complete.  Note that the lexer retains
- * a pointer to the storage at *line --- this string must not be altered
- * or freed until after psql_scan_finish is called.
- */
-void
-psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len)
-{
-	/* Mustn't be scanning already */
-	Assert(state->scanbufhandle == NULL);
-	Assert(state->buffer_stack == NULL);
-
-	/* Do we need to hack the character set encoding? */
-	state->encoding = pset.encoding;
-	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
-
-	/* needed for prepare_buffer */
-	cur_state = state;
-
-	/* Set up flex input buffer with appropriate translation and padding */
-	state->scanbufhandle = prepare_buffer(line, line_len,
-										  &state->scanbuf);
-	state->scanline = line;
-
-	/* Set lookaside data in case we have to map unsafe encoding */
-	state->curline = state->scanbuf;
-	state->refline = state->scanline;
-}
-
-/*
- * Do lexical analysis of SQL command text.
- *
- * The text previously passed to psql_scan_setup is scanned, and appended
- * (possibly with transformation) to query_buf.
- *
- * The return value indicates the condition that stopped scanning:
- *
- * PSCAN_SEMICOLON: found a command-ending semicolon.  (The semicolon is
- * transferred to query_buf.)  The command accumulated in query_buf should
- * be executed, then clear query_buf and call again to scan the remainder
- * of the line.
- *
- * PSCAN_BACKSLASH: found a backslash that starts a psql special command.
- * Any previous data on the line has been transferred to query_buf.
- * The caller will typically next call psql_scan_slash_command(),
- * perhaps psql_scan_slash_option(), and psql_scan_slash_command_end().
- *
- * PSCAN_INCOMPLETE: the end of the line was reached, but we have an
- * incomplete SQL command.  *prompt is set to the appropriate prompt type.
- *
- * PSCAN_EOL: the end of the line was reached, and there is no lexical
- * reason to consider the command incomplete.  The caller may or may not
- * choose to send it.  *prompt is set to the appropriate prompt type if
- * the caller chooses to collect more input.
- *
- * In the PSCAN_INCOMPLETE and PSCAN_EOL cases, psql_scan_finish() should
- * be called next, then the cycle may be repeated with a fresh input line.
- *
- * In all cases, *prompt is set to an appropriate prompt type code for the
- * next line-input operation.
- */
-PsqlScanResult
-psql_scan(PsqlScanState state,
-		  PQExpBuffer query_buf,
-		  promptStatus_t *prompt)
-{
-	PsqlScanResult result;
-	int			lexresult;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = query_buf;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(state->start_state);
-
-	/* And lex. */
-	lexresult = yylex();
-
-	/* Update static vars back to the state struct */
-	state->start_state = YY_START;
-
-	/*
-	 * Check termination state and return appropriate result info.
-	 */
-	switch (lexresult)
-	{
-		case LEXRES_EOL:		/* end of input */
-			switch (state->start_state)
-			{
-				/* This switch must cover all non-slash-command states. */
-				case INITIAL:
-				case xuiend:	/* we treat these like INITIAL */
-				case xusend:
-					if (state->paren_depth > 0)
-					{
-						result = PSCAN_INCOMPLETE;
-						*prompt = PROMPT_PAREN;
-					}
-					else if (query_buf->len > 0)
-					{
-						result = PSCAN_EOL;
-						*prompt = PROMPT_CONTINUE;
-					}
-					else
-					{
-						/* never bother to send an empty buffer */
-						result = PSCAN_INCOMPLETE;
-						*prompt = PROMPT_READY;
-					}
-					break;
-				case xb:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xc:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_COMMENT;
-					break;
-				case xd:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_DOUBLEQUOTE;
-					break;
-				case xh:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xe:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xq:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xdolq:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_DOLLARQUOTE;
-					break;
-				case xui:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_DOUBLEQUOTE;
-					break;
-				case xus:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				default:
-					/* can't get here */
-					fprintf(stderr, "invalid YY_START\n");
-					exit(1);
-			}
-			break;
-		case LEXRES_SEMI:		/* semicolon */
-			result = PSCAN_SEMICOLON;
-			*prompt = PROMPT_READY;
-			break;
-		case LEXRES_BACKSLASH:	/* backslash */
-			result = PSCAN_BACKSLASH;
-			*prompt = PROMPT_READY;
-			break;
-		default:
-			/* can't get here */
-			fprintf(stderr, "invalid yylex result\n");
-			exit(1);
-	}
-
-	return result;
-}
-
-/*
- * Clean up after scanning a string.  This flushes any unread input and
- * releases resources (but not the PsqlScanState itself).  Note however
- * that this does not reset the lexer scan state; that can be done by
- * psql_scan_reset(), which is an orthogonal operation.
- *
- * It is legal to call this when not scanning anything (makes it easier
- * to deal with error recovery).
- */
-void
-psql_scan_finish(PsqlScanState state)
-{
-	/* Drop any incomplete variable expansions. */
-	while (state->buffer_stack != NULL)
-		pop_buffer_stack(state);
-
-	/* Done with the outer scan buffer, too */
-	if (state->scanbufhandle)
-		yy_delete_buffer(state->scanbufhandle);
-	state->scanbufhandle = NULL;
-	if (state->scanbuf)
-		free(state->scanbuf);
-	state->scanbuf = NULL;
-}
-
-/*
- * Reset lexer scanning state to start conditions.  This is appropriate
- * for executing \r psql commands (or any other time that we discard the
- * prior contents of query_buf).  It is not, however, necessary to do this
- * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
- * PSCAN_EOL scan result, because the scan state must be INITIAL when those
- * conditions are returned.
- *
- * Note that this is unrelated to flushing unread input; that task is
- * done by psql_scan_finish().
- */
-void
-psql_scan_reset(PsqlScanState state)
-{
-	state->start_state = INITIAL;
-	state->paren_depth = 0;
-	state->xcdepth = 0;			/* not really necessary */
-	if (state->dolqstart)
-		free(state->dolqstart);
-	state->dolqstart = NULL;
-}
-
-/*
- * Return true if lexer is currently in an "inside quotes" state.
- *
- * This is pretty grotty but is needed to preserve the old behavior
- * that mainloop.c drops blank lines not inside quotes without even
- * echoing them.
- */
-bool
-psql_scan_in_quote(PsqlScanState state)
-{
-	return state->start_state != INITIAL;
-}
-
-/*
- * Scan the command name of a psql backslash command.  This should be called
- * after psql_scan() returns PSCAN_BACKSLASH.  It is assumed that the input
- * has been consumed through the leading backslash.
- *
- * The return value is a malloc'd copy of the command name, as parsed off
- * from the input.
- */
-char *
-psql_scan_slash_command(PsqlScanState state)
-{
-	PQExpBufferData mybuf;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashcmd);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-
-	return mybuf.data;
-}
-
-/*
- * Parse off the next argument for a backslash command, and return it as a
- * malloc'd string.  If there are no more arguments, returns NULL.
- *
- * type tells what processing, if any, to perform on the option string;
- * for example, if it's a SQL identifier, we want to downcase any unquoted
- * letters.
- *
- * if quote is not NULL, *quote is set to 0 if no quoting was found, else
- * the last quote symbol used in the argument.
- *
- * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
- * be taken as part of the option string will be stripped.
- *
- * NOTE: the only possible syntax errors for backslash options are unmatched
- * quotes, which are detected when we run out of input.  Therefore, on a
- * syntax error we just throw away the string and return NULL; there is no
- * need to worry about flushing remaining input.
- */
-char *
-psql_scan_slash_option(PsqlScanState state,
-					   enum slash_option_type type,
-					   char *quote,
-					   bool semicolon)
-{
-	PQExpBufferData mybuf;
-	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
-	char		local_quote;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	if (quote == NULL)
-		quote = &local_quote;
-	*quote = 0;
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-	option_type = type;
-	option_quote = quote;
-	unquoted_option_chars = 0;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	if (type == OT_WHOLE_LINE)
-		BEGIN(xslashwholeline);
-	else
-		BEGIN(xslashargstart);
-
-	/* And lex. */
-	lexresult = yylex();
-
-	/*
-	 * Check the lex result: we should have gotten back either LEXRES_OK
-	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
-	 * a quoted string, as indicated by YY_START, EOL is an error.
-	 */
-	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
-
-	switch (YY_START)
-	{
-		case xslashargstart:
-			/* empty arg */
-			break;
-		case xslasharg:
-			/* Strip any unquoted trailing semi-colons if requested */
-			if (semicolon)
-			{
-				while (unquoted_option_chars-- > 0 &&
-					   mybuf.len > 0 &&
-					   mybuf.data[mybuf.len - 1] == ';')
-				{
-					mybuf.data[--mybuf.len] = '\0';
-				}
-			}
-
-			/*
-			 * If SQL identifier processing was requested, then we strip out
-			 * excess double quotes and downcase unquoted letters.
-			 * Doubled double-quotes become output double-quotes, per spec.
-			 *
-			 * Note that a string like FOO"BAR"BAZ will be converted to
-			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
-			 * which would have us parse it as several identifiers.  But
-			 * for psql's purposes, we want a string like "foo"."bar" to
-			 * be treated as one option, so there's little choice.
-			 */
-			if (type == OT_SQLID || type == OT_SQLIDHACK)
-			{
-				bool		inquotes = false;
-				char	   *cp = mybuf.data;
-
-				while (*cp)
-				{
-					if (*cp == '"')
-					{
-						if (inquotes && cp[1] == '"')
-						{
-							/* Keep the first quote, remove the second */
-							cp++;
-						}
-						inquotes = !inquotes;
-						/* Collapse out quote at *cp */
-						memmove(cp, cp + 1, strlen(cp));
-						mybuf.len--;
-						/* do not advance cp */
-					}
-					else
-					{
-						if (!inquotes && type == OT_SQLID)
-							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, pset.encoding);
-					}
-				}
-			}
-			break;
-		case xslashquote:
-		case xslashbackquote:
-		case xslashdquote:
-			/* must have hit EOL inside quotes */
-			psql_error("unterminated quoted string\n");
-			termPQExpBuffer(&mybuf);
-			return NULL;
-		case xslashwholeline:
-			/* always okay */
-			break;
-		default:
-			/* can't get here */
-			fprintf(stderr, "invalid YY_START\n");
-			exit(1);
-	}
-
-	/*
-	 * An unquoted empty argument isn't possible unless we are at end of
-	 * command.  Return NULL instead.
-	 */
-	if (mybuf.len == 0 && *quote == 0)
-	{
-		termPQExpBuffer(&mybuf);
-		return NULL;
-	}
-
-	/* Else return the completed string. */
-	return mybuf.data;
-}
-
-/*
- * Eat up any unused \\ to complete a backslash command.
- */
-void
-psql_scan_slash_command_end(PsqlScanState state)
-{
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = NULL;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashend);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-}
-
-/*
- * Evaluate a backticked substring of a slash command's argument.
- *
- * The portion of output_buf starting at backtick_start_offset is evaluated
- * as a shell command and then replaced by the command's output.
- */
-static void
-evaluate_backtick(void)
-{
-	char	   *cmd = output_buf->data + backtick_start_offset;
-	PQExpBufferData cmd_output;
-	FILE	   *fd;
-	bool		error = false;
-	char		buf[512];
-	size_t		result;
-
-	initPQExpBuffer(&cmd_output);
-
-	fd = popen(cmd, PG_BINARY_R);
-	if (!fd)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (!error)
-	{
-		do
-		{
-			result = fread(buf, 1, sizeof(buf), fd);
-			if (ferror(fd))
-			{
-				psql_error("%s: %s\n", cmd, strerror(errno));
-				error = true;
-				break;
-			}
-			appendBinaryPQExpBuffer(&cmd_output, buf, result);
-		} while (!feof(fd));
-	}
-
-	if (fd && pclose(fd) == -1)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (PQExpBufferDataBroken(cmd_output))
-	{
-		psql_error("%s: out of memory\n", cmd);
-		error = true;
-	}
-
-	/* Now done with cmd, delete it from output_buf */
-	output_buf->len = backtick_start_offset;
-	output_buf->data[output_buf->len] = '\0';
-
-	/* If no error, transfer result to output_buf */
-	if (!error)
-	{
-		/* strip any trailing newline */
-		if (cmd_output.len > 0 &&
-			cmd_output.data[cmd_output.len - 1] == '\n')
-			cmd_output.len--;
-		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
-	}
-
-	termPQExpBuffer(&cmd_output);
-}
-
-/*
- * Push the given string onto the stack of stuff to scan.
- *
- * cur_state must point to the active PsqlScanState.
- *
- * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
- */
-static void
-push_new_buffer(const char *newstr, const char *varname)
-{
-	StackElem  *stackelem;
-
-	stackelem = (StackElem *) pg_malloc(sizeof(StackElem));
-
-	/*
-	 * In current usage, the passed varname points at the current flex
-	 * input buffer; we must copy it before calling prepare_buffer()
-	 * because that will change the buffer state.
-	 */
-	stackelem->varname = varname ? pg_strdup(varname) : NULL;
-
-	stackelem->buf = prepare_buffer(newstr, strlen(newstr),
-									&stackelem->bufstring);
-	cur_state->curline = stackelem->bufstring;
-	if (cur_state->safe_encoding)
-	{
-		stackelem->origstring = NULL;
-		cur_state->refline = stackelem->bufstring;
-	}
-	else
-	{
-		stackelem->origstring = pg_strdup(newstr);
-		cur_state->refline = stackelem->origstring;
-	}
-	stackelem->next = cur_state->buffer_stack;
-	cur_state->buffer_stack = stackelem;
-}
-
-/*
- * Pop the topmost buffer stack item (there must be one!)
- *
- * NB: after this, the flex input state is unspecified; caller must
- * switch to an appropriate buffer to continue lexing.
- */
-static void
-pop_buffer_stack(PsqlScanState state)
-{
-	StackElem  *stackelem = state->buffer_stack;
-
-	state->buffer_stack = stackelem->next;
-	yy_delete_buffer(stackelem->buf);
-	free(stackelem->bufstring);
-	if (stackelem->origstring)
-		free(stackelem->origstring);
-	if (stackelem->varname)
-		free(stackelem->varname);
-	free(stackelem);
-}
-
-/*
- * Check if specified variable name is the source for any string
- * currently being scanned
- */
-static bool
-var_is_current_source(PsqlScanState state, const char *varname)
-{
-	StackElem  *stackelem;
-
-	for (stackelem = state->buffer_stack;
-		 stackelem != NULL;
-		 stackelem = stackelem->next)
-	{
-		if (stackelem->varname && strcmp(stackelem->varname, varname) == 0)
-			return true;
-	}
-	return false;
-}
-
-/*
- * Set up a flex input buffer to scan the given data.  We always make a
- * copy of the data.  If working in an unsafe encoding, the copy has
- * multibyte sequences replaced by FFs to avoid fooling the lexer rules.
- *
- * cur_state must point to the active PsqlScanState.
- *
- * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
- */
-static YY_BUFFER_STATE
-prepare_buffer(const char *txt, int len, char **txtcopy)
-{
-	char	   *newtxt;
-
-	/* Flex wants two \0 characters after the actual data */
-	newtxt = pg_malloc(len + 2);
-	*txtcopy = newtxt;
-	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
-
-	if (cur_state->safe_encoding)
-		memcpy(newtxt, txt, len);
-	else
-	{
-		/* Gotta do it the hard way */
-		int		i = 0;
-
-		while (i < len)
-		{
-			int		thislen = PQmblen(txt + i, cur_state->encoding);
-
-			/* first byte should always be okay... */
-			newtxt[i] = txt[i];
-			i++;
-			while (--thislen > 0 && i < len)
-				newtxt[i++] = (char) 0xFF;
-		}
-	}
-
-	return yy_scan_buffer(newtxt, len + 2);
-}
-
-/*
- * emit() --- body for ECHO macro
- *
- * NB: this must be used for ALL and ONLY the text copied from the flex
- * input data.  If you pass it something that is not part of the yytext
- * string, you are making a mistake.  Internally generated text can be
- * appended directly to output_buf.
- */
-static void
-emit(const char *txt, int len)
-{
-	if (cur_state->safe_encoding)
-		appendBinaryPQExpBuffer(output_buf, txt, len);
-	else
-	{
-		/* Gotta do it the hard way */
-		const char *reference = cur_state->refline;
-		int		i;
-
-		reference += (txt - cur_state->curline);
-
-		for (i = 0; i < len; i++)
-		{
-			char	ch = txt[i];
-
-			if (ch == (char) 0xFF)
-				ch = reference[i];
-			appendPQExpBufferChar(output_buf, ch);
-		}
-	}
-}
-
-/*
- * extract_substring --- fetch the true value of (part of) the current token
- *
- * This is like emit(), except that the data is returned as a malloc'd string
- * rather than being pushed directly to output_buf.
- */
-static char *
-extract_substring(const char *txt, int len)
-{
-	char	   *result = (char *) pg_malloc(len + 1);
-
-	if (cur_state->safe_encoding)
-		memcpy(result, txt, len);
-	else
-	{
-		/* Gotta do it the hard way */
-		const char *reference = cur_state->refline;
-		int		i;
-
-		reference += (txt - cur_state->curline);
-
-		for (i = 0; i < len; i++)
-		{
-			char	ch = txt[i];
-
-			if (ch == (char) 0xFF)
-				ch = reference[i];
-			result[i] = ch;
-		}
-	}
-	result[len] = '\0';
-	return result;
-}
-
-/*
- * escape_variable --- process :'VARIABLE' or :"VARIABLE"
- *
- * If the variable name is found, escape its value using the appropriate
- * quoting method and emit the value to output_buf.  (Since the result is
- * surely quoted, there is never any reason to rescan it.)  If we don't
- * find the variable or the escaping function fails, emit the token as-is.
- */
-static void
-escape_variable(bool as_ident)
-{
-	char	   *varname;
-	const char *value;
-
-	/* Variable lookup. */
-	varname = extract_substring(yytext + 2, yyleng - 3);
-	value = GetVariable(pset.vars, varname);
-	free(varname);
-
-	/* Escaping. */
-	if (value)
-	{
-		if (!pset.db)
-			psql_error("can't escape without active connection\n");
-		else
-		{
-			char   *escaped_value;
-
-			if (as_ident)
-				escaped_value =
-					PQescapeIdentifier(pset.db, value, strlen(value));
-			else
-				escaped_value =
-					PQescapeLiteral(pset.db, value, strlen(value));
-
-			if (escaped_value == NULL)
-			{
-				const char *error = PQerrorMessage(pset.db);
-
-				psql_error("%s", error);
-			}
-			else
-			{
-				appendPQExpBufferStr(output_buf, escaped_value);
-				PQfreemem(escaped_value);
-				return;
-			}
-		}
-	}
-
-	/*
-	 * If we reach this point, some kind of error has occurred.  Emit the
-	 * original text into the output buffer.
-	 */
-	emit(yytext, yyleng);
-}
diff --git a/src/bin/psql/psqlscan_int.h b/src/bin/psql/psqlscan_int.h
new file mode 100644
index 0000000..cf3b688
--- /dev/null
+++ b/src/bin/psql/psqlscan_int.h
@@ -0,0 +1,84 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.h
+ */
+#ifndef PSQLSCAN_INT_H
+#define PSQLSCAN_INT_H
+
+/* Abstract type for lexer's internal state */
+typedef struct PsqlScanStateData *PsqlScanState;
+
+/* Return values from yylex() */
+#define LEXRES_EOL			0	/* end of input */
+#define LEXRES_SEMI			1	/* command-terminating semicolon found */
+#define LEXRES_BACKSLASH	2	/* backslash command start */
+#define LEXRES_OK			3	/* OK completion of backslash argument */
+
+/*
+ * We use a stack of flex buffers to handle substitution of psql variables.
+ * Each stacked buffer contains the as-yet-unread text from one psql variable.
+ * When we pop the stack all the way, we resume reading from the outer buffer
+ * identified by scanbufhandle.
+ */
+typedef struct StackElem
+{
+	YY_BUFFER_STATE buf;		/* flex input control structure */
+	char	   *bufstring;		/* data actually being scanned by flex */
+	char	   *origstring;		/* copy of original data, if needed */
+	char	   *varname;		/* name of variable providing data, or NULL */
+	struct StackElem *next;
+} StackElem;
+
+/*
+ * All working state of the lexer must be stored in PsqlScanStateData
+ * between calls.  This allows us to have multiple open lexer operations,
+ * which is needed for nested include files.  The lexer itself is not
+ * recursive, but it must be re-entrant.
+ */
+typedef struct PsqlScanStateData
+{
+	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
+	/*
+	 * These variables always refer to the outer buffer, never to any
+	 * stacked variable-expansion buffer.
+	 */
+	YY_BUFFER_STATE scanbufhandle;
+	char	   *scanbuf;		/* start of outer-level input buffer */
+	const char *scanline;		/* current input line at outer level */
+
+	const char *curline;		/* actual flex input string for cur buf */
+	const char *refline;		/* original data for cur buffer */
+	int			curpos;			/* current position in curline  */
+
+	PsqlScanCallbacks cb;		/* callback given from outside */
+
+
+	/*
+	 * All this state lives across successive input lines, until explicitly
+	 * reset by psql_scan_reset.
+	 */
+	int			start_state;	/* saved YY_START */
+	int			paren_depth;	/* depth of nesting in parentheses */
+	int			xcdepth;		/* depth of nesting in slash-star comments */
+	char	   *dolqstart;		/* current $foo$ quote start string */
+
+	/* Scan, cleanup and reset function for the lexer for this scan state */
+	void	(*finish)(PsqlScanState state);
+	void	(*reset)(PsqlScanState state);
+	YY_BUFFER_STATE (*my_yy_scan_buffer)(char *base, yy_size_t size);
+} PsqlScanStateData;
+
+extern void psql_scan_switch_lexer(PsqlScanState state);
+extern char *extract_substring(const char *txt, int len);
+extern void escape_variable(bool as_ident);
+extern void push_new_buffer(const char *newstr, const char *varname);
+extern void pop_buffer_stack(PsqlScanState state);
+extern bool var_is_current_source(PsqlScanState state, const char *varname);
+extern void scan_emit(const char *txt, int len);
+extern YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
+									  char **txtcopy);
+
+#endif   /* PSQLSCAN_INT_H */
diff --git a/src/bin/psql/psqlscan_slash.c b/src/bin/psql/psqlscan_slash.c
new file mode 100644
index 0000000..bf8c0f3
--- /dev/null
+++ b/src/bin/psql/psqlscan_slash.c
@@ -0,0 +1,19 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan_slash.c
+ *
+ */
+
+/*
+ * psqlscan_slashbody.c is #include'd here instead of being compiled on its own.
+ * This is because we need postgres_fe.h to be read before any system include
+ * files, else things tend to break on platforms that have multiple
+ * infrastructures for stdio.h and so on.  flex is absolutely uncooperative
+ * about that, so we can't compile psqlscan.c on its own.
+ */
+#include "postgres_fe.h"
+#include "psqlscan.h"
+#include "psqlscan_slashbody.c"
diff --git a/src/bin/psql/psqlscan_slash.h b/src/bin/psql/psqlscan_slash.h
new file mode 100644
index 0000000..71acbfb
--- /dev/null
+++ b/src/bin/psql/psqlscan_slash.h
@@ -0,0 +1,31 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.h
+ */
+#ifndef PSQLSCAN_SLASH_H
+#define PSQLSCAN_SLASH_H
+
+/* Different ways for scan_slash_option to handle parameter words */
+enum slash_option_type
+{
+	OT_NORMAL,					/* normal case */
+	OT_SQLID,					/* treat as SQL identifier */
+	OT_SQLIDHACK,				/* SQL identifier, but don't downcase */
+	OT_FILEPIPE,				/* it's a filename or pipe */
+	OT_WHOLE_LINE,				/* just snarf the rest of the line */
+	OT_NO_EVAL					/* no expansion of backticks or variables */
+};
+
+extern char *psql_scan_slash_command(PsqlScanState state);
+
+extern char *psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon);
+
+extern void psql_scan_slash_command_end(PsqlScanState state);
+
+#endif   /* PSQLSCAN_H */
diff --git a/src/bin/psql/psqlscan_slashbody.l b/src/bin/psql/psqlscan_slashbody.l
new file mode 100644
index 0000000..ae51d3f
--- /dev/null
+++ b/src/bin/psql/psqlscan_slashbody.l
@@ -0,0 +1,766 @@
+%{
+/*-------------------------------------------------------------------------
+ *
+ * psqlscan_slashcmd.l
+ *	  lexical scanner for slash commands of psql
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/bin/psql/psqlscan_slashcmd.l
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "psqlscan.h"
+#include "psqlscan_int.h"
+#include "psqlscan_slash.h"
+
+#include <ctype.h>
+
+static PsqlScanState cur_state;	/* current state while active */
+static PQExpBuffer output_buf;	/* current output buffer */
+
+/* these variables do not need to be saved across calls */
+static enum slash_option_type option_type;
+static char *option_quote;
+static int	unquoted_option_chars;
+static int	backtick_start_offset;
+
+static void evaluate_backtick(void);
+
+#define ECHO scan_emit(yytext, yyleng)
+
+/* Adjust curpos on yyless */
+#define my_yyless(n) cur_state->curpos -= (yyleng - (n)); yyless(n)
+
+/* Track where lexer parsed up to */
+#define YY_USER_ACTION cur_state->curpos += yyleng;
+
+#define ENC_IS_SAFE(s) (!(s)->cb.enc_mblen)
+%}
+
+%option 8bit
+%option never-interactive
+%option nodefault
+%option noinput
+%option nounput
+%option noyywrap
+%option warn
+%option prefix="yys"
+
+/*
+ * All of the following definitions and rules should exactly match
+ * src/backend/parser/scan.l so far as the flex patterns are concerned.
+ * The rule bodies are just ECHO as opposed to what the backend does,
+ * however.  (But be sure to duplicate code that affects the lexing process,
+ * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
+ * scan.l has a separate one for each exclusive state.
+ */
+
+/* Exclusive states for psql only: lex backslash commands */
+%x xslashargstart
+%x xslasharg
+%x xslashquote
+%x xslashbackquote
+%x xslashdquote
+%x xslashwholeline
+%x xslashend
+
+space			[ \t\n\r\f]
+quote			'
+
+/* Quoted string that allows backslash escapes */
+xeoctesc		[\\][0-7]{1,3}
+xehexesc		[\\]x[0-9A-Fa-f]{1,2}
+
+/* Extended quote
+ * xqdouble implements embedded quote, ''''
+ */
+xqdouble		{quote}{quote}
+
+/* Double quote
+ * Allows embedded spaces and other special characters into identifiers.
+ */
+dquote			\"
+
+/* psql-specific: characters allowed in variable names */
+variable_char	[A-Za-z\200-\377_0-9]
+
+other			.
+
+%%
+	/*
+	 * Exclusive lexer states to handle backslash command lexing
+	 */
+
+{
+	/* command name ends at whitespace or backslash; eat all else */
+
+{space}|"\\"	{
+					my_yyless(0);
+					return LEXRES_OK;
+				}
+
+{other}			{ ECHO;}
+
+}
+
+<xslashargstart>{
+	/*
+	 * Discard any whitespace before argument, then go to xslasharg state.
+	 * An exception is that "|" is only special at start of argument, so we
+	 * check for it here.
+	 */
+
+{space}+		{ }
+
+"|"				{
+					if (option_type == OT_FILEPIPE)
+					{
+						/* treat like whole-string case */
+						ECHO;
+						BEGIN(xslashwholeline);
+					}
+					else
+					{
+						/* vertical bar is not special otherwise */
+						my_yyless(0);
+						BEGIN(xslasharg);
+					}
+				}
+
+{other}			{
+					my_yyless(0);
+					BEGIN(xslasharg);
+				}
+
+}
+
+<xslasharg>{
+	/*
+	 * Default processing of text in a slash command's argument.
+	 *
+	 * Note: unquoted_option_chars counts the number of characters at the
+	 * end of the argument that were not subject to any form of quoting.
+	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
+	 */
+
+{space}|"\\"	{
+					/*
+					 * Unquoted space is end of arg; do not eat.  Likewise
+					 * backslash is end of command or next command, do not eat
+					 *
+					 * XXX this means we can't conveniently accept options
+					 * that include unquoted backslashes; therefore, option
+					 * processing that encourages use of backslashes is rather
+					 * broken.
+					 */
+					my_yyless(0);
+					return LEXRES_OK;
+				}
+
+{quote}			{
+					*option_quote = '\'';
+					unquoted_option_chars = 0;
+					BEGIN(xslashquote);
+				}
+
+"`"				{
+					backtick_start_offset = output_buf->len;
+					*option_quote = '`';
+					unquoted_option_chars = 0;
+					BEGIN(xslashbackquote);
+				}
+
+{dquote}		{
+					ECHO;
+					*option_quote = '"';
+					unquoted_option_chars = 0;
+					BEGIN(xslashdquote);
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						char	   *varname;
+						const char *value;
+						void	  (*free_fn)(void *) = NULL;
+
+						if (cur_state->cb.get_variable)
+						{
+							varname = extract_substring(yytext + 1, yyleng - 1);
+							value = cur_state->cb.get_variable(varname,
+												   false, false, &free_fn);
+							free(varname);
+						}
+
+						/*
+						 * The variable value is just emitted without any
+						 * further examination.  This is consistent with the
+						 * pre-8.0 code behavior, if not with the way that
+						 * variables are handled outside backslash commands.
+						 * Note that we needn't guard against recursion here.
+						 */
+						if (value)
+						{
+							appendPQExpBufferStr(output_buf, value);
+							if (free_fn)
+								free_fn((void*)value);
+						}
+						else
+							ECHO;
+
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}+'	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(false);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+
+:\"{variable_char}+\"	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(true);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+{other}			{
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+}
+
+<xslashquote>{
+	/*
+	 * single-quoted text: copy literally except for '' and backslash
+	 * sequences
+	 */
+
+{quote}			{ BEGIN(xslasharg); }
+
+{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
+
+"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
+"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
+"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
+"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
+"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
+
+{xeoctesc}		{
+					/* octal case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 1, NULL, 8));
+				}
+
+{xehexesc}		{
+					/* hex case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 2, NULL, 16));
+				}
+
+"\\".			{ scan_emit(yytext + 1, 1); }
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashbackquote>{
+	/*
+	 * backticked text: copy everything until next backquote, then evaluate.
+	 *
+	 * XXX Possible future behavioral change: substitute for :VARIABLE?
+	 */
+
+"`"				{
+					/* In NO_EVAL mode, don't evaluate the command */
+					if (option_type != OT_NO_EVAL)
+						evaluate_backtick();
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashdquote>{
+	/* double-quoted text: copy verbatim, including the double quotes */
+
+{dquote}		{
+					ECHO;
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashwholeline>{
+	/* copy everything until end of input line */
+	/* but suppress leading whitespace */
+
+{space}+		{
+					if (output_buf->len > 0)
+						ECHO;
+				}
+
+{other}			{ ECHO; }
+
+}
+
+<xslashend>{
+	/* at end of command, eat a double backslash, but not anything else */
+
+"\\\\"			{ return LEXRES_OK; }
+
+{other}|\n		{
+					my_yyless(0);
+					return LEXRES_OK;
+				}
+
+}
+
+%%
+
+static void psql_scan_slash_command_finish(PsqlScanState state);
+static void psql_scan_slash_command_reset(PsqlScanState state);
+
+static void
+psql_scan_slash_command_initialize(PsqlScanState state)
+{
+	psql_scan_finish(state);
+	psql_scan_reset(state);
+	memset(state, 0, sizeof(*state));
+	state->finish = &psql_scan_slash_command_finish;
+	state->reset = &psql_scan_slash_command_reset;
+	state->my_yy_scan_buffer = &yy_scan_buffer;
+	state->reset(state);
+}
+
+/*
+ * Set up to perform lexing of the given input line.
+ *
+ * The text at *line, extending for line_len bytes, will be scanned by
+ * subsequent calls to the psql_scan routines.  psql_scan_finish should
+ * be called when scanning is complete.  Note that the lexer retains
+ * a pointer to the storage at *line --- this string must not be altered
+ * or freed until after psql_scan_finish is called.
+ */
+static void
+psql_scan_slash_command_setup(PsqlScanState state,
+							  const char *line, int line_len,
+							  PsqlScanCallbacks *cb)
+{
+	/* Mustn't be scanning already */
+	Assert(state->scanbufhandle == NULL);
+	Assert(state->buffer_stack == NULL);
+	Assert(cb->error_out != NULL);
+
+	/* copy callback functions */
+	state->cb.get_variable = cb->get_variable;
+	state->cb.enc_mblen = cb->enc_mblen;
+	state->cb.standard_strings = cb->standard_strings;
+	state->cb.error_out = cb->error_out;
+
+	/* needed for prepare_buffer */
+	cur_state = state;
+
+	/* Set up flex input buffer with appropriate translation and padding */
+	state->scanbufhandle = prepare_buffer(line, line_len,
+										  &state->scanbuf);
+	state->scanline = line;
+	state->curpos = 0;
+
+	/* Set lookaside data in case we have to map unsafe encoding */
+	state->curline = state->scanbuf;
+	state->refline = state->scanline;
+}
+
+/*
+ * Create new lexer scanning state for this lexer which parses from the current
+ * position of the given scanning state for another lexer. The given state is
+ * destroyed.
+ * 
+ * Note: This function cannot access yy* functions and varialbes of the given
+ * state because they are of different lexer.
+ */
+static void
+psql_scan_slash_command_switch_lexer(PsqlScanState state)
+{
+	const char *newscanline = state->scanline + state->curpos;
+	PsqlScanCallbacks cb = state->cb;
+
+	psql_scan_slash_command_initialize(state);
+	psql_scan_slash_command_setup(state, newscanline, strlen(newscanline), &cb);
+}
+
+/*
+ * Scan the command name of a psql backslash command.  This should be called
+ * after psql_scan() on the main lexer returns PSCAN_BACKSLASH.  It is assumed
+ * that the input has been consumed through the leading backslash.
+ *
+ * The return value is a malloc'd copy of the command name, as parsed off
+ * from the input.
+ */
+char *
+psql_scan_slash_command(PsqlScanState state)
+{
+	PQExpBufferData mybuf;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	psql_scan_slash_command_switch_lexer(state);
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+
+	if (state->buffer_stack != NULL)
+		yys_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yys_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(INITIAL);
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+
+	return mybuf.data;
+}
+
+/*
+ * Parse off the next argument for a backslash command, and return it as a
+ * malloc'd string.  If there are no more arguments, returns NULL.
+ *
+ * type tells what processing, if any, to perform on the option string;
+ * for example, if it's a SQL identifier, we want to downcase any unquoted
+ * letters.
+ *
+ * if quote is not NULL, *quote is set to 0 if no quoting was found, else
+ * the last quote symbol used in the argument.
+ *
+ * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
+ * be taken as part of the option string will be stripped.
+ *
+ * NOTE: the only possible syntax errors for backslash options are unmatched
+ * quotes, which are detected when we run out of input.  Therefore, on a
+ * syntax error we just throw away the string and return NULL; there is no
+ * need to worry about flushing remaining input.
+ */
+char *
+psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon)
+{
+	PQExpBufferData mybuf;
+	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
+	char		local_quote;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	if (quote == NULL)
+		quote = &local_quote;
+	*quote = 0;
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+	option_type = type;
+	option_quote = quote;
+	unquoted_option_chars = 0;
+
+	if (state->buffer_stack != NULL)
+		yys_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yys_switch_to_buffer(state->scanbufhandle);
+
+	if (type == OT_WHOLE_LINE)
+		BEGIN(xslashwholeline);
+	else
+		BEGIN(xslashargstart);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/*
+	 * Check the lex result: we should have gotten back either LEXRES_OK
+	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
+	 * a quoted string, as indicated by YY_START, EOL is an error.
+	 */
+	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
+
+	switch (YY_START)
+	{
+		case xslashargstart:
+			/* empty arg */
+			break;
+		case xslasharg:
+			/* Strip any unquoted trailing semi-colons if requested */
+			if (semicolon)
+			{
+				while (unquoted_option_chars-- > 0 &&
+					   mybuf.len > 0 &&
+					   mybuf.data[mybuf.len - 1] == ';')
+				{
+					mybuf.data[--mybuf.len] = '\0';
+				}
+			}
+
+			/*
+			 * If SQL identifier processing was requested, then we strip out
+			 * excess double quotes and downcase unquoted letters.
+			 * Doubled double-quotes become output double-quotes, per spec.
+			 *
+			 * Note that a string like FOO"BAR"BAZ will be converted to
+			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
+			 * which would have us parse it as several identifiers.  But
+			 * for psql's purposes, we want a string like "foo"."bar" to
+			 * be treated as one option, so there's little choice.
+			 */
+			if (type == OT_SQLID || type == OT_SQLIDHACK)
+			{
+				bool		inquotes = false;
+				char	   *cp = mybuf.data;
+
+				while (*cp)
+				{
+					if (*cp == '"')
+					{
+						if (inquotes && cp[1] == '"')
+						{
+							/* Keep the first quote, remove the second */
+							cp++;
+						}
+						inquotes = !inquotes;
+						/* Collapse out quote at *cp */
+						memmove(cp, cp + 1, strlen(cp));
+						mybuf.len--;
+						/* do not advance cp */
+					}
+					else
+					{
+						if (!inquotes && type == OT_SQLID)
+							*cp = pg_tolower((unsigned char) *cp);
+						if (ENC_IS_SAFE(cur_state))
+							cp += strlen(cp);
+						else
+							cp += cur_state->cb.enc_mblen(cp);
+					}
+				}
+			}
+			break;
+		case xslashquote:
+		case xslashbackquote:
+		case xslashdquote:
+			/* must have hit EOL inside quotes */
+			cur_state->cb.error_out("unterminated quoted string\n");
+			termPQExpBuffer(&mybuf);
+			return NULL;
+		case xslashwholeline:
+			/* always okay */
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid YY_START\n");
+			exit(1);
+	}
+
+	/*
+	 * An unquoted empty argument isn't possible unless we are at end of
+	 * command.  Return NULL instead.
+	 */
+	if (mybuf.len == 0 && *quote == 0)
+	{
+		termPQExpBuffer(&mybuf);
+		return NULL;
+	}
+
+	/* Else return the completed string. */
+	return mybuf.data;
+}
+
+/*
+ * Eat up any unused \\ to complete a backslash command.
+ */
+void
+psql_scan_slash_command_end(PsqlScanState state)
+{
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = NULL;
+
+	if (state->buffer_stack != NULL)
+		yys_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yys_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(xslashend);
+
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+	psql_scan_switch_lexer(state);
+}
+
+/*
+ * Evaluate a backticked substring of a slash command's argument.
+ *
+ * The portion of output_buf starting at backtick_start_offset is evaluated
+ * as a shell command and then replaced by the command's output.
+ */
+static void
+evaluate_backtick(void)
+{
+	char	   *cmd = output_buf->data + backtick_start_offset;
+	PQExpBufferData cmd_output;
+	FILE	   *fd;
+	bool		error = false;
+	char		buf[512];
+	size_t		result;
+
+	initPQExpBuffer(&cmd_output);
+
+	fd = popen(cmd, PG_BINARY_R);
+	if (!fd)
+	{
+		cur_state->cb.error_out("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (!error)
+	{
+		do
+		{
+			result = fread(buf, 1, sizeof(buf), fd);
+			if (ferror(fd))
+			{
+				cur_state->cb.error_out("%s: %s\n", cmd, strerror(errno));
+				error = true;
+				break;
+			}
+			appendBinaryPQExpBuffer(&cmd_output, buf, result);
+		} while (!feof(fd));
+	}
+
+	if (fd && pclose(fd) == -1)
+	{
+		cur_state->cb.error_out("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (PQExpBufferDataBroken(cmd_output))
+	{
+		cur_state->cb.error_out("%s: out of memory\n", cmd);
+		error = true;
+	}
+
+	/* Now done with cmd, delete it from output_buf */
+	output_buf->len = backtick_start_offset;
+	output_buf->data[output_buf->len] = '\0';
+
+	/* If no error, transfer result to output_buf */
+	if (!error)
+	{
+		/* strip any trailing newline */
+		if (cmd_output.len > 0 &&
+			cmd_output.data[cmd_output.len - 1] == '\n')
+			cmd_output.len--;
+		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
+	}
+
+	termPQExpBuffer(&cmd_output);
+}
+
+/*
+ * Clean up after scanning a string.  This flushes any unread input and
+ * releases resources (but not the PsqlScanState itself).  Note however
+ * that this does not reset the lexer scan state; that can be done by
+ * psql_scan_reset(), which is an orthogonal operation.
+ *
+ * It is legal to call this when not scanning anything (makes it easier
+ * to deal with error recovery).
+ */
+static void
+psql_scan_slash_command_finish(PsqlScanState state)
+{
+	/* Drop any incomplete variable expansions. */
+	while (state->buffer_stack != NULL)
+		pop_buffer_stack(state);
+
+	/* Done with the outer scan buffer, too */
+	if (state->scanbufhandle)
+		yys_delete_buffer(state->scanbufhandle);
+	state->scanbufhandle = NULL;
+	if (state->scanbuf)
+		free(state->scanbuf);
+	state->scanbuf = NULL;
+}
+
+/*
+ * Reset lexer scanning state to start conditions.  This is appropriate
+ * for executing \r psql commands (or any other time that we discard the
+ * prior contents of query_buf).  It is not, however, necessary to do this
+ * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
+ * PSCAN_EOL scan result, because the scan state must be INITIAL when those
+ * conditions are returned.
+ *
+ * Note that this is unrelated to flushing unread input; that task is
+ * done by psql_scan_finish().
+ */
+static void
+psql_scan_slash_command_reset(PsqlScanState state)
+{
+	state->start_state = INITIAL;
+	state->paren_depth = 0;
+	state->xcdepth = 0;			/* not really necessary */
+	if (state->dolqstart)
+		free(state->dolqstart);
+	state->dolqstart = NULL;
+}
+
diff --git a/src/bin/psql/psqlscanbody.l b/src/bin/psql/psqlscanbody.l
new file mode 100644
index 0000000..546fa12
--- /dev/null
+++ b/src/bin/psql/psqlscanbody.l
@@ -0,0 +1,1438 @@
+%{
+/*-------------------------------------------------------------------------
+ *
+ * psqlscan.l
+ *	  lexical scanner for psql
+ *
+ * This code is mainly needed to determine where the end of a SQL statement
+ * is: we are looking for semicolons that are not within quotes, comments,
+ * or parentheses.  The most reliable way to handle this is to borrow the
+ * backend's flex lexer rules, lock, stock, and barrel.  The rules below
+ * are (except for a few) the same as the backend's, but their actions are
+ * just ECHO whereas the backend's actions generally do other things.
+ *
+ * XXX The rules in this file must be kept in sync with the backend lexer!!!
+ *
+ * XXX Avoid creating backtracking cases --- see the backend lexer for info.
+ *
+ * The most difficult aspect of this code is that we need to work in multibyte
+ * encodings that are not ASCII-safe.  A "safe" encoding is one in which each
+ * byte of a multibyte character has the high bit set (it's >= 0x80).  Since
+ * all our lexing rules treat all high-bit-set characters alike, we don't
+ * really need to care whether such a byte is part of a sequence or not.
+ * In an "unsafe" encoding, we still expect the first byte of a multibyte
+ * sequence to be >= 0x80, but later bytes might not be.  If we scan such
+ * a sequence as-is, the lexing rules could easily be fooled into matching
+ * such bytes to ordinary ASCII characters.  Our solution for this is to
+ * substitute 0xFF for each non-first byte within the data presented to flex.
+ * The flex rules will then pass the FF's through unmolested.  The emit()
+ * subroutine is responsible for looking back to the original string and
+ * replacing FF's with the corresponding original bytes.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/bin/psql/psqlscan.l
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "psqlscan.h"
+#include "psqlscan_int.h"
+
+#include <ctype.h>
+
+static PsqlScanState cur_state;	/* current state while active */
+
+static PQExpBuffer output_buf;	/* current output buffer */
+
+#define ECHO scan_emit(yytext, yyleng)
+
+/* Adjust curpos on yyless */
+#define my_yyless(n) cur_state->curpos -= (yyleng - (n)); yyless(n)
+
+/* Track where lexer parsed up to */
+#define YY_USER_ACTION cur_state->curpos += yyleng;
+
+#define ENC_IS_SAFE(s) (!(s)->cb.enc_mblen)
+%}
+
+%option 8bit
+%option never-interactive
+%option nodefault
+%option noinput
+%option nounput
+%option noyywrap
+%option warn
+
+/*
+ * All of the following definitions and rules should exactly match
+ * src/backend/parser/scan.l so far as the flex patterns are concerned.
+ * The rule bodies are just ECHO as opposed to what the backend does,
+ * however.  (But be sure to duplicate code that affects the lexing process,
+ * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
+ * scan.l has a separate one for each exclusive state.
+ */
+
+/*
+ * OK, here is a short description of lex/flex rules behavior.
+ * The longest pattern which matches an input string is always chosen.
+ * For equal-length patterns, the first occurring in the rules list is chosen.
+ * INITIAL is the starting state, to which all non-conditional rules apply.
+ * Exclusive states change parsing rules while the state is active.  When in
+ * an exclusive state, only those rules defined for that state apply.
+ *
+ * We use exclusive states for quoted strings, extended comments,
+ * and to eliminate parsing troubles for numeric strings.
+ * Exclusive states:
+ *  <xb> bit string literal
+ *  <xc> extended C-style comments
+ *  <xd> delimited identifiers (double-quoted identifiers)
+ *  <xh> hexadecimal numeric string
+ *  <xq> standard quoted strings
+ *  <xe> extended quoted strings (support backslash escape sequences)
+ *  <xdolq> $foo$ quoted strings
+ *  <xui> quoted identifier with Unicode escapes
+ *  <xuiend> end of a quoted identifier with Unicode escapes, UESCAPE can follow
+ *  <xus> quoted string with Unicode escapes
+ *  <xusend> end of a quoted string with Unicode escapes, UESCAPE can follow
+ *
+ * Note: we intentionally don't mimic the backend's <xeu> state; we have
+ * no need to distinguish it from <xe> state, and no good way to get out
+ * of it in error cases.  The backend just throws yyerror() in those
+ * cases, but that's not an option here.
+ */
+
+%x xb
+%x xc
+%x xd
+%x xh
+%x xe
+%x xq
+%x xdolq
+%x xui
+%x xuiend
+%x xus
+%x xusend
+
+/*
+ * In order to make the world safe for Windows and Mac clients as well as
+ * Unix ones, we accept either \n or \r as a newline.  A DOS-style \r\n
+ * sequence will be seen as two successive newlines, but that doesn't cause
+ * any problems.  Comments that start with -- and extend to the next
+ * newline are treated as equivalent to a single whitespace character.
+ *
+ * NOTE a fine point: if there is no newline following --, we will absorb
+ * everything to the end of the input as a comment.  This is correct.  Older
+ * versions of Postgres failed to recognize -- as a comment if the input
+ * did not end with a newline.
+ *
+ * XXX perhaps \f (formfeed) should be treated as a newline as well?
+ *
+ * XXX if you change the set of whitespace characters, fix scanner_isspace()
+ * to agree, and see also the plpgsql lexer.
+ */
+
+space			[ \t\n\r\f]
+horiz_space		[ \t\f]
+newline			[\n\r]
+non_newline		[^\n\r]
+
+comment			("--"{non_newline}*)
+
+whitespace		({space}+|{comment})
+
+/*
+ * SQL requires at least one newline in the whitespace separating
+ * string literals that are to be concatenated.  Silly, but who are we
+ * to argue?  Note that {whitespace_with_newline} should not have * after
+ * it, whereas {whitespace} should generally have a * after it...
+ */
+
+special_whitespace		({space}+|{comment}{newline})
+horiz_whitespace		({horiz_space}|{comment})
+whitespace_with_newline	({horiz_whitespace}*{newline}{special_whitespace}*)
+
+/*
+ * To ensure that {quotecontinue} can be scanned without having to back up
+ * if the full pattern isn't matched, we include trailing whitespace in
+ * {quotestop}.  This matches all cases where {quotecontinue} fails to match,
+ * except for {quote} followed by whitespace and just one "-" (not two,
+ * which would start a {comment}).  To cover that we have {quotefail}.
+ * The actions for {quotestop} and {quotefail} must throw back characters
+ * beyond the quote proper.
+ */
+quote			'
+quotestop		{quote}{whitespace}*
+quotecontinue	{quote}{whitespace_with_newline}{quote}
+quotefail		{quote}{whitespace}*"-"
+
+/* Bit string
+ * It is tempting to scan the string for only those characters
+ * which are allowed. However, this leads to silently swallowed
+ * characters if illegal characters are included in the string.
+ * For example, if xbinside is [01] then B'ABCD' is interpreted
+ * as a zero-length string, and the ABCD' is lost!
+ * Better to pass the string forward and let the input routines
+ * validate the contents.
+ */
+xbstart			[bB]{quote}
+xbinside		[^']*
+
+/* Hexadecimal number */
+xhstart			[xX]{quote}
+xhinside		[^']*
+
+/* National character */
+xnstart			[nN]{quote}
+
+/* Quoted string that allows backslash escapes */
+xestart			[eE]{quote}
+xeinside		[^\\']+
+xeescape		[\\][^0-7]
+xeoctesc		[\\][0-7]{1,3}
+xehexesc		[\\]x[0-9A-Fa-f]{1,2}
+xeunicode		[\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})
+xeunicodefail	[\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})
+
+/* Extended quote
+ * xqdouble implements embedded quote, ''''
+ */
+xqstart			{quote}
+xqdouble		{quote}{quote}
+xqinside		[^']+
+
+/* $foo$ style quotes ("dollar quoting")
+ * The quoted string starts with $foo$ where "foo" is an optional string
+ * in the form of an identifier, except that it may not contain "$",
+ * and extends to the first occurrence of an identical string.
+ * There is *no* processing of the quoted text.
+ *
+ * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim}
+ * fails to match its trailing "$".
+ */
+dolq_start		[A-Za-z\200-\377_]
+dolq_cont		[A-Za-z\200-\377_0-9]
+dolqdelim		\$({dolq_start}{dolq_cont}*)?\$
+dolqfailed		\${dolq_start}{dolq_cont}*
+dolqinside		[^$]+
+
+/* Double quote
+ * Allows embedded spaces and other special characters into identifiers.
+ */
+dquote			\"
+xdstart			{dquote}
+xdstop			{dquote}
+xddouble		{dquote}{dquote}
+xdinside		[^"]+
+
+/* Unicode escapes */
+uescape			[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote}
+/* error rule to avoid backup */
+uescapefail		[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU]
+
+/* Quoted identifier with Unicode escapes */
+xuistart		[uU]&{dquote}
+
+/* Quoted string with Unicode escapes */
+xusstart		[uU]&{quote}
+
+/* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */
+xustop1		{uescapefail}?
+xustop2		{uescape}
+
+/* error rule to avoid backup */
+xufailed		[uU]&
+
+
+/* C-style comments
+ *
+ * The "extended comment" syntax closely resembles allowable operator syntax.
+ * The tricky part here is to get lex to recognize a string starting with
+ * slash-star as a comment, when interpreting it as an operator would produce
+ * a longer match --- remember lex will prefer a longer match!  Also, if we
+ * have something like plus-slash-star, lex will think this is a 3-character
+ * operator whereas we want to see it as a + operator and a comment start.
+ * The solution is two-fold:
+ * 1. append {op_chars}* to xcstart so that it matches as much text as
+ *    {operator} would. Then the tie-breaker (first matching rule of same
+ *    length) ensures xcstart wins.  We put back the extra stuff with yyless()
+ *    in case it contains a star-slash that should terminate the comment.
+ * 2. In the operator rule, check for slash-star within the operator, and
+ *    if found throw it back with yyless().  This handles the plus-slash-star
+ *    problem.
+ * Dash-dash comments have similar interactions with the operator rule.
+ */
+xcstart			\/\*{op_chars}*
+xcstop			\*+\/
+xcinside		[^*/]+
+
+digit			[0-9]
+ident_start		[A-Za-z\200-\377_]
+ident_cont		[A-Za-z\200-\377_0-9\$]
+
+identifier		{ident_start}{ident_cont}*
+
+/* Assorted special-case operators and operator-like tokens */
+typecast		"::"
+dot_dot			\.\.
+colon_equals	":="
+equals_greater	"=>"
+less_equals		"<="
+greater_equals	">="
+less_greater	"<>"
+not_equals		"!="
+
+/*
+ * "self" is the set of chars that should be returned as single-character
+ * tokens.  "op_chars" is the set of chars that can make up "Op" tokens,
+ * which can be one or more characters long (but if a single-char token
+ * appears in the "self" set, it is not to be returned as an Op).  Note
+ * that the sets overlap, but each has some chars that are not in the other.
+ *
+ * If you change either set, adjust the character lists appearing in the
+ * rule for "operator"!
+ */
+self			[,()\[\].;\:\+\-\*\/\%\^\<\>\=]
+op_chars		[\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
+operator		{op_chars}+
+
+/* we no longer allow unary minus in numbers.
+ * instead we pass it separately to parser. there it gets
+ * coerced via doNegate() -- Leon aug 20 1999
+ *
+ * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 10.
+ *
+ * {realfail1} and {realfail2} are added to prevent the need for scanner
+ * backup when the {real} rule fails to match completely.
+ */
+
+integer			{digit}+
+decimal			(({digit}*\.{digit}+)|({digit}+\.{digit}*))
+decimalfail		{digit}+\.\.
+real			({integer}|{decimal})[Ee][-+]?{digit}+
+realfail1		({integer}|{decimal})[Ee]
+realfail2		({integer}|{decimal})[Ee][-+]
+
+param			\${integer}
+
+/* psql-specific: characters allowed in variable names */
+variable_char	[A-Za-z\200-\377_0-9]
+
+other			.
+
+/*
+ * Dollar quoted strings are totally opaque, and no escaping is done on them.
+ * Other quoted strings must allow some special characters such as single-quote
+ *  and newline.
+ * Embedded single-quotes are implemented both in the SQL standard
+ *  style of two adjacent single quotes "''" and in the Postgres/Java style
+ *  of escaped-quote "\'".
+ * Other embedded escaped characters are matched explicitly and the leading
+ *  backslash is dropped from the string.
+ * Note that xcstart must appear before operator, as explained above!
+ *  Also whitespace (comment) must appear before operator.
+ */
+
+%%
+
+{whitespace}	{
+					/*
+					 * Note that the whitespace rule includes both true
+					 * whitespace and single-line ("--" style) comments.
+					 * We suppress whitespace at the start of the query
+					 * buffer.  We also suppress all single-line comments,
+					 * which is pretty dubious but is the historical
+					 * behavior.
+					 */
+					if (!(output_buf->len == 0 || yytext[0] == '-'))
+						ECHO;
+				}
+
+{xcstart}		{
+					cur_state->xcdepth = 0;
+					BEGIN(xc);
+					/* Put back any characters past slash-star; see above */
+					my_yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstart}	{
+					cur_state->xcdepth++;
+					/* Put back any characters past slash-star; see above */
+					my_yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstop}	{
+					if (cur_state->xcdepth <= 0)
+					{
+						BEGIN(INITIAL);
+					}
+					else
+						cur_state->xcdepth--;
+					ECHO;
+				}
+
+<xc>{xcinside}	{
+					ECHO;
+				}
+
+<xc>{op_chars}	{
+					ECHO;
+				}
+
+<xc>\*+			{
+					ECHO;
+				}
+
+{xbstart}		{
+					BEGIN(xb);
+					ECHO;
+				}
+<xb>{quotestop}	|
+<xb>{quotefail} {
+					my_yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xh>{xhinside}	|
+<xb>{xbinside}	{
+					ECHO;
+				}
+<xh>{quotecontinue}	|
+<xb>{quotecontinue}	{
+					ECHO;
+				}
+
+{xhstart}		{
+					/* Hexadecimal bit type.
+					 * At some point we should simply pass the string
+					 * forward to the parser and label it there.
+					 * In the meantime, place a leading "x" on the string
+					 * to mark it for the input routine as a hex string.
+					 */
+					BEGIN(xh);
+					ECHO;
+				}
+<xh>{quotestop}	|
+<xh>{quotefail} {
+					my_yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+
+{xnstart}		{
+					my_yyless(1);				/* eat only 'n' this time */
+					ECHO;
+				}
+
+{xqstart}		{
+					if (cur_state->cb.standard_strings())
+						BEGIN(xq);
+					else
+						BEGIN(xe);
+					ECHO;
+				}
+{xestart}		{
+					BEGIN(xe);
+					ECHO;
+				}
+{xusstart}		{
+					BEGIN(xus);
+					ECHO;
+				}
+<xq,xe>{quotestop}	|
+<xq,xe>{quotefail} {
+					my_yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xus>{quotestop} |
+<xus>{quotefail} {
+					my_yyless(1);
+					BEGIN(xusend);
+					ECHO;
+				}
+<xusend>{whitespace} {
+					ECHO;
+				}
+<xusend>{other} |
+<xusend>{xustop1} {
+					my_yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xusend>{xustop2} {
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xq,xe,xus>{xqdouble} {
+					ECHO;
+				}
+<xq,xus>{xqinside}  {
+					ECHO;
+				}
+<xe>{xeinside}  {
+					ECHO;
+				}
+<xe>{xeunicode} {
+					ECHO;
+				}
+<xe>{xeunicodefail}	{
+					ECHO;
+				}
+<xe>{xeescape}  {
+					ECHO;
+				}
+<xe>{xeoctesc}  {
+					ECHO;
+				}
+<xe>{xehexesc}  {
+					ECHO;
+				}
+<xq,xe,xus>{quotecontinue} {
+					ECHO;
+				}
+<xe>.			{
+					/* This is only needed for \ just before EOF */
+					ECHO;
+				}
+
+{dolqdelim}		{
+					cur_state->dolqstart = pg_strdup(yytext);
+					BEGIN(xdolq);
+					ECHO;
+				}
+{dolqfailed}	{
+					/* throw back all but the initial "$" */
+					my_yyless(1);
+					ECHO;
+				}
+<xdolq>{dolqdelim} {
+					if (strcmp(yytext, cur_state->dolqstart) == 0)
+					{
+						free(cur_state->dolqstart);
+						cur_state->dolqstart = NULL;
+						BEGIN(INITIAL);
+					}
+					else
+					{
+						/*
+						 * When we fail to match $...$ to dolqstart, transfer
+						 * the $... part to the output, but put back the final
+						 * $ for rescanning.  Consider $delim$...$junk$delim$
+						 */
+						my_yyless(yyleng-1);
+					}
+					ECHO;
+				}
+<xdolq>{dolqinside} {
+					ECHO;
+				}
+<xdolq>{dolqfailed} {
+					ECHO;
+				}
+<xdolq>.		{
+					/* This is only needed for $ inside the quoted text */
+					ECHO;
+				}
+
+{xdstart}		{
+					BEGIN(xd);
+					ECHO;
+				}
+{xuistart}		{
+					BEGIN(xui);
+					ECHO;
+				}
+<xd>{xdstop}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xui>{dquote} {
+					my_yyless(1);
+					BEGIN(xuiend);
+					ECHO;
+				}
+<xuiend>{whitespace} {
+					ECHO;
+				}
+<xuiend>{other} |
+<xuiend>{xustop1} {
+					my_yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xuiend>{xustop2}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xd,xui>{xddouble}	{
+					ECHO;
+				}
+<xd,xui>{xdinside}	{
+					ECHO;
+				}
+
+{xufailed}	{
+					/* throw back all but the initial u/U */
+					my_yyless(1);
+					ECHO;
+				}
+
+{typecast}		{
+					ECHO;
+				}
+
+{dot_dot}		{
+					ECHO;
+				}
+
+{colon_equals}	{
+					ECHO;
+				}
+
+{equals_greater} {
+					ECHO;
+				}
+
+{less_equals}	{
+					ECHO;
+				}
+
+{greater_equals} {
+					ECHO;
+				}
+
+{less_greater}	{
+					ECHO;
+				}
+
+{not_equals}	{
+					ECHO;
+				}
+
+	/*
+	 * These rules are specific to psql --- they implement parenthesis
+	 * counting and detection of command-ending semicolon.  These must
+	 * appear before the {self} rule so that they take precedence over it.
+	 */
+
+"("				{
+					cur_state->paren_depth++;
+					ECHO;
+				}
+
+")"				{
+					if (cur_state->paren_depth > 0)
+						cur_state->paren_depth--;
+					ECHO;
+				}
+
+";"				{
+					ECHO;
+					if (cur_state->paren_depth == 0)
+					{
+						/* Terminate lexing temporarily */
+						return LEXRES_SEMI;
+					}
+				}
+
+	/*
+	 * psql-specific rules to handle backslash commands and variable
+	 * substitution.  We want these before {self}, also.
+	 */
+
+"\\"[;:]		{
+					/* Force a semicolon or colon into the query buffer */
+					scan_emit(yytext + 1, 1);
+				}
+
+"\\"			{
+					/* Terminate lexing temporarily */
+					return LEXRES_BACKSLASH;
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					char	   *varname = NULL;
+					const char *value = NULL;
+					void	  (*free_fn)(void *) = NULL;
+
+					if (cur_state->cb.get_variable)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = cur_state->cb.get_variable(varname,
+ 									false, false, &free_fn);
+					}
+
+					if (value)
+					{
+						/* It is a variable, check for recursion */
+						if (var_is_current_source(cur_state, varname))
+						{
+							/* Recursive expansion --- don't go there */
+							cur_state->cb.error_out("skipping recursive expansion of variable \"%s\"\n",
+									   varname);
+							/* Instead copy the string as is */
+							ECHO;
+						}
+						else
+						{
+							/* OK, perform substitution */
+							push_new_buffer(value, varname);
+							/* yy_scan_string already made buffer active */
+						}
+						if (free_fn)
+							free_fn((void*)value);
+					}
+					else
+					{
+						/*
+						 * if the variable doesn't exist we'll copy the
+						 * string as is
+						 */
+						ECHO;
+					}
+
+					if (varname)
+						free(varname);
+				}
+
+:'{variable_char}+'	{
+					escape_variable(false);
+				}
+
+:\"{variable_char}+\"	{
+					escape_variable(true);
+				}
+
+	/*
+	 * These rules just avoid the need for scanner backup if one of the
+	 * two rules above fails to match completely.
+	 */
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					ECHO;
+				}
+
+	/*
+	 * Back to backend-compatible rules.
+	 */
+
+{self}			{
+					ECHO;
+				}
+
+{operator}		{
+					/*
+					 * Check for embedded slash-star or dash-dash; those
+					 * are comment starts, so operator must stop there.
+					 * Note that slash-star or dash-dash at the first
+					 * character will match a prior rule, not this one.
+					 */
+					int		nchars = yyleng;
+					char   *slashstar = strstr(yytext, "/*");
+					char   *dashdash = strstr(yytext, "--");
+
+					if (slashstar && dashdash)
+					{
+						/* if both appear, take the first one */
+						if (slashstar > dashdash)
+							slashstar = dashdash;
+					}
+					else if (!slashstar)
+						slashstar = dashdash;
+					if (slashstar)
+						nchars = slashstar - yytext;
+
+					/*
+					 * For SQL compatibility, '+' and '-' cannot be the
+					 * last char of a multi-char operator unless the operator
+					 * contains chars that are not in SQL operators.
+					 * The idea is to lex '=-' as two operators, but not
+					 * to forbid operator names like '?-' that could not be
+					 * sequences of SQL operators.
+					 */
+					while (nchars > 1 &&
+						   (yytext[nchars-1] == '+' ||
+							yytext[nchars-1] == '-'))
+					{
+						int		ic;
+
+						for (ic = nchars-2; ic >= 0; ic--)
+						{
+							if (strchr("~!@#^&|`?%", yytext[ic]))
+								break;
+						}
+						if (ic >= 0)
+							break; /* found a char that makes it OK */
+						nchars--; /* else remove the +/-, and check again */
+					}
+
+					if (nchars < yyleng)
+					{
+						/* Strip the unwanted chars from the token */
+						my_yyless(nchars);
+					}
+					ECHO;
+				}
+
+{param}			{
+					ECHO;
+				}
+
+{integer}		{
+					ECHO;
+				}
+{decimal}		{
+					ECHO;
+				}
+{decimalfail}	{
+					/* throw back the .., and treat as integer */
+					my_yyless(yyleng-2);
+					ECHO;
+				}
+{real}			{
+					ECHO;
+				}
+{realfail1}		{
+					/*
+					 * throw back the [Ee], and treat as {decimal}.  Note
+					 * that it is possible the input is actually {integer},
+					 * but since this case will almost certainly lead to a
+					 * syntax error anyway, we don't bother to distinguish.
+					 */
+					my_yyless(yyleng-1);
+					ECHO;
+				}
+{realfail2}		{
+					/* throw back the [Ee][+-], and proceed as above */
+					my_yyless(yyleng-2);
+					ECHO;
+				}
+
+
+{identifier}	{
+					ECHO;
+				}
+
+{other}			{
+					ECHO;
+				}
+
+
+	/*
+	 * Everything from here down is psql-specific.
+	 */
+
+<<EOF>>			{
+					StackElem  *stackelem = cur_state->buffer_stack;
+
+					if (stackelem == NULL)
+						return LEXRES_EOL; /* end of input reached */
+
+					/*
+					 * We were expanding a variable, so pop the inclusion
+					 * stack and keep lexing
+					 */
+					pop_buffer_stack(cur_state);
+
+					stackelem = cur_state->buffer_stack;
+					if (stackelem != NULL)
+					{
+						yy_switch_to_buffer(stackelem->buf);
+						cur_state->curline = stackelem->bufstring;
+						cur_state->refline = stackelem->origstring ? stackelem->origstring : stackelem->bufstring;
+					}
+					else
+					{
+						yy_switch_to_buffer(cur_state->scanbufhandle);
+						cur_state->curline = cur_state->scanbuf;
+						cur_state->refline = cur_state->scanline;
+					}
+				}
+%%
+
+static void my_psql_scan_finish(PsqlScanState state);
+static void my_psql_scan_reset(PsqlScanState state);
+static void psql_error_errout(const char *fmt, ...)
+	__attribute__ ((format (printf, 1, 2)));
+static bool psql_standard_strings(void);
+
+static void
+psql_scan_initialize(PsqlScanState state)
+{
+	psql_scan_finish(state);
+	psql_scan_reset(state);
+	memset(state, 0, sizeof(*state));
+	state->finish = &my_psql_scan_finish;
+	state->reset = &my_psql_scan_reset;
+	state->my_yy_scan_buffer = &yy_scan_buffer;
+	state->reset(state);
+}
+
+/*
+ * Create a lexer working state struct.
+ */
+PsqlScanState
+psql_scan_create(void)
+{
+	PsqlScanState state;
+
+	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
+	psql_scan_initialize(state);
+
+	return state;
+}
+
+/*
+ * Destroy a lexer working state struct, releasing all resources.
+ */
+void
+psql_scan_destroy(PsqlScanState state)
+{
+	psql_scan_finish(state);
+
+	psql_scan_reset(state);
+
+	free(state);
+}
+
+/*
+ * Set up to perform lexing of the given input line.
+ *
+ * The text at *line, extending for line_len bytes, will be scanned by
+ * subsequent calls to the psql_scan routines.  psql_scan_finish should
+ * be called when scanning is complete.  Note that the lexer retains
+ * a pointer to the storage at *line --- this string must not be altered
+ * or freed until after psql_scan_finish is called.
+ */
+void
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PsqlScanCallbacks *cb)
+{
+	/* Mustn't be scanning already */
+	Assert(state->scanbufhandle == NULL);
+	Assert(state->buffer_stack == NULL);
+
+	/* copy callback functions */
+	state->cb.get_variable = cb->get_variable;
+	if (cb->standard_strings)
+		state->cb.standard_strings = cb->standard_strings;
+	else
+		state->cb.standard_strings = &psql_standard_strings;
+
+	state->cb.enc_mblen = cb->enc_mblen;
+
+	if (cb->error_out)
+		state->cb.error_out = cb->error_out;
+	else
+		state->cb.error_out = &psql_error_errout;
+
+	/* needed for prepare_buffer */
+	cur_state = state;
+
+	/* Set up flex input buffer with appropriate translation and padding */
+	state->scanbufhandle = prepare_buffer(line, line_len,
+										  &state->scanbuf);
+	state->scanline = line;
+	state->curpos = 0;
+
+	/* Set lookaside data in case we have to map unsafe encoding */
+	state->curline = state->scanbuf;
+	state->refline = state->scanline;
+}
+
+/*
+ * Redirect functions for indirect calls. These functions may be called for
+ * scan state of other lexers.
+ */
+void
+psql_scan_finish(PsqlScanState state)
+{
+	if (state->finish)
+		state->finish(state);
+}
+
+void
+psql_scan_reset(PsqlScanState state)
+{
+	if (state->reset)
+		state->reset(state);
+}
+
+
+/*
+ * Do lexical analysis of SQL command text.
+ *
+ * The text previously passed to psql_scan_setup is scanned, and appended
+ * (possibly with transformation) to query_buf.
+ *
+ * The return value indicates the condition that stopped scanning:
+ *
+ * PSCAN_SEMICOLON: found a command-ending semicolon.  (The semicolon is
+ * transferred to query_buf.)  The command accumulated in query_buf should
+ * be executed, then clear query_buf and call again to scan the remainder
+ * of the line.
+ *
+ * PSCAN_BACKSLASH: found a backslash that starts a psql special command.
+ * Any previous data on the line has been transferred to query_buf.
+ * The caller will typically next call psql_scan_slash_command(),
+ * perhaps psql_scan_slash_option(), and psql_scan_slash_command_end().
+ *
+ * PSCAN_INCOMPLETE: the end of the line was reached, but we have an
+ * incomplete SQL command.  *prompt is set to the appropriate prompt type.
+ *
+ * PSCAN_EOL: the end of the line was reached, and there is no lexical
+ * reason to consider the command incomplete.  The caller may or may not
+ * choose to send it.  *prompt is set to the appropriate prompt type if
+ * the caller chooses to collect more input.
+ *
+ * In the PSCAN_INCOMPLETE and PSCAN_EOL cases, psql_scan_finish() should
+ * be called next, then the cycle may be repeated with a fresh input line.
+ *
+ * In all cases, *prompt is set to an appropriate prompt type code for the
+ * next line-input operation.
+ */
+PsqlScanResult
+psql_scan(PsqlScanState state,
+		  PQExpBuffer query_buf,
+		  promptStatus_t *prompt)
+{
+	PsqlScanResult result;
+	int			lexresult;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = query_buf;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(state->start_state);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/* Update static vars back to the state struct */
+	state->start_state = YY_START;
+
+	/*
+	 * Check termination state and return appropriate result info.
+	 */
+	switch (lexresult)
+	{
+		case LEXRES_EOL:		/* end of input */
+			switch (state->start_state)
+			{
+				/* This switch must cover all non-slash-command states. */
+				case INITIAL:
+				case xuiend:	/* we treat these like INITIAL */
+				case xusend:
+					if (state->paren_depth > 0)
+					{
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_PAREN;
+					}
+					else if (query_buf->len > 0)
+					{
+						result = PSCAN_EOL;
+						*prompt = PROMPT_CONTINUE;
+					}
+					else
+					{
+						/* never bother to send an empty buffer */
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_READY;
+					}
+					break;
+				case xb:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xc:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_COMMENT;
+					break;
+				case xd:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xh:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xe:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xdolq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOLLARQUOTE;
+					break;
+				case xui:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xus:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				default:
+					/* can't get here */
+					fprintf(stderr, "invalid YY_START\n");
+					exit(1);
+			}
+			break;
+		case LEXRES_SEMI:		/* semicolon */
+			result = PSCAN_SEMICOLON;
+			*prompt = PROMPT_READY;
+			break;
+		case LEXRES_BACKSLASH:	/* backslash */
+			result = PSCAN_BACKSLASH;
+			*prompt = PROMPT_READY;
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid yylex result\n");
+			exit(1);
+	}
+
+	return result;
+}
+
+/*
+ * Clean up after scanning a string.  This flushes any unread input and
+ * releases resources (but not the PsqlScanState itself).  Note however
+ * that this does not reset the lexer scan state; that can be done by
+ * psql_scan_reset(), which is an orthogonal operation.
+ *
+ * It is legal to call this when not scanning anything (makes it easier
+ * to deal with error recovery).
+ */
+static void
+my_psql_scan_finish(PsqlScanState state)
+{
+	/* Drop any incomplete variable expansions. */
+	while (state->buffer_stack != NULL)
+		pop_buffer_stack(state);
+
+	/* Done with the outer scan buffer, too */
+	if (state->scanbufhandle)
+		yy_delete_buffer(state->scanbufhandle);
+	state->scanbufhandle = NULL;
+	if (state->scanbuf)
+		free(state->scanbuf);
+	state->scanbuf = NULL;
+}
+
+/*
+ * Create new lexer scanning state for this lexer which parses from the current
+ * position of the given scanning state for another lexer. The given state is
+ * destroyed.
+ * 
+ * Note: This function cannot access yy* functions and varialbes of the given
+ * state because they are of different lexer.
+ */
+void
+psql_scan_switch_lexer(PsqlScanState state)
+{
+	const char	   *newscanline = state->scanline + state->curpos;
+	PsqlScanCallbacks cb = state->cb;
+
+	psql_scan_initialize(state);
+	psql_scan_setup(state, newscanline, strlen(newscanline), &cb);
+}
+
+/*
+ * Reset lexer scanning state to start conditions.  This is appropriate
+ * for executing \r psql commands (or any other time that we discard the
+ * prior contents of query_buf).  It is not, however, necessary to do this
+ * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
+ * PSCAN_EOL scan result, because the scan state must be INITIAL when those
+ * conditions are returned.
+ *
+ * Note that this is unrelated to flushing unread input; that task is
+ * done by psql_scan_finish().
+ */
+static void
+my_psql_scan_reset(PsqlScanState state)
+{
+	state->start_state = INITIAL;
+	state->paren_depth = 0;
+	state->xcdepth = 0;			/* not really necessary */
+	if (state->dolqstart)
+		free(state->dolqstart);
+	state->dolqstart = NULL;
+}
+
+/*
+ * Return true if lexer is currently in an "inside quotes" state.
+ *
+ * This is pretty grotty but is needed to preserve the old behavior
+ * that mainloop.c drops blank lines not inside quotes without even
+ * echoing them.
+ */
+bool
+psql_scan_in_quote(PsqlScanState state)
+{
+	return state->start_state != INITIAL;
+}
+
+/*
+ * Push the given string onto the stack of stuff to scan.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+void
+push_new_buffer(const char *newstr, const char *varname)
+{
+	StackElem  *stackelem;
+
+	stackelem = (StackElem *) pg_malloc(sizeof(StackElem));
+
+	/*
+	 * In current usage, the passed varname points at the current flex
+	 * input buffer; we must copy it before calling prepare_buffer()
+	 * because that will change the buffer state.
+	 */
+	stackelem->varname = varname ? pg_strdup(varname) : NULL;
+
+	stackelem->buf = prepare_buffer(newstr, strlen(newstr),
+									&stackelem->bufstring);
+	cur_state->curline = stackelem->bufstring;
+	if (ENC_IS_SAFE(cur_state))
+	{
+		stackelem->origstring = NULL;
+		cur_state->refline = stackelem->bufstring;
+	}
+	else
+	{
+		stackelem->origstring = pg_strdup(newstr);
+		cur_state->refline = stackelem->origstring;
+	}
+	stackelem->next = cur_state->buffer_stack;
+	cur_state->buffer_stack = stackelem;
+}
+
+/*
+ * Pop the topmost buffer stack item (there must be one!)
+ *
+ * NB: after this, the flex input state is unspecified; caller must
+ * switch to an appropriate buffer to continue lexing.
+ */
+void
+pop_buffer_stack(PsqlScanState state)
+{
+	StackElem  *stackelem = state->buffer_stack;
+
+	state->buffer_stack = stackelem->next;
+	yy_delete_buffer(stackelem->buf);
+	free(stackelem->bufstring);
+	if (stackelem->origstring)
+		free(stackelem->origstring);
+	if (stackelem->varname)
+		free(stackelem->varname);
+	free(stackelem);
+}
+
+/*
+ * Check if specified variable name is the source for any string
+ * currently being scanned
+ */
+bool
+var_is_current_source(PsqlScanState state, const char *varname)
+{
+	StackElem  *stackelem;
+
+	for (stackelem = state->buffer_stack;
+		 stackelem != NULL;
+		 stackelem = stackelem->next)
+	{
+		if (stackelem->varname && strcmp(stackelem->varname, varname) == 0)
+			return true;
+	}
+	return false;
+}
+
+/*
+ * Set up a flex input buffer to scan the given data.  We always make a
+ * copy of the data.  If working in an unsafe encoding, the copy has
+ * multibyte sequences replaced by FFs to avoid fooling the lexer rules.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+YY_BUFFER_STATE
+prepare_buffer(const char *txt, int len, char **txtcopy)
+{
+	char	   *newtxt;
+
+	/* Flex wants two \0 characters after the actual data */
+	newtxt = pg_malloc(len + 2);
+	*txtcopy = newtxt;
+	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
+
+	if (ENC_IS_SAFE(cur_state))
+		memcpy(newtxt, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		int		i = 0;
+
+		while (i < len)
+		{
+			int		thislen = cur_state->cb.enc_mblen(txt + i);
+
+			/* first byte should always be okay... */
+			newtxt[i] = txt[i];
+			i++;
+			while (--thislen > 0 && i < len)
+				newtxt[i++] = (char) 0xFF;
+		}
+	}
+
+	return cur_state->my_yy_scan_buffer(newtxt, len + 2);
+}
+
+/*
+ * scan_emit() --- body for ECHO macro
+ *
+ * NB: this must be used for ALL and ONLY the text copied from the flex
+ * input data.  If you pass it something that is not part of the yytext
+ * string, you are making a mistake.  Internally generated text can be
+ * appended directly to output_buf.
+ */
+void
+scan_emit(const char *txt, int len)
+{
+	if (ENC_IS_SAFE(cur_state))
+		appendBinaryPQExpBuffer(output_buf, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			appendPQExpBufferChar(output_buf, ch);
+		}
+	}
+}
+
+/*
+ * extract_substring --- fetch the true value of (part of) the current token
+ *
+ * This is like scan_emit(), except that the data is returned as a malloc'd
+ * string rather than being pushed directly to output_buf.
+ */
+char *
+extract_substring(const char *txt, int len)
+{
+	char	   *result = (char *) pg_malloc(len + 1);
+
+	if (ENC_IS_SAFE(cur_state))
+		memcpy(result, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			result[i] = ch;
+		}
+	}
+	result[len] = '\0';
+	return result;
+}
+
+/*
+ * escape_variable --- process :'VARIABLE' or :"VARIABLE"
+ *
+ * If the variable name is found, escape its value using the appropriate
+ * quoting method and emit the value to output_buf.  (Since the result is
+ * surely quoted, there is never any reason to rescan it.)  If we don't
+ * find the variable or the escaping function fails, emit the token as-is.
+ */
+void
+escape_variable(bool as_ident)
+{
+	/* Variable lookup if possible. */
+	if (cur_state->cb.get_variable)
+	{
+		char		*varname;
+		const char  *value;
+		void	   (*free_fn)(void *);
+
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = cur_state->cb.get_variable(varname, true, as_ident, &free_fn);
+		free(varname);
+
+		if (value)
+		{
+			appendPQExpBufferStr(output_buf, value);
+			if (free_fn)
+				free_fn((void*)value);
+			return;
+		}
+	}
+
+	/*
+	 * If we reach this point, some kind of error has occurred.  Emit the
+	 * original text into the output buffer.
+	 */
+	scan_emit(yytext, yyleng);
+}
+
+/* Default error output function */
+static void psql_error_errout(const char *fmt, ...)
+{
+	va_list	ap;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, _(fmt), ap);
+	va_end(ap);
+}
+
+/* Default function to check standard_conforming_strings */
+static bool psql_standard_strings(void)
+{
+	return false;
+}
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 6916f6f..47e9077 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -337,9 +337,12 @@ main(int argc, char *argv[])
 					puts(cell->val);
 
 				scan_state = psql_scan_create();
-				psql_scan_setup(scan_state,
-								cell->val,
-								strlen(cell->val));
+				/* set enc_mblen according to the encoding */
+				psqlscan_callbacks.enc_mblen =
+					(pg_valid_server_encoding_id(pset.encoding) ?
+					 NULL : &psql_mblen);
+				psql_scan_setup(scan_state,	cell->val, strlen(cell->val),
+								&psqlscan_callbacks);
 
 				successResult = HandleSlashCmds(scan_state, NULL) != PSQL_CMD_ERROR
 					? EXIT_SUCCESS : EXIT_FAILURE;
-- 
1.8.3.1

0002-pgbench-uses-common-frontend-SQL-parser.patchtext/x-patch; charset=us-asciiDownload
From 438f2b0a6e885758800695335c0904dc7502a7fe Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 6 Jan 2016 15:45:58 +0900
Subject: [PATCH 2/3] pgbench uses common frontend SQL parser

Make pgbench to use common frontend SQL parser instead of its
homegrown parser.
---
 src/bin/pgbench/Makefile  |   8 +-
 src/bin/pgbench/pgbench.c | 476 +++++++++++++++++++++++++++++++---------------
 2 files changed, 329 insertions(+), 155 deletions(-)

diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile
index 18fdf58..afb48a6 100644
--- a/src/bin/pgbench/Makefile
+++ b/src/bin/pgbench/Makefile
@@ -5,11 +5,12 @@ PGAPPICON = win32
 
 subdir = src/bin/pgbench
 top_builddir = ../../..
+psqldir = ../psql
 include $(top_builddir)/src/Makefile.global
 
-OBJS = pgbench.o exprparse.o $(WIN32RES)
+OBJS = pgbench.o exprparse.o $(psqldir)/psqlscan.o $(WIN32RES)
 
-override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
+override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(psqldir) $(CPPFLAGS)
 
 ifneq ($(PORTNAME), win32)
 override CFLAGS += $(PTHREAD_CFLAGS)
@@ -24,6 +25,9 @@ pgbench: $(OBJS) | submake-libpq submake-libpgport
 # exprscan is compiled as part of exprparse
 exprparse.o: exprscan.c
 
+$(psqldir)/psqlscan.o:
+	make -C $(psqldir) psqlscan.o
+
 distprep: exprparse.c exprscan.c
 
 install: all installdirs
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 596d112..8793fd2 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -54,6 +54,7 @@
 #endif
 
 #include "pgbench.h"
+#include "psqlscan.h"
 
 #define ERRCODE_UNDEFINED_TABLE  "42P01"
 
@@ -285,7 +286,7 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct
+typedef struct Command_t
 {
 	char	   *line;			/* full text of command line */
 	int			command_num;	/* unique index of this Command struct */
@@ -295,6 +296,7 @@ typedef struct
 	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
 	SimpleStats stats;			/* time spent in this command */
+	struct Command_t *next;		/* more command if any, for multistatements */
 } Command;
 
 static struct
@@ -303,6 +305,22 @@ static struct
 	Command   **commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
+
+typedef enum
+{
+	PS_IDLE,
+	PS_IN_STATEMENT,
+	PS_IN_BACKSLASH_CMD
+} ParseState;
+
+typedef struct ParseInfo
+{
+	PsqlScanState	scan_state;
+	PQExpBuffer		outbuf;
+	ParseState		mode;
+} ParseInfoData;
+typedef ParseInfoData *ParseInfo;
+
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
 static int	debug = 0;			/* debug flag */
@@ -430,6 +448,9 @@ usage(void)
 		   progname, progname);
 }
 
+PsqlScanCallbacks pgbench_scan_callbacks =
+{NULL, NULL, NULL};
+
 /*
  * strtoint64 -- convert a string to 64-bit integer
  *
@@ -2287,216 +2308,346 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
-/* Parse a command; return a Command struct, or NULL if it's a comment */
+static ParseInfo
+createParseInfo(void)
+{
+	ParseInfo ret = (ParseInfo) pg_malloc(sizeof(ParseInfoData));
+
+	ret->scan_state = psql_scan_create();
+	ret->outbuf = createPQExpBuffer();
+	ret->mode = PS_IDLE;
+
+	return ret;
+}
+
+#define parse_reset_outbuf(pcs) resetPQExpBuffer((pcs)->outbuf)
+#define parse_finish_scan(pcs) psql_scan_finish((pcs)->scan_state)
+
+/* copy a string after removing newlines and collapsing whitespaces */
+static char *
+strdup_nonl(const char *in)
+{
+	char *ret, *p, *q;
+
+	ret = pg_strdup(in);
+
+	/* Replace newlines into spaces */
+	for (p = ret ; *p ; p++)
+		if (*p == '\n') *p = ' ';
+
+	/* collapse successive spaces */
+	for (p = q = ret ; *p ; p++, q++)
+	{
+		while (isspace(*p) && isspace(*(p + 1))) p++;
+		if (p > q) *q = *p;
+	}
+	*q = '\0';
+
+	return ret;
+}
+
+/* Parse a backslash command; return a Command struct */
 static Command *
-process_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(ParseInfo proc_state, char *buf,
+						   const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 	Command    *my_commands;
 	int			j;
 	char	   *p,
+			   *start,
 			   *tok;
-
-	/* Make the string buf end at the next newline */
-	if ((p = strchr(buf, '\n')) != NULL)
-		*p = '\0';
+	int			max_args = -1;
 
 	/* Skip leading whitespace */
 	p = buf;
 	while (isspace((unsigned char) *p))
 		p++;
+	start = p;
 
-	/* If the line is empty or actually a comment, we're done */
-	if (*p == '\0' || strncmp(p, "--", 2) == 0)
-		return NULL;
+	if (proc_state->mode != PS_IN_BACKSLASH_CMD)
+	{
+		if (*p != '\\')
+			return NULL;	/* not a backslash command */
+
+		/* This is the first line of a backslash command  */
+		proc_state->mode = PS_IN_BACKSLASH_CMD;
+	}
+
+	/*
+	 * Make the string buf end at the next newline, or move to just after the
+	 * end of line
+	 */
+	if ((p = strchr(start, '\n')) != NULL)
+		*p = '\0';
+	else
+		p = start + strlen(start);
+
+	/* continued line ends with a backslash */
+	if (*(--p) == '\\')
+	{
+		*p-- = '\0';
+		appendPQExpBufferStr(proc_state->outbuf, start);
+
+		/* Add a delimiter at the end of the line if necessary */
+		if (!isspace(*p))
+			appendPQExpBufferChar(proc_state->outbuf, ' ');
+ 		return NULL;
+	}
+
+	appendPQExpBufferStr(proc_state->outbuf, start);
+	proc_state->mode = PS_IDLE;
+
+	/* Start parsing the backslash command */
+
+	p = proc_state->outbuf->data;
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
-	my_commands->line = pg_strdup(buf);
+	my_commands->line = pg_strdup(p);
 	my_commands->command_num = num_commands++;
-	my_commands->type = 0;		/* until set */
+	my_commands->type = META_COMMAND;
 	my_commands->argc = 0;
+	my_commands->next = NULL;
 	initSimpleStats(&my_commands->stats);
 
-	if (*p == '\\')
-	{
-		int			max_args = -1;
+	j = 0;
+	tok = strtok(++p, delim);
 
-		my_commands->type = META_COMMAND;
+	if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
+		max_args = 2;
 
-		j = 0;
-		tok = strtok(++p, delim);
+	while (tok != NULL)
+	{
+		my_commands->cols[j] = tok - buf + 1;
+		my_commands->argv[j++] = pg_strdup(tok);
+		my_commands->argc++;
+		if (max_args >= 0 && my_commands->argc >= max_args)
+			tok = strtok(NULL, "");
+		else
+			tok = strtok(NULL, delim);
+	}
+	parse_reset_outbuf(proc_state);
 
-		if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
-			max_args = 2;
+	if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
+	{
+		/*--------
+		 * parsing:
+		 *	 \setrandom variable min max [uniform]
+		 *	 \setrandom variable min max (gaussian|exponential) parameter
+		 */
 
-		while (tok != NULL)
+		if (my_commands->argc < 4)
 		{
-			my_commands->cols[j] = tok - buf + 1;
-			my_commands->argv[j++] = pg_strdup(tok);
-			my_commands->argc++;
-			if (max_args >= 0 && my_commands->argc >= max_args)
-				tok = strtok(NULL, "");
-			else
-				tok = strtok(NULL, delim);
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing arguments", NULL, -1);
 		}
 
-		if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
-		{
-			/*--------
-			 * parsing:
-			 *	 \setrandom variable min max [uniform]
-			 *	 \setrandom variable min max (gaussian|exponential) parameter
-			 */
+		/* argc >= 4 */
 
-			if (my_commands->argc < 4)
+		if (my_commands->argc == 4 ||		/* uniform without/with
+											 * "uniform" keyword */
+			(my_commands->argc == 5 &&
+			 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
+		{
+			/* nothing to do */
+		}
+		else if (			/* argc >= 5 */
+			(pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
+			(pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
+		{
+			if (my_commands->argc < 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing arguments", NULL, -1);
-			}
-
-			/* argc >= 4 */
-
-			if (my_commands->argc == 4 ||		/* uniform without/with
-												 * "uniform" keyword */
-				(my_commands->argc == 5 &&
-				 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
-			{
-				/* nothing to do */
-			}
-			else if (			/* argc >= 5 */
-					 (pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
-				   (pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
-			{
-				if (my_commands->argc < 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							  "missing parameter", my_commands->argv[4], -1);
-				}
-				else if (my_commands->argc > 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "too many arguments", my_commands->argv[4],
-								 my_commands->cols[6]);
-				}
+							 "missing parameter", my_commands->argv[4], -1);
 			}
-			else	/* cannot parse, unexpected arguments */
+			else if (my_commands->argc > 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "unexpected argument", my_commands->argv[4],
-							 my_commands->cols[4]);
+							 "too many arguments", my_commands->argv[4],
+							 my_commands->cols[6]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+		else	/* cannot parse, unexpected arguments */
 		{
-			if (my_commands->argc < 3)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "unexpected argument", my_commands->argv[4],
+						 my_commands->cols[4]);
+		}
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+	{
+		if (my_commands->argc < 3)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-			expr_scanner_init(my_commands->argv[2], source, lineno,
-							  my_commands->line, my_commands->argv[0],
-							  my_commands->cols[2] - 1);
+		expr_scanner_init(my_commands->argv[2], source, lineno,
+						  my_commands->line, my_commands->argv[0],
+						  my_commands->cols[2] - 1);
 
-			if (expr_yyparse() != 0)
-			{
-				/* dead code: exit done from syntax_error called by yyerror */
-				exit(1);
-			}
+		if (expr_yyparse() != 0)
+		{
+			/* dead code: exit done from syntax_error called by yyerror */
+			exit(1);
+		}
 
-			my_commands->expr = expr_parse_result;
+		my_commands->expr = expr_parse_result;
 
-			expr_scanner_finish();
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+		expr_scanner_finish();
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+	{
+		if (my_commands->argc < 2)
 		{
-			if (my_commands->argc < 2)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
-
-			/*
-			 * Split argument into number and unit to allow "sleep 1ms" etc.
-			 * We don't have to terminate the number argument with null
-			 * because it will be parsed with atoi, which ignores trailing
-			 * non-digit characters.
-			 */
-			if (my_commands->argv[1][0] != ':')
-			{
-				char	   *c = my_commands->argv[1];
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-				while (isdigit((unsigned char) *c))
-					c++;
-				if (*c)
-				{
-					my_commands->argv[2] = c;
-					if (my_commands->argc < 3)
-						my_commands->argc = 3;
-				}
-			}
+		/*
+		 * Split argument into number and unit to allow "sleep 1ms" etc.
+		 * We don't have to terminate the number argument with null
+		 * because it will be parsed with atoi, which ignores trailing
+		 * non-digit characters.
+		 */
+		if (my_commands->argv[1][0] != ':')
+		{
+			char	   *c = my_commands->argv[1];
 
-			if (my_commands->argc >= 3)
+			while (isdigit((unsigned char) *c))
+				c++;
+			if (*c)
 			{
-				if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "s") != 0)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "unknown time unit, must be us, ms or s",
-								 my_commands->argv[2], my_commands->cols[2]);
-				}
+				my_commands->argv[2] = c;
+				if (my_commands->argc < 3)
+					my_commands->argc = 3;
 			}
-
-			/* this should be an error?! */
-			for (j = 3; j < my_commands->argc; j++)
-				fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
-						my_commands->argv[0], my_commands->argv[j]);
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+
+		if (my_commands->argc >= 3)
 		{
-			if (my_commands->argc < 3)
+			if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "s") != 0)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
+							 "unknown time unit, must be us, ms or s",
+							 my_commands->argv[2], my_commands->cols[2]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+
+		/* this should be an error?! */
+		for (j = 3; j < my_commands->argc; j++)
+			fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
+					my_commands->argv[0], my_commands->argv[j]);
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+	{
+		if (my_commands->argc < 3)
 		{
-			if (my_commands->argc < 1)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing command", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
 		}
-		else
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+	{
+		if (my_commands->argc < 1)
 		{
 			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-						 "invalid command", NULL, -1);
+						 "missing command", NULL, -1);
 		}
 	}
 	else
 	{
-		my_commands->type = SQL_COMMAND;
+		syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+					 "invalid command", NULL, -1);
+	}
+
+	return my_commands;
+}
+
+/* Parse an input line, return non-null if any command terminates. */
+static Command *
+process_commands(ParseInfo proc_state, char *buf,
+				 const char *source, const int lineno)
+{
+	Command *command = NULL;
+	Command *retcomd = NULL;
+	PsqlScanState scan_state = proc_state->scan_state;
+	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+	PQExpBuffer qbuf = proc_state->outbuf;
+	PsqlScanResult scan_result;
+
+	if (proc_state->mode != PS_IN_STATEMENT)
+	{
+		command = process_backslash_commands(proc_state, buf, source, lineno);
+
+		/* go to next line for continuation of the backslash command. */
+		if (command != NULL || proc_state->mode == PS_IN_BACKSLASH_CMD)
+			return command;
+	}
+
+	/* Parse statements */
+	psql_scan_setup(scan_state, buf, strlen(buf), &pgbench_scan_callbacks);
+
+next_command:	
+	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
+
+	if (scan_result == PSCAN_SEMICOLON)
+	{
+		proc_state->mode = PS_IDLE;
+		/*
+		 * Command is terminated. Fill the struct.
+		 */
+		command = (Command*) pg_malloc(sizeof(Command));
+		command->line = strdup_nonl(qbuf->data);
+		command->command_num = num_commands++;
+		command->type = SQL_COMMAND;
+		command->argc = 0;
+		command->next = NULL;
+
+		/* Put this command at the end of returning command chain */
+		if (!retcomd)
+			retcomd = command;
+		else
+		{
+			Command *pcomm = retcomd;
+			while (pcomm->next) pcomm = pcomm->next;
+			pcomm->next = command;
+		}
 
 		switch (querymode)
 		{
-			case QUERY_SIMPLE:
-				my_commands->argv[0] = pg_strdup(p);
-				my_commands->argc++;
-				break;
-			case QUERY_EXTENDED:
-			case QUERY_PREPARED:
-				if (!parseQuery(my_commands, p))
-					exit(1);
-				break;
-			default:
+		case QUERY_SIMPLE:
+			command->argv[0] = pg_strdup(qbuf->data);
+			command->argc++;
+			break;
+		case QUERY_EXTENDED:
+		case QUERY_PREPARED:
+			if (!parseQuery(command, qbuf->data))
 				exit(1);
+			break;
+		default:
+			exit(1);
 		}
+
+		parse_reset_outbuf(proc_state);
+
+		/* Ask for the next statement in this line */
+		goto next_command;
+ 	}
+	else if (scan_result == PSCAN_BACKSLASH)
+	{
+		fprintf(stderr, "Unexpected backslash in SQL statement: %s:%d\n",
+				source, lineno);
+		exit(1);
 	}
 
-	return my_commands;
+	proc_state->mode = PS_IN_STATEMENT;
+	psql_scan_finish(scan_state);
+
+	return retcomd;
 }
 
 /*
@@ -2557,6 +2708,7 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2571,26 +2723,38 @@ process_file(char *filename)
 		return NULL;
 	}
 
+	proc_state->mode = PS_IDLE;
+
 	lineno = 0;
 	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
-		Command    *command;
+		Command    *command = NULL;
 
 		lineno += 1;
 
-		command = process_commands(buf, filename, lineno);
+		command = process_commands(proc_state, buf, filename, lineno);
 
 		free(buf);
 
 		if (command == NULL)
-			continue;
-
-		my_commands[index] = command;
-		index++;
+		{
+			/*
+			 * command is NULL when psql_scan returns PSCAN_EOL or
+			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
+			 * cases.
+			 */
+ 			continue;
+		}
 
-		if (index >= alloc_num)
+		while (command)
+		{
+			my_commands[index++] = command;
+			command = command->next;
+		}
+		
+		if (index > alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
 			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
@@ -2598,6 +2762,8 @@ process_file(char *filename)
 	}
 	fclose(fd);
 
+	parse_finish_scan(proc_state);
+
 	my_commands[index] = NULL;
 
 	return my_commands;
@@ -2613,6 +2779,7 @@ process_builtin(const char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2639,10 +2806,12 @@ process_builtin(const char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(buf, source, lineno);
+		command = process_commands(proc_state, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
+		/* builtin doesn't need multistatements */
+		Assert(command->next == NULL);
 		my_commands[index] = command;
 		index++;
 
@@ -2654,6 +2823,7 @@ process_builtin(const char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
+	parse_finish_scan(proc_state);
 
 	return my_commands;
 }
-- 
1.8.3.1

0003-Change-the-way-to-hold-command-list.patchtext/x-patch; charset=us-asciiDownload
From 0bab31f3f1dbddc3006b05c9d6cb3ca9d4ee1b54 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 7 Jan 2016 15:54:19 +0900
Subject: [PATCH 3/3] Change the way to hold command list.

Currently commands for SQL statements are generated as a linked list
and stored into and accessed as an array. This patch unifies the way
to store them to linked list.
---
 src/bin/pgbench/pgbench.c | 189 +++++++++++++++++++++-------------------------
 1 file changed, 85 insertions(+), 104 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 8793fd2..e934e87 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -191,6 +191,7 @@ typedef struct
 
 #define MAX_SCRIPTS		128		/* max number of SQL scripts allowed */
 #define SHELL_COMMAND_SIZE	256 /* maximum size allowed for shell command */
+#define MAX_ARGS		10
 
 /*
  * Simple data structure to keep stats about something.
@@ -222,13 +223,29 @@ typedef struct StatsData
 } StatsData;
 
 /*
+ * Structure for individual command
+ */
+typedef struct Command_t
+{
+	char	   *line;			/* full text of command line */
+	int			command_num;	/* unique index of this Command struct */
+	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
+	int			argc;			/* number of command words */
+	char	   *argv[MAX_ARGS]; /* command word list */
+	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
+	PgBenchExpr *expr;			/* parsed expression */
+	SimpleStats stats;			/* time spent in this command */
+	struct Command_t *next;		/* more command if any, for multistatements */
+} Command;
+
+/*
  * Connection state
  */
 typedef struct
 {
 	PGconn	   *con;			/* connection handle to DB */
 	int			id;				/* client No. */
-	int			state;			/* state No. */
+	Command	   *curr;			/* current command */
 	bool		listen;			/* whether an async query has been sent */
 	bool		is_throttled;	/* whether transaction throttling is done */
 	bool		sleeping;		/* whether the client is napping */
@@ -273,7 +290,6 @@ typedef struct
  */
 #define SQL_COMMAND		1
 #define META_COMMAND	2
-#define MAX_ARGS		10
 
 typedef enum QueryMode
 {
@@ -286,23 +302,10 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct Command_t
-{
-	char	   *line;			/* full text of command line */
-	int			command_num;	/* unique index of this Command struct */
-	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
-	int			argc;			/* number of command words */
-	char	   *argv[MAX_ARGS]; /* command word list */
-	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
-	PgBenchExpr *expr;			/* parsed expression */
-	SimpleStats stats;			/* time spent in this command */
-	struct Command_t *next;		/* more command if any, for multistatements */
-} Command;
-
 static struct
 {
 	const char *name;
-	Command   **commands;
+	Command   *commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 
@@ -1271,7 +1274,7 @@ static bool
 doCustom(TState *thread, CState *st, StatsData *agg)
 {
 	PGresult   *res;
-	Command   **commands;
+	Command    *commands;
 	bool		trans_needs_throttle = false;
 	instr_time	now;
 
@@ -1351,13 +1354,13 @@ top:
 
 	if (st->listen)
 	{							/* are we receiver? */
-		if (commands[st->state]->type == SQL_COMMAND)
+		if (st->curr->type == SQL_COMMAND)
 		{
 			if (debug)
 				fprintf(stderr, "client %d receiving\n", st->id);
 			if (!PQconsumeInput(st->con))
 			{					/* there's something wrong */
-				fprintf(stderr, "client %d aborted in state %d; perhaps the backend died while processing\n", st->id, st->state);
+				fprintf(stderr, "client %d aborted in state %d; perhaps the backend died while processing\n", st->id, st->curr->command_num);
 				return clientDone(st, false);
 			}
 			if (PQisBusy(st->con))
@@ -1374,13 +1377,13 @@ top:
 				INSTR_TIME_SET_CURRENT(now);
 
 			/* XXX could use a mutex here, but we choose not to */
-			addToSimpleStats(&commands[st->state]->stats,
+			addToSimpleStats(&st->curr->stats,
 							 INSTR_TIME_GET_DOUBLE(now) -
 							 INSTR_TIME_GET_DOUBLE(st->stmt_begin));
 		}
 
 		/* transaction finished: calculate latency and log the transaction */
-		if (commands[st->state + 1] == NULL)
+		if (st->curr->next == NULL)
 		{
 			if (progress || throttle_delay || latency_limit ||
 				per_script_stats || use_log)
@@ -1389,7 +1392,7 @@ top:
 				thread->stats.cnt++;
 		}
 
-		if (commands[st->state]->type == SQL_COMMAND)
+		if (st->curr->type == SQL_COMMAND)
 		{
 			/*
 			 * Read and discard the query result; note this is not included in
@@ -1403,7 +1406,8 @@ top:
 					break;		/* OK */
 				default:
 					fprintf(stderr, "client %d aborted in state %d: %s",
-							st->id, st->state, PQerrorMessage(st->con));
+							st->id, st->curr->command_num,
+							PQerrorMessage(st->con));
 					PQclear(res);
 					return clientDone(st, false);
 			}
@@ -1411,7 +1415,7 @@ top:
 			discard_response(st);
 		}
 
-		if (commands[st->state + 1] == NULL)
+		if (st->curr->next == NULL)
 		{
 			if (is_connect)
 			{
@@ -1424,16 +1428,16 @@ top:
 				return clientDone(st, true);	/* exit success */
 		}
 
-		/* increment state counter */
-		st->state++;
-		if (commands[st->state] == NULL)
+		/* move to the next state */
+		st->curr = st->curr->next;
+		if (st->curr == NULL)
 		{
-			st->state = 0;
 			st->use_file = chooseScript(thread);
 			commands = sql_script[st->use_file].commands;
 			if (debug)
 				fprintf(stderr, "client %d executing script \"%s\"\n", st->id,
 						sql_script[st->use_file].name);
+			st->curr = commands;
 			st->is_throttled = false;
 
 			/*
@@ -1477,7 +1481,7 @@ top:
 
 	/* Record transaction start time under logging, progress or throttling */
 	if ((use_log || progress || throttle_delay || latency_limit ||
-		 per_script_stats) && st->state == 0)
+		 per_script_stats) && st->curr == commands)
 	{
 		INSTR_TIME_SET_CURRENT(st->txn_begin);
 
@@ -1493,9 +1497,9 @@ top:
 	if (is_latencies)
 		INSTR_TIME_SET_CURRENT(st->stmt_begin);
 
-	if (commands[st->state]->type == SQL_COMMAND)
+	if (st->curr->type == SQL_COMMAND)
 	{
-		const Command *command = commands[st->state];
+		const Command *command = st->curr;
 		int			r;
 
 		if (querymode == QUERY_SIMPLE)
@@ -1529,18 +1533,19 @@ top:
 
 			if (!st->prepared[st->use_file])
 			{
-				int			j;
+				int			j = 0;
+				Command		*pcom = commands;
 
-				for (j = 0; commands[j] != NULL; j++)
+				for (; pcom ; pcom = pcom->next, j++)
 				{
 					PGresult   *res;
 					char		name[MAX_PREPARE_NAME];
 
-					if (commands[j]->type != SQL_COMMAND)
+					if (pcom->type != SQL_COMMAND)
 						continue;
 					preparedStatementName(name, st->use_file, j);
 					res = PQprepare(st->con, name,
-						  commands[j]->argv[0], commands[j]->argc - 1, NULL);
+						  pcom->argv[0], pcom->argc - 1, NULL);
 					if (PQresultStatus(res) != PGRES_COMMAND_OK)
 						fprintf(stderr, "%s", PQerrorMessage(st->con));
 					PQclear(res);
@@ -1549,7 +1554,7 @@ top:
 			}
 
 			getQueryParams(st, command, params);
-			preparedStatementName(name, st->use_file, st->state);
+			preparedStatementName(name, st->use_file, st->curr->command_num);
 
 			if (debug)
 				fprintf(stderr, "client %d sending %s\n", st->id, name);
@@ -1569,11 +1574,11 @@ top:
 		else
 			st->listen = true;	/* flags that should be listened */
 	}
-	else if (commands[st->state]->type == META_COMMAND)
+	else if (st->curr->type == META_COMMAND)
 	{
-		int			argc = commands[st->state]->argc,
+		int			argc = st->curr->argc,
 					i;
-		char	  **argv = commands[st->state]->argv;
+		char	  **argv = st->curr->argv;
 
 		if (debug)
 		{
@@ -1723,7 +1728,7 @@ top:
 		else if (pg_strcasecmp(argv[0], "set") == 0)
 		{
 			char		res[64];
-			PgBenchExpr *expr = commands[st->state]->expr;
+			PgBenchExpr *expr = st->curr->expr;
 			int64		result;
 
 			if (!evaluateExpr(st, expr, &result))
@@ -2697,36 +2702,28 @@ read_line_from_file(FILE *fd)
  * Given a file name, read it and return the array of Commands contained
  * therein.  "-" means to read stdin.
  */
-static Command **
+static Command *
 process_file(char *filename)
 {
-#define COMMANDS_ALLOC_NUM 128
-
-	Command   **my_commands;
+	Command    *my_commands = NULL,
+			   *my_commands_tail = NULL;
 	FILE	   *fd;
-	int			lineno,
-				index;
+	int			lineno;
 	char	   *buf;
-	int			alloc_num;
 	ParseInfo proc_state = createParseInfo();
 
-	alloc_num = COMMANDS_ALLOC_NUM;
-	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
-
 	if (strcmp(filename, "-") == 0)
 		fd = stdin;
 	else if ((fd = fopen(filename, "r")) == NULL)
 	{
 		fprintf(stderr, "could not open file \"%s\": %s\n",
 				filename, strerror(errno));
-		pg_free(my_commands);
 		return NULL;
 	}
 
 	proc_state->mode = PS_IDLE;
 
 	lineno = 0;
-	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
@@ -2748,44 +2745,35 @@ process_file(char *filename)
  			continue;
 		}
 
-		while (command)
-		{
-			my_commands[index++] = command;
-			command = command->next;
-		}
-		
-		if (index > alloc_num)
-		{
-			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
-		}
+		/* Append new commands at the end of the list */
+		if (my_commands_tail)
+			my_commands_tail->next = command;
+		else
+			my_commands = my_commands_tail = command;
+
+		/* Seek to the tail of the list */
+		while (my_commands_tail->next)
+			my_commands_tail = my_commands_tail->next;
 	}
 	fclose(fd);
 
 	parse_finish_scan(proc_state);
 
-	my_commands[index] = NULL;
+	my_commands_tail->next = NULL;
 
 	return my_commands;
 }
 
-static Command **
+static Command *
 process_builtin(const char *tb, const char *source)
 {
-#define COMMANDS_ALLOC_NUM 128
-
-	Command   **my_commands;
-	int			lineno,
-				index;
+	Command    *my_commands = NULL,
+			   *my_commands_tail = NULL;
+	int			lineno;
 	char		buf[BUFSIZ];
-	int			alloc_num;
 	ParseInfo proc_state = createParseInfo();
 
-	alloc_num = COMMANDS_ALLOC_NUM;
-	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
-
 	lineno = 0;
-	index = 0;
 
 	for (;;)
 	{
@@ -2810,19 +2798,17 @@ process_builtin(const char *tb, const char *source)
 		if (command == NULL)
 			continue;
 
-		/* builtin doesn't need multistatements */
+		/* For simplisity, inhibit builtin from multistatements */
 		Assert(command->next == NULL);
-		my_commands[index] = command;
-		index++;
-
-		if (index >= alloc_num)
-		{
-			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
+		if (my_commands_tail)
+ 		{
+			my_commands_tail->next = command;
+			my_commands_tail = command;
 		}
+		else
+			my_commands = my_commands_tail = command;
 	}
 
-	my_commands[index] = NULL;
 	parse_finish_scan(proc_state);
 
 	return my_commands;
@@ -2860,10 +2846,9 @@ findBuiltin(const char *name, char **desc)
 }
 
 static void
-addScript(const char *name, Command **commands)
+addScript(const char *name, Command *commands)
 {
-	if (commands == NULL ||
-		commands[0] == NULL)
+	if (commands == NULL)
 	{
 		fprintf(stderr, "empty command list for script \"%s\"\n", name);
 		exit(1);
@@ -2986,17 +2971,16 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 			/* Report per-command latencies */
 			if (is_latencies)
 			{
-				Command   **commands;
+				Command   *command;
 
 				printf(" - statement latencies in milliseconds:\n");
 
-				for (commands = sql_script[i].commands;
-					 *commands != NULL;
-					 commands++)
+				for (command = sql_script[i].commands;
+					 command != NULL;
+					 command = command->next)
 					printf("   %11.3f  %s\n",
-						   1000.0 * (*commands)->stats.sum /
-						   (*commands)->stats.count,
-						   (*commands)->line);
+						   1000.0 * command->stats.sum / command->stats.count,
+						   command->line);
 			}
 		}
 	}
@@ -3797,20 +3781,19 @@ threadRun(void *arg)
 	{
 		CState	   *st = &state[i];
 		int			prev_ecnt = st->ecnt;
-		Command   **commands;
 
 		st->use_file = chooseScript(thread);
-		commands = sql_script[st->use_file].commands;
+		st->curr = sql_script[st->use_file].commands;
 		if (debug)
 			fprintf(stderr, "client %d executing script \"%s\"\n", st->id,
 					sql_script[st->use_file].name);
 		if (!doCustom(thread, st, &aggs))
 			remains--;			/* I've aborted */
 
-		if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
+		if (st->ecnt > prev_ecnt && st->curr->type == META_COMMAND)
 		{
 			fprintf(stderr, "client %d aborted in state %d; execution of meta-command failed\n",
-					i, st->state);
+					i, st->curr->command_num);
 			remains--;			/* I've aborted */
 			PQfinish(st->con);
 			st->con = NULL;
@@ -3831,7 +3814,6 @@ threadRun(void *arg)
 		for (i = 0; i < nstate; i++)
 		{
 			CState	   *st = &state[i];
-			Command   **commands = sql_script[st->use_file].commands;
 			int			sock;
 
 			if (st->con == NULL)
@@ -3867,7 +3849,7 @@ threadRun(void *arg)
 						min_usec = this_usec;
 				}
 			}
-			else if (commands[st->state]->type == META_COMMAND)
+			else if (st->curr->type == META_COMMAND)
 			{
 				min_usec = 0;	/* the connection is ready to run */
 				break;
@@ -3937,7 +3919,6 @@ threadRun(void *arg)
 		for (i = 0; i < nstate; i++)
 		{
 			CState	   *st = &state[i];
-			Command   **commands = sql_script[st->use_file].commands;
 			int			prev_ecnt = st->ecnt;
 
 			if (st->con)
@@ -3950,17 +3931,17 @@ threadRun(void *arg)
 					goto done;
 				}
 				if (FD_ISSET(sock, &input_mask) ||
-					commands[st->state]->type == META_COMMAND)
+					st->curr->type == META_COMMAND)
 				{
 					if (!doCustom(thread, st, &aggs))
 						remains--;		/* I've aborted */
 				}
 			}
 
-			if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
+			if (st->ecnt > prev_ecnt && st->curr->type == META_COMMAND)
 			{
 				fprintf(stderr, "client %d aborted in state %d; execution of meta-command failed\n",
-						i, st->state);
+						i, st->curr->command_num);
 				remains--;		/* I've aborted */
 				PQfinish(st->con);
 				st->con = NULL;
-- 
1.8.3.1

psqlscanbody.l.difftext/x-patch; charset=us-asciiDownload
--- psqlscan.l	2016-02-18 16:50:19.140495106 +0900
+++ psqlscanbody.l	2016-02-18 16:48:23.263579135 +0900
@@ -38,93 +38,22 @@
  *-------------------------------------------------------------------------
  */
-#include "postgres_fe.h"
-
 #include "psqlscan.h"
+#include "psqlscan_int.h"
 
 #include <ctype.h>
 
-#include "common.h"
-#include "settings.h"
-#include "variables.h"
-
-
-/*
- * We use a stack of flex buffers to handle substitution of psql variables.
- * Each stacked buffer contains the as-yet-unread text from one psql variable.
- * When we pop the stack all the way, we resume reading from the outer buffer
- * identified by scanbufhandle.
- */
-typedef struct StackElem
-{
-	YY_BUFFER_STATE buf;		/* flex input control structure */
-	char	   *bufstring;		/* data actually being scanned by flex */
-	char	   *origstring;		/* copy of original data, if needed */
-	char	   *varname;		/* name of variable providing data, or NULL */
-	struct StackElem *next;
-} StackElem;
-
-/*
- * All working state of the lexer must be stored in PsqlScanStateData
- * between calls.  This allows us to have multiple open lexer operations,
- * which is needed for nested include files.  The lexer itself is not
- * recursive, but it must be re-entrant.
- */
-typedef struct PsqlScanStateData
-{
-	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
-	/*
-	 * These variables always refer to the outer buffer, never to any
-	 * stacked variable-expansion buffer.
-	 */
-	YY_BUFFER_STATE scanbufhandle;
-	char	   *scanbuf;		/* start of outer-level input buffer */
-	const char *scanline;		/* current input line at outer level */
-
-	/* safe_encoding, curline, refline are used by emit() to replace FFs */
-	int			encoding;		/* encoding being used now */
-	bool		safe_encoding;	/* is current encoding "safe"? */
-	const char *curline;		/* actual flex input string for cur buf */
-	const char *refline;		/* original data for cur buffer */
-
-	/*
-	 * All this state lives across successive input lines, until explicitly
-	 * reset by psql_scan_reset.
-	 */
-	int			start_state;	/* saved YY_START */
-	int			paren_depth;	/* depth of nesting in parentheses */
-	int			xcdepth;		/* depth of nesting in slash-star comments */
-	char	   *dolqstart;		/* current $foo$ quote start string */
-} PsqlScanStateData;
-
 static PsqlScanState cur_state;	/* current state while active */
 
-static PQExpBuffer output_buf;	/* current output buffer */
+PQExpBuffer output_buf;	/* current output buffer */
+
+#define ECHO scan_emit(yytext, yyleng)
 
-/* these variables do not need to be saved across calls */
-static enum slash_option_type option_type;
-static char *option_quote;
-static int	unquoted_option_chars;
-static int	backtick_start_offset;
-
-
-/* Return values from yylex() */
-#define LEXRES_EOL			0	/* end of input */
-#define LEXRES_SEMI			1	/* command-terminating semicolon found */
-#define LEXRES_BACKSLASH	2	/* backslash command start */
-#define LEXRES_OK			3	/* OK completion of backslash argument */
-
-
-static void evaluate_backtick(void);
-static void push_new_buffer(const char *newstr, const char *varname);
-static void pop_buffer_stack(PsqlScanState state);
-static bool var_is_current_source(PsqlScanState state, const char *varname);
-static YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
-									  char **txtcopy);
-static void emit(const char *txt, int len);
-static char *extract_substring(const char *txt, int len);
-static void escape_variable(bool as_ident);
+/* Adjust curpos on yyless */
+#define my_yyless(n) cur_state->curpos -= (yyleng - (n)); yyless(n)
 
-#define ECHO emit(yytext, yyleng)
+/* Track where lexer parsed up to */
+#define YY_USER_ACTION cur_state->curpos += yyleng;
 
+#define ENC_IS_SAFE(s) (!(s)->cb.enc_mblen)
 %}
 
@@ -186,13 +115,4 @@
 %x xus
 %x xusend
-/* Additional exclusive states for psql only: lex backslash commands */
-%x xslashcmd
-%x xslashargstart
-%x xslasharg
-%x xslashquote
-%x xslashbackquote
-%x xslashdquote
-%x xslashwholeline
-%x xslashend
 
 /*
@@ -434,5 +354,5 @@
 					BEGIN(xc);
 					/* Put back any characters past slash-star; see above */
-					yyless(2);
+					my_yyless(2);
 					ECHO;
 				}
@@ -441,5 +361,5 @@
 					cur_state->xcdepth++;
 					/* Put back any characters past slash-star; see above */
-					yyless(2);
+					my_yyless(2);
 					ECHO;
 				}
@@ -473,5 +393,5 @@
 <xb>{quotestop}	|
 <xb>{quotefail} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(INITIAL);
 					ECHO;
@@ -498,5 +418,5 @@
 <xh>{quotestop}	|
 <xh>{quotefail} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(INITIAL);
 					ECHO;
@@ -504,10 +424,10 @@
 
 {xnstart}		{
-					yyless(1);				/* eat only 'n' this time */
+					my_yyless(1);				/* eat only 'n' this time */
 					ECHO;
 				}
 
 {xqstart}		{
-					if (standard_strings())
+					if (cur_state->cb.standard_strings())
 						BEGIN(xq);
 					else
@@ -525,5 +445,5 @@
 <xq,xe>{quotestop}	|
 <xq,xe>{quotefail} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(INITIAL);
 					ECHO;
@@ -531,5 +451,5 @@
 <xus>{quotestop} |
 <xus>{quotefail} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(xusend);
 					ECHO;
@@ -540,5 +460,5 @@
 <xusend>{other} |
 <xusend>{xustop1} {
-					yyless(0);
+					my_yyless(0);
 					BEGIN(INITIAL);
 					ECHO;
@@ -587,5 +507,5 @@
 {dolqfailed}	{
 					/* throw back all but the initial "$" */
-					yyless(1);
+					my_yyless(1);
 					ECHO;
 				}
@@ -604,5 +524,5 @@
 						 * $ for rescanning.  Consider $delim$...$junk$delim$
 						 */
-						yyless(yyleng-1);
+						my_yyless(yyleng-1);
 					}
 					ECHO;
@@ -632,5 +552,5 @@
 				}
 <xui>{dquote} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(xuiend);
 					ECHO;
@@ -641,5 +561,5 @@
 <xuiend>{other} |
 <xuiend>{xustop1} {
-					yyless(0);
+					my_yyless(0);
 					BEGIN(INITIAL);
 					ECHO;
@@ -658,5 +578,5 @@
 {xufailed}	{
 					/* throw back all but the initial u/U */
-					yyless(1);
+					my_yyless(1);
 					ECHO;
 				}
@@ -727,5 +647,5 @@
 "\\"[;:]		{
 					/* Force a semicolon or colon into the query buffer */
-					emit(yytext + 1, 1);
+					scan_emit(yytext + 1, 1);
 				}
 
@@ -737,9 +657,14 @@
 :{variable_char}+	{
 					/* Possible psql variable substitution */
-					char   *varname;
-					const char *value;
+					char	   *varname = NULL;
+					const char *value = NULL;
+					void	  (*free_fn)(void *) = NULL;
 
-					varname = extract_substring(yytext + 1, yyleng - 1);
-					value = GetVariable(pset.vars, varname);
+					if (cur_state->cb.get_variable)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = cur_state->cb.get_variable(varname,
+ 									false, false, &free_fn);
+					}
 
 					if (value)
@@ -749,5 +674,5 @@
 						{
 							/* Recursive expansion --- don't go there */
-							psql_error("skipping recursive expansion of variable \"%s\"\n",
+							cur_state->cb.error_out("skipping recursive expansion of variable \"%s\"\n",
 									   varname);
 							/* Instead copy the string as is */
@@ -760,4 +685,6 @@
 							/* yy_scan_string already made buffer active */
 						}
+						if (free_fn)
+							free_fn((void*)value);
 					}
 					else
@@ -770,5 +697,6 @@
 					}
 
-					free(varname);
+					if (varname)
+						free(varname);
 				}
 
@@ -788,5 +716,5 @@
 :'{variable_char}*	{
 					/* Throw back everything but the colon */
-					yyless(1);
+					my_yyless(1);
 					ECHO;
 				}
@@ -794,5 +722,5 @@
 :\"{variable_char}*	{
 					/* Throw back everything but the colon */
-					yyless(1);
+					my_yyless(1);
 					ECHO;
 				}
@@ -855,5 +783,5 @@
 					{
 						/* Strip the unwanted chars from the token */
-						yyless(nchars);
+						my_yyless(nchars);
 					}
 					ECHO;
@@ -872,5 +800,5 @@
 {decimalfail}	{
 					/* throw back the .., and treat as integer */
-					yyless(yyleng-2);
+					my_yyless(yyleng-2);
 					ECHO;
 				}
@@ -885,10 +813,10 @@
 					 * syntax error anyway, we don't bother to distinguish.
 					 */
-					yyless(yyleng-1);
+					my_yyless(yyleng-1);
 					ECHO;
 				}
 {realfail2}		{
 					/* throw back the [Ee][+-], and proceed as above */
-					yyless(yyleng-2);
+					my_yyless(yyleng-2);
 					ECHO;
 				}
@@ -934,260 +862,24 @@
 					}
 				}
+%%
 
-	/*
-	 * Exclusive lexer states to handle backslash command lexing
-	 */
-
-<xslashcmd>{
-	/* command name ends at whitespace or backslash; eat all else */
-
-{space}|"\\"	{
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashargstart>{
-	/*
-	 * Discard any whitespace before argument, then go to xslasharg state.
-	 * An exception is that "|" is only special at start of argument, so we
-	 * check for it here.
-	 */
-
-{space}+		{ }
-
-"|"				{
-					if (option_type == OT_FILEPIPE)
-					{
-						/* treat like whole-string case */
-						ECHO;
-						BEGIN(xslashwholeline);
-					}
-					else
-					{
-						/* vertical bar is not special otherwise */
-						yyless(0);
-						BEGIN(xslasharg);
-					}
-				}
-
-{other}			{
-					yyless(0);
-					BEGIN(xslasharg);
-				}
-
-}
-
-<xslasharg>{
-	/*
-	 * Default processing of text in a slash command's argument.
-	 *
-	 * Note: unquoted_option_chars counts the number of characters at the
-	 * end of the argument that were not subject to any form of quoting.
-	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
-	 */
-
-{space}|"\\"	{
-					/*
-					 * Unquoted space is end of arg; do not eat.  Likewise
-					 * backslash is end of command or next command, do not eat
-					 *
-					 * XXX this means we can't conveniently accept options
-					 * that include unquoted backslashes; therefore, option
-					 * processing that encourages use of backslashes is rather
-					 * broken.
-					 */
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{quote}			{
-					*option_quote = '\'';
-					unquoted_option_chars = 0;
-					BEGIN(xslashquote);
-				}
-
-"`"				{
-					backtick_start_offset = output_buf->len;
-					*option_quote = '`';
-					unquoted_option_chars = 0;
-					BEGIN(xslashbackquote);
-				}
-
-{dquote}		{
-					ECHO;
-					*option_quote = '"';
-					unquoted_option_chars = 0;
-					BEGIN(xslashdquote);
-				}
-
-:{variable_char}+	{
-					/* Possible psql variable substitution */
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						char   *varname;
-						const char *value;
-
-						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(pset.vars, varname);
-						free(varname);
-
-						/*
-						 * The variable value is just emitted without any
-						 * further examination.  This is consistent with the
-						 * pre-8.0 code behavior, if not with the way that
-						 * variables are handled outside backslash commands.
-						 * Note that we needn't guard against recursion here.
-						 */
-						if (value)
-							appendPQExpBufferStr(output_buf, value);
-						else
-							ECHO;
-
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}+'	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(false);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-
-:\"{variable_char}+\"	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(true);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-:\"{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-{other}			{
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-}
-
-<xslashquote>{
-	/*
-	 * single-quoted text: copy literally except for '' and backslash
-	 * sequences
-	 */
-
-{quote}			{ BEGIN(xslasharg); }
-
-{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
-
-"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
-"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
-"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
-"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
-"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
-
-{xeoctesc}		{
-					/* octal case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 1, NULL, 8));
-				}
-
-{xehexesc}		{
-					/* hex case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 2, NULL, 16));
-				}
-
-"\\".			{ emit(yytext + 1, 1); }
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashbackquote>{
-	/*
-	 * backticked text: copy everything until next backquote, then evaluate.
-	 *
-	 * XXX Possible future behavioral change: substitute for :VARIABLE?
-	 */
-
-"`"				{
-					/* In NO_EVAL mode, don't evaluate the command */
-					if (option_type != OT_NO_EVAL)
-						evaluate_backtick();
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashdquote>{
-	/* double-quoted text: copy verbatim, including the double quotes */
-
-{dquote}		{
-					ECHO;
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashwholeline>{
-	/* copy everything until end of input line */
-	/* but suppress leading whitespace */
-
-{space}+		{
-					if (output_buf->len > 0)
-						ECHO;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashend>{
-	/* at end of command, eat a double backslash, but not anything else */
-
-"\\\\"			{ return LEXRES_OK; }
-
-{other}|\n		{
-					yyless(0);
-					return LEXRES_OK;
-				}
+static void my_psql_scan_finish(PsqlScanState state);
+static void my_psql_scan_reset(PsqlScanState state);
+static void psql_error_errout(const char *fmt, ...)
+	__attribute__ ((format (printf, 1, 2)));
+static bool psql_standard_strings(void);
 
+static void
+psql_scan_initialize(PsqlScanState state)
+{
+	psql_scan_finish(state);
+	psql_scan_reset(state);
+	memset(state, 0, sizeof(*state));
+	state->finish = &my_psql_scan_finish;
+	state->reset = &my_psql_scan_reset;
+	state->my_yy_scan_buffer = &yy_scan_buffer;
+	state->reset(state);
 }
 
-%%
-
 /*
  * Create a lexer working state struct.
@@ -1199,6 +891,5 @@
 
 	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
-
-	psql_scan_reset(state);
+	psql_scan_initialize(state);
 
 	return state;
@@ -1228,6 +919,6 @@
  */
 void
-psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len)
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PsqlScanCallbacks *cb)
 {
 	/* Mustn't be scanning already */
@@ -1235,7 +926,17 @@
 	Assert(state->buffer_stack == NULL);
 
-	/* Do we need to hack the character set encoding? */
-	state->encoding = pset.encoding;
-	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
+	/* copy callback functions */
+	state->cb.get_variable = cb->get_variable;
+	if (cb->standard_strings)
+		state->cb.standard_strings = cb->standard_strings;
+	else
+		state->cb.standard_strings = &psql_standard_strings;
+
+	state->cb.enc_mblen = cb->enc_mblen;
+
+	if (cb->error_out)
+		state->cb.error_out = cb->error_out;
+	else
+		state->cb.error_out = &psql_error_errout;
 
 	/* needed for prepare_buffer */
@@ -1246,4 +947,5 @@
 										  &state->scanbuf);
 	state->scanline = line;
+	state->curpos = 0;
 
 	/* Set lookaside data in case we have to map unsafe encoding */
@@ -1253,4 +955,23 @@
 
 /*
+ * Redirect functions for indirect calls. These functions may be called for
+ * scan state of other lexers.
+ */
+void
+psql_scan_finish(PsqlScanState state)
+{
+	if (state->finish)
+		state->finish(state);
+}
+
+void
+psql_scan_reset(PsqlScanState state)
+{
+	if (state->reset)
+		state->reset(state);
+}
+
+
+/*
  * Do lexical analysis of SQL command text.
  *
@@ -1409,6 +1130,6 @@
  * to deal with error recovery).
  */
-void
-psql_scan_finish(PsqlScanState state)
+static void
+my_psql_scan_finish(PsqlScanState state)
 {
 	/* Drop any incomplete variable expansions. */
@@ -1426,4 +1147,22 @@
 
 /*
+ * Create new lexer scanning state for this lexer which parses from the current
+ * position of the given scanning state for another lexer. The given state is
+ * destroyed.
+ * 
+ * Note: This function cannot access yy* functions and varialbes of the given
+ * state because they are of different lexer.
+ */
+void
+psql_scan_switch_lexer(PsqlScanState state)
+{
+	const char	   *newscanline = state->scanline + state->curpos;
+	PsqlScanCallbacks cb = state->cb;
+
+	psql_scan_initialize(state);
+	psql_scan_setup(state, newscanline, strlen(newscanline), &cb);
+}
+
+/*
  * Reset lexer scanning state to start conditions.  This is appropriate
  * for executing \r psql commands (or any other time that we discard the
@@ -1436,6 +1175,6 @@
  * done by psql_scan_finish().
  */
-void
-psql_scan_reset(PsqlScanState state)
+static void
+my_psql_scan_reset(PsqlScanState state)
 {
 	state->start_state = INITIAL;
@@ -1461,290 +1200,4 @@
 
 /*
- * Scan the command name of a psql backslash command.  This should be called
- * after psql_scan() returns PSCAN_BACKSLASH.  It is assumed that the input
- * has been consumed through the leading backslash.
- *
- * The return value is a malloc'd copy of the command name, as parsed off
- * from the input.
- */
-char *
-psql_scan_slash_command(PsqlScanState state)
-{
-	PQExpBufferData mybuf;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashcmd);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-
-	return mybuf.data;
-}
-
-/*
- * Parse off the next argument for a backslash command, and return it as a
- * malloc'd string.  If there are no more arguments, returns NULL.
- *
- * type tells what processing, if any, to perform on the option string;
- * for example, if it's a SQL identifier, we want to downcase any unquoted
- * letters.
- *
- * if quote is not NULL, *quote is set to 0 if no quoting was found, else
- * the last quote symbol used in the argument.
- *
- * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
- * be taken as part of the option string will be stripped.
- *
- * NOTE: the only possible syntax errors for backslash options are unmatched
- * quotes, which are detected when we run out of input.  Therefore, on a
- * syntax error we just throw away the string and return NULL; there is no
- * need to worry about flushing remaining input.
- */
-char *
-psql_scan_slash_option(PsqlScanState state,
-					   enum slash_option_type type,
-					   char *quote,
-					   bool semicolon)
-{
-	PQExpBufferData mybuf;
-	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
-	char		local_quote;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	if (quote == NULL)
-		quote = &local_quote;
-	*quote = 0;
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-	option_type = type;
-	option_quote = quote;
-	unquoted_option_chars = 0;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	if (type == OT_WHOLE_LINE)
-		BEGIN(xslashwholeline);
-	else
-		BEGIN(xslashargstart);
-
-	/* And lex. */
-	lexresult = yylex();
-
-	/*
-	 * Check the lex result: we should have gotten back either LEXRES_OK
-	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
-	 * a quoted string, as indicated by YY_START, EOL is an error.
-	 */
-	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
-
-	switch (YY_START)
-	{
-		case xslashargstart:
-			/* empty arg */
-			break;
-		case xslasharg:
-			/* Strip any unquoted trailing semi-colons if requested */
-			if (semicolon)
-			{
-				while (unquoted_option_chars-- > 0 &&
-					   mybuf.len > 0 &&
-					   mybuf.data[mybuf.len - 1] == ';')
-				{
-					mybuf.data[--mybuf.len] = '\0';
-				}
-			}
-
-			/*
-			 * If SQL identifier processing was requested, then we strip out
-			 * excess double quotes and downcase unquoted letters.
-			 * Doubled double-quotes become output double-quotes, per spec.
-			 *
-			 * Note that a string like FOO"BAR"BAZ will be converted to
-			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
-			 * which would have us parse it as several identifiers.  But
-			 * for psql's purposes, we want a string like "foo"."bar" to
-			 * be treated as one option, so there's little choice.
-			 */
-			if (type == OT_SQLID || type == OT_SQLIDHACK)
-			{
-				bool		inquotes = false;
-				char	   *cp = mybuf.data;
-
-				while (*cp)
-				{
-					if (*cp == '"')
-					{
-						if (inquotes && cp[1] == '"')
-						{
-							/* Keep the first quote, remove the second */
-							cp++;
-						}
-						inquotes = !inquotes;
-						/* Collapse out quote at *cp */
-						memmove(cp, cp + 1, strlen(cp));
-						mybuf.len--;
-						/* do not advance cp */
-					}
-					else
-					{
-						if (!inquotes && type == OT_SQLID)
-							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, pset.encoding);
-					}
-				}
-			}
-			break;
-		case xslashquote:
-		case xslashbackquote:
-		case xslashdquote:
-			/* must have hit EOL inside quotes */
-			psql_error("unterminated quoted string\n");
-			termPQExpBuffer(&mybuf);
-			return NULL;
-		case xslashwholeline:
-			/* always okay */
-			break;
-		default:
-			/* can't get here */
-			fprintf(stderr, "invalid YY_START\n");
-			exit(1);
-	}
-
-	/*
-	 * An unquoted empty argument isn't possible unless we are at end of
-	 * command.  Return NULL instead.
-	 */
-	if (mybuf.len == 0 && *quote == 0)
-	{
-		termPQExpBuffer(&mybuf);
-		return NULL;
-	}
-
-	/* Else return the completed string. */
-	return mybuf.data;
-}
-
-/*
- * Eat up any unused \\ to complete a backslash command.
- */
-void
-psql_scan_slash_command_end(PsqlScanState state)
-{
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = NULL;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashend);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-}
-
-/*
- * Evaluate a backticked substring of a slash command's argument.
- *
- * The portion of output_buf starting at backtick_start_offset is evaluated
- * as a shell command and then replaced by the command's output.
- */
-static void
-evaluate_backtick(void)
-{
-	char	   *cmd = output_buf->data + backtick_start_offset;
-	PQExpBufferData cmd_output;
-	FILE	   *fd;
-	bool		error = false;
-	char		buf[512];
-	size_t		result;
-
-	initPQExpBuffer(&cmd_output);
-
-	fd = popen(cmd, PG_BINARY_R);
-	if (!fd)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (!error)
-	{
-		do
-		{
-			result = fread(buf, 1, sizeof(buf), fd);
-			if (ferror(fd))
-			{
-				psql_error("%s: %s\n", cmd, strerror(errno));
-				error = true;
-				break;
-			}
-			appendBinaryPQExpBuffer(&cmd_output, buf, result);
-		} while (!feof(fd));
-	}
-
-	if (fd && pclose(fd) == -1)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (PQExpBufferDataBroken(cmd_output))
-	{
-		psql_error("%s: out of memory\n", cmd);
-		error = true;
-	}
-
-	/* Now done with cmd, delete it from output_buf */
-	output_buf->len = backtick_start_offset;
-	output_buf->data[output_buf->len] = '\0';
-
-	/* If no error, transfer result to output_buf */
-	if (!error)
-	{
-		/* strip any trailing newline */
-		if (cmd_output.len > 0 &&
-			cmd_output.data[cmd_output.len - 1] == '\n')
-			cmd_output.len--;
-		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
-	}
-
-	termPQExpBuffer(&cmd_output);
-}
-
-/*
  * Push the given string onto the stack of stuff to scan.
  *
@@ -1753,5 +1206,5 @@
  * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
  */
-static void
+void
 push_new_buffer(const char *newstr, const char *varname)
 {
@@ -1770,5 +1223,5 @@
 									&stackelem->bufstring);
 	cur_state->curline = stackelem->bufstring;
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 	{
 		stackelem->origstring = NULL;
@@ -1790,5 +1243,5 @@
  * switch to an appropriate buffer to continue lexing.
  */
-static void
+void
 pop_buffer_stack(PsqlScanState state)
 {
@@ -1809,5 +1262,5 @@
  * currently being scanned
  */
-static bool
+bool
 var_is_current_source(PsqlScanState state, const char *varname)
 {
@@ -1833,5 +1286,5 @@
  * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
  */
-static YY_BUFFER_STATE
+YY_BUFFER_STATE
 prepare_buffer(const char *txt, int len, char **txtcopy)
 {
@@ -1843,5 +1296,5 @@
 	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
 
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 		memcpy(newtxt, txt, len);
 	else
@@ -1852,5 +1305,5 @@
 		while (i < len)
 		{
-			int		thislen = PQmblen(txt + i, cur_state->encoding);
+			int		thislen = cur_state->cb.enc_mblen(txt + i);
 
 			/* first byte should always be okay... */
@@ -1862,9 +1315,9 @@
 	}
 
-	return yy_scan_buffer(newtxt, len + 2);
+	return cur_state->my_yy_scan_buffer(newtxt, len + 2);
 }
 
 /*
- * emit() --- body for ECHO macro
+ * scan_emit() --- body for ECHO macro
  *
  * NB: this must be used for ALL and ONLY the text copied from the flex
@@ -1873,8 +1326,8 @@
  * appended directly to output_buf.
  */
-static void
-emit(const char *txt, int len)
+void
+scan_emit(const char *txt, int len)
 {
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 		appendBinaryPQExpBuffer(output_buf, txt, len);
 	else
@@ -1900,13 +1353,13 @@
  * extract_substring --- fetch the true value of (part of) the current token
  *
- * This is like emit(), except that the data is returned as a malloc'd string
- * rather than being pushed directly to output_buf.
+ * This is like scan_emit(), except that the data is returned as a malloc'd
+ * string rather than being pushed directly to output_buf.
  */
-static char *
+char *
 extract_substring(const char *txt, int len)
 {
 	char	   *result = (char *) pg_malloc(len + 1);
 
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 		memcpy(result, txt, len);
 	else
@@ -1939,43 +1392,24 @@
  * find the variable or the escaping function fails, emit the token as-is.
  */
-static void
+void
 escape_variable(bool as_ident)
 {
-	char	   *varname;
-	const char *value;
-
-	/* Variable lookup. */
-	varname = extract_substring(yytext + 2, yyleng - 3);
-	value = GetVariable(pset.vars, varname);
-	free(varname);
-
-	/* Escaping. */
-	if (value)
+	/* Variable lookup if possible. */
+	if (cur_state->cb.get_variable)
 	{
-		if (!pset.db)
-			psql_error("can't escape without active connection\n");
-		else
-		{
-			char   *escaped_value;
+		char		*varname;
+		const char  *value;
+		void	   (*free_fn)(void *);
+
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = cur_state->cb.get_variable(varname, true, as_ident, &free_fn);
+		free(varname);
 
-			if (as_ident)
-				escaped_value =
-					PQescapeIdentifier(pset.db, value, strlen(value));
-			else
-				escaped_value =
-					PQescapeLiteral(pset.db, value, strlen(value));
-
-			if (escaped_value == NULL)
-			{
-				const char *error = PQerrorMessage(pset.db);
-
-				psql_error("%s", error);
-			}
-			else
-			{
-				appendPQExpBufferStr(output_buf, escaped_value);
-				PQfreemem(escaped_value);
-				return;
-			}
+		if (value)
+		{
+			appendPQExpBufferStr(output_buf, value);
+			if (free_fn)
+				free_fn((void*)value);
+			return;
 		}
 	}
@@ -1985,4 +1419,20 @@
 	 * original text into the output buffer.
 	 */
-	emit(yytext, yyleng);
+	scan_emit(yytext, yyleng);
+}
+
+/* Default error output function */
+static void psql_error_errout(const char *fmt, ...)
+{
+	va_list	ap;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, _(fmt), ap);
+	va_end(ap);
+}
+
+/* Default function to check standard_conforming_strings */
+static bool psql_standard_strings(void)
+{
+	return false;
 }
pgbench.c.patient.difftext/x-patch; charset=us-asciiDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 596d112..8793fd2 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -54,6 +54,7 @@
 #endif
 
 #include "pgbench.h"
+#include "psqlscan.h"
 
 #define ERRCODE_UNDEFINED_TABLE  "42P01"
 
@@ -285,7 +286,7 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct
+typedef struct Command_t
 {
 	char	   *line;			/* full text of command line */
 	int			command_num;	/* unique index of this Command struct */
@@ -295,6 +296,7 @@ typedef struct
 	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
 	SimpleStats stats;			/* time spent in this command */
+	struct Command_t *next;		/* more command if any, for multistatements */
 } Command;
 
 static struct
@@ -303,6 +305,22 @@ static struct
 	Command   **commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
+
+typedef enum
+{
+	PS_IDLE,
+	PS_IN_STATEMENT,
+	PS_IN_BACKSLASH_CMD
+} ParseState;
+
+typedef struct ParseInfo
+{
+	PsqlScanState	scan_state;
+	PQExpBuffer		outbuf;
+	ParseState		mode;
+} ParseInfoData;
+typedef ParseInfoData *ParseInfo;
+
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
 static int	debug = 0;			/* debug flag */
@@ -430,6 +448,9 @@ usage(void)
 		   progname, progname);
 }
 
+PsqlScanCallbacks pgbench_scan_callbacks =
+{NULL, NULL, NULL};
+
 /*
  * strtoint64 -- convert a string to 64-bit integer
  *
@@ -2287,43 +2308,109 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
-/* Parse a command; return a Command struct, or NULL if it's a comment */
+static ParseInfo
+createParseInfo(void)
+{
+	ParseInfo ret = (ParseInfo) pg_malloc(sizeof(ParseInfoData));
+
+	ret->scan_state = psql_scan_create();
+	ret->outbuf = createPQExpBuffer();
+	ret->mode = PS_IDLE;
+
+	return ret;
+}
+
+#define parse_reset_outbuf(pcs) resetPQExpBuffer((pcs)->outbuf)
+#define parse_finish_scan(pcs) psql_scan_finish((pcs)->scan_state)
+
+/* copy a string after removing newlines and collapsing whitespaces */
+static char *
+strdup_nonl(const char *in)
+{
+	char *ret, *p, *q;
+
+	ret = pg_strdup(in);
+
+	/* Replace newlines into spaces */
+	for (p = ret ; *p ; p++)
+		if (*p == '\n') *p = ' ';
+
+	/* collapse successive spaces */
+	for (p = q = ret ; *p ; p++, q++)
+	{
+		while (isspace(*p) && isspace(*(p + 1))) p++;
+		if (p > q) *q = *p;
+	}
+	*q = '\0';
+
+	return ret;
+}
+
+/* Parse a backslash command; return a Command struct */
 static Command *
-process_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(ParseInfo proc_state, char *buf,
+						   const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 	Command    *my_commands;
 	int			j;
 	char	   *p,
+			   *start,
 			   *tok;
-
-	/* Make the string buf end at the next newline */
-	if ((p = strchr(buf, '\n')) != NULL)
-		*p = '\0';
+	int			max_args = -1;
 
 	/* Skip leading whitespace */
 	p = buf;
 	while (isspace((unsigned char) *p))
 		p++;
+	start = p;
 
-	/* If the line is empty or actually a comment, we're done */
-	if (*p == '\0' || strncmp(p, "--", 2) == 0)
+	if (proc_state->mode != PS_IN_BACKSLASH_CMD)
+	{
+		if (*p != '\\')
+			return NULL;	/* not a backslash command */
+
+		/* This is the first line of a backslash command  */
+		proc_state->mode = PS_IN_BACKSLASH_CMD;
+	}
+
+	/*
+	 * Make the string buf end at the next newline, or move to just after the
+	 * end of line
+	 */
+	if ((p = strchr(start, '\n')) != NULL)
+		*p = '\0';
+	else
+		p = start + strlen(start);
+
+	/* continued line ends with a backslash */
+	if (*(--p) == '\\')
+	{
+		*p-- = '\0';
+		appendPQExpBufferStr(proc_state->outbuf, start);
+
+		/* Add a delimiter at the end of the line if necessary */
+		if (!isspace(*p))
+			appendPQExpBufferChar(proc_state->outbuf, ' ');
  		return NULL;
+	}
+
+	appendPQExpBufferStr(proc_state->outbuf, start);
+	proc_state->mode = PS_IDLE;
+
+	/* Start parsing the backslash command */
+
+	p = proc_state->outbuf->data;
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
-	my_commands->line = pg_strdup(buf);
+	my_commands->line = pg_strdup(p);
 	my_commands->command_num = num_commands++;
-	my_commands->type = 0;		/* until set */
+	my_commands->type = META_COMMAND;
 	my_commands->argc = 0;
+	my_commands->next = NULL;
 	initSimpleStats(&my_commands->stats);
 
-	if (*p == '\\')
-	{
-		int			max_args = -1;
-
-		my_commands->type = META_COMMAND;
-
 	j = 0;
 	tok = strtok(++p, delim);
 
@@ -2340,6 +2427,7 @@ process_commands(char *buf, const char *source, const int lineno)
 		else
 			tok = strtok(NULL, delim);
 	}
+	parse_reset_outbuf(proc_state);
 
 	if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
 	{
@@ -2475,28 +2563,91 @@ process_commands(char *buf, const char *source, const int lineno)
 		syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
 					 "invalid command", NULL, -1);
 	}
+
+	return my_commands;
+}
+
+/* Parse an input line, return non-null if any command terminates. */
+static Command *
+process_commands(ParseInfo proc_state, char *buf,
+				 const char *source, const int lineno)
+{
+	Command *command = NULL;
+	Command *retcomd = NULL;
+	PsqlScanState scan_state = proc_state->scan_state;
+	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+	PQExpBuffer qbuf = proc_state->outbuf;
+	PsqlScanResult scan_result;
+
+	if (proc_state->mode != PS_IN_STATEMENT)
+	{
+		command = process_backslash_commands(proc_state, buf, source, lineno);
+
+		/* go to next line for continuation of the backslash command. */
+		if (command != NULL || proc_state->mode == PS_IN_BACKSLASH_CMD)
+			return command;
 	}
+
+	/* Parse statements */
+	psql_scan_setup(scan_state, buf, strlen(buf), &pgbench_scan_callbacks);
+
+next_command:	
+	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
+
+	if (scan_result == PSCAN_SEMICOLON)
+	{
+		proc_state->mode = PS_IDLE;
+		/*
+		 * Command is terminated. Fill the struct.
+		 */
+		command = (Command*) pg_malloc(sizeof(Command));
+		command->line = strdup_nonl(qbuf->data);
+		command->command_num = num_commands++;
+		command->type = SQL_COMMAND;
+		command->argc = 0;
+		command->next = NULL;
+
+		/* Put this command at the end of returning command chain */
+		if (!retcomd)
+			retcomd = command;
 		else
 		{
-		my_commands->type = SQL_COMMAND;
+			Command *pcomm = retcomd;
+			while (pcomm->next) pcomm = pcomm->next;
+			pcomm->next = command;
+		}
 
 		switch (querymode)
 		{
 		case QUERY_SIMPLE:
-				my_commands->argv[0] = pg_strdup(p);
-				my_commands->argc++;
+			command->argv[0] = pg_strdup(qbuf->data);
+			command->argc++;
 			break;
 		case QUERY_EXTENDED:
 		case QUERY_PREPARED:
-				if (!parseQuery(my_commands, p))
+			if (!parseQuery(command, qbuf->data))
 				exit(1);
 			break;
 		default:
 			exit(1);
 		}
+
+		parse_reset_outbuf(proc_state);
+
+		/* Ask for the next statement in this line */
+		goto next_command;
  	}
+	else if (scan_result == PSCAN_BACKSLASH)
+	{
+		fprintf(stderr, "Unexpected backslash in SQL statement: %s:%d\n",
+				source, lineno);
+		exit(1);
+	}
+
+	proc_state->mode = PS_IN_STATEMENT;
+	psql_scan_finish(scan_state);
 
-	return my_commands;
+	return retcomd;
 }
 
 /*
@@ -2557,6 +2708,7 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2571,26 +2723,38 @@ process_file(char *filename)
 		return NULL;
 	}
 
+	proc_state->mode = PS_IDLE;
+
 	lineno = 0;
 	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
-		Command    *command;
+		Command    *command = NULL;
 
 		lineno += 1;
 
-		command = process_commands(buf, filename, lineno);
+		command = process_commands(proc_state, buf, filename, lineno);
 
 		free(buf);
 
 		if (command == NULL)
+		{
+			/*
+			 * command is NULL when psql_scan returns PSCAN_EOL or
+			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
+			 * cases.
+			 */
  			continue;
+		}
 
-		my_commands[index] = command;
-		index++;
+		while (command)
+		{
+			my_commands[index++] = command;
+			command = command->next;
+		}
 		
-		if (index >= alloc_num)
+		if (index > alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
 			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
@@ -2598,6 +2762,8 @@ process_file(char *filename)
 	}
 	fclose(fd);
 
+	parse_finish_scan(proc_state);
+
 	my_commands[index] = NULL;
 
 	return my_commands;
@@ -2613,6 +2779,7 @@ process_builtin(const char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2639,10 +2806,12 @@ process_builtin(const char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(buf, source, lineno);
+		command = process_commands(proc_state, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
+		/* builtin doesn't need multistatements */
+		Assert(command->next == NULL);
 		my_commands[index] = command;
 		index++;
 
@@ -2654,6 +2823,7 @@ process_builtin(const char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
+	parse_finish_scan(proc_state);
 
 	return my_commands;
 }
#50David Steele
david@pgmasters.net
In reply to: Kyotaro HORIGUCHI (#49)
Re: pgbench - allow backslash-continuations in custom scripts

On 2/18/16 6:54 AM, Kyotaro HORIGUCHI wrote:

First, I rebased the previous patch set and merged three of
them. Now they are of three patches.

1. Making SQL parser part of psqlscan independent from psql.

Moved psql's baskslsh command stuff out of original psqlscan.l
and some psql stuff the parser directly looked are used via a
set of callback functions, which can be all NULL for usages
from other than psql.

2. Making pgbench to use the new psqlscan parser.

3. Changing the way to hold SQL/META commands from array to
linked list.

The #2 introduced linked list to store SQL multistatement but
immediately the caller moves the elements into an array. This
patch totally changes the way to linked list.

Any takers to review this updated patch?

--
-David
david@pgmasters.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#51Fabien COELHO
coelho@cri.ensmp.fr
In reply to: David Steele (#50)
Re: pgbench - allow backslash-continuations in custom scripts

Hello David,

Any takers to review this updated patch?

I intend to have a look at it, I had a look at a previous instance, but
I'm ok if someone wants to proceed.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#52David Steele
david@pgmasters.net
In reply to: Fabien COELHO (#51)
Re: pgbench - allow backslash-continuations in custom scripts

Hi Fabien,

On 3/14/16 3:27 PM, Fabien COELHO wrote:

Any takers to review this updated patch?

I intend to have a look at it, I had a look at a previous instance, but
I'm ok if someone wants to proceed.

There's not exactly a long line of reviewers at the moment so if you
could do a followup review that would be great.

Thanks,
--
-David
david@pgmasters.net

#53Fabien COELHO
coelho@cri.ensmp.fr
In reply to: David Steele (#52)
Re: pgbench - allow backslash-continuations in custom scripts

I intend to have a look at it, I had a look at a previous instance, but
I'm ok if someone wants to proceed.

There's not exactly a long line of reviewers at the moment so if you
could do a followup review that would be great.

Ok. It is in the queue, not for right know, though.

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#54Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Fabien COELHO (#53)
5 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

At Tue, 15 Mar 2016 14:55:52 +0100 (CET), Fabien COELHO <coelho@cri.ensmp.fr> wrote in <alpine.DEB.2.10.1603151455170.23831@sto>

I intend to have a look at it, I had a look at a previous instance,
but
I'm ok if someone wants to proceed.

There's not exactly a long line of reviewers at the moment so if you
could do a followup review that would be great.

Ok. It is in the queue, not for right know, though.

Thank you.

This patchset needs "make maintainer-clean" before applying
because it adds src/bin/psql/psqlscan.c, which is currently
generated by flex. All of the patches apply but with many offsets
so I rebased them. The two subsidiary diffs are the same to the
previous patch set.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Make-SQL-parser-part-of-psqlscan-independent-from-ps.patchtext/x-patch; charset=us-asciiDownload
From c61dcfefda2274479418786429145c3bd9b6821a Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 16 Mar 2016 12:05:04 +0900
Subject: [PATCH 1/3] Make SQL parser part of psqlscan independent from psql.

Moved psql's baskslsh command stuff out of original psqlscan.l and
some psql stuff the parser directly looked are used via a set of
callback functions, which can be all NULL for usages from other than
psql.
---
 src/bin/psql/Makefile             |   14 +-
 src/bin/psql/command.c            |    1 +
 src/bin/psql/common.c             |   54 +
 src/bin/psql/common.h             |    8 +
 src/bin/psql/mainloop.c           |   20 +-
 src/bin/psql/psqlscan.h           |   42 +-
 src/bin/psql/psqlscan.l           | 1988 -------------------------------------
 src/bin/psql/psqlscan_int.h       |   84 ++
 src/bin/psql/psqlscan_slash.c     |   19 +
 src/bin/psql/psqlscan_slash.h     |   31 +
 src/bin/psql/psqlscan_slashbody.l |  766 ++++++++++++++
 src/bin/psql/psqlscanbody.l       | 1438 +++++++++++++++++++++++++++
 src/bin/psql/startup.c            |    9 +-
 13 files changed, 2436 insertions(+), 2038 deletions(-)
 delete mode 100644 src/bin/psql/psqlscan.l
 create mode 100644 src/bin/psql/psqlscan_int.h
 create mode 100644 src/bin/psql/psqlscan_slash.c
 create mode 100644 src/bin/psql/psqlscan_slash.h
 create mode 100644 src/bin/psql/psqlscan_slashbody.l
 create mode 100644 src/bin/psql/psqlscanbody.l

diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile
index 66e14fb..05a60fe 100644
--- a/src/bin/psql/Makefile
+++ b/src/bin/psql/Makefile
@@ -23,7 +23,7 @@ override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(top_srcdir)/src/bin/p
 OBJS=	command.o common.o help.o input.o stringutils.o mainloop.o copy.o \
 	startup.o prompt.o variables.o large_obj.o print.o describe.o \
 	tab-complete.o mbprint.o dumputils.o keywords.o kwlookup.o \
-	sql_help.o \
+	sql_help.o psqlscan.o psqlscan_slash.o\
 	$(WIN32RES)
 
 
@@ -44,13 +44,13 @@ sql_help.c: sql_help.h ;
 sql_help.h: create_help.pl $(wildcard $(REFDOCDIR)/*.sgml)
 	$(PERL) $< $(REFDOCDIR) $*
 
-# psqlscan is compiled as part of mainloop
-mainloop.o: psqlscan.c
+psqlscan.o: psqlscan.c psqlscanbody.c common.h psqlscan_int.h
+psqlscan_slash.o: psqlscan_slash.c psqlscan_slashbody.c common.h psqlscan_int.h
 
-psqlscan.c: FLEXFLAGS = -Cfe -p -p
-psqlscan.c: FLEX_NO_BACKUP=yes
+psqlscanbody.c psqlscan_slashbody.c: FLEXFLAGS = -Cfe -p -p
+psqlscanbody.c psqlscan_slashbody.c: FLEX_NO_BACKUP=yes
 
-distprep: sql_help.h psqlscan.c
+distprep: sql_help.h psqlscanbody.c psqlscan_slashbody.c
 
 install: all installdirs
 	$(INSTALL_PROGRAM) psql$(X) '$(DESTDIR)$(bindir)/psql$(X)'
@@ -67,4 +67,4 @@ clean distclean:
 	rm -f psql$(X) $(OBJS) dumputils.c keywords.c kwlookup.c lex.backup
 
 maintainer-clean: distclean
-	rm -f sql_help.h sql_help.c psqlscan.c
+	rm -f sql_help.h sql_help.c psqlscanbody.c psqlscan_slashbody.c
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9750a5b..e42fca7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -46,6 +46,7 @@
 #include "mainloop.h"
 #include "print.h"
 #include "psqlscan.h"
+#include "psqlscan_slash.h"
 #include "settings.h"
 #include "variables.h"
 
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 2cb2e9b..79eb04e 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -30,6 +30,9 @@ static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec);
 static bool command_no_begin(const char *query);
 static bool is_select_command(const char *query);
 
+PsqlScanCallbacks psqlscan_callbacks =
+{&get_variable, &psql_mblen, &standard_strings, &psql_error};
+
 
 /*
  * openQueryOutputFile --- attempt to open a query output file
@@ -1901,3 +1904,54 @@ recognized_connection_string(const char *connstr)
 {
 	return uri_prefix_length(connstr) != 0 || strchr(connstr, '=') != NULL;
 }
+
+/* Access callback to "shell variables" for lexer */
+const char *
+get_variable(const char *name, bool escape, bool as_ident,
+			 void (**free_func)(void *))
+{
+	const char *value;
+	char   *escaped_value;
+
+	*free_func = NULL;
+
+	value = GetVariable(pset.vars, name);
+
+	if (!escape)
+		return value;
+
+	/* Escaping. */
+
+	if (!value)
+		return NULL;
+
+	if (!pset.db)
+	{
+		psql_error("can't escape without active connection\n");
+		return NULL;
+	}
+
+	if (as_ident)
+		escaped_value =
+			PQescapeIdentifier(pset.db, value, strlen(value));
+	else
+		escaped_value =
+			PQescapeLiteral(pset.db, value, strlen(value));
+
+	if (escaped_value == NULL)
+	{
+		const char *error = PQerrorMessage(pset.db);
+
+		psql_error("%s", error);
+		return NULL;
+	}
+
+	*free_func = &PQfreemem;
+	return escaped_value;
+}
+
+int
+psql_mblen(const char *s)
+{
+	return PQmblen(s, pset.encoding);
+}
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index 6ba3f44..686503a 100644
--- a/src/bin/psql/common.h
+++ b/src/bin/psql/common.h
@@ -13,6 +13,7 @@
 #include "libpq-fe.h"
 
 #include "print.h"
+#include "psqlscan.h"
 
 #define atooid(x)  ((Oid) strtoul((x), NULL, 10))
 
@@ -29,6 +30,8 @@ extern sigjmp_buf sigint_interrupt_jmp;
 
 extern volatile bool cancel_pressed;
 
+extern PsqlScanCallbacks psqlscan_callbacks;
+
 /* Note: cancel_pressed is defined in print.c, see that file for reasons */
 
 extern void setup_cancel_handler(void);
@@ -49,4 +52,9 @@ extern void expand_tilde(char **filename);
 
 extern bool recognized_connection_string(const char *connstr);
 
+extern const char *get_variable(const char *name, bool escape, bool as_ident,
+								void (**free_func)(void *));
+
+extern int psql_mblen(const char *s);
+
 #endif   /* COMMON_H */
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index dadbd29..13424be 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -16,7 +16,6 @@
 
 #include "mb/pg_wchar.h"
 
-
 /*
  * Main processing loop for reading lines of input
  *	and sending them to the backend.
@@ -233,7 +232,11 @@ MainLoop(FILE *source)
 		/*
 		 * Parse line, looking for command separators.
 		 */
-		psql_scan_setup(scan_state, line, strlen(line));
+		/* set enc_mblen according to the encoding */
+		psqlscan_callbacks.enc_mblen =
+			(pg_valid_server_encoding_id(pset.encoding) ? NULL : &psql_mblen);
+
+		psql_scan_setup(scan_state, line, strlen(line),	&psqlscan_callbacks);
 		success = true;
 		line_saved_in_history = false;
 
@@ -373,7 +376,8 @@ MainLoop(FILE *source)
 					resetPQExpBuffer(query_buf);
 					/* reset parsing state since we are rescanning whole line */
 					psql_scan_reset(scan_state);
-					psql_scan_setup(scan_state, line, strlen(line));
+					psql_scan_setup(scan_state, line, strlen(line),
+									&psqlscan_callbacks);
 					line_saved_in_history = false;
 					prompt_status = PROMPT_READY;
 				}
@@ -450,13 +454,3 @@ MainLoop(FILE *source)
 
 	return successResult;
 }	/* MainLoop() */
-
-
-/*
- * psqlscan.c is #include'd here instead of being compiled on its own.
- * This is because we need postgres_fe.h to be read before any system
- * include files, else things tend to break on platforms that have
- * multiple infrastructures for stdio.h and so on.  flex is absolutely
- * uncooperative about that, so we can't compile psqlscan.c on its own.
- */
-#include "psqlscan.c"
diff --git a/src/bin/psql/psqlscan.h b/src/bin/psql/psqlscan.h
index 674ba69..322edd3 100644
--- a/src/bin/psql/psqlscan.h
+++ b/src/bin/psql/psqlscan.h
@@ -12,10 +12,20 @@
 
 #include "prompt.h"
 
-
 /* Abstract type for lexer's internal state */
 typedef struct PsqlScanStateData *PsqlScanState;
 
+typedef struct PsqlScanCallbacks
+{
+	const char *(*get_variable)(const char *, bool escape, bool as_ident,
+								void (**free_fn)(void *));
+	/* enc_mblen is needed only if encoding is not safe */
+	int	 (*enc_mblen)(const char *);
+	bool (*standard_strings)(void); /* standard_conforming_strings */
+	void (*error_out)(const char *fmt, ...) /* write error message */
+		pg_attribute_printf(1, 2);
+} PsqlScanCallbacks;
+
 /* Termination states for psql_scan() */
 typedef enum
 {
@@ -25,40 +35,18 @@ typedef enum
 	PSCAN_EOL					/* end of line, SQL possibly complete */
 } PsqlScanResult;
 
-/* Different ways for scan_slash_option to handle parameter words */
-enum slash_option_type
-{
-	OT_NORMAL,					/* normal case */
-	OT_SQLID,					/* treat as SQL identifier */
-	OT_SQLIDHACK,				/* SQL identifier, but don't downcase */
-	OT_FILEPIPE,				/* it's a filename or pipe */
-	OT_WHOLE_LINE,				/* just snarf the rest of the line */
-	OT_NO_EVAL					/* no expansion of backticks or variables */
-};
-
-
 extern PsqlScanState psql_scan_create(void);
 extern void psql_scan_destroy(PsqlScanState state);
 
-extern void psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len);
+extern void psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+							PsqlScanCallbacks *callbacks);
 extern void psql_scan_finish(PsqlScanState state);
 
 extern PsqlScanResult psql_scan(PsqlScanState state,
-		  PQExpBuffer query_buf,
-		  promptStatus_t *prompt);
+								PQExpBuffer query_buf,
+								promptStatus_t *prompt);
 
 extern void psql_scan_reset(PsqlScanState state);
-
 extern bool psql_scan_in_quote(PsqlScanState state);
 
-extern char *psql_scan_slash_command(PsqlScanState state);
-
-extern char *psql_scan_slash_option(PsqlScanState state,
-					   enum slash_option_type type,
-					   char *quote,
-					   bool semicolon);
-
-extern void psql_scan_slash_command_end(PsqlScanState state);
-
 #endif   /* PSQLSCAN_H */
diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
deleted file mode 100644
index bbe0172..0000000
--- a/src/bin/psql/psqlscan.l
+++ /dev/null
@@ -1,1988 +0,0 @@
-%{
-/*-------------------------------------------------------------------------
- *
- * psqlscan.l
- *	  lexical scanner for psql
- *
- * This code is mainly needed to determine where the end of a SQL statement
- * is: we are looking for semicolons that are not within quotes, comments,
- * or parentheses.  The most reliable way to handle this is to borrow the
- * backend's flex lexer rules, lock, stock, and barrel.  The rules below
- * are (except for a few) the same as the backend's, but their actions are
- * just ECHO whereas the backend's actions generally do other things.
- *
- * XXX The rules in this file must be kept in sync with the backend lexer!!!
- *
- * XXX Avoid creating backtracking cases --- see the backend lexer for info.
- *
- * The most difficult aspect of this code is that we need to work in multibyte
- * encodings that are not ASCII-safe.  A "safe" encoding is one in which each
- * byte of a multibyte character has the high bit set (it's >= 0x80).  Since
- * all our lexing rules treat all high-bit-set characters alike, we don't
- * really need to care whether such a byte is part of a sequence or not.
- * In an "unsafe" encoding, we still expect the first byte of a multibyte
- * sequence to be >= 0x80, but later bytes might not be.  If we scan such
- * a sequence as-is, the lexing rules could easily be fooled into matching
- * such bytes to ordinary ASCII characters.  Our solution for this is to
- * substitute 0xFF for each non-first byte within the data presented to flex.
- * The flex rules will then pass the FF's through unmolested.  The emit()
- * subroutine is responsible for looking back to the original string and
- * replacing FF's with the corresponding original bytes.
- *
- * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * IDENTIFICATION
- *	  src/bin/psql/psqlscan.l
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres_fe.h"
-
-#include "psqlscan.h"
-
-#include <ctype.h>
-
-#include "common.h"
-#include "settings.h"
-#include "variables.h"
-
-
-/*
- * We use a stack of flex buffers to handle substitution of psql variables.
- * Each stacked buffer contains the as-yet-unread text from one psql variable.
- * When we pop the stack all the way, we resume reading from the outer buffer
- * identified by scanbufhandle.
- */
-typedef struct StackElem
-{
-	YY_BUFFER_STATE buf;		/* flex input control structure */
-	char	   *bufstring;		/* data actually being scanned by flex */
-	char	   *origstring;		/* copy of original data, if needed */
-	char	   *varname;		/* name of variable providing data, or NULL */
-	struct StackElem *next;
-} StackElem;
-
-/*
- * All working state of the lexer must be stored in PsqlScanStateData
- * between calls.  This allows us to have multiple open lexer operations,
- * which is needed for nested include files.  The lexer itself is not
- * recursive, but it must be re-entrant.
- */
-typedef struct PsqlScanStateData
-{
-	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
-	/*
-	 * These variables always refer to the outer buffer, never to any
-	 * stacked variable-expansion buffer.
-	 */
-	YY_BUFFER_STATE scanbufhandle;
-	char	   *scanbuf;		/* start of outer-level input buffer */
-	const char *scanline;		/* current input line at outer level */
-
-	/* safe_encoding, curline, refline are used by emit() to replace FFs */
-	int			encoding;		/* encoding being used now */
-	bool		safe_encoding;	/* is current encoding "safe"? */
-	const char *curline;		/* actual flex input string for cur buf */
-	const char *refline;		/* original data for cur buffer */
-
-	/*
-	 * All this state lives across successive input lines, until explicitly
-	 * reset by psql_scan_reset.
-	 */
-	int			start_state;	/* saved YY_START */
-	int			paren_depth;	/* depth of nesting in parentheses */
-	int			xcdepth;		/* depth of nesting in slash-star comments */
-	char	   *dolqstart;		/* current $foo$ quote start string */
-} PsqlScanStateData;
-
-static PsqlScanState cur_state;	/* current state while active */
-
-static PQExpBuffer output_buf;	/* current output buffer */
-
-/* these variables do not need to be saved across calls */
-static enum slash_option_type option_type;
-static char *option_quote;
-static int	unquoted_option_chars;
-static int	backtick_start_offset;
-
-
-/* Return values from yylex() */
-#define LEXRES_EOL			0	/* end of input */
-#define LEXRES_SEMI			1	/* command-terminating semicolon found */
-#define LEXRES_BACKSLASH	2	/* backslash command start */
-#define LEXRES_OK			3	/* OK completion of backslash argument */
-
-
-static void evaluate_backtick(void);
-static void push_new_buffer(const char *newstr, const char *varname);
-static void pop_buffer_stack(PsqlScanState state);
-static bool var_is_current_source(PsqlScanState state, const char *varname);
-static YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
-									  char **txtcopy);
-static void emit(const char *txt, int len);
-static char *extract_substring(const char *txt, int len);
-static void escape_variable(bool as_ident);
-
-#define ECHO emit(yytext, yyleng)
-
-%}
-
-%option 8bit
-%option never-interactive
-%option nodefault
-%option noinput
-%option nounput
-%option noyywrap
-%option warn
-
-/*
- * All of the following definitions and rules should exactly match
- * src/backend/parser/scan.l so far as the flex patterns are concerned.
- * The rule bodies are just ECHO as opposed to what the backend does,
- * however.  (But be sure to duplicate code that affects the lexing process,
- * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
- * scan.l has a separate one for each exclusive state.
- */
-
-/*
- * OK, here is a short description of lex/flex rules behavior.
- * The longest pattern which matches an input string is always chosen.
- * For equal-length patterns, the first occurring in the rules list is chosen.
- * INITIAL is the starting state, to which all non-conditional rules apply.
- * Exclusive states change parsing rules while the state is active.  When in
- * an exclusive state, only those rules defined for that state apply.
- *
- * We use exclusive states for quoted strings, extended comments,
- * and to eliminate parsing troubles for numeric strings.
- * Exclusive states:
- *  <xb> bit string literal
- *  <xc> extended C-style comments
- *  <xd> delimited identifiers (double-quoted identifiers)
- *  <xh> hexadecimal numeric string
- *  <xq> standard quoted strings
- *  <xe> extended quoted strings (support backslash escape sequences)
- *  <xdolq> $foo$ quoted strings
- *  <xui> quoted identifier with Unicode escapes
- *  <xuiend> end of a quoted identifier with Unicode escapes, UESCAPE can follow
- *  <xus> quoted string with Unicode escapes
- *  <xusend> end of a quoted string with Unicode escapes, UESCAPE can follow
- *
- * Note: we intentionally don't mimic the backend's <xeu> state; we have
- * no need to distinguish it from <xe> state, and no good way to get out
- * of it in error cases.  The backend just throws yyerror() in those
- * cases, but that's not an option here.
- */
-
-%x xb
-%x xc
-%x xd
-%x xh
-%x xe
-%x xq
-%x xdolq
-%x xui
-%x xuiend
-%x xus
-%x xusend
-/* Additional exclusive states for psql only: lex backslash commands */
-%x xslashcmd
-%x xslashargstart
-%x xslasharg
-%x xslashquote
-%x xslashbackquote
-%x xslashdquote
-%x xslashwholeline
-%x xslashend
-
-/*
- * In order to make the world safe for Windows and Mac clients as well as
- * Unix ones, we accept either \n or \r as a newline.  A DOS-style \r\n
- * sequence will be seen as two successive newlines, but that doesn't cause
- * any problems.  Comments that start with -- and extend to the next
- * newline are treated as equivalent to a single whitespace character.
- *
- * NOTE a fine point: if there is no newline following --, we will absorb
- * everything to the end of the input as a comment.  This is correct.  Older
- * versions of Postgres failed to recognize -- as a comment if the input
- * did not end with a newline.
- *
- * XXX perhaps \f (formfeed) should be treated as a newline as well?
- *
- * XXX if you change the set of whitespace characters, fix scanner_isspace()
- * to agree, and see also the plpgsql lexer.
- */
-
-space			[ \t\n\r\f]
-horiz_space		[ \t\f]
-newline			[\n\r]
-non_newline		[^\n\r]
-
-comment			("--"{non_newline}*)
-
-whitespace		({space}+|{comment})
-
-/*
- * SQL requires at least one newline in the whitespace separating
- * string literals that are to be concatenated.  Silly, but who are we
- * to argue?  Note that {whitespace_with_newline} should not have * after
- * it, whereas {whitespace} should generally have a * after it...
- */
-
-special_whitespace		({space}+|{comment}{newline})
-horiz_whitespace		({horiz_space}|{comment})
-whitespace_with_newline	({horiz_whitespace}*{newline}{special_whitespace}*)
-
-/*
- * To ensure that {quotecontinue} can be scanned without having to back up
- * if the full pattern isn't matched, we include trailing whitespace in
- * {quotestop}.  This matches all cases where {quotecontinue} fails to match,
- * except for {quote} followed by whitespace and just one "-" (not two,
- * which would start a {comment}).  To cover that we have {quotefail}.
- * The actions for {quotestop} and {quotefail} must throw back characters
- * beyond the quote proper.
- */
-quote			'
-quotestop		{quote}{whitespace}*
-quotecontinue	{quote}{whitespace_with_newline}{quote}
-quotefail		{quote}{whitespace}*"-"
-
-/* Bit string
- * It is tempting to scan the string for only those characters
- * which are allowed. However, this leads to silently swallowed
- * characters if illegal characters are included in the string.
- * For example, if xbinside is [01] then B'ABCD' is interpreted
- * as a zero-length string, and the ABCD' is lost!
- * Better to pass the string forward and let the input routines
- * validate the contents.
- */
-xbstart			[bB]{quote}
-xbinside		[^']*
-
-/* Hexadecimal number */
-xhstart			[xX]{quote}
-xhinside		[^']*
-
-/* National character */
-xnstart			[nN]{quote}
-
-/* Quoted string that allows backslash escapes */
-xestart			[eE]{quote}
-xeinside		[^\\']+
-xeescape		[\\][^0-7]
-xeoctesc		[\\][0-7]{1,3}
-xehexesc		[\\]x[0-9A-Fa-f]{1,2}
-xeunicode		[\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})
-xeunicodefail	[\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})
-
-/* Extended quote
- * xqdouble implements embedded quote, ''''
- */
-xqstart			{quote}
-xqdouble		{quote}{quote}
-xqinside		[^']+
-
-/* $foo$ style quotes ("dollar quoting")
- * The quoted string starts with $foo$ where "foo" is an optional string
- * in the form of an identifier, except that it may not contain "$",
- * and extends to the first occurrence of an identical string.
- * There is *no* processing of the quoted text.
- *
- * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim}
- * fails to match its trailing "$".
- */
-dolq_start		[A-Za-z\200-\377_]
-dolq_cont		[A-Za-z\200-\377_0-9]
-dolqdelim		\$({dolq_start}{dolq_cont}*)?\$
-dolqfailed		\${dolq_start}{dolq_cont}*
-dolqinside		[^$]+
-
-/* Double quote
- * Allows embedded spaces and other special characters into identifiers.
- */
-dquote			\"
-xdstart			{dquote}
-xdstop			{dquote}
-xddouble		{dquote}{dquote}
-xdinside		[^"]+
-
-/* Unicode escapes */
-uescape			[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote}
-/* error rule to avoid backup */
-uescapefail		[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU]
-
-/* Quoted identifier with Unicode escapes */
-xuistart		[uU]&{dquote}
-
-/* Quoted string with Unicode escapes */
-xusstart		[uU]&{quote}
-
-/* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */
-xustop1		{uescapefail}?
-xustop2		{uescape}
-
-/* error rule to avoid backup */
-xufailed		[uU]&
-
-
-/* C-style comments
- *
- * The "extended comment" syntax closely resembles allowable operator syntax.
- * The tricky part here is to get lex to recognize a string starting with
- * slash-star as a comment, when interpreting it as an operator would produce
- * a longer match --- remember lex will prefer a longer match!  Also, if we
- * have something like plus-slash-star, lex will think this is a 3-character
- * operator whereas we want to see it as a + operator and a comment start.
- * The solution is two-fold:
- * 1. append {op_chars}* to xcstart so that it matches as much text as
- *    {operator} would. Then the tie-breaker (first matching rule of same
- *    length) ensures xcstart wins.  We put back the extra stuff with yyless()
- *    in case it contains a star-slash that should terminate the comment.
- * 2. In the operator rule, check for slash-star within the operator, and
- *    if found throw it back with yyless().  This handles the plus-slash-star
- *    problem.
- * Dash-dash comments have similar interactions with the operator rule.
- */
-xcstart			\/\*{op_chars}*
-xcstop			\*+\/
-xcinside		[^*/]+
-
-digit			[0-9]
-ident_start		[A-Za-z\200-\377_]
-ident_cont		[A-Za-z\200-\377_0-9\$]
-
-identifier		{ident_start}{ident_cont}*
-
-/* Assorted special-case operators and operator-like tokens */
-typecast		"::"
-dot_dot			\.\.
-colon_equals	":="
-equals_greater	"=>"
-less_equals		"<="
-greater_equals	">="
-less_greater	"<>"
-not_equals		"!="
-
-/*
- * "self" is the set of chars that should be returned as single-character
- * tokens.  "op_chars" is the set of chars that can make up "Op" tokens,
- * which can be one or more characters long (but if a single-char token
- * appears in the "self" set, it is not to be returned as an Op).  Note
- * that the sets overlap, but each has some chars that are not in the other.
- *
- * If you change either set, adjust the character lists appearing in the
- * rule for "operator"!
- */
-self			[,()\[\].;\:\+\-\*\/\%\^\<\>\=]
-op_chars		[\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
-operator		{op_chars}+
-
-/* we no longer allow unary minus in numbers.
- * instead we pass it separately to parser. there it gets
- * coerced via doNegate() -- Leon aug 20 1999
- *
- * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 10.
- *
- * {realfail1} and {realfail2} are added to prevent the need for scanner
- * backup when the {real} rule fails to match completely.
- */
-
-integer			{digit}+
-decimal			(({digit}*\.{digit}+)|({digit}+\.{digit}*))
-decimalfail		{digit}+\.\.
-real			({integer}|{decimal})[Ee][-+]?{digit}+
-realfail1		({integer}|{decimal})[Ee]
-realfail2		({integer}|{decimal})[Ee][-+]
-
-param			\${integer}
-
-/* psql-specific: characters allowed in variable names */
-variable_char	[A-Za-z\200-\377_0-9]
-
-other			.
-
-/*
- * Dollar quoted strings are totally opaque, and no escaping is done on them.
- * Other quoted strings must allow some special characters such as single-quote
- *  and newline.
- * Embedded single-quotes are implemented both in the SQL standard
- *  style of two adjacent single quotes "''" and in the Postgres/Java style
- *  of escaped-quote "\'".
- * Other embedded escaped characters are matched explicitly and the leading
- *  backslash is dropped from the string.
- * Note that xcstart must appear before operator, as explained above!
- *  Also whitespace (comment) must appear before operator.
- */
-
-%%
-
-{whitespace}	{
-					/*
-					 * Note that the whitespace rule includes both true
-					 * whitespace and single-line ("--" style) comments.
-					 * We suppress whitespace at the start of the query
-					 * buffer.  We also suppress all single-line comments,
-					 * which is pretty dubious but is the historical
-					 * behavior.
-					 */
-					if (!(output_buf->len == 0 || yytext[0] == '-'))
-						ECHO;
-				}
-
-{xcstart}		{
-					cur_state->xcdepth = 0;
-					BEGIN(xc);
-					/* Put back any characters past slash-star; see above */
-					yyless(2);
-					ECHO;
-				}
-
-<xc>{xcstart}	{
-					cur_state->xcdepth++;
-					/* Put back any characters past slash-star; see above */
-					yyless(2);
-					ECHO;
-				}
-
-<xc>{xcstop}	{
-					if (cur_state->xcdepth <= 0)
-					{
-						BEGIN(INITIAL);
-					}
-					else
-						cur_state->xcdepth--;
-					ECHO;
-				}
-
-<xc>{xcinside}	{
-					ECHO;
-				}
-
-<xc>{op_chars}	{
-					ECHO;
-				}
-
-<xc>\*+			{
-					ECHO;
-				}
-
-{xbstart}		{
-					BEGIN(xb);
-					ECHO;
-				}
-<xb>{quotestop}	|
-<xb>{quotefail} {
-					yyless(1);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xh>{xhinside}	|
-<xb>{xbinside}	{
-					ECHO;
-				}
-<xh>{quotecontinue}	|
-<xb>{quotecontinue}	{
-					ECHO;
-				}
-
-{xhstart}		{
-					/* Hexadecimal bit type.
-					 * At some point we should simply pass the string
-					 * forward to the parser and label it there.
-					 * In the meantime, place a leading "x" on the string
-					 * to mark it for the input routine as a hex string.
-					 */
-					BEGIN(xh);
-					ECHO;
-				}
-<xh>{quotestop}	|
-<xh>{quotefail} {
-					yyless(1);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-
-{xnstart}		{
-					yyless(1);				/* eat only 'n' this time */
-					ECHO;
-				}
-
-{xqstart}		{
-					if (standard_strings())
-						BEGIN(xq);
-					else
-						BEGIN(xe);
-					ECHO;
-				}
-{xestart}		{
-					BEGIN(xe);
-					ECHO;
-				}
-{xusstart}		{
-					BEGIN(xus);
-					ECHO;
-				}
-<xq,xe>{quotestop}	|
-<xq,xe>{quotefail} {
-					yyless(1);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xus>{quotestop} |
-<xus>{quotefail} {
-					yyless(1);
-					BEGIN(xusend);
-					ECHO;
-				}
-<xusend>{whitespace} {
-					ECHO;
-				}
-<xusend>{other} |
-<xusend>{xustop1} {
-					yyless(0);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xusend>{xustop2} {
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xq,xe,xus>{xqdouble} {
-					ECHO;
-				}
-<xq,xus>{xqinside}  {
-					ECHO;
-				}
-<xe>{xeinside}  {
-					ECHO;
-				}
-<xe>{xeunicode} {
-					ECHO;
-				}
-<xe>{xeunicodefail}	{
-					ECHO;
-				}
-<xe>{xeescape}  {
-					ECHO;
-				}
-<xe>{xeoctesc}  {
-					ECHO;
-				}
-<xe>{xehexesc}  {
-					ECHO;
-				}
-<xq,xe,xus>{quotecontinue} {
-					ECHO;
-				}
-<xe>.			{
-					/* This is only needed for \ just before EOF */
-					ECHO;
-				}
-
-{dolqdelim}		{
-					cur_state->dolqstart = pg_strdup(yytext);
-					BEGIN(xdolq);
-					ECHO;
-				}
-{dolqfailed}	{
-					/* throw back all but the initial "$" */
-					yyless(1);
-					ECHO;
-				}
-<xdolq>{dolqdelim} {
-					if (strcmp(yytext, cur_state->dolqstart) == 0)
-					{
-						free(cur_state->dolqstart);
-						cur_state->dolqstart = NULL;
-						BEGIN(INITIAL);
-					}
-					else
-					{
-						/*
-						 * When we fail to match $...$ to dolqstart, transfer
-						 * the $... part to the output, but put back the final
-						 * $ for rescanning.  Consider $delim$...$junk$delim$
-						 */
-						yyless(yyleng-1);
-					}
-					ECHO;
-				}
-<xdolq>{dolqinside} {
-					ECHO;
-				}
-<xdolq>{dolqfailed} {
-					ECHO;
-				}
-<xdolq>.		{
-					/* This is only needed for $ inside the quoted text */
-					ECHO;
-				}
-
-{xdstart}		{
-					BEGIN(xd);
-					ECHO;
-				}
-{xuistart}		{
-					BEGIN(xui);
-					ECHO;
-				}
-<xd>{xdstop}	{
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xui>{dquote} {
-					yyless(1);
-					BEGIN(xuiend);
-					ECHO;
-				}
-<xuiend>{whitespace} {
-					ECHO;
-				}
-<xuiend>{other} |
-<xuiend>{xustop1} {
-					yyless(0);
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xuiend>{xustop2}	{
-					BEGIN(INITIAL);
-					ECHO;
-				}
-<xd,xui>{xddouble}	{
-					ECHO;
-				}
-<xd,xui>{xdinside}	{
-					ECHO;
-				}
-
-{xufailed}	{
-					/* throw back all but the initial u/U */
-					yyless(1);
-					ECHO;
-				}
-
-{typecast}		{
-					ECHO;
-				}
-
-{dot_dot}		{
-					ECHO;
-				}
-
-{colon_equals}	{
-					ECHO;
-				}
-
-{equals_greater} {
-					ECHO;
-				}
-
-{less_equals}	{
-					ECHO;
-				}
-
-{greater_equals} {
-					ECHO;
-				}
-
-{less_greater}	{
-					ECHO;
-				}
-
-{not_equals}	{
-					ECHO;
-				}
-
-	/*
-	 * These rules are specific to psql --- they implement parenthesis
-	 * counting and detection of command-ending semicolon.  These must
-	 * appear before the {self} rule so that they take precedence over it.
-	 */
-
-"("				{
-					cur_state->paren_depth++;
-					ECHO;
-				}
-
-")"				{
-					if (cur_state->paren_depth > 0)
-						cur_state->paren_depth--;
-					ECHO;
-				}
-
-";"				{
-					ECHO;
-					if (cur_state->paren_depth == 0)
-					{
-						/* Terminate lexing temporarily */
-						return LEXRES_SEMI;
-					}
-				}
-
-	/*
-	 * psql-specific rules to handle backslash commands and variable
-	 * substitution.  We want these before {self}, also.
-	 */
-
-"\\"[;:]		{
-					/* Force a semicolon or colon into the query buffer */
-					emit(yytext + 1, 1);
-				}
-
-"\\"			{
-					/* Terminate lexing temporarily */
-					return LEXRES_BACKSLASH;
-				}
-
-:{variable_char}+	{
-					/* Possible psql variable substitution */
-					char   *varname;
-					const char *value;
-
-					varname = extract_substring(yytext + 1, yyleng - 1);
-					value = GetVariable(pset.vars, varname);
-
-					if (value)
-					{
-						/* It is a variable, check for recursion */
-						if (var_is_current_source(cur_state, varname))
-						{
-							/* Recursive expansion --- don't go there */
-							psql_error("skipping recursive expansion of variable \"%s\"\n",
-									   varname);
-							/* Instead copy the string as is */
-							ECHO;
-						}
-						else
-						{
-							/* OK, perform substitution */
-							push_new_buffer(value, varname);
-							/* yy_scan_string already made buffer active */
-						}
-					}
-					else
-					{
-						/*
-						 * if the variable doesn't exist we'll copy the
-						 * string as is
-						 */
-						ECHO;
-					}
-
-					free(varname);
-				}
-
-:'{variable_char}+'	{
-					escape_variable(false);
-				}
-
-:\"{variable_char}+\"	{
-					escape_variable(true);
-				}
-
-	/*
-	 * These rules just avoid the need for scanner backup if one of the
-	 * two rules above fails to match completely.
-	 */
-
-:'{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					ECHO;
-				}
-
-:\"{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					ECHO;
-				}
-
-	/*
-	 * Back to backend-compatible rules.
-	 */
-
-{self}			{
-					ECHO;
-				}
-
-{operator}		{
-					/*
-					 * Check for embedded slash-star or dash-dash; those
-					 * are comment starts, so operator must stop there.
-					 * Note that slash-star or dash-dash at the first
-					 * character will match a prior rule, not this one.
-					 */
-					int		nchars = yyleng;
-					char   *slashstar = strstr(yytext, "/*");
-					char   *dashdash = strstr(yytext, "--");
-
-					if (slashstar && dashdash)
-					{
-						/* if both appear, take the first one */
-						if (slashstar > dashdash)
-							slashstar = dashdash;
-					}
-					else if (!slashstar)
-						slashstar = dashdash;
-					if (slashstar)
-						nchars = slashstar - yytext;
-
-					/*
-					 * For SQL compatibility, '+' and '-' cannot be the
-					 * last char of a multi-char operator unless the operator
-					 * contains chars that are not in SQL operators.
-					 * The idea is to lex '=-' as two operators, but not
-					 * to forbid operator names like '?-' that could not be
-					 * sequences of SQL operators.
-					 */
-					while (nchars > 1 &&
-						   (yytext[nchars-1] == '+' ||
-							yytext[nchars-1] == '-'))
-					{
-						int		ic;
-
-						for (ic = nchars-2; ic >= 0; ic--)
-						{
-							if (strchr("~!@#^&|`?%", yytext[ic]))
-								break;
-						}
-						if (ic >= 0)
-							break; /* found a char that makes it OK */
-						nchars--; /* else remove the +/-, and check again */
-					}
-
-					if (nchars < yyleng)
-					{
-						/* Strip the unwanted chars from the token */
-						yyless(nchars);
-					}
-					ECHO;
-				}
-
-{param}			{
-					ECHO;
-				}
-
-{integer}		{
-					ECHO;
-				}
-{decimal}		{
-					ECHO;
-				}
-{decimalfail}	{
-					/* throw back the .., and treat as integer */
-					yyless(yyleng-2);
-					ECHO;
-				}
-{real}			{
-					ECHO;
-				}
-{realfail1}		{
-					/*
-					 * throw back the [Ee], and treat as {decimal}.  Note
-					 * that it is possible the input is actually {integer},
-					 * but since this case will almost certainly lead to a
-					 * syntax error anyway, we don't bother to distinguish.
-					 */
-					yyless(yyleng-1);
-					ECHO;
-				}
-{realfail2}		{
-					/* throw back the [Ee][+-], and proceed as above */
-					yyless(yyleng-2);
-					ECHO;
-				}
-
-
-{identifier}	{
-					ECHO;
-				}
-
-{other}			{
-					ECHO;
-				}
-
-
-	/*
-	 * Everything from here down is psql-specific.
-	 */
-
-<<EOF>>			{
-					StackElem  *stackelem = cur_state->buffer_stack;
-
-					if (stackelem == NULL)
-						return LEXRES_EOL; /* end of input reached */
-
-					/*
-					 * We were expanding a variable, so pop the inclusion
-					 * stack and keep lexing
-					 */
-					pop_buffer_stack(cur_state);
-
-					stackelem = cur_state->buffer_stack;
-					if (stackelem != NULL)
-					{
-						yy_switch_to_buffer(stackelem->buf);
-						cur_state->curline = stackelem->bufstring;
-						cur_state->refline = stackelem->origstring ? stackelem->origstring : stackelem->bufstring;
-					}
-					else
-					{
-						yy_switch_to_buffer(cur_state->scanbufhandle);
-						cur_state->curline = cur_state->scanbuf;
-						cur_state->refline = cur_state->scanline;
-					}
-				}
-
-	/*
-	 * Exclusive lexer states to handle backslash command lexing
-	 */
-
-<xslashcmd>{
-	/* command name ends at whitespace or backslash; eat all else */
-
-{space}|"\\"	{
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashargstart>{
-	/*
-	 * Discard any whitespace before argument, then go to xslasharg state.
-	 * An exception is that "|" is only special at start of argument, so we
-	 * check for it here.
-	 */
-
-{space}+		{ }
-
-"|"				{
-					if (option_type == OT_FILEPIPE)
-					{
-						/* treat like whole-string case */
-						ECHO;
-						BEGIN(xslashwholeline);
-					}
-					else
-					{
-						/* vertical bar is not special otherwise */
-						yyless(0);
-						BEGIN(xslasharg);
-					}
-				}
-
-{other}			{
-					yyless(0);
-					BEGIN(xslasharg);
-				}
-
-}
-
-<xslasharg>{
-	/*
-	 * Default processing of text in a slash command's argument.
-	 *
-	 * Note: unquoted_option_chars counts the number of characters at the
-	 * end of the argument that were not subject to any form of quoting.
-	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
-	 */
-
-{space}|"\\"	{
-					/*
-					 * Unquoted space is end of arg; do not eat.  Likewise
-					 * backslash is end of command or next command, do not eat
-					 *
-					 * XXX this means we can't conveniently accept options
-					 * that include unquoted backslashes; therefore, option
-					 * processing that encourages use of backslashes is rather
-					 * broken.
-					 */
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{quote}			{
-					*option_quote = '\'';
-					unquoted_option_chars = 0;
-					BEGIN(xslashquote);
-				}
-
-"`"				{
-					backtick_start_offset = output_buf->len;
-					*option_quote = '`';
-					unquoted_option_chars = 0;
-					BEGIN(xslashbackquote);
-				}
-
-{dquote}		{
-					ECHO;
-					*option_quote = '"';
-					unquoted_option_chars = 0;
-					BEGIN(xslashdquote);
-				}
-
-:{variable_char}+	{
-					/* Possible psql variable substitution */
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						char   *varname;
-						const char *value;
-
-						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(pset.vars, varname);
-						free(varname);
-
-						/*
-						 * The variable value is just emitted without any
-						 * further examination.  This is consistent with the
-						 * pre-8.0 code behavior, if not with the way that
-						 * variables are handled outside backslash commands.
-						 * Note that we needn't guard against recursion here.
-						 */
-						if (value)
-							appendPQExpBufferStr(output_buf, value);
-						else
-							ECHO;
-
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}+'	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(false);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-
-:\"{variable_char}+\"	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(true);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-:\"{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-{other}			{
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-}
-
-<xslashquote>{
-	/*
-	 * single-quoted text: copy literally except for '' and backslash
-	 * sequences
-	 */
-
-{quote}			{ BEGIN(xslasharg); }
-
-{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
-
-"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
-"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
-"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
-"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
-"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
-
-{xeoctesc}		{
-					/* octal case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 1, NULL, 8));
-				}
-
-{xehexesc}		{
-					/* hex case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 2, NULL, 16));
-				}
-
-"\\".			{ emit(yytext + 1, 1); }
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashbackquote>{
-	/*
-	 * backticked text: copy everything until next backquote, then evaluate.
-	 *
-	 * XXX Possible future behavioral change: substitute for :VARIABLE?
-	 */
-
-"`"				{
-					/* In NO_EVAL mode, don't evaluate the command */
-					if (option_type != OT_NO_EVAL)
-						evaluate_backtick();
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashdquote>{
-	/* double-quoted text: copy verbatim, including the double quotes */
-
-{dquote}		{
-					ECHO;
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashwholeline>{
-	/* copy everything until end of input line */
-	/* but suppress leading whitespace */
-
-{space}+		{
-					if (output_buf->len > 0)
-						ECHO;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashend>{
-	/* at end of command, eat a double backslash, but not anything else */
-
-"\\\\"			{ return LEXRES_OK; }
-
-{other}|\n		{
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-}
-
-%%
-
-/*
- * Create a lexer working state struct.
- */
-PsqlScanState
-psql_scan_create(void)
-{
-	PsqlScanState state;
-
-	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
-
-	psql_scan_reset(state);
-
-	return state;
-}
-
-/*
- * Destroy a lexer working state struct, releasing all resources.
- */
-void
-psql_scan_destroy(PsqlScanState state)
-{
-	psql_scan_finish(state);
-
-	psql_scan_reset(state);
-
-	free(state);
-}
-
-/*
- * Set up to perform lexing of the given input line.
- *
- * The text at *line, extending for line_len bytes, will be scanned by
- * subsequent calls to the psql_scan routines.  psql_scan_finish should
- * be called when scanning is complete.  Note that the lexer retains
- * a pointer to the storage at *line --- this string must not be altered
- * or freed until after psql_scan_finish is called.
- */
-void
-psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len)
-{
-	/* Mustn't be scanning already */
-	Assert(state->scanbufhandle == NULL);
-	Assert(state->buffer_stack == NULL);
-
-	/* Do we need to hack the character set encoding? */
-	state->encoding = pset.encoding;
-	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
-
-	/* needed for prepare_buffer */
-	cur_state = state;
-
-	/* Set up flex input buffer with appropriate translation and padding */
-	state->scanbufhandle = prepare_buffer(line, line_len,
-										  &state->scanbuf);
-	state->scanline = line;
-
-	/* Set lookaside data in case we have to map unsafe encoding */
-	state->curline = state->scanbuf;
-	state->refline = state->scanline;
-}
-
-/*
- * Do lexical analysis of SQL command text.
- *
- * The text previously passed to psql_scan_setup is scanned, and appended
- * (possibly with transformation) to query_buf.
- *
- * The return value indicates the condition that stopped scanning:
- *
- * PSCAN_SEMICOLON: found a command-ending semicolon.  (The semicolon is
- * transferred to query_buf.)  The command accumulated in query_buf should
- * be executed, then clear query_buf and call again to scan the remainder
- * of the line.
- *
- * PSCAN_BACKSLASH: found a backslash that starts a psql special command.
- * Any previous data on the line has been transferred to query_buf.
- * The caller will typically next call psql_scan_slash_command(),
- * perhaps psql_scan_slash_option(), and psql_scan_slash_command_end().
- *
- * PSCAN_INCOMPLETE: the end of the line was reached, but we have an
- * incomplete SQL command.  *prompt is set to the appropriate prompt type.
- *
- * PSCAN_EOL: the end of the line was reached, and there is no lexical
- * reason to consider the command incomplete.  The caller may or may not
- * choose to send it.  *prompt is set to the appropriate prompt type if
- * the caller chooses to collect more input.
- *
- * In the PSCAN_INCOMPLETE and PSCAN_EOL cases, psql_scan_finish() should
- * be called next, then the cycle may be repeated with a fresh input line.
- *
- * In all cases, *prompt is set to an appropriate prompt type code for the
- * next line-input operation.
- */
-PsqlScanResult
-psql_scan(PsqlScanState state,
-		  PQExpBuffer query_buf,
-		  promptStatus_t *prompt)
-{
-	PsqlScanResult result;
-	int			lexresult;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = query_buf;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(state->start_state);
-
-	/* And lex. */
-	lexresult = yylex();
-
-	/* Update static vars back to the state struct */
-	state->start_state = YY_START;
-
-	/*
-	 * Check termination state and return appropriate result info.
-	 */
-	switch (lexresult)
-	{
-		case LEXRES_EOL:		/* end of input */
-			switch (state->start_state)
-			{
-				/* This switch must cover all non-slash-command states. */
-				case INITIAL:
-				case xuiend:	/* we treat these like INITIAL */
-				case xusend:
-					if (state->paren_depth > 0)
-					{
-						result = PSCAN_INCOMPLETE;
-						*prompt = PROMPT_PAREN;
-					}
-					else if (query_buf->len > 0)
-					{
-						result = PSCAN_EOL;
-						*prompt = PROMPT_CONTINUE;
-					}
-					else
-					{
-						/* never bother to send an empty buffer */
-						result = PSCAN_INCOMPLETE;
-						*prompt = PROMPT_READY;
-					}
-					break;
-				case xb:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xc:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_COMMENT;
-					break;
-				case xd:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_DOUBLEQUOTE;
-					break;
-				case xh:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xe:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xq:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				case xdolq:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_DOLLARQUOTE;
-					break;
-				case xui:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_DOUBLEQUOTE;
-					break;
-				case xus:
-					result = PSCAN_INCOMPLETE;
-					*prompt = PROMPT_SINGLEQUOTE;
-					break;
-				default:
-					/* can't get here */
-					fprintf(stderr, "invalid YY_START\n");
-					exit(1);
-			}
-			break;
-		case LEXRES_SEMI:		/* semicolon */
-			result = PSCAN_SEMICOLON;
-			*prompt = PROMPT_READY;
-			break;
-		case LEXRES_BACKSLASH:	/* backslash */
-			result = PSCAN_BACKSLASH;
-			*prompt = PROMPT_READY;
-			break;
-		default:
-			/* can't get here */
-			fprintf(stderr, "invalid yylex result\n");
-			exit(1);
-	}
-
-	return result;
-}
-
-/*
- * Clean up after scanning a string.  This flushes any unread input and
- * releases resources (but not the PsqlScanState itself).  Note however
- * that this does not reset the lexer scan state; that can be done by
- * psql_scan_reset(), which is an orthogonal operation.
- *
- * It is legal to call this when not scanning anything (makes it easier
- * to deal with error recovery).
- */
-void
-psql_scan_finish(PsqlScanState state)
-{
-	/* Drop any incomplete variable expansions. */
-	while (state->buffer_stack != NULL)
-		pop_buffer_stack(state);
-
-	/* Done with the outer scan buffer, too */
-	if (state->scanbufhandle)
-		yy_delete_buffer(state->scanbufhandle);
-	state->scanbufhandle = NULL;
-	if (state->scanbuf)
-		free(state->scanbuf);
-	state->scanbuf = NULL;
-}
-
-/*
- * Reset lexer scanning state to start conditions.  This is appropriate
- * for executing \r psql commands (or any other time that we discard the
- * prior contents of query_buf).  It is not, however, necessary to do this
- * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
- * PSCAN_EOL scan result, because the scan state must be INITIAL when those
- * conditions are returned.
- *
- * Note that this is unrelated to flushing unread input; that task is
- * done by psql_scan_finish().
- */
-void
-psql_scan_reset(PsqlScanState state)
-{
-	state->start_state = INITIAL;
-	state->paren_depth = 0;
-	state->xcdepth = 0;			/* not really necessary */
-	if (state->dolqstart)
-		free(state->dolqstart);
-	state->dolqstart = NULL;
-}
-
-/*
- * Return true if lexer is currently in an "inside quotes" state.
- *
- * This is pretty grotty but is needed to preserve the old behavior
- * that mainloop.c drops blank lines not inside quotes without even
- * echoing them.
- */
-bool
-psql_scan_in_quote(PsqlScanState state)
-{
-	return state->start_state != INITIAL;
-}
-
-/*
- * Scan the command name of a psql backslash command.  This should be called
- * after psql_scan() returns PSCAN_BACKSLASH.  It is assumed that the input
- * has been consumed through the leading backslash.
- *
- * The return value is a malloc'd copy of the command name, as parsed off
- * from the input.
- */
-char *
-psql_scan_slash_command(PsqlScanState state)
-{
-	PQExpBufferData mybuf;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashcmd);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-
-	return mybuf.data;
-}
-
-/*
- * Parse off the next argument for a backslash command, and return it as a
- * malloc'd string.  If there are no more arguments, returns NULL.
- *
- * type tells what processing, if any, to perform on the option string;
- * for example, if it's a SQL identifier, we want to downcase any unquoted
- * letters.
- *
- * if quote is not NULL, *quote is set to 0 if no quoting was found, else
- * the last quote symbol used in the argument.
- *
- * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
- * be taken as part of the option string will be stripped.
- *
- * NOTE: the only possible syntax errors for backslash options are unmatched
- * quotes, which are detected when we run out of input.  Therefore, on a
- * syntax error we just throw away the string and return NULL; there is no
- * need to worry about flushing remaining input.
- */
-char *
-psql_scan_slash_option(PsqlScanState state,
-					   enum slash_option_type type,
-					   char *quote,
-					   bool semicolon)
-{
-	PQExpBufferData mybuf;
-	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
-	char		local_quote;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	if (quote == NULL)
-		quote = &local_quote;
-	*quote = 0;
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-	option_type = type;
-	option_quote = quote;
-	unquoted_option_chars = 0;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	if (type == OT_WHOLE_LINE)
-		BEGIN(xslashwholeline);
-	else
-		BEGIN(xslashargstart);
-
-	/* And lex. */
-	lexresult = yylex();
-
-	/*
-	 * Check the lex result: we should have gotten back either LEXRES_OK
-	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
-	 * a quoted string, as indicated by YY_START, EOL is an error.
-	 */
-	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
-
-	switch (YY_START)
-	{
-		case xslashargstart:
-			/* empty arg */
-			break;
-		case xslasharg:
-			/* Strip any unquoted trailing semi-colons if requested */
-			if (semicolon)
-			{
-				while (unquoted_option_chars-- > 0 &&
-					   mybuf.len > 0 &&
-					   mybuf.data[mybuf.len - 1] == ';')
-				{
-					mybuf.data[--mybuf.len] = '\0';
-				}
-			}
-
-			/*
-			 * If SQL identifier processing was requested, then we strip out
-			 * excess double quotes and downcase unquoted letters.
-			 * Doubled double-quotes become output double-quotes, per spec.
-			 *
-			 * Note that a string like FOO"BAR"BAZ will be converted to
-			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
-			 * which would have us parse it as several identifiers.  But
-			 * for psql's purposes, we want a string like "foo"."bar" to
-			 * be treated as one option, so there's little choice.
-			 */
-			if (type == OT_SQLID || type == OT_SQLIDHACK)
-			{
-				bool		inquotes = false;
-				char	   *cp = mybuf.data;
-
-				while (*cp)
-				{
-					if (*cp == '"')
-					{
-						if (inquotes && cp[1] == '"')
-						{
-							/* Keep the first quote, remove the second */
-							cp++;
-						}
-						inquotes = !inquotes;
-						/* Collapse out quote at *cp */
-						memmove(cp, cp + 1, strlen(cp));
-						mybuf.len--;
-						/* do not advance cp */
-					}
-					else
-					{
-						if (!inquotes && type == OT_SQLID)
-							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, pset.encoding);
-					}
-				}
-			}
-			break;
-		case xslashquote:
-		case xslashbackquote:
-		case xslashdquote:
-			/* must have hit EOL inside quotes */
-			psql_error("unterminated quoted string\n");
-			termPQExpBuffer(&mybuf);
-			return NULL;
-		case xslashwholeline:
-			/* always okay */
-			break;
-		default:
-			/* can't get here */
-			fprintf(stderr, "invalid YY_START\n");
-			exit(1);
-	}
-
-	/*
-	 * An unquoted empty argument isn't possible unless we are at end of
-	 * command.  Return NULL instead.
-	 */
-	if (mybuf.len == 0 && *quote == 0)
-	{
-		termPQExpBuffer(&mybuf);
-		return NULL;
-	}
-
-	/* Else return the completed string. */
-	return mybuf.data;
-}
-
-/*
- * Eat up any unused \\ to complete a backslash command.
- */
-void
-psql_scan_slash_command_end(PsqlScanState state)
-{
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = NULL;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashend);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-}
-
-/*
- * Evaluate a backticked substring of a slash command's argument.
- *
- * The portion of output_buf starting at backtick_start_offset is evaluated
- * as a shell command and then replaced by the command's output.
- */
-static void
-evaluate_backtick(void)
-{
-	char	   *cmd = output_buf->data + backtick_start_offset;
-	PQExpBufferData cmd_output;
-	FILE	   *fd;
-	bool		error = false;
-	char		buf[512];
-	size_t		result;
-
-	initPQExpBuffer(&cmd_output);
-
-	fd = popen(cmd, PG_BINARY_R);
-	if (!fd)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (!error)
-	{
-		do
-		{
-			result = fread(buf, 1, sizeof(buf), fd);
-			if (ferror(fd))
-			{
-				psql_error("%s: %s\n", cmd, strerror(errno));
-				error = true;
-				break;
-			}
-			appendBinaryPQExpBuffer(&cmd_output, buf, result);
-		} while (!feof(fd));
-	}
-
-	if (fd && pclose(fd) == -1)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (PQExpBufferDataBroken(cmd_output))
-	{
-		psql_error("%s: out of memory\n", cmd);
-		error = true;
-	}
-
-	/* Now done with cmd, delete it from output_buf */
-	output_buf->len = backtick_start_offset;
-	output_buf->data[output_buf->len] = '\0';
-
-	/* If no error, transfer result to output_buf */
-	if (!error)
-	{
-		/* strip any trailing newline */
-		if (cmd_output.len > 0 &&
-			cmd_output.data[cmd_output.len - 1] == '\n')
-			cmd_output.len--;
-		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
-	}
-
-	termPQExpBuffer(&cmd_output);
-}
-
-/*
- * Push the given string onto the stack of stuff to scan.
- *
- * cur_state must point to the active PsqlScanState.
- *
- * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
- */
-static void
-push_new_buffer(const char *newstr, const char *varname)
-{
-	StackElem  *stackelem;
-
-	stackelem = (StackElem *) pg_malloc(sizeof(StackElem));
-
-	/*
-	 * In current usage, the passed varname points at the current flex
-	 * input buffer; we must copy it before calling prepare_buffer()
-	 * because that will change the buffer state.
-	 */
-	stackelem->varname = varname ? pg_strdup(varname) : NULL;
-
-	stackelem->buf = prepare_buffer(newstr, strlen(newstr),
-									&stackelem->bufstring);
-	cur_state->curline = stackelem->bufstring;
-	if (cur_state->safe_encoding)
-	{
-		stackelem->origstring = NULL;
-		cur_state->refline = stackelem->bufstring;
-	}
-	else
-	{
-		stackelem->origstring = pg_strdup(newstr);
-		cur_state->refline = stackelem->origstring;
-	}
-	stackelem->next = cur_state->buffer_stack;
-	cur_state->buffer_stack = stackelem;
-}
-
-/*
- * Pop the topmost buffer stack item (there must be one!)
- *
- * NB: after this, the flex input state is unspecified; caller must
- * switch to an appropriate buffer to continue lexing.
- */
-static void
-pop_buffer_stack(PsqlScanState state)
-{
-	StackElem  *stackelem = state->buffer_stack;
-
-	state->buffer_stack = stackelem->next;
-	yy_delete_buffer(stackelem->buf);
-	free(stackelem->bufstring);
-	if (stackelem->origstring)
-		free(stackelem->origstring);
-	if (stackelem->varname)
-		free(stackelem->varname);
-	free(stackelem);
-}
-
-/*
- * Check if specified variable name is the source for any string
- * currently being scanned
- */
-static bool
-var_is_current_source(PsqlScanState state, const char *varname)
-{
-	StackElem  *stackelem;
-
-	for (stackelem = state->buffer_stack;
-		 stackelem != NULL;
-		 stackelem = stackelem->next)
-	{
-		if (stackelem->varname && strcmp(stackelem->varname, varname) == 0)
-			return true;
-	}
-	return false;
-}
-
-/*
- * Set up a flex input buffer to scan the given data.  We always make a
- * copy of the data.  If working in an unsafe encoding, the copy has
- * multibyte sequences replaced by FFs to avoid fooling the lexer rules.
- *
- * cur_state must point to the active PsqlScanState.
- *
- * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
- */
-static YY_BUFFER_STATE
-prepare_buffer(const char *txt, int len, char **txtcopy)
-{
-	char	   *newtxt;
-
-	/* Flex wants two \0 characters after the actual data */
-	newtxt = pg_malloc(len + 2);
-	*txtcopy = newtxt;
-	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
-
-	if (cur_state->safe_encoding)
-		memcpy(newtxt, txt, len);
-	else
-	{
-		/* Gotta do it the hard way */
-		int		i = 0;
-
-		while (i < len)
-		{
-			int		thislen = PQmblen(txt + i, cur_state->encoding);
-
-			/* first byte should always be okay... */
-			newtxt[i] = txt[i];
-			i++;
-			while (--thislen > 0 && i < len)
-				newtxt[i++] = (char) 0xFF;
-		}
-	}
-
-	return yy_scan_buffer(newtxt, len + 2);
-}
-
-/*
- * emit() --- body for ECHO macro
- *
- * NB: this must be used for ALL and ONLY the text copied from the flex
- * input data.  If you pass it something that is not part of the yytext
- * string, you are making a mistake.  Internally generated text can be
- * appended directly to output_buf.
- */
-static void
-emit(const char *txt, int len)
-{
-	if (cur_state->safe_encoding)
-		appendBinaryPQExpBuffer(output_buf, txt, len);
-	else
-	{
-		/* Gotta do it the hard way */
-		const char *reference = cur_state->refline;
-		int		i;
-
-		reference += (txt - cur_state->curline);
-
-		for (i = 0; i < len; i++)
-		{
-			char	ch = txt[i];
-
-			if (ch == (char) 0xFF)
-				ch = reference[i];
-			appendPQExpBufferChar(output_buf, ch);
-		}
-	}
-}
-
-/*
- * extract_substring --- fetch the true value of (part of) the current token
- *
- * This is like emit(), except that the data is returned as a malloc'd string
- * rather than being pushed directly to output_buf.
- */
-static char *
-extract_substring(const char *txt, int len)
-{
-	char	   *result = (char *) pg_malloc(len + 1);
-
-	if (cur_state->safe_encoding)
-		memcpy(result, txt, len);
-	else
-	{
-		/* Gotta do it the hard way */
-		const char *reference = cur_state->refline;
-		int		i;
-
-		reference += (txt - cur_state->curline);
-
-		for (i = 0; i < len; i++)
-		{
-			char	ch = txt[i];
-
-			if (ch == (char) 0xFF)
-				ch = reference[i];
-			result[i] = ch;
-		}
-	}
-	result[len] = '\0';
-	return result;
-}
-
-/*
- * escape_variable --- process :'VARIABLE' or :"VARIABLE"
- *
- * If the variable name is found, escape its value using the appropriate
- * quoting method and emit the value to output_buf.  (Since the result is
- * surely quoted, there is never any reason to rescan it.)  If we don't
- * find the variable or the escaping function fails, emit the token as-is.
- */
-static void
-escape_variable(bool as_ident)
-{
-	char	   *varname;
-	const char *value;
-
-	/* Variable lookup. */
-	varname = extract_substring(yytext + 2, yyleng - 3);
-	value = GetVariable(pset.vars, varname);
-	free(varname);
-
-	/* Escaping. */
-	if (value)
-	{
-		if (!pset.db)
-			psql_error("can't escape without active connection\n");
-		else
-		{
-			char   *escaped_value;
-
-			if (as_ident)
-				escaped_value =
-					PQescapeIdentifier(pset.db, value, strlen(value));
-			else
-				escaped_value =
-					PQescapeLiteral(pset.db, value, strlen(value));
-
-			if (escaped_value == NULL)
-			{
-				const char *error = PQerrorMessage(pset.db);
-
-				psql_error("%s", error);
-			}
-			else
-			{
-				appendPQExpBufferStr(output_buf, escaped_value);
-				PQfreemem(escaped_value);
-				return;
-			}
-		}
-	}
-
-	/*
-	 * If we reach this point, some kind of error has occurred.  Emit the
-	 * original text into the output buffer.
-	 */
-	emit(yytext, yyleng);
-}
diff --git a/src/bin/psql/psqlscan_int.h b/src/bin/psql/psqlscan_int.h
new file mode 100644
index 0000000..cf3b688
--- /dev/null
+++ b/src/bin/psql/psqlscan_int.h
@@ -0,0 +1,84 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.h
+ */
+#ifndef PSQLSCAN_INT_H
+#define PSQLSCAN_INT_H
+
+/* Abstract type for lexer's internal state */
+typedef struct PsqlScanStateData *PsqlScanState;
+
+/* Return values from yylex() */
+#define LEXRES_EOL			0	/* end of input */
+#define LEXRES_SEMI			1	/* command-terminating semicolon found */
+#define LEXRES_BACKSLASH	2	/* backslash command start */
+#define LEXRES_OK			3	/* OK completion of backslash argument */
+
+/*
+ * We use a stack of flex buffers to handle substitution of psql variables.
+ * Each stacked buffer contains the as-yet-unread text from one psql variable.
+ * When we pop the stack all the way, we resume reading from the outer buffer
+ * identified by scanbufhandle.
+ */
+typedef struct StackElem
+{
+	YY_BUFFER_STATE buf;		/* flex input control structure */
+	char	   *bufstring;		/* data actually being scanned by flex */
+	char	   *origstring;		/* copy of original data, if needed */
+	char	   *varname;		/* name of variable providing data, or NULL */
+	struct StackElem *next;
+} StackElem;
+
+/*
+ * All working state of the lexer must be stored in PsqlScanStateData
+ * between calls.  This allows us to have multiple open lexer operations,
+ * which is needed for nested include files.  The lexer itself is not
+ * recursive, but it must be re-entrant.
+ */
+typedef struct PsqlScanStateData
+{
+	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
+	/*
+	 * These variables always refer to the outer buffer, never to any
+	 * stacked variable-expansion buffer.
+	 */
+	YY_BUFFER_STATE scanbufhandle;
+	char	   *scanbuf;		/* start of outer-level input buffer */
+	const char *scanline;		/* current input line at outer level */
+
+	const char *curline;		/* actual flex input string for cur buf */
+	const char *refline;		/* original data for cur buffer */
+	int			curpos;			/* current position in curline  */
+
+	PsqlScanCallbacks cb;		/* callback given from outside */
+
+
+	/*
+	 * All this state lives across successive input lines, until explicitly
+	 * reset by psql_scan_reset.
+	 */
+	int			start_state;	/* saved YY_START */
+	int			paren_depth;	/* depth of nesting in parentheses */
+	int			xcdepth;		/* depth of nesting in slash-star comments */
+	char	   *dolqstart;		/* current $foo$ quote start string */
+
+	/* Scan, cleanup and reset function for the lexer for this scan state */
+	void	(*finish)(PsqlScanState state);
+	void	(*reset)(PsqlScanState state);
+	YY_BUFFER_STATE (*my_yy_scan_buffer)(char *base, yy_size_t size);
+} PsqlScanStateData;
+
+extern void psql_scan_switch_lexer(PsqlScanState state);
+extern char *extract_substring(const char *txt, int len);
+extern void escape_variable(bool as_ident);
+extern void push_new_buffer(const char *newstr, const char *varname);
+extern void pop_buffer_stack(PsqlScanState state);
+extern bool var_is_current_source(PsqlScanState state, const char *varname);
+extern void scan_emit(const char *txt, int len);
+extern YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
+									  char **txtcopy);
+
+#endif   /* PSQLSCAN_INT_H */
diff --git a/src/bin/psql/psqlscan_slash.c b/src/bin/psql/psqlscan_slash.c
new file mode 100644
index 0000000..bf8c0f3
--- /dev/null
+++ b/src/bin/psql/psqlscan_slash.c
@@ -0,0 +1,19 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan_slash.c
+ *
+ */
+
+/*
+ * psqlscan_slashbody.c is #include'd here instead of being compiled on its own.
+ * This is because we need postgres_fe.h to be read before any system include
+ * files, else things tend to break on platforms that have multiple
+ * infrastructures for stdio.h and so on.  flex is absolutely uncooperative
+ * about that, so we can't compile psqlscan.c on its own.
+ */
+#include "postgres_fe.h"
+#include "psqlscan.h"
+#include "psqlscan_slashbody.c"
diff --git a/src/bin/psql/psqlscan_slash.h b/src/bin/psql/psqlscan_slash.h
new file mode 100644
index 0000000..71acbfb
--- /dev/null
+++ b/src/bin/psql/psqlscan_slash.h
@@ -0,0 +1,31 @@
+/*
+ * psql - the PostgreSQL interactive terminal
+ *
+ * Copyright (c) 2000-2016, PostgreSQL Global Development Group
+ *
+ * src/bin/psql/psqlscan.h
+ */
+#ifndef PSQLSCAN_SLASH_H
+#define PSQLSCAN_SLASH_H
+
+/* Different ways for scan_slash_option to handle parameter words */
+enum slash_option_type
+{
+	OT_NORMAL,					/* normal case */
+	OT_SQLID,					/* treat as SQL identifier */
+	OT_SQLIDHACK,				/* SQL identifier, but don't downcase */
+	OT_FILEPIPE,				/* it's a filename or pipe */
+	OT_WHOLE_LINE,				/* just snarf the rest of the line */
+	OT_NO_EVAL					/* no expansion of backticks or variables */
+};
+
+extern char *psql_scan_slash_command(PsqlScanState state);
+
+extern char *psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon);
+
+extern void psql_scan_slash_command_end(PsqlScanState state);
+
+#endif   /* PSQLSCAN_H */
diff --git a/src/bin/psql/psqlscan_slashbody.l b/src/bin/psql/psqlscan_slashbody.l
new file mode 100644
index 0000000..ae51d3f
--- /dev/null
+++ b/src/bin/psql/psqlscan_slashbody.l
@@ -0,0 +1,766 @@
+%{
+/*-------------------------------------------------------------------------
+ *
+ * psqlscan_slashcmd.l
+ *	  lexical scanner for slash commands of psql
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/bin/psql/psqlscan_slashcmd.l
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "psqlscan.h"
+#include "psqlscan_int.h"
+#include "psqlscan_slash.h"
+
+#include <ctype.h>
+
+static PsqlScanState cur_state;	/* current state while active */
+static PQExpBuffer output_buf;	/* current output buffer */
+
+/* these variables do not need to be saved across calls */
+static enum slash_option_type option_type;
+static char *option_quote;
+static int	unquoted_option_chars;
+static int	backtick_start_offset;
+
+static void evaluate_backtick(void);
+
+#define ECHO scan_emit(yytext, yyleng)
+
+/* Adjust curpos on yyless */
+#define my_yyless(n) cur_state->curpos -= (yyleng - (n)); yyless(n)
+
+/* Track where lexer parsed up to */
+#define YY_USER_ACTION cur_state->curpos += yyleng;
+
+#define ENC_IS_SAFE(s) (!(s)->cb.enc_mblen)
+%}
+
+%option 8bit
+%option never-interactive
+%option nodefault
+%option noinput
+%option nounput
+%option noyywrap
+%option warn
+%option prefix="yys"
+
+/*
+ * All of the following definitions and rules should exactly match
+ * src/backend/parser/scan.l so far as the flex patterns are concerned.
+ * The rule bodies are just ECHO as opposed to what the backend does,
+ * however.  (But be sure to duplicate code that affects the lexing process,
+ * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
+ * scan.l has a separate one for each exclusive state.
+ */
+
+/* Exclusive states for psql only: lex backslash commands */
+%x xslashargstart
+%x xslasharg
+%x xslashquote
+%x xslashbackquote
+%x xslashdquote
+%x xslashwholeline
+%x xslashend
+
+space			[ \t\n\r\f]
+quote			'
+
+/* Quoted string that allows backslash escapes */
+xeoctesc		[\\][0-7]{1,3}
+xehexesc		[\\]x[0-9A-Fa-f]{1,2}
+
+/* Extended quote
+ * xqdouble implements embedded quote, ''''
+ */
+xqdouble		{quote}{quote}
+
+/* Double quote
+ * Allows embedded spaces and other special characters into identifiers.
+ */
+dquote			\"
+
+/* psql-specific: characters allowed in variable names */
+variable_char	[A-Za-z\200-\377_0-9]
+
+other			.
+
+%%
+	/*
+	 * Exclusive lexer states to handle backslash command lexing
+	 */
+
+{
+	/* command name ends at whitespace or backslash; eat all else */
+
+{space}|"\\"	{
+					my_yyless(0);
+					return LEXRES_OK;
+				}
+
+{other}			{ ECHO;}
+
+}
+
+<xslashargstart>{
+	/*
+	 * Discard any whitespace before argument, then go to xslasharg state.
+	 * An exception is that "|" is only special at start of argument, so we
+	 * check for it here.
+	 */
+
+{space}+		{ }
+
+"|"				{
+					if (option_type == OT_FILEPIPE)
+					{
+						/* treat like whole-string case */
+						ECHO;
+						BEGIN(xslashwholeline);
+					}
+					else
+					{
+						/* vertical bar is not special otherwise */
+						my_yyless(0);
+						BEGIN(xslasharg);
+					}
+				}
+
+{other}			{
+					my_yyless(0);
+					BEGIN(xslasharg);
+				}
+
+}
+
+<xslasharg>{
+	/*
+	 * Default processing of text in a slash command's argument.
+	 *
+	 * Note: unquoted_option_chars counts the number of characters at the
+	 * end of the argument that were not subject to any form of quoting.
+	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
+	 */
+
+{space}|"\\"	{
+					/*
+					 * Unquoted space is end of arg; do not eat.  Likewise
+					 * backslash is end of command or next command, do not eat
+					 *
+					 * XXX this means we can't conveniently accept options
+					 * that include unquoted backslashes; therefore, option
+					 * processing that encourages use of backslashes is rather
+					 * broken.
+					 */
+					my_yyless(0);
+					return LEXRES_OK;
+				}
+
+{quote}			{
+					*option_quote = '\'';
+					unquoted_option_chars = 0;
+					BEGIN(xslashquote);
+				}
+
+"`"				{
+					backtick_start_offset = output_buf->len;
+					*option_quote = '`';
+					unquoted_option_chars = 0;
+					BEGIN(xslashbackquote);
+				}
+
+{dquote}		{
+					ECHO;
+					*option_quote = '"';
+					unquoted_option_chars = 0;
+					BEGIN(xslashdquote);
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						char	   *varname;
+						const char *value;
+						void	  (*free_fn)(void *) = NULL;
+
+						if (cur_state->cb.get_variable)
+						{
+							varname = extract_substring(yytext + 1, yyleng - 1);
+							value = cur_state->cb.get_variable(varname,
+												   false, false, &free_fn);
+							free(varname);
+						}
+
+						/*
+						 * The variable value is just emitted without any
+						 * further examination.  This is consistent with the
+						 * pre-8.0 code behavior, if not with the way that
+						 * variables are handled outside backslash commands.
+						 * Note that we needn't guard against recursion here.
+						 */
+						if (value)
+						{
+							appendPQExpBufferStr(output_buf, value);
+							if (free_fn)
+								free_fn((void*)value);
+						}
+						else
+							ECHO;
+
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}+'	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(false);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+
+:\"{variable_char}+\"	{
+					if (option_type == OT_NO_EVAL)
+						ECHO;
+					else
+					{
+						escape_variable(true);
+						*option_quote = ':';
+					}
+					unquoted_option_chars = 0;
+				}
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+{other}			{
+					unquoted_option_chars++;
+					ECHO;
+				}
+
+}
+
+<xslashquote>{
+	/*
+	 * single-quoted text: copy literally except for '' and backslash
+	 * sequences
+	 */
+
+{quote}			{ BEGIN(xslasharg); }
+
+{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
+
+"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
+"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
+"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
+"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
+"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
+
+{xeoctesc}		{
+					/* octal case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 1, NULL, 8));
+				}
+
+{xehexesc}		{
+					/* hex case */
+					appendPQExpBufferChar(output_buf,
+										  (char) strtol(yytext + 2, NULL, 16));
+				}
+
+"\\".			{ scan_emit(yytext + 1, 1); }
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashbackquote>{
+	/*
+	 * backticked text: copy everything until next backquote, then evaluate.
+	 *
+	 * XXX Possible future behavioral change: substitute for :VARIABLE?
+	 */
+
+"`"				{
+					/* In NO_EVAL mode, don't evaluate the command */
+					if (option_type != OT_NO_EVAL)
+						evaluate_backtick();
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashdquote>{
+	/* double-quoted text: copy verbatim, including the double quotes */
+
+{dquote}		{
+					ECHO;
+					BEGIN(xslasharg);
+				}
+
+{other}|\n		{ ECHO; }
+
+}
+
+<xslashwholeline>{
+	/* copy everything until end of input line */
+	/* but suppress leading whitespace */
+
+{space}+		{
+					if (output_buf->len > 0)
+						ECHO;
+				}
+
+{other}			{ ECHO; }
+
+}
+
+<xslashend>{
+	/* at end of command, eat a double backslash, but not anything else */
+
+"\\\\"			{ return LEXRES_OK; }
+
+{other}|\n		{
+					my_yyless(0);
+					return LEXRES_OK;
+				}
+
+}
+
+%%
+
+static void psql_scan_slash_command_finish(PsqlScanState state);
+static void psql_scan_slash_command_reset(PsqlScanState state);
+
+static void
+psql_scan_slash_command_initialize(PsqlScanState state)
+{
+	psql_scan_finish(state);
+	psql_scan_reset(state);
+	memset(state, 0, sizeof(*state));
+	state->finish = &psql_scan_slash_command_finish;
+	state->reset = &psql_scan_slash_command_reset;
+	state->my_yy_scan_buffer = &yy_scan_buffer;
+	state->reset(state);
+}
+
+/*
+ * Set up to perform lexing of the given input line.
+ *
+ * The text at *line, extending for line_len bytes, will be scanned by
+ * subsequent calls to the psql_scan routines.  psql_scan_finish should
+ * be called when scanning is complete.  Note that the lexer retains
+ * a pointer to the storage at *line --- this string must not be altered
+ * or freed until after psql_scan_finish is called.
+ */
+static void
+psql_scan_slash_command_setup(PsqlScanState state,
+							  const char *line, int line_len,
+							  PsqlScanCallbacks *cb)
+{
+	/* Mustn't be scanning already */
+	Assert(state->scanbufhandle == NULL);
+	Assert(state->buffer_stack == NULL);
+	Assert(cb->error_out != NULL);
+
+	/* copy callback functions */
+	state->cb.get_variable = cb->get_variable;
+	state->cb.enc_mblen = cb->enc_mblen;
+	state->cb.standard_strings = cb->standard_strings;
+	state->cb.error_out = cb->error_out;
+
+	/* needed for prepare_buffer */
+	cur_state = state;
+
+	/* Set up flex input buffer with appropriate translation and padding */
+	state->scanbufhandle = prepare_buffer(line, line_len,
+										  &state->scanbuf);
+	state->scanline = line;
+	state->curpos = 0;
+
+	/* Set lookaside data in case we have to map unsafe encoding */
+	state->curline = state->scanbuf;
+	state->refline = state->scanline;
+}
+
+/*
+ * Create new lexer scanning state for this lexer which parses from the current
+ * position of the given scanning state for another lexer. The given state is
+ * destroyed.
+ * 
+ * Note: This function cannot access yy* functions and varialbes of the given
+ * state because they are of different lexer.
+ */
+static void
+psql_scan_slash_command_switch_lexer(PsqlScanState state)
+{
+	const char *newscanline = state->scanline + state->curpos;
+	PsqlScanCallbacks cb = state->cb;
+
+	psql_scan_slash_command_initialize(state);
+	psql_scan_slash_command_setup(state, newscanline, strlen(newscanline), &cb);
+}
+
+/*
+ * Scan the command name of a psql backslash command.  This should be called
+ * after psql_scan() on the main lexer returns PSCAN_BACKSLASH.  It is assumed
+ * that the input has been consumed through the leading backslash.
+ *
+ * The return value is a malloc'd copy of the command name, as parsed off
+ * from the input.
+ */
+char *
+psql_scan_slash_command(PsqlScanState state)
+{
+	PQExpBufferData mybuf;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	psql_scan_slash_command_switch_lexer(state);
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+
+	if (state->buffer_stack != NULL)
+		yys_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yys_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(INITIAL);
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+
+	return mybuf.data;
+}
+
+/*
+ * Parse off the next argument for a backslash command, and return it as a
+ * malloc'd string.  If there are no more arguments, returns NULL.
+ *
+ * type tells what processing, if any, to perform on the option string;
+ * for example, if it's a SQL identifier, we want to downcase any unquoted
+ * letters.
+ *
+ * if quote is not NULL, *quote is set to 0 if no quoting was found, else
+ * the last quote symbol used in the argument.
+ *
+ * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
+ * be taken as part of the option string will be stripped.
+ *
+ * NOTE: the only possible syntax errors for backslash options are unmatched
+ * quotes, which are detected when we run out of input.  Therefore, on a
+ * syntax error we just throw away the string and return NULL; there is no
+ * need to worry about flushing remaining input.
+ */
+char *
+psql_scan_slash_option(PsqlScanState state,
+					   enum slash_option_type type,
+					   char *quote,
+					   bool semicolon)
+{
+	PQExpBufferData mybuf;
+	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
+	char		local_quote;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	if (quote == NULL)
+		quote = &local_quote;
+	*quote = 0;
+
+	/* Build a local buffer that we'll return the data of */
+	initPQExpBuffer(&mybuf);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = &mybuf;
+	option_type = type;
+	option_quote = quote;
+	unquoted_option_chars = 0;
+
+	if (state->buffer_stack != NULL)
+		yys_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yys_switch_to_buffer(state->scanbufhandle);
+
+	if (type == OT_WHOLE_LINE)
+		BEGIN(xslashwholeline);
+	else
+		BEGIN(xslashargstart);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/*
+	 * Check the lex result: we should have gotten back either LEXRES_OK
+	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
+	 * a quoted string, as indicated by YY_START, EOL is an error.
+	 */
+	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
+
+	switch (YY_START)
+	{
+		case xslashargstart:
+			/* empty arg */
+			break;
+		case xslasharg:
+			/* Strip any unquoted trailing semi-colons if requested */
+			if (semicolon)
+			{
+				while (unquoted_option_chars-- > 0 &&
+					   mybuf.len > 0 &&
+					   mybuf.data[mybuf.len - 1] == ';')
+				{
+					mybuf.data[--mybuf.len] = '\0';
+				}
+			}
+
+			/*
+			 * If SQL identifier processing was requested, then we strip out
+			 * excess double quotes and downcase unquoted letters.
+			 * Doubled double-quotes become output double-quotes, per spec.
+			 *
+			 * Note that a string like FOO"BAR"BAZ will be converted to
+			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
+			 * which would have us parse it as several identifiers.  But
+			 * for psql's purposes, we want a string like "foo"."bar" to
+			 * be treated as one option, so there's little choice.
+			 */
+			if (type == OT_SQLID || type == OT_SQLIDHACK)
+			{
+				bool		inquotes = false;
+				char	   *cp = mybuf.data;
+
+				while (*cp)
+				{
+					if (*cp == '"')
+					{
+						if (inquotes && cp[1] == '"')
+						{
+							/* Keep the first quote, remove the second */
+							cp++;
+						}
+						inquotes = !inquotes;
+						/* Collapse out quote at *cp */
+						memmove(cp, cp + 1, strlen(cp));
+						mybuf.len--;
+						/* do not advance cp */
+					}
+					else
+					{
+						if (!inquotes && type == OT_SQLID)
+							*cp = pg_tolower((unsigned char) *cp);
+						if (ENC_IS_SAFE(cur_state))
+							cp += strlen(cp);
+						else
+							cp += cur_state->cb.enc_mblen(cp);
+					}
+				}
+			}
+			break;
+		case xslashquote:
+		case xslashbackquote:
+		case xslashdquote:
+			/* must have hit EOL inside quotes */
+			cur_state->cb.error_out("unterminated quoted string\n");
+			termPQExpBuffer(&mybuf);
+			return NULL;
+		case xslashwholeline:
+			/* always okay */
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid YY_START\n");
+			exit(1);
+	}
+
+	/*
+	 * An unquoted empty argument isn't possible unless we are at end of
+	 * command.  Return NULL instead.
+	 */
+	if (mybuf.len == 0 && *quote == 0)
+	{
+		termPQExpBuffer(&mybuf);
+		return NULL;
+	}
+
+	/* Else return the completed string. */
+	return mybuf.data;
+}
+
+/*
+ * Eat up any unused \\ to complete a backslash command.
+ */
+void
+psql_scan_slash_command_end(PsqlScanState state)
+{
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = NULL;
+
+	if (state->buffer_stack != NULL)
+		yys_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yys_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(xslashend);
+
+	/* And lex. */
+	yylex();
+
+	/* There are no possible errors in this lex state... */
+	psql_scan_switch_lexer(state);
+}
+
+/*
+ * Evaluate a backticked substring of a slash command's argument.
+ *
+ * The portion of output_buf starting at backtick_start_offset is evaluated
+ * as a shell command and then replaced by the command's output.
+ */
+static void
+evaluate_backtick(void)
+{
+	char	   *cmd = output_buf->data + backtick_start_offset;
+	PQExpBufferData cmd_output;
+	FILE	   *fd;
+	bool		error = false;
+	char		buf[512];
+	size_t		result;
+
+	initPQExpBuffer(&cmd_output);
+
+	fd = popen(cmd, PG_BINARY_R);
+	if (!fd)
+	{
+		cur_state->cb.error_out("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (!error)
+	{
+		do
+		{
+			result = fread(buf, 1, sizeof(buf), fd);
+			if (ferror(fd))
+			{
+				cur_state->cb.error_out("%s: %s\n", cmd, strerror(errno));
+				error = true;
+				break;
+			}
+			appendBinaryPQExpBuffer(&cmd_output, buf, result);
+		} while (!feof(fd));
+	}
+
+	if (fd && pclose(fd) == -1)
+	{
+		cur_state->cb.error_out("%s: %s\n", cmd, strerror(errno));
+		error = true;
+	}
+
+	if (PQExpBufferDataBroken(cmd_output))
+	{
+		cur_state->cb.error_out("%s: out of memory\n", cmd);
+		error = true;
+	}
+
+	/* Now done with cmd, delete it from output_buf */
+	output_buf->len = backtick_start_offset;
+	output_buf->data[output_buf->len] = '\0';
+
+	/* If no error, transfer result to output_buf */
+	if (!error)
+	{
+		/* strip any trailing newline */
+		if (cmd_output.len > 0 &&
+			cmd_output.data[cmd_output.len - 1] == '\n')
+			cmd_output.len--;
+		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
+	}
+
+	termPQExpBuffer(&cmd_output);
+}
+
+/*
+ * Clean up after scanning a string.  This flushes any unread input and
+ * releases resources (but not the PsqlScanState itself).  Note however
+ * that this does not reset the lexer scan state; that can be done by
+ * psql_scan_reset(), which is an orthogonal operation.
+ *
+ * It is legal to call this when not scanning anything (makes it easier
+ * to deal with error recovery).
+ */
+static void
+psql_scan_slash_command_finish(PsqlScanState state)
+{
+	/* Drop any incomplete variable expansions. */
+	while (state->buffer_stack != NULL)
+		pop_buffer_stack(state);
+
+	/* Done with the outer scan buffer, too */
+	if (state->scanbufhandle)
+		yys_delete_buffer(state->scanbufhandle);
+	state->scanbufhandle = NULL;
+	if (state->scanbuf)
+		free(state->scanbuf);
+	state->scanbuf = NULL;
+}
+
+/*
+ * Reset lexer scanning state to start conditions.  This is appropriate
+ * for executing \r psql commands (or any other time that we discard the
+ * prior contents of query_buf).  It is not, however, necessary to do this
+ * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
+ * PSCAN_EOL scan result, because the scan state must be INITIAL when those
+ * conditions are returned.
+ *
+ * Note that this is unrelated to flushing unread input; that task is
+ * done by psql_scan_finish().
+ */
+static void
+psql_scan_slash_command_reset(PsqlScanState state)
+{
+	state->start_state = INITIAL;
+	state->paren_depth = 0;
+	state->xcdepth = 0;			/* not really necessary */
+	if (state->dolqstart)
+		free(state->dolqstart);
+	state->dolqstart = NULL;
+}
+
diff --git a/src/bin/psql/psqlscanbody.l b/src/bin/psql/psqlscanbody.l
new file mode 100644
index 0000000..546fa12
--- /dev/null
+++ b/src/bin/psql/psqlscanbody.l
@@ -0,0 +1,1438 @@
+%{
+/*-------------------------------------------------------------------------
+ *
+ * psqlscan.l
+ *	  lexical scanner for psql
+ *
+ * This code is mainly needed to determine where the end of a SQL statement
+ * is: we are looking for semicolons that are not within quotes, comments,
+ * or parentheses.  The most reliable way to handle this is to borrow the
+ * backend's flex lexer rules, lock, stock, and barrel.  The rules below
+ * are (except for a few) the same as the backend's, but their actions are
+ * just ECHO whereas the backend's actions generally do other things.
+ *
+ * XXX The rules in this file must be kept in sync with the backend lexer!!!
+ *
+ * XXX Avoid creating backtracking cases --- see the backend lexer for info.
+ *
+ * The most difficult aspect of this code is that we need to work in multibyte
+ * encodings that are not ASCII-safe.  A "safe" encoding is one in which each
+ * byte of a multibyte character has the high bit set (it's >= 0x80).  Since
+ * all our lexing rules treat all high-bit-set characters alike, we don't
+ * really need to care whether such a byte is part of a sequence or not.
+ * In an "unsafe" encoding, we still expect the first byte of a multibyte
+ * sequence to be >= 0x80, but later bytes might not be.  If we scan such
+ * a sequence as-is, the lexing rules could easily be fooled into matching
+ * such bytes to ordinary ASCII characters.  Our solution for this is to
+ * substitute 0xFF for each non-first byte within the data presented to flex.
+ * The flex rules will then pass the FF's through unmolested.  The emit()
+ * subroutine is responsible for looking back to the original string and
+ * replacing FF's with the corresponding original bytes.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/bin/psql/psqlscan.l
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "psqlscan.h"
+#include "psqlscan_int.h"
+
+#include <ctype.h>
+
+static PsqlScanState cur_state;	/* current state while active */
+
+static PQExpBuffer output_buf;	/* current output buffer */
+
+#define ECHO scan_emit(yytext, yyleng)
+
+/* Adjust curpos on yyless */
+#define my_yyless(n) cur_state->curpos -= (yyleng - (n)); yyless(n)
+
+/* Track where lexer parsed up to */
+#define YY_USER_ACTION cur_state->curpos += yyleng;
+
+#define ENC_IS_SAFE(s) (!(s)->cb.enc_mblen)
+%}
+
+%option 8bit
+%option never-interactive
+%option nodefault
+%option noinput
+%option nounput
+%option noyywrap
+%option warn
+
+/*
+ * All of the following definitions and rules should exactly match
+ * src/backend/parser/scan.l so far as the flex patterns are concerned.
+ * The rule bodies are just ECHO as opposed to what the backend does,
+ * however.  (But be sure to duplicate code that affects the lexing process,
+ * such as BEGIN().)  Also, psqlscan uses a single <<EOF>> rule whereas
+ * scan.l has a separate one for each exclusive state.
+ */
+
+/*
+ * OK, here is a short description of lex/flex rules behavior.
+ * The longest pattern which matches an input string is always chosen.
+ * For equal-length patterns, the first occurring in the rules list is chosen.
+ * INITIAL is the starting state, to which all non-conditional rules apply.
+ * Exclusive states change parsing rules while the state is active.  When in
+ * an exclusive state, only those rules defined for that state apply.
+ *
+ * We use exclusive states for quoted strings, extended comments,
+ * and to eliminate parsing troubles for numeric strings.
+ * Exclusive states:
+ *  <xb> bit string literal
+ *  <xc> extended C-style comments
+ *  <xd> delimited identifiers (double-quoted identifiers)
+ *  <xh> hexadecimal numeric string
+ *  <xq> standard quoted strings
+ *  <xe> extended quoted strings (support backslash escape sequences)
+ *  <xdolq> $foo$ quoted strings
+ *  <xui> quoted identifier with Unicode escapes
+ *  <xuiend> end of a quoted identifier with Unicode escapes, UESCAPE can follow
+ *  <xus> quoted string with Unicode escapes
+ *  <xusend> end of a quoted string with Unicode escapes, UESCAPE can follow
+ *
+ * Note: we intentionally don't mimic the backend's <xeu> state; we have
+ * no need to distinguish it from <xe> state, and no good way to get out
+ * of it in error cases.  The backend just throws yyerror() in those
+ * cases, but that's not an option here.
+ */
+
+%x xb
+%x xc
+%x xd
+%x xh
+%x xe
+%x xq
+%x xdolq
+%x xui
+%x xuiend
+%x xus
+%x xusend
+
+/*
+ * In order to make the world safe for Windows and Mac clients as well as
+ * Unix ones, we accept either \n or \r as a newline.  A DOS-style \r\n
+ * sequence will be seen as two successive newlines, but that doesn't cause
+ * any problems.  Comments that start with -- and extend to the next
+ * newline are treated as equivalent to a single whitespace character.
+ *
+ * NOTE a fine point: if there is no newline following --, we will absorb
+ * everything to the end of the input as a comment.  This is correct.  Older
+ * versions of Postgres failed to recognize -- as a comment if the input
+ * did not end with a newline.
+ *
+ * XXX perhaps \f (formfeed) should be treated as a newline as well?
+ *
+ * XXX if you change the set of whitespace characters, fix scanner_isspace()
+ * to agree, and see also the plpgsql lexer.
+ */
+
+space			[ \t\n\r\f]
+horiz_space		[ \t\f]
+newline			[\n\r]
+non_newline		[^\n\r]
+
+comment			("--"{non_newline}*)
+
+whitespace		({space}+|{comment})
+
+/*
+ * SQL requires at least one newline in the whitespace separating
+ * string literals that are to be concatenated.  Silly, but who are we
+ * to argue?  Note that {whitespace_with_newline} should not have * after
+ * it, whereas {whitespace} should generally have a * after it...
+ */
+
+special_whitespace		({space}+|{comment}{newline})
+horiz_whitespace		({horiz_space}|{comment})
+whitespace_with_newline	({horiz_whitespace}*{newline}{special_whitespace}*)
+
+/*
+ * To ensure that {quotecontinue} can be scanned without having to back up
+ * if the full pattern isn't matched, we include trailing whitespace in
+ * {quotestop}.  This matches all cases where {quotecontinue} fails to match,
+ * except for {quote} followed by whitespace and just one "-" (not two,
+ * which would start a {comment}).  To cover that we have {quotefail}.
+ * The actions for {quotestop} and {quotefail} must throw back characters
+ * beyond the quote proper.
+ */
+quote			'
+quotestop		{quote}{whitespace}*
+quotecontinue	{quote}{whitespace_with_newline}{quote}
+quotefail		{quote}{whitespace}*"-"
+
+/* Bit string
+ * It is tempting to scan the string for only those characters
+ * which are allowed. However, this leads to silently swallowed
+ * characters if illegal characters are included in the string.
+ * For example, if xbinside is [01] then B'ABCD' is interpreted
+ * as a zero-length string, and the ABCD' is lost!
+ * Better to pass the string forward and let the input routines
+ * validate the contents.
+ */
+xbstart			[bB]{quote}
+xbinside		[^']*
+
+/* Hexadecimal number */
+xhstart			[xX]{quote}
+xhinside		[^']*
+
+/* National character */
+xnstart			[nN]{quote}
+
+/* Quoted string that allows backslash escapes */
+xestart			[eE]{quote}
+xeinside		[^\\']+
+xeescape		[\\][^0-7]
+xeoctesc		[\\][0-7]{1,3}
+xehexesc		[\\]x[0-9A-Fa-f]{1,2}
+xeunicode		[\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8})
+xeunicodefail	[\\](u[0-9A-Fa-f]{0,3}|U[0-9A-Fa-f]{0,7})
+
+/* Extended quote
+ * xqdouble implements embedded quote, ''''
+ */
+xqstart			{quote}
+xqdouble		{quote}{quote}
+xqinside		[^']+
+
+/* $foo$ style quotes ("dollar quoting")
+ * The quoted string starts with $foo$ where "foo" is an optional string
+ * in the form of an identifier, except that it may not contain "$",
+ * and extends to the first occurrence of an identical string.
+ * There is *no* processing of the quoted text.
+ *
+ * {dolqfailed} is an error rule to avoid scanner backup when {dolqdelim}
+ * fails to match its trailing "$".
+ */
+dolq_start		[A-Za-z\200-\377_]
+dolq_cont		[A-Za-z\200-\377_0-9]
+dolqdelim		\$({dolq_start}{dolq_cont}*)?\$
+dolqfailed		\${dolq_start}{dolq_cont}*
+dolqinside		[^$]+
+
+/* Double quote
+ * Allows embedded spaces and other special characters into identifiers.
+ */
+dquote			\"
+xdstart			{dquote}
+xdstop			{dquote}
+xddouble		{dquote}{dquote}
+xdinside		[^"]+
+
+/* Unicode escapes */
+uescape			[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']{quote}
+/* error rule to avoid backup */
+uescapefail		[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU]
+
+/* Quoted identifier with Unicode escapes */
+xuistart		[uU]&{dquote}
+
+/* Quoted string with Unicode escapes */
+xusstart		[uU]&{quote}
+
+/* Optional UESCAPE after a quoted string or identifier with Unicode escapes. */
+xustop1		{uescapefail}?
+xustop2		{uescape}
+
+/* error rule to avoid backup */
+xufailed		[uU]&
+
+
+/* C-style comments
+ *
+ * The "extended comment" syntax closely resembles allowable operator syntax.
+ * The tricky part here is to get lex to recognize a string starting with
+ * slash-star as a comment, when interpreting it as an operator would produce
+ * a longer match --- remember lex will prefer a longer match!  Also, if we
+ * have something like plus-slash-star, lex will think this is a 3-character
+ * operator whereas we want to see it as a + operator and a comment start.
+ * The solution is two-fold:
+ * 1. append {op_chars}* to xcstart so that it matches as much text as
+ *    {operator} would. Then the tie-breaker (first matching rule of same
+ *    length) ensures xcstart wins.  We put back the extra stuff with yyless()
+ *    in case it contains a star-slash that should terminate the comment.
+ * 2. In the operator rule, check for slash-star within the operator, and
+ *    if found throw it back with yyless().  This handles the plus-slash-star
+ *    problem.
+ * Dash-dash comments have similar interactions with the operator rule.
+ */
+xcstart			\/\*{op_chars}*
+xcstop			\*+\/
+xcinside		[^*/]+
+
+digit			[0-9]
+ident_start		[A-Za-z\200-\377_]
+ident_cont		[A-Za-z\200-\377_0-9\$]
+
+identifier		{ident_start}{ident_cont}*
+
+/* Assorted special-case operators and operator-like tokens */
+typecast		"::"
+dot_dot			\.\.
+colon_equals	":="
+equals_greater	"=>"
+less_equals		"<="
+greater_equals	">="
+less_greater	"<>"
+not_equals		"!="
+
+/*
+ * "self" is the set of chars that should be returned as single-character
+ * tokens.  "op_chars" is the set of chars that can make up "Op" tokens,
+ * which can be one or more characters long (but if a single-char token
+ * appears in the "self" set, it is not to be returned as an Op).  Note
+ * that the sets overlap, but each has some chars that are not in the other.
+ *
+ * If you change either set, adjust the character lists appearing in the
+ * rule for "operator"!
+ */
+self			[,()\[\].;\:\+\-\*\/\%\^\<\>\=]
+op_chars		[\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
+operator		{op_chars}+
+
+/* we no longer allow unary minus in numbers.
+ * instead we pass it separately to parser. there it gets
+ * coerced via doNegate() -- Leon aug 20 1999
+ *
+ * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 10.
+ *
+ * {realfail1} and {realfail2} are added to prevent the need for scanner
+ * backup when the {real} rule fails to match completely.
+ */
+
+integer			{digit}+
+decimal			(({digit}*\.{digit}+)|({digit}+\.{digit}*))
+decimalfail		{digit}+\.\.
+real			({integer}|{decimal})[Ee][-+]?{digit}+
+realfail1		({integer}|{decimal})[Ee]
+realfail2		({integer}|{decimal})[Ee][-+]
+
+param			\${integer}
+
+/* psql-specific: characters allowed in variable names */
+variable_char	[A-Za-z\200-\377_0-9]
+
+other			.
+
+/*
+ * Dollar quoted strings are totally opaque, and no escaping is done on them.
+ * Other quoted strings must allow some special characters such as single-quote
+ *  and newline.
+ * Embedded single-quotes are implemented both in the SQL standard
+ *  style of two adjacent single quotes "''" and in the Postgres/Java style
+ *  of escaped-quote "\'".
+ * Other embedded escaped characters are matched explicitly and the leading
+ *  backslash is dropped from the string.
+ * Note that xcstart must appear before operator, as explained above!
+ *  Also whitespace (comment) must appear before operator.
+ */
+
+%%
+
+{whitespace}	{
+					/*
+					 * Note that the whitespace rule includes both true
+					 * whitespace and single-line ("--" style) comments.
+					 * We suppress whitespace at the start of the query
+					 * buffer.  We also suppress all single-line comments,
+					 * which is pretty dubious but is the historical
+					 * behavior.
+					 */
+					if (!(output_buf->len == 0 || yytext[0] == '-'))
+						ECHO;
+				}
+
+{xcstart}		{
+					cur_state->xcdepth = 0;
+					BEGIN(xc);
+					/* Put back any characters past slash-star; see above */
+					my_yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstart}	{
+					cur_state->xcdepth++;
+					/* Put back any characters past slash-star; see above */
+					my_yyless(2);
+					ECHO;
+				}
+
+<xc>{xcstop}	{
+					if (cur_state->xcdepth <= 0)
+					{
+						BEGIN(INITIAL);
+					}
+					else
+						cur_state->xcdepth--;
+					ECHO;
+				}
+
+<xc>{xcinside}	{
+					ECHO;
+				}
+
+<xc>{op_chars}	{
+					ECHO;
+				}
+
+<xc>\*+			{
+					ECHO;
+				}
+
+{xbstart}		{
+					BEGIN(xb);
+					ECHO;
+				}
+<xb>{quotestop}	|
+<xb>{quotefail} {
+					my_yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xh>{xhinside}	|
+<xb>{xbinside}	{
+					ECHO;
+				}
+<xh>{quotecontinue}	|
+<xb>{quotecontinue}	{
+					ECHO;
+				}
+
+{xhstart}		{
+					/* Hexadecimal bit type.
+					 * At some point we should simply pass the string
+					 * forward to the parser and label it there.
+					 * In the meantime, place a leading "x" on the string
+					 * to mark it for the input routine as a hex string.
+					 */
+					BEGIN(xh);
+					ECHO;
+				}
+<xh>{quotestop}	|
+<xh>{quotefail} {
+					my_yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+
+{xnstart}		{
+					my_yyless(1);				/* eat only 'n' this time */
+					ECHO;
+				}
+
+{xqstart}		{
+					if (cur_state->cb.standard_strings())
+						BEGIN(xq);
+					else
+						BEGIN(xe);
+					ECHO;
+				}
+{xestart}		{
+					BEGIN(xe);
+					ECHO;
+				}
+{xusstart}		{
+					BEGIN(xus);
+					ECHO;
+				}
+<xq,xe>{quotestop}	|
+<xq,xe>{quotefail} {
+					my_yyless(1);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xus>{quotestop} |
+<xus>{quotefail} {
+					my_yyless(1);
+					BEGIN(xusend);
+					ECHO;
+				}
+<xusend>{whitespace} {
+					ECHO;
+				}
+<xusend>{other} |
+<xusend>{xustop1} {
+					my_yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xusend>{xustop2} {
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xq,xe,xus>{xqdouble} {
+					ECHO;
+				}
+<xq,xus>{xqinside}  {
+					ECHO;
+				}
+<xe>{xeinside}  {
+					ECHO;
+				}
+<xe>{xeunicode} {
+					ECHO;
+				}
+<xe>{xeunicodefail}	{
+					ECHO;
+				}
+<xe>{xeescape}  {
+					ECHO;
+				}
+<xe>{xeoctesc}  {
+					ECHO;
+				}
+<xe>{xehexesc}  {
+					ECHO;
+				}
+<xq,xe,xus>{quotecontinue} {
+					ECHO;
+				}
+<xe>.			{
+					/* This is only needed for \ just before EOF */
+					ECHO;
+				}
+
+{dolqdelim}		{
+					cur_state->dolqstart = pg_strdup(yytext);
+					BEGIN(xdolq);
+					ECHO;
+				}
+{dolqfailed}	{
+					/* throw back all but the initial "$" */
+					my_yyless(1);
+					ECHO;
+				}
+<xdolq>{dolqdelim} {
+					if (strcmp(yytext, cur_state->dolqstart) == 0)
+					{
+						free(cur_state->dolqstart);
+						cur_state->dolqstart = NULL;
+						BEGIN(INITIAL);
+					}
+					else
+					{
+						/*
+						 * When we fail to match $...$ to dolqstart, transfer
+						 * the $... part to the output, but put back the final
+						 * $ for rescanning.  Consider $delim$...$junk$delim$
+						 */
+						my_yyless(yyleng-1);
+					}
+					ECHO;
+				}
+<xdolq>{dolqinside} {
+					ECHO;
+				}
+<xdolq>{dolqfailed} {
+					ECHO;
+				}
+<xdolq>.		{
+					/* This is only needed for $ inside the quoted text */
+					ECHO;
+				}
+
+{xdstart}		{
+					BEGIN(xd);
+					ECHO;
+				}
+{xuistart}		{
+					BEGIN(xui);
+					ECHO;
+				}
+<xd>{xdstop}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xui>{dquote} {
+					my_yyless(1);
+					BEGIN(xuiend);
+					ECHO;
+				}
+<xuiend>{whitespace} {
+					ECHO;
+				}
+<xuiend>{other} |
+<xuiend>{xustop1} {
+					my_yyless(0);
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xuiend>{xustop2}	{
+					BEGIN(INITIAL);
+					ECHO;
+				}
+<xd,xui>{xddouble}	{
+					ECHO;
+				}
+<xd,xui>{xdinside}	{
+					ECHO;
+				}
+
+{xufailed}	{
+					/* throw back all but the initial u/U */
+					my_yyless(1);
+					ECHO;
+				}
+
+{typecast}		{
+					ECHO;
+				}
+
+{dot_dot}		{
+					ECHO;
+				}
+
+{colon_equals}	{
+					ECHO;
+				}
+
+{equals_greater} {
+					ECHO;
+				}
+
+{less_equals}	{
+					ECHO;
+				}
+
+{greater_equals} {
+					ECHO;
+				}
+
+{less_greater}	{
+					ECHO;
+				}
+
+{not_equals}	{
+					ECHO;
+				}
+
+	/*
+	 * These rules are specific to psql --- they implement parenthesis
+	 * counting and detection of command-ending semicolon.  These must
+	 * appear before the {self} rule so that they take precedence over it.
+	 */
+
+"("				{
+					cur_state->paren_depth++;
+					ECHO;
+				}
+
+")"				{
+					if (cur_state->paren_depth > 0)
+						cur_state->paren_depth--;
+					ECHO;
+				}
+
+";"				{
+					ECHO;
+					if (cur_state->paren_depth == 0)
+					{
+						/* Terminate lexing temporarily */
+						return LEXRES_SEMI;
+					}
+				}
+
+	/*
+	 * psql-specific rules to handle backslash commands and variable
+	 * substitution.  We want these before {self}, also.
+	 */
+
+"\\"[;:]		{
+					/* Force a semicolon or colon into the query buffer */
+					scan_emit(yytext + 1, 1);
+				}
+
+"\\"			{
+					/* Terminate lexing temporarily */
+					return LEXRES_BACKSLASH;
+				}
+
+:{variable_char}+	{
+					/* Possible psql variable substitution */
+					char	   *varname = NULL;
+					const char *value = NULL;
+					void	  (*free_fn)(void *) = NULL;
+
+					if (cur_state->cb.get_variable)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = cur_state->cb.get_variable(varname,
+ 									false, false, &free_fn);
+					}
+
+					if (value)
+					{
+						/* It is a variable, check for recursion */
+						if (var_is_current_source(cur_state, varname))
+						{
+							/* Recursive expansion --- don't go there */
+							cur_state->cb.error_out("skipping recursive expansion of variable \"%s\"\n",
+									   varname);
+							/* Instead copy the string as is */
+							ECHO;
+						}
+						else
+						{
+							/* OK, perform substitution */
+							push_new_buffer(value, varname);
+							/* yy_scan_string already made buffer active */
+						}
+						if (free_fn)
+							free_fn((void*)value);
+					}
+					else
+					{
+						/*
+						 * if the variable doesn't exist we'll copy the
+						 * string as is
+						 */
+						ECHO;
+					}
+
+					if (varname)
+						free(varname);
+				}
+
+:'{variable_char}+'	{
+					escape_variable(false);
+				}
+
+:\"{variable_char}+\"	{
+					escape_variable(true);
+				}
+
+	/*
+	 * These rules just avoid the need for scanner backup if one of the
+	 * two rules above fails to match completely.
+	 */
+
+:'{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					ECHO;
+				}
+
+:\"{variable_char}*	{
+					/* Throw back everything but the colon */
+					my_yyless(1);
+					ECHO;
+				}
+
+	/*
+	 * Back to backend-compatible rules.
+	 */
+
+{self}			{
+					ECHO;
+				}
+
+{operator}		{
+					/*
+					 * Check for embedded slash-star or dash-dash; those
+					 * are comment starts, so operator must stop there.
+					 * Note that slash-star or dash-dash at the first
+					 * character will match a prior rule, not this one.
+					 */
+					int		nchars = yyleng;
+					char   *slashstar = strstr(yytext, "/*");
+					char   *dashdash = strstr(yytext, "--");
+
+					if (slashstar && dashdash)
+					{
+						/* if both appear, take the first one */
+						if (slashstar > dashdash)
+							slashstar = dashdash;
+					}
+					else if (!slashstar)
+						slashstar = dashdash;
+					if (slashstar)
+						nchars = slashstar - yytext;
+
+					/*
+					 * For SQL compatibility, '+' and '-' cannot be the
+					 * last char of a multi-char operator unless the operator
+					 * contains chars that are not in SQL operators.
+					 * The idea is to lex '=-' as two operators, but not
+					 * to forbid operator names like '?-' that could not be
+					 * sequences of SQL operators.
+					 */
+					while (nchars > 1 &&
+						   (yytext[nchars-1] == '+' ||
+							yytext[nchars-1] == '-'))
+					{
+						int		ic;
+
+						for (ic = nchars-2; ic >= 0; ic--)
+						{
+							if (strchr("~!@#^&|`?%", yytext[ic]))
+								break;
+						}
+						if (ic >= 0)
+							break; /* found a char that makes it OK */
+						nchars--; /* else remove the +/-, and check again */
+					}
+
+					if (nchars < yyleng)
+					{
+						/* Strip the unwanted chars from the token */
+						my_yyless(nchars);
+					}
+					ECHO;
+				}
+
+{param}			{
+					ECHO;
+				}
+
+{integer}		{
+					ECHO;
+				}
+{decimal}		{
+					ECHO;
+				}
+{decimalfail}	{
+					/* throw back the .., and treat as integer */
+					my_yyless(yyleng-2);
+					ECHO;
+				}
+{real}			{
+					ECHO;
+				}
+{realfail1}		{
+					/*
+					 * throw back the [Ee], and treat as {decimal}.  Note
+					 * that it is possible the input is actually {integer},
+					 * but since this case will almost certainly lead to a
+					 * syntax error anyway, we don't bother to distinguish.
+					 */
+					my_yyless(yyleng-1);
+					ECHO;
+				}
+{realfail2}		{
+					/* throw back the [Ee][+-], and proceed as above */
+					my_yyless(yyleng-2);
+					ECHO;
+				}
+
+
+{identifier}	{
+					ECHO;
+				}
+
+{other}			{
+					ECHO;
+				}
+
+
+	/*
+	 * Everything from here down is psql-specific.
+	 */
+
+<<EOF>>			{
+					StackElem  *stackelem = cur_state->buffer_stack;
+
+					if (stackelem == NULL)
+						return LEXRES_EOL; /* end of input reached */
+
+					/*
+					 * We were expanding a variable, so pop the inclusion
+					 * stack and keep lexing
+					 */
+					pop_buffer_stack(cur_state);
+
+					stackelem = cur_state->buffer_stack;
+					if (stackelem != NULL)
+					{
+						yy_switch_to_buffer(stackelem->buf);
+						cur_state->curline = stackelem->bufstring;
+						cur_state->refline = stackelem->origstring ? stackelem->origstring : stackelem->bufstring;
+					}
+					else
+					{
+						yy_switch_to_buffer(cur_state->scanbufhandle);
+						cur_state->curline = cur_state->scanbuf;
+						cur_state->refline = cur_state->scanline;
+					}
+				}
+%%
+
+static void my_psql_scan_finish(PsqlScanState state);
+static void my_psql_scan_reset(PsqlScanState state);
+static void psql_error_errout(const char *fmt, ...)
+	__attribute__ ((format (printf, 1, 2)));
+static bool psql_standard_strings(void);
+
+static void
+psql_scan_initialize(PsqlScanState state)
+{
+	psql_scan_finish(state);
+	psql_scan_reset(state);
+	memset(state, 0, sizeof(*state));
+	state->finish = &my_psql_scan_finish;
+	state->reset = &my_psql_scan_reset;
+	state->my_yy_scan_buffer = &yy_scan_buffer;
+	state->reset(state);
+}
+
+/*
+ * Create a lexer working state struct.
+ */
+PsqlScanState
+psql_scan_create(void)
+{
+	PsqlScanState state;
+
+	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
+	psql_scan_initialize(state);
+
+	return state;
+}
+
+/*
+ * Destroy a lexer working state struct, releasing all resources.
+ */
+void
+psql_scan_destroy(PsqlScanState state)
+{
+	psql_scan_finish(state);
+
+	psql_scan_reset(state);
+
+	free(state);
+}
+
+/*
+ * Set up to perform lexing of the given input line.
+ *
+ * The text at *line, extending for line_len bytes, will be scanned by
+ * subsequent calls to the psql_scan routines.  psql_scan_finish should
+ * be called when scanning is complete.  Note that the lexer retains
+ * a pointer to the storage at *line --- this string must not be altered
+ * or freed until after psql_scan_finish is called.
+ */
+void
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PsqlScanCallbacks *cb)
+{
+	/* Mustn't be scanning already */
+	Assert(state->scanbufhandle == NULL);
+	Assert(state->buffer_stack == NULL);
+
+	/* copy callback functions */
+	state->cb.get_variable = cb->get_variable;
+	if (cb->standard_strings)
+		state->cb.standard_strings = cb->standard_strings;
+	else
+		state->cb.standard_strings = &psql_standard_strings;
+
+	state->cb.enc_mblen = cb->enc_mblen;
+
+	if (cb->error_out)
+		state->cb.error_out = cb->error_out;
+	else
+		state->cb.error_out = &psql_error_errout;
+
+	/* needed for prepare_buffer */
+	cur_state = state;
+
+	/* Set up flex input buffer with appropriate translation and padding */
+	state->scanbufhandle = prepare_buffer(line, line_len,
+										  &state->scanbuf);
+	state->scanline = line;
+	state->curpos = 0;
+
+	/* Set lookaside data in case we have to map unsafe encoding */
+	state->curline = state->scanbuf;
+	state->refline = state->scanline;
+}
+
+/*
+ * Redirect functions for indirect calls. These functions may be called for
+ * scan state of other lexers.
+ */
+void
+psql_scan_finish(PsqlScanState state)
+{
+	if (state->finish)
+		state->finish(state);
+}
+
+void
+psql_scan_reset(PsqlScanState state)
+{
+	if (state->reset)
+		state->reset(state);
+}
+
+
+/*
+ * Do lexical analysis of SQL command text.
+ *
+ * The text previously passed to psql_scan_setup is scanned, and appended
+ * (possibly with transformation) to query_buf.
+ *
+ * The return value indicates the condition that stopped scanning:
+ *
+ * PSCAN_SEMICOLON: found a command-ending semicolon.  (The semicolon is
+ * transferred to query_buf.)  The command accumulated in query_buf should
+ * be executed, then clear query_buf and call again to scan the remainder
+ * of the line.
+ *
+ * PSCAN_BACKSLASH: found a backslash that starts a psql special command.
+ * Any previous data on the line has been transferred to query_buf.
+ * The caller will typically next call psql_scan_slash_command(),
+ * perhaps psql_scan_slash_option(), and psql_scan_slash_command_end().
+ *
+ * PSCAN_INCOMPLETE: the end of the line was reached, but we have an
+ * incomplete SQL command.  *prompt is set to the appropriate prompt type.
+ *
+ * PSCAN_EOL: the end of the line was reached, and there is no lexical
+ * reason to consider the command incomplete.  The caller may or may not
+ * choose to send it.  *prompt is set to the appropriate prompt type if
+ * the caller chooses to collect more input.
+ *
+ * In the PSCAN_INCOMPLETE and PSCAN_EOL cases, psql_scan_finish() should
+ * be called next, then the cycle may be repeated with a fresh input line.
+ *
+ * In all cases, *prompt is set to an appropriate prompt type code for the
+ * next line-input operation.
+ */
+PsqlScanResult
+psql_scan(PsqlScanState state,
+		  PQExpBuffer query_buf,
+		  promptStatus_t *prompt)
+{
+	PsqlScanResult result;
+	int			lexresult;
+
+	/* Must be scanning already */
+	Assert(state->scanbufhandle != NULL);
+
+	/* Set up static variables that will be used by yylex */
+	cur_state = state;
+	output_buf = query_buf;
+
+	if (state->buffer_stack != NULL)
+		yy_switch_to_buffer(state->buffer_stack->buf);
+	else
+		yy_switch_to_buffer(state->scanbufhandle);
+
+	BEGIN(state->start_state);
+
+	/* And lex. */
+	lexresult = yylex();
+
+	/* Update static vars back to the state struct */
+	state->start_state = YY_START;
+
+	/*
+	 * Check termination state and return appropriate result info.
+	 */
+	switch (lexresult)
+	{
+		case LEXRES_EOL:		/* end of input */
+			switch (state->start_state)
+			{
+				/* This switch must cover all non-slash-command states. */
+				case INITIAL:
+				case xuiend:	/* we treat these like INITIAL */
+				case xusend:
+					if (state->paren_depth > 0)
+					{
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_PAREN;
+					}
+					else if (query_buf->len > 0)
+					{
+						result = PSCAN_EOL;
+						*prompt = PROMPT_CONTINUE;
+					}
+					else
+					{
+						/* never bother to send an empty buffer */
+						result = PSCAN_INCOMPLETE;
+						*prompt = PROMPT_READY;
+					}
+					break;
+				case xb:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xc:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_COMMENT;
+					break;
+				case xd:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xh:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xe:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				case xdolq:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOLLARQUOTE;
+					break;
+				case xui:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_DOUBLEQUOTE;
+					break;
+				case xus:
+					result = PSCAN_INCOMPLETE;
+					*prompt = PROMPT_SINGLEQUOTE;
+					break;
+				default:
+					/* can't get here */
+					fprintf(stderr, "invalid YY_START\n");
+					exit(1);
+			}
+			break;
+		case LEXRES_SEMI:		/* semicolon */
+			result = PSCAN_SEMICOLON;
+			*prompt = PROMPT_READY;
+			break;
+		case LEXRES_BACKSLASH:	/* backslash */
+			result = PSCAN_BACKSLASH;
+			*prompt = PROMPT_READY;
+			break;
+		default:
+			/* can't get here */
+			fprintf(stderr, "invalid yylex result\n");
+			exit(1);
+	}
+
+	return result;
+}
+
+/*
+ * Clean up after scanning a string.  This flushes any unread input and
+ * releases resources (but not the PsqlScanState itself).  Note however
+ * that this does not reset the lexer scan state; that can be done by
+ * psql_scan_reset(), which is an orthogonal operation.
+ *
+ * It is legal to call this when not scanning anything (makes it easier
+ * to deal with error recovery).
+ */
+static void
+my_psql_scan_finish(PsqlScanState state)
+{
+	/* Drop any incomplete variable expansions. */
+	while (state->buffer_stack != NULL)
+		pop_buffer_stack(state);
+
+	/* Done with the outer scan buffer, too */
+	if (state->scanbufhandle)
+		yy_delete_buffer(state->scanbufhandle);
+	state->scanbufhandle = NULL;
+	if (state->scanbuf)
+		free(state->scanbuf);
+	state->scanbuf = NULL;
+}
+
+/*
+ * Create new lexer scanning state for this lexer which parses from the current
+ * position of the given scanning state for another lexer. The given state is
+ * destroyed.
+ * 
+ * Note: This function cannot access yy* functions and varialbes of the given
+ * state because they are of different lexer.
+ */
+void
+psql_scan_switch_lexer(PsqlScanState state)
+{
+	const char	   *newscanline = state->scanline + state->curpos;
+	PsqlScanCallbacks cb = state->cb;
+
+	psql_scan_initialize(state);
+	psql_scan_setup(state, newscanline, strlen(newscanline), &cb);
+}
+
+/*
+ * Reset lexer scanning state to start conditions.  This is appropriate
+ * for executing \r psql commands (or any other time that we discard the
+ * prior contents of query_buf).  It is not, however, necessary to do this
+ * when we execute and clear the buffer after getting a PSCAN_SEMICOLON or
+ * PSCAN_EOL scan result, because the scan state must be INITIAL when those
+ * conditions are returned.
+ *
+ * Note that this is unrelated to flushing unread input; that task is
+ * done by psql_scan_finish().
+ */
+static void
+my_psql_scan_reset(PsqlScanState state)
+{
+	state->start_state = INITIAL;
+	state->paren_depth = 0;
+	state->xcdepth = 0;			/* not really necessary */
+	if (state->dolqstart)
+		free(state->dolqstart);
+	state->dolqstart = NULL;
+}
+
+/*
+ * Return true if lexer is currently in an "inside quotes" state.
+ *
+ * This is pretty grotty but is needed to preserve the old behavior
+ * that mainloop.c drops blank lines not inside quotes without even
+ * echoing them.
+ */
+bool
+psql_scan_in_quote(PsqlScanState state)
+{
+	return state->start_state != INITIAL;
+}
+
+/*
+ * Push the given string onto the stack of stuff to scan.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+void
+push_new_buffer(const char *newstr, const char *varname)
+{
+	StackElem  *stackelem;
+
+	stackelem = (StackElem *) pg_malloc(sizeof(StackElem));
+
+	/*
+	 * In current usage, the passed varname points at the current flex
+	 * input buffer; we must copy it before calling prepare_buffer()
+	 * because that will change the buffer state.
+	 */
+	stackelem->varname = varname ? pg_strdup(varname) : NULL;
+
+	stackelem->buf = prepare_buffer(newstr, strlen(newstr),
+									&stackelem->bufstring);
+	cur_state->curline = stackelem->bufstring;
+	if (ENC_IS_SAFE(cur_state))
+	{
+		stackelem->origstring = NULL;
+		cur_state->refline = stackelem->bufstring;
+	}
+	else
+	{
+		stackelem->origstring = pg_strdup(newstr);
+		cur_state->refline = stackelem->origstring;
+	}
+	stackelem->next = cur_state->buffer_stack;
+	cur_state->buffer_stack = stackelem;
+}
+
+/*
+ * Pop the topmost buffer stack item (there must be one!)
+ *
+ * NB: after this, the flex input state is unspecified; caller must
+ * switch to an appropriate buffer to continue lexing.
+ */
+void
+pop_buffer_stack(PsqlScanState state)
+{
+	StackElem  *stackelem = state->buffer_stack;
+
+	state->buffer_stack = stackelem->next;
+	yy_delete_buffer(stackelem->buf);
+	free(stackelem->bufstring);
+	if (stackelem->origstring)
+		free(stackelem->origstring);
+	if (stackelem->varname)
+		free(stackelem->varname);
+	free(stackelem);
+}
+
+/*
+ * Check if specified variable name is the source for any string
+ * currently being scanned
+ */
+bool
+var_is_current_source(PsqlScanState state, const char *varname)
+{
+	StackElem  *stackelem;
+
+	for (stackelem = state->buffer_stack;
+		 stackelem != NULL;
+		 stackelem = stackelem->next)
+	{
+		if (stackelem->varname && strcmp(stackelem->varname, varname) == 0)
+			return true;
+	}
+	return false;
+}
+
+/*
+ * Set up a flex input buffer to scan the given data.  We always make a
+ * copy of the data.  If working in an unsafe encoding, the copy has
+ * multibyte sequences replaced by FFs to avoid fooling the lexer rules.
+ *
+ * cur_state must point to the active PsqlScanState.
+ *
+ * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
+ */
+YY_BUFFER_STATE
+prepare_buffer(const char *txt, int len, char **txtcopy)
+{
+	char	   *newtxt;
+
+	/* Flex wants two \0 characters after the actual data */
+	newtxt = pg_malloc(len + 2);
+	*txtcopy = newtxt;
+	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
+
+	if (ENC_IS_SAFE(cur_state))
+		memcpy(newtxt, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		int		i = 0;
+
+		while (i < len)
+		{
+			int		thislen = cur_state->cb.enc_mblen(txt + i);
+
+			/* first byte should always be okay... */
+			newtxt[i] = txt[i];
+			i++;
+			while (--thislen > 0 && i < len)
+				newtxt[i++] = (char) 0xFF;
+		}
+	}
+
+	return cur_state->my_yy_scan_buffer(newtxt, len + 2);
+}
+
+/*
+ * scan_emit() --- body for ECHO macro
+ *
+ * NB: this must be used for ALL and ONLY the text copied from the flex
+ * input data.  If you pass it something that is not part of the yytext
+ * string, you are making a mistake.  Internally generated text can be
+ * appended directly to output_buf.
+ */
+void
+scan_emit(const char *txt, int len)
+{
+	if (ENC_IS_SAFE(cur_state))
+		appendBinaryPQExpBuffer(output_buf, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			appendPQExpBufferChar(output_buf, ch);
+		}
+	}
+}
+
+/*
+ * extract_substring --- fetch the true value of (part of) the current token
+ *
+ * This is like scan_emit(), except that the data is returned as a malloc'd
+ * string rather than being pushed directly to output_buf.
+ */
+char *
+extract_substring(const char *txt, int len)
+{
+	char	   *result = (char *) pg_malloc(len + 1);
+
+	if (ENC_IS_SAFE(cur_state))
+		memcpy(result, txt, len);
+	else
+	{
+		/* Gotta do it the hard way */
+		const char *reference = cur_state->refline;
+		int		i;
+
+		reference += (txt - cur_state->curline);
+
+		for (i = 0; i < len; i++)
+		{
+			char	ch = txt[i];
+
+			if (ch == (char) 0xFF)
+				ch = reference[i];
+			result[i] = ch;
+		}
+	}
+	result[len] = '\0';
+	return result;
+}
+
+/*
+ * escape_variable --- process :'VARIABLE' or :"VARIABLE"
+ *
+ * If the variable name is found, escape its value using the appropriate
+ * quoting method and emit the value to output_buf.  (Since the result is
+ * surely quoted, there is never any reason to rescan it.)  If we don't
+ * find the variable or the escaping function fails, emit the token as-is.
+ */
+void
+escape_variable(bool as_ident)
+{
+	/* Variable lookup if possible. */
+	if (cur_state->cb.get_variable)
+	{
+		char		*varname;
+		const char  *value;
+		void	   (*free_fn)(void *);
+
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = cur_state->cb.get_variable(varname, true, as_ident, &free_fn);
+		free(varname);
+
+		if (value)
+		{
+			appendPQExpBufferStr(output_buf, value);
+			if (free_fn)
+				free_fn((void*)value);
+			return;
+		}
+	}
+
+	/*
+	 * If we reach this point, some kind of error has occurred.  Emit the
+	 * original text into the output buffer.
+	 */
+	scan_emit(yytext, yyleng);
+}
+
+/* Default error output function */
+static void psql_error_errout(const char *fmt, ...)
+{
+	va_list	ap;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, _(fmt), ap);
+	va_end(ap);
+}
+
+/* Default function to check standard_conforming_strings */
+static bool psql_standard_strings(void)
+{
+	return false;
+}
diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 6916f6f..47e9077 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -337,9 +337,12 @@ main(int argc, char *argv[])
 					puts(cell->val);
 
 				scan_state = psql_scan_create();
-				psql_scan_setup(scan_state,
-								cell->val,
-								strlen(cell->val));
+				/* set enc_mblen according to the encoding */
+				psqlscan_callbacks.enc_mblen =
+					(pg_valid_server_encoding_id(pset.encoding) ?
+					 NULL : &psql_mblen);
+				psql_scan_setup(scan_state,	cell->val, strlen(cell->val),
+								&psqlscan_callbacks);
 
 				successResult = HandleSlashCmds(scan_state, NULL) != PSQL_CMD_ERROR
 					? EXIT_SUCCESS : EXIT_FAILURE;
-- 
1.8.3.1

0002-pgbench-uses-common-frontend-SQL-parser.patchtext/x-patch; charset=us-asciiDownload
From a68a6ed2db22a96083eb3f349d4be37d609bd36d Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 16 Mar 2016 12:06:22 +0900
Subject: [PATCH 2/3] pgbench uses common frontend SQL parser

Make pgbench to use common frontend SQL parser instead of its
homegrown parser.
---
 src/bin/pgbench/Makefile  |   8 +-
 src/bin/pgbench/pgbench.c | 476 +++++++++++++++++++++++++++++++---------------
 2 files changed, 329 insertions(+), 155 deletions(-)

diff --git a/src/bin/pgbench/Makefile b/src/bin/pgbench/Makefile
index 560bfea..5bf8bab 100644
--- a/src/bin/pgbench/Makefile
+++ b/src/bin/pgbench/Makefile
@@ -5,11 +5,12 @@ PGAPPICON = win32
 
 subdir = src/bin/pgbench
 top_builddir = ../../..
+psqldir = ../psql
 include $(top_builddir)/src/Makefile.global
 
-OBJS = pgbench.o exprparse.o $(WIN32RES)
+OBJS = pgbench.o exprparse.o $(psqldir)/psqlscan.o $(WIN32RES)
 
-override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) $(CPPFLAGS)
+override CPPFLAGS := -I. -I$(srcdir) -I$(libpq_srcdir) -I$(psqldir) $(CPPFLAGS)
 
 ifneq ($(PORTNAME), win32)
 override CFLAGS += $(PTHREAD_CFLAGS)
@@ -24,6 +25,9 @@ pgbench: $(OBJS) | submake-libpq submake-libpgport
 # exprscan is compiled as part of exprparse
 exprparse.o: exprscan.c
 
+$(psqldir)/psqlscan.o:
+	make -C $(psqldir) psqlscan.o
+
 distprep: exprparse.c exprscan.c
 
 install: all installdirs
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 5a3c6cd..3b90bef 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -54,6 +54,7 @@
 #endif
 
 #include "pgbench.h"
+#include "psqlscan.h"
 
 #define ERRCODE_UNDEFINED_TABLE  "42P01"
 
@@ -286,7 +287,7 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct
+typedef struct Command_t
 {
 	char	   *line;			/* full text of command line */
 	int			command_num;	/* unique index of this Command struct */
@@ -296,6 +297,7 @@ typedef struct
 	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
 	SimpleStats stats;			/* time spent in this command */
+	struct Command_t *next;		/* more command if any, for multistatements */
 } Command;
 
 static struct
@@ -304,6 +306,22 @@ static struct
 	Command   **commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
+
+typedef enum
+{
+	PS_IDLE,
+	PS_IN_STATEMENT,
+	PS_IN_BACKSLASH_CMD
+} ParseState;
+
+typedef struct ParseInfo
+{
+	PsqlScanState	scan_state;
+	PQExpBuffer		outbuf;
+	ParseState		mode;
+} ParseInfoData;
+typedef ParseInfoData *ParseInfo;
+
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
 static int	debug = 0;			/* debug flag */
@@ -433,6 +451,9 @@ usage(void)
 		   progname, progname);
 }
 
+PsqlScanCallbacks pgbench_scan_callbacks =
+{NULL, NULL, NULL};
+
 /*
  * strtoint64 -- convert a string to 64-bit integer
  *
@@ -2369,216 +2390,346 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
-/* Parse a command; return a Command struct, or NULL if it's a comment */
+static ParseInfo
+createParseInfo(void)
+{
+	ParseInfo ret = (ParseInfo) pg_malloc(sizeof(ParseInfoData));
+
+	ret->scan_state = psql_scan_create();
+	ret->outbuf = createPQExpBuffer();
+	ret->mode = PS_IDLE;
+
+	return ret;
+}
+
+#define parse_reset_outbuf(pcs) resetPQExpBuffer((pcs)->outbuf)
+#define parse_finish_scan(pcs) psql_scan_finish((pcs)->scan_state)
+
+/* copy a string after removing newlines and collapsing whitespaces */
+static char *
+strdup_nonl(const char *in)
+{
+	char *ret, *p, *q;
+
+	ret = pg_strdup(in);
+
+	/* Replace newlines into spaces */
+	for (p = ret ; *p ; p++)
+		if (*p == '\n') *p = ' ';
+
+	/* collapse successive spaces */
+	for (p = q = ret ; *p ; p++, q++)
+	{
+		while (isspace(*p) && isspace(*(p + 1))) p++;
+		if (p > q) *q = *p;
+	}
+	*q = '\0';
+
+	return ret;
+}
+
+/* Parse a backslash command; return a Command struct */
 static Command *
-process_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(ParseInfo proc_state, char *buf,
+						   const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 	Command    *my_commands;
 	int			j;
 	char	   *p,
+			   *start,
 			   *tok;
-
-	/* Make the string buf end at the next newline */
-	if ((p = strchr(buf, '\n')) != NULL)
-		*p = '\0';
+	int			max_args = -1;
 
 	/* Skip leading whitespace */
 	p = buf;
 	while (isspace((unsigned char) *p))
 		p++;
+	start = p;
 
-	/* If the line is empty or actually a comment, we're done */
-	if (*p == '\0' || strncmp(p, "--", 2) == 0)
-		return NULL;
+	if (proc_state->mode != PS_IN_BACKSLASH_CMD)
+	{
+		if (*p != '\\')
+			return NULL;	/* not a backslash command */
+
+		/* This is the first line of a backslash command  */
+		proc_state->mode = PS_IN_BACKSLASH_CMD;
+	}
+
+	/*
+	 * Make the string buf end at the next newline, or move to just after the
+	 * end of line
+	 */
+	if ((p = strchr(start, '\n')) != NULL)
+		*p = '\0';
+	else
+		p = start + strlen(start);
+
+	/* continued line ends with a backslash */
+	if (*(--p) == '\\')
+	{
+		*p-- = '\0';
+		appendPQExpBufferStr(proc_state->outbuf, start);
+
+		/* Add a delimiter at the end of the line if necessary */
+		if (!isspace(*p))
+			appendPQExpBufferChar(proc_state->outbuf, ' ');
+ 		return NULL;
+	}
+
+	appendPQExpBufferStr(proc_state->outbuf, start);
+	proc_state->mode = PS_IDLE;
+
+	/* Start parsing the backslash command */
+
+	p = proc_state->outbuf->data;
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
-	my_commands->line = pg_strdup(buf);
+	my_commands->line = pg_strdup(p);
 	my_commands->command_num = num_commands++;
-	my_commands->type = 0;		/* until set */
+	my_commands->type = META_COMMAND;
 	my_commands->argc = 0;
+	my_commands->next = NULL;
 	initSimpleStats(&my_commands->stats);
 
-	if (*p == '\\')
-	{
-		int			max_args = -1;
+	j = 0;
+	tok = strtok(++p, delim);
 
-		my_commands->type = META_COMMAND;
+	if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
+		max_args = 2;
 
-		j = 0;
-		tok = strtok(++p, delim);
+	while (tok != NULL)
+	{
+		my_commands->cols[j] = tok - buf + 1;
+		my_commands->argv[j++] = pg_strdup(tok);
+		my_commands->argc++;
+		if (max_args >= 0 && my_commands->argc >= max_args)
+			tok = strtok(NULL, "");
+		else
+			tok = strtok(NULL, delim);
+	}
+	parse_reset_outbuf(proc_state);
 
-		if (tok != NULL && pg_strcasecmp(tok, "set") == 0)
-			max_args = 2;
+	if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
+	{
+		/*--------
+		 * parsing:
+		 *	 \setrandom variable min max [uniform]
+		 *	 \setrandom variable min max (gaussian|exponential) parameter
+		 */
 
-		while (tok != NULL)
+		if (my_commands->argc < 4)
 		{
-			my_commands->cols[j] = tok - buf + 1;
-			my_commands->argv[j++] = pg_strdup(tok);
-			my_commands->argc++;
-			if (max_args >= 0 && my_commands->argc >= max_args)
-				tok = strtok(NULL, "");
-			else
-				tok = strtok(NULL, delim);
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing arguments", NULL, -1);
 		}
 
-		if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
-		{
-			/*--------
-			 * parsing:
-			 *	 \setrandom variable min max [uniform]
-			 *	 \setrandom variable min max (gaussian|exponential) parameter
-			 */
+		/* argc >= 4 */
 
-			if (my_commands->argc < 4)
+		if (my_commands->argc == 4 ||		/* uniform without/with
+											 * "uniform" keyword */
+			(my_commands->argc == 5 &&
+			 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
+		{
+			/* nothing to do */
+		}
+		else if (			/* argc >= 5 */
+			(pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
+			(pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
+		{
+			if (my_commands->argc < 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing arguments", NULL, -1);
-			}
-
-			/* argc >= 4 */
-
-			if (my_commands->argc == 4 ||		/* uniform without/with
-												 * "uniform" keyword */
-				(my_commands->argc == 5 &&
-				 pg_strcasecmp(my_commands->argv[4], "uniform") == 0))
-			{
-				/* nothing to do */
-			}
-			else if (			/* argc >= 5 */
-					 (pg_strcasecmp(my_commands->argv[4], "gaussian") == 0) ||
-				   (pg_strcasecmp(my_commands->argv[4], "exponential") == 0))
-			{
-				if (my_commands->argc < 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							  "missing parameter", my_commands->argv[4], -1);
-				}
-				else if (my_commands->argc > 6)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "too many arguments", my_commands->argv[4],
-								 my_commands->cols[6]);
-				}
+							 "missing parameter", my_commands->argv[4], -1);
 			}
-			else	/* cannot parse, unexpected arguments */
+			else if (my_commands->argc > 6)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "unexpected argument", my_commands->argv[4],
-							 my_commands->cols[4]);
+							 "too many arguments", my_commands->argv[4],
+							 my_commands->cols[6]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+		else	/* cannot parse, unexpected arguments */
 		{
-			if (my_commands->argc < 3)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "unexpected argument", my_commands->argv[4],
+						 my_commands->cols[4]);
+		}
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "set") == 0)
+	{
+		if (my_commands->argc < 3)
+		{
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-			expr_scanner_init(my_commands->argv[2], source, lineno,
-							  my_commands->line, my_commands->argv[0],
-							  my_commands->cols[2] - 1);
+		expr_scanner_init(my_commands->argv[2], source, lineno,
+						  my_commands->line, my_commands->argv[0],
+						  my_commands->cols[2] - 1);
 
-			if (expr_yyparse() != 0)
-			{
-				/* dead code: exit done from syntax_error called by yyerror */
-				exit(1);
-			}
+		if (expr_yyparse() != 0)
+		{
+			/* dead code: exit done from syntax_error called by yyerror */
+			exit(1);
+		}
 
-			my_commands->expr = expr_parse_result;
+		my_commands->expr = expr_parse_result;
 
-			expr_scanner_finish();
-		}
-		else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+		expr_scanner_finish();
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "sleep") == 0)
+	{
+		if (my_commands->argc < 2)
 		{
-			if (my_commands->argc < 2)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
-			}
-
-			/*
-			 * Split argument into number and unit to allow "sleep 1ms" etc.
-			 * We don't have to terminate the number argument with null
-			 * because it will be parsed with atoi, which ignores trailing
-			 * non-digit characters.
-			 */
-			if (my_commands->argv[1][0] != ':')
-			{
-				char	   *c = my_commands->argv[1];
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
+		}
 
-				while (isdigit((unsigned char) *c))
-					c++;
-				if (*c)
-				{
-					my_commands->argv[2] = c;
-					if (my_commands->argc < 3)
-						my_commands->argc = 3;
-				}
-			}
+		/*
+		 * Split argument into number and unit to allow "sleep 1ms" etc.
+		 * We don't have to terminate the number argument with null
+		 * because it will be parsed with atoi, which ignores trailing
+		 * non-digit characters.
+		 */
+		if (my_commands->argv[1][0] != ':')
+		{
+			char	   *c = my_commands->argv[1];
 
-			if (my_commands->argc >= 3)
+			while (isdigit((unsigned char) *c))
+				c++;
+			if (*c)
 			{
-				if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
-					pg_strcasecmp(my_commands->argv[2], "s") != 0)
-				{
-					syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-								 "unknown time unit, must be us, ms or s",
-								 my_commands->argv[2], my_commands->cols[2]);
-				}
+				my_commands->argv[2] = c;
+				if (my_commands->argc < 3)
+					my_commands->argc = 3;
 			}
-
-			/* this should be an error?! */
-			for (j = 3; j < my_commands->argc; j++)
-				fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
-						my_commands->argv[0], my_commands->argv[j]);
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+
+		if (my_commands->argc >= 3)
 		{
-			if (my_commands->argc < 3)
+			if (pg_strcasecmp(my_commands->argv[2], "us") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "ms") != 0 &&
+				pg_strcasecmp(my_commands->argv[2], "s") != 0)
 			{
 				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing argument", NULL, -1);
+							 "unknown time unit, must be us, ms or s",
+							 my_commands->argv[2], my_commands->cols[2]);
 			}
 		}
-		else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+
+		/* this should be an error?! */
+		for (j = 3; j < my_commands->argc; j++)
+			fprintf(stderr, "%s: extra argument \"%s\" ignored\n",
+					my_commands->argv[0], my_commands->argv[j]);
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "setshell") == 0)
+	{
+		if (my_commands->argc < 3)
 		{
-			if (my_commands->argc < 1)
-			{
-				syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-							 "missing command", NULL, -1);
-			}
+			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+						 "missing argument", NULL, -1);
 		}
-		else
+	}
+	else if (pg_strcasecmp(my_commands->argv[0], "shell") == 0)
+	{
+		if (my_commands->argc < 1)
 		{
 			syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
-						 "invalid command", NULL, -1);
+						 "missing command", NULL, -1);
 		}
 	}
 	else
 	{
-		my_commands->type = SQL_COMMAND;
+		syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
+					 "invalid command", NULL, -1);
+	}
+
+	return my_commands;
+}
+
+/* Parse an input line, return non-null if any command terminates. */
+static Command *
+process_commands(ParseInfo proc_state, char *buf,
+				 const char *source, const int lineno)
+{
+	Command *command = NULL;
+	Command *retcomd = NULL;
+	PsqlScanState scan_state = proc_state->scan_state;
+	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+	PQExpBuffer qbuf = proc_state->outbuf;
+	PsqlScanResult scan_result;
+
+	if (proc_state->mode != PS_IN_STATEMENT)
+	{
+		command = process_backslash_commands(proc_state, buf, source, lineno);
+
+		/* go to next line for continuation of the backslash command. */
+		if (command != NULL || proc_state->mode == PS_IN_BACKSLASH_CMD)
+			return command;
+	}
+
+	/* Parse statements */
+	psql_scan_setup(scan_state, buf, strlen(buf), &pgbench_scan_callbacks);
+
+next_command:	
+	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
+
+	if (scan_result == PSCAN_SEMICOLON)
+	{
+		proc_state->mode = PS_IDLE;
+		/*
+		 * Command is terminated. Fill the struct.
+		 */
+		command = (Command*) pg_malloc(sizeof(Command));
+		command->line = strdup_nonl(qbuf->data);
+		command->command_num = num_commands++;
+		command->type = SQL_COMMAND;
+		command->argc = 0;
+		command->next = NULL;
+
+		/* Put this command at the end of returning command chain */
+		if (!retcomd)
+			retcomd = command;
+		else
+		{
+			Command *pcomm = retcomd;
+			while (pcomm->next) pcomm = pcomm->next;
+			pcomm->next = command;
+		}
 
 		switch (querymode)
 		{
-			case QUERY_SIMPLE:
-				my_commands->argv[0] = pg_strdup(p);
-				my_commands->argc++;
-				break;
-			case QUERY_EXTENDED:
-			case QUERY_PREPARED:
-				if (!parseQuery(my_commands, p))
-					exit(1);
-				break;
-			default:
+		case QUERY_SIMPLE:
+			command->argv[0] = pg_strdup(qbuf->data);
+			command->argc++;
+			break;
+		case QUERY_EXTENDED:
+		case QUERY_PREPARED:
+			if (!parseQuery(command, qbuf->data))
 				exit(1);
+			break;
+		default:
+			exit(1);
 		}
+
+		parse_reset_outbuf(proc_state);
+
+		/* Ask for the next statement in this line */
+		goto next_command;
+ 	}
+	else if (scan_result == PSCAN_BACKSLASH)
+	{
+		fprintf(stderr, "Unexpected backslash in SQL statement: %s:%d\n",
+				source, lineno);
+		exit(1);
 	}
 
-	return my_commands;
+	proc_state->mode = PS_IN_STATEMENT;
+	psql_scan_finish(scan_state);
+
+	return retcomd;
 }
 
 /*
@@ -2639,6 +2790,7 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2653,26 +2805,38 @@ process_file(char *filename)
 		return NULL;
 	}
 
+	proc_state->mode = PS_IDLE;
+
 	lineno = 0;
 	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
-		Command    *command;
+		Command    *command = NULL;
 
 		lineno += 1;
 
-		command = process_commands(buf, filename, lineno);
+		command = process_commands(proc_state, buf, filename, lineno);
 
 		free(buf);
 
 		if (command == NULL)
-			continue;
-
-		my_commands[index] = command;
-		index++;
+		{
+			/*
+			 * command is NULL when psql_scan returns PSCAN_EOL or
+			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
+			 * cases.
+			 */
+ 			continue;
+		}
 
-		if (index >= alloc_num)
+		while (command)
+		{
+			my_commands[index++] = command;
+			command = command->next;
+		}
+		
+		if (index > alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
 			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
@@ -2680,6 +2844,8 @@ process_file(char *filename)
 	}
 	fclose(fd);
 
+	parse_finish_scan(proc_state);
+
 	my_commands[index] = NULL;
 
 	return my_commands;
@@ -2695,6 +2861,7 @@ process_builtin(const char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2721,10 +2888,12 @@ process_builtin(const char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(buf, source, lineno);
+		command = process_commands(proc_state, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
+		/* builtin doesn't need multistatements */
+		Assert(command->next == NULL);
 		my_commands[index] = command;
 		index++;
 
@@ -2736,6 +2905,7 @@ process_builtin(const char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
+	parse_finish_scan(proc_state);
 
 	return my_commands;
 }
-- 
1.8.3.1

0003-Change-the-way-to-hold-command-list.patchtext/x-patch; charset=us-asciiDownload
From 75914b56cd5d6a2989cf82dc9c0eb508a76f17ce Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 16 Mar 2016 12:07:02 +0900
Subject: [PATCH 3/3] Change the way to hold command list.

Currently commands for SQL statements are generated as a linked list
and stored into and accessed as an array. This patch unifies the way
to store them to linked list.
---
 src/bin/pgbench/pgbench.c | 189 +++++++++++++++++++++-------------------------
 1 file changed, 85 insertions(+), 104 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 3b90bef..daf73c6 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -192,6 +192,7 @@ typedef struct
 
 #define MAX_SCRIPTS		128		/* max number of SQL scripts allowed */
 #define SHELL_COMMAND_SIZE	256 /* maximum size allowed for shell command */
+#define MAX_ARGS		10
 
 /*
  * Simple data structure to keep stats about something.
@@ -223,13 +224,29 @@ typedef struct StatsData
 } StatsData;
 
 /*
+ * Structure for individual command
+ */
+typedef struct Command_t
+{
+	char	   *line;			/* full text of command line */
+	int			command_num;	/* unique index of this Command struct */
+	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
+	int			argc;			/* number of command words */
+	char	   *argv[MAX_ARGS]; /* command word list */
+	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
+	PgBenchExpr *expr;			/* parsed expression */
+	SimpleStats stats;			/* time spent in this command */
+	struct Command_t *next;		/* more command if any, for multistatements */
+} Command;
+
+/*
  * Connection state
  */
 typedef struct
 {
 	PGconn	   *con;			/* connection handle to DB */
 	int			id;				/* client No. */
-	int			state;			/* state No. */
+	Command	   *curr;			/* current command */
 	bool		listen;			/* whether an async query has been sent */
 	bool		is_throttled;	/* whether transaction throttling is done */
 	bool		sleeping;		/* whether the client is napping */
@@ -274,7 +291,6 @@ typedef struct
  */
 #define SQL_COMMAND		1
 #define META_COMMAND	2
-#define MAX_ARGS		10
 
 typedef enum QueryMode
 {
@@ -287,23 +303,10 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct Command_t
-{
-	char	   *line;			/* full text of command line */
-	int			command_num;	/* unique index of this Command struct */
-	int			type;			/* command type (SQL_COMMAND or META_COMMAND) */
-	int			argc;			/* number of command words */
-	char	   *argv[MAX_ARGS]; /* command word list */
-	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
-	PgBenchExpr *expr;			/* parsed expression */
-	SimpleStats stats;			/* time spent in this command */
-	struct Command_t *next;		/* more command if any, for multistatements */
-} Command;
-
 static struct
 {
 	const char *name;
-	Command   **commands;
+	Command   *commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 
@@ -1348,7 +1351,7 @@ static bool
 doCustom(TState *thread, CState *st, StatsData *agg)
 {
 	PGresult   *res;
-	Command   **commands;
+	Command    *commands;
 	bool		trans_needs_throttle = false;
 	instr_time	now;
 
@@ -1432,13 +1435,13 @@ top:
 
 	if (st->listen)
 	{							/* are we receiver? */
-		if (commands[st->state]->type == SQL_COMMAND)
+		if (st->curr->type == SQL_COMMAND)
 		{
 			if (debug)
 				fprintf(stderr, "client %d receiving\n", st->id);
 			if (!PQconsumeInput(st->con))
 			{					/* there's something wrong */
-				fprintf(stderr, "client %d aborted in state %d; perhaps the backend died while processing\n", st->id, st->state);
+				fprintf(stderr, "client %d aborted in state %d; perhaps the backend died while processing\n", st->id, st->curr->command_num);
 				return clientDone(st, false);
 			}
 			if (PQisBusy(st->con))
@@ -1455,13 +1458,13 @@ top:
 				INSTR_TIME_SET_CURRENT(now);
 
 			/* XXX could use a mutex here, but we choose not to */
-			addToSimpleStats(&commands[st->state]->stats,
+			addToSimpleStats(&st->curr->stats,
 							 INSTR_TIME_GET_DOUBLE(now) -
 							 INSTR_TIME_GET_DOUBLE(st->stmt_begin));
 		}
 
 		/* transaction finished: calculate latency and log the transaction */
-		if (commands[st->state + 1] == NULL)
+		if (st->curr->next == NULL)
 		{
 			if (progress || throttle_delay || latency_limit ||
 				per_script_stats || use_log)
@@ -1470,7 +1473,7 @@ top:
 				thread->stats.cnt++;
 		}
 
-		if (commands[st->state]->type == SQL_COMMAND)
+		if (st->curr->type == SQL_COMMAND)
 		{
 			/*
 			 * Read and discard the query result; note this is not included in
@@ -1484,7 +1487,8 @@ top:
 					break;		/* OK */
 				default:
 					fprintf(stderr, "client %d aborted in state %d: %s",
-							st->id, st->state, PQerrorMessage(st->con));
+							st->id, st->curr->command_num,
+							PQerrorMessage(st->con));
 					PQclear(res);
 					return clientDone(st, false);
 			}
@@ -1492,7 +1496,7 @@ top:
 			discard_response(st);
 		}
 
-		if (commands[st->state + 1] == NULL)
+		if (st->curr->next == NULL)
 		{
 			if (is_connect)
 			{
@@ -1505,16 +1509,16 @@ top:
 				return clientDone(st, true);	/* exit success */
 		}
 
-		/* increment state counter */
-		st->state++;
-		if (commands[st->state] == NULL)
+		/* move to the next state */
+		st->curr = st->curr->next;
+		if (st->curr == NULL)
 		{
-			st->state = 0;
 			st->use_file = chooseScript(thread);
 			commands = sql_script[st->use_file].commands;
 			if (debug)
 				fprintf(stderr, "client %d executing script \"%s\"\n", st->id,
 						sql_script[st->use_file].name);
+			st->curr = commands;
 			st->is_throttled = false;
 
 			/*
@@ -1558,7 +1562,7 @@ top:
 
 	/* Record transaction start time under logging, progress or throttling */
 	if ((use_log || progress || throttle_delay || latency_limit ||
-		 per_script_stats) && st->state == 0)
+		 per_script_stats) && st->curr == commands)
 	{
 		INSTR_TIME_SET_CURRENT(st->txn_begin);
 
@@ -1574,9 +1578,9 @@ top:
 	if (is_latencies)
 		INSTR_TIME_SET_CURRENT(st->stmt_begin);
 
-	if (commands[st->state]->type == SQL_COMMAND)
+	if (st->curr->type == SQL_COMMAND)
 	{
-		const Command *command = commands[st->state];
+		const Command *command = st->curr;
 		int			r;
 
 		if (querymode == QUERY_SIMPLE)
@@ -1610,18 +1614,19 @@ top:
 
 			if (!st->prepared[st->use_file])
 			{
-				int			j;
+				int			j = 0;
+				Command		*pcom = commands;
 
-				for (j = 0; commands[j] != NULL; j++)
+				for (; pcom ; pcom = pcom->next, j++)
 				{
 					PGresult   *res;
 					char		name[MAX_PREPARE_NAME];
 
-					if (commands[j]->type != SQL_COMMAND)
+					if (pcom->type != SQL_COMMAND)
 						continue;
 					preparedStatementName(name, st->use_file, j);
 					res = PQprepare(st->con, name,
-						  commands[j]->argv[0], commands[j]->argc - 1, NULL);
+						  pcom->argv[0], pcom->argc - 1, NULL);
 					if (PQresultStatus(res) != PGRES_COMMAND_OK)
 						fprintf(stderr, "%s", PQerrorMessage(st->con));
 					PQclear(res);
@@ -1630,7 +1635,7 @@ top:
 			}
 
 			getQueryParams(st, command, params);
-			preparedStatementName(name, st->use_file, st->state);
+			preparedStatementName(name, st->use_file, st->curr->command_num);
 
 			if (debug)
 				fprintf(stderr, "client %d sending %s\n", st->id, name);
@@ -1650,11 +1655,11 @@ top:
 		else
 			st->listen = true;	/* flags that should be listened */
 	}
-	else if (commands[st->state]->type == META_COMMAND)
+	else if (st->curr->type == META_COMMAND)
 	{
-		int			argc = commands[st->state]->argc,
+		int			argc = st->curr->argc,
 					i;
-		char	  **argv = commands[st->state]->argv;
+		char	  **argv = st->curr->argv;
 
 		if (debug)
 		{
@@ -1804,7 +1809,7 @@ top:
 		else if (pg_strcasecmp(argv[0], "set") == 0)
 		{
 			char		res[64];
-			PgBenchExpr *expr = commands[st->state]->expr;
+			PgBenchExpr *expr = st->curr->expr;
 			int64		result;
 
 			if (!evaluateExpr(st, expr, &result))
@@ -2779,36 +2784,28 @@ read_line_from_file(FILE *fd)
  * Given a file name, read it and return the array of Commands contained
  * therein.  "-" means to read stdin.
  */
-static Command **
+static Command *
 process_file(char *filename)
 {
-#define COMMANDS_ALLOC_NUM 128
-
-	Command   **my_commands;
+	Command    *my_commands = NULL,
+			   *my_commands_tail = NULL;
 	FILE	   *fd;
-	int			lineno,
-				index;
+	int			lineno;
 	char	   *buf;
-	int			alloc_num;
 	ParseInfo proc_state = createParseInfo();
 
-	alloc_num = COMMANDS_ALLOC_NUM;
-	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
-
 	if (strcmp(filename, "-") == 0)
 		fd = stdin;
 	else if ((fd = fopen(filename, "r")) == NULL)
 	{
 		fprintf(stderr, "could not open file \"%s\": %s\n",
 				filename, strerror(errno));
-		pg_free(my_commands);
 		return NULL;
 	}
 
 	proc_state->mode = PS_IDLE;
 
 	lineno = 0;
-	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
@@ -2830,44 +2827,35 @@ process_file(char *filename)
  			continue;
 		}
 
-		while (command)
-		{
-			my_commands[index++] = command;
-			command = command->next;
-		}
-		
-		if (index > alloc_num)
-		{
-			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
-		}
+		/* Append new commands at the end of the list */
+		if (my_commands_tail)
+			my_commands_tail->next = command;
+		else
+			my_commands = my_commands_tail = command;
+
+		/* Seek to the tail of the list */
+		while (my_commands_tail->next)
+			my_commands_tail = my_commands_tail->next;
 	}
 	fclose(fd);
 
 	parse_finish_scan(proc_state);
 
-	my_commands[index] = NULL;
+	my_commands_tail->next = NULL;
 
 	return my_commands;
 }
 
-static Command **
+static Command *
 process_builtin(const char *tb, const char *source)
 {
-#define COMMANDS_ALLOC_NUM 128
-
-	Command   **my_commands;
-	int			lineno,
-				index;
+	Command    *my_commands = NULL,
+			   *my_commands_tail = NULL;
+	int			lineno;
 	char		buf[BUFSIZ];
-	int			alloc_num;
 	ParseInfo proc_state = createParseInfo();
 
-	alloc_num = COMMANDS_ALLOC_NUM;
-	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
-
 	lineno = 0;
-	index = 0;
 
 	for (;;)
 	{
@@ -2892,19 +2880,17 @@ process_builtin(const char *tb, const char *source)
 		if (command == NULL)
 			continue;
 
-		/* builtin doesn't need multistatements */
+		/* For simplisity, inhibit builtin from multistatements */
 		Assert(command->next == NULL);
-		my_commands[index] = command;
-		index++;
-
-		if (index >= alloc_num)
-		{
-			alloc_num += COMMANDS_ALLOC_NUM;
-			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
+		if (my_commands_tail)
+ 		{
+			my_commands_tail->next = command;
+			my_commands_tail = command;
 		}
+		else
+			my_commands = my_commands_tail = command;
 	}
 
-	my_commands[index] = NULL;
 	parse_finish_scan(proc_state);
 
 	return my_commands;
@@ -2956,10 +2942,9 @@ findBuiltin(const char *name, char **desc)
 }
 
 static void
-addScript(const char *name, Command **commands)
+addScript(const char *name, Command *commands)
 {
-	if (commands == NULL ||
-		commands[0] == NULL)
+	if (commands == NULL)
 	{
 		fprintf(stderr, "empty command list for script \"%s\"\n", name);
 		exit(1);
@@ -3082,17 +3067,16 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 			/* Report per-command latencies */
 			if (is_latencies)
 			{
-				Command   **commands;
+				Command   *command;
 
 				printf(" - statement latencies in milliseconds:\n");
 
-				for (commands = sql_script[i].commands;
-					 *commands != NULL;
-					 commands++)
+				for (command = sql_script[i].commands;
+					 command != NULL;
+					 command = command->next)
 					printf("   %11.3f  %s\n",
-						   1000.0 * (*commands)->stats.sum /
-						   (*commands)->stats.count,
-						   (*commands)->line);
+						   1000.0 * command->stats.sum / command->stats.count,
+						   command->line);
 			}
 		}
 	}
@@ -3902,20 +3886,19 @@ threadRun(void *arg)
 	{
 		CState	   *st = &state[i];
 		int			prev_ecnt = st->ecnt;
-		Command   **commands;
 
 		st->use_file = chooseScript(thread);
-		commands = sql_script[st->use_file].commands;
+		st->curr = sql_script[st->use_file].commands;
 		if (debug)
 			fprintf(stderr, "client %d executing script \"%s\"\n", st->id,
 					sql_script[st->use_file].name);
 		if (!doCustom(thread, st, &aggs))
 			remains--;			/* I've aborted */
 
-		if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
+		if (st->ecnt > prev_ecnt && st->curr->type == META_COMMAND)
 		{
 			fprintf(stderr, "client %d aborted in state %d; execution of meta-command failed\n",
-					i, st->state);
+					i, st->curr->command_num);
 			remains--;			/* I've aborted */
 			PQfinish(st->con);
 			st->con = NULL;
@@ -3936,7 +3919,6 @@ threadRun(void *arg)
 		for (i = 0; i < nstate; i++)
 		{
 			CState	   *st = &state[i];
-			Command   **commands = sql_script[st->use_file].commands;
 			int			sock;
 
 			if (st->con == NULL)
@@ -3972,7 +3954,7 @@ threadRun(void *arg)
 						min_usec = this_usec;
 				}
 			}
-			else if (commands[st->state]->type == META_COMMAND)
+			else if (st->curr->type == META_COMMAND)
 			{
 				min_usec = 0;	/* the connection is ready to run */
 				break;
@@ -4042,7 +4024,6 @@ threadRun(void *arg)
 		for (i = 0; i < nstate; i++)
 		{
 			CState	   *st = &state[i];
-			Command   **commands = sql_script[st->use_file].commands;
 			int			prev_ecnt = st->ecnt;
 
 			if (st->con)
@@ -4056,17 +4037,17 @@ threadRun(void *arg)
 					goto done;
 				}
 				if (FD_ISSET(sock, &input_mask) ||
-					commands[st->state]->type == META_COMMAND)
+					st->curr->type == META_COMMAND)
 				{
 					if (!doCustom(thread, st, &aggs))
 						remains--;		/* I've aborted */
 				}
 			}
 
-			if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
+			if (st->ecnt > prev_ecnt && st->curr->type == META_COMMAND)
 			{
 				fprintf(stderr, "client %d aborted in state %d; execution of meta-command failed\n",
-						i, st->state);
+						i, st->curr->command_num);
 				remains--;		/* I've aborted */
 				PQfinish(st->con);
 				st->con = NULL;
-- 
1.8.3.1

psqlscanbody.l.difftext/x-patch; charset=us-asciiDownload
--- psqlscan.l	2016-02-18 16:50:19.140495106 +0900
+++ psqlscanbody.l	2016-02-18 16:48:23.263579135 +0900
@@ -38,93 +38,22 @@
  *-------------------------------------------------------------------------
  */
-#include "postgres_fe.h"
-
 #include "psqlscan.h"
+#include "psqlscan_int.h"
 
 #include <ctype.h>
 
-#include "common.h"
-#include "settings.h"
-#include "variables.h"
-
-
-/*
- * We use a stack of flex buffers to handle substitution of psql variables.
- * Each stacked buffer contains the as-yet-unread text from one psql variable.
- * When we pop the stack all the way, we resume reading from the outer buffer
- * identified by scanbufhandle.
- */
-typedef struct StackElem
-{
-	YY_BUFFER_STATE buf;		/* flex input control structure */
-	char	   *bufstring;		/* data actually being scanned by flex */
-	char	   *origstring;		/* copy of original data, if needed */
-	char	   *varname;		/* name of variable providing data, or NULL */
-	struct StackElem *next;
-} StackElem;
-
-/*
- * All working state of the lexer must be stored in PsqlScanStateData
- * between calls.  This allows us to have multiple open lexer operations,
- * which is needed for nested include files.  The lexer itself is not
- * recursive, but it must be re-entrant.
- */
-typedef struct PsqlScanStateData
-{
-	StackElem  *buffer_stack;	/* stack of variable expansion buffers */
-	/*
-	 * These variables always refer to the outer buffer, never to any
-	 * stacked variable-expansion buffer.
-	 */
-	YY_BUFFER_STATE scanbufhandle;
-	char	   *scanbuf;		/* start of outer-level input buffer */
-	const char *scanline;		/* current input line at outer level */
-
-	/* safe_encoding, curline, refline are used by emit() to replace FFs */
-	int			encoding;		/* encoding being used now */
-	bool		safe_encoding;	/* is current encoding "safe"? */
-	const char *curline;		/* actual flex input string for cur buf */
-	const char *refline;		/* original data for cur buffer */
-
-	/*
-	 * All this state lives across successive input lines, until explicitly
-	 * reset by psql_scan_reset.
-	 */
-	int			start_state;	/* saved YY_START */
-	int			paren_depth;	/* depth of nesting in parentheses */
-	int			xcdepth;		/* depth of nesting in slash-star comments */
-	char	   *dolqstart;		/* current $foo$ quote start string */
-} PsqlScanStateData;
-
 static PsqlScanState cur_state;	/* current state while active */
 
-static PQExpBuffer output_buf;	/* current output buffer */
+PQExpBuffer output_buf;	/* current output buffer */
+
+#define ECHO scan_emit(yytext, yyleng)
 
-/* these variables do not need to be saved across calls */
-static enum slash_option_type option_type;
-static char *option_quote;
-static int	unquoted_option_chars;
-static int	backtick_start_offset;
-
-
-/* Return values from yylex() */
-#define LEXRES_EOL			0	/* end of input */
-#define LEXRES_SEMI			1	/* command-terminating semicolon found */
-#define LEXRES_BACKSLASH	2	/* backslash command start */
-#define LEXRES_OK			3	/* OK completion of backslash argument */
-
-
-static void evaluate_backtick(void);
-static void push_new_buffer(const char *newstr, const char *varname);
-static void pop_buffer_stack(PsqlScanState state);
-static bool var_is_current_source(PsqlScanState state, const char *varname);
-static YY_BUFFER_STATE prepare_buffer(const char *txt, int len,
-									  char **txtcopy);
-static void emit(const char *txt, int len);
-static char *extract_substring(const char *txt, int len);
-static void escape_variable(bool as_ident);
+/* Adjust curpos on yyless */
+#define my_yyless(n) cur_state->curpos -= (yyleng - (n)); yyless(n)
 
-#define ECHO emit(yytext, yyleng)
+/* Track where lexer parsed up to */
+#define YY_USER_ACTION cur_state->curpos += yyleng;
 
+#define ENC_IS_SAFE(s) (!(s)->cb.enc_mblen)
 %}
 
@@ -186,13 +115,4 @@
 %x xus
 %x xusend
-/* Additional exclusive states for psql only: lex backslash commands */
-%x xslashcmd
-%x xslashargstart
-%x xslasharg
-%x xslashquote
-%x xslashbackquote
-%x xslashdquote
-%x xslashwholeline
-%x xslashend
 
 /*
@@ -434,5 +354,5 @@
 					BEGIN(xc);
 					/* Put back any characters past slash-star; see above */
-					yyless(2);
+					my_yyless(2);
 					ECHO;
 				}
@@ -441,5 +361,5 @@
 					cur_state->xcdepth++;
 					/* Put back any characters past slash-star; see above */
-					yyless(2);
+					my_yyless(2);
 					ECHO;
 				}
@@ -473,5 +393,5 @@
 <xb>{quotestop}	|
 <xb>{quotefail} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(INITIAL);
 					ECHO;
@@ -498,5 +418,5 @@
 <xh>{quotestop}	|
 <xh>{quotefail} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(INITIAL);
 					ECHO;
@@ -504,10 +424,10 @@
 
 {xnstart}		{
-					yyless(1);				/* eat only 'n' this time */
+					my_yyless(1);				/* eat only 'n' this time */
 					ECHO;
 				}
 
 {xqstart}		{
-					if (standard_strings())
+					if (cur_state->cb.standard_strings())
 						BEGIN(xq);
 					else
@@ -525,5 +445,5 @@
 <xq,xe>{quotestop}	|
 <xq,xe>{quotefail} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(INITIAL);
 					ECHO;
@@ -531,5 +451,5 @@
 <xus>{quotestop} |
 <xus>{quotefail} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(xusend);
 					ECHO;
@@ -540,5 +460,5 @@
 <xusend>{other} |
 <xusend>{xustop1} {
-					yyless(0);
+					my_yyless(0);
 					BEGIN(INITIAL);
 					ECHO;
@@ -587,5 +507,5 @@
 {dolqfailed}	{
 					/* throw back all but the initial "$" */
-					yyless(1);
+					my_yyless(1);
 					ECHO;
 				}
@@ -604,5 +524,5 @@
 						 * $ for rescanning.  Consider $delim$...$junk$delim$
 						 */
-						yyless(yyleng-1);
+						my_yyless(yyleng-1);
 					}
 					ECHO;
@@ -632,5 +552,5 @@
 				}
 <xui>{dquote} {
-					yyless(1);
+					my_yyless(1);
 					BEGIN(xuiend);
 					ECHO;
@@ -641,5 +561,5 @@
 <xuiend>{other} |
 <xuiend>{xustop1} {
-					yyless(0);
+					my_yyless(0);
 					BEGIN(INITIAL);
 					ECHO;
@@ -658,5 +578,5 @@
 {xufailed}	{
 					/* throw back all but the initial u/U */
-					yyless(1);
+					my_yyless(1);
 					ECHO;
 				}
@@ -727,5 +647,5 @@
 "\\"[;:]		{
 					/* Force a semicolon or colon into the query buffer */
-					emit(yytext + 1, 1);
+					scan_emit(yytext + 1, 1);
 				}
 
@@ -737,9 +657,14 @@
 :{variable_char}+	{
 					/* Possible psql variable substitution */
-					char   *varname;
-					const char *value;
+					char	   *varname = NULL;
+					const char *value = NULL;
+					void	  (*free_fn)(void *) = NULL;
 
-					varname = extract_substring(yytext + 1, yyleng - 1);
-					value = GetVariable(pset.vars, varname);
+					if (cur_state->cb.get_variable)
+					{
+						varname = extract_substring(yytext + 1, yyleng - 1);
+						value = cur_state->cb.get_variable(varname,
+ 									false, false, &free_fn);
+					}
 
 					if (value)
@@ -749,5 +674,5 @@
 						{
 							/* Recursive expansion --- don't go there */
-							psql_error("skipping recursive expansion of variable \"%s\"\n",
+							cur_state->cb.error_out("skipping recursive expansion of variable \"%s\"\n",
 									   varname);
 							/* Instead copy the string as is */
@@ -760,4 +685,6 @@
 							/* yy_scan_string already made buffer active */
 						}
+						if (free_fn)
+							free_fn((void*)value);
 					}
 					else
@@ -770,5 +697,6 @@
 					}
 
-					free(varname);
+					if (varname)
+						free(varname);
 				}
 
@@ -788,5 +716,5 @@
 :'{variable_char}*	{
 					/* Throw back everything but the colon */
-					yyless(1);
+					my_yyless(1);
 					ECHO;
 				}
@@ -794,5 +722,5 @@
 :\"{variable_char}*	{
 					/* Throw back everything but the colon */
-					yyless(1);
+					my_yyless(1);
 					ECHO;
 				}
@@ -855,5 +783,5 @@
 					{
 						/* Strip the unwanted chars from the token */
-						yyless(nchars);
+						my_yyless(nchars);
 					}
 					ECHO;
@@ -872,5 +800,5 @@
 {decimalfail}	{
 					/* throw back the .., and treat as integer */
-					yyless(yyleng-2);
+					my_yyless(yyleng-2);
 					ECHO;
 				}
@@ -885,10 +813,10 @@
 					 * syntax error anyway, we don't bother to distinguish.
 					 */
-					yyless(yyleng-1);
+					my_yyless(yyleng-1);
 					ECHO;
 				}
 {realfail2}		{
 					/* throw back the [Ee][+-], and proceed as above */
-					yyless(yyleng-2);
+					my_yyless(yyleng-2);
 					ECHO;
 				}
@@ -934,260 +862,24 @@
 					}
 				}
+%%
 
-	/*
-	 * Exclusive lexer states to handle backslash command lexing
-	 */
-
-<xslashcmd>{
-	/* command name ends at whitespace or backslash; eat all else */
-
-{space}|"\\"	{
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashargstart>{
-	/*
-	 * Discard any whitespace before argument, then go to xslasharg state.
-	 * An exception is that "|" is only special at start of argument, so we
-	 * check for it here.
-	 */
-
-{space}+		{ }
-
-"|"				{
-					if (option_type == OT_FILEPIPE)
-					{
-						/* treat like whole-string case */
-						ECHO;
-						BEGIN(xslashwholeline);
-					}
-					else
-					{
-						/* vertical bar is not special otherwise */
-						yyless(0);
-						BEGIN(xslasharg);
-					}
-				}
-
-{other}			{
-					yyless(0);
-					BEGIN(xslasharg);
-				}
-
-}
-
-<xslasharg>{
-	/*
-	 * Default processing of text in a slash command's argument.
-	 *
-	 * Note: unquoted_option_chars counts the number of characters at the
-	 * end of the argument that were not subject to any form of quoting.
-	 * psql_scan_slash_option needs this to strip trailing semicolons safely.
-	 */
-
-{space}|"\\"	{
-					/*
-					 * Unquoted space is end of arg; do not eat.  Likewise
-					 * backslash is end of command or next command, do not eat
-					 *
-					 * XXX this means we can't conveniently accept options
-					 * that include unquoted backslashes; therefore, option
-					 * processing that encourages use of backslashes is rather
-					 * broken.
-					 */
-					yyless(0);
-					return LEXRES_OK;
-				}
-
-{quote}			{
-					*option_quote = '\'';
-					unquoted_option_chars = 0;
-					BEGIN(xslashquote);
-				}
-
-"`"				{
-					backtick_start_offset = output_buf->len;
-					*option_quote = '`';
-					unquoted_option_chars = 0;
-					BEGIN(xslashbackquote);
-				}
-
-{dquote}		{
-					ECHO;
-					*option_quote = '"';
-					unquoted_option_chars = 0;
-					BEGIN(xslashdquote);
-				}
-
-:{variable_char}+	{
-					/* Possible psql variable substitution */
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						char   *varname;
-						const char *value;
-
-						varname = extract_substring(yytext + 1, yyleng - 1);
-						value = GetVariable(pset.vars, varname);
-						free(varname);
-
-						/*
-						 * The variable value is just emitted without any
-						 * further examination.  This is consistent with the
-						 * pre-8.0 code behavior, if not with the way that
-						 * variables are handled outside backslash commands.
-						 * Note that we needn't guard against recursion here.
-						 */
-						if (value)
-							appendPQExpBufferStr(output_buf, value);
-						else
-							ECHO;
-
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}+'	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(false);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-
-:\"{variable_char}+\"	{
-					if (option_type == OT_NO_EVAL)
-						ECHO;
-					else
-					{
-						escape_variable(true);
-						*option_quote = ':';
-					}
-					unquoted_option_chars = 0;
-				}
-
-:'{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-:\"{variable_char}*	{
-					/* Throw back everything but the colon */
-					yyless(1);
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-{other}			{
-					unquoted_option_chars++;
-					ECHO;
-				}
-
-}
-
-<xslashquote>{
-	/*
-	 * single-quoted text: copy literally except for '' and backslash
-	 * sequences
-	 */
-
-{quote}			{ BEGIN(xslasharg); }
-
-{xqdouble}		{ appendPQExpBufferChar(output_buf, '\''); }
-
-"\\n"			{ appendPQExpBufferChar(output_buf, '\n'); }
-"\\t"			{ appendPQExpBufferChar(output_buf, '\t'); }
-"\\b"			{ appendPQExpBufferChar(output_buf, '\b'); }
-"\\r"			{ appendPQExpBufferChar(output_buf, '\r'); }
-"\\f"			{ appendPQExpBufferChar(output_buf, '\f'); }
-
-{xeoctesc}		{
-					/* octal case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 1, NULL, 8));
-				}
-
-{xehexesc}		{
-					/* hex case */
-					appendPQExpBufferChar(output_buf,
-										  (char) strtol(yytext + 2, NULL, 16));
-				}
-
-"\\".			{ emit(yytext + 1, 1); }
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashbackquote>{
-	/*
-	 * backticked text: copy everything until next backquote, then evaluate.
-	 *
-	 * XXX Possible future behavioral change: substitute for :VARIABLE?
-	 */
-
-"`"				{
-					/* In NO_EVAL mode, don't evaluate the command */
-					if (option_type != OT_NO_EVAL)
-						evaluate_backtick();
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashdquote>{
-	/* double-quoted text: copy verbatim, including the double quotes */
-
-{dquote}		{
-					ECHO;
-					BEGIN(xslasharg);
-				}
-
-{other}|\n		{ ECHO; }
-
-}
-
-<xslashwholeline>{
-	/* copy everything until end of input line */
-	/* but suppress leading whitespace */
-
-{space}+		{
-					if (output_buf->len > 0)
-						ECHO;
-				}
-
-{other}			{ ECHO; }
-
-}
-
-<xslashend>{
-	/* at end of command, eat a double backslash, but not anything else */
-
-"\\\\"			{ return LEXRES_OK; }
-
-{other}|\n		{
-					yyless(0);
-					return LEXRES_OK;
-				}
+static void my_psql_scan_finish(PsqlScanState state);
+static void my_psql_scan_reset(PsqlScanState state);
+static void psql_error_errout(const char *fmt, ...)
+	__attribute__ ((format (printf, 1, 2)));
+static bool psql_standard_strings(void);
 
+static void
+psql_scan_initialize(PsqlScanState state)
+{
+	psql_scan_finish(state);
+	psql_scan_reset(state);
+	memset(state, 0, sizeof(*state));
+	state->finish = &my_psql_scan_finish;
+	state->reset = &my_psql_scan_reset;
+	state->my_yy_scan_buffer = &yy_scan_buffer;
+	state->reset(state);
 }
 
-%%
-
 /*
  * Create a lexer working state struct.
@@ -1199,6 +891,5 @@
 
 	state = (PsqlScanStateData *) pg_malloc0(sizeof(PsqlScanStateData));
-
-	psql_scan_reset(state);
+	psql_scan_initialize(state);
 
 	return state;
@@ -1228,6 +919,6 @@
  */
 void
-psql_scan_setup(PsqlScanState state,
-				const char *line, int line_len)
+psql_scan_setup(PsqlScanState state, const char *line, int line_len,
+				PsqlScanCallbacks *cb)
 {
 	/* Mustn't be scanning already */
@@ -1235,7 +926,17 @@
 	Assert(state->buffer_stack == NULL);
 
-	/* Do we need to hack the character set encoding? */
-	state->encoding = pset.encoding;
-	state->safe_encoding = pg_valid_server_encoding_id(state->encoding);
+	/* copy callback functions */
+	state->cb.get_variable = cb->get_variable;
+	if (cb->standard_strings)
+		state->cb.standard_strings = cb->standard_strings;
+	else
+		state->cb.standard_strings = &psql_standard_strings;
+
+	state->cb.enc_mblen = cb->enc_mblen;
+
+	if (cb->error_out)
+		state->cb.error_out = cb->error_out;
+	else
+		state->cb.error_out = &psql_error_errout;
 
 	/* needed for prepare_buffer */
@@ -1246,4 +947,5 @@
 										  &state->scanbuf);
 	state->scanline = line;
+	state->curpos = 0;
 
 	/* Set lookaside data in case we have to map unsafe encoding */
@@ -1253,4 +955,23 @@
 
 /*
+ * Redirect functions for indirect calls. These functions may be called for
+ * scan state of other lexers.
+ */
+void
+psql_scan_finish(PsqlScanState state)
+{
+	if (state->finish)
+		state->finish(state);
+}
+
+void
+psql_scan_reset(PsqlScanState state)
+{
+	if (state->reset)
+		state->reset(state);
+}
+
+
+/*
  * Do lexical analysis of SQL command text.
  *
@@ -1409,6 +1130,6 @@
  * to deal with error recovery).
  */
-void
-psql_scan_finish(PsqlScanState state)
+static void
+my_psql_scan_finish(PsqlScanState state)
 {
 	/* Drop any incomplete variable expansions. */
@@ -1426,4 +1147,22 @@
 
 /*
+ * Create new lexer scanning state for this lexer which parses from the current
+ * position of the given scanning state for another lexer. The given state is
+ * destroyed.
+ * 
+ * Note: This function cannot access yy* functions and varialbes of the given
+ * state because they are of different lexer.
+ */
+void
+psql_scan_switch_lexer(PsqlScanState state)
+{
+	const char	   *newscanline = state->scanline + state->curpos;
+	PsqlScanCallbacks cb = state->cb;
+
+	psql_scan_initialize(state);
+	psql_scan_setup(state, newscanline, strlen(newscanline), &cb);
+}
+
+/*
  * Reset lexer scanning state to start conditions.  This is appropriate
  * for executing \r psql commands (or any other time that we discard the
@@ -1436,6 +1175,6 @@
  * done by psql_scan_finish().
  */
-void
-psql_scan_reset(PsqlScanState state)
+static void
+my_psql_scan_reset(PsqlScanState state)
 {
 	state->start_state = INITIAL;
@@ -1461,290 +1200,4 @@
 
 /*
- * Scan the command name of a psql backslash command.  This should be called
- * after psql_scan() returns PSCAN_BACKSLASH.  It is assumed that the input
- * has been consumed through the leading backslash.
- *
- * The return value is a malloc'd copy of the command name, as parsed off
- * from the input.
- */
-char *
-psql_scan_slash_command(PsqlScanState state)
-{
-	PQExpBufferData mybuf;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashcmd);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-
-	return mybuf.data;
-}
-
-/*
- * Parse off the next argument for a backslash command, and return it as a
- * malloc'd string.  If there are no more arguments, returns NULL.
- *
- * type tells what processing, if any, to perform on the option string;
- * for example, if it's a SQL identifier, we want to downcase any unquoted
- * letters.
- *
- * if quote is not NULL, *quote is set to 0 if no quoting was found, else
- * the last quote symbol used in the argument.
- *
- * if semicolon is true, unquoted trailing semicolon(s) that would otherwise
- * be taken as part of the option string will be stripped.
- *
- * NOTE: the only possible syntax errors for backslash options are unmatched
- * quotes, which are detected when we run out of input.  Therefore, on a
- * syntax error we just throw away the string and return NULL; there is no
- * need to worry about flushing remaining input.
- */
-char *
-psql_scan_slash_option(PsqlScanState state,
-					   enum slash_option_type type,
-					   char *quote,
-					   bool semicolon)
-{
-	PQExpBufferData mybuf;
-	int			lexresult PG_USED_FOR_ASSERTS_ONLY;
-	char		local_quote;
-
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	if (quote == NULL)
-		quote = &local_quote;
-	*quote = 0;
-
-	/* Build a local buffer that we'll return the data of */
-	initPQExpBuffer(&mybuf);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = &mybuf;
-	option_type = type;
-	option_quote = quote;
-	unquoted_option_chars = 0;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	if (type == OT_WHOLE_LINE)
-		BEGIN(xslashwholeline);
-	else
-		BEGIN(xslashargstart);
-
-	/* And lex. */
-	lexresult = yylex();
-
-	/*
-	 * Check the lex result: we should have gotten back either LEXRES_OK
-	 * or LEXRES_EOL (the latter indicating end of string).  If we were inside
-	 * a quoted string, as indicated by YY_START, EOL is an error.
-	 */
-	Assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
-
-	switch (YY_START)
-	{
-		case xslashargstart:
-			/* empty arg */
-			break;
-		case xslasharg:
-			/* Strip any unquoted trailing semi-colons if requested */
-			if (semicolon)
-			{
-				while (unquoted_option_chars-- > 0 &&
-					   mybuf.len > 0 &&
-					   mybuf.data[mybuf.len - 1] == ';')
-				{
-					mybuf.data[--mybuf.len] = '\0';
-				}
-			}
-
-			/*
-			 * If SQL identifier processing was requested, then we strip out
-			 * excess double quotes and downcase unquoted letters.
-			 * Doubled double-quotes become output double-quotes, per spec.
-			 *
-			 * Note that a string like FOO"BAR"BAZ will be converted to
-			 * fooBARbaz; this is somewhat inconsistent with the SQL spec,
-			 * which would have us parse it as several identifiers.  But
-			 * for psql's purposes, we want a string like "foo"."bar" to
-			 * be treated as one option, so there's little choice.
-			 */
-			if (type == OT_SQLID || type == OT_SQLIDHACK)
-			{
-				bool		inquotes = false;
-				char	   *cp = mybuf.data;
-
-				while (*cp)
-				{
-					if (*cp == '"')
-					{
-						if (inquotes && cp[1] == '"')
-						{
-							/* Keep the first quote, remove the second */
-							cp++;
-						}
-						inquotes = !inquotes;
-						/* Collapse out quote at *cp */
-						memmove(cp, cp + 1, strlen(cp));
-						mybuf.len--;
-						/* do not advance cp */
-					}
-					else
-					{
-						if (!inquotes && type == OT_SQLID)
-							*cp = pg_tolower((unsigned char) *cp);
-						cp += PQmblen(cp, pset.encoding);
-					}
-				}
-			}
-			break;
-		case xslashquote:
-		case xslashbackquote:
-		case xslashdquote:
-			/* must have hit EOL inside quotes */
-			psql_error("unterminated quoted string\n");
-			termPQExpBuffer(&mybuf);
-			return NULL;
-		case xslashwholeline:
-			/* always okay */
-			break;
-		default:
-			/* can't get here */
-			fprintf(stderr, "invalid YY_START\n");
-			exit(1);
-	}
-
-	/*
-	 * An unquoted empty argument isn't possible unless we are at end of
-	 * command.  Return NULL instead.
-	 */
-	if (mybuf.len == 0 && *quote == 0)
-	{
-		termPQExpBuffer(&mybuf);
-		return NULL;
-	}
-
-	/* Else return the completed string. */
-	return mybuf.data;
-}
-
-/*
- * Eat up any unused \\ to complete a backslash command.
- */
-void
-psql_scan_slash_command_end(PsqlScanState state)
-{
-	/* Must be scanning already */
-	Assert(state->scanbufhandle != NULL);
-
-	/* Set up static variables that will be used by yylex */
-	cur_state = state;
-	output_buf = NULL;
-
-	if (state->buffer_stack != NULL)
-		yy_switch_to_buffer(state->buffer_stack->buf);
-	else
-		yy_switch_to_buffer(state->scanbufhandle);
-
-	BEGIN(xslashend);
-
-	/* And lex. */
-	yylex();
-
-	/* There are no possible errors in this lex state... */
-}
-
-/*
- * Evaluate a backticked substring of a slash command's argument.
- *
- * The portion of output_buf starting at backtick_start_offset is evaluated
- * as a shell command and then replaced by the command's output.
- */
-static void
-evaluate_backtick(void)
-{
-	char	   *cmd = output_buf->data + backtick_start_offset;
-	PQExpBufferData cmd_output;
-	FILE	   *fd;
-	bool		error = false;
-	char		buf[512];
-	size_t		result;
-
-	initPQExpBuffer(&cmd_output);
-
-	fd = popen(cmd, PG_BINARY_R);
-	if (!fd)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (!error)
-	{
-		do
-		{
-			result = fread(buf, 1, sizeof(buf), fd);
-			if (ferror(fd))
-			{
-				psql_error("%s: %s\n", cmd, strerror(errno));
-				error = true;
-				break;
-			}
-			appendBinaryPQExpBuffer(&cmd_output, buf, result);
-		} while (!feof(fd));
-	}
-
-	if (fd && pclose(fd) == -1)
-	{
-		psql_error("%s: %s\n", cmd, strerror(errno));
-		error = true;
-	}
-
-	if (PQExpBufferDataBroken(cmd_output))
-	{
-		psql_error("%s: out of memory\n", cmd);
-		error = true;
-	}
-
-	/* Now done with cmd, delete it from output_buf */
-	output_buf->len = backtick_start_offset;
-	output_buf->data[output_buf->len] = '\0';
-
-	/* If no error, transfer result to output_buf */
-	if (!error)
-	{
-		/* strip any trailing newline */
-		if (cmd_output.len > 0 &&
-			cmd_output.data[cmd_output.len - 1] == '\n')
-			cmd_output.len--;
-		appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
-	}
-
-	termPQExpBuffer(&cmd_output);
-}
-
-/*
  * Push the given string onto the stack of stuff to scan.
  *
@@ -1753,5 +1206,5 @@
  * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
  */
-static void
+void
 push_new_buffer(const char *newstr, const char *varname)
 {
@@ -1770,5 +1223,5 @@
 									&stackelem->bufstring);
 	cur_state->curline = stackelem->bufstring;
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 	{
 		stackelem->origstring = NULL;
@@ -1790,5 +1243,5 @@
  * switch to an appropriate buffer to continue lexing.
  */
-static void
+void
 pop_buffer_stack(PsqlScanState state)
 {
@@ -1809,5 +1262,5 @@
  * currently being scanned
  */
-static bool
+bool
 var_is_current_source(PsqlScanState state, const char *varname)
 {
@@ -1833,5 +1286,5 @@
  * NOTE SIDE EFFECT: the new buffer is made the active flex input buffer.
  */
-static YY_BUFFER_STATE
+YY_BUFFER_STATE
 prepare_buffer(const char *txt, int len, char **txtcopy)
 {
@@ -1843,5 +1296,5 @@
 	newtxt[len] = newtxt[len + 1] = YY_END_OF_BUFFER_CHAR;
 
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 		memcpy(newtxt, txt, len);
 	else
@@ -1852,5 +1305,5 @@
 		while (i < len)
 		{
-			int		thislen = PQmblen(txt + i, cur_state->encoding);
+			int		thislen = cur_state->cb.enc_mblen(txt + i);
 
 			/* first byte should always be okay... */
@@ -1862,9 +1315,9 @@
 	}
 
-	return yy_scan_buffer(newtxt, len + 2);
+	return cur_state->my_yy_scan_buffer(newtxt, len + 2);
 }
 
 /*
- * emit() --- body for ECHO macro
+ * scan_emit() --- body for ECHO macro
  *
  * NB: this must be used for ALL and ONLY the text copied from the flex
@@ -1873,8 +1326,8 @@
  * appended directly to output_buf.
  */
-static void
-emit(const char *txt, int len)
+void
+scan_emit(const char *txt, int len)
 {
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 		appendBinaryPQExpBuffer(output_buf, txt, len);
 	else
@@ -1900,13 +1353,13 @@
  * extract_substring --- fetch the true value of (part of) the current token
  *
- * This is like emit(), except that the data is returned as a malloc'd string
- * rather than being pushed directly to output_buf.
+ * This is like scan_emit(), except that the data is returned as a malloc'd
+ * string rather than being pushed directly to output_buf.
  */
-static char *
+char *
 extract_substring(const char *txt, int len)
 {
 	char	   *result = (char *) pg_malloc(len + 1);
 
-	if (cur_state->safe_encoding)
+	if (ENC_IS_SAFE(cur_state))
 		memcpy(result, txt, len);
 	else
@@ -1939,43 +1392,24 @@
  * find the variable or the escaping function fails, emit the token as-is.
  */
-static void
+void
 escape_variable(bool as_ident)
 {
-	char	   *varname;
-	const char *value;
-
-	/* Variable lookup. */
-	varname = extract_substring(yytext + 2, yyleng - 3);
-	value = GetVariable(pset.vars, varname);
-	free(varname);
-
-	/* Escaping. */
-	if (value)
+	/* Variable lookup if possible. */
+	if (cur_state->cb.get_variable)
 	{
-		if (!pset.db)
-			psql_error("can't escape without active connection\n");
-		else
-		{
-			char   *escaped_value;
+		char		*varname;
+		const char  *value;
+		void	   (*free_fn)(void *);
+
+		varname = extract_substring(yytext + 2, yyleng - 3);
+		value = cur_state->cb.get_variable(varname, true, as_ident, &free_fn);
+		free(varname);
 
-			if (as_ident)
-				escaped_value =
-					PQescapeIdentifier(pset.db, value, strlen(value));
-			else
-				escaped_value =
-					PQescapeLiteral(pset.db, value, strlen(value));
-
-			if (escaped_value == NULL)
-			{
-				const char *error = PQerrorMessage(pset.db);
-
-				psql_error("%s", error);
-			}
-			else
-			{
-				appendPQExpBufferStr(output_buf, escaped_value);
-				PQfreemem(escaped_value);
-				return;
-			}
+		if (value)
+		{
+			appendPQExpBufferStr(output_buf, value);
+			if (free_fn)
+				free_fn((void*)value);
+			return;
 		}
 	}
@@ -1985,4 +1419,20 @@
 	 * original text into the output buffer.
 	 */
-	emit(yytext, yyleng);
+	scan_emit(yytext, yyleng);
+}
+
+/* Default error output function */
+static void psql_error_errout(const char *fmt, ...)
+{
+	va_list	ap;
+
+	va_start(ap, fmt);
+	vfprintf(stderr, _(fmt), ap);
+	va_end(ap);
+}
+
+/* Default function to check standard_conforming_strings */
+static bool psql_standard_strings(void)
+{
+	return false;
 }
pgbench.c.patient.difftext/x-patch; charset=us-asciiDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 596d112..8793fd2 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -54,6 +54,7 @@
 #endif
 
 #include "pgbench.h"
+#include "psqlscan.h"
 
 #define ERRCODE_UNDEFINED_TABLE  "42P01"
 
@@ -285,7 +286,7 @@ typedef enum QueryMode
 static QueryMode querymode = QUERY_SIMPLE;
 static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
 
-typedef struct
+typedef struct Command_t
 {
 	char	   *line;			/* full text of command line */
 	int			command_num;	/* unique index of this Command struct */
@@ -295,6 +296,7 @@ typedef struct
 	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
 	SimpleStats stats;			/* time spent in this command */
+	struct Command_t *next;		/* more command if any, for multistatements */
 } Command;
 
 static struct
@@ -303,6 +305,22 @@ static struct
 	Command   **commands;
 	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
+
+typedef enum
+{
+	PS_IDLE,
+	PS_IN_STATEMENT,
+	PS_IN_BACKSLASH_CMD
+} ParseState;
+
+typedef struct ParseInfo
+{
+	PsqlScanState	scan_state;
+	PQExpBuffer		outbuf;
+	ParseState		mode;
+} ParseInfoData;
+typedef ParseInfoData *ParseInfo;
+
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
 static int	debug = 0;			/* debug flag */
@@ -430,6 +448,9 @@ usage(void)
 		   progname, progname);
 }
 
+PsqlScanCallbacks pgbench_scan_callbacks =
+{NULL, NULL, NULL};
+
 /*
  * strtoint64 -- convert a string to 64-bit integer
  *
@@ -2287,43 +2308,109 @@ syntax_error(const char *source, const int lineno,
 	exit(1);
 }
 
-/* Parse a command; return a Command struct, or NULL if it's a comment */
+static ParseInfo
+createParseInfo(void)
+{
+	ParseInfo ret = (ParseInfo) pg_malloc(sizeof(ParseInfoData));
+
+	ret->scan_state = psql_scan_create();
+	ret->outbuf = createPQExpBuffer();
+	ret->mode = PS_IDLE;
+
+	return ret;
+}
+
+#define parse_reset_outbuf(pcs) resetPQExpBuffer((pcs)->outbuf)
+#define parse_finish_scan(pcs) psql_scan_finish((pcs)->scan_state)
+
+/* copy a string after removing newlines and collapsing whitespaces */
+static char *
+strdup_nonl(const char *in)
+{
+	char *ret, *p, *q;
+
+	ret = pg_strdup(in);
+
+	/* Replace newlines into spaces */
+	for (p = ret ; *p ; p++)
+		if (*p == '\n') *p = ' ';
+
+	/* collapse successive spaces */
+	for (p = q = ret ; *p ; p++, q++)
+	{
+		while (isspace(*p) && isspace(*(p + 1))) p++;
+		if (p > q) *q = *p;
+	}
+	*q = '\0';
+
+	return ret;
+}
+
+/* Parse a backslash command; return a Command struct */
 static Command *
-process_commands(char *buf, const char *source, const int lineno)
+process_backslash_commands(ParseInfo proc_state, char *buf,
+						   const char *source, const int lineno)
 {
 	const char	delim[] = " \f\n\r\t\v";
 	Command    *my_commands;
 	int			j;
 	char	   *p,
+			   *start,
 			   *tok;
-
-	/* Make the string buf end at the next newline */
-	if ((p = strchr(buf, '\n')) != NULL)
-		*p = '\0';
+	int			max_args = -1;
 
 	/* Skip leading whitespace */
 	p = buf;
 	while (isspace((unsigned char) *p))
 		p++;
+	start = p;
 
-	/* If the line is empty or actually a comment, we're done */
-	if (*p == '\0' || strncmp(p, "--", 2) == 0)
+	if (proc_state->mode != PS_IN_BACKSLASH_CMD)
+	{
+		if (*p != '\\')
+			return NULL;	/* not a backslash command */
+
+		/* This is the first line of a backslash command  */
+		proc_state->mode = PS_IN_BACKSLASH_CMD;
+	}
+
+	/*
+	 * Make the string buf end at the next newline, or move to just after the
+	 * end of line
+	 */
+	if ((p = strchr(start, '\n')) != NULL)
+		*p = '\0';
+	else
+		p = start + strlen(start);
+
+	/* continued line ends with a backslash */
+	if (*(--p) == '\\')
+	{
+		*p-- = '\0';
+		appendPQExpBufferStr(proc_state->outbuf, start);
+
+		/* Add a delimiter at the end of the line if necessary */
+		if (!isspace(*p))
+			appendPQExpBufferChar(proc_state->outbuf, ' ');
  		return NULL;
+	}
+
+	appendPQExpBufferStr(proc_state->outbuf, start);
+	proc_state->mode = PS_IDLE;
+
+	/* Start parsing the backslash command */
+
+	p = proc_state->outbuf->data;
 
 	/* Allocate and initialize Command structure */
 	my_commands = (Command *) pg_malloc(sizeof(Command));
-	my_commands->line = pg_strdup(buf);
+	my_commands->line = pg_strdup(p);
 	my_commands->command_num = num_commands++;
-	my_commands->type = 0;		/* until set */
+	my_commands->type = META_COMMAND;
 	my_commands->argc = 0;
+	my_commands->next = NULL;
 	initSimpleStats(&my_commands->stats);
 
-	if (*p == '\\')
-	{
-		int			max_args = -1;
-
-		my_commands->type = META_COMMAND;
-
 	j = 0;
 	tok = strtok(++p, delim);
 
@@ -2340,6 +2427,7 @@ process_commands(char *buf, const char *source, const int lineno)
 		else
 			tok = strtok(NULL, delim);
 	}
+	parse_reset_outbuf(proc_state);
 
 	if (pg_strcasecmp(my_commands->argv[0], "setrandom") == 0)
 	{
@@ -2475,28 +2563,91 @@ process_commands(char *buf, const char *source, const int lineno)
 		syntax_error(source, lineno, my_commands->line, my_commands->argv[0],
 					 "invalid command", NULL, -1);
 	}
+
+	return my_commands;
+}
+
+/* Parse an input line, return non-null if any command terminates. */
+static Command *
+process_commands(ParseInfo proc_state, char *buf,
+				 const char *source, const int lineno)
+{
+	Command *command = NULL;
+	Command *retcomd = NULL;
+	PsqlScanState scan_state = proc_state->scan_state;
+	promptStatus_t prompt_status = PROMPT_READY; /* dummy  */
+	PQExpBuffer qbuf = proc_state->outbuf;
+	PsqlScanResult scan_result;
+
+	if (proc_state->mode != PS_IN_STATEMENT)
+	{
+		command = process_backslash_commands(proc_state, buf, source, lineno);
+
+		/* go to next line for continuation of the backslash command. */
+		if (command != NULL || proc_state->mode == PS_IN_BACKSLASH_CMD)
+			return command;
 	}
+
+	/* Parse statements */
+	psql_scan_setup(scan_state, buf, strlen(buf), &pgbench_scan_callbacks);
+
+next_command:	
+	scan_result = psql_scan(scan_state, qbuf, &prompt_status);
+
+	if (scan_result == PSCAN_SEMICOLON)
+	{
+		proc_state->mode = PS_IDLE;
+		/*
+		 * Command is terminated. Fill the struct.
+		 */
+		command = (Command*) pg_malloc(sizeof(Command));
+		command->line = strdup_nonl(qbuf->data);
+		command->command_num = num_commands++;
+		command->type = SQL_COMMAND;
+		command->argc = 0;
+		command->next = NULL;
+
+		/* Put this command at the end of returning command chain */
+		if (!retcomd)
+			retcomd = command;
 		else
 		{
-		my_commands->type = SQL_COMMAND;
+			Command *pcomm = retcomd;
+			while (pcomm->next) pcomm = pcomm->next;
+			pcomm->next = command;
+		}
 
 		switch (querymode)
 		{
 		case QUERY_SIMPLE:
-				my_commands->argv[0] = pg_strdup(p);
-				my_commands->argc++;
+			command->argv[0] = pg_strdup(qbuf->data);
+			command->argc++;
 			break;
 		case QUERY_EXTENDED:
 		case QUERY_PREPARED:
-				if (!parseQuery(my_commands, p))
+			if (!parseQuery(command, qbuf->data))
 				exit(1);
 			break;
 		default:
 			exit(1);
 		}
+
+		parse_reset_outbuf(proc_state);
+
+		/* Ask for the next statement in this line */
+		goto next_command;
  	}
+	else if (scan_result == PSCAN_BACKSLASH)
+	{
+		fprintf(stderr, "Unexpected backslash in SQL statement: %s:%d\n",
+				source, lineno);
+		exit(1);
+	}
+
+	proc_state->mode = PS_IN_STATEMENT;
+	psql_scan_finish(scan_state);
 
-	return my_commands;
+	return retcomd;
 }
 
 /*
@@ -2557,6 +2708,7 @@ process_file(char *filename)
 				index;
 	char	   *buf;
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2571,26 +2723,38 @@ process_file(char *filename)
 		return NULL;
 	}
 
+	proc_state->mode = PS_IDLE;
+
 	lineno = 0;
 	index = 0;
 
 	while ((buf = read_line_from_file(fd)) != NULL)
 	{
-		Command    *command;
+		Command    *command = NULL;
 
 		lineno += 1;
 
-		command = process_commands(buf, filename, lineno);
+		command = process_commands(proc_state, buf, filename, lineno);
 
 		free(buf);
 
 		if (command == NULL)
+		{
+			/*
+			 * command is NULL when psql_scan returns PSCAN_EOL or
+			 * PSCAN_INCOMPLETE. Immediately ask for the next line for the
+			 * cases.
+			 */
  			continue;
+		}
 
-		my_commands[index] = command;
-		index++;
+		while (command)
+		{
+			my_commands[index++] = command;
+			command = command->next;
+		}
 		
-		if (index >= alloc_num)
+		if (index > alloc_num)
 		{
 			alloc_num += COMMANDS_ALLOC_NUM;
 			my_commands = pg_realloc(my_commands, sizeof(Command *) * alloc_num);
@@ -2598,6 +2762,8 @@ process_file(char *filename)
 	}
 	fclose(fd);
 
+	parse_finish_scan(proc_state);
+
 	my_commands[index] = NULL;
 
 	return my_commands;
@@ -2613,6 +2779,7 @@ process_builtin(const char *tb, const char *source)
 				index;
 	char		buf[BUFSIZ];
 	int			alloc_num;
+	ParseInfo proc_state = createParseInfo();
 
 	alloc_num = COMMANDS_ALLOC_NUM;
 	my_commands = (Command **) pg_malloc(sizeof(Command *) * alloc_num);
@@ -2639,10 +2806,12 @@ process_builtin(const char *tb, const char *source)
 
 		lineno += 1;
 
-		command = process_commands(buf, source, lineno);
+		command = process_commands(proc_state, buf, source, lineno);
 		if (command == NULL)
 			continue;
 
+		/* builtin doesn't need multistatements */
+		Assert(command->next == NULL);
 		my_commands[index] = command;
 		index++;
 
@@ -2654,6 +2823,7 @@ process_builtin(const char *tb, const char *source)
 	}
 
 	my_commands[index] = NULL;
+	parse_finish_scan(proc_state);
 
 	return my_commands;
 }
#55Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro HORIGUCHI (#49)
Re: pgbench - allow backslash-continuations in custom scripts

On Thu, Feb 18, 2016 at 6:54 AM, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:

It is the SQL part of old psqlscan.l but the difference between
them is a bit bothersome to see. I attached the diff between them
as "psqlscanbody.l.diff" for convenience.

This is a huge diff, and I don't see that you've explained the reason for
all the changes. For example:

-/*
- * We use a stack of flex buffers to handle substitution of psql variables.
- * Each stacked buffer contains the as-yet-unread text from one psql
variable.
- * When we pop the stack all the way, we resume reading from the outer
buffer
- * identified by scanbufhandle.
- */
-typedef struct StackElem
-{
- YY_BUFFER_STATE buf; /* flex input control structure */
- char *bufstring; /* data actually being scanned by
flex *
/
- char *origstring; /* copy of original data, if needed
*/
- char *varname; /* name of variable providing data,
or N
ULL */
- struct StackElem *next;
-} StackElem;

Perhaps we could separate this part of the code motion into its own
preliminary patch? I see this went to psqlscan_int.h, but there's no
obvious reason for that particular name, and the comments don't explain it;
in fact, they say that's psqlscan.h. psqlscan_slash.h has the same
problem; perhaps moving things there could be another preliminary patch.

-                                       yyless(0);
+                                       my_yyless(0);

Why do we need to do this? Is "my_" really the best prefix? Is this
another change that could be its own patch?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#56Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Robert Haas (#55)
Re: pgbench - allow backslash-continuations in custom scripts

Thank you for the comment, but could you please tell me what kind
of criteria should I take to split this patch? The discussion
about splitting criteria is in the following reply (in the
sentence begins with "By the way").

At Wed, 16 Mar 2016 11:57:25 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+Tgmobvu1aBRdRaKvqMVp0ifhQpgvnOEZa2Rg3AHfRWPE5-Tg@mail.gmail.com>

On Thu, Feb 18, 2016 at 6:54 AM, Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> wrote:

It is the SQL part of old psqlscan.l but the difference between
them is a bit bothersome to see. I attached the diff between them
as "psqlscanbody.l.diff" for convenience.

This is a huge diff, and I don't see that you've explained the reason for
all the changes. For example:

-/*
- * We use a stack of flex buffers to handle substitution of psql variables.
- * Each stacked buffer contains the as-yet-unread text from one psql
variable.
- * When we pop the stack all the way, we resume reading from the outer
buffer
- * identified by scanbufhandle.
- */
-typedef struct StackElem
-{
- YY_BUFFER_STATE buf; /* flex input control structure */
- char *bufstring; /* data actually being scanned by
flex *
/
- char *origstring; /* copy of original data, if needed
*/
- char *varname; /* name of variable providing data,
or N
ULL */
- struct StackElem *next;
-} StackElem;

Perhaps we could separate this part of the code motion into its own
preliminary patch?

The "preliminary patch" seems to mean the same thing with the
first patch in the following message.

/messages/by-id/20160107.173603.31865003.horiguchi.kyotaro@lab.ntt.co.jp

The commit log says as the following.

| Subject: [PATCH 1/5] Prepare for sharing psqlscan with pgbench.
|
| Lexer is no longer compiled as a part of mainloop.c. The slash
| command lexer is brought out from the command line lexer. psql_scan
| no longer accesses directly to pset struct and VariableSpace. This
| change allows psqlscan to be used without these things.

The following two patches are the follwings.

| Subject: [PATCH 2/5] Change the access method to shell variables
|
| Access to shell variables via a callback function so that the lexer no
| longer need to be aware of VariableSpace.

| Subject: [PATCH 3/5] Detach common.c from psqlscan
|
| Call standard_strings() and psql_error() via callback functions so
| that psqlscan.l can live totally without common.c stuff. They are
| bundled up with get_variable() callback in one struct since now we
| have as many as four callback functions.

These patches are made so as to keep the compilable and workable
state of the source files. It might be a bit more readable if
unshackled from such restriction.

I see this went to psqlscan_int.h, but there's no
obvious reason for that particular name, and the comments don't explain it;

I assumed that is a convention of naming by looking libpq-int.h
(though it doesn't use underscore, but hyphen). But the file
needs a comment like libpq-int.h. I'll rename it and add such
comment to it.

By the way, the patch set mentioned above gives such preliminary
steps separately. Should I make the next patch set based on the
older one which is separating the preliminary steps? Or in new
splitting criteria that is free from the compilable-workable
restriction is preferable?

in fact, they say that's psqlscan.h. psqlscan_slash.h has the same
problem; perhaps moving things there could be another preliminary patch.

It is also included in the 0001 patch.

-                                       yyless(0);
+                                       my_yyless(0);

Why do we need to do this? Is "my_" really the best prefix? Is this
another change that could be its own patch?

Oops! Sorry for the silly name. I was not able to think up a
proper name for it. Does psqlscan_yyless seems good?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#57Robert Haas
robertmhaas@gmail.com
In reply to: Kyotaro HORIGUCHI (#56)
Re: pgbench - allow backslash-continuations in custom scripts

On Thu, Mar 17, 2016 at 4:12 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Thank you for the comment, but could you please tell me what kind
of criteria should I take to split this patch? The discussion
about splitting criteria is in the following reply (in the
sentence begins with "By the way").

Well, I'm trying to find a piece of this patch that is small enough
that I can understand it and in good enough shape that I can commit it
independently, but I am having some difficulty with that. I keep
hoping some other committer is going to come along and be able to grok
this well enough to apply it based on what you've already done, but so
far it seems to be the all-me show.

These patches are made so as to keep the compilable and workable
state of the source files. It might be a bit more readable if
unshackled from such restriction.

Keeping it compilable and workable after each patch is essential, but
the first patch is still big and doing a lot of stuff. I'm wondering
if it can be further decomposed.

I see this went to psqlscan_int.h, but there's no
obvious reason for that particular name, and the comments don't explain it;

I assumed that is a convention of naming by looking libpq-int.h
(though it doesn't use underscore, but hyphen). But the file
needs a comment like libpq-int.h. I'll rename it and add such
comment to it.

OK.

-                                       yyless(0);
+                                       my_yyless(0);

Why do we need to do this? Is "my_" really the best prefix? Is this
another change that could be its own patch?

Oops! Sorry for the silly name. I was not able to think up a
proper name for it. Does psqlscan_yyless seems good?

That does sound better.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#58Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#57)
Re: pgbench - allow backslash-continuations in custom scripts

Robert Haas <robertmhaas@gmail.com> writes:

Well, I'm trying to find a piece of this patch that is small enough
that I can understand it and in good enough shape that I can commit it
independently, but I am having some difficulty with that. I keep
hoping some other committer is going to come along and be able to grok
this well enough to apply it based on what you've already done, but so
far it seems to be the all-me show.

This is mostly a flex/bison hack, isn't it? If you like I'll take it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#59Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#58)
Re: pgbench - allow backslash-continuations in custom scripts

On Fri, Mar 18, 2016 at 10:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Well, I'm trying to find a piece of this patch that is small enough
that I can understand it and in good enough shape that I can commit it
independently, but I am having some difficulty with that. I keep
hoping some other committer is going to come along and be able to grok
this well enough to apply it based on what you've already done, but so
far it seems to be the all-me show.

This is mostly a flex/bison hack, isn't it? If you like I'll take it.

I would be delighted if you would.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#60Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#59)
Re: pgbench - allow backslash-continuations in custom scripts

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 18, 2016 at 10:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This is mostly a flex/bison hack, isn't it? If you like I'll take it.

I would be delighted if you would.

I've committed changes equivalent to Horiguchi-san's 0001 and 0002
patches, though rather different in detail. I concur with the upthread
opinion that 0003 doesn't seem really necessary.

This solves the problem of allowing SQL commands in scripts to span
lines, but it doesn't do anything about backslash commands, which was
the original point according to the thread title ;-). I can think of
two somewhat-independent changes we might want to make at this point,
since we're breaking exact script compatibility for 9.6 anyway:

* Allow multiple backslash commands on one line, eg
\set foo 5 \set bar 6
The main reason for that is that psql allows it, and one of the things
we're supposedly trying to do here is reduce the behavioral distance
between psql and pgbench parsing rules.

* Allow backslash commands to span lines, probably by adopting the
rule that backslash immediately followed by newline is to be ignored
within a backslash command. This would not be compatible with psql,
though, at least not unless we wanted to change psql too.

I don't have strong feelings about either.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#61David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#60)
Re: pgbench - allow backslash-continuations in custom scripts

On Sunday, March 20, 2016, Tom Lane <tgl@sss.pgh.pa.us> wrote:

* Allow backslash commands to span lines, probably by adopting the
rule that backslash immediately followed by newline is to be ignored
within a backslash command. This would not be compatible with psql,
though, at least not unless we wanted to change psql too.

This would be appreciated. The main case I find wanting this is writing
out long \copy expressions. Solving really complex ones using temporary
tables works but being able to spread it out over multiple lines would be a
welcomed addition.

David J.

#62Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#60)
Re: pgbench - allow backslash-continuations in custom scripts

On Sun, Mar 20, 2016 at 1:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 18, 2016 at 10:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This is mostly a flex/bison hack, isn't it? If you like I'll take it.

I would be delighted if you would.

I've committed changes equivalent to Horiguchi-san's 0001 and 0002
patches, though rather different in detail. I concur with the upthread
opinion that 0003 doesn't seem really necessary.

This solves the problem of allowing SQL commands in scripts to span
lines, ...

Excellent.

but it doesn't do anything about backslash commands, which was
the original point according to the thread title ;-).

Wait, was it really? I'd been thinking it was mostly to continue
queries, not metacommands, but maybe I missed the boat.

I can think of
two somewhat-independent changes we might want to make at this point,
since we're breaking exact script compatibility for 9.6 anyway:

* Allow multiple backslash commands on one line, eg
\set foo 5 \set bar 6
The main reason for that is that psql allows it, and one of the things
we're supposedly trying to do here is reduce the behavioral distance
between psql and pgbench parsing rules.

This seems to me to be going in the wrong direction.

* Allow backslash commands to span lines, probably by adopting the
rule that backslash immediately followed by newline is to be ignored
within a backslash command. This would not be compatible with psql,
though, at least not unless we wanted to change psql too.

This might have some point to it, though, if you want to say \set i
<incredibly long expression not easily contained on a single line>

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#63Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#62)
Re: pgbench - allow backslash-continuations in custom scripts

Robert Haas <robertmhaas@gmail.com> writes:

On Sun, Mar 20, 2016 at 1:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This solves the problem of allowing SQL commands in scripts to span
lines, ...

Excellent.

but it doesn't do anything about backslash commands, which was
the original point according to the thread title ;-).

Wait, was it really? I'd been thinking it was mostly to continue
queries, not metacommands, but maybe I missed the boat.

Nah, you're right, it was about continuing queries. Still, we've had
complaints about the other thing too, and I think if we're going to
change anything here, we should change it all in the same release.

I can think of
two somewhat-independent changes we might want to make at this point,
since we're breaking exact script compatibility for 9.6 anyway:

* Allow multiple backslash commands on one line, eg
\set foo 5 \set bar 6
The main reason for that is that psql allows it, and one of the things
we're supposedly trying to do here is reduce the behavioral distance
between psql and pgbench parsing rules.

This seems to me to be going in the wrong direction.

Um, why exactly? That psql behavior is of really ancient standing, and
we have not had complaints about it.

* Allow backslash commands to span lines, probably by adopting the
rule that backslash immediately followed by newline is to be ignored
within a backslash command. This would not be compatible with psql,
though, at least not unless we wanted to change psql too.

This might have some point to it, though, if you want to say \set i
<incredibly long expression not easily contained on a single line>

Shall I make a patch that allows backslash-newline to be handled this way
in both psql and pgbench backslash commands? At least in psql, there
would be no backwards compatibility problem, since right now the case
just fails:

regression=# \set x y \
Invalid command \. Try \? for help.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#64Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#63)
Re: pgbench - allow backslash-continuations in custom scripts

On Mon, Mar 21, 2016 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Wait, was it really? I'd been thinking it was mostly to continue
queries, not metacommands, but maybe I missed the boat.

Nah, you're right, it was about continuing queries. Still, we've had
complaints about the other thing too, and I think if we're going to
change anything here, we should change it all in the same release.

Fair enough.

* Allow multiple backslash commands on one line, eg
\set foo 5 \set bar 6
The main reason for that is that psql allows it, and one of the things
we're supposedly trying to do here is reduce the behavioral distance
between psql and pgbench parsing rules.

This seems to me to be going in the wrong direction.

Um, why exactly? That psql behavior is of really ancient standing, and
we have not had complaints about it.

I think that's mostly because the psql metacommands are ridiculously
impoverished. I'm guessing that pgbench's expression language is
eventually going to support strings as a data type, for example, and
those strings might want to contain backlashes. There's basically no
value in cramming multiple metacommands onto a single line, but there
is the risk of creating unnecessary lexing or parsing difficulties in
the future.

* Allow backslash commands to span lines, probably by adopting the
rule that backslash immediately followed by newline is to be ignored
within a backslash command. This would not be compatible with psql,
though, at least not unless we wanted to change psql too.

This might have some point to it, though, if you want to say \set i
<incredibly long expression not easily contained on a single line>

Shall I make a patch that allows backslash-newline to be handled this way
in both psql and pgbench backslash commands? At least in psql, there
would be no backwards compatibility problem, since right now the case
just fails:

regression=# \set x y \
Invalid command \. Try \? for help.

I certainly don't object to such a patch, although if it's between you
writing that patch and you getting Tomas Vondra's multivariate
statistics stuff committed, I'll take the latter. :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#65Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#64)
Re: pgbench - allow backslash-continuations in custom scripts

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Mar 21, 2016 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Um, why exactly? That psql behavior is of really ancient standing, and
we have not had complaints about it.

I think that's mostly because the psql metacommands are ridiculously
impoverished. I'm guessing that pgbench's expression language is
eventually going to support strings as a data type, for example, and
those strings might want to contain backlashes.

Sure, but once we define strings, they'll be quoted, and the behavior
can/should/will be different for a backslash inside quotes than one
outside them --- as it is already in psql. Moreover, if you're on
board with the backslash-newline proposal, you've already bought into
the idea that backslashes outside quotes will behave differently from
those inside.

Shall I make a patch that allows backslash-newline to be handled this way
in both psql and pgbench backslash commands?

I certainly don't object to such a patch, although if it's between you
writing that patch and you getting Tomas Vondra's multivariate
statistics stuff committed, I'll take the latter. :-)

I'll get to that, but I'd like to get this area fully dealt with before
context-swapping to that one.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#66Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#65)
Re: pgbench - allow backslash-continuations in custom scripts

On Mon, Mar 21, 2016 at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Mar 21, 2016 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Um, why exactly? That psql behavior is of really ancient standing, and
we have not had complaints about it.

I think that's mostly because the psql metacommands are ridiculously
impoverished. I'm guessing that pgbench's expression language is
eventually going to support strings as a data type, for example, and
those strings might want to contain backlashes.

Sure, but once we define strings, they'll be quoted, and the behavior
can/should/will be different for a backslash inside quotes than one
outside them --- as it is already in psql. Moreover, if you're on
board with the backslash-newline proposal, you've already bought into
the idea that backslashes outside quotes will behave differently from
those inside.

Mmph. I just don't see any benefit in being able to start a command
in the middle of a line. Even if it's not dangerous to future things
we might want o implement, I'd argue it's still poor style, and of no
practical utility. But if I lose that argument, then I do.

Shall I make a patch that allows backslash-newline to be handled this way
in both psql and pgbench backslash commands?

I certainly don't object to such a patch, although if it's between you
writing that patch and you getting Tomas Vondra's multivariate
statistics stuff committed, I'll take the latter. :-)

I'll get to that, but I'd like to get this area fully dealt with before
context-swapping to that one.

Understood.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#67Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#66)
Re: pgbench - allow backslash-continuations in custom scripts

Robert Haas <robertmhaas@gmail.com> writes:

Mmph. I just don't see any benefit in being able to start a command
in the middle of a line.

I agree that there's no functional benefit; it's a matter of consistency.
In particular, psql has always allowed you to write multiple SQL commands
per line:

SELECT 2+2; SELECT x FROM tab; SELECT y FROM othertab;

and as of yesterday pgbench supports that as well. So allowing multiple
backslash commands on a line improves consistency both with psql and with
pgbench's own behavior, IMV.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#68Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#67)
1 attachment(s)
Re: pgbench - allow backslash-continuations in custom scripts

So I looked into this, and found that persuading psql to let backslash
commands cross line boundaries is a much bigger deal than just fixing the
lexer. The problem is that MainLoop would need to grow an understanding
of having received only a partial backslash command and needing to go back
to readline() for another line. And probably HandleSlashCmds would need
to be changed to stop parsing and back out without doing anything when it
hits backslash-newline. It's do-able no doubt, but it's not going to be a
small and simple patch.

However, since pgbench is already set up to slurp the entire file and
lex it in one go, it is just a trivial adjustment to the lexer rules in
that program. The only thing I found that made it complicated is that
syntax_error() had too simplistic an understanding of how to position
the error cursor usefully, so that needed a bit of work.

I think it'd be okay to commit this without fixing psql at the same time;
if you try it in psql you get an error, so on that side it's unimplemented
behavior rather than an actual incompatibility. Perhaps somebody will be
motivated to fix it later, but I'm not going to spend that kind of time
on it right now.

I've not written a docs update, but otherwise I think this is committable.
Comments?

regards, tom lane

Attachments:

pgbench-backslash-newline-in-meta-commands.patchtext/x-diff; charset=us-ascii; name=pgbench-backslash-newline-in-meta-commands.patchDownload
diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 825dacc..12b5f7e 100644
*** a/src/bin/pgbench/exprscan.l
--- b/src/bin/pgbench/exprscan.l
***************
*** 6,17 ****
   *
   * This lexer supports two operating modes:
   *
!  * In INITIAL state, just parse off whitespace-separated words (this mode
!  * is basically equivalent to strtok(), which is what we used to use).
   *
   * In EXPR state, lex for the simple expression syntax of exprparse.y.
   *
!  * In either mode, stop upon hitting newline or end of string.
   *
   * Note that this lexer operates within the framework created by psqlscan.l,
   *
--- 6,18 ----
   *
   * This lexer supports two operating modes:
   *
!  * In INITIAL state, just parse off whitespace-separated words.  (This mode
!  * is basically equivalent to strtok(), which is what we used to use.)
   *
   * In EXPR state, lex for the simple expression syntax of exprparse.y.
   *
!  * In either mode, stop upon hitting newline, end of string, or unquoted
!  * backslash (except that backslash-newline is silently swallowed).
   *
   * Note that this lexer operates within the framework created by psqlscan.l,
   *
*************** extern void expr_yyset_column(int column
*** 61,69 ****
  alpha			[a-zA-Z_]
  digit			[0-9]
  alnum			[a-zA-Z0-9_]
! /* {space} + {nonspace} + {newline} should cover all characters */
  space			[ \t\r\f\v]
! nonspace		[^ \t\r\f\v\n]
  newline			[\n]
  
  /* Exclusive states */
--- 62,71 ----
  alpha			[a-zA-Z_]
  digit			[0-9]
  alnum			[a-zA-Z0-9_]
! /* {space} + {nonspace} + {backslash} + {newline} should cover all characters */
  space			[ \t\r\f\v]
! nonspace		[^ \t\r\f\v\\\n]
! backslash		[\\]
  newline			[\n]
  
  /* Exclusive states */
*************** newline			[\n]
*** 98,103 ****
--- 100,113 ----
  
  {space}+		{ /* ignore */ }
  
+ {backslash}{newline}	{ /* ignore */ }
+ 
+ {backslash}		{
+ 					/* do not eat, and report end of command */
+ 					yyless(0);
+ 					return 0;
+ 				}
+ 
  {newline}		{
  					/* report end of command */
  					last_was_newline = true;
*************** newline			[\n]
*** 130,143 ****
  					return FUNCTION;
  				}
  
  {newline}		{
  					/* report end of command */
  					last_was_newline = true;
  					return 0;
  				}
  
- {space}+		{ /* ignore */ }
- 
  .				{
  					/*
  					 * must strdup yytext so that expr_yyerror_more doesn't
--- 140,161 ----
  					return FUNCTION;
  				}
  
+ {space}+		{ /* ignore */ }
+ 
+ {backslash}{newline}	{ /* ignore */ }
+ 
+ {backslash}		{
+ 					/* do not eat, and report end of command */
+ 					yyless(0);
+ 					return 0;
+ 				}
+ 
  {newline}		{
  					/* report end of command */
  					last_was_newline = true;
  					return 0;
  				}
  
  .				{
  					/*
  					 * must strdup yytext so that expr_yyerror_more doesn't
*************** expr_yyerror_more(yyscan_t yyscanner, co
*** 177,183 ****
  	/*
  	 * While parsing an expression, we may not have collected the whole line
  	 * yet from the input source.  Lex till EOL so we can report whole line.
! 	 * (If we're at EOF, it's okay to call yylex() an extra time.)
  	 */
  	if (!last_was_newline)
  	{
--- 195,201 ----
  	/*
  	 * While parsing an expression, we may not have collected the whole line
  	 * yet from the input source.  Lex till EOL so we can report whole line.
! 	 * (If we're at backslash/EOF, it's okay to call yylex() an extra time.)
  	 */
  	if (!last_was_newline)
  	{
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4196b0e..e947f77 100644
*** a/src/bin/pgbench/pgbench.c
--- b/src/bin/pgbench/pgbench.c
*************** syntax_error(const char *source, int lin
*** 2413,2426 ****
  	fprintf(stderr, "\n");
  	if (line != NULL)
  	{
! 		fprintf(stderr, "%s\n", line);
! 		if (column >= 0)
  		{
  			int			i;
  
! 			for (i = 0; i < column; i++)
! 				fprintf(stderr, " ");
! 			fprintf(stderr, "^ error found here\n");
  		}
  	}
  	exit(1);
--- 2413,2455 ----
  	fprintf(stderr, "\n");
  	if (line != NULL)
  	{
! 		/*
! 		 * Multi-line backslash commands make this harder than you'd think; we
! 		 * have to identify which line to put the error cursor on.  So print
! 		 * one line at a time.
! 		 */
! 		for (;;)
  		{
+ 			const char *nlpos = strchr(line, '\n');
+ 			int			len;
  			int			i;
  
! 			if (nlpos)
! 			{
! 				/*
! 				 * It's tempting to use fprintf("%.*s"), but that can fail if
! 				 * glibc has a different idea of the encoding than we do.
! 				 */
! 				len = nlpos - line + 1;
! 				for (i = 0; i < len; i++)
! 					fputc(line[i], stderr);
! 			}
! 			else
! 			{
! 				len = column + 1;		/* ensure we print ^ if not done */
! 				fprintf(stderr, "%s\n", line);
! 			}
! 			if (column >= 0 && column < len)
! 			{
! 				for (i = 0; i < column; i++)
! 					fputc(' ', stderr);
! 				fprintf(stderr, "^ error found here\n");
! 			}
! 			column -= len;
! 			if (nlpos)
! 				line = nlpos + 1;
! 			else
! 				break;
  		}
  	}
  	exit(1);
#69Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Tom Lane (#68)
Re: pgbench - allow backslash-continuations in custom scripts

First, thank you all involved, and thank you for polishing this
and committing, Tom.

At Mon, 21 Mar 2016 17:15:18 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <1596.1458594918@sss.pgh.pa.us>

So I looked into this, and found that persuading psql to let backslash
commands cross line boundaries is a much bigger deal than just fixing the
lexer. The problem is that MainLoop would need to grow an understanding
of having received only a partial backslash command and needing to go back
to readline() for another line. And probably HandleSlashCmds would need
to be changed to stop parsing and back out without doing anything when it
hits backslash-newline. It's do-able no doubt, but it's not going to be a
small and simple patch.

I agree.

However, since pgbench is already set up to slurp the entire file and
lex it in one go, it is just a trivial adjustment to the lexer rules in
that program. The only thing I found that made it complicated is that
syntax_error() had too simplistic an understanding of how to position
the error cursor usefully, so that needed a bit of work.

The modified lexer treats {backslash}{newline} as the same with
whitespace and it looks ok for me.

/test.sql:6: syntax error, unexpected FUNCTION, expecting $end in command "set"
\set naccounts\
10x0
^ error found here

The error message seems fine. (The mysterious message would be
another problem.) But it prints the lines after the error indicator.

\set naccounts\
10x0\
^ error found here
* :scale

I suppose that the trailing lines might be better not be
printed. (gcc doesn't seem to do so.)

I think it'd be okay to commit this without fixing psql at the same time;
if you try it in psql you get an error, so on that side it's unimplemented
behavior rather than an actual incompatibility. Perhaps somebody will be
motivated to fix it later, but I'm not going to spend that kind of time
on it right now.

I've not written a docs update, but otherwise I think this is committable.
Comments?

regards, tom lane

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#70Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#68)
Re: pgbench - allow backslash-continuations in custom scripts

Tom Lane wrote:

So I looked into this, and found that persuading psql to let backslash
commands cross line boundaries is a much bigger deal than just fixing the
lexer. The problem is that MainLoop would need to grow an understanding
of having received only a partial backslash command and needing to go back
to readline() for another line. And probably HandleSlashCmds would need
to be changed to stop parsing and back out without doing anything when it
hits backslash-newline. It's do-able no doubt, but it's not going to be a
small and simple patch.

FWIW, I would love to see this in some future release: particularly for
\copy lines with large queries, the limitation that only single-line
input is accepted is very annoying -- much more so when the query comes
pasted from some other input, or when you have a file with a query and
just want to add a quick \copy prefix.

(Hmm, a much simpler alternative would be to allow \g-like syntax, i.e.
the query is already in the query buffer and the \copy line just
specifies the output file. In fact, for queries in history this is much
more convenient than the current syntax ...)

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers