Allowing line-continuation in pgbench custom scripts

Started by Amit Langoteover 11 years ago14 messages

amitlangote09@gmail.com

over 11 years ago

1 attachment(s)

Hi,

In a custom pgbench script, it seems convenient to be able to split a
really long query to span multiple lines using an escape character
(bash-style). Attached adds that capability to read_line_from_file()
in pgbench.c

For example,

BEGIN;
\setrandom 1 16500000
UPDATE table \
SET col2 = (clock_timestamp() + '10s'::interval * random() * 1000), \
col3 = (clock_timestamp() + '10s'::interval * sin(random() *
(2*pi()) ) * 1000) \
WHERE col1 = :id;
COMMIT;

instead of:

BEGIN;
\setrandom id 1 16500000
UPDATE table SET col2 = (clock_timestamp() + '10s'::interval *
random() * :id), col3 = (clock_timestamp() + '10s'::interval *
sin(random() * (2*pi()) ) * 100000) WHERE col1 = :id;
COMMIT;

Thoughts?

--
Amit

Attachments:

pgbench-custom-script-line-continuation.patchapplication/octet-stream; name=pgbench-custom-script-line-continuation.patchDownload

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 4aa8a50..b2b003d 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -2047,6 +2047,13 @@ read_line_from_file(FILE *fd)
 	{
 		size_t		thislen = strlen(tmpbuf);
 
+		/* allow line continuation with escape character '\' */
+		if(tmpbuf[thislen - 2] == '\\')
+		{
+			tmpbuf[thislen - 2] = '\0';
+			thislen -= 2;
+		}
+
 		/* Append tmpbuf to whatever we had already */
 		memcpy(buf + used, tmpbuf, thislen + 1);
 		used += thislen;

Fujii Masao

masao.fujii@gmail.com

over 11 years ago

In reply to: Amit Langote (#1)

Re: Allowing line-continuation in pgbench custom scripts

On Mon, May 26, 2014 at 6:50 PM, Amit Langote <amitlangote09@gmail.com> wrote:

Hi,

In a custom pgbench script, it seems convenient to be able to split a
really long query to span multiple lines using an escape character
(bash-style). Attached adds that capability to read_line_from_file()
in pgbench.c

For example,

BEGIN;
\setrandom 1 16500000
UPDATE table \
SET col2 = (clock_timestamp() + '10s'::interval * random() * 1000), \
col3 = (clock_timestamp() + '10s'::interval * sin(random() *
(2*pi()) ) * 1000) \
WHERE col1 = :id;
COMMIT;

instead of:

BEGIN;
\setrandom id 1 16500000
UPDATE table SET col2 = (clock_timestamp() + '10s'::interval *
random() * :id), col3 = (clock_timestamp() + '10s'::interval *
sin(random() * (2*pi()) ) * 100000) WHERE col1 = :id;
COMMIT;

Thoughts?

IMO it's better if we can write SQL in multiples line *without* a tailing
escape character, like psql's input file.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amit Langote

amitlangote09@gmail.com

over 11 years ago

In reply to: Fujii Masao (#2)

Re: Allowing line-continuation in pgbench custom scripts

On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Mon, May 26, 2014 at 6:50 PM, Amit Langote <amitlangote09@gmail.com> wrote:

Thoughts?

IMO it's better if we can write SQL in multiples line *without* a tailing
escape character, like psql's input file.

Yeah, that would be much cleaner.

--
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alvaro Herrera

alvherre@2ndquadrant.com

over 11 years ago

In reply to: Amit Langote (#3)

Re: Allowing line-continuation in pgbench custom scripts

Amit Langote wrote:

On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Mon, May 26, 2014 at 6:50 PM, Amit Langote <amitlangote09@gmail.com> wrote:

Thoughts?

IMO it's better if we can write SQL in multiples line *without* a tailing
escape character, like psql's input file.

Yeah, that would be much cleaner.

But that would require duplicating the lexing stuff to determine where
quotes are and where commands end. There are already some cases where
pgbench itself is the bottleneck; adding a lexing step would be more
expensive, no? Whereas simply detecting line continuations would be
cheaper.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Amit Langote (#1)

Re: Allowing line-continuation in pgbench custom scripts

Amit Langote <amitlangote09@gmail.com> writes:

In a custom pgbench script, it seems convenient to be able to split a
really long query to span multiple lines using an escape character
(bash-style). Attached adds that capability to read_line_from_file()
in pgbench.c

This seems pretty likely to break existing scripts that happen to contain
backslashes. Is it really worth the compatibility risk?

The patch as written has got serious problems even discounting any
compatibility risk: it will be fooled by a backslash near the end of a
bufferload that doesn't end with a newline, and it doesn't allow for
DOS-style newlines (\r\n), and it indexes off the array if the buffer
contains *only* a newline (and, assuming that it fails to crash in that
case, it'd also fail to note a backslash that had been in the previous
bufferload).

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Alvaro Herrera (#4)

Re: Allowing line-continuation in pgbench custom scripts

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Amit Langote wrote:

On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

IMO it's better if we can write SQL in multiples line *without* a tailing
escape character, like psql's input file.

Yeah, that would be much cleaner.

But that would require duplicating the lexing stuff to determine where
quotes are and where commands end. There are already some cases where
pgbench itself is the bottleneck; adding a lexing step would be more
expensive, no? Whereas simply detecting line continuations would be
cheaper.

Well, we only parse the script file(s) once at run start, and that time
isn't included in the TPS timing, so I don't think performance is really
an issue here. But yeah, the amount of code that would have to be
duplicated out of psql is pretty daunting --- it'd be a maintenance
nightmare, for what seems like not a lot of gain. There would also
be a compatibility issue if we went this way, because existing scripts
that haven't bothered with semicolon line terminators would break.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amit Langote

amitlangote09@gmail.com

over 11 years ago

In reply to: Tom Lane (#5)

Re: Allowing line-continuation in pgbench custom scripts

On Tue, May 27, 2014 at 12:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

In a custom pgbench script, it seems convenient to be able to split a
really long query to span multiple lines using an escape character
(bash-style). Attached adds that capability to read_line_from_file()
in pgbench.c

This seems pretty likely to break existing scripts that happen to contain
backslashes. Is it really worth the compatibility risk?

The patch as written has got serious problems even discounting any
compatibility risk: it will be fooled by a backslash near the end of a
bufferload that doesn't end with a newline, and it doesn't allow for
DOS-style newlines (\r\n), and it indexes off the array if the buffer
contains *only* a newline (and, assuming that it fails to crash in that
case, it'd also fail to note a backslash that had been in the previous
bufferload).

Sorry, the patch was in a really bad shape. Should have pondered these
points before submitting it.

Even if I drop the backslash line-continuation idea and decide to use
semi-colons as SQL command separators, given the compatibility issues
mentioned downthread, it would not be worthwhile.

--
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Jeff Janes

jeff.janes@gmail.com

over 11 years ago

In reply to: Tom Lane (#5)

Re: Allowing line-continuation in pgbench custom scripts

On Monday, May 26, 2014, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com <javascript:;>> writes:

In a custom pgbench script, it seems convenient to be able to split a
really long query to span multiple lines using an escape character
(bash-style). Attached adds that capability to read_line_from_file()
in pgbench.c

This seems pretty likely to break existing scripts that happen to contain
backslashes. Is it really worth the compatibility risk?

Do you mean due to the bugs you point out, or in general? Is it really at
all likely that someone has ended a line of their custom benchmark file
with a backslash? I'm having a hard time seeing what, other than malice,
would prod someone to do that.

Cheers,

Jeff

Andres Freund

andres@2ndquadrant.com

over 11 years ago

In reply to: Tom Lane (#5)

Re: Allowing line-continuation in pgbench custom scripts

On 2014-05-26 11:44:31 -0400, Tom Lane wrote:

Amit Langote <amitlangote09@gmail.com> writes:

In a custom pgbench script, it seems convenient to be able to split a
really long query to span multiple lines using an escape character
(bash-style). Attached adds that capability to read_line_from_file()
in pgbench.c

This seems pretty likely to break existing scripts that happen to contain
backslashes. Is it really worth the compatibility risk?

Weaknesses in the implementation aside, I don't think pgbench has to be
hold up to the same level of compatibility as many of our other
tools. It's just a benchmark tool after all.

I've more than once wished for the capability.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Jeff Janes (#8)

Re: Allowing line-continuation in pgbench custom scripts

Jeff Janes <jeff.janes@gmail.com> writes:

On Monday, May 26, 2014, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This seems pretty likely to break existing scripts that happen to contain
backslashes. Is it really worth the compatibility risk?

Do you mean due to the bugs you point out, or in general? Is it really at
all likely that someone has ended a line of their custom benchmark file
with a backslash? I'm having a hard time seeing what, other than malice,
would prod someone to do that.

No, I was worried that the feature would pose such a risk even when
correctly implemented. But on reflection, you're right, it seems a bit
hard to credit that any existing script file would have a backslash just
before EOL. There's certainly not SQL syntax in which that could be
valid; perhaps someone would do it in a "--" comment but that seems a
tad far fetched.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Alvaro Herrera

alvherre@2ndquadrant.com

over 11 years ago

In reply to: Andres Freund (#9)

Re: Allowing line-continuation in pgbench custom scripts

Andres Freund wrote:

I've more than once wished for the capability.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Josh Berkus

josh@agliodbs.com

over 11 years ago

In reply to: Amit Langote (#1)

Re: Allowing line-continuation in pgbench custom scripts

On 05/26/2014 08:52 AM, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Amit Langote wrote:

On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

IMO it's better if we can write SQL in multiples line *without* a tailing
escape character, like psql's input file.

Yeah, that would be much cleaner.

But that would require duplicating the lexing stuff to determine where
quotes are and where commands end. There are already some cases where
pgbench itself is the bottleneck; adding a lexing step would be more
expensive, no? Whereas simply detecting line continuations would be
cheaper.

Well, we only parse the script file(s) once at run start, and that time
isn't included in the TPS timing, so I don't think performance is really
an issue here. But yeah, the amount of code that would have to be
duplicated out of psql is pretty daunting --- it'd be a maintenance
nightmare, for what seems like not a lot of gain. There would also
be a compatibility issue if we went this way, because existing scripts
that haven't bothered with semicolon line terminators would break.

What if we make using semicolons or not a config option in the file? i.e.:

\multiline

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WMaa0c8ba34dd8a937c18c20bfa210092905f129eb12032382202edf8e229d7f42bf9bb3603643b6a2dd75b579f5b984f1@asav-3.01.com

#13

Christoph Berg

cb@df7cb.de

over 11 years ago

In reply to: Tom Lane (#6)

Re: Allowing line-continuation in pgbench custom scripts

Re: Tom Lane 2014-05-26 <26629.1401119523@sss.pgh.pa.us>

Yeah, that would be much cleaner.

But that would require duplicating the lexing stuff to determine where
quotes are and where commands end. There are already some cases where
pgbench itself is the bottleneck; adding a lexing step would be more
expensive, no? Whereas simply detecting line continuations would be
cheaper.

Well, we only parse the script file(s) once at run start, and that time
isn't included in the TPS timing, so I don't think performance is really
an issue here. But yeah, the amount of code that would have to be
duplicated out of psql is pretty daunting --- it'd be a maintenance
nightmare, for what seems like not a lot of gain. There would also
be a compatibility issue if we went this way, because existing scripts
that haven't bothered with semicolon line terminators would break.

Fwiw, I would love to have some \ line continuation thing also for
.psqlrc. I have some dozen \set in there containing queries for
looking into stats/locks/whatever I can invoke just typing e.g.
:user_tables, and these are pretty hard to edit as they are squeezed
on one line.

I agree that putting an SQL parser into the backslash command parser
is overkill, but there's hardly a chance backslashes at the end of a
backslash command line would break anything, except for meeting what
most people would expect.

Christoph
--
cb@df7cb.de | http://www.df7cb.de/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Amit Langote

amitlangote09@gmail.com

over 11 years ago

In reply to: Josh Berkus (#12)

Re: Allowing line-continuation in pgbench custom scripts

On Tue, May 27, 2014 at 6:19 AM, Josh Berkus <josh@agliodbs.com> wrote:

On 05/26/2014 08:52 AM, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Amit Langote wrote:

On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

IMO it's better if we can write SQL in multiples line *without* a tailing
escape character, like psql's input file.

Yeah, that would be much cleaner.

But that would require duplicating the lexing stuff to determine where
quotes are and where commands end. There are already some cases where
pgbench itself is the bottleneck; adding a lexing step would be more
expensive, no? Whereas simply detecting line continuations would be
cheaper.

Well, we only parse the script file(s) once at run start, and that time
isn't included in the TPS timing, so I don't think performance is really
an issue here. But yeah, the amount of code that would have to be
duplicated out of psql is pretty daunting --- it'd be a maintenance
nightmare, for what seems like not a lot of gain. There would also
be a compatibility issue if we went this way, because existing scripts
that haven't bothered with semicolon line terminators would break.

What if we make using semicolons or not a config option in the file? i.e.:

\multiline

And perhaps make 'off' the default if I get it correctly?

It would apply only to the SQL commands though, no?

--
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers