Two small patches for the isolationtester lexer

Started by Daniel Gustafssonalmost 8 years ago10 messages
#1Daniel Gustafsson
daniel@yesql.se
2 attachment(s)

When writing an isolation testcase recently I bumped into the 1024 line buffer
size limit in the lexer for my setup block. Adding some stored procedures to
the test makes it quite easy to break 1024 characters, and while these could be
added as steps it, it’s not a good workaround since the permutation order
becomes trickier (and more set in stone). As far as I can see in the history,
this limit is chosen as a decent sized buffer and not rooted in a specific
requirement, so I propose to bump it slightly to 2048 instead (an equally
arbitrarily chosen number). Is there a reason to keep it at 1024 that I’m
missing?

I also (again) forgot about the # comments not being allowed inside setup and
teardown blocks, so patch 0002 proposes adding support for these as the
documentation implies. Since SQL comments will be counted towards the line
buffer, and sent with the command, supporting both kinds of comments seems
reasonable and consistent.

make check passes with these patches applies, but I’m quite rusty in this area
so I might be missing something.

cheers ./daniel

Attachments:

0001-Increase-the-linebuf-in-the-isolation-spec-scanner.patchapplication/octet-stream; name=0001-Increase-the-linebuf-in-the-isolation-spec-scanner.patchDownload
From cffcde237cb19792181fd060c10198c26a1705bd Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Wed, 21 Feb 2018 20:42:24 +0100
Subject: [PATCH 1/2] Increase the linebuf in the isolation spec scanner

The previous value of 1024 characters for the line buffer is quite
easy to bump into with large setup blocks. Increase the buffer to
an arbitrarily chosen 2048 characters which allows for double the
size of blocks.
---
 src/test/isolation/specscanner.l | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/test/isolation/specscanner.l b/src/test/isolation/specscanner.l
index 481b32d1d7..7d371ebbca 100644
--- a/src/test/isolation/specscanner.l
+++ b/src/test/isolation/specscanner.l
@@ -12,7 +12,7 @@
 
 static int	yyline = 1;			/* line number for error reporting */
 
-static char litbuf[1024];
+static char litbuf[2048];
 static int litbufpos = 0;
 
 static void addlitchar(char c);
-- 
2.14.1.145.gb3622a4ee

0002-Allow-comments-in-the-isolation-spec-SQL-blocks.patchapplication/octet-stream; name=0002-Allow-comments-in-the-isolation-spec-SQL-blocks.patchDownload
From 45e944bde1b18c6a0ddcba3e95d29540ab583b5d Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Wed, 21 Feb 2018 20:42:28 +0100
Subject: [PATCH 2/2] Allow # comments in the isolation spec SQL blocks

The documentation for the isolationtester spec files states that
lines starting with # are treated as comments. This doesm't apply
to inside the SQL blocks, so add support for # comments there as
well. While normal -- SQL comments are perfectly valid to use,
they will count against the linebuffer so supporthing both has a
value outside just being consistent.
---
 src/test/isolation/specscanner.l | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/test/isolation/specscanner.l b/src/test/isolation/specscanner.l
index 7d371ebbca..84194b3a87 100644
--- a/src/test/isolation/specscanner.l
+++ b/src/test/isolation/specscanner.l
@@ -72,6 +72,9 @@ teardown		{ return TEARDOWN; }
 					litbufpos = 0;
 					BEGIN(sql);
 				}
+<sql>{comment}	{
+					/* ignore */
+				}
 <sql>{space}*"}" {
 					litbuf[litbufpos] = '\0';
 					yylval.str = pg_strdup(litbuf);
-- 
2.14.1.145.gb3622a4ee

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Gustafsson (#1)
Re: Two small patches for the isolationtester lexer

Daniel Gustafsson <daniel@yesql.se> writes:

When writing an isolation testcase recently I bumped into the 1024 line buffer
size limit in the lexer for my setup block. Adding some stored procedures to
the test makes it quite easy to break 1024 characters, and while these could be
added as steps it, it’s not a good workaround since the permutation order
becomes trickier (and more set in stone). As far as I can see in the history,
this limit is chosen as a decent sized buffer and not rooted in a specific
requirement, so I propose to bump it slightly to 2048 instead (an equally
arbitrarily chosen number). Is there a reason to keep it at 1024 that I’m
missing?

I can't think of one; but I wonder if it's worth working a bit harder and
removing the fixed limit altogether, probably by using a PQExpBuffer.
If you've hit 1024 today, somebody will bump up against 2048 tomorrow.

I also (again) forgot about the # comments not being allowed inside setup and
teardown blocks, so patch 0002 proposes adding support for these as the
documentation implies. Since SQL comments will be counted towards the line
buffer, and sent with the command, supporting both kinds of comments seems
reasonable and consistent.

Hmm, not sure this is a good idea. # is a valid SQL operator name, so
doing this would create some risk of breaking legal queries. Admittedly,
those operators are rare enough that maybe nobody would ever need them in
isolationtester scripts, but I'm not sure that providing an additional
way to spell "comment" is worth that.

regards, tom lane

#3Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#2)
Re: Two small patches for the isolationtester lexer

On 21 Feb 2018, at 21:41, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Daniel Gustafsson <daniel@yesql.se> writes:

When writing an isolation testcase recently I bumped into the 1024 line buffer
size limit in the lexer for my setup block. Adding some stored procedures to
the test makes it quite easy to break 1024 characters, and while these could be
added as steps it, it’s not a good workaround since the permutation order
becomes trickier (and more set in stone). As far as I can see in the history,
this limit is chosen as a decent sized buffer and not rooted in a specific
requirement, so I propose to bump it slightly to 2048 instead (an equally
arbitrarily chosen number). Is there a reason to keep it at 1024 that I’m
missing?

I can't think of one; but I wonder if it's worth working a bit harder and
removing the fixed limit altogether, probably by using a PQExpBuffer.
If you've hit 1024 today, somebody will bump up against 2048 tomorrow.

The thought did cross my mind, but I opted for the simple hack first. I can
take a stab at using a PQExpBuffer to see where that leads.

I also (again) forgot about the # comments not being allowed inside setup and
teardown blocks, so patch 0002 proposes adding support for these as the
documentation implies. Since SQL comments will be counted towards the line
buffer, and sent with the command, supporting both kinds of comments seems
reasonable and consistent.

Hmm, not sure this is a good idea. # is a valid SQL operator name, so
doing this would create some risk of breaking legal queries. Admittedly,
those operators are rare enough that maybe nobody would ever need them in
isolationtester scripts, but I'm not sure that providing an additional
way to spell "comment" is worth that.

Good point, didn’t think about that.

cheers ./daniel

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: Two small patches for the isolationtester lexer

I wrote;

Daniel Gustafsson <daniel@yesql.se> writes:

I also (again) forgot about the # comments not being allowed inside setup and
teardown blocks, so patch 0002 proposes adding support for these as the
documentation implies.

Hmm, not sure this is a good idea. # is a valid SQL operator name, so
doing this would create some risk of breaking legal queries.

Actually, looking closer, this would also trigger on '#' used inside a
SQL literal, which seems to move the problem cases into the "pretty
likely" category instead of the "far-fetched" one. So I'd only be OK
with it if we made the lexer smart enough to distinguish inside-a-SQL-
literal from not. That might be a good thing anyway, since it would
allow us to not choke on literals containing '}', but it'd be a great
deal more work. You might be able to steal code from the psql lexer
though.

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Gustafsson (#3)
Re: Two small patches for the isolationtester lexer

Daniel Gustafsson <daniel@yesql.se> writes:

On 21 Feb 2018, at 21:41, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I can't think of one; but I wonder if it's worth working a bit harder and
removing the fixed limit altogether, probably by using a PQExpBuffer.
If you've hit 1024 today, somebody will bump up against 2048 tomorrow.

The thought did cross my mind, but I opted for the simple hack first. I can
take a stab at using a PQExpBuffer to see where that leads.

Another idea is just to teach addlitchar to realloc the buffer bigger
when necessary.

Show quoted text

I also (again) forgot about the # comments not being allowed inside setup and
teardown blocks, so patch 0002 proposes adding support for these as the
documentation implies. Since SQL comments will be counted towards the line
buffer, and sent with the command, supporting both kinds of comments seems
reasonable and consistent.

Hmm, not sure this is a good idea. # is a valid SQL operator name, so
doing this would create some risk of breaking legal queries. Admittedly,
those operators are rare enough that maybe nobody would ever need them in
isolationtester scripts, but I'm not sure that providing an additional
way to spell "comment" is worth that.

Good point, didn’t think about that.

cheers ./daniel

#6Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#5)
1 attachment(s)
Re: Two small patches for the isolationtester lexer

On 22 Feb 2018, at 05:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Daniel Gustafsson <daniel@yesql.se> writes:

On 21 Feb 2018, at 21:41, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I can't think of one; but I wonder if it's worth working a bit harder and
removing the fixed limit altogether, probably by using a PQExpBuffer.
If you've hit 1024 today, somebody will bump up against 2048 tomorrow.

The thought did cross my mind, but I opted for the simple hack first. I can
take a stab at using a PQExpBuffer to see where that leads.

Another idea is just to teach addlitchar to realloc the buffer bigger
when necessary.

I think this is the best approach for the task, the attached patch changes the
static allocation to instead realloc when required. Having an upper limit on
the allocation seemed like a good idea to me, but perhaps it’s overthinking and
complicating things for no good reason.

cheers ./daniel

Attachments:

0001-Grow-isolation-spec-scanner-buffer-on-large-SQL-step.patchapplication/octet-stream; name=0001-Grow-isolation-spec-scanner-buffer-on-large-SQL-step.patchDownload
From d862dd97e76b4cec256f90999b049edb6e5bc958 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <daniel@yesql.se>
Date: Wed, 21 Feb 2018 20:42:24 +0100
Subject: [PATCH] Grow isolation spec scanner buffer on large SQL steps

The previous value of 1024 characters for the line buffer is quite
easy to bump into with large setup blocks. In case the SQL step is
larger than the current limit of 1024 chars, reallocate the buffer
until we hit an upper limit of LITBUF_MAX.
---
 src/test/isolation/specscanner.l | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/test/isolation/specscanner.l b/src/test/isolation/specscanner.l
index 481b32d1d7..6e8e4a6ff0 100644
--- a/src/test/isolation/specscanner.l
+++ b/src/test/isolation/specscanner.l
@@ -12,7 +12,11 @@
 
 static int	yyline = 1;			/* line number for error reporting */
 
-static char litbuf[1024];
+#define LITBUF_MAX	1024 * 1024 * 10
+#define LITBUF_INIT	1024
+
+static size_t litbufsize;
+static char *litbuf = NULL;
 static int litbufpos = 0;
 
 static void addlitchar(char c);
@@ -40,6 +44,10 @@ space			[ \t\r\f]
 comment			("#"{non_newline}*)
 
 %%
+%{
+	litbuf = pg_malloc(LITBUF_INIT);
+	litbufsize = LITBUF_INIT;
+%}
 
 permutation		{ return PERMUTATION; }
 session			{ return SESSION; }
@@ -100,10 +108,15 @@ teardown		{ return TEARDOWN; }
 static void
 addlitchar(char c)
 {
-	if (litbufpos >= sizeof(litbuf) - 1)
+	if (litbufpos >= litbufsize - 1)
 	{
-		fprintf(stderr, "SQL step too long\n");
-		exit(1);
+		litbufsize += litbufsize;
+		if (litbufsize > LITBUF_MAX)
+		{
+			fprintf(stderr, "SQL step too long\n");
+			exit(1);
+		}
+		litbuf = pg_realloc(litbuf, litbufsize);
 	}
 	litbuf[litbufpos++] = c;
 }
-- 
2.14.1.145.gb3622a4ee

#7Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#4)
Re: Two small patches for the isolationtester lexer

On 22 Feb 2018, at 05:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote;

Daniel Gustafsson <daniel@yesql.se> writes:

I also (again) forgot about the # comments not being allowed inside setup and
teardown blocks, so patch 0002 proposes adding support for these as the
documentation implies.

Hmm, not sure this is a good idea. # is a valid SQL operator name, so
doing this would create some risk of breaking legal queries.

Actually, looking closer, this would also trigger on '#' used inside a
SQL literal, which seems to move the problem cases into the "pretty
likely" category instead of the "far-fetched" one. So I'd only be OK
with it if we made the lexer smart enough to distinguish inside-a-SQL-
literal from not. That might be a good thing anyway, since it would
allow us to not choke on literals containing '}', but it'd be a great
deal more work. You might be able to steal code from the psql lexer
though.

I agree, patch 0002 was broken and the correct fix is a much bigger project -
one too big for me to tackle right now (but hopefully at some point in the near
future). Thanks for the review of it though!

cheers ./daniel

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Gustafsson (#6)
Re: Two small patches for the isolationtester lexer

Daniel Gustafsson <daniel@yesql.se> writes:

On 22 Feb 2018, at 05:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Another idea is just to teach addlitchar to realloc the buffer bigger
when necessary.

I think this is the best approach for the task, the attached patch changes the
static allocation to instead realloc when required. Having an upper limit on
the allocation seemed like a good idea to me, but perhaps it’s overthinking and
complicating things for no good reason.

Yeah, it doesn't seem to me that we need any fixed limit on the length,
so I deleted that part of the logic, and pushed it with one or two other
cosmetic adjustments.

regards, tom lane

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Gustafsson (#7)
Re: Two small patches for the isolationtester lexer

Daniel Gustafsson <daniel@yesql.se> writes:

On 22 Feb 2018, at 05:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Actually, looking closer, this would also trigger on '#' used inside a
SQL literal, which seems to move the problem cases into the "pretty
likely" category instead of the "far-fetched" one. So I'd only be OK
with it if we made the lexer smart enough to distinguish inside-a-SQL-
literal from not. That might be a good thing anyway, since it would
allow us to not choke on literals containing '}', but it'd be a great
deal more work. You might be able to steal code from the psql lexer
though.

I agree, patch 0002 was broken and the correct fix is a much bigger project -
one too big for me to tackle right now (but hopefully at some point in the near
future). Thanks for the review of it though!

OK. I'm going to mark this commitfest entry closed, since the other patch
is in and this one needs a good bit more work. Please start a new thread,
or at least a new CF entry, if you do more work in this area.

regards, tom lane

#10Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#9)
Re: Two small patches for the isolationtester lexer

On 01 Mar 2018, at 06:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:

I agree, patch 0002 was broken and the correct fix is a much bigger project -
one too big for me to tackle right now (but hopefully at some point in the near
future). Thanks for the review of it though!

OK. I'm going to mark this commitfest entry closed, since the other patch
is in and this one needs a good bit more work. Please start a new thread,
or at least a new CF entry, if you do more work in this area.

I absolutely agree, thanks!

cheers ./daniel