Leading comments and client applications

Started by Philip Semanchukabout 4 years ago11 messagesgeneral
Jump to latest
#1Philip Semanchuk
philip@americanefficient.com

Hi,
I'm trying to understand a behavior where, with our Postgres client, a leading comment in a SQL script causes the CREATE FUNCTION statement following it to be not executed. I can't figure out if this is a bug somewhere or just a misunderstanding on my part. I would appreciate some help understanding.

Here's the contents of foo.sql --

-- this is a comment
CREATE FUNCTION foo(bar text) RETURNS text AS $$
SELECT bar
$$
LANGUAGE sql IMMUTABLE PARALLEL SAFE
;

When I feed that to 'psql -f foo.sql', the function is created as I expect. In the Postgres log, the leading comment *doesn't* appear. I see the same behavior if I just copy/paste the function into psql.

Our test system uses Python 3.8, SQLAlchemy 1.3.6, and psycopg 2.8.5, and when our test harness reads foo.sql and passes it to SQLAlchemy's execute(), I can see in the Postgres log that the leading comment is *not* stripped, and the function isn't created.

The server is Postgres 11.

My naive interpretation is that one of the client layers (SQLAlchemy or psycopg2) should be stripping the leading comment but isn't, but that seems like a lot of responsibility to push onto a client application. I figured that would be the responsibility of the Postgres parser.

I'd be grateful for any insights about what I'm missing.

Thanks
Philip

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Philip Semanchuk (#1)
Re: Leading comments and client applications

On Fri, Mar 25, 2022 at 8:32 AM Philip Semanchuk <
philip@americanefficient.com> wrote:

Here's the contents of foo.sql --

-- this is a comment
CREATE FUNCTION foo(bar text) RETURNS text AS $$
SELECT bar
$$
LANGUAGE sql IMMUTABLE PARALLEL SAFE
;

When I feed that to 'psql -f foo.sql', the function is created as I
expect. In the Postgres log, the leading comment *doesn't* appear. I see
the same behavior if I just copy/paste the function into psql.

Our test system uses Python 3.8, SQLAlchemy 1.3.6, and psycopg 2.8.5, and
when our test harness reads foo.sql and passes it to SQLAlchemy's
execute(), I can see in the Postgres log that the leading comment is *not*
stripped, and the function isn't created.

I think you need to provide these log entries you are referring to.

The comment form using the -- prefix ends at the first newline
encountered. This is server behavior, not client-side [1]https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-COMMENTS. But the
comment doesn't actually belong with any individual command, it (the line)
is simply ignored by the server when seen.

I suspect that the newline is being removed in order by SQLAlchemy to do
all of its helpful stuff, like statement caching. I also suspect that you
are probably mis-using the driver since the execute(string) method is
marked deprecated [2]https://docs.sqlalchemy.org/en/14/core/connections.html#sqlalchemy.engine.Connection.execute, possibly for this very reason, but you also haven't
shown that code so it is hard to say (I don't actually use the tool myself
though).

[1]: https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-COMMENTS
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-COMMENTS
[2]: https://docs.sqlalchemy.org/en/14/core/connections.html#sqlalchemy.engine.Connection.execute
https://docs.sqlalchemy.org/en/14/core/connections.html#sqlalchemy.engine.Connection.execute

David J.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Philip Semanchuk (#1)
Re: Leading comments and client applications

Philip Semanchuk <philip@americanefficient.com> writes:

I'm trying to understand a behavior where, with our Postgres client, a leading comment in a SQL script causes the CREATE FUNCTION statement following it to be not executed. I can't figure out if this is a bug somewhere or just a misunderstanding on my part. I would appreciate some help understanding.

Are you certain there's actually a newline after the comment?
The easiest explanation for this would be if something in the
SQLAlchemy code path were munging the newline.

A completely different line of thought is that the function
*does* get created, but inside a transaction that never
gets committed. Either way, I think this is mainly a SQLAlchemy
question not a Postgres question.

As far as the comparison behavior goes, psql's parser strips
comments that start with double dashes, for $obscure_reasons.
The server is perfectly capable of ignoring those by itself,
though. (Awhile back I tried to remove that psql behavior,
but it caused too much churn in our regression tests.)

regards, tom lane

#4Rob Sargent
robjsargent@gmail.com
In reply to: Tom Lane (#3)
Re: Leading comments and client applications

As far as the comparison behavior goes, psql's parser strips
comments that start with double dashes, for $obscure_reasons.

That story aught to be worth a $beer or two

Show quoted text

The server is perfectly capable of ignoring those by itself,
though. (Awhile back I tried to remove that psql behavior,
but it caused too much churn in our regression tests.)

regards, tom lane

#5Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Philip Semanchuk (#1)
Re: Leading comments and client applications

On 2022-03-25 11:32:24 -0400, Philip Semanchuk wrote:

I'm trying to understand a behavior where, with our Postgres client, a
leading comment in a SQL script causes the CREATE FUNCTION statement
following it to be not executed. I can't figure out if this is a bug
somewhere or just a misunderstanding on my part. I would appreciate
some help understanding.

Here's the contents of foo.sql --

-- this is a comment
CREATE FUNCTION foo(bar text) RETURNS text AS $$
SELECT bar
$$
LANGUAGE sql IMMUTABLE PARALLEL SAFE
;

When I feed that to 'psql -f foo.sql', the function is created as I
expect. In the Postgres log, the leading comment *doesn't* appear. I
see the same behavior if I just copy/paste the function into psql.

Our test system uses Python 3.8, SQLAlchemy 1.3.6, and psycopg 2.8.5,
and when our test harness reads foo.sql and passes it to SQLAlchemy's
execute(), I can see in the Postgres log that the leading comment is
*not* stripped, and the function isn't created.

I cannot reproduce this with plain psycopg:

% cat foo
#!/usr/bin/python3

import psycopg2

db = psycopg2.connect("")
csr = db.cursor()

csr.execute(
"""
-- this is a comment
CREATE FUNCTION foo(bar text) RETURNS text AS $$
SELECT bar
$$
LANGUAGE sql IMMUTABLE PARALLEL SAFE
""")
db.commit()
% ./foo
% psql
psql (13.6 (Ubuntu 13.6-1.pgdg20.04+1), server 11.15 (Ubuntu 11.15-1.pgdg20.04+1))
Type "help" for help.

hjp=> \df foo
List of functions
╔════════╤══════╤══════════════════╤═════════════════════╤══════╗
║ Schema │ Name │ Result data type │ Argument data types │ Type ║
╟────────┼──────┼──────────────────┼─────────────────────┼──────╢
║ public │ foo │ text │ bar text │ func ║
╚════════╧══════╧══════════════════╧═════════════════════╧══════╝
(1 row)

hjp=> select foo('x*');
╔═════╗
║ foo ║
╟─────╢
║ x* ║
╚═════╝
(1 row)

Time: 1.296 ms
hjp=> \q

So like others I suspect that SQLAlchemy is doing something weird here.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#6Philip Semanchuk
philip@americanefficient.com
In reply to: Tom Lane (#3)
Re: Leading comments and client applications

On Mar 25, 2022, at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Philip Semanchuk <philip@americanefficient.com> writes:

I'm trying to understand a behavior where, with our Postgres client, a leading comment in a SQL script causes the CREATE FUNCTION statement following it to be not executed. I can't figure out if this is a bug somewhere or just a misunderstanding on my part. I would appreciate some help understanding.

Are you certain there's actually a newline after the comment?
The easiest explanation for this would be if something in the
SQLAlchemy code path were munging the newline.

I verified that there is a newline after the comment. But yes, thanks to your suggestion and others, I was able to narrow this down to something in SQLAlchemy behavior. In case anyone else comes across this and is wondering --

In addition to accepting a plain string, execute() accepts a number of different SQLAlchemy data types, including TextClause and DDLElement. We used to pass a DDLElement to execute(), but a few months ago we switched to passing a TextClause because DDLElement interprets % signs anywhere in SQL scripts as Python string interpolation markers and that was causing us headaches in some scripts. Something about the way TextClause changes the raw SQL string causes the behavior I’m seeing, although we didn’t notice it at the time of the changeover. I don’t know what exactly it’s doing yet, but when I switch back to passing a DDLElement to execute(), my SQL function is created as I expected.

https://docs.sqlalchemy.org/en/13/core/connections.html#sqlalchemy.engine.Connection.execute

As David J pointed out, execute() is deprecated as of version 1.4. We’re still on 1.3 but we’ll have to move away from this code eventually so maybe this is a good inspiration to move away from execute() now and reduce the number of deprecation warnings we have to deal with in the future.

As far as the comparison behavior goes, psql's parser strips
comments that start with double dashes, for $obscure_reasons.
The server is perfectly capable of ignoring those by itself,
though. (Awhile back I tried to remove that psql behavior,
but it caused too much churn in our regression tests.)

Thanks, this is most helpful. I use psql to double check I think SQLAlchemy is doing something odd. It’s good to know that psql's behavior in this case is a choice and not required behavior for clients. Peter J. Holzer’s psycopg2 example could have showed me the same; I wish I had thought of that.

I appreciate all the help!

Cheers
Philip

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rob Sargent (#4)
Re: Leading comments and client applications

Rob Sargent <robjsargent@gmail.com> writes:

As far as the comparison behavior goes, psql's parser strips
comments that start with double dashes, for $obscure_reasons.

That story aught to be worth a $beer or two

Hmm. The original reasoning is lost in the mists of time;
I dug in our git history and traced the behavior as far back
as a45195a19 of 1999-11-04, but I'll bet a nickel that Peter
doesn't remember exactly why he did that.

But I can show you why I gave up on removing the behavior:
it's an important part of psql's strategy of discarding
leading whitespace before a query. Our regression test
scripts are full of cases like

-- comments here

SELECT intentionally-wrong-query;

and what they're expecting to get from that is output like

ERROR: column "intentionally" does not exist
LINE 1: SELECT intentionally-wrong-query;
^

When I changed psql's parser to not remove comments, that output
suddenly changed to say "LINE 3:", because now the query string
sent to the server included the "-- comments here" line as well
as the blank line after it. While we could have changed all the
expected output, or changed how the server counts lines within
a query, we concluded that this would confuse too many people and
break too many applications; so we left it alone.

(As of v15, psql will send -- comments that begin *after* the
first non-whitespace token of a query [1]https://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=83884682f4df96184549b91869a1cf79dafb4f94. But leading comments
and whitespace will still get stripped.)

regards, tom lane

[1]: https://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=83884682f4df96184549b91869a1cf79dafb4f94

#8Rob Sargent
robjsargent@gmail.com
In reply to: Tom Lane (#7)
Re: Leading comments and client applications

On 3/25/22 13:30, Tom Lane wrote:

Rob Sargent<robjsargent@gmail.com> writes:

As far as the comparison behavior goes, psql's parser strips
comments that start with double dashes, for $obscure_reasons.

That story aught to be worth a $beer or two

Hmm. The original reasoning is lost in the mists of time;
I dug in our git history and traced the behavior as far back
as a45195a19 of 1999-11-04, but I'll bet a nickel that Peter
doesn't remember exactly why he did that.

But I can show you why I gave up on removing the behavior:
it's an important part of psql's strategy of discarding
leading whitespace before a query. Our regression test
scripts are full of cases like

-- comments here

SELECT intentionally-wrong-query;

and what they're expecting to get from that is output like

ERROR: column "intentionally" does not exist
LINE 1: SELECT intentionally-wrong-query;
^

When I changed psql's parser to not remove comments, that output
suddenly changed to say "LINE 3:", because now the query string
sent to the server included the "-- comments here" line as well
as the blank line after it. While we could have changed all the
expected output, or changed how the server counts lines within
a query, we concluded that this would confuse too many people and
break too many applications; so we left it alone.

(As of v15, psql will send -- comments that begin *after* the
first non-whitespace token of a query [1]. But leading comments
and whitespace will still get stripped.)

regards, tom lane

[1]https://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=83884682f4df96184549b91869a1cf79dafb4f94

Thank you for the indulgence! I clearly owe you (another) one.

#9Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Rob Sargent (#8)
Re: Leading comments and client applications

On Fri, 2022-03-25 at 13:35 -0600, Rob Sargent wrote:

On 3/25/22 13:30, Tom Lane wrote:

Bob Sargent <robjsargent@gmail.com> writes:

That story aught to be worth a $beer or two

Hmm. [story]

 Thank you for the indulgence! I clearly owe you (another) one.

I think Tom prefers wine: /messages/by-id/14929.1358317689@sss.pgh.pa.us

Yours,
Laurenz Albe

#10Philippe Doussot
philippe.doussot@arche-mc2.fr
In reply to: Philip Semanchuk (#6)
Re: Leading comments and client applications

Hi,

I was able to narrow this down to something in SQLAlchemy behavior.

Fine :)

Something about the way TextClause changes the raw SQL string causes the behavior I’m seeing, although we didn’t notice it at the time of the changeover.
I don’t know what exactly it’s doing yet, but when I switch back to passing a DDLElement to execute(), my SQL function is created as I expected.

Alternate option if you want continue to use  TextClause:

use /* comment */ for first prefix comment.

Comment is logged and query executed (tested on Java ( not on SQLAlchemy )).
We use it to track back the request id executed like that

query = em.createNativeQuery("/*requete_enregistree_num_" + requete.getId() + "*/ " + requete.getReqRequete().trim());

Philippe

On 25/03/2022 19:05, Philip Semanchuk wrote:

On Mar 25, 2022, at 11:59 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Philip Semanchuk<philip@americanefficient.com> writes:

I'm trying to understand a behavior where, with our Postgres client, a leading comment in a SQL script causes the CREATE FUNCTION statement following it to be not executed. I can't figure out if this is a bug somewhere or just a misunderstanding on my part. I would appreciate some help understanding.

Are you certain there's actually a newline after the comment?
The easiest explanation for this would be if something in the
SQLAlchemy code path were munging the newline.

I verified that there is a newline after the comment. But yes, thanks to your suggestion and others, I was able to narrow this down to something in SQLAlchemy behavior. In case anyone else comes across this and is wondering --

In addition to accepting a plain string, execute() accepts a number of different SQLAlchemy data types, including TextClause and DDLElement. We used to pass a DDLElement to execute(), but a few months ago we switched to passing a TextClause because DDLElement interprets % signs anywhere in SQL scripts as Python string interpolation markers and that was causing us headaches in some scripts. Something about the way TextClause changes the raw SQL string causes the behavior I’m seeing, although we didn’t notice it at the time of the changeover. I don’t know what exactly it’s doing yet, but when I switch back to passing a DDLElement to execute(), my SQL function is created as I expected.

https://docs.sqlalchemy.org/en/13/core/connections.html#sqlalchemy.engine.Connection.execute

As David J pointed out, execute() is deprecated as of version 1.4. We’re still on 1.3 but we’ll have to move away from this code eventually so maybe this is a good inspiration to move away from execute() now and reduce the number of deprecation warnings we have to deal with in the future.

As far as the comparison behavior goes, psql's parser strips
comments that start with double dashes, for $obscure_reasons.
The server is perfectly capable of ignoring those by itself,
though. (Awhile back I tried to remove that psql behavior,
but it caused too much churn in our regression tests.)

Thanks, this is most helpful. I use psql to double check I think SQLAlchemy is doing something odd. It’s good to know that psql's behavior in this case is a choice and not required behavior for clients. Peter J. Holzer’s psycopg2 example could have showed me the same; I wish I had thought of that.

I appreciate all the help!

Cheers
Philip

--

📌 Le nom de domaine de nos adresses mails évolue et devient @arche-mc2.fr.

My Logo

arche-mc2.fr <https://cloud.letsignit.com/collect/bc/61f021ab21488edd71fe55fe?p=T6Q_2TuhlnOa6Vsx6-MT2IBatnqI7vmjUul2-_EjYgUpYaHQ5q1LqjIR9-pxUHfikOYW7y2eM8WGWMEw6PlnMfKnTaOZP-y2iUHa7jlBrUK_DYDxjdcFR32CefiLPg-hrxT5OaW-c3bv9zu5BEdrLA==&gt;

Philippe DOUSSOT

ARCHITECTE TECHNIQUE

DIRECTION DES SOLUTIONS ARCHE MC2 DOMICILE

philippe.doussot@arche‑mc2.fr

#11Philip Semanchuk
philip@americanefficient.com
In reply to: Philippe Doussot (#10)
Re: Leading comments and client applications

On Mar 28, 2022, at 5:42 AM, Philippe Doussot <philippe.doussot@arche-mc2.fr> wrote:

Something about the way TextClause changes the raw SQL string causes the behavior I’m seeing, although we didn’t notice it at the time of the changeover.
I don’t know what exactly it’s doing yet, but when I switch back to passing a DDLElement to execute(), my SQL function is created as I expected.

Alternate option if you want continue to use TextClause:

use /* comment */ for first prefix comment.

Comment is logged and query executed (tested on Java ( not on SQLAlchemy )).
We use it to track back the request id executed like that

query = em.createNativeQuery("/*requete_enregistree_num_" + requete.getId() + "*/ " + requete.getReqRequete().trim());

Thanks for the suggestion! In my testing, both single line and multiline comment blocks cause the same problem for me. I *was* able to resolve this with a simple change. I was calling SQLAlchemy’s engine.execute(). When I call connection.execute() instead, the problem resolves. This also solves a future deprecation problem for us. engine.execute() is deprecated in SQLAlchemy 1.4, but connection.execute() is not.

I didn’t expect this to fix the problem. There’s no difference in the Postgres log that I can see, so I think the SQL that SQLAlchemy sends to postgres is the same. If it’s a commit/transaction problem, it should affect all of our functions equally, not just the ones that start with comments.

I clearly don’t understand this problem fully. Although I'm curious about it, I’m eager to move on to other things. I plan to proceed with this fix and not investigate any more.

THanks everyone for all the help and suggestions

Cheers
Philip

Show quoted text

On 25/03/2022 19:05, Philip Semanchuk wrote:

On Mar 25, 2022, at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us>
wrote:

Philip Semanchuk
<philip@americanefficient.com>
writes:

I'm trying to understand a behavior where, with our Postgres client, a leading comment in a SQL script causes the CREATE FUNCTION statement following it to be not executed. I can't figure out if this is a bug somewhere or just a misunderstanding on my part. I would appreciate some help understanding.

Are you certain there's actually a newline after the comment?
The easiest explanation for this would be if something in the
SQLAlchemy code path were munging the newline.

I verified that there is a newline after the comment. But yes, thanks to your suggestion and others, I was able to narrow this down to something in SQLAlchemy behavior. In case anyone else comes across this and is wondering --

In addition to accepting a plain string, execute() accepts a number of different SQLAlchemy data types, including TextClause and DDLElement. We used to pass a DDLElement to execute(), but a few months ago we switched to passing a TextClause because DDLElement interprets % signs anywhere in SQL scripts as Python string interpolation markers and that was causing us headaches in some scripts. Something about the way TextClause changes the raw SQL string causes the behavior I’m seeing, although we didn’t notice it at the time of the changeover. I don’t know what exactly it’s doing yet, but when I switch back to passing a DDLElement to execute(), my SQL function is created as I expected.

https://docs.sqlalchemy.org/en/13/core/connections.html#sqlalchemy.engine.Connection.execute

As David J pointed out, execute() is deprecated as of version 1.4. We’re still on 1.3 but we’ll have to move away from this code eventually so maybe this is a good inspiration to move away from execute() now and reduce the number of deprecation warnings we have to deal with in the future.

As far as the comparison behavior goes, psql's parser strips
comments that start with double dashes, for $obscure_reasons.
The server is perfectly capable of ignoring those by itself,
though. (Awhile back I tried to remove that psql behavior,
but it caused too much churn in our regression tests.)

Thanks, this is most helpful. I use psql to double check I think SQLAlchemy is doing something odd. It’s good to know that psql's behavior in this case is a choice and not required behavior for clients. Peter J. Holzer’s psycopg2 example could have showed me the same; I wish I had thought of that.

I appreciate all the help!

Cheers
Philip

--

📌 Le nom de domaine de nos adresses mails évolue et devient @arche-mc2.fr.

arche-mc2.fr

Philippe DOUSSOT

ARCHITECTE TECHNIQUE

DIRECTION DES SOLUTIONS ARCHE MC2 DOMICILE

philippe.doussot@arche‑mc2.fr