Allowing additional commas between columns, and at the end of the SELECT clause

Started by Artur Formellaalmost 2 years ago8 messageshackers
Jump to latest
#1Artur Formella
artur.formella3@gmail.com

Hello!
I have created a patch to allow additional commas between columns, and
at the end of the SELECT clause.

Motivation:
Commas of this type are allowed in many programming languages, in some
it is even recommended to use them at the ends of lists or objects. A
new generation of programmers expects a more forgiving language just as
our generation enjoyed LIMIT and the ability to write `select` in lowercase.

Accepted:
    SELECT 1,;
    SELECT 1,,,,,;
    SELECT *, from information_schema.sql_features;
    (...) RETURNING a,,b,c,;

Not accepted:
    SELECT ,;
    SELECT ,1;
    SELECT ,,,;

Advantages:
- simplifies the creation and debugging of queries by reducing the most
common syntax error,
- eliminates the need to use the popular `1::int as dummy` at the end of
a SELECT list,
- simplifies query generators,
- the query is still deterministic,

Disadvantages:
- counting of returned columns can be difficult,
- syntax checkers will still report errors,
- probably not SQL standard compliant,
- functionality can be controversial,

I attach the patch along with the tests.

What do you think?

Your opinions are very much welcome!

Attachments:

0001-additional_commas.patchtext/plain; charset=UTF-8; name=0001-additional_commas.patchDownload+69-4
#2Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Artur Formella (#1)
Re: Allowing additional commas between columns, and at the end of the SELECT clause

On Mon, 13 May 2024 at 10:42, Artur Formella <artur.formella3@gmail.com> wrote:

Motivation:
Commas of this type are allowed in many programming languages, in some
it is even recommended to use them at the ends of lists or objects.

Single trailing commas are a feature that's more and more common in
languages, yes, but arbitrary excess commas is new to me. Could you
provide some examples of popular languages which have that, as I can't
think of any.

Accepted:
SELECT 1,;
SELECT 1,,,,,;
SELECT *, from information_schema.sql_features;
(...) RETURNING a,,b,c,;

Not accepted:
SELECT ,;
SELECT ,1;
SELECT ,,,;

Advantages:
- simplifies the creation and debugging of queries by reducing the most
common syntax error,
- eliminates the need to use the popular `1::int as dummy` at the end of
a SELECT list,

This is the first time I've heard of this `1 as dummy`.

- simplifies query generators,
- the query is still deterministic,

What part of a query would (or would not) be deterministic? I don't
think I understand the potential concern here. Is it about whether the
statement can be parsed deterministically?

Disadvantages:
- counting of returned columns can be difficult,
- syntax checkers will still report errors,
- probably not SQL standard compliant,

I'd argue you better raise this with the standard committee if this
isn't compliant. I don't see enough added value to break standard
compliance here, especially when the standard may at some point allow
only a single trailing comma (and not arbitrarily many).

What do you think?

Do you expect `SELECT 1,,,,,,,` to have an equivalent query identifier
to `SELECT 1;` in pg_stat_statements? Why, or why not?

Overall, I don't think unlimited commas is a good feature. A trailing
comma in the select list would be less problematic, but I'd still want
to follow the standard first and foremost.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#3Étienne BERSAC
etienne.bersac@dalibo.com
In reply to: Artur Formella (#1)
Re: Allowing additional commas between columns, and at the end of the SELECT clause

Hi,

As a developer, I love this feature.

But as a developer of an universal TDOP SQL parser[1]https://gitlab.com/dalibo/transqlate, this can be a
pain. Please request it to the standard.

Regards,
Étienne

[1]: https://gitlab.com/dalibo/transqlate

In reply to: Matthias van de Meent (#2)
Re: Allowing additional commas between columns, and at the end of the SELECT clause

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

On Mon, 13 May 2024 at 10:42, Artur Formella <artur.formella3@gmail.com> wrote:

Motivation:
Commas of this type are allowed in many programming languages, in some
it is even recommended to use them at the ends of lists or objects.

Single trailing commas are a feature that's more and more common in
languages, yes, but arbitrary excess commas is new to me. Could you
provide some examples of popular languages which have that, as I can't
think of any.

The only one I can think of is Perl, which I'm not sure counts as
popular any more. JavaScript allows consecutive commas in array
literals, but they're not no-ops, they create empty array slots:

❯ js
Welcome to Node.js v18.19.0.
Type ".help" for more information.

[1,,2,,]

[ 1, <1 empty item>, 2, <1 empty item> ]

- ilmari

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dagfinn Ilmari Mannsåker (#4)
Re: Allowing additional commas between columns, and at the end of the SELECT clause

=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

Single trailing commas are a feature that's more and more common in
languages, yes, but arbitrary excess commas is new to me. Could you
provide some examples of popular languages which have that, as I can't
think of any.

The only one I can think of is Perl, which I'm not sure counts as
popular any more. JavaScript allows consecutive commas in array
literals, but they're not no-ops, they create empty array slots:

I'm fairly down on this idea for SQL, because I think it creates
ambiguity for the ROW() constructor syntax. That is:

(x,y) is understood to be shorthand for ROW(x,y)

(x) is not ROW(x), it's just x

(x,) means what?

I realize the original proposal intended to restrict the legality of
excess commas to only a couple of places, but to me that just flags
it as a kluge. ROW(...) ought to work pretty much the same as a
SELECT list.

As already mentioned, if you can get some variant of this through the
SQL standards process, we'll probably adopt it. But I doubt that we
want to get out front of the committee in this area.

regards, tom lane

In reply to: Tom Lane (#5)
Re: Allowing additional commas between columns, and at the end of the SELECT clause

Tom Lane <tgl@sss.pgh.pa.us> writes:

=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

Single trailing commas are a feature that's more and more common in
languages, yes, but arbitrary excess commas is new to me. Could you
provide some examples of popular languages which have that, as I can't
think of any.

The only one I can think of is Perl, which I'm not sure counts as
popular any more. JavaScript allows consecutive commas in array
literals, but they're not no-ops, they create empty array slots:

I'm fairly down on this idea for SQL, because I think it creates
ambiguity for the ROW() constructor syntax. That is:

(x,y) is understood to be shorthand for ROW(x,y)

(x) is not ROW(x), it's just x

(x,) means what?

Python has a similar issue: (x, y) is a tuple, but (x) is just x, and
they use the trailing comma to disambiguate, so (x,) creates a
single-item tuple. AFAIK it's the only place where the trailing comma
is significant.

I realize the original proposal intended to restrict the legality of
excess commas to only a couple of places, but to me that just flags
it as a kluge. ROW(...) ought to work pretty much the same as a
SELECT list.

Yeah, a more principled approach would be to not special-case target
lists, but to allow one (and only one) trailing comma everywhere:
select, order by, group by, array constructors, row constructors,
everything that looks like a function call, etc.

As already mentioned, if you can get some variant of this through the
SQL standards process, we'll probably adopt it. But I doubt that we
want to get out front of the committee in this area.

Agreed.

regards, tom lane

- ilmari

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dagfinn Ilmari Mannsåker (#6)
Re: Allowing additional commas between columns, and at the end of the SELECT clause

=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

I'm fairly down on this idea for SQL, because I think it creates
ambiguity for the ROW() constructor syntax. That is:
(x,y) is understood to be shorthand for ROW(x,y)
(x) is not ROW(x), it's just x
(x,) means what?

Python has a similar issue: (x, y) is a tuple, but (x) is just x, and
they use the trailing comma to disambiguate, so (x,) creates a
single-item tuple. AFAIK it's the only place where the trailing comma
is significant.

Ugh :-(. The semantic principle I'd prefer to have here is "a trailing
comma is ignored", but what they did breaks that. But then again,
I'm not particularly a fan of anything about Python's syntax.

Yeah, a more principled approach would be to not special-case target
lists, but to allow one (and only one) trailing comma everywhere:
select, order by, group by, array constructors, row constructors,
everything that looks like a function call, etc.

If it can be made to work everywhere, that would get my vote.
I'm not sure if any other ambiguities arise, though. SQL has
a lot of weird syntax corners (and the committee keeps adding
more :-().

regards, tom lane

#8Artur Formella
artur.formella3@gmail.com
In reply to: Matthias van de Meent (#2)
Re: Allowing additional commas between columns, and at the end of the SELECT clause

On 13.05.2024 11:24, Matthias van de Meent wrote:

On Mon, 13 May 2024 at 10:42, Artur Formella<artur.formella3@gmail.com> wrote:

Motivation:
Commas of this type are allowed in many programming languages, in some
it is even recommended to use them at the ends of lists or objects.

Single trailing commas are a feature that's more and more common in
languages, yes, but arbitrary excess commas is new to me. Could you
provide some examples of popular languages which have that, as I can't
think of any.

Thank for your comment.
I meant commas are recommended at the end of the list. Sorry for the
lack of precision.
Typescript has a popular directive "rules": { "trailing-comma": false }
in the tslint.json file, which forces trailing commas. Popular Airbnb
coding style require trailing commas by eslint
(https://github.com/airbnb/javascript?tab=readme-ov-file#functions--signature-invocation-indentation).

This is the first time I've heard of this `1 as dummy`.

dummy column is a popular way to end SELECT list on R&D phase to avoid
the most common syntax error. This way you don't have to pay attention
to commas.

SELECT <hacking /> , 1::int AS ignoreme FROM <hacking />

- simplifies query generators,
- the query is still deterministic,

What part of a query would (or would not) be deterministic? I don't
think I understand the potential concern here. Is it about whether the
statement can be parsed deterministically?

Bison doesn't report error or conflict.

I'd argue you better raise this with the standard committee if this
isn't compliant. I don't see enough added value to break standard
compliance here, especially when the standard may at some point allow
only a single trailing comma (and not arbitrarily many).

Do you expect `SELECT 1,,,,,,,` to have an equivalent query identifier
to `SELECT 1;` in pg_stat_statements? Why, or why not?

I don't know, I have a feeling that the queries are equivalent, but I
don't know the mechanism.

Overall, I don't think unlimited commas is a good feature. A trailing
comma in the select list would be less problematic, but I'd still want
to follow the standard first and foremost.

I will prepare a patch with trailing comma only tomorrow.

Thank you.

Artur