\crosstabview fixes

Started by Tom Laneabout 10 years ago17 messageshackers

tgl@sss.pgh.pa.us

about 10 years ago

I noticed that the \crosstabview documentation asserts that column name
arguments are handled per standard SQL semantics. In point of fact,
though, the patch expends a couple hundred lines to implement what is
NOT standard SQL semantics: matching unquoted names case-insensitively
is anything but that. I think we should rip all that out and do it as
per attached. (I also took the trouble to make the error messages conform
to project style.)

Comments/objections?

regards, tom lane

Daniel Verite

daniel@manitou-mail.org

about 10 years ago

In reply to: Tom Lane (#1)

Re: \crosstabview fixes

Tom Lane wrote:

I noticed that the \crosstabview documentation asserts that column name
arguments are handled per standard SQL semantics. In point of fact,
though, the patch expends a couple hundred lines to implement what is
NOT standard SQL semantics: matching unquoted names case-insensitively
is anything but that. I think we should rip all that out and do it as
per attached. (I also took the trouble to make the error messages conform
to project style.)

Comments/objections?

+1 for ripping it out in favor of the option parser with OT_SQLID,
but then there's a problem with the colH:scolH syntax.
For instance when refering to columns by positions, with the
patch applied, it accepts this invocation without error:
\crosstabview 1 "2:4" 3
whereas before it produced this error:
Invalid column name: "2:4"

To avoid the confusion between "2:4" and "2":"4" or 2:4,
and the ambiguity with a possibly existing "2:4" column,
maybe we should abandon this syntax and require the optional
scolH to be on its own at the end of the command.

The simplified invocation of the command would be
\crosstabview colV colH colD [scolH]
(without any default, just scolH being optional):

Or if it's preferrable to have colH just near scolH:
\crosstabview colD colV colH [scolH]

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Alvaro Herrera

alvherre@2ndquadrant.com

about 10 years ago

In reply to: Tom Lane (#1)

Re: \crosstabview fixes

Tom Lane wrote:

I noticed that the \crosstabview documentation asserts that column name
arguments are handled per standard SQL semantics. In point of fact,
though, the patch expends a couple hundred lines to implement what is
NOT standard SQL semantics: matching unquoted names case-insensitively
is anything but that. I think we should rip all that out and do it as
per attached.

Ah, yeah, I hadn't realized this bogosity. Haven't verified the patch in
detail.

(I also took the trouble to make the error messages conform
to project style.)

Not sure about this part. Many psql error messages are full sentences (start
with uppercase, end in period); others start with the \ command being
complained about. Compare

alvherre=# \foobar
Invalid command \foobar. Try \? for help.
alvherre=# \copy foobar
\copy: parse error at end of line

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Daniel Verite (#2)

Re: \crosstabview fixes

"Daniel Verite" <daniel@manitou-mail.org> writes:

To avoid the confusion between "2:4" and "2":"4" or 2:4,
and the ambiguity with a possibly existing "2:4" column,
maybe we should abandon this syntax and require the optional
scolH to be on its own at the end of the command.

That would be OK with me; it's certainly less of a hack than what's
there now. (I went back and forth about how much effort to put into
dealing with the colon syntax; I think the version I have in my patch
would be all right, but it's not perfect.)

The simplified invocation of the command would be
\crosstabview colV colH colD [scolH]
(without any default, just scolH being optional):

Or if it's preferrable to have colH just near scolH:
\crosstabview colD colV colH [scolH]

Uh, why not

\crosstabview [ colV colH [ colD [ scolH ]]]

I see no particular reason that the existing abbreviation styles aren't
good. In any case, forcing colD to be specified is kind of annoying ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Alvaro Herrera (#3)

Re: \crosstabview fixes

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Tom Lane wrote:

(I also took the trouble to make the error messages conform
to project style.)

Not sure about this part. Many psql error messages are full sentences (start
with uppercase, end in period); others start with the \ command being
complained about. Compare

alvherre=# \foobar
Invalid command \foobar. Try \? for help.
alvherre=# \copy foobar
\copy: parse error at end of line

Well, the first of those is meant to be two full sentences, though they
are a bit abbreviated.

I think putting "\crosstabview: " in front of all the error messages
in this patch would be a fine idea, though. Will do that unless there
are further objections.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tom Lane (#4)

Re: \crosstabview fixes

I wrote:

"Daniel Verite" <daniel@manitou-mail.org> writes:

To avoid the confusion between "2:4" and "2":"4" or 2:4,
and the ambiguity with a possibly existing "2:4" column,
maybe we should abandon this syntax and require the optional
scolH to be on its own at the end of the command.

That would be OK with me; it's certainly less of a hack than what's
there now. (I went back and forth about how much effort to put into
dealing with the colon syntax; I think the version I have in my patch
would be all right, but it's not perfect.)

Here's a patch along those lines. Any objections?

regards, tom lane

Christoph Berg

myon@debian.org

about 10 years ago

In reply to: Tom Lane (#6)

Re: \crosstabview fixes

Re: Tom Lane 2016-04-14 <15673.1460592362@sss.pgh.pa.us>

Here's a patch along those lines. Any objections?

<term><literal>\crosstabview [
<replaceable class="parameter">colV</replaceable>
! [ <replaceable class="parameter">colH</replaceable>
! [ <replaceable class="parameter">colD</replaceable>
! [ <replaceable class="parameter">scolH</replaceable>
! ] ] ] ] </literal></term>

Maybe use "sortcolH" to make it immediately clear what it does?

When I first read the docs it seemed to me that this scolH thing would
introduce a lot of magic until I finished the text, sortcolH would
have made me more comfortable with it.

Christoph

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Daniel Verite

daniel@manitou-mail.org

about 10 years ago

In reply to: Tom Lane (#6)

Re: \crosstabview fixes

Tom Lane wrote:

That would be OK with me; it's certainly less of a hack than what's
there now. (I went back and forth about how much effort to put into
dealing with the colon syntax; I think the version I have in my patch
would be all right, but it's not perfect.)

Here's a patch along those lines. Any objections?

There's the issue that it can no longer distinguish between numbers as
column positions and column names that consist entirely of numbers.

For example, before the patch:
=# SELECT 'a' as "4", 'b' as "5", 'x'
\crosstabview "4" "5"
4 | b
---+---
a | x

After the patch:
=# SELECT 'a' as "4", 'b' as "5", 'x' \crosstabview "4" "5"
\crosstabview: invalid column number: "4"

crosstabview's parseColumnRefs() knows that "4" is a column
name because of the quotes, but once it's replaced by the psql
ident parser, I guess the quotes are removed and the argument
becomes a bare number.

I don't quite see how to work around that, short of simply
removing the possibility of addressing columns by their
numbers. Which maybe is a bit sad for the end user, I'm not
sure, but ISTM that's a logical consequence of abandoning
the dedicated parser for columns.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Christoph Berg

myon@debian.org

about 10 years ago

In reply to: Daniel Verite (#8)

Re: \crosstabview fixes

Re: Daniel Verite 2016-04-14 <bc3a2e99-5030-4466-a248-98c5e9841205@mm>

I don't quite see how to work around that, short of simply
removing the possibility of addressing columns by their
numbers. Which maybe is a bit sad for the end user, I'm not
sure, but ISTM that's a logical consequence of abandoning
the dedicated parser for columns.

That would be bad news, given that \crosstabview is meant for
interactive use where these number shortcuts are much more likely to
be used than in proper production SQL code. Be it only for ease of
typing, or for the case where the columns are just called ?column?.

If there's no way out, what about changing it the other way, i.e.
breaking the case where the column is named by a number? That seems
much less of a problem in practice.

Christoph

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Daniel Verite

daniel@manitou-mail.org

about 10 years ago

In reply to: Christoph Berg (#9)

Re: \crosstabview fixes

Christoph Berg wrote:

I don't quite see how to work around that, short of simply
removing the possibility of addressing columns by their
numbers. [...]

That would be bad news, given that \crosstabview is meant for
interactive use where these number shortcuts are much more likely to
be used than in proper production SQL code. Be it only for ease of
typing, or for the case where the columns are just called ?column?.

Ah, that reminds me that there's another corner case.
In a resultset, two columns can share the exact same name.
I think the only way to distinguish them in that case is
to refer to them by position, so that's another argument
to not discard that possibility.

=# select 'X' as "a", 'Y' as "a", 'Z' \crosstabview a a
Ambiguous column name: a

versus

=# select 'X' as "a", 'Y' as "a", 'x' \crosstabview 1 2
a | Y
---+---
X | x

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Daniel Verite

daniel@manitou-mail.org

about 10 years ago

In reply to: Christoph Berg (#9)

Re: \crosstabview fixes

Christoph Berg wrote:

If there's no way out, what about changing it the other way, i.e.
breaking the case where the column is named by a number? That seems
much less of a problem in practice.

I don't think it would be acceptable.
But there's still the option of keeping the dedicated parser.

crosstabview doc says:
"The usual SQL case folding and quoting rules apply to column names"

Tom objected to that, upthread:

I noticed that the \crosstabview documentation asserts that column
name arguments are handled per standard SQL semantics. In point of
fact, though, the patch expends a couple hundred lines to implement
what is NOT standard SQL semantics: matching unquoted names
case-insensitively is anything but that.

Indeed it differs, but that's not necessarily bad. If there's a FOO
column and it's refered to as \crosstabview foo... it will complain about
an ambiguity only if there's another column that's named foo, or Foo, or
any case variant. Quoting becomes mandatory only in that case,
or of course if the column referred to contains spaces or double
quotes.

For example, both these invocations work:
current=# SELECT 'X' as "FOO", 'Y', 'z' \crosstabview foo 2
FOO | Y
-----+---
X | z

current=# SELECT 'X' as "FOO", 'Y', 'z' \crosstabview FOO 2
FOO | Y
-----+---
X | z

OTOH a detected ambiguity would be:
current=# SELECT 'X' as "FOO", 'Y' as "foo", 'z' \crosstabview FOO 2
Ambiguous column name: FOO

which is solved by quoting the argument:
current=# SELECT 'X' as "FOO", 'Y' as "foo", 'z' \crosstabview "FOO" 2
FOO | Y
-----+---
X | z

Whereas using the generic column parser with Tom's patch, the only
accepted invocation out of the 4 examples above is the last one:

new=# SELECT 'X' as "FOO", 'Y', 'z' \crosstabview foo 2
\crosstabview: column name not found: "foo"

new # SELECT 'X' as "FOO", 'Y', 'z' \crosstabview FOO 2
\crosstabview: column name not found: "foo"

new=# SELECT 'X' as "FOO", 'Y' as "foo", 'z' \crosstabview FOO 2
\crosstabview: vertical and horizontal headers must be different columns

new=# SELECT 'X' as "FOO", 'Y' as "foo", 'z' \crosstabview "FOO" 2
FOO | Y
-----+---
X | z

Personally I prefer the current behavior, but I can also understand
the appeal from a maintainer's point of view of getting rid of it in
favor of already existing, well-tested generic code.
In which case, what the dedicated parser does is a moot point.

However, because of the aforementioned problem of the interpretation
of column names as numbers, maybe the balance comes back to the
dedicated parser? In which case, there's the question of whether how
it handles case folding as shown above is OK, or whether it should
just downcase unquoted identifiers to strictly match SQL rules.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Daniel Verite (#11)

Re: \crosstabview fixes

"Daniel Verite" <daniel@manitou-mail.org> writes:

Christoph Berg wrote:

If there's no way out, what about changing it the other way, i.e.
breaking the case where the column is named by a number? That seems
much less of a problem in practice.

I don't think it would be acceptable.

I had figured it would be ;-) ... but if you object, let's see what
it would take to preserve that.

My feeling is that what we should do is undo the change to use OT_SQLID,
and in indexOfColumn() perform a downcasing/dequoting conversion that
duplicates what OT_SQLID does in psqlscanslash.l. That only adds a couple
dozen lines of code, so it will still be way smaller than the existing
logic in crosstabview.c, and it will match the behavior of the rest of
psql. By postponing that till after the "is it a number" check, we
restore the current behavior that a quoted number won't be taken as a
column number.

(Another approach would be to somehow modify psqlscanslash.l's API so that
we could tell externally whether an argument had been quoted or not. But
that would be much more invasive, and I doubt it's worth it for this one
use-case.)

I'll have a patch after I finish catching up the rest of my mail.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tom Lane (#12)

Re: \crosstabview fixes

I wrote:

My feeling is that what we should do is undo the change to use OT_SQLID,
and in indexOfColumn() perform a downcasing/dequoting conversion that
duplicates what OT_SQLID does in psqlscanslash.l.

Here's an updated patch that does it that way, and also adopts Christoph's
documentation suggestion about "sortcolH". Any further comments?

regards, tom lane

#14

Daniel Verite

daniel@manitou-mail.org

about 10 years ago

In reply to: Tom Lane (#13)

Re: \crosstabview fixes

Tom Lane wrote:

I wrote:

My feeling is that what we should do is undo the change to use OT_SQLID,
and in indexOfColumn() perform a downcasing/dequoting conversion that
duplicates what OT_SQLID does in psqlscanslash.l.

Here's an updated patch that does it that way, and also adopts Christoph's
documentation suggestion about "sortcolH". Any further comments?

For my part, I'm fine with this. Thanks!

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Daniel Verite

daniel@manitou-mail.org

almost 10 years ago

In reply to: Tom Lane (#6)

Re: \crosstabview fixes

Tom Lane wrote:

"Daniel Verite" <daniel@manitou-mail.org> writes:

To avoid the confusion between "2:4" and "2":"4" or 2:4,
and the ambiguity with a possibly existing "2:4" column,
maybe we should abandon this syntax and require the optional
scolH to be on its own at the end of the command.

That would be OK with me; it's certainly less of a hack than what's
there now. (I went back and forth about how much effort to put into
dealing with the colon syntax; I think the version I have in my patch
would be all right, but it's not perfect.)

Here's a patch along those lines. Any objections?

Checking http://www.postgresql.org/docs/devel/static/app-psql.html ,
I notice that the last example is still using the syntax for arguments
that has been deprecated by commit 6f0d6a507, as discussed in this
thread.
A fix to psql-ref.sgml is attached.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

#16

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: Daniel Verite (#15)

Re: \crosstabview fixes

"Daniel Verite" <daniel@manitou-mail.org> writes:

Checking http://www.postgresql.org/docs/devel/static/app-psql.html ,
I notice that the last example is still using the syntax for arguments
that has been deprecated by commit 6f0d6a507, as discussed in this
thread.

Ooops.

A fix to psql-ref.sgml is attached.

Pushed, thanks.

BTW, I see we've been spelling your name with an insufficient number
of accents in the commit logs and release notes. Can't do much about
the logs, but will fix the release notes.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Daniel Verite

daniel@manitou-mail.org

almost 10 years ago

In reply to: Tom Lane (#16)

Re: \crosstabview fixes

Tom Lane wrote:

Pushed, thanks.
BTW, I see we've been spelling your name with an insufficient number
of accents in the commit logs and release notes. Can't do much about
the logs, but will fix the release notes.

I use myself the nonaccented version of my name in "From" headers, as
a concession to mailers that are not always rfc2047-friendly, so I can't
blame anyone for leaving out one or both of these accents :)
Thanks for taking care of this anyway!

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

\crosstabview fixes

Attachments:

Attachments:

Attachments:

Attachments: