Proposal for psql wildcarding behavior w/schemas

Started by Tom Laneover 23 years ago14 messages
#1Tom Lane
tgl@sss.pgh.pa.us

So far as I recall, no one's really taken up the challenge of deciding
how psql's various \d commands should work in the presence of schemas.
Here's a straw-man proposal:

1. A wildcardable pattern must consist of either "namepattern" or
"namepattern.namepattern". In the first case we match against all names
visible in the current search path. In the second case, we consider all
names matching the second part of the pattern within all schemas
matching the first part, without regard to search path visibility.
(For the moment, anyway, patterns containing more than one dot are an
error.)

2. I'd like to switch over to using explicit wildcard characters.
There are presently some cases where psql assumes an implicit "*" at the
end of a name pattern, but I find this surprising. Seems like it would
be more consistent if foo meant foo, and you had to write "foo*" to get
a wildcard search.

3. As for the specific wildcard characters, I propose accepting "*"
and "?" with the same meanings as in common shell filename globbing.
This could be extended to include character classes (eg, [0-9]) if
anyone feels like it. Following shell practice rather than (say)
regexp or LIKE rules avoids problems with dot and underscore, two
characters that we definitely don't want to be pattern match characters
in this context.

4. The wildcard characters "*" and "?" are problematic for \do
(display operators), since they are valid characters in operator names.
I can see three possible answers to this:
A. Don't do any wildcarding in operator searches.
B. Treat "*" and "?" as wildcards, and expect the user to quote
them with backslashes if he wants to use them as regular
characters in an operator search.
C. Treat "*" and "?" as regular characters in operator search,
and let "\*" and "\?" be the wildcards in this context.
A is the current behavior but lacks functionality. C might be the most
convenient once you got used to it, but I suspect people will find it
too confusing. So I'm leaning to B.

Comments, better ideas?

regards, tom lane

#2Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#1)
Re: Proposal for psql wildcarding behavior w/schemas

Tom Lane wrote:

1. A wildcardable pattern must consist of either "namepattern" or
"namepattern.namepattern". In the first case we match against all names
visible in the current search path. In the second case, we consider all
names matching the second part of the pattern within all schemas
matching the first part, without regard to search path visibility.
(For the moment, anyway, patterns containing more than one dot are an
error.)

I like this.

2. I'd like to switch over to using explicit wildcard characters.
There are presently some cases where psql assumes an implicit "*" at the
end of a name pattern, but I find this surprising. Seems like it would
be more consistent if foo meant foo, and you had to write "foo*" to get
a wildcard search.

Agree

3. As for the specific wildcard characters, I propose accepting "*"
and "?" with the same meanings as in common shell filename globbing.
This could be extended to include character classes (eg, [0-9]) if
anyone feels like it. Following shell practice rather than (say)
regexp or LIKE rules avoids problems with dot and underscore, two
characters that we definitely don't want to be pattern match characters
in this context.

Agree again

4. The wildcard characters "*" and "?" are problematic for \do
(display operators), since they are valid characters in operator names.
I can see three possible answers to this:
A. Don't do any wildcarding in operator searches.
B. Treat "*" and "?" as wildcards, and expect the user to quote
them with backslashes if he wants to use them as regular
characters in an operator search.
C. Treat "*" and "?" as regular characters in operator search,
and let "\*" and "\?" be the wildcards in this context.
A is the current behavior but lacks functionality. C might be the most
convenient once you got used to it, but I suspect people will find it
too confusing. So I'm leaning to B.

I would definitely vote for B.

Joe

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#1)
Re: Proposal for psql wildcarding behavior w/schemas

I said:

So far as I recall, no one's really taken up the challenge of deciding
how psql's various \d commands should work in the presence of schemas.
Here's a straw-man proposal:

It occurs to me that I wasn't thinking about the effects of
double-quoted identifiers. Should dot, star, and question mark
be taken as non-special characters if they're inside double quotes?
(Probably.) Does that mean that we don't need backslash-oriented
escaping conventions? (Maybe; would people expect 'em anyway?)
Any other implications I missed? (Very likely.)

regards, tom lane

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#3)
Re: Proposal for psql wildcarding behavior w/schemas

Tom Lane wrote:

I said:

So far as I recall, no one's really taken up the challenge of deciding
how psql's various \d commands should work in the presence of schemas.
Here's a straw-man proposal:

It occurs to me that I wasn't thinking about the effects of
double-quoted identifiers. Should dot, star, and question mark
be taken as non-special characters if they're inside double quotes?
(Probably.) Does that mean that we don't need backslash-oriented
escaping conventions? (Maybe; would people expect 'em anyway?)
Any other implications I missed? (Very likely.)

Uh, if we follow the shell rules, quote-star-quote means star has no
special meaning:

$ echo "*"
*

$ echo \*
*

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#4)
Re: Proposal for psql wildcarding behavior w/schemas

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Uh, if we follow the shell rules, quote-star-quote means star has no
special meaning:

Interesting analogy. We can't take it too far, because the shell quote
rules don't agree with SQL:

$ echo "aaa""zzz"
aaazzz

Under SQL rules the produced identifier would be aaa"zzz. Still, this
provides some ammunition for not processing wildcard characters that
are within quotes.

$ echo \*
*

That analogy says we need to accept both quote and backslash quoting.
Not sure about this. Again, SQL doesn't quite agree with the shell
about how these interact. For example:

egression=# select "foo\bar";
ERROR: Attribute "foo\bar" not found
regression=# \q
$ echo "foo\bar"
foar <--- \b went to backspace

So backslash isn't special within quotes according to SQL, but it
is according to the shell.

I still like "use the shell wildcards" as a rough design principle,
but the devil is in the details ...

regards, tom lane

#6Alvaro Herrera
alvherre@atentus.com
In reply to: Tom Lane (#5)
Re: Proposal for psql wildcarding behavior w/schemas

Tom Lane dijo:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Uh, if we follow the shell rules, quote-star-quote means star has no
special meaning:

Interesting analogy. We can't take it too far, because the shell quote
rules don't agree with SQL:

[...]

$ echo "foo\bar"
foar <--- \b went to backspace

So backslash isn't special within quotes according to SQL, but it
is according to the shell.

Note that GNU echo has actually two different behaviours:

$ echo "a\bb"
a\bb
$ echo -e "a\bb"
b

Also note that since the backslash is between quotes you are not actually
testing shell behaviour but echo(1) behaviour. bash(1) and tcsh(1) both
say

$ echo a\bb
abb

The shell will interpret anything that is outside quotes and leave
anything inside quotes alone, but of course you already knew that. It's
echo that's interpreting further the backslashed string. In that light,
I'd say * should be left alone (no special behaviour) if between quotes.

My 10 chilean pesos.

--
Alvaro Herrera (<alvherre[a]atentus.com>)
"Porque Kim no hacia nada, pero, eso si,
con extraordinario exito" ("Kim", Kipling)

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#6)
Re: Proposal for psql wildcarding behavior w/schemas

Alvaro Herrera <alvherre@atentus.com> writes:

Also note that since the backslash is between quotes you are not actually
testing shell behaviour but echo(1) behaviour.

Duh. Time to go to bed ;-)

regards, tom lane

#8Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Tom Lane (#7)
Re: Proposal for psql wildcarding behavior w/schemas

On Tue, 6 Aug 2002, Tom Lane wrote:

Alvaro Herrera <alvherre@atentus.com> writes:

Also note that since the backslash is between quotes you are not actually
testing shell behaviour but echo(1) behaviour.

Duh. Time to go to bed ;-)

Hmm...that's not how I've always understood shell quoting, at least for bash:

~$ aa=3
~$ perl -e 'print join(",",@ARGV), "\n";' "1 $aa 2 3" 4 5 6
1 3 2 3,4,5,6
~$

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

#9Alvaro Herrera
alvherre@atentus.com
In reply to: Nigel J. Andrews (#8)
Re: Proposal for psql wildcarding behavior w/schemas

Nigel J. Andrews dijo:

Hmm...that's not how I've always understood shell quoting, at least for bash:

~$ aa=3
~$ perl -e 'print join(",",@ARGV), "\n";' "1 $aa 2 3" 4 5 6
1 3 2 3,4,5,6
~$

What's the difference? What your example is saying basically is that
the shell is treating the "1 $aa 2 3" as a single parameter (i.e. spaces
do not have the usual parameter-separating behaviour), _but_ variables
are interpreted. Using '' prevents variable substitution, so

~$ perl -e 'print join(",",@ARGV), "\n";' '1 $aa 2 3' 4 5 6

should give
1 $aa 2 3,4,5,6

--
Alvaro Herrera (<alvherre[a]atentus.com>)
FOO MANE PADME HUM

#10Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Alvaro Herrera (#9)
Re: Proposal for psql wildcarding behavior w/schemas

On Tue, 6 Aug 2002, Alvaro Herrera wrote:

Nigel J. Andrews dijo:

Hmm...that's not how I've always understood shell quoting, at least for bash:

~$ aa=3
~$ perl -e 'print join(",",@ARGV), "\n";' "1 $aa 2 3" 4 5 6
1 3 2 3,4,5,6
~$

What's the difference? What your example is saying basically is that
the shell is treating the "1 $aa 2 3" as a single parameter (i.e. spaces
do not have the usual parameter-separating behaviour), _but_ variables
are interpreted. Using '' prevents variable substitution, so

~$ perl -e 'print join(",",@ARGV), "\n";' '1 $aa 2 3' 4 5 6

should give
1 $aa 2 3,4,5,6

Oops, I've just realised the original was about glob expansion whereas I was
looking at other special characters.

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#1)
Re: Proposal for psql wildcarding behavior w/schemas

Tom Lane writes:

1. A wildcardable pattern must consist of either "namepattern" or
"namepattern.namepattern".

Regarding the use of quotes: Would

\d "foo.bar"

show the table "foo.bar", whereas

\d "foo"."bar"

would show the table "bar" in schema "foo"?

--
Peter Eisentraut peter_e@gmx.net

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#11)
Re: Proposal for psql wildcarding behavior w/schemas

Peter Eisentraut <peter_e@gmx.net> writes:

Tom Lane writes:

1. A wildcardable pattern must consist of either "namepattern" or
"namepattern.namepattern".

Regarding the use of quotes: Would

\d "foo.bar"

show the table "foo.bar", whereas

\d "foo"."bar"

would show the table "bar" in schema "foo"?

That'd be my interpretation of what it should do. Okay with you?

regards, tom lane

#13Greg Sabino Mullane
greg@turnstep.com
In reply to: Tom Lane (#12)
Re: Proposal for psql wildcarding behavior w/schemas

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Here's my thoughts on the matter:

More than one period throws an error (something pleasant, mentioning
that we do not quite support cross-database queries yet).

I'll assume we are switching to the "explicit wildcard" system for this.
(I think the proposed *? wildcards are a great idea). Basically, for
those functions (esp. \d) that can return a list or a single item,
the wildcard indicates that we want a list, and a lack of wildcard
indicates we want a single item.

\d foo.baz Shows details about the table bar in the schema foo

\d baz Attempts to find a visible baz in the schemas, shows first
one that matches.

\d baz* Shows all tables starting with "baz", in all available schemas

\d *.baz* Same as above

\d baz.* Shows all tables in the baz schema

\d foo.baz* Shows all tables starting with "baz" in the "foo" schema

\d *.baz Similar to \d baz, but the wildcard forces all matches
to be shown.

\d foo*.baz Shows all tables named "baz" in schemas staring with "foo"

\d *.* Same as plain old \d (which is actually a special case now)

\d *oo. Matches all tables in schemas ending in "oo"

\d *oo.* Same as above - a plain * on either side of the dot is
usually superfluous, except that it forces a list of matches.

The only one I have a problem with is the '\d baz' which some may argue should
continue the current behavior and return details about the first "baz"
table it encounters. On the other hand, people really need to start thinking
about schemas and the consequences of having more than one table named
"baz". Still, typing in the schema could get tedious very quickly. Perhaps it
should only return a list if more than one table was found, otherwise it
will return information about that table.

The good thing about the above is that \d and \dt will actually have the
same functionality. The current behavior can be a bit confusing, in that
some functions have implicit wildcards (\dt and friends) and some do
not (\d).

Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200208091849

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE9VEb6vJuQZxSWSsgRAsHNAKDCXz4vUY1A79EaVmfPIfcNS7VyxgCaA5Tx
HrNmY0vITbb2G1fZFBCPgBM=
=qhYp
-----END PGP SIGNATURE-----

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Sabino Mullane (#13)
Re: Proposal for psql wildcarding behavior w/schemas

"Greg Sabino Mullane" <greg@turnstep.com> writes:

Here's my thoughts on the matter:
More than one period throws an error (something pleasant, mentioning
that we do not quite support cross-database queries yet).

The just-committed implementation doesn't throw an error, but silently
discards name fragments to the left of the last two --- for example,
foo.bar.baz is silently treated as bar.baz. This could probably be
improved, but I haven't quite figured out how psql deals with error
recovery...

Otherwise I agree with your comments, except for

\d baz* Shows all tables starting with "baz", in all available schemas

Make that "shows visible tables whose names begin with baz". There is
a subtle difference.

\d *.baz* Same as above

This shows tables whose names begin with baz, in any schema in the
database --- without regard to visibility.

\d *.* Same as plain old \d (which is actually a special case now)

\d without an argument is still a special case: it transforms to \dtvs
with no argument. Other than that little usability kluge, the general
rule is that for any object-type x, \dx is the same as \dx *, which is
*not* the same as \dx *.* ... the former shows all visible objects, the
latter all objects in the database.

The current behavior can be a bit confusing, in that some functions
have implicit wildcards (\dt and friends) and some do not (\d).

As of cvs tip, all the \d family take wildcards.

regards, tom lane