Thread: Proposal for psql wildcarding behavior w/schemas

Proposal for psql wildcarding behavior w/schemas

From
Tom Lane
Date:
So far as I recall, no one's really taken up the challenge of deciding
how psql's various \d commands should work in the presence of schemas.
Here's a straw-man proposal:

1.  A wildcardable pattern must consist of either "namepattern" or
"namepattern.namepattern".  In the first case we match against all names
visible in the current search path.  In the second case, we consider all
names matching the second part of the pattern within all schemas
matching the first part, without regard to search path visibility.
(For the moment, anyway, patterns containing more than one dot are an
error.)

2.  I'd like to switch over to using explicit wildcard characters.
There are presently some cases where psql assumes an implicit "*" at the
end of a name pattern, but I find this surprising.  Seems like it would
be more consistent if foo meant foo, and you had to write "foo*" to get
a wildcard search.

3.  As for the specific wildcard characters, I propose accepting "*"
and "?" with the same meanings as in common shell filename globbing.
This could be extended to include character classes (eg, [0-9]) if
anyone feels like it.  Following shell practice rather than (say)
regexp or LIKE rules avoids problems with dot and underscore, two
characters that we definitely don't want to be pattern match characters
in this context.

4.  The wildcard characters "*" and "?" are problematic for \do
(display operators), since they are valid characters in operator names.
I can see three possible answers to this:   A.  Don't do any wildcarding in operator searches.   B.  Treat "*" and "?"
aswildcards, and expect the user to quotethem with backslashes if he wants to use them as regularcharacters in an
operatorsearch.   C.  Treat "*" and "?" as regular characters in operator search,and let "\*" and "\?" be the wildcards
inthis context.
 
A is the current behavior but lacks functionality.  C might be the most
convenient once you got used to it, but I suspect people will find it
too confusing.  So I'm leaning to B.

Comments, better ideas?
        regards, tom lane


Re: Proposal for psql wildcarding behavior w/schemas

From
Joe Conway
Date:
Tom Lane wrote:
> 1.  A wildcardable pattern must consist of either "namepattern" or
> "namepattern.namepattern".  In the first case we match against all names
> visible in the current search path.  In the second case, we consider all
> names matching the second part of the pattern within all schemas
> matching the first part, without regard to search path visibility.
> (For the moment, anyway, patterns containing more than one dot are an
> error.)

I like this.

> 2.  I'd like to switch over to using explicit wildcard characters.
> There are presently some cases where psql assumes an implicit "*" at the
> end of a name pattern, but I find this surprising.  Seems like it would
> be more consistent if foo meant foo, and you had to write "foo*" to get
> a wildcard search.

Agree

> 
> 3.  As for the specific wildcard characters, I propose accepting "*"
> and "?" with the same meanings as in common shell filename globbing.
> This could be extended to include character classes (eg, [0-9]) if
> anyone feels like it.  Following shell practice rather than (say)
> regexp or LIKE rules avoids problems with dot and underscore, two
> characters that we definitely don't want to be pattern match characters
> in this context.

Agree again


> 
> 4.  The wildcard characters "*" and "?" are problematic for \do
> (display operators), since they are valid characters in operator names.
> I can see three possible answers to this:
>     A.  Don't do any wildcarding in operator searches.
>     B.  Treat "*" and "?" as wildcards, and expect the user to quote
>     them with backslashes if he wants to use them as regular
>     characters in an operator search.
>     C.  Treat "*" and "?" as regular characters in operator search,
>     and let "\*" and "\?" be the wildcards in this context.
> A is the current behavior but lacks functionality.  C might be the most
> convenient once you got used to it, but I suspect people will find it
> too confusing.  So I'm leaning to B.

I would definitely vote for B.

Joe



Re: Proposal for psql wildcarding behavior w/schemas

From
Tom Lane
Date:
I said:
> So far as I recall, no one's really taken up the challenge of deciding
> how psql's various \d commands should work in the presence of schemas.
> Here's a straw-man proposal:

It occurs to me that I wasn't thinking about the effects of
double-quoted identifiers.  Should dot, star, and question mark
be taken as non-special characters if they're inside double quotes?
(Probably.)  Does that mean that we don't need backslash-oriented
escaping conventions?  (Maybe; would people expect 'em anyway?)
Any other implications I missed?  (Very likely.)
        regards, tom lane


Re: Proposal for psql wildcarding behavior w/schemas

From
Bruce Momjian
Date:
Tom Lane wrote:
> I said:
> > So far as I recall, no one's really taken up the challenge of deciding
> > how psql's various \d commands should work in the presence of schemas.
> > Here's a straw-man proposal:
> 
> It occurs to me that I wasn't thinking about the effects of
> double-quoted identifiers.  Should dot, star, and question mark
> be taken as non-special characters if they're inside double quotes?
> (Probably.)  Does that mean that we don't need backslash-oriented
> escaping conventions?  (Maybe; would people expect 'em anyway?)
> Any other implications I missed?  (Very likely.)

Uh, if we follow the shell rules, quote-star-quote means star has no
special meaning:
$ echo "*"*
$ echo \**

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Proposal for psql wildcarding behavior w/schemas

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Uh, if we follow the shell rules, quote-star-quote means star has no
> special meaning:

Interesting analogy.  We can't take it too far, because the shell quote
rules don't agree with SQL:

$ echo "aaa""zzz"
aaazzz

Under SQL rules the produced identifier would be aaa"zzz.  Still, this
provides some ammunition for not processing wildcard characters that
are within quotes.

>     $ echo \*
>     *

That analogy says we need to accept both quote and backslash quoting.
Not sure about this.  Again, SQL doesn't quite agree with the shell
about how these interact.  For example:

egression=# select "foo\bar";
ERROR:  Attribute "foo\bar" not found
regression=# \q
$ echo "foo\bar"
foar                              <--- \b went to backspace

So backslash isn't special within quotes according to SQL, but it
is according to the shell.

I still like "use the shell wildcards" as a rough design principle,
but the devil is in the details ...
        regards, tom lane


Re: Proposal for psql wildcarding behavior w/schemas

From
Alvaro Herrera
Date:
Tom Lane dijo: 

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Uh, if we follow the shell rules, quote-star-quote means star has no
> > special meaning:
> 
> Interesting analogy.  We can't take it too far, because the shell quote
> rules don't agree with SQL:
[...]

> $ echo "foo\bar"
> foar                              <--- \b went to backspace
> 
> So backslash isn't special within quotes according to SQL, but it
> is according to the shell.

Note that GNU echo has actually two different behaviours:

$ echo "a\bb"
a\bb
$ echo -e "a\bb"
b

Also note that since the backslash is between quotes you are not actually
testing shell behaviour but echo(1) behaviour. bash(1) and tcsh(1) both
say

$ echo a\bb
abb

The shell will interpret anything that is outside quotes and leave
anything inside quotes alone, but of course you already knew that.  It's
echo that's interpreting further the backslashed string.  In that light,
I'd say * should be left alone (no special behaviour) if between quotes.

My 10 chilean pesos.

-- 
Alvaro Herrera (<alvherre[a]atentus.com>)
"Porque Kim no hacia nada, pero, eso si,
con extraordinario exito" ("Kim", Kipling)



Re: Proposal for psql wildcarding behavior w/schemas

From
Tom Lane
Date:
Alvaro Herrera <alvherre@atentus.com> writes:
> Also note that since the backslash is between quotes you are not actually
> testing shell behaviour but echo(1) behaviour.

Duh.  Time to go to bed ;-)
        regards, tom lane


Re: Proposal for psql wildcarding behavior w/schemas

From
"Nigel J. Andrews"
Date:
On Tue, 6 Aug 2002, Tom Lane wrote:

> Alvaro Herrera <alvherre@atentus.com> writes:
> > Also note that since the backslash is between quotes you are not actually
> > testing shell behaviour but echo(1) behaviour.
> 
> Duh.  Time to go to bed ;-)


Hmm...that's not how I've always understood shell quoting, at least for bash:

~$ aa=3
~$ perl -e 'print join(",",@ARGV), "\n";' "1 $aa 2 3" 4 5 6
1 3 2 3,4,5,6
~$


-- 
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants



Re: Proposal for psql wildcarding behavior w/schemas

From
Alvaro Herrera
Date:
Nigel J. Andrews dijo: 

> Hmm...that's not how I've always understood shell quoting, at least for bash:
> 
> ~$ aa=3
> ~$ perl -e 'print join(",",@ARGV), "\n";' "1 $aa 2 3" 4 5 6
> 1 3 2 3,4,5,6
> ~$

What's the difference?  What your example is saying basically is that
the shell is treating the "1 $aa 2 3" as a single parameter (i.e. spaces
do not have the usual parameter-separating behaviour), _but_ variables
are interpreted.  Using '' prevents variable substitution, so 

> ~$ perl -e 'print join(",",@ARGV), "\n";' '1 $aa 2 3' 4 5 6

should give
1 $aa 2 3,4,5,6

-- 
Alvaro Herrera (<alvherre[a]atentus.com>)
FOO MANE PADME HUM



Re: Proposal for psql wildcarding behavior w/schemas

From
"Nigel J. Andrews"
Date:
On Tue, 6 Aug 2002, Alvaro Herrera wrote:

> Nigel J. Andrews dijo: 
> 
> > Hmm...that's not how I've always understood shell quoting, at least for bash:
> > 
> > ~$ aa=3
> > ~$ perl -e 'print join(",",@ARGV), "\n";' "1 $aa 2 3" 4 5 6
> > 1 3 2 3,4,5,6
> > ~$
> 
> What's the difference?  What your example is saying basically is that
> the shell is treating the "1 $aa 2 3" as a single parameter (i.e. spaces
> do not have the usual parameter-separating behaviour), _but_ variables
> are interpreted.  Using '' prevents variable substitution, so 
> 
> > ~$ perl -e 'print join(",",@ARGV), "\n";' '1 $aa 2 3' 4 5 6
> 
> should give
> 1 $aa 2 3,4,5,6


Oops, I've just realised the original was about glob expansion whereas I was
looking at other special characters.



-- 
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants



Re: Proposal for psql wildcarding behavior w/schemas

From
Peter Eisentraut
Date:
Tom Lane writes:

> 1.  A wildcardable pattern must consist of either "namepattern" or
> "namepattern.namepattern".

Regarding the use of quotes:  Would

\d "foo.bar"

show the table "foo.bar", whereas

\d "foo"."bar"

would show the table "bar" in schema "foo"?

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Proposal for psql wildcarding behavior w/schemas

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> 1.  A wildcardable pattern must consist of either "namepattern" or
>> "namepattern.namepattern".

> Regarding the use of quotes:  Would

> \d "foo.bar"

> show the table "foo.bar", whereas

> \d "foo"."bar"

> would show the table "bar" in schema "foo"?

That'd be my interpretation of what it should do.  Okay with you?
        regards, tom lane


Re: Proposal for psql wildcarding behavior w/schemas

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Here's my thoughts on the matter:

More than one period throws an error (something pleasant, mentioning 
that we do not quite support cross-database queries yet).

I'll assume we are switching to the "explicit wildcard" system for this. 
(I think the proposed *? wildcards are a great idea). Basically, for 
those functions (esp. \d) that can return a list or a single item, 
the wildcard indicates that we want a list, and a lack of wildcard 
indicates we want a single item.

\d foo.baz    Shows details about the table bar in the schema foo

\d baz        Attempts to find a visible baz in the schemas, shows first              one that matches.

\d baz*       Shows all tables starting with "baz", in all available schemas

\d *.baz*     Same as above

\d baz.*      Shows all tables in the baz schema

\d foo.baz*   Shows all tables starting with "baz" in the "foo" schema

\d *.baz      Similar to \d baz, but the wildcard forces all matches              to be shown.

\d foo*.baz   Shows all tables named "baz" in schemas staring with "foo"

\d *.*        Same as plain old \d (which is actually a special case now)

\d *oo.       Matches all tables in schemas ending in "oo"

\d *oo.*      Same as above - a plain * on either side of the dot is             usually superfluous, except that it
forcesa list of matches.
 


The only one I have a problem with is the '\d baz' which some may argue should 
continue the current behavior and return details about the first "baz" 
table it encounters. On the other hand, people really need to start thinking 
about schemas and the consequences of having more than one table named 
"baz". Still, typing in the schema could get tedious very quickly. Perhaps it 
should only return a list if more than one table was found, otherwise it 
will return information about that table.

The good thing about the above is that \d and \dt will actually have the 
same functionality. The current behavior can be a bit confusing, in that 
some functions have implicit wildcards (\dt and friends) and some do 
not (\d).

Greg Sabino Mullane  greg@turnstep.com
PGP Key: 0x14964AC8 200208091849

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE9VEb6vJuQZxSWSsgRAsHNAKDCXz4vUY1A79EaVmfPIfcNS7VyxgCaA5Tx
HrNmY0vITbb2G1fZFBCPgBM=
=qhYp
-----END PGP SIGNATURE-----




Re: Proposal for psql wildcarding behavior w/schemas

From
Tom Lane
Date:
"Greg Sabino Mullane" <greg@turnstep.com> writes:
> Here's my thoughts on the matter:
> More than one period throws an error (something pleasant, mentioning 
> that we do not quite support cross-database queries yet).

The just-committed implementation doesn't throw an error, but silently
discards name fragments to the left of the last two --- for example,
foo.bar.baz is silently treated as bar.baz.  This could probably be
improved, but I haven't quite figured out how psql deals with error
recovery...

Otherwise I agree with your comments, except for

> \d baz*       Shows all tables starting with "baz", in all available schemas

Make that "shows visible tables whose names begin with baz".  There is
a subtle difference.

> \d *.baz*     Same as above

This shows tables whose names begin with baz, in any schema in the
database --- without regard to visibility.

> \d *.*        Same as plain old \d (which is actually a special case now)

\d without an argument is still a special case: it transforms to \dtvs
with no argument.  Other than that little usability kluge, the general
rule is that for any object-type x, \dx is the same as \dx *, which is
*not* the same as \dx *.* ... the former shows all visible objects, the
latter all objects in the database.

> The current behavior can be a bit confusing, in that some functions
> have implicit wildcards (\dt and friends) and some do not (\d).

As of cvs tip, all the \d family take wildcards.
        regards, tom lane