Thread: psql/pg_dump vs. dollar signs in identifiers
An example being discussed on the jdbc list led me to try this: regression=# create table a$b$c (f1 int); CREATE TABLE regression=# \d a$b$c Did not find any relation named "a$b$c". It works if you use quotes: regression=# \d "a$b$c" Table "public.a$b$c"Column | Type | Modifiers --------+---------+-----------f1 | integer | The reason it doesn't work without quotes is that processSQLNamePattern() thinks this: * Inside double quotes, or at all times if force_escape is true, * quote regexp special characterswith a backslash to avoid * regexp errors. Outside quotes, however, let them pass through * as-is; this lets knowledgeable users build regexp expressions * that are more powerful than shell-style patterns. and of course $ is a regexp special character, so it bollixes up the match. Now, because we surround the pattern with ^...$ anyway, I can't offhand see a use-case for putting $ with its regexp meaning into the pattern. And since we do allow $ as a non-first character of identifiers, there is a use-case for expecting it to be treated like an ordinary character. So I'm thinking that $ ought to be quoted whether it's inside double quotes or not. This change would affect psql's describe commands as well as pg_dump -t and -n patterns. Comments? regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Now, because we surround the pattern with ^...$ anyway, I can't offhand > see a use-case for putting $ with its regexp meaning into the pattern. It's possible to still usefully use $ in the regexp, but it's existence at the end means there should always be a way to write the regexp without needing another one inside. Incidentally, are these really regexps? I always thought they were globs. And experiments seem to back up my memory: postgres=# \d foo* Table "public.foo^bar"Column | Type | Modifiers --------+---------+-----------i | integer | postgres=# \d foo.* Did not find any relation named "foo.*". > Comments? The first half of the logic applies to ^ as well. There's no use case for regexps using ^ inside. You would have to use quotes to create the table but we could have \d foo^* work: postgres=# \d foo^* Did not find any relation named "foo^*". -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > Incidentally, are these really regexps? I always thought they were globs. They're regexps under the hood, but we treat . as a schema separator and translate * to .*, which makes it look like mostly a glob scheme. But you can make use of brackets, |, +, ... regards, tom lane
On Mon, Jul 09, 2007 at 07:04:27PM +0100, Gregory Stark wrote: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: > > > Now, because we surround the pattern with ^...$ anyway, I can't offhand > > see a use-case for putting $ with its regexp meaning into the pattern. > > It's possible to still usefully use $ in the regexp, but it's existence at the > end means there should always be a way to write the regexp without needing > another one inside. Unless you're doing muti-line regex, what's the point of a $ anywhere but the end of the expression? Am I missing something? Likewise with ^. I'm inclined to escape $ as Tom suggested. -- Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
"Jim C. Nasby" <decibel@decibel.org> writes: > Unless you're doing muti-line regex, what's the point of a $ anywhere > but the end of the expression? Am I missing something? Likewise with ^. Leaving out the backslashes, you can do things like (foo$|baz|qux)(baz|qux|) to say that all 9 combinations of those two tokens are valid except that foo must be followed by the empty second half. But it can always be refactored into something more normal like (foo|((baz|qux)(baz|qux)?)) > I'm inclined to escape $ as Tom suggested. Yeah, I have a tendency to look for the most obscure counter-example if only to be sure I really understand precisely how obscure it is. I do agree that it's not a realistic concern. Especially since I never even realized we handled regexps here at all :) IIRC some regexp engines don't actually treat $ specially except at the end of the regexp at all. Tom's just suggesting doing the same thing here where complicated regexps are even *less* likely and dollars as literals more. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com