Thread: \dn [PATTERN] handling not quite right...
I haven't looked in great detail into why this is happpening, but it seems as though processNamePattern() doesn't handle ?'s correctly in the negative lookahead context correctly. 1) \dn [pattern] strips ?'s and replaces them with periods. This may be intentional (as the comment in describe.c suggests, converting input from shell-style wildcards gets converted into regexp notation), but is quite annoying. Ex: test=# \dn foo(?!_log|_shadow) ********* QUERY ********** SELECT n.nspname AS "Name", u.usename AS "Owner" FROM pg_catalog.pg_namespace n LEFT JOIN pg_catalog.pg_user u ON n.nspowner=u.usesysid WHERE (n.nspname NOT LIKE 'pg\\_temp\\_%' OR n.nspname = (pg_catalog.current_schemas(true))[1]) AND n.nspname ~ '^foo(.!_log|_shadow)$' ORDER BY 1; ************************** Which is incorrect, IMHO. Instead the last bit of the query should be: AND n.nspname ~ '^foo(?!_log|_shadow)$' 2) This brings up a large deficiency with the way that \d? [pattern] handling is done in psql(1). It'd be slick if there was a way to have psql's pattern routine look at the first non-whitespace character or two to change change the structure of the query. Something like \dn !.*_shadow% would change the RE operator from ~ to !~ and \dn %bar% would translate to LIKE('bar%'). Doing the regexp equiv of !LIKE('%_shadow') isn't trivial because '^.*(?!_shadow)$' doesn't return the expected result for various reasons. Oh! This'd be a "gun pointed at foot" feature, but having the first character being an = would, without escaping, drop the remainder of the input directly into the query (ex: \dn =nspname != (LIKE('%_log') OR LIKE('%_shadow'))). Maybe a psql(1) variable that changes the behavior of the pattern queries from using an RE to a LIKE statement could also be a possibility. The more I think about this, a leading pipe could be used to pipe the output to a utility, so that \dn | egrep -v '(log|shadow) would work and would be the easiest solution. Maybe a better "bug report" would be, what's the suggested way of doing: n.nspname !~ '_(log|shadow)$'? from a list pattern? -sc -- Sean Chittenden
Sean Chittenden <sean@chittenden.org> writes: > I haven't looked in great detail into why this is happpening, but it > seems as though processNamePattern() doesn't handle ?'s correctly in > the negative lookahead context correctly. Negative lookahead context!? You are several sigmas beyond the subset of regex functionality that \d and friends are intended to support. Given that we're defining * and ? as shell-style wildcards, it's not going to be feasible to handle very much of ordinary regex usage let alone advanced. > The more I think about this, a leading pipe could be used > to pipe the output to a utility, so that \dn | egrep -v '(log|shadow) > would work and would be the easiest solution. This on the other hand seems more like a potentially useful feature, although I'm unclear on what you expect to get sent through the pipe exactly --- you want column headers for instance? What if you're using a nondefault display layout? regards, tom lane
>> I haven't looked in great detail into why this is happpening, but it >> seems as though processNamePattern() doesn't handle ?'s correctly in >> the negative lookahead context correctly. > > Negative lookahead context!? You are several sigmas beyond the subset > of regex functionality that \d and friends are intended to support. > Given that we're defining * and ? as shell-style wildcards, it's not > going to be feasible to handle very much of ordinary regex usage let > alone advanced. I was worried you'd say as much. I'm in a situation where I've got a few hundred schemas floating around now and about half of them end with _log or _shadow and I was surprised at how non-trivial it was to filter out the _log or _shadow schemas with \dn. I tried thinking up the psql equiv of tcsh's fignore but had no luck (ex: set fignore = (\~ .o .bak)). >> The more I think about this, a leading pipe could be used >> to pipe the output to a utility, so that \dn | egrep -v '(log|shadow) >> would work and would be the easiest solution. > > This on the other hand seems more like a potentially useful feature, > although I'm unclear on what you expect to get sent through the pipe > exactly --- you want column headers for instance? What if you're using > a nondefault display layout? Instead of using printf(), fprintf(), fwrite(), or whatever it is that psql(1) uses internally for displaying result sets, have it use the following chunk of pseudo code: if (pipe_symbol_found) { char buf[8192]; size_t len = 0; memset(&buf, 0, sizeof(buf)); fh = popen(..., "r+"); /* or setsocketpair() + fork() */ fwrite(formatted_output_buffer, strlen(formatted_output_buffer), 1, fh); while((len = read(fileno(fh), buf, sizeof(buf)))) { fwrite(buf, len, 1, stdout); } } else { /* whatever the current code does */ } That doesn't take into account the set option that lets you write the output to a file, but that is trivial to handle. To answer your question, I'd send _everything_ through the pipe and use the pipe as a blanket IO filter. I haven't thought about this, but would it be possible to hand the data off to sh(1) and have it handle the pipe/redirection foo that way psql doesn't have to have any pipe/redirection brains? If so, I think that'd be slick since you could do things like '\dn | tail -n +3 | grep -v blah' to handle your concern about having the header sent through and a utility not wanting it. Too bad tee(1) doesn't support a -p option to have tee(1)'s argument sent to sh(1) or a pipe instead of a file, then there'd be some real interesting things that one could script. Ex: \dn | tee -p 'head -n 3 >> /dev/stdout' | tail -n +3 | egrep -v '_(log|shadow)$' Which'd show you the header, but everything after the header would be sent to egrep(1). I can't understand why win32 users think *NIX's CLI can be confusing.... *grin*. Just some thoughts. -sc -- Sean Chittenden