Thread: psql - better support pipe line
Hi
I found so a set "psql" and "xargs" is pretty strong. But I miss a psql option for simple returning current database and continuing in pipeline.where relname = 'pg_attribute' and n_dead_tup > 100000" |
xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %"
--echo-db requires -q option
On 08/24/2015 08:06 AM, Pavel Stehule wrote: > Hi > > I found so a set "psql" and "xargs" is pretty strong. But I miss a psql > option for simple returning current database and continuing in pipeline. > > What I am doing: > > psql postgres -At -c "select datname from pg_database" | > xargs -P 3 -I % psql % -At -c "select current_databe() from > pg_stat_all_tables > where relname = 'pg_attribute' and n_dead_tup > 100000" | > xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" | > xargs -P 3 -I % psql % -At -c "select curren_database() from ..." > > it works perfectly - but the line > > xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" > > is little bit ugly - with some psql option it can be cleaned to > > xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ... > > --echo-db requires -q option > > What are you thinking about this idea? Seems like a one-tricky-pony to me. You're just as likely to need to print a relation name or something else, as the current database. I don't actually understandu what you'd put in the "..." above. One you've analyze'd the table, what more do you want to do? Overall, once your pipeline gets that complicated, I'd rather write a little bash or perl script with for-loops and variables. - Heikki
2015-08-24 12:49 GMT+02:00 Heikki Linnakangas <hlinnaka@iki.fi>:
On 08/24/2015 08:06 AM, Pavel Stehule wrote:Hi
I found so a set "psql" and "xargs" is pretty strong. But I miss a psql
option for simple returning current database and continuing in pipeline.
What I am doing:
psql postgres -At -c "select datname from pg_database" |
xargs -P 3 -I % psql % -At -c "select current_databe() from
pg_stat_all_tables
where relname = 'pg_attribute' and n_dead_tup > 100000" |
xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" |
xargs -P 3 -I % psql % -At -c "select curren_database() from ..."
it works perfectly - but the line
xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %"
is little bit ugly - with some psql option it can be cleaned to
xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ...
--echo-db requires -q option
What are you thinking about this idea?
Seems like a one-tricky-pony to me. You're just as likely to need to print a relation name or something else, as the current database.
yes, you have true
I don't actually understandu what you'd put in the "..." above. One you've analyze'd the table, what more do you want to do?
do other filtering and for example run VACUUM FULL when it can have a significant effect
Overall, once your pipeline gets that complicated, I'd rather write a little bash or perl script with for-loops and variables.
It is pipeline of trivial statements - so still it is pretty simple - and with multiprocess suppport
Regards
Pavel
- Heikki
On 08/24/2015 06:49 AM, Heikki Linnakangas wrote: > On 08/24/2015 08:06 AM, Pavel Stehule wrote: >> Hi >> >> I found so a set "psql" and "xargs" is pretty strong. But I miss a psql >> option for simple returning current database and continuing in pipeline. >> >> What I am doing: >> >> psql postgres -At -c "select datname from pg_database" | >> xargs -P 3 -I % psql % -At -c "select current_databe() from >> pg_stat_all_tables >> where relname = 'pg_attribute' and n_dead_tup > 100000" | >> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" | >> xargs -P 3 -I % psql % -At -c "select curren_database() from ..." >> >> it works perfectly - but the line >> >> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" >> >> is little bit ugly - with some psql option it can be cleaned to >> >> xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ... >> >> --echo-db requires -q option >> >> What are you thinking about this idea? > > Seems like a one-tricky-pony to me. You're just as likely to need to > print a relation name or something else, as the current database. > > I don't actually understandu what you'd put in the "..." above. One > you've analyze'd the table, what more do you want to do? > > Overall, once your pipeline gets that complicated, I'd rather write a > little bash or perl script with for-loops and variables. > > Yes, the use case for this is way too narrow. cheers andrew
Heikki Linnakangas <hlinnaka@iki.fi> writes: > On 08/24/2015 08:06 AM, Pavel Stehule wrote: >> it works perfectly - but the line >> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" >> is little bit ugly - with some psql option it can be cleaned to >> xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ... >> --echo-db requires -q option >> What are you thinking about this idea? > Seems like a one-tricky-pony to me. You're just as likely to need to > print a relation name or something else, as the current database. Not only that, but: (1) there is no reason to believe that the db name and only the db name is needed to do another connection; what about port, host, user, etc? (2) this commandeers the pipe connection to transmit out-of-band data, making it impossible to use the pipe for its natural function, viz transmitting ordinary data from one processing step to the next. Sure, there are use-cases where there's no such data and you can repurpose the pipe like that, but that's an enormous limitation. > Overall, once your pipeline gets that complicated, I'd rather write a > little bash or perl script with for-loops and variables. Yeah, on the whole this seems like a band-aid to let a bad scripting approach limp a few steps further before it collapses completely. regards, tom lane
2015-08-24 16:02 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 08/24/2015 08:06 AM, Pavel Stehule wrote:
>> it works perfectly - but the line
>> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %"
>> is little bit ugly - with some psql option it can be cleaned to
>> xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ...
>> --echo-db requires -q option
>> What are you thinking about this idea?
> Seems like a one-tricky-pony to me. You're just as likely to need to
> print a relation name or something else, as the current database.
Not only that, but:
(1) there is no reason to believe that the db name and only the db name
is needed to do another connection; what about port, host, user, etc?
I have to agree - the possibilities is much more than database name - so one option is not good idea.
(2) this commandeers the pipe connection to transmit out-of-band data,
making it impossible to use the pipe for its natural function, viz
transmitting ordinary data from one processing step to the next. Sure,
there are use-cases where there's no such data and you can repurpose the
pipe like that, but that's an enormous limitation.
I wrote some bash or perl scripts and I don't think so described style is less readable than other.
But it has one pretty advantage - paralelism without any line more, without higher complexity.
Regards
Pavel
> Overall, once your pipeline gets that complicated, I'd rather write a
> little bash or perl script with for-loops and variables.
Yeah, on the whole this seems like a band-aid to let a bad scripting
approach limp a few steps further before it collapses completely.
regards, tom lane
On 8/24/15 3:04 PM, Pavel Stehule wrote: > (1) there is no reason to believe that the db name and only the db name > is needed to do another connection; what about port, host, user, etc? > > > I have to agree - the possibilities is much more than database name - so > one option is not good idea. What I've had problems with is trying to correlate psql specified connection attributes with things like DBI. It would be nice if there was a way to get a fully formed connection URI for the current connection. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
Jim Nasby <Jim.Nasby@bluetreble.com> writes: > What I've had problems with is trying to correlate psql specified > connection attributes with things like DBI. It would be nice if there > was a way to get a fully formed connection URI for the current connection. Yeah, although I'd think the capability to create such a URI is libpq's province not psql's. Maybe a PQgetConnectionURI(PGConn) function in libpq, and some psql backslash command to access that? Or maybe a nicer API would be that there's a magic psql variable containing the URI; not sure. regards, tom lane
Hi
2015-08-25 17:21 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> What I've had problems with is trying to correlate psql specified
> connection attributes with things like DBI. It would be nice if there
> was a way to get a fully formed connection URI for the current connection.
Yeah, although I'd think the capability to create such a URI is libpq's
province not psql's. Maybe a PQgetConnectionURI(PGConn) function in
libpq, and some psql backslash command to access that? Or maybe a nicer
API would be that there's a magic psql variable containing the URI; not
sure.
proof concept of PQGetConnectionUri and \uri command.
missing:
connection options
uri encoding
regards, tom lane
Attachment
On Wed, Aug 26, 2015 at 5:44 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi2015-08-25 17:21 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> What I've had problems with is trying to correlate psql specified
> connection attributes with things like DBI. It would be nice if there
> was a way to get a fully formed connection URI for the current connection.
Yeah, although I'd think the capability to create such a URI is libpq's
province not psql's. Maybe a PQgetConnectionURI(PGConn) function in
libpq, and some psql backslash command to access that? Or maybe a nicer
API would be that there's a magic psql variable containing the URI; not
sure.proof concept of PQGetConnectionUri and \uri command.
I like the idea, thanks!
missing:connection optionsuri encoding
Attached adds implementation of both. Still missing:
- documentation
Maybe we should provide a bool parameter to this new function so that additional parameters could be ignored. Right now it will print a few default values, that are of no great use anyway:
$ ./bin/psql -c '\uri' 'postgresql://username@/postgres'
postgresql:/username@:5432/postgres?client_encoding=UTF8&fallback_application_name=psql&sslmode=disable
I don't think we can detect and remove the default values from this output in a reliable way?
--
Alex
Attachment
2015-08-27 11:54 GMT+02:00 Shulgin, Oleksandr <oleksandr.shulgin@zalando.de>:
On Wed, Aug 26, 2015 at 5:44 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:Hi2015-08-25 17:21 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> What I've had problems with is trying to correlate psql specified
> connection attributes with things like DBI. It would be nice if there
> was a way to get a fully formed connection URI for the current connection.
Yeah, although I'd think the capability to create such a URI is libpq's
province not psql's. Maybe a PQgetConnectionURI(PGConn) function in
libpq, and some psql backslash command to access that? Or maybe a nicer
API would be that there's a magic psql variable containing the URI; not
sure.proof concept of PQGetConnectionUri and \uri command.I like the idea, thanks!missing:connection optionsuri encodingAttached adds implementation of both. Still missing:- documentationMaybe we should provide a bool parameter to this new function so that additional parameters could be ignored. Right now it will print a few default values, that are of no great use anyway:$ ./bin/psql -c '\uri' 'postgresql://username@/postgres'postgresql:/username@:5432/postgres?client_encoding=UTF8&fallback_application_name=psql&sslmode=disableI don't think we can detect and remove the default values from this output in a reliable way?
This is pretty difficult - any parameter can be important, and hard to identify default values on client side without connect to server side. I don't see any other way, than hold bitmap for identification entered values and default values from the input.
Pavel
--Alex
2015-08-27 11:54 GMT+02:00 Shulgin, Oleksandr <oleksandr.shulgin@zalando.de>:
On Wed, Aug 26, 2015 at 5:44 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:Hi2015-08-25 17:21 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> What I've had problems with is trying to correlate psql specified
> connection attributes with things like DBI. It would be nice if there
> was a way to get a fully formed connection URI for the current connection.
Yeah, although I'd think the capability to create such a URI is libpq's
province not psql's. Maybe a PQgetConnectionURI(PGConn) function in
libpq, and some psql backslash command to access that? Or maybe a nicer
API would be that there's a magic psql variable containing the URI; not
sure.proof concept of PQGetConnectionUri and \uri command.I like the idea, thanks!missing:connection optionsuri encodingAttached adds implementation of both. Still missing:
thank you for enhancing this patch
Pavel
- documentationMaybe we should provide a bool parameter to this new function so that additional parameters could be ignored. Right now it will print a few default values, that are of no great use anyway:$ ./bin/psql -c '\uri' 'postgresql://username@/postgres'postgresql:/username@:5432/postgres?client_encoding=UTF8&fallback_application_name=psql&sslmode=disableI don't think we can detect and remove the default values from this output in a reliable way?--Alex
On 8/27/15 8:37 AM, Pavel Stehule wrote: > > I don't think we can detect and remove the default values from this > output in a reliable way? > > > This is pretty difficult - any parameter can be important, and hard to > identify default values on client side without connect to server side. I > don't see any other way, than hold bitmap for identification entered > values and default values from the input. That would be nice, but probably not that big a deal. What I think would be more useful is a way to exclude the password if you didn't need it. It occurs to me the most flexible thing that could be done here would be providing a libpq function that spits out JSON connection parameters and have psql turn that into a variable. It would be easy to feed that to a SQL statement and do whatever you want with it at that point, including format it to a connection URI. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On Fri, Aug 28, 2015 at 3:29 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 8/27/15 8:37 AM, Pavel Stehule wrote:
I don't think we can detect and remove the default values from this
output in a reliable way?
This is pretty difficult - any parameter can be important, and hard to
identify default values on client side without connect to server side. I
don't see any other way, than hold bitmap for identification entered
values and default values from the input.
That would be nice, but probably not that big a deal.
What I think would be more useful is a way to exclude the password if you didn't need it.
Good point, and maybe even exclude it by default and provide only if asked for that explicitly.
It occurs to me the most flexible thing that could be done here would be providing a libpq function that spits out JSON connection parameters and have psql turn that into a variable. It would be easy to feed that to a SQL statement and do whatever you want with it at that point, including format it to a connection URI.
Hm... but that would mean that suddenly psql would need JSON parsing capabilities and URI escaping code would have to be moved there too? So every client that links to libpq and wants to use this feature going as far as reconstructing an URI would need both of the capabilities.
Why instead of JSON not spit conninfo format, with proper escaping? That could be a separate library call, e.g. PGgetConnectionString() and a separate backslash command: \conninfo
--
Alex
On 8/28/15 3:58 AM, Shulgin, Oleksandr wrote: > It occurs to me the most flexible thing that could be done here > would be providing a libpq function that spits out JSON connection > parameters and have psql turn that into a variable. It would be easy > to feed that to a SQL statement and do whatever you want with it at > that point, including format it to a connection URI. > > > Hm... but that would mean that suddenly psql would need JSON parsing > capabilities and URI escaping code would have to be moved there too? So > every client that links to libpq and wants to use this feature going as > far as reconstructing an URI would need both of the capabilities. Anything that's doing this presumably has connected to the database, which on any recent version means you have plenty of ability to process JSON at the SQL layer. > Why instead of JSON not spit conninfo format, with proper escaping? > That could be a separate library call, e.g. PGgetConnectionString() and > a separate backslash command: \conninfo Do you mean as a URI? The downside to that it's it's more difficult to parse than JSON. Another option might be an array. The other issue is there's no way to capture \conninfo inside of psql and do something with it. If instead this was exposed as a variable, you could handle it in SQL if you wanted to. All that said, the patch already adds significant value and you could always parse the URI if you really needed to. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On Fri, Aug 28, 2015 at 9:52 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 8/28/15 3:58 AM, Shulgin, Oleksandr wrote:It occurs to me the most flexible thing that could be done here
would be providing a libpq function that spits out JSON connection
parameters and have psql turn that into a variable. It would be easy
to feed that to a SQL statement and do whatever you want with it at
that point, including format it to a connection URI.
Hm... but that would mean that suddenly psql would need JSON parsing
capabilities and URI escaping code would have to be moved there too? So
every client that links to libpq and wants to use this feature going as
far as reconstructing an URI would need both of the capabilities.
Anything that's doing this presumably has connected to the database, which on any recent version means you have plenty of ability to process JSON at the SQL layer.
*Cough*... Well, the fact that it's technically not impossible, doesn't mean it's the right way to do it. By the same reasoning we can also ask the server to calculate 1+1 for us in SQL. :-)
And that will work even with a 9.0 server, while parsing JSON -- not really. Another point is that you don't need an *alive* connection to be able to extract its URI/conninfo string, while when offloading JSON parsing part to the server you suddenly do. Bottom line for me: while still possible, this can't be portable.
Why instead of JSON not spit conninfo format, with proper escaping?
That could be a separate library call, e.g. PGgetConnectionString() and
a separate backslash command: \conninfo
Do you mean as a URI? The downside to that it's it's more difficult to parse than JSON. Another option might be an array.
Hm... actually why not just use the existing call:
PQconninfoOption *PQconninfo(PGconn *conn);
and move whatever code is needed to form an URI or conninfo string to psql itself?
The other issue is there's no way to capture \conninfo inside of psql and do something with it. If instead this was exposed as a variable, you could handle it in SQL if you wanted to.
Yeah, I forgot about the variable proposal, that would be a more useful way to expose it for sure.
--
Alex
On 8/29/15 8:10 AM, Shulgin, Oleksandr wrote: > The other issue is there's no way to capture \conninfo inside of > psql and do something with it. If instead this was exposed as a > variable, you could handle it in SQL if you wanted to. > > > Yeah, I forgot about the variable proposal, that would be a more useful > way to expose it for sure. Right. My only other point is it would be nice if what we exposed there could be easily parsed in SQL. But as I said, having *anything* would be an improvement. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com