Thread: psql - better support pipe line

psql - better support pipe line

From
Pavel Stehule
Date:
Hi

I found so a set "psql" and "xargs" is pretty strong. But I miss a psql option for simple returning current database and continuing in pipeline.

What I am doing:

psql postgres -At -c "select datname from pg_database" |
xargs -P 3 -I % psql % -At -c "select current_databe() from pg_stat_all_tables
where relname = 'pg_attribute' and n_dead_tup > 100000" |
xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" |
xargs -P 3 -I % psql % -At -c "select curren_database() from ..."

it works perfectly - but the line

xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %"

is little bit ugly - with some psql option it can be cleaned to

xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ...

--echo-db requires -q option

What are you thinking about this idea?

Regards

Pavel

Re: psql - better support pipe line

From
Heikki Linnakangas
Date:
On 08/24/2015 08:06 AM, Pavel Stehule wrote:
> Hi
>
> I found so a set "psql" and "xargs" is pretty strong. But I miss a psql
> option for simple returning current database and continuing in pipeline.
>
> What I am doing:
>
> psql postgres -At -c "select datname from pg_database" |
> xargs -P 3 -I % psql % -At -c "select current_databe() from
> pg_stat_all_tables
> where relname = 'pg_attribute' and n_dead_tup > 100000" |
> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" |
> xargs -P 3 -I % psql % -At -c "select curren_database() from ..."
>
> it works perfectly - but the line
>
> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %"
>
> is little bit ugly - with some psql option it can be cleaned to
>
> xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ...
>
> --echo-db requires -q option
>
> What are you thinking about this idea?

Seems like a one-tricky-pony to me. You're just as likely to need to 
print a relation name or something else, as the current database.

I don't actually understandu what you'd put in the "..." above. One 
you've analyze'd the table, what more do you want to do?

Overall, once your pipeline gets that complicated, I'd rather write a 
little bash or perl script with for-loops and variables.

- Heikki




Re: psql - better support pipe line

From
Pavel Stehule
Date:


2015-08-24 12:49 GMT+02:00 Heikki Linnakangas <hlinnaka@iki.fi>:
On 08/24/2015 08:06 AM, Pavel Stehule wrote:
Hi

I found so a set "psql" and "xargs" is pretty strong. But I miss a psql
option for simple returning current database and continuing in pipeline.

What I am doing:

psql postgres -At -c "select datname from pg_database" |
xargs -P 3 -I % psql % -At -c "select current_databe() from
pg_stat_all_tables
where relname = 'pg_attribute' and n_dead_tup > 100000" |
xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" |
xargs -P 3 -I % psql % -At -c "select curren_database() from ..."

it works perfectly - but the line

xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %"

is little bit ugly - with some psql option it can be cleaned to

xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ...

--echo-db requires -q option

What are you thinking about this idea?

Seems like a one-tricky-pony to me. You're just as likely to need to print a relation name or something else, as the current database.

yes, you have true


I don't actually understandu what you'd put in the "..." above. One you've analyze'd the table, what more do you want to do?

do other filtering and for example run VACUUM FULL when it can have a significant effect

Overall, once your pipeline gets that complicated, I'd rather write a little bash or perl script with for-loops and variables.

It is pipeline of trivial statements - so still it is pretty simple - and with multiprocess suppport

Regards

Pavel
 


- Heikki


Re: psql - better support pipe line

From
Andrew Dunstan
Date:

On 08/24/2015 06:49 AM, Heikki Linnakangas wrote:
> On 08/24/2015 08:06 AM, Pavel Stehule wrote:
>> Hi
>>
>> I found so a set "psql" and "xargs" is pretty strong. But I miss a psql
>> option for simple returning current database and continuing in pipeline.
>>
>> What I am doing:
>>
>> psql postgres -At -c "select datname from pg_database" |
>> xargs -P 3 -I % psql % -At -c "select current_databe() from
>> pg_stat_all_tables
>> where relname = 'pg_attribute' and n_dead_tup > 100000" |
>> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %" |
>> xargs -P 3 -I % psql % -At -c "select curren_database() from ..."
>>
>> it works perfectly - but the line
>>
>> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %"
>>
>> is little bit ugly - with some psql option it can be cleaned to
>>
>> xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ...
>>
>> --echo-db requires -q option
>>
>> What are you thinking about this idea?
>
> Seems like a one-tricky-pony to me. You're just as likely to need to 
> print a relation name or something else, as the current database.
>
> I don't actually understandu what you'd put in the "..." above. One 
> you've analyze'd the table, what more do you want to do?
>
> Overall, once your pipeline gets that complicated, I'd rather write a 
> little bash or perl script with for-loops and variables.
>
>


Yes, the use case for this is way too narrow.

cheers

andrew








Re: psql - better support pipe line

From
Tom Lane
Date:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 08/24/2015 08:06 AM, Pavel Stehule wrote:
>> it works perfectly - but the line
>> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %"
>> is little bit ugly - with some psql option it can be cleaned to
>> xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ...
>> --echo-db requires -q option
>> What are you thinking about this idea?

> Seems like a one-tricky-pony to me. You're just as likely to need to 
> print a relation name or something else, as the current database.

Not only that, but:

(1) there is no reason to believe that the db name and only the db name
is needed to do another connection; what about port, host, user, etc?

(2) this commandeers the pipe connection to transmit out-of-band data,
making it impossible to use the pipe for its natural function, viz
transmitting ordinary data from one processing step to the next.  Sure,
there are use-cases where there's no such data and you can repurpose the
pipe like that, but that's an enormous limitation.

> Overall, once your pipeline gets that complicated, I'd rather write a 
> little bash or perl script with for-loops and variables.

Yeah, on the whole this seems like a band-aid to let a bad scripting
approach limp a few steps further before it collapses completely.
        regards, tom lane



Re: psql - better support pipe line

From
Pavel Stehule
Date:


2015-08-24 16:02 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 08/24/2015 08:06 AM, Pavel Stehule wrote:
>> it works perfectly - but the line
>> xargs -P 3 -I % sh -c "psql % -q -c 'analyze pg_attribute'; echo %"
>> is little bit ugly - with some psql option it can be cleaned to
>> xargs -P3 -I % psql % -q --echo-db -c "analyze pg_attribute" | ...
>> --echo-db requires -q option
>> What are you thinking about this idea?

> Seems like a one-tricky-pony to me. You're just as likely to need to
> print a relation name or something else, as the current database.

Not only that, but:

(1) there is no reason to believe that the db name and only the db name
is needed to do another connection; what about port, host, user, etc?

I have to agree - the possibilities is much more than database name - so one option is not good idea.
 

(2) this commandeers the pipe connection to transmit out-of-band data,
making it impossible to use the pipe for its natural function, viz
transmitting ordinary data from one processing step to the next.  Sure,
there are use-cases where there's no such data and you can repurpose the
pipe like that, but that's an enormous limitation.

I wrote some bash or perl scripts and I don't think so described style is less readable than other.

But it has one pretty advantage - paralelism without any line more, without higher complexity.

Regards

Pavel
 
 

> Overall, once your pipeline gets that complicated, I'd rather write a
> little bash or perl script with for-loops and variables.

Yeah, on the whole this seems like a band-aid to let a bad scripting
approach limp a few steps further before it collapses completely.


 

                        regards, tom lane

Re: psql - better support pipe line

From
Jim Nasby
Date:
On 8/24/15 3:04 PM, Pavel Stehule wrote:
>     (1) there is no reason to believe that the db name and only the db name
>     is needed to do another connection; what about port, host, user, etc?
>
>
> I have to agree - the possibilities is much more than database name - so
> one option is not good idea.

What I've had problems with is trying to correlate psql specified 
connection attributes with things like DBI. It would be nice if there 
was a way to get a fully formed connection URI for the current connection.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: psql - better support pipe line

From
Tom Lane
Date:
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> What I've had problems with is trying to correlate psql specified 
> connection attributes with things like DBI. It would be nice if there 
> was a way to get a fully formed connection URI for the current connection.

Yeah, although I'd think the capability to create such a URI is libpq's
province not psql's.  Maybe a PQgetConnectionURI(PGConn) function in
libpq, and some psql backslash command to access that?  Or maybe a nicer
API would be that there's a magic psql variable containing the URI; not
sure.
        regards, tom lane



Re: psql - better support pipe line

From
Pavel Stehule
Date:
Hi

2015-08-25 17:21 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> What I've had problems with is trying to correlate psql specified
> connection attributes with things like DBI. It would be nice if there
> was a way to get a fully formed connection URI for the current connection.

Yeah, although I'd think the capability to create such a URI is libpq's
province not psql's.  Maybe a PQgetConnectionURI(PGConn) function in
libpq, and some psql backslash command to access that?  Or maybe a nicer
API would be that there's a magic psql variable containing the URI; not
sure.

proof concept of PQGetConnectionUri and \uri command.

missing:

connection options
uri encoding

 

                        regards, tom lane

Attachment

Re: psql - better support pipe line

From
"Shulgin, Oleksandr"
Date:


On Wed, Aug 26, 2015 at 5:44 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

2015-08-25 17:21 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> What I've had problems with is trying to correlate psql specified
> connection attributes with things like DBI. It would be nice if there
> was a way to get a fully formed connection URI for the current connection.

Yeah, although I'd think the capability to create such a URI is libpq's
province not psql's.  Maybe a PQgetConnectionURI(PGConn) function in
libpq, and some psql backslash command to access that?  Or maybe a nicer
API would be that there's a magic psql variable containing the URI; not
sure.

proof concept of PQGetConnectionUri and \uri command.

I like the idea, thanks!
 
missing:

connection options
uri encoding

Attached adds implementation of both.  Still missing:

- documentation

Maybe we should provide a bool parameter to this new function so that additional parameters could be ignored.  Right now it will print a few default values, that are of no great use anyway:

$ ./bin/psql -c '\uri' 'postgresql://username@/postgres'
postgresql:/username@:5432/postgres?client_encoding=UTF8&fallback_application_name=psql&sslmode=disable

I don't think we can detect and remove the default values from this output in a reliable way?

--
Alex

Attachment

Re: psql - better support pipe line

From
Pavel Stehule
Date:


2015-08-27 11:54 GMT+02:00 Shulgin, Oleksandr <oleksandr.shulgin@zalando.de>:


On Wed, Aug 26, 2015 at 5:44 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

2015-08-25 17:21 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> What I've had problems with is trying to correlate psql specified
> connection attributes with things like DBI. It would be nice if there
> was a way to get a fully formed connection URI for the current connection.

Yeah, although I'd think the capability to create such a URI is libpq's
province not psql's.  Maybe a PQgetConnectionURI(PGConn) function in
libpq, and some psql backslash command to access that?  Or maybe a nicer
API would be that there's a magic psql variable containing the URI; not
sure.

proof concept of PQGetConnectionUri and \uri command.

I like the idea, thanks!
 
missing:

connection options
uri encoding

Attached adds implementation of both.  Still missing:

- documentation

Maybe we should provide a bool parameter to this new function so that additional parameters could be ignored.  Right now it will print a few default values, that are of no great use anyway:

$ ./bin/psql -c '\uri' 'postgresql://username@/postgres'
postgresql:/username@:5432/postgres?client_encoding=UTF8&fallback_application_name=psql&sslmode=disable

I don't think we can detect and remove the default values from this output in a reliable way?

This is pretty difficult - any parameter can be important, and hard to identify default values on client side without connect to server side. I don't see any other way, than hold bitmap for identification entered values and default values from the input.

Pavel
 

--
Alex


Re: psql - better support pipe line

From
Pavel Stehule
Date:


2015-08-27 11:54 GMT+02:00 Shulgin, Oleksandr <oleksandr.shulgin@zalando.de>:


On Wed, Aug 26, 2015 at 5:44 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

2015-08-25 17:21 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> What I've had problems with is trying to correlate psql specified
> connection attributes with things like DBI. It would be nice if there
> was a way to get a fully formed connection URI for the current connection.

Yeah, although I'd think the capability to create such a URI is libpq's
province not psql's.  Maybe a PQgetConnectionURI(PGConn) function in
libpq, and some psql backslash command to access that?  Or maybe a nicer
API would be that there's a magic psql variable containing the URI; not
sure.

proof concept of PQGetConnectionUri and \uri command.

I like the idea, thanks!
 
missing:

connection options
uri encoding

Attached adds implementation of both.  Still missing:

thank you for enhancing this patch

Pavel
 

- documentation

Maybe we should provide a bool parameter to this new function so that additional parameters could be ignored.  Right now it will print a few default values, that are of no great use anyway:

$ ./bin/psql -c '\uri' 'postgresql://username@/postgres'
postgresql:/username@:5432/postgres?client_encoding=UTF8&fallback_application_name=psql&sslmode=disable

I don't think we can detect and remove the default values from this output in a reliable way?

--
Alex


Re: psql - better support pipe line

From
Jim Nasby
Date:
On 8/27/15 8:37 AM, Pavel Stehule wrote:
>
>     I don't think we can detect and remove the default values from this
>     output in a reliable way?
>
>
> This is pretty difficult - any parameter can be important, and hard to
> identify default values on client side without connect to server side. I
> don't see any other way, than hold bitmap for identification entered
> values and default values from the input.

That would be nice, but probably not that big a deal.

What I think would be more useful is a way to exclude the password if 
you didn't need it.

It occurs to me the most flexible thing that could be done here would be 
providing a libpq function that spits out JSON connection parameters and 
have psql turn that into a variable. It would be easy to feed that to a 
SQL statement and do whatever you want with it at that point, including 
format it to a connection URI.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: psql - better support pipe line

From
"Shulgin, Oleksandr"
Date:
On Fri, Aug 28, 2015 at 3:29 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 8/27/15 8:37 AM, Pavel Stehule wrote:

    I don't think we can detect and remove the default values from this
    output in a reliable way?


This is pretty difficult - any parameter can be important, and hard to
identify default values on client side without connect to server side. I
don't see any other way, than hold bitmap for identification entered
values and default values from the input.

That would be nice, but probably not that big a deal.

What I think would be more useful is a way to exclude the password if you didn't need it.

Good point, and maybe even exclude it by default and provide only if asked for that explicitly.
 
It occurs to me the most flexible thing that could be done here would be providing a libpq function that spits out JSON connection parameters and have psql turn that into a variable. It would be easy to feed that to a SQL statement and do whatever you want with it at that point, including format it to a connection URI.

Hm... but that would mean that suddenly psql would need JSON parsing capabilities and URI escaping code would have to be moved there too?  So every client that links to libpq and wants to use this feature going as far as reconstructing an URI would need both of the capabilities.

Why instead of JSON not spit conninfo format, with proper escaping?  That could be a separate library call, e.g. PGgetConnectionString() and a separate backslash command: \conninfo

--
Alex

Re: psql - better support pipe line

From
Jim Nasby
Date:
On 8/28/15 3:58 AM, Shulgin, Oleksandr wrote:
>     It occurs to me the most flexible thing that could be done here
>     would be providing a libpq function that spits out JSON connection
>     parameters and have psql turn that into a variable. It would be easy
>     to feed that to a SQL statement and do whatever you want with it at
>     that point, including format it to a connection URI.
>
>
> Hm... but that would mean that suddenly psql would need JSON parsing
> capabilities and URI escaping code would have to be moved there too?  So
> every client that links to libpq and wants to use this feature going as
> far as reconstructing an URI would need both of the capabilities.

Anything that's doing this presumably has connected to the database, 
which on any recent version means you have plenty of ability to process 
JSON at the SQL layer.

> Why instead of JSON not spit conninfo format, with proper escaping?
> That could be a separate library call, e.g. PGgetConnectionString() and
> a separate backslash command: \conninfo

Do you mean as a URI? The downside to that it's it's more difficult to 
parse than JSON. Another option might be an array.

The other issue is there's no way to capture \conninfo inside of psql 
and do something with it. If instead this was exposed as a variable, you 
could handle it in SQL if you wanted to.

All that said, the patch already adds significant value and you could 
always parse the URI if you really needed to.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: psql - better support pipe line

From
"Shulgin, Oleksandr"
Date:
On Fri, Aug 28, 2015 at 9:52 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 8/28/15 3:58 AM, Shulgin, Oleksandr wrote:
    It occurs to me the most flexible thing that could be done here
    would be providing a libpq function that spits out JSON connection
    parameters and have psql turn that into a variable. It would be easy
    to feed that to a SQL statement and do whatever you want with it at
    that point, including format it to a connection URI.


Hm... but that would mean that suddenly psql would need JSON parsing
capabilities and URI escaping code would have to be moved there too?  So
every client that links to libpq and wants to use this feature going as
far as reconstructing an URI would need both of the capabilities.

Anything that's doing this presumably has connected to the database, which on any recent version means you have plenty of ability to process JSON at the SQL layer.

*Cough*...  Well, the fact that it's technically not impossible, doesn't mean it's the right way to do it.  By the same reasoning we can also ask the server to calculate 1+1 for us in SQL. :-)

And that will work even with a 9.0 server, while parsing JSON -- not really.  Another point is that you don't need an *alive* connection to be able to extract its URI/conninfo string, while when offloading JSON parsing part to the server you suddenly do.  Bottom line for me: while still possible, this can't be portable.

Why instead of JSON not spit conninfo format, with proper escaping?
That could be a separate library call, e.g. PGgetConnectionString() and
a separate backslash command: \conninfo

Do you mean as a URI? The downside to that it's it's more difficult to parse than JSON. Another option might be an array.

Hm... actually why not just use the existing call:

PQconninfoOption *PQconninfo(PGconn *conn);

and move whatever code is needed to form an URI or conninfo string to psql itself?

The other issue is there's no way to capture \conninfo inside of psql and do something with it. If instead this was exposed as a variable, you could handle it in SQL if you wanted to.

Yeah, I forgot about the variable proposal, that would be a more useful way to expose it for sure.

--
Alex

Re: psql - better support pipe line

From
Jim Nasby
Date:
On 8/29/15 8:10 AM, Shulgin, Oleksandr wrote:
>     The other issue is there's no way to capture \conninfo inside of
>     psql and do something with it. If instead this was exposed as a
>     variable, you could handle it in SQL if you wanted to.
>
>
> Yeah, I forgot about the variable proposal, that would be a more useful
> way to expose it for sure.

Right. My only other point is it would be nice if what we exposed there 
could be easily parsed in SQL. But as I said, having *anything* would be 
an improvement.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com