Home > mailing lists

Thread: proposal: psql \setfileref

proposal: psql \setfileref

From

Pavel Stehule

Date:

31 August 2016, 15:33:30

Hi

I propose a new type of psql variables - file references. The content of file reference is specified by referenced file. It allows simple inserting large data without necessity of manual escaping or using LO api.

When I wrote the patch, I used parametrized queries for these data instead escaped strings - the code is not much bigger, and the error messages are much more friendly if query is not bloated by bigger content. The text mode is used only - when escaping is not required, then content is implicitly transformed to bytea. By default the content of file is bytea. When use requires escaping, then he enforces text escaping - because it has sense only for text type.

postgres=# \setfileref a ~/test2.xml
postgres=# \setfileref b ~/avatar.gif
postgres=# insert into test values(convert_from(:a, 'latin2')::xml, :b); -- xml is passed as bytea
postgres=# insert into test values(:'a', :b); -- xml is passed via unknown text value

The content of file reference variables is not persistent in memory.

Comments, notes?

Regards

Pavel

Attachment

psql-setfileref-initial.patch

Re: proposal: psql \setfileref

From

Corey Huinker

Date:

31 August 2016, 16:25:00

On Wed, Aug 31, 2016 at 11:32 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

I propose a new type of psql variables - file references. The content of file reference is specified by referenced file. It allows simple inserting large data without necessity of manual escaping or using LO api.

When I wrote the patch, I used parametrized queries for these data instead escaped strings - the code is not much bigger, and the error messages are much more friendly if query is not bloated by bigger content. The text mode is used only - when escaping is not required, then content is implicitly transformed to bytea. By default the content of file is bytea. When use requires escaping, then he enforces text escaping - because it has sense only for text type.

postgres=# \setfileref a ~/test2.xml
postgres=# \setfileref b ~/avatar.gif
postgres=# insert into test values(convert_from(:a, 'latin2')::xml, :b); -- xml is passed as bytea
postgres=# insert into test values(:'a', :b); -- xml is passed via unknown text value

The content of file reference variables is not persistent in memory.

Comments, notes?

Regards

Pavel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Clearly jumping ahead on this one, but if the fileref is essentially a pipe to "cat /path/to/file.name", is there anything stopping us from setting pipes?
My interest is primarily in ways that COPY could use this.

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

31 August 2016, 16:30:33

2016-08-31 18:24 GMT+02:00 Corey Huinker <corey.huinker@gmail.com>:

On Wed, Aug 31, 2016 at 11:32 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

I propose a new type of psql variables - file references. The content of file reference is specified by referenced file. It allows simple inserting large data without necessity of manual escaping or using LO api.

When I wrote the patch, I used parametrized queries for these data instead escaped strings - the code is not much bigger, and the error messages are much more friendly if query is not bloated by bigger content. The text mode is used only - when escaping is not required, then content is implicitly transformed to bytea. By default the content of file is bytea. When use requires escaping, then he enforces text escaping - because it has sense only for text type.

postgres=# \setfileref a ~/test2.xml
postgres=# \setfileref b ~/avatar.gif
postgres=# insert into test values(convert_from(:a, 'latin2')::xml, :b); -- xml is passed as bytea
postgres=# insert into test values(:'a', :b); -- xml is passed via unknown text value

The content of file reference variables is not persistent in memory.

Comments, notes?

Regards

Pavel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Clearly jumping ahead on this one, but if the fileref is essentially a pipe to "cat /path/to/file.name", is there anything stopping us from setting pipes?

My interest is primarily in ways that COPY could use this.

I don't see a reason why it should not be possible - the current code can't do it, but with 20 lines more, it should be possible

There is one disadvantage against copy - the content should be fully loaded to memory, but any other functionality should be same.

Regards

Pavel

Re: proposal: psql \setfileref

From

Ryan Murphy

Date:

26 September 2016, 12:58:04

Hi Pavel,

I just tried to apply your patch psql-setfileref-initial.patch (using git apply) to the newest revision of postgres at
thetime (da6c4f6ca88) and it failed to patch startup.c.  Thinking that the patch was for some previous revision, I then
checkedout d062245b5, which was from 2016-08-31, and tried applying the patch from there.  I had the same problem:
 
   $ git apply psql-setfileref-initial.patch   error: patch failed: src/bin/psql/startup.c:106   error:
src/bin/psql/startup.c:patch does not apply
 

However, when I applied the changes to startup.c manually and removed them from the patch, the rest of the patch
appliedwithout a problem.  I don't know if I may have done something wrong in trying to apply the patch, but you may
wantto double check if you need to regenerate your patch from the latest revision so it will apply smoothly for
reviewers.

In the meantime, I haven't had a chance to try out the fileref feature yet but I'll give it a go when I get a chance!

Thanks!
Ryan

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

26 September 2016, 16:20:09

2016-09-26 14:57 GMT+02:00 Ryan Murphy <ryanfmurphy@gmail.com>:

Hi Pavel,

I just tried to apply your patch psql-setfileref-initial.patch (using git apply) to the newest revision of postgres at the time (da6c4f6ca88) and it failed to patch startup.c. Thinking that the patch was for some previous revision, I then checked out d062245b5, which was from 2016-08-31, and tried applying the patch from there. I had the same problem:

$ git apply psql-setfileref-initial.patch
error: patch failed: src/bin/psql/startup.c:106
error: src/bin/psql/startup.c: patch does not apply

However, when I applied the changes to startup.c manually and removed them from the patch, the rest of the patch applied without a problem. I don't know if I may have done something wrong in trying to apply the patch, but you may want to double check if you need to regenerate your patch from the latest revision so it will apply smoothly for reviewers.

Thank you for info. I'll do it immediately.

Regards

Pavel

In the meantime, I haven't had a chance to try out the fileref feature yet but I'll give it a go when I get a chance!

Thanks!
Ryan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

26 September 2016, 19:20:48

Hi

2016-09-26 14:57 GMT+02:00 Ryan Murphy <ryanfmurphy@gmail.com>:

Hi Pavel,

I just tried to apply your patch psql-setfileref-initial.patch (using git apply) to the newest revision of postgres at the time (da6c4f6ca88) and it failed to patch startup.c. Thinking that the patch was for some previous revision, I then checked out d062245b5, which was from 2016-08-31, and tried applying the patch from there. I had the same problem:

$ git apply psql-setfileref-initial.patch
error: patch failed: src/bin/psql/startup.c:106
error: src/bin/psql/startup.c: patch does not apply

However, when I applied the changes to startup.c manually and removed them from the patch, the rest of the patch applied without a problem. I don't know if I may have done something wrong in trying to apply the patch, but you may want to double check if you need to regenerate your patch from the latest revision so it will apply smoothly for reviewers.

please, can you check attached patch? It worked in my laptop.

Regards

Pavel

In the meantime, I haven't had a chance to try out the fileref feature yet but I'll give it a go when I get a chance!

Thanks!
Ryan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

psql-setfileref-2016-09-26.patch

Re: proposal: psql \setfileref

From

Ryan Murphy

Date:

26 September 2016, 19:48:05

please, can you check attached patch? It worked in my laptop.

Regards

Pavel

Yes, that one applied for me without any problems.

Thanks,

Ryan

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

26 September 2016, 19:50:20

2016-09-26 21:47 GMT+02:00 Ryan Murphy <ryanfmurphy@gmail.com>:

please, can you check attached patch? It worked in my laptop.

Regards

Pavel

Yes, that one applied for me without any problems.

Great,

Thank you

Pavel

Thanks,
Ryan

Re: proposal: psql \setfileref

From

Gilles Darold

Date:

03 October 2016, 19:55:09

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, failed
Spec compliant:           tested, failed
Documentation:            not tested

Contents & Purpose
==================

This patch adds a new type of psql variables: file references, that can be
set using command \setfileref. The value of the named variable is the path
to the referenced file. It allows simple inserting of large data without
necessity of manual escaping or using LO api. Use:
postgres=# create table test (col1 bytea);postgres=# \setfileref a ~/avatar.gifpostgres=# insert into test values(:a);

Content of file is returned as bytea, the text output can be used when
escaping is not required, in this case use convert_from() to obtain a
text output.
postgres=# create table test (col1 text);postgres=# \setfileref a ~/README.txtpostgres=# insert into test
values(convert_from(:a,'utf8'));
 

The content of file reference variables is not persistent in memory.

List of file referenced variable can be listed using \set
postgres=# \set...b = ^'~/README.txt'

Empty file outputs an empty bytea (\x).

Initial Run
===========

The patch applies cleanly to HEAD and doesn't break anything, at least the
regression tests all pass successfully.

Feedback on testing
===============

I see several problems.

1) Error reading referenced file returns the system error and a syntax error
on variable:
postgres=# \setfileref a /etc/sudoerspostgres=# insert into test values (:a);/etc/sudoers: Permission deniedERROR:
syntaxerror at or near ":"LINE 1: insert into t1 values (:a);
 

2) Trying to load a file with size upper than the 1GB limit reports the following
error (size 2254110720 bytes):
postgres=# \setfileref b /home/VMs/ISOs/sol-10-u11-ga-x86-dvd.isopostgres=# insert into test values (:b);INSERT 0
1postgres=#ANALYZE test;postgres=# SELECT relname, relkind, relpages, pg_size_pretty(relpages::bigint*8192) FROM
pg_classWHERE relname='test'; relname | relkind | relpages | pg_size_pretty
---------+---------+----------+----------------test    | r       |        1 | 8192 bytes(1 row)
 
postgres=# select * from test; col1 ------ \x(1 row)

This should not insert an empty bytea but might raise an error instead.

Trying to load smaller file but with bytea escaping total size upper than
the 1GB limit (size 666894336 bytes) reports:
postgres=# \setfileref a /var/ISOs/CentOS-7-x86_64-Minimal-1503-01.isopostgres=# insert into t1 values (1, :a,
'tr');ERROR: out of memoryDETAIL:  Cannot enlarge string buffer containing 0 bytes by 1333788688 more bytes.Time:
1428.657ms (00:01.429)
 

This raise an error as bytea escaping increase content size which is the behavior expected.

3) The path doesn't not allow the use of pipe to system command, which is a secure
behavior, but it is quite easy to perform a DOS by using special files like:
postgres=# \setfileref b /dev/random                                       postgres=# insert into t1 (:b);.

this will never end until we kill the psql session. I think we must also prevent
non regular files to be referenced using S_ISREG.

I think all these three errors can be caught and prevented at variable declaration using some tests on files like:
is readable?is a regular file?is small size enough?

to report an error directly at variable file reference setting.

4) An other problem is that like this this patch will allow anyone to upload into a
column the content of any system file that can be read by postgres system user
and then allow non system user to read its content. For example, connected as
a basic PostgreSQL only user:
testdb=> select current_user; current_user -------------- user1(1 row)testdb=> \setfileref b /etc/passwdtestdb=> insert
intotest values (:b);INSERT 0 1
 

then to read the file:
testdb=> select convert_from(col1, 'utf8') from t1; 

Maybe the referenced files must be limited to some part of the filesystem
or the feature limited to superuser only.

5) There is no documentation at all on this feature. Here a proposal
for inclusion in doc/src/sgml/ref/psql-ref.sgml 
\setfileref name valueSets the internal variable name as a reference to the file pathset as value. To unset a variable,
usethe \unset command.
 
File references are shown as file path prefixed with the ^ characterwhen using the \set command alone.
Valid variable names can contain characters, digits, and underscores.See the section Variables below for details.
Variablenames arecase-sensitive.
 
More detail here about file size, file privilege, etc followingwhat will be decided.


6) I would also like to see \setfileref display all file references variables
defined instead of message "missing required argument". Just like \set and
\pset alone are working. \set can still show the file references variables, that's
not a problem for me.


The new status of this patch is: Waiting on Author

Re: proposal: psql \setfileref

From

Robert Haas

Date:

03 October 2016, 21:03:17

On Mon, Oct 3, 2016 at 3:54 PM, Gilles Darold <gilles@darold.net> wrote:
> 4) An other problem is that like this this patch will allow anyone to upload into a
> column the content of any system file that can be read by postgres system user
> and then allow non system user to read its content.

I thought this was a client-side feature, so that it would let a
client upload any file that the client can read, but not things that
can only be read by the postgres system user.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: proposal: psql \setfileref

From

Gilles Darold

Date:

03 October 2016, 21:24:03

Le 03/10/2016 à 23:03, Robert Haas a écrit :
> On Mon, Oct 3, 2016 at 3:54 PM, Gilles Darold <gilles@darold.net> wrote:
>> 4) An other problem is that like this this patch will allow anyone to upload into a
>> column the content of any system file that can be read by postgres system user
>> and then allow non system user to read its content.
> I thought this was a client-side feature, so that it would let a
> client upload any file that the client can read, but not things that
> can only be read by the postgres system user.
>

Yes that's right, sorry for the noise, forget this fourth report.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Re: proposal: psql \setfileref

From

Gilles Darold

Date:

04 October 2016, 07:18:07

Le 03/10/2016 à 23:23, Gilles Darold a écrit :
> Le 03/10/2016 à 23:03, Robert Haas a écrit :
>> On Mon, Oct 3, 2016 at 3:54 PM, Gilles Darold <gilles@darold.net> wrote:
>>> 4) An other problem is that like this this patch will allow anyone to upload into a
>>> column the content of any system file that can be read by postgres system user
>>> and then allow non system user to read its content.
>> I thought this was a client-side feature, so that it would let a
>> client upload any file that the client can read, but not things that
>> can only be read by the postgres system user.
>>
> Yes that's right, sorry for the noise, forget this fourth report.
>

After some more though there is still a security issue here. For a
PostgreSQL user who also have login acces to the server, it is possible
to read any file that the postgres system user can read, especially a
.pgpass or a recovery.conf containing password.


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

04 October 2016, 15:30:35

2016-10-04 9:18 GMT+02:00 Gilles Darold <gilles.darold@dalibo.com>:

Le 03/10/2016 à 23:23, Gilles Darold a écrit :
> Le 03/10/2016 à 23:03, Robert Haas a écrit :
>> On Mon, Oct 3, 2016 at 3:54 PM, Gilles Darold <gilles@darold.net> wrote:
>>> 4) An other problem is that like this this patch will allow anyone to upload into a
>>> column the content of any system file that can be read by postgres system user
>>> and then allow non system user to read its content.
>> I thought this was a client-side feature, so that it would let a
>> client upload any file that the client can read, but not things that
>> can only be read by the postgres system user.
>>
> Yes that's right, sorry for the noise, forget this fourth report.
>

After some more though there is still a security issue here. For a
PostgreSQL user who also have login acces to the server, it is possible
to read any file that the postgres system user can read, especially a
.pgpass or a recovery.conf containing password.

This patch doesn't introduce any new server side functionality, so if there is some vulnerability, then it is exists now too.

Regards

Pavel

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: proposal: psql \setfileref

From

Gilles Darold

Date:

04 October 2016, 22:12:43

Le 04/10/2016 à 17:29, Pavel Stehule a écrit :

2016-10-04 9:18 GMT+02:00 Gilles Darold <gilles.darold@dalibo.com>:
Le 03/10/2016 à 23:23, Gilles Darold a écrit :
> Le 03/10/2016 à 23:03, Robert Haas a écrit :
>> On Mon, Oct 3, 2016 at 3:54 PM, Gilles Darold <gilles@darold.net> wrote:
>>> 4) An other problem is that like this this patch will allow anyone to upload into a
>>> column the content of any system file that can be read by postgres system user
>>> and then allow non system user to read its content.
>> I thought this was a client-side feature, so that it would let a
>> client upload any file that the client can read, but not things that
>> can only be read by the postgres system user.
>>
> Yes that's right, sorry for the noise, forget this fourth report.
>

After some more though there is still a security issue here. For a
PostgreSQL user who also have login acces to the server, it is possible
to read any file that the postgres system user can read, especially a
.pgpass or a recovery.conf containing password.

This patch doesn't introduce any new server side functionality, so if there is some vulnerability, then it is exists now too.

It doesn't exists, that was my system user which have extended privilege. You can definitively forget the fouth point.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

09 October 2016, 09:49:42

hi

2016-10-04 9:18 GMT+02:00 Gilles Darold <gilles.darold@dalibo.com>:

Le 03/10/2016 à 23:23, Gilles Darold a écrit :
> Le 03/10/2016 à 23:03, Robert Haas a écrit :
>> On Mon, Oct 3, 2016 at 3:54 PM, Gilles Darold <gilles@darold.net> wrote:
>>> 4) An other problem is that like this this patch will allow anyone to upload into a
>>> column the content of any system file that can be read by postgres system user
>>> and then allow non system user to read its content.
>> I thought this was a client-side feature, so that it would let a
>> client upload any file that the client can read, but not things that
>> can only be read by the postgres system user.
>>
> Yes that's right, sorry for the noise, forget this fourth report.
>

After some more though there is still a security issue here. For a
PostgreSQL user who also have login acces to the server, it is possible
to read any file that the postgres system user can read, especially a
.pgpass or a recovery.conf containing password.

here is new update - some mentioned issues are fixed + regress tests and docs

regards

Pavel

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

psql-setfileref-2016-10-09.patch

Re: proposal: psql \setfileref

From

Corey Huinker

Date:

09 October 2016, 15:28:11

here is new update - some mentioned issues are fixed + regress tests and docs

This might tie into some work I'm doing. Is there any way these filerefs could be used as the inline portion of a COPY command?

i.e. like this:

COPY my_table FROM STDIN

:file_ref
\.

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

09 October 2016, 16:13:06

2016-10-09 17:27 GMT+02:00 Corey Huinker <corey.huinker@gmail.com>:

here is new update - some mentioned issues are fixed + regress tests and docs

This might tie into some work I'm doing. Is there any way these filerefs could be used as the inline portion of a COPY command?
i.e. like this:

COPY my_table FROM STDIN
:file_ref
\.

I understand, but I am not sure how difficult implementation it is. This part (COPY input) doesn't support parametrization - and parametrization can have a negative performance impact.

Regards

Pavel

Re: proposal: psql \setfileref

From

Corey Huinker

Date:

09 October 2016, 17:15:15

On Sun, Oct 9, 2016 at 12:12 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2016-10-09 17:27 GMT+02:00 Corey Huinker <corey.huinker@gmail.com>:
here is new update - some mentioned issues are fixed + regress tests and docs

This might tie into some work I'm doing. Is there any way these filerefs could be used as the inline portion of a COPY command?
i.e. like this:

COPY my_table FROM STDIN
:file_ref
\.

I understand, but I am not sure how difficult implementation it is. This part (COPY input) doesn't support parametrization - and parametrization can have a negative performance impact.

Regards

I may not understand your response. I was thinking we'd want :file_ref to simply 'cat' the file (or program) in place. The fact that the output was consumed by COPY was coincidental. Does that chance your response?

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

09 October 2016, 17:21:58

2016-10-09 19:14 GMT+02:00 Corey Huinker <corey.huinker@gmail.com>:

On Sun, Oct 9, 2016 at 12:12 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2016-10-09 17:27 GMT+02:00 Corey Huinker <corey.huinker@gmail.com>:
here is new update - some mentioned issues are fixed + regress tests and docs

This might tie into some work I'm doing. Is there any way these filerefs could be used as the inline portion of a COPY command?
i.e. like this:

COPY my_table FROM STDIN
:file_ref
\.

I understand, but I am not sure how difficult implementation it is. This part (COPY input) doesn't support parametrization - and parametrization can have a negative performance impact.

Regards

I may not understand your response. I was thinking we'd want :file_ref to simply 'cat' the file (or program) in place. The fact that the output was consumed by COPY was coincidental. Does that chance your response?

look to code - some parts allows psql session variables, some not - I can use variables in statements - not in data block.

More, there is ambiguity - :xxx should not be part of SQL statement (and then psql variables are possible), but :xxxx should be part of data.

Regards

Pavel

Re: proposal: psql \setfileref

From

Corey Huinker

Date:

09 October 2016, 17:27:35

look to code - some parts allows psql session variables, some not - I can use variables in statements - not in data block.

More, there is ambiguity - :xxx should not be part of SQL statement (and then psql variables are possible), but :xxxx should be part of data.

Regards

Pavel

Makes sense, thanks for clearing it up.

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

09 October 2016, 17:33:41

2016-10-09 19:27 GMT+02:00 Corey Huinker <corey.huinker@gmail.com>:

look to code - some parts allows psql session variables, some not - I can use variables in statements - not in data block.

More, there is ambiguity - :xxx should not be part of SQL statement (and then psql variables are possible), but :xxxx should be part of data.

Regards

Pavel

Makes sense, thanks for clearing it up.

ok

Regards

Pavel

Re: proposal: psql \setfileref

From

Bruce Momjian

Date:

10 October 2016, 02:54:08

On Sun, Oct  9, 2016 at 06:12:16PM +0200, Pavel Stehule wrote:
> 
> 
> 2016-10-09 17:27 GMT+02:00 Corey Huinker <corey.huinker@gmail.com>:
> 
>             here is new update - some mentioned issues are fixed + regress
>             tests and docs
> 
>        
> 
> 
>      
>     This might tie into some work I'm doing. Is there any way these filerefs
>     could be used as the inline portion of a COPY command?
>     i.e. like this:
> 
>     COPY my_table FROM STDIN
>     :file_ref
>     \.
> 
> 
> 
> 
> I understand, but I am not sure how difficult implementation it is. This part
> (COPY input) doesn't support parametrization - and parametrization can have a
> negative performance impact.

And it would need to be \:file_ref in COPY so real data doesn't trigger
it.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

Re: proposal: psql \setfileref

From

Jim Nasby

Date:

10 October 2016, 03:26:19

On 10/9/16 9:54 PM, Bruce Momjian wrote:
>> I understand, but I am not sure how difficult implementation it is. This part
>> > (COPY input) doesn't support parametrization - and parametrization can have a
>> > negative performance impact.
> And it would need to be \:file_ref in COPY so real data doesn't trigger
> it.

ISTM it'd be much saner to just restrict file ref's to only working with 
psql's \COPY, and not server-side COPY.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461

Re: proposal: psql \setfileref

From

Corey Huinker

Date:

10 October 2016, 04:04:33

On Sun, Oct 9, 2016 at 11:26 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 10/9/16 9:54 PM, Bruce Momjian wrote:
I understand, but I am not sure how difficult implementation it is. This part
> (COPY input) doesn't support parametrization - and parametrization can have a
> negative performance impact.
And it would need to be \:file_ref in COPY so real data doesn't trigger
it.

ISTM it'd be much saner to just restrict file ref's to only working with psql's \COPY, and not server-side COPY.

Which is a great discussion for the thread on "COPY as a set returning function". I didn't mean to hijack this thread with a misguided tie-in.

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

10 October 2016, 04:07:54

2016-10-10 5:26 GMT+02:00 Jim Nasby <Jim.Nasby@bluetreble.com>:

On 10/9/16 9:54 PM, Bruce Momjian wrote:
I understand, but I am not sure how difficult implementation it is. This part
> (COPY input) doesn't support parametrization - and parametrization can have a
> negative performance impact.
And it would need to be \:file_ref in COPY so real data doesn't trigger
it.

ISTM it'd be much saner to just restrict file ref's to only working with psql's \COPY, and not server-side COPY.

What should be a benefit, use case for file ref for client side \COPY ?

Regards

Pavel

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

Re: proposal: psql \setfileref

From

Gilles Darold

Date:

10 October 2016, 07:23:21

Le 09/10/2016 à 11:48, Pavel Stehule a écrit :

hi

2016-10-04 9:18 GMT+02:00 Gilles Darold <gilles.darold@dalibo.com>:
Le 03/10/2016 à 23:23, Gilles Darold a écrit :
> Le 03/10/2016 à 23:03, Robert Haas a écrit :
>> On Mon, Oct 3, 2016 at 3:54 PM, Gilles Darold <gilles@darold.net> wrote:
>>> 4) An other problem is that like this this patch will allow anyone to upload into a
>>> column the content of any system file that can be read by postgres system user
>>> and then allow non system user to read its content.
>> I thought this was a client-side feature, so that it would let a
>> client upload any file that the client can read, but not things that
>> can only be read by the postgres system user.
>>
> Yes that's right, sorry for the noise, forget this fourth report.
>

After some more though there is still a security issue here. For a
PostgreSQL user who also have login acces to the server, it is possible
to read any file that the postgres system user can read, especially a
.pgpass or a recovery.conf containing password.

here is new update - some mentioned issues are fixed + regress tests and docs

Looks very good for me minus the two following points:

1) I think \setfileref must return the same syntax than \set command

postgres=# \setfileref a testfile.txt
postgres=# \setfileref
a = 'testfile.txt'

postgres=# \setfileref
...
a = ^'testfile.txt'

I think it would be better to prefixed the variable value with the ^ too like in the \set report even if we know by using this command that reported variables are file references.

2) You still allow special file to be used, I understand that this is from the user responsibility but I think it could be a wise precaution.

postgres=# \setfileref b /dev/random
postgres=# insert into test (:b);

Process need to be killed using SIGTERM.

However if this last point is not critical and should be handle by the user, I think this patch is ready to be reviewed by a committer after fixing the first point.

Regards,

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Re: proposal: psql \setfileref

From

"Daniel Verite"

Date:

10 October 2016, 12:38:31

Gilles Darold wrote:

>    postgres=# \setfileref b /dev/random
>    postgres=# insert into test (:b);
>
> Process need to be killed using SIGTERM.

This can be fixed by setting sigint_interrupt_enabled to true
before operating on the file. Then ctrl-C would be able to cancel
the command.

See comment in common.c, above the declaration of
sigint_interrupt_enabled:

/*[....]* SIGINT is supposed to abort all long-running psql operations, not only* database queries.  In most places,
thisis accomplished by checking* cancel_pressed during long-running loops.  However, that won't work when* blocked on
userinput (in readline() or fgets()).  In those places, we* set sigint_interrupt_enabled TRUE while blocked,
instructingthe signal* catcher to longjmp through sigint_interrupt_jmp.  We assume readline and* fgets are coded to
handlepossible interruption.  (XXX currently this does* not work on win32, so control-C is less useful there)*/ 

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: proposal: psql \setfileref

From

Gilles Darold

Date:

10 October 2016, 13:17:43

Le 10/10/2016 à 14:38, Daniel Verite a écrit :
>     Gilles Darold wrote:
>
>>    postgres=# \setfileref b /dev/random
>>    postgres=# insert into test (:b);
>>
>> Process need to be killed using SIGTERM.
> This can be fixed by setting sigint_interrupt_enabled to true
> before operating on the file. Then ctrl-C would be able to cancel
> the command.

If we do not prevent the user to be able to load special files that
would be useful yes.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

10 October 2016, 17:51:15

2016-10-10 15:17 GMT+02:00 Gilles Darold <gilles.darold@dalibo.com>:

Le 10/10/2016 à 14:38, Daniel Verite a écrit :
> Gilles Darold wrote:
>
>> postgres=# \setfileref b /dev/random
>> postgres=# insert into test (:b);
>>
>> Process need to be killed using SIGTERM.
> This can be fixed by setting sigint_interrupt_enabled to true
> before operating on the file. Then ctrl-C would be able to cancel
> the command.

If we do not prevent the user to be able to load special files that
would be useful yes.

I don't think so special files has some sense, just I had not time fix this issue. I will do it early - I hope.

Regards

Pavel

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

11 October 2016, 05:54:06

2016-10-10 19:50 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2016-10-10 15:17 GMT+02:00 Gilles Darold <gilles.darold@dalibo.com>:
Le 10/10/2016 à 14:38, Daniel Verite a écrit :
> Gilles Darold wrote:
>
>> postgres=# \setfileref b /dev/random
>> postgres=# insert into test (:b);
>>
>> Process need to be killed using SIGTERM.
> This can be fixed by setting sigint_interrupt_enabled to true
> before operating on the file. Then ctrl-C would be able to cancel
> the command.

If we do not prevent the user to be able to load special files that
would be useful yes.

I don't think so special files has some sense, just I had not time fix this issue. I will do it early - I hope.

fresh patch - only regular files are allowed

Regards

Pavel

Regards

Pavel

--
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Attachment

psql-setfileref-2016-10-11.patch

Re: proposal: psql \setfileref

From

Gilles Darold

Date:

11 October 2016, 07:32:59

Le 11/10/2016 à 07:53, Pavel Stehule a écrit :

2016-10-10 19:50 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2016-10-10 15:17 GMT+02:00 Gilles Darold <gilles.darold@dalibo.com>:
Le 10/10/2016 à 14:38, Daniel Verite a écrit :
> Gilles Darold wrote:
>
>> postgres=# \setfileref b /dev/random
>> postgres=# insert into test (:b);
>>
>> Process need to be killed using SIGTERM.
> This can be fixed by setting sigint_interrupt_enabled to true
> before operating on the file. Then ctrl-C would be able to cancel
> the command.

If we do not prevent the user to be able to load special files that
would be useful yes.

I don't think so special files has some sense, just I had not time fix this issue. I will do it early - I hope.

fresh patch - only regular files are allowed

Regards

Pavel

Hi,

The patch works as expected, the last two remaining issues have been fixed. I've changed the status to "Ready for committers".

Thanks Pavel.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

11 October 2016, 07:40:58

2016-10-11 9:32 GMT+02:00 Gilles Darold <gilles.darold@dalibo.com>:

Le 11/10/2016 à 07:53, Pavel Stehule a écrit :

2016-10-10 19:50 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2016-10-10 15:17 GMT+02:00 Gilles Darold <gilles.darold@dalibo.com>:
Le 10/10/2016 à 14:38, Daniel Verite a écrit :
> Gilles Darold wrote:
>
>> postgres=# \setfileref b /dev/random
>> postgres=# insert into test (:b);
>>
>> Process need to be killed using SIGTERM.
> This can be fixed by setting sigint_interrupt_enabled to true
> before operating on the file. Then ctrl-C would be able to cancel
> the command.

If we do not prevent the user to be able to load special files that
would be useful yes.

I don't think so special files has some sense, just I had not time fix this issue. I will do it early - I hope.

fresh patch - only regular files are allowed

Regards

Pavel

Hi,

The patch works as expected, the last two remaining issues have been fixed. I've changed the status to "Ready for committers".
Thanks Pavel.

Thank you very much

Regards

Pavel

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

Re: proposal: psql \setfileref

From

Tom Lane

Date:

09 November 2016, 21:47:47

Pavel Stehule <pavel.stehule@gmail.com> writes:
> [ psql-setfileref-2016-10-11.patch ]

I haven't been paying any attention to this thread, but I got around to
looking at it finally.  I follow the idea of wanting to be able to shove
the contents of a file into a query literal, but there isn't much else
that I like about this patch.  In particular:

* I really dislike the notion of keying the behavior to a special type of
psql variable.  psql variables are untyped at the moment, and if we were
going to introduce typing, this wouldn't be what I'd want to use it for.
I really don't want to introduce typing and then invent one-off,
unextensible syntax like '^' prefixes to denote what type a variable is.

Aside from being conceptually a mess, I don't even find it particularly
convenient.  In the shell, if you want to source from a file, you write
"<filename".  You aren't compelled to assign the filename to a variable
and then write "<$filename" ... although you can if that's actually
helpful.

Going by the notion of driving it off syntax not variable type, I'd
suggest that we extend the colon-variablename syntax to indicate
desire to read a file.  :<filename< is one pretty obvious idea.
Maybe we could use :<:variablename< to indicate substituting the
content of a variable as the file name to read.

* I'm a bit queasy about the idea of automatically switching over to
parameterized queries when we have one of these things in the query.
I'm afraid that that will have user-visible consequences, so I would
rather that psql not do that behind the user's back.  Plus, that assumes
a fact not in evidence, namely that you only ever want to insert data
and not code this way.  (If \i were more flexible, that objection would
be moot, but you can't source just part of a query from \i AFAIK.)
There might be something to be said for a psql setting that controls
whether to handle colon-insertions this way, and make it apply to
the existing :'var' syntax as well as the filename syntax.

* I find the subthread about attaching this to COPY to be pretty much of
a red herring.  What would that do that you can't do today with \copy?
        regards, tom lane

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

10 November 2016, 14:40:46

Hi

2016-11-09 22:47 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> [ psql-setfileref-2016-10-11.patch ]

I haven't been paying any attention to this thread, but I got around to
looking at it finally. I follow the idea of wanting to be able to shove
the contents of a file into a query literal, but there isn't much else
that I like about this patch. In particular:

* I really dislike the notion of keying the behavior to a special type of
psql variable. psql variables are untyped at the moment, and if we were
going to introduce typing, this wouldn't be what I'd want to use it for.
I really don't want to introduce typing and then invent one-off,
unextensible syntax like '^' prefixes to denote what type a variable is.

still I am thinking so some differencing can be nice, because we can use psql file path tab autocomplete.

Maybe \setfileref can stay - it will set any variable, but the autocomplete will be based on file path.

Aside from being conceptually a mess, I don't even find it particularly
convenient. In the shell, if you want to source from a file, you write
"<filename". You aren't compelled to assign the filename to a variable
and then write "<$filename" ... although you can if that's actually
helpful.

Going by the notion of driving it off syntax not variable type, I'd
suggest that we extend the colon-variablename syntax to indicate
desire to read a file. :<filename< is one pretty obvious idea.
Maybe we could use :<:variablename< to indicate substituting the
content of a variable as the file name to read.

I used the concept of file references because I would not to invent new syntax of psql variables evaluation.

If we introduce new syntax, then the variables are not necessary. The syntax :some op has sense, and be used and enhanced in future.

What do you thing about following example?

INSERT INTO tab VALUES(1, :<varname); -- insert text value -- used text escaping

INSERT INTO tab VALUES(1, :<#varname); -- insert bytea value -- used bytea escaping

* I'm a bit queasy about the idea of automatically switching over to
parameterized queries when we have one of these things in the query.
I'm afraid that that will have user-visible consequences, so I would
rather that psql not do that behind the user's back. Plus, that assumes
a fact not in evidence, namely that you only ever want to insert data
and not code this way. (If \i were more flexible, that objection would
be moot, but you can't source just part of a query from \i AFAIK.)
There might be something to be said for a psql setting that controls
whether to handle colon-insertions this way, and make it apply to
the existing :'var' syntax as well as the filename syntax.

I understand to this objection - The my motivation for parametrized queries was better (user friendly) reaction on syntax errors. In this case the content can be big, the query can be big. When we use parametrized queries, then the error message can be short and readable. Another advantage of parametrized queries is possibility to set parameter type. It is important for binary content. And last advantage is a possibility to use binary passing - it is interesting for XML - it allows automatic encoding conversions. These features are nice, but are not necessary for this patch.

* I find the subthread about attaching this to COPY to be pretty much of
a red herring. What would that do that you can't do today with \copy?

The primary task is simple - import big XML, JSON document or some binary data to database. This can be partially solved by ref variables, but COPY has more verbose and more natural syntax - the file path autocomplete can be used.

\COPY table(column) FROM file FLAG;

Second task is not too complex too - export binary data from Postgres and store these data in binary files. Now I have to use final transformation on client side.

Third task - one interesting feature of XML type (automatic encoding conversion) is available only with binary input output functions. I would to find a way how this functionality can be accessed without "hard" programming.

\COPY (SELECT xmldoc FROM xxx WHERE id = 111) TO file BINARY ENCODING latin1;

Regards

Pavel

regards, tom lane

Re: proposal: psql \setfileref

From

Tom Lane

Date:

10 November 2016, 17:56:18

Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2016-11-09 22:47 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
>> * I really dislike the notion of keying the behavior to a special type of
>> psql variable.

> still I am thinking so some differencing can be nice, because we can use
> psql file path tab autocomplete.
> Maybe \setfileref can stay - it will set any variable, but the autocomplete
> will be based on file path.

Personally, I'd rather have filename tab completion after ":<", and forget
\setfileref --- I do not think that setting a variable first is going to
be the primary use-case for this.  In fact, I could happily dispense with
the variable case entirely, except that sometimes people will want to
reference file names that don't syntactically fit into the colon syntax
(eg, names with spaces in them).  Even that seems surmountable, if people
are okay with requiring the '<' trailer --- I don't think we need to worry
too much about supporting file names with '<' in them.  (Likewise for
'>', if you feel like :<filename> is a less ugly syntax.)

> What do you thing about following example?

> INSERT INTO tab VALUES(1, :<varname); -- insert text value  -- used text
> escaping
> INSERT INTO tab VALUES(1, :<#varname); -- insert bytea value  -- used bytea
> escaping

Seems a bit arbitrary, and not readily extensible to more datatypes.

I guess an advantage of the parameterized-query approach is that we
wouldn't really have to distinguish different datatypes in the syntax,
because we could use the result of Describe Statement to figure out what
to do.  Maybe that outweighs the concern about magic behavioral changes.

On the other hand, it's also arguable that trying to cater for automatic
handling of raw files as bytea literals is overcomplicating the feature
in support of a very infrequent use-case, and giving up some other
infrequent use-cases to get that.  An example is that if we insist on
doing this through parameterized queries, it will not work for inserting
literals into utility commands.  I don't know which of these things is
more important to have.
        regards, tom lane

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

10 November 2016, 18:54:32

2016-11-10 18:56 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2016-11-09 22:47 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
>> * I really dislike the notion of keying the behavior to a special type of
>> psql variable.

> still I am thinking so some differencing can be nice, because we can use
> psql file path tab autocomplete.
> Maybe \setfileref can stay - it will set any variable, but the autocomplete
> will be based on file path.

Personally, I'd rather have filename tab completion after ":<", and forget
\setfileref --- I do not think that setting a variable first is going to
be the primary use-case for this. In fact, I could happily dispense with
the variable case entirely, except that sometimes people will want to
reference file names that don't syntactically fit into the colon syntax
(eg, names with spaces in them). Even that seems surmountable, if people
are okay with requiring the '<' trailer --- I don't think we need to worry
too much about supporting file names with '<' in them. (Likewise for
'>', if you feel like :<filename> is a less ugly syntax.)

In this case I dislike '>' on the end. The semantic is less clear with it.

I though about dropping variables too, but I expecting a expandable variable after colon syntax. So it need special clear syntax for disabling variable expansion.

What about :<{filename} ?

INSERT INTO tab VALUES(1, :<{~/data/doc.txt});

> What do you thing about following example?

> INSERT INTO tab VALUES(1, :<varname); -- insert text value -- used text
> escaping
> INSERT INTO tab VALUES(1, :<#varname); -- insert bytea value -- used bytea
> escaping

Seems a bit arbitrary, and not readily extensible to more datatypes.

there are only two possibilities - text with client encoding to server encoding conversions, and binary - without any conversions.

I guess an advantage of the parameterized-query approach is that we
wouldn't really have to distinguish different datatypes in the syntax,
because we could use the result of Describe Statement to figure out what
to do. Maybe that outweighs the concern about magic behavioral changes.

I don't understand - there is a possibility to detect type of parameter on client side?

On the other hand, it's also arguable that trying to cater for automatic
handling of raw files as bytea literals is overcomplicating the feature
in support of a very infrequent use-case, and giving up some other
infrequent use-cases to get that. An example is that if we insist on
doing this through parameterized queries, it will not work for inserting
literals into utility commands. I don't know which of these things is
more important to have.

I had not idea a complete replacement of current mechanism by query parameters. But a partial usage can be source of inconsistencies :(

Pavel

regards, tom lane

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

12 November 2016, 20:44:48

2016-11-10 18:56 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2016-11-09 22:47 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
>> * I really dislike the notion of keying the behavior to a special type of
>> psql variable.

> still I am thinking so some differencing can be nice, because we can use
> psql file path tab autocomplete.
> Maybe \setfileref can stay - it will set any variable, but the autocomplete
> will be based on file path.

Personally, I'd rather have filename tab completion after ":<", and forget
\setfileref --- I do not think that setting a variable first is going to
be the primary use-case for this. In fact, I could happily dispense with
the variable case entirely, except that sometimes people will want to
reference file names that don't syntactically fit into the colon syntax
(eg, names with spaces in them). Even that seems surmountable, if people
are okay with requiring the '<' trailer --- I don't think we need to worry
too much about supporting file names with '<' in them. (Likewise for
'>', if you feel like :<filename> is a less ugly syntax.)

I found early stop - there are not easy implement any complex syntax in lexer, without complex backup rules.

Now, I am coming with little bit different idea, different syntax.

:xxxx means insert some content based on psql variable

We can introduce psql content commands that produces some content. The syntax can be

:{cmd parameters}

Some commands:

* r - read

* rq - read and quote, it can has a alias "<"

* rbq - read binary and quote

Other commands can be introduced in future

Usage:

INSERT INTO foo(image) VALUES(:{rbq ~/avatar.jpg})

INSERT INTO foo(xmdoc) VALUES(:{rq "~/current doc.xml"})

INSERT INTO foo(longtext) VALUES(:{< ~/book.txt})

It is general with possible simple implementation - doesn't big impact on psql lexer complexity.

What do you think about it?

Regards

Pavel

p.s. the colon is not necessary - we can use {} as special case of ``.

INSERT INTO foo(image) VALUES({rbq ~/avatar.jpg})

INSERT INTO foo(xmdoc) VALUES({rq "~/current doc.xml"})

INSERT INTO foo(longtext) VALUES({< ~/book.txt})

Then we can support psql variables there

\set avatar ~/avatar.jpg

INSERT INTO foo(image) VALUES({rbq :avatar})

> What do you thing about following example?

> INSERT INTO tab VALUES(1, :<varname); -- insert text value -- used text
> escaping
> INSERT INTO tab VALUES(1, :<#varname); -- insert bytea value -- used bytea
> escaping

Seems a bit arbitrary, and not readily extensible to more datatypes.

I guess an advantage of the parameterized-query approach is that we
wouldn't really have to distinguish different datatypes in the syntax,
because we could use the result of Describe Statement to figure out what
to do. Maybe that outweighs the concern about magic behavioral changes.

On the other hand, it's also arguable that trying to cater for automatic
handling of raw files as bytea literals is overcomplicating the feature
in support of a very infrequent use-case, and giving up some other
infrequent use-cases to get that. An example is that if we insist on
doing this through parameterized queries, it will not work for inserting
literals into utility commands. I don't know which of these things is
more important to have.

regards, tom lane

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

13 November 2016, 18:47:18

Hi

I am sending a initial implementation of psql content commands. This design is reaction to Tom's objections against psql file ref variables. This design is more cleaner, more explicit and more practical - import can be in one step.

Now supported commands are:

r - read file without any modification

rq - read file, escape as literal and use outer quotes

rb - read binary file - transform to hex code string

rbq - read binary file, transform to hex code string and use outer quotes

Usage:

create table testt(a xml);
insert into test values( {rq /home/pavel/.local/share/rhythmbox/rhythmdb.xml} );

\set xxx {r /home/pavel/.local/share/rhythmbox/rhythmdb.xml}

This patch is demo of this design - one part is redundant - I'll clean it in next iteration.

Regards

Pavel

Attachment

psql-content-commands.patch

Re: proposal: psql \setfileref

From

Corey Huinker

Date:

15 November 2016, 15:41:44

On Sun, Nov 13, 2016 at 1:46 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

r - read file without any modification
rq - read file, escape as literal and use outer quotes
rb - read binary file - transform to hex code string
rbq - read binary file, transform to hex code string and use outer quotes

Usage:

I am not asking for this feature now, but do you see any barriers to later adding x/xq/xb/xbq equivalents for executing a shell command?

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

15 November 2016, 15:44:50

2016-11-15 16:41 GMT+01:00 Corey Huinker <corey.huinker@gmail.com>:

On Sun, Nov 13, 2016 at 1:46 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
r - read file without any modification
rq - read file, escape as literal and use outer quotes
rb - read binary file - transform to hex code string
rbq - read binary file, transform to hex code string and use outer quotes

Usage:

I am not asking for this feature now, but do you see any barriers to later adding x/xq/xb/xbq equivalents for executing a shell command?

any other new commands can be appended simply - this is really generic

Regards

Pavel

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

15 November 2016, 16:25:49

Hi

2016-11-13 19:46 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

I am sending a initial implementation of psql content commands. This design is reaction to Tom's objections against psql file ref variables. This design is more cleaner, more explicit and more practical - import can be in one step.

Now supported commands are:

r - read file without any modification
rq - read file, escape as literal and use outer quotes
rb - read binary file - transform to hex code string
rbq - read binary file, transform to hex code string and use outer quotes

Usage:

create table testt(a xml);
insert into test values( {rq /home/pavel/.local/share/rhythmbox/rhythmdb.xml} );
\set xxx {r /home/pavel/.local/share/rhythmbox/rhythmdb.xml}

This patch is demo of this design - one part is redundant - I'll clean it in next iteration.

Regards

here is cleaned patch

* the behave is much more psqlish - show error only when the command is exactly detected - errors are related to processed files only - the behave is similar to psql variables - we doesn't raise a error, when the variable doesn't exists.

* no more duplicate code

* some basic doc

* some basic regress tests

Regards

Pavel

Pavel

Attachment

psql-content-commands-01.patch

Re: proposal: psql \setfileref

From

"Daniel Verite"

Date:

15 November 2016, 16:39:51

    Corey Huinker wrote:

> I am not asking for this feature now, but do you see any barriers to later
> adding x/xq/xb/xbq equivalents for executing a shell command?

For text contents, we already have this (pasted right from the doc):

testdb=> \set content `cat my_file.txt`
testdb=> INSERT INTO my_table VALUES (:'content');

Maybe we might look at why it doesn't work for binary string and fix that?

I can think of three reasons:

- psql use C-style '\0' terminated string implying no nul byte in variables.
That could potentially be fixed by tweaking the data structures and
accessors.

- the backtick operator trims ending '\n' from the result of the command
(like the shell), but we could add a variant that doesn't have this behavior.

- we don't have "interpolate as binary", an operator that will
essentially apply PQescapeByteaConn rather than PQescapeStringConn.

For example, I'd suggest this syntax:

-- slurp a file into a variable, with no translation whatsoever
\set content ``cat my_binary_file``

-- interpolate as binary (backquotes instead of quotes)
INSERT INTO my_table VALUES(:`content`);

If we had something like this, would we still need filerefs? I feel like
the point of filerefs is mainly to work around lack of support for
binary in variables, but maybe supporting the latter directly would
be an easier change to accept.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

15 November 2016, 16:49:10

2016-11-15 17:39 GMT+01:00 Daniel Verite <daniel@manitou-mail.org>:

Corey Huinker wrote:

> I am not asking for this feature now, but do you see any barriers to later
> adding x/xq/xb/xbq equivalents for executing a shell command?

For text contents, we already have this (pasted right from the doc):

testdb=> \set content `cat my_file.txt`
testdb=> INSERT INTO my_table VALUES (:'content');

Maybe we might look at why it doesn't work for binary string and fix that?

I can think of three reasons:

- psql use C-style '\0' terminated string implying no nul byte in variables.
That could potentially be fixed by tweaking the data structures and
accessors.

- the backtick operator trims ending '\n' from the result of the command
(like the shell), but we could add a variant that doesn't have this behavior.

- we don't have "interpolate as binary", an operator that will
essentially apply PQescapeByteaConn rather than PQescapeStringConn.

For example, I'd suggest this syntax:

-- slurp a file into a variable, with no translation whatsoever
\set content ``cat my_binary_file``

-- interpolate as binary (backquotes instead of quotes)
INSERT INTO my_table VALUES(:`content`);

If we had something like this, would we still need filerefs? I feel like
the point of filerefs is mainly to work around lack of support for
binary in variables, but maybe supporting the latter directly would
be an easier change to accept.

I leaved a concept of fileref - see Tom's objection. I know, so I can hack ``, but I would not do it. It is used for shell (system) calls, and these functionality can depends on used os. The proposed content commands can be extended more, and you are doesn't depends on o.s. Another issue of your proposal is hard support for tab complete (file path).

Please, try to test my last patch.

Regards

Pavel

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: proposal: psql \setfileref

From

Robert Haas

Date:

16 November 2016, 15:59:16

On Tue, Nov 15, 2016 at 11:48 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> For text contents, we already have this (pasted right from the doc):
>>
>> testdb=> \set content `cat my_file.txt`
>> testdb=> INSERT INTO my_table VALUES (:'content');
>>
>> Maybe we might look at why it doesn't work for binary string and fix that?
>>
>> I can think of three reasons:
>>
>> - psql use C-style '\0' terminated string implying no nul byte in
>> variables.
>> That could potentially be fixed by tweaking the data structures and
>> accessors.
>>
>> - the backtick operator trims ending '\n' from the result of the command
>> (like the shell), but we could add a variant that doesn't have this
>> behavior.
>>
>> - we don't have "interpolate as binary", an operator that will
>> essentially apply PQescapeByteaConn rather than PQescapeStringConn.
>>
>> For example, I'd suggest this syntax:
>>
>> -- slurp a file into a variable, with no translation whatsoever
>> \set content ``cat my_binary_file``
>>
>> -- interpolate as binary (backquotes instead of quotes)
>> INSERT INTO my_table VALUES(:`content`);
>>
>> If we had something like this, would we still need filerefs? I feel like
>> the point of filerefs is mainly to work around lack of support for
>> binary in variables, but maybe supporting the latter directly would
>> be an easier change to accept.
>
> I leaved a concept of fileref - see Tom's objection. I know, so I can hack
> ``, but I would not do it. It is used for shell (system) calls, and these
> functionality can depends on used os. The proposed content commands can be
> extended more, and you are doesn't depends on o.s. Another issue of your
> proposal is hard support for tab complete (file path).

On the other hand, it seems like you've invented a whole new system of
escaping and interpolation that is completely disconnected from the
one we already have.  psql is already full of features that nobody
knows about identified by incomprehensible character combinations.
Daniel's suggestion would at least be more similar to what already
exists.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

16 November 2016, 16:16:15

2016-11-16 16:59 GMT+01:00 Robert Haas <robertmhaas@gmail.com>:

On Tue, Nov 15, 2016 at 11:48 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> For text contents, we already have this (pasted right from the doc):
>>
>> testdb=> \set content `cat my_file.txt`
>> testdb=> INSERT INTO my_table VALUES (:'content');
>>
>> Maybe we might look at why it doesn't work for binary string and fix that?
>>
>> I can think of three reasons:
>>
>> - psql use C-style '\0' terminated string implying no nul byte in
>> variables.
>> That could potentially be fixed by tweaking the data structures and
>> accessors.
>>
>> - the backtick operator trims ending '\n' from the result of the command
>> (like the shell), but we could add a variant that doesn't have this
>> behavior.
>>
>> - we don't have "interpolate as binary", an operator that will
>> essentially apply PQescapeByteaConn rather than PQescapeStringConn.
>>
>> For example, I'd suggest this syntax:
>>
>> -- slurp a file into a variable, with no translation whatsoever
>> \set content ``cat my_binary_file``
>>
>> -- interpolate as binary (backquotes instead of quotes)
>> INSERT INTO my_table VALUES(:`content`);
>>
>> If we had something like this, would we still need filerefs? I feel like
>> the point of filerefs is mainly to work around lack of support for
>> binary in variables, but maybe supporting the latter directly would
>> be an easier change to accept.
>
> I leaved a concept of fileref - see Tom's objection. I know, so I can hack
> ``, but I would not do it. It is used for shell (system) calls, and these
> functionality can depends on used os. The proposed content commands can be
> extended more, and you are doesn't depends on o.s. Another issue of your
> proposal is hard support for tab complete (file path).

On the other hand, it seems like you've invented a whole new system of
escaping and interpolation that is completely disconnected from the
one we already have. psql is already full of features that nobody
knows about identified by incomprehensible character combinations.
Daniel's suggestion would at least be more similar to what already
exists.

The Daniel's proposal has important issues:

1. you need to store and hold the content in memory

2. you cannot use tab complete - without it this feature lost a usability

3. you have to do two steps

4. Depends on o.s.

Regards

Pavel

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: proposal: psql \setfileref

From

Tom Lane

Date:

16 November 2016, 16:58:11

Pavel Stehule <pavel.stehule@gmail.com> writes:
> The Daniel's proposal has important issues:

> 1. you need to store and hold the content in memory
> 2. you cannot use tab complete - without it this feature lost a usability
> 3. you have to do two steps
> 4. Depends on o.s.

I think you're putting *way* too much emphasis on tab completion of the
filename.  That's a nice-to-have, but it should not cause us to contort
the feature design to get it.

I'm not especially impressed by objections 3 and 4, either.  Point #1
has some merit, but since the wire protocol is going to limit us to
circa 1GB anyway, it doesn't seem fatal.

What I like about Daniel's proposal is that because it adds two separate
new behaviors, it can be used for more things than just "interpolate a
file".  What I don't like is that we're still stuck in the land of textual
interpolation into query strings, which as you noted upthread is not
very user-friendly for long values.  I thought there was some value in
your original idea of having a way to get psql to use extended query mode
with the inserted value being sent as a parameter.  But again, I'd rather
see that decoupled as a separate feature and not tightly tied to the
use-case of interpolating a file.

Maybe using extended mode could be driven off a setting rather than being
identified by syntax?  There doesn't seem to be much reason why you'd want
some of the :-interpolated values in a query to be inlined and others sent
as parameters.  Also, for testing purposes, it'd be mighty nice to have a
way of invoking extended query mode in psql with a clear way to define
which things are sent as parameters.  But I don't want to have to make
separate files to contain the values being sent.
        regards, tom lane

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

16 November 2016, 19:29:08

2016-11-16 17:58 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:
> The Daniel's proposal has important issues:

> 1. you need to store and hold the content in memory
> 2. you cannot use tab complete - without it this feature lost a usability
> 3. you have to do two steps
> 4. Depends on o.s.

I think you're putting *way* too much emphasis on tab completion of the
filename. That's a nice-to-have, but it should not cause us to contort

the feature design to get it.

I am sorry, I cannot to agree. When you cannot use tab complete in interactive mode, then you lost valuable help.

I'm not especially impressed by objections 3 and 4, either. Point #1
has some merit, but since the wire protocol is going to limit us to
circa 1GB anyway, it doesn't seem fatal.

There are not "cat" on ms win. For relative basic functionality you have to switch between platforms.

What I like about Daniel's proposal is that because it adds two separate
new behaviors, it can be used for more things than just "interpolate a
file". What I don't like is that we're still stuck in the land of textual
interpolation into query strings, which as you noted upthread is not
very user-friendly for long values. I thought there was some value in
your original idea of having a way to get psql to use extended query mode
with the inserted value being sent as a parameter. But again, I'd rather
see that decoupled as a separate feature and not tightly tied to the
use-case of interpolating a file.

With content commands syntax, I am able to control it simply - there is space for invention of new features.

the my patch has full functionality of Daniel's proposal

\set xxx {rb somefile} - is supported already in my last patch

Now I am working only with real files, but the patch can be simply extended to work with named pipes, with everything. There is a space for extending.

Maybe using extended mode could be driven off a setting rather than being
identified by syntax?

It is possible, but I don't think so it is user friendly - what is worst - use special syntax or search setting some psql set value?

There doesn't seem to be much reason why you'd want
some of the :-interpolated values in a query to be inlined and others sent
as parameters. Also, for testing purposes, it'd be mighty nice to have a
way of invoking extended query mode in psql with a clear way to define
which things are sent as parameters. But I don't want to have to make
separate files to contain the values being sent.

regards, tom lane

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

17 November 2016, 11:01:40

Hi

a independent implementation of parametrized queries can looks like attached patch:

this code is simple without any unexpected behave.

parameters are used when user doesn't require escaping and when PARAMETRIZED_QUERIES variable is on

Regards

Pavel

Attachment

psql-parametrized-queries.patch

Re: proposal: psql \setfileref

From

Pavel Stehule

Date:

17 November 2016, 18:54:44

Hi

2016-11-17 12:00 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

a independent implementation of parametrized queries can looks like attached patch:

this code is simple without any unexpected behave.

parameters are used when user doesn't require escaping and when PARAMETRIZED_QUERIES variable is on

Regards

here is a Daniel's syntax patch

The independent implementation of using query parameters is good idea, the patch looks well, and there is a agreement with Tom.

I implemented a Daniel proposal - it is few lines, but personally I prefer curly bracket syntax against `` syntax. When separator is double char, then the correct implementation will not be simple - the bad combinations "` ``, `` `" should be handled better. I wrote my arguments for my design, but if these arguments are not important for any other, then I can accept any other designs.

regards

Pavel

Pavel

Attachment

psql-backtick-hex-output.patch

Re: proposal: psql \setfileref

From

Haribabu Kommi

Date:

02 December 2016, 06:40:52

On Fri, Nov 18, 2016 at 5:53 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

2016-11-17 12:00 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

a independent implementation of parametrized queries can looks like attached patch:

this code is simple without any unexpected behave.

parameters are used when user doesn't require escaping and when PARAMETRIZED_QUERIES variable is on

Regards

here is a Daniel's syntax patch

The independent implementation of using query parameters is good idea, the patch looks well, and there is a agreement with Tom.

I implemented a Daniel proposal - it is few lines, but personally I prefer curly bracket syntax against `` syntax. When separator is double char, then the correct implementation will not be simple - the bad combinations "` ``, `` `" should be handled better. I wrote my arguments for my design, but if these arguments are not important for any other, then I can accept any other designs.

The recent updated patch as per the agreement hasn't received any feedback

from reviewers. Moved to next CF with "needs review" status.

Regards,

Hari Babu

Fujitsu Australia