Thread: Alternative to \copy in psql modelled after \g

Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
  Hi,

Currently \copy cannot span multiple lines (like any meta-command)
and cannot use psql variables whereas \g can do both.

The POC patch attached implements two meta-commands \copyfrom
and \copyto that are to COPY what \g is to any other query:

- they take the COPY query already var-expanded from the query buffer,
which must mention FROM STDIN or TO STDOUT.

- they accept an argument declaring the local data source or destination,
either a filename or a program (|command args) or empty for stdin/stdout.

By contrast \copy has a specific parser to extract the data source
or dest from its line of arguments, plus whether it's a COPY FROM or TO,
and build a COPY query from that.

Examples of use

1. $ psql -v filename="/path/data-$(date -I).csv"
COPY (SELECT *
   FROM table
   WHERE ...)
TO STDOUT (FORMAT csv) \copyto :filename

2. -- copy only the first 100 lines
COPY table FROM stdin \copyfrom |head -n 100 /data/datafile.txt

3. $ cat script.sql
COPY table1 FROM stdin;  -- copy inline data
data line
data line
\.

-- copy data from psql's stdin
COPY table2 FROM stdin \copyfrom

# copy both in-script and out-of-script data
$ psql -f script.sql < table2.data

Comments? Does that look useful as an alternative to \copy ?


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

Re: Alternative to \copy in psql modelled after \g

From
"David G. Johnston"
Date:
On Fri, Nov 9, 2018 at 4:19 AM Daniel Verite <daniel@manitou-mail.org> wrote:
> Examples of use
>
> 1. $ psql -v filename="/path/data-$(date -I).csv"
> COPY (SELECT *
>    FROM table
>    WHERE ...)
> TO STDOUT (FORMAT csv) \copyto :filename

Do I understand correctly that you are proposing a slightly less
verbose alternative of:

\o :filename
COPY TO STDOUT
\o

David J.


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    David G. Johnston wrote:

> Do I understand correctly that you are proposing a slightly less
> verbose alternative of:
>
> \o :filename
> COPY TO STDOUT
> \o

Not strictly the same because of this:

\o or \out [ filename ]
....
  “Query results” includes all tables, command responses, and notices
  obtained from the database server, as well as output of various
  backslash commands that query the database (such as \d); but not
  error messages.

If for instance psql received a notification between the two \o it would
end up in the file, which it wouldn't with \copyto.
The same is true for  SELECT... \g file as opposed to \o file

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
"David G. Johnston"
Date:
On Fri, Nov 9, 2018 at 9:35 AM Daniel Verite <daniel@manitou-mail.org> wrote:
> If for instance psql received a notification between the two \o it would
> end up in the file, which it wouldn't with \copyto.
> The same is true for  SELECT... \g file as opposed to \o file

Is that something we could change instead of adding a new command?
POLA, for me at least, would be for \g [filename] to do exactly what
you are describing with the \copyto feature. That is doesn't and there
haven't been complaints I would chalk up to notices and command
responses not being prevalent in situations where people would use \g
[filename] or paired \o [filename].

David J.


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    David G. Johnston wrote:

> POLA, for me at least, would be for \g [filename] to do exactly what
> you are describing with the \copyto feature.

I admit that if we could improve \g to handle COPY, it would be more
elegant than the current proposal adding two meta-commands.

But the copy-workflow and non-copy-workflow are different, and in
order to know which one to start, \g would need to analyze the query
to determine whether it's a COPY FROM, COPY TO or something else.
psql parses queries syntactically, but not semantically AFAIK, and I
suspect we don't want to start doing that, as it breaks a separation
of concerns.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    I wrote:

> I admit that if we could improve \g to handle COPY, it would be more
> elegant than the current proposal adding two meta-commands.
>
> But the copy-workflow and non-copy-workflow are different, and in
> order to know which one to start, \g would need to analyze the query

It turns out I was wrong on this. The workflows are different but when
psql receives the first PGresult, it's still time to handle the I/O
redirection. It just differs from \copy in the case of an error:
\copy won't even send a command to the server if the local output
stream can't be opened, whereas COPY \g would send the command, and
will have to deal with the client-side error afterwards.

Well, except that up to now, COPY ... TO STDOUT \g file or \g |command
has been ignoring "file" or "|command", which is arguably a bug as Tom
puts it in another discussion in [1].

So as a replacement for the \copyto I was proposing earlier, PFA a patch
for COPY TO STDOUT to make use of the argument to \g.


[1] bug #15535
https://www.postgresql.org/message-id/6309.1544031175@sss.pgh.pa.us


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

Re: Alternative to \copy in psql modelled after \g

From
Tom Lane
Date:
"Daniel Verite" <daniel@manitou-mail.org> writes:
> So as a replacement for the \copyto I was proposing earlier, PFA a patch
> for COPY TO STDOUT to make use of the argument to \g.

Sounds plausible, please add to next commitfest so we don't forget it.

            regards, tom lane


Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
Bonjour Daniel,

>> But the copy-workflow and non-copy-workflow are different, and in
>> order to know which one to start, \g would need to analyze the query
>
> It turns out I was wrong on this. The workflows are different but when
> psql receives the first PGresult, it's still time to handle the I/O
> redirection. It just differs from \copy in the case of an error:
> \copy won't even send a command to the server if the local output
> stream can't be opened, whereas COPY \g would send the command, and
> will have to deal with the client-side error afterwards.
>
> Well, except that up to now, COPY ... TO STDOUT \g file or \g |command
> has been ignoring "file" or "|command", which is arguably a bug as Tom
> puts it in another discussion in [1].
>
> So as a replacement for the \copyto I was proposing earlier, PFA a patch
> for COPY TO STDOUT to make use of the argument to \g.

Patch applies cleanly, compiles, make check is ok.

However, it does not contain any tests (bad:-) nor documentation (hmmm... 
maybe none needed, though).

Is this just kind of a bug fix? Beforehand the documentation says "\g fn" 
sends to file, but that was not happening with COPY, and now it does as it 
says?

A question: if opening the output file fails, should not the query be 
cancelled and an error be reported? Maybe it is too late for that. It 
seems that "SELECT pg_sleep(10) \g /bad/file" fails but executes the query 
nevertheless.

ISTM that overriding copystream on open failures is not necessary, because 
its value is already NULL in this case.

I'd suggest that is_pipe should be initialized to false, otherwise it is 
unclear from the code when it is set before use, and some compilers may 
complain.

-- 
Fabien.


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Fabien COELHO wrote:

> Is this just kind of a bug fix? Beforehand the documentation says "\g fn"
> sends to file, but that was not happening with COPY, and now it does as it
> says?

Yes. The doc says about \g:

  Sends the current query buffer to the server for execution. If an
  argument is given, the query's output is written to the named file
  or piped to the given shell command, instead of displaying it as
  usual.

It does not add "unless the query is a COPY", so it seems right
to make that just work, and call it a bug fix.

> A question: if opening the output file fails, should not the query
> be cancelled and an error be reported? Maybe it is too late for
> that. It seems that "SELECT pg_sleep(10) \g /bad/file" fails but
> executes the query nevertheless.

Yes. This part works as documented:

  "The file or command is written to only if the query successfully
  returns zero or more tuples, not if the query fails or is a
  non-data-returning SQL command."

> However, it does not contain any tests (bad:-)

Testing this requires at least some interaction with the OS which I'm
uncomfortable to add. There is a precedent in
regress/sql/hs_standby_allowed.sql doing:

  COPY hs1 TO '/tmp/copy_test'
  \! cat /tmp/copy_test

We could add something like that in psql.sql, but I'm not fond of it
because it assumes a Unix-ish environment, and that it's OK to clobber
the hardcoded /tmp/copy_test should it preexist. I'd rather work with
a dedicated temporary directory. On Linux mktemp -d could be used, but
I don't know about a portable solution, plus POSIX has deprecated
mktemp.
I'm open to ideas on a portable way for psql.sql to test \g writing to a
file or a program, but ATM my inclination is to not add that test.


> nor documentation (hmmm... maybe none needed, though).

\copy has this paragraph:

  "The syntax of this command is similar to that of the SQL COPY
  command. All options other than the data source/destination are as
  specified for COPY. Because of this, special parsing rules apply to
  the \copy meta-command. Unlike most other meta-commands, the entire
  remainder of the line is always taken to be the arguments of \copy,
  and neither variable interpolation nor backquote expansion are
  performed in the arguments".

We could add that COPY TO STDOUT with a redirection might be used as an
alternative. The downside is that with four paragraphs plus one tip,
the explanations on \copy give already a lot to chew on, so is it
worth to add more?


> ISTM that overriding copystream on open failures is not necessary, because
> its value is already NULL in this case.
>
> I'd suggest that is_pipe should be initialized to false, otherwise it is
> unclear from the code when it is set before use, and some compilers may
> complain.

OK, will consider these in the next revision.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
Hello,

>> Is this just kind of a bug fix? Beforehand the documentation says "\g fn"
>> sends to file, but that was not happening with COPY, and now it does as it
>> says?
>
> Yes. [...]
>
> It does not add "unless the query is a COPY", so it seems right
> to make that just work, and call it a bug fix.

Does this suggest backpatching?

>> However, it does not contain any tests (bad:-)
>
> Testing this requires at least some interaction with the OS which I'm
> uncomfortable to add.

Hmmm. This means that "\g filename" goes fully untested:-( A casual grep 
seems to confirm this. Sigh:-(

> There is a precedent in regress/sql/hs_standby_allowed.sql doing:
>
>  COPY hs1 TO '/tmp/copy_test'
>  \! cat /tmp/copy_test

Indeed. I'm unsure windows has cat or /tmp, so I do not understand how it 
works on such platform. Maybe I'm missing something.

> We could add something like that in psql.sql, but I'm not fond of it
> because it assumes a Unix-ish environment,

Yep.

> I'm open to ideas on a portable way for psql.sql to test \g writing to a
> file or a program, but ATM my inclination is to not add that test.

A relative file could be ok? After some testing, the standard regression 
tests do not cd to some special place, so it may fail?

However TAP tests do that, and I have used this extensively with pgbench, 
so a psql TAP test could do that and other things, such as importing a csv 
file or whatever.

>> nor documentation (hmmm... maybe none needed, though).
>
> \copy has this paragraph:
>
>  "The syntax of this command is similar to that of the SQL COPY
>  command. All options other than the data source/destination are as
>  specified for COPY. Because of this, special parsing rules apply to
>  the \copy meta-command. Unlike most other meta-commands, the entire
>  remainder of the line is always taken to be the arguments of \copy,
>  and neither variable interpolation nor backquote expansion are
>  performed in the arguments".
>
> We could add that COPY TO STDOUT with a redirection might be used as an
> alternative. The downside is that with four paragraphs plus one tip,
> the explanations on \copy give already a lot to chew on, so is it
> worth to add more?

I'd say that a small paragraph with the tip would be ok.

-- 
Fabien.


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Fabien COELHO wrote:

> > It does not add "unless the query is a COPY", so it seems right
> > to make that just work, and call it a bug fix.
>
> Does this suggest backpatching?

Yes, I think it's a candidate for back-patching.

> > There is a precedent in regress/sql/hs_standby_allowed.sql doing:
> >
> >  COPY hs1 TO '/tmp/copy_test'
> >  \! cat /tmp/copy_test
>
> Indeed. I'm unsure windows has cat or /tmp, so I do not understand how it
> works on such platform. Maybe I'm missing something.

It's exercised only on a standby. Possibly few machines run this test,
among which none powered by Windows? And maybe it even works
on Windows in some cases: the reference to /tmp would work
in an MSYS/MingW environment and "cat" might too if \! gets
to the /bin/sh of that environment.

> However TAP tests do that, and I have used this extensively with pgbench,
> so a psql TAP test could do that and other things, such as importing a csv
> file or whatever.

It looks a significant step forward, to be brought by a patch on its own
without prospect of being back-patched.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
> It looks a significant step forward, to be brought by a patch on its own
> without prospect of being back-patched.

Dunno. Even if an additional tap test would not be backpatched, it could 
be added on master. I'm basically sadden by pg test coverage, especially 
psql which is under 40%, so I try to grasp any improvement opportunity…

See https://coverage.postgresql.org/

-- 
Fabien.

Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Fabien COELHO wrote:

> Dunno. Even if an additional tap test would not be backpatched, it could
> be added on master. I'm basically sadden by pg test coverage, especially
> psql which is under 40%, so I try to grasp any improvement opportunity…
>
> See https://coverage.postgresql.org/

Maybe I misunderstand something, as I'm not familiar with TAP tests,
but hasn't psql no such test to begin with, as opposed to the
other programs in src/bin that have a t/ directory?

$ find . -name t
./pg_resetwal/t
./scripts/t
./pg_archivecleanup/t
./pg_verify_checksums/t
./pg_config/t
./pg_controldata/t
./pgbench/t
./pg_rewind/t
./pg_basebackup/t
./pg_dump/t
./initdb/t
./pg_ctl/t


In that case, the first thing we'd need is to add check and installcheck
targets in .../psql/Makefile, and a t/ directory with at least one Perl
script.
If that's the way to go forward, let's just do that in a patch
with a specific entry in the next CF like "Add TAP tests to psql".
Personally I'll be willing to submit and review new tests in t
independently of the patch discussed in $subject.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Alvaro Herrera
Date:
On 2018-Dec-27, Daniel Verite wrote:

> 
> Maybe I misunderstand something, as I'm not familiar with TAP tests,
> but hasn't psql no such test to begin with, as opposed to the
> other programs in src/bin that have a t/ directory?

That's correct.  psql does have some tests though, in
src/test/regress/sql/psql.sql and psql_crosstab.sql.  It's also tested
indirectly because it's used to run all the src/test/regress files.

If you want to add more tests and increase coverage, that's a good goal,
but keep in mind those other files that can also be used.  It doesn't
all have to be TAP.

Some things such as help.c, sql_help.c are hard to test. describe.c
could use more coverage for sure, but lots of it is version-specific,
which makes things harder.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Alternative to \copy in psql modelled after \g

From
Tom Lane
Date:
I took a quick look at this patch.  Some thoughts:

* I think the right way to proceed is to commit (and back-patch)
this without any regression test case, and maybe later look into
adding a TAP test that covers "\g file".  We have no test cases
for any variant of "\g file", and because of platform variability
and suchlike considerations, adding that seems like a project of
its own --- one I'd not want to back-patch.

* I do agree with documenting this by adding some small note to the
discussion of \copy.

* I think you've made the wrong choice about how this interacts with
the pset.copyStream option.  If copyStream is set, that should
continue to take priority, IMO, as anything else would break do_copy's
assumptions about what will happen.  One example of why it would be
problematic is that both levels of code would think they control what
to do with the sigpipe trap.  Now, it's likely that \copy's syntax
makes it impossible for both copyStream and gfname to be set at the
same time anyway, because \copy extends to the end of the line.  But
if it were to happen, we don't want it to be something that do_copy
has to take into account; better therefore just to leave gfname alone.
(Note also the comment just above where you patched, which you failed
to update.)

* You need to be more careful about error cases.  I note that
openQueryOutputFile is capable of returning failure with is_pipe
set true, which would lead this patch to disable the sigpipe trap
and never re-enable it.

            regards, tom lane


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Tom Lane wrote:

> I took a quick look at this patch.

PFA an updated patch addressing your comments and Fabien's.

I've also changed handleCopyOut() to return success if it
could pump the data without writing it out locally for lack of
an output stream. It seems to make more sense like that.

While adding the note to the doc I've noticed that the other \copy
tip says:

 "This operation is not as efficient as the SQL COPY command because
 all data must pass through the client/server connection. For large
 amounts of data the SQL command might be preferable.

It doesn't specify that it's for COPY TO/FROM file, not COPY TO
STDOUT/FROM STDIN. Of course the latter would rank the same as \copy
with respect to client/server throughput.  Should this tip be more
specific?


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
Bonjour Daniel,

>> I took a quick look at this patch.
>
> PFA an updated patch addressing your comments and Fabien's.

About this v2.

Applies cleanly, compiles cleanly, "make check" ok.

No tests, but Tom suggests this does not belong here: the committer is 
right:-)

I tested the feature manually, and I noticed that with your patch
ROW_COUNT is set more consistently:

   sql> COPY pgbench_branches TO STDOUT \g /dev/null # COPY 10
   sql> \echo :ROW_COUNT # 10

But previously we had:

   sql> \echo :ROW_COUNT # 0

This is an improvement, although I'm not sure where it comes from.

> I've also changed handleCopyOut() to return success if it
> could pump the data without writing it out locally for lack of
> an output stream. It seems to make more sense like that.

I'm hesitating on this one.

On the one hand the SQL query is executed, on the other hand the \g 
command partially failed. There is a debatable side effect: If there is an 
error, eg:

   sql> COPY pgbench_branches TO STDOUT \g /NIET
   /NIET: Permission denied
   sql> \echo :ERROR :ROW_COUNT # true 0

I understand from the code that the COPY is really executed, so the ERROR
and so ROW_COUNT about the SQL should reflect that. Basically the change 
makes the client believe that there is an SQL error whereas the error is 
on the client.

Does anyone else have an opinion?

About the code:

I'm unclear whether the project policy accepts "foo" for "foo != NULL", 
whether the later is prefered, or whether it does not care about it.

   /*
    * COPY TO STDOUT \g [|]file may be used as an alternative
    * to \copy
    */

The later part of the comment is already stated in the documentation, I'm 
not sure it is worth repeating it here. I'd suggest a simpler /* handle 
"COPY TO STDOUT \g ..." */

> While adding the note to the doc I've noticed that the other \copy
> tip says:
>
> "This operation is not as efficient as the SQL COPY command because
> all data must pass through the client/server connection. For large
> amounts of data the SQL command might be preferable.
>
> It doesn't specify that it's for COPY TO/FROM file, not COPY TO
> STDOUT/FROM STDIN. Of course the latter would rank the same as \copy
> with respect to client/server throughput.  Should this tip be more
> specific?

Yep.

This tip also overlooks that the client and server are not necessary on 
the same host with the same permissions: it seems to say "prefer COPY to 
\copy", while the alternative may work only under special conditions which 
are not hinted about in any way.

-- 
Fabien.


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Fabien COELHO wrote:

> > I've also changed handleCopyOut() to return success if it
> > could pump the data without writing it out locally for lack of
> > an output stream. It seems to make more sense like that.
>
> I'm hesitating on this one.
>
> On the one hand the SQL query is executed, on the other hand the \g
> command partially failed. There is a debatable side effect: If there is an
> error, eg:

The change mentioned above is not supposed to have any user-visible
effect, compared to the previous version. The success of the command
as a whole, that is reflected by :ERROR, is the result of AND'ing the
successes of different steps of the command. Having handleCopyOut()
succeed does not change the falseness of the aggregated result,
which is still false when it fails to open the output stream.

> But previously we had:
>
>   sql> \echo :ROW_COUNT # 0

Previously "COPY... \g file" behaved as "COPY... \g", the argument being
ignored. In this case, and the patch doesn't change that, the code does:

    /*
     * Suppress status printing if the report would go to the same
     * place as the COPY data just went.  Note this doesn't
     * prevent error reporting, since handleCopyOut did that.
     */
    if (copystream == pset.queryFout)
    {
        PQclear(copy_result);
        copy_result = NULL;
    }

One effect of clearing that result is that :ROW_COUNT is going to
be set to zero by SetResultVariables(), which explains the above
output.

:ROW_COUNT is incorrect after COPY TO STDOUT but this is a PG11-only
bug, :ROW_COUNT being a recent addition, so maybe we should deal with
this separately, since the current patch is supposed to address all versions?

> I understand from the code that the COPY is really executed, so the ERROR
> and so ROW_COUNT about the SQL should reflect that. Basically the change
> makes the client believe that there is an SQL error whereas the error is
> on the client.

Right, but wether COPY fails because psql can't write the output,
possibly half-way because of a disk full condition, or because the
query was cancelled or the server went down, are these distinctions
meaningful for a script?
If we say yes, how can the user know that the data fetched is
empty or incomplete? We don't have a CLIENT_SIDE_ERROR variable.
Also, the fact that psql retrieves the results when it doesn't have
a valid destination for them is an implementation detail.
I think we could also cancel the query in a way that would be
technically an SQL error, as Ctrl+C would do.
Hiding these details under a generic ERROR=true seems reasonable,
especially since we expose SQLSTATE for fine-grained error checking,
should that be needed.
ERROR=true and SQLSTATE empty is already mentioned as plausible
in SetResultVariables():

    SetVariable(pset.vars, "ERROR", "true");

    /*
     * If there is no SQLSTATE code, use an empty string.  This can
happen
     * for libpq-detected errors (e.g., lost connection, ENOMEM).
     */
    if (code == NULL)
        code = "";
    SetVariable(pset.vars, "SQLSTATE", code);


> The later part of the comment is already stated in the documentation, I'm
> not sure it is worth repeating it here. I'd suggest a simpler /* handle
> "COPY TO STDOUT \g ..." */

The bug we're fixing here is due to missing the point the comment is
making, so being thick seems fair.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
Bonjour Daniel,

> :ROW_COUNT is incorrect after COPY TO STDOUT but this is a PG11-only
> bug, :ROW_COUNT being a recent addition, so maybe we should deal with
> this separately, since the current patch is supposed to address all versions?

ISTM that the patch is considered a bug fix, so it shoud be applied to 
pg11 and fix it?

>> I understand from the code that the COPY is really executed, so the ERROR
>> and so ROW_COUNT about the SQL should reflect that. Basically the change
>> makes the client believe that there is an SQL error whereas the error is
>> on the client.
>
> Right, but wether COPY fails because psql can't write the output,
> possibly half-way because of a disk full condition, or because the
> query was cancelled or the server went down, are these distinctions
> meaningful for a script?

It could if the SQL command has side effects, but probably this does not 
apply to COPY TO which cannot have.

> If we say yes, how can the user know that the data fetched is
> empty or incomplete? We don't have a CLIENT_SIDE_ERROR variable.

Yep. Maybe we should.

The documentation states that ERROR is about SQL, not psql internal stuff:

  ERROR true if the last SQL query failed, false if it succeeded.
        See also SQLSTATE.

And this is somehow the behavior on all other SQL commands, with or 
without your patch:

   sql> SELECT 1 \g /BAD
   /BAD: Permission denied

   sql> \echo :ERROR
   false

Basically, with your patch, the behavior becomes inconsistent between COPY 
and other SQL commands, so there is something to fix.

Given that, I see two options:

(1) document ERROR as being muddy, i.e. there has been some error which 
may be SQL or possibly client side. Although SQLSTATE would still allow to 
know whether an SQL error occured, there is still no client side 
expressions, and even if I had moved pgbench expressions to psql they 
would still need to be extended to handle strings. This suggest that maybe 
there could be an SQL_ERROR boolean which does store whether SQL succeeded 
or not, and possibly a CLIENT_ERROR on the side, and ERROR = SQL_ERROR OR 
CLIENT_ERROR.

(2) keep ERROR as is, i.e. about SQL, and add some CLIENT_ERROR, but then 
I see another issue, if it is *only* the client error, then it should be 
true if there is an SQL error, thus checking if there is any error becomes 
ERROR OR CLIENT_ERROR, which is muddy as well especially as there are no 
client-side expressions in psql.

> Also, the fact that psql retrieves the results when it doesn't have
> a valid destination for them is an implementation detail.

Yep, but if there are side effects, a script could want to know if they 
occured?

> I think we could also cancel the query in a way that would be
> technically an SQL error, as Ctrl+C would do.

Hmmm.

> Hiding these details under a generic ERROR=true seems reasonable,
> especially since we expose SQLSTATE for fine-grained error checking,
> should that be needed.

Yep, but no expression.

> ERROR=true and SQLSTATE empty is already mentioned as plausible
> in SetResultVariables():

Indeed. This suggest that ERROR is already muddy, contrary to the 
documentation.

>     SetVariable(pset.vars, "ERROR", "true");
>     if (code == NULL)
>         code = "";
>     SetVariable(pset.vars, "SQLSTATE", code);

Overall, ISTM that it should point to solution (1).

  - document that ERROR is muddy, "some SQL or client error occured"
  - add SQL_ERROR, which is always about SQL error and nothing else
  - add CLIENT_ERROR, same
  - make the behavior consistent for all SQL command, COPY & others

>> The later part of the comment is already stated in the documentation, I'm
>> not sure it is worth repeating it here. I'd suggest a simpler /* handle
>> "COPY TO STDOUT \g ..." */
>
> The bug we're fixing here is due to missing the point the comment is
> making, so being thick seems fair.

I would not be, because ISTM that mentionning that COPY TO STDOUT is 
specially handled already points clearly to the previous issue. No big 
deal.

-- 
Fabien.


Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
>>> I understand from the code that the COPY is really executed, so the ERROR
>>> and so ROW_COUNT about the SQL should reflect that. Basically the change
>>> makes the client believe that there is an SQL error whereas the error is
>>> on the client.
>> 
>> Right, but wether COPY fails because psql can't write the output,
>> possibly half-way because of a disk full condition, or because the
>> query was cancelled or the server went down, are these distinctions
>> meaningful for a script?
>
> It could if the SQL command has side effects, but probably this does not 
> apply to COPY TO which cannot have.

Yes it can:

COPY (
   UPDATE pgbench_branches
     SET bbalance = bbalance + 1
     WHERE bid <= 5
   RETURNING *) TO STDOUT \g /BAD

The SQL command is executed but the backslash command fails.

-- 
Fabien.


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Fabien COELHO wrote:

>   sql> SELECT 1 \g /BAD
>   /BAD: Permission denied
>
>   sql> \echo :ERROR
>   false

That seems broken, because it's pointless to leave out a class of errors
from ERROR. Presumably the purpose of ERROR is to enable
error checking like:
  \if :ERROR
    ... error processing
  \endif

Now if you download data with SELECT or COPY and we can't even
create the file, how is that a good idea to intentionally have the
script fail to detect it? What purpose does it satisfy?

> The documentation states that ERROR is about SQL, not psql internal stuff:
>
>  ERROR true if the last SQL query failed, false if it succeeded.
>        See also SQLSTATE.

ERROR is set by SetResultVariables(PGresult *results, bool success)
and takes the value of "success", ignoring the PGresult.
So why is ERROR independant of the SQL result, relatively to your
claim that it should never reflect anything else?


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Fabien COELHO wrote:

> (1) document ERROR as being muddy, i.e. there has been some error which
> may be SQL or possibly client side. Although SQLSTATE would still allow to
> know whether an SQL error occured, there is still no client side
> expressions, and even if I had moved pgbench expressions to psql they
> would still need to be extended to handle strings. This suggest that maybe
> there could be an SQL_ERROR boolean which does store whether SQL succeeded
> or not, and possibly a CLIENT_ERROR on the side, and ERROR = SQL_ERROR OR
> CLIENT_ERROR.
>
> (2) keep ERROR as is, i.e. about SQL, and add some CLIENT_ERROR, but then
> I see another issue, if it is *only* the client error, then it should be
> true if there is an SQL error, thus checking if there is any error becomes
> ERROR OR CLIENT_ERROR, which is muddy as well especially as there are no
> client-side expressions in psql.

(3) Set ERROR=true if ON_ERROR_STOP would quit on the exact same
conditions.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
Bonsoir Daniel,

>>   sql> SELECT 1 \g /BAD
>>   /BAD: Permission denied
>>
>>   sql> \echo :ERROR
>>   false
>
> That seems broken, because it's pointless to leave out a class of errors
> from ERROR.

Yep. That is my point, but fixing it requires to decide whether it is okay 
to change ERROR documentation and behavior, and ISTM that there is some 
legit case to have the SQL_ERROR information independently.

> Now if you download data with SELECT or COPY and we can't even
> create the file, how is that a good idea to intentionally have the
> script fail to detect it? What purpose does it satisfy?

It means that the client knows that the SQL command, and possible 
associated side effects, were executed server-side, and that if we are in 
a transaction it is still going on:

   BEGIN;
   UPDATE pgbench_branches SET bbalance = bbalance + 1 RETURNING * \g /bad
   // the update is performed, the transaction is not rollbacked,
   // *but* the output file was not written, "COMMIT" save changes.

>> The documentation states that ERROR is about SQL, not psql internal stuff:
>>
>>  ERROR true if the last SQL query failed, false if it succeeded.
>>        See also SQLSTATE.
>
> ERROR is set by SetResultVariables(PGresult *results, bool success)
> and takes the value of "success", ignoring the PGresult.

Yes and no: "success" pretty often currently depends on PGresult, eg:

         if (PQresultStatus(results) != PGRES_COMMAND_OK) {
            ...
            SetResultVariables(results, false);


         results = PQexec(pset.db, buf.data);
         OK = AcceptResult(results); // true if SQL is ok
         ...
         SetResultVariables(results, OK);

         results = PQexec(pset.db, buf.data);
         OK = AcceptResult(results) &&
                 (PQresultStatus(results) == PGRES_COMMAND_OK);
         if (!OK)
                 SetResultVariables(results, OK);

> So why is ERROR independant of the SQL result, relatively to your
> claim that it should never reflect anything else?

AFAICS I'm not claiming that, on the contrary I'm basically ok with 
changing ERROR documentation and implementation (what I called option 1), 
although it would have to pass through committers, *AND* I'm still 
thinking that having a separate access to whether the SQL failed or not is 
of interest, so there is an argument to add a SQL_ERROR which reflects the 
current documentation, if not fully the implementation.

-- 
Fabien .


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Fabien COELHO wrote:

> > Now if you download data with SELECT or COPY and we can't even
> > create the file, how is that a good idea to intentionally have the
> > script fail to detect it? What purpose does it satisfy?
>
> It means that the client knows that the SQL command, and possible
> associated side effects, were executed server-side, and that if we are in
> a transaction it is still going on:
>
>   BEGIN;
>   UPDATE pgbench_branches SET bbalance = bbalance + 1 RETURNING * \g /bad
>   // the update is performed, the transaction is not rollbacked,
>   // *but* the output file was not written, "COMMIT" save changes.

if PQexec() could not store the results for lack of memory, it would
yield a PGRES_FATAL_ERROR, then :ERROR would be set to true, and yet
the server-side operation would have been performed. Additionally, If
that BEGIN was not there, the statement would also have been
committed, so its effect would be durable independently of the value
of :ERROR.

My point is that client-side issues already affect :ERROR,
so it can't be assumed that :ERROR=true implies that the SQL
statement did not have effects on the server.

In that sense, the patch in its current state does not break this
guarantee, since it does not exist in the first place.

OTOH I hope that :ERROR = false is a true guarantee that there
have been no problem whatsoever in the execution of
the last statement.

> on the contrary I'm basically ok with changing ERROR
> documentation and implementation (what I called option 1),

The doc only says:

   ERROR
      true if the last SQL query failed, false if it succeeded.
      See also SQLSTATE.

If the failure to retrieve results is included in "query failed", which
seems a reasonable interpretation to me, it doesn't need to be changed.

What's not right is "SELECT ... \g /bad" having a different effect on
:ERROR than "COPY... \g /bad".
I plan to follow up on that because there are other problems with
SELECT ... \g anyway, for instance, when a disk full occurs,
it's not reported at all. But that problem is not in the code path
of COPY.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
>>   BEGIN;
>>   UPDATE pgbench_branches SET bbalance = bbalance + 1 RETURNING * \g /bad
>>   // the update is performed, the transaction is not rollbacked,
>>   // *but* the output file was not written, "COMMIT" save changes.
>
> if PQexec() could not store the results for lack of memory, it would
> yield a PGRES_FATAL_ERROR, then :ERROR would be set to true, and yet
> the server-side operation would have been performed. Additionally, If
> that BEGIN was not there, the statement would also have been
> committed, so its effect would be durable independently of the value
> of :ERROR.

Indeed, OOM is a special case when the client is unable to know whether 
the command was actually executed. However ISTM that the connection is 
likely to be aborted, so if there is an explicit transaction it would be 
aborted as well.

> My point is that client-side issues already affect :ERROR,
> so it can't be assumed that :ERROR=true implies that the SQL
> statement did not have effects on the server.

Not from my reading of the doc (which I probably wrote, and did the 
implementation without a clear view of the different client-specific error 
conditions, so all this is really my fault BTW), where I understand that 
ERROR as true says that the SQL failed.

> In that sense, the patch in its current state does not break this
> guarantee, since it does not exist in the first place.

Sure, but the patch in its current state creates an inconsistency between 
an SQL command with "\g /BAD" and the same command in "COPY (...) TO 
STDOUT \g /BAD". I think that it must at the minimum be consistent.

> OTOH I hope that :ERROR = false is a true guarantee that there have been 
> no problem whatsoever in the execution of the last statement.
>
>> on the contrary I'm basically ok with changing ERROR
>> documentation and implementation (what I called option 1),
>
> The doc only says:
>
>   ERROR
>      true if the last SQL query failed, false if it succeeded.
>      See also SQLSTATE.
>
> If the failure to retrieve results is included in "query failed", which
> seems a reasonable interpretation to me, it doesn't need to be changed.

I understand "SQL query failed" as a server issue, especially as SQLSTATE 
is the reported command status at the PQ protocol level.

> What's not right is "SELECT ... \g /bad" having a different effect on
> :ERROR than "COPY... \g /bad".

Yep, I've been saying that.

> I plan to follow up on that because there are other problems with
> SELECT ... \g anyway, for instance, when a disk full occurs,
> it's not reported at all. But that problem is not in the code path
> of COPY.

Sure. As there are several bugs (doubtful features) uncovered, a first 
patch could fix "COPY ...TO STDOUT \g file", but probably replicate ERROR 
current behavior however debatable it is (i.e. your patch without the 
ERROR change, which is unrelated to the bug being fixed), and then another 
patch should fix/modify the behavior around ERROR (everywhere and 
consistently), and probably IMHO add an SQL_ERROR.

-- 
Fabien.


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Fabien COELHO wrote:

> Sure. As there are several bugs (doubtful features) uncovered, a first
> patch could fix "COPY ...TO STDOUT \g file", but probably replicate ERROR
> current behavior however debatable it is (i.e. your patch without the
> ERROR change, which is unrelated to the bug being fixed), and then another
> patch should fix/modify the behavior around ERROR (everywhere and
> consistently), and probably IMHO add an SQL_ERROR.

It's not as if the patch issued an explicit call SetVariable("ERROR", "true")
that could be commented, or something like that. The assignment
of the variable happens as a consequence of patched code that aims at
being correct in its error handling.
So I'm for leaving this decision to a maintainer, because I don't agree
with your conclusion that the current patch should be changed in
that regard.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
Bonsoir Daniel,

>> Sure. As there are several bugs (doubtful features) uncovered, a first
>> patch could fix "COPY ...TO STDOUT \g file", but probably replicate ERROR
>> current behavior however debatable it is (i.e. your patch without the
>> ERROR change, which is unrelated to the bug being fixed), and then another
>> patch should fix/modify the behavior around ERROR (everywhere and
>> consistently), and probably IMHO add an SQL_ERROR.
>
> It's not as if the patch issued an explicit call SetVariable("ERROR", "true")
> that could be commented, or something like that. The assignment
> of the variable happens as a consequence of patched code that aims at
> being correct in its error handling.
> So I'm for leaving this decision to a maintainer, because I don't agree
> with your conclusion that the current patch should be changed in
> that regard.

Ok, fine with me.

To summarize the debate, when fixing "\g filename" for "COPY TO STDOUT" 
the patch does not set error consistently on:

   sql> COPY (SELECT 1) TO STDOUT \g /BAD
     # Permission denied on "/BAD"
     -> :ERROR is true
     # the command is executed but could not write to the file

compared to the existing behavior:

   sql> SELECT 1 \g BAD
     # Permission denied on "/BAD"
     -> :ERROR is false
     # the SQL command is fine, although writing to the file failed

Should we include this inconsistency and then fix the problem later, or 
replicate the initial (doubtful?) behavior for consistency and then fix 
the problem later, anyway?

Fixing the problem envolves deciding what is the right behavior, and 
update the documentation and the implementation accordingly. Currently the 
documentation suggests that :ERROR is about SQL error (subject to 
interpretation), and the implementation is more or less consistent with 
that view, but not always, as pointed out by Daniel.

Note that as the author of the initial patch adding :ERROR and others 
(69835bc8), I'm probably somehow responsible for this situation, so shame 
on me.

-- 
Fabien.


Re: Alternative to \copy in psql modelled after \g

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
> Fixing the problem envolves deciding what is the right behavior, and 
> update the documentation and the implementation accordingly. Currently the 
> documentation suggests that :ERROR is about SQL error (subject to 
> interpretation), and the implementation is more or less consistent with 
> that view, but not always, as pointed out by Daniel.

FWIW, I think we should adopt the rule that :ERROR reflects any error
condition, whether server-side or client-side.  It's unlikely that
scripts would really not care about client-side errors.  Moreover,
I do not think we can reliably tell the difference; there are always
going to be corner cases that are hard to classify.  Example: if we
lose the server connection, is that a server-side error or client-side?
Or maybe neither, maybe the network went down.

Because of this concern, I find the idea of inventing separate
SQL_ERROR and CLIENT_ERROR variables to be a pretty terrible one.
I think we'd be creating a lot of make-work and hard choices for
ourselves, to do something that most script writers won't give a
fig about.  If you've got subtle error-handling logic in mind,
you shouldn't be trying to implement your app in a psql script
in the first place, IMO anyway.

It's unclear to me whether to push ahead with Daniel's existing
patch or not.  It doesn't look to me like it's making things
any worse from the error-consistency standpoint than they were
already, so I'd be inclined to consider error semantics cleanup
as something to be done separately/later.

BTW, it looks to me like the last patch is still not right as far
as when to close copystream --- won't it incorrectly close a
stream obtained from pset.copyStream?  While you could fix that
with

-    if (pset.gfname && copystream != NULL)
+    if (!pset.copyStream && pset.gfname && copystream != NULL)

I think that's getting overly complex and unmaintainable.  I'd
be inclined to code things more like


    FILE   *copystream = NULL;
    bool    need_close = false;

    ...

        need_close = openQueryOutputFile(...);

    ...

    if (need_close)
        // close copystream here


            regards, tom lane


Re: Alternative to \copy in psql modelled after \g

From
Fabien COELHO
Date:
Hello Tom,

>> Fixing the problem envolves deciding what is the right behavior, and
>> update the documentation and the implementation accordingly. Currently the
>> documentation suggests that :ERROR is about SQL error (subject to
>> interpretation), and the implementation is more or less consistent with
>> that view, but not always, as pointed out by Daniel.
>
> FWIW, I think we should adopt the rule that :ERROR reflects any error
> condition, whether server-side or client-side.

I think that everybody agrees with that.

> It's unlikely that scripts would really not care about client-side 
> errors.  Moreover, I do not think we can reliably tell the difference; 
> there are always going to be corner cases that are hard to classify. 
> Example: if we lose the server connection, is that a server-side error 
> or client-side? Or maybe neither, maybe the network went down.
>
> Because of this concern, I find the idea of inventing separate
> SQL_ERROR and CLIENT_ERROR variables to be a pretty terrible one.

Then the committer is right:-)

> I think we'd be creating a lot of make-work and hard choices for
> ourselves, to do something that most script writers won't give a
> fig about.  If you've got subtle error-handling logic in mind,
> you shouldn't be trying to implement your app in a psql script
> in the first place, IMO anyway.

Possibly. I was also thinking of debug. ISTM that SQL_ERROR is pretty 
trivial to implement because we already set SQLSTATE, and SQL_ERROR is 
probably something like SQLSTATE != "0000" or close to that.

> It's unclear to me whether to push ahead with Daniel's existing
> patch or not.  It doesn't look to me like it's making things
> any worse from the error-consistency standpoint than they were
> already, so I'd be inclined to consider error semantics cleanup
> as something to be done separately/later.

Fine.

> BTW, it looks to me like the last patch is still not right as far
> as when to close copystream --- won't it incorrectly close a
> stream obtained from pset.copyStream?

Argh, I should have checked this point.

-- 
Fabien.


Re: Alternative to \copy in psql modelled after \g

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> It's unclear to me whether to push ahead with Daniel's existing
>> patch or not.  It doesn't look to me like it's making things
>> any worse from the error-consistency standpoint than they were
>> already, so I'd be inclined to consider error semantics cleanup
>> as something to be done separately/later.

> Fine.

OK.  I fixed the error-cleanup issue and pushed it.

The patch applied cleanly back to 9.5, but the code for \g is a good
bit different in 9.4.  I didn't have the interest to try to make the
patch work with that, so I just left 9.4 alone.

            regards, tom lane


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Tom Lane wrote:

> OK.  I fixed the error-cleanup issue and pushed it.
>
> The patch applied cleanly back to 9.5, but the code for \g is a good
> bit different in 9.4.  I didn't have the interest to try to make the
> patch work with that, so I just left 9.4 alone.

Thanks!

Now as far as I can see, there is nothing that \copy to file or program
can do that COPY TO STDOUT cannot do. The next thing would be to
figure out how to similarly improve COPY FROM in psql, after which
\copy might be seen as obsolete.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Tom Lane
Date:
"Daniel Verite" <daniel@manitou-mail.org> writes:
> Now as far as I can see, there is nothing that \copy to file or program
> can do that COPY TO STDOUT cannot do.

I don't think there's a way to get the effect of "\copy to pstdout"
(which, IIRC without any caffeine, means write to psql's stdout regardless
of where queryFout is currently pointing).  Somebody was excited enough
about that to submit a patch for it, so maybe it's worth covering.
My first thought about syntax is to define "\g -" as meaning that.

> The next thing would be to
> figure out how to similarly improve COPY FROM in psql, after which
> \copy might be seen as obsolete.

I suggested upthread that we could just define "\g foo" as reading
from foo, not writing it, if the command turns out to be COPY FROM.
Maybe that's too weird/mistake-prone?  A variant that might or might
not be safer is "\g <foo", ie we insist on you putting a mark there
that shows you intended to read.

Also, not quite clear what we'd do about copy-from-program.
I think "\g |foo" is definitely confusing for that.  "\g foo|"
would be better if it doesn't have syntax issues.

            regards, tom lane


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Tom Lane wrote:

> A variant that might or might not be safer is "\g <foo", ie we
> insist on you putting a mark there that shows you intended to read.
>
> Also, not quite clear what we'd do about copy-from-program.
> I think "\g |foo" is definitely confusing for that.  "\g foo|"
> would be better if it doesn't have syntax issues.

I haven't written any patch yet, but I was thinking of submitting
something like that, with the addition of "\g >foo" as a synonym of
"\g foo" for the symmetry with "<".
Perl's open() also uses the "program |" syntax.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Tom Lane
Date:
"Daniel Verite" <daniel@manitou-mail.org> writes:
>     Tom Lane wrote:
>> A variant that might or might not be safer is "\g <foo", ie we
>> insist on you putting a mark there that shows you intended to read.

> I haven't written any patch yet, but I was thinking of submitting
> something like that, with the addition of "\g >foo" as a synonym of
> "\g foo" for the symmetry with "<".

+1, the same had occurred to me.

            regards, tom lane


Re: Alternative to \copy in psql modelled after \g

From
"Daniel Verite"
Date:
    Tom Lane wrote:

> > Now as far as I can see, there is nothing that \copy to file or program
> > can do that COPY TO STDOUT cannot do.
>
> I don't think there's a way to get the effect of "\copy to pstdout"
> (which, IIRC without any caffeine, means write to psql's stdout regardless
> of where queryFout is currently pointing).

\g /dev/stdout would work already on systems like Linux that provide
this alias.
Otherwise "\g -" looks good as a portable solution.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: Alternative to \copy in psql modelled after \g

From
Corey Huinker
Date:
> Otherwise "\g -" looks good as a portable solution.

+1