psql COPY vs. ON_ERROR_ROLLBACK, multi-command strings - Mailing list pgsql-hackers

From Noah Misch
Subject psql COPY vs. ON_ERROR_ROLLBACK, multi-command strings
Date
Msg-id 20120114154002.GB12188@tornado.leadboat.com
Whole thread Raw
Responses Re: psql COPY vs. ON_ERROR_ROLLBACK, multi-command strings
List pgsql-hackers
It has bothered me that psql's \copy ignores the ON_ERROR_ROLLBACK setting.
Only SendQuery() takes note of ON_ERROR_ROLLBACK, and \copy, like all
backslash commands, does not route through SendQuery().  Looking into this
turned up several other weaknesses in psql's handling of COPY.  For example,
SendQuery() does not release the savepoint after an ordinary COPY:

[local] test=# set log_statement = 'all'; set client_min_messages = 'log'; copy (select 0) to stdout;
SET
SET
LOG:  statement: RELEASE pg_psql_temporary_savepoint
LOG:  statement: SAVEPOINT pg_psql_temporary_savepoint
LOG:  statement: copy (select 0) to stdout;
0

When psql sends a command string containing more than one command, at least
one of which is a COPY, we stop processing results at the first COPY:

[local] test=# set log_statement = 'all'; set client_min_messages = 'log'; copy (select 0) to stdout\; select 1/0;
select1; 
SET
SET
LOG:  statement: RELEASE pg_psql_temporary_savepoint
LOG:  statement: SAVEPOINT pg_psql_temporary_savepoint
LOG:  statement: copy (select 0) to stdout; select 1/0;
0
LOG:  statement: select 1;
ERROR:  current transaction is aborted, commands ignored until end of transaction block

To make the above work normally, this patch improves SendQuery()-based COPY
command handling to process the entire queue of results whenever we've
processed a COPY.  It also brings sensible handling in the face of multiple
COPY commands in a single command string.  See the included test cases for
some examples.

We must prepare for COPY to fail for a local reason, like client-side malloc()
failure or network I/O problems.  The server will still have the connection in
a COPY mode, and we must get it out of that mode.  The \copy command was
prepared for the COPY FROM case with this code block:

    /*
     * Make sure we have pumped libpq dry of results; else it may still be in
     * ASYNC_BUSY state, leading to false readings in, eg, get_prompt().
     */
    while ((result = PQgetResult(pset.db)) != NULL)
    {
        success = false;
        psql_error("\\copy: unexpected response (%d)\n",
                   PQresultStatus(result));
        /* if still in COPY IN state, try to get out of it */
        if (PQresultStatus(result) == PGRES_COPY_IN)
            PQputCopyEnd(pset.db, _("trying to exit copy mode"));
        PQclear(result);
    }

It arose from these threads:
http://archives.postgresql.org/pgsql-hackers/2006-11/msg00694.php
http://archives.postgresql.org/pgsql-general/2009-08/msg00268.php

However, psql still enters an infinite loop when COPY TO STDOUT encounters a
client-side failure, such as malloc() failure.  I've merged the above
treatment into lower-level routines, granting plain COPY commands similar
benefits, and fixed the COPY TO handling.  To help you test the corner cases
involved here, I've attached a gdb script that will inject client side
failures into both kinds of COPY commands.  Attach gdb to your psql process,
source the script, and compare the behavior of commands like these with and
without the patch:

\copy (select 0) to pstdout

create table t (c int);
\copy t from stdin
1
\.

With plain COPY handled thoroughly, I fixed \copy by having it override the
source or destination stream, then call SendQuery().  This gets us support for
ON_ERROR_ROLLBACK, \timing, \set ECHO and \set SINGLESTEP for free.  I found
it reasonable to treat \copy's SQL commmand more like a user-entered command
than an internal command, because \copy is such a thin wrapper around COPY
FROM STDIN/COPY TO STDOUT.

This patch makes superfluous the second argument of PSQLexec(), but I have not
removed that argument.

Incidentally, psql's common.c had several switches on PQresultStatus(res) that
did not include a case for PGRES_COPY_BOTH and also silently assumed any
unlisted status was some kind of error.  I tightened these to distinguish all
known statuses and give a diagnostic upon receiving an unknown status.

Thanks,
nm


Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: psql filename completion: quoting
Next
From: Matthew Draper
Date:
Subject: Patch: Allow SQL-language functions to reference parameters by parameter name