Thread: pg_dump and backslash escapes

pg_dump and backslash escapes

From
Bruce Momjian
Date:
pg_dump is designed to produce dumps that can be loaded into newer
PostgreSQL versions, and it can read from older database versions.

The new backslash escape warning and sql standard strings might cause
problems for people moving from older versions of PostgreSQL, because
pg_dump assumes identifiers and strings allow escape processing.  (COPY
is unaffected.)

We have a few possible solutions.  One would be to backpatch older
backends to understand E'', and use that in pg_dump.  Another solution
is to use SET to turn off backslash warnings and SQL standard strings,
and have the older backend servers either ignore 'off' commands for
these, or wrap them in a way that they are invisible to the user if they
fail.

I have chosen the last option;  patch attached.  It will output this at
the top of dump files for releases 7.3.X, 7.4.X and 8.0.X.  8.1 and CVS
HEAD are fine.  The new dump output lines are:

    -- Set escape environment for possible loading into version >= 8.2.
    -- If variables are not supported, suppress error messages.
    SET client_min_messages = panic;
    SET log_min_messages = log;
    SET log_min_error_statement = panic;
    SET escape_string_warning = off;
    SET standard_conforming_strings = off;
    RESET log_min_error_statement;
    RESET log_min_messages;
    RESET client_min_messages;

--
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/bin/pg_dump/pg_backup_archiver.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.127
diff -c -c -r1.127 pg_backup_archiver.c
*** src/bin/pg_dump/pg_backup_archiver.c    19 Apr 2006 16:02:17 -0000    1.127
--- src/bin/pg_dump/pg_backup_archiver.c    13 May 2006 05:02:23 -0000
***************
*** 2017,2022 ****
--- 2017,2039 ----
      /* Make sure function checking is disabled */
      ahprintf(AH, "SET check_function_bodies = false;\n");

+     /*
+      *    We are using backslash escapes, so make sure they are enabled.
+      *    Older servers might not understand these variables, so we
+      *    turn off error reporting.
+      */
+     ahprintf(AH, "\n-- Set escape environment for possible loading into version >= 8.2.\n");
+     ahprintf(AH, "-- If variables are not supported, suppress error messages.\n");
+     ahprintf(AH, "SET client_min_messages = panic;\n");
+     /* In 7.3, this was server_min_messages */
+     ahprintf(AH, "SET log_min_messages = log;\n");
+     ahprintf(AH, "SET log_min_error_statement = panic;\n");
+     ahprintf(AH, "SET escape_string_warning = off;\n");
+     ahprintf(AH, "SET standard_conforming_strings = off;\n");
+     ahprintf(AH, "RESET log_min_error_statement;\n");
+     ahprintf(AH, "RESET log_min_messages;\n");
+     ahprintf(AH, "RESET client_min_messages;\n");
+
      /* Avoid annoying notices etc */
      ahprintf(AH, "SET client_min_messages = warning;\n");


Re: pg_dump and backslash escapes

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>     -- Set escape environment for possible loading into version >= 8.2.
>     -- If variables are not supported, suppress error messages.
>     SET client_min_messages = panic;
>     SET log_min_messages = log;
>     SET log_min_error_statement = panic;
>     SET escape_string_warning = off;
>     SET standard_conforming_strings = off;
>     RESET log_min_error_statement;
>     RESET log_min_messages;
>     RESET client_min_messages;

Thrashing about with the message level settings like that is useless.
Either the command will work or it won't.  And we've not bothered to
try to suppress warnings for any of the other SET commands pg_dump
issues.  AFAICS all you've accomplished here is to make the dump
dependent on even more GUC variables than it needs to be (consider
what will happen if we remove/redefine the log level variables in
future).

I don't particularly like the way that pg_dump is behaving at the
moment, ie cluttering the output with E'' strings.  That makes it
unnecessarily hard to use the output to load into other databases
or older PG versions.  What I'd like to do is SET
standard_conforming_strings appropriately (this probably has to be
a command line option, since it'll depend on where you want to use
the output) and then not use E'' strings at all.
        regards, tom lane


Re: pg_dump and backslash escapes

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >     -- Set escape environment for possible loading into version >= 8.2.
> >     -- If variables are not supported, suppress error messages.
> >     SET client_min_messages = panic;
> >     SET log_min_messages = log;
> >     SET log_min_error_statement = panic;
> >     SET escape_string_warning = off;
> >     SET standard_conforming_strings = off;
> >     RESET log_min_error_statement;
> >     RESET log_min_messages;
> >     RESET client_min_messages;
> 
> Thrashing about with the message level settings like that is useless.
> Either the command will work or it won't.  And we've not bothered to
> try to suppress warnings for any of the other SET commands pg_dump
> issues.  AFAICS all you've accomplished here is to make the dump

Well, the issue with back branches is there is no way to know if the
dump is going to be loaded into the same back-branch, or a newer one,
so I figured we would have to suppress any SET because in the existing
branch, it would fail.

We have discussed the idea of coding the PostgreSQL version number in
the dump somehow so the backend could adjust its behavior based on that
version.  For example, you do SET pg_dump_version = 7.3 and sql standard
strings and backslash warnings are turned off.  That could be
backpatched, I guess.

> dependent on even more GUC variables than it needs to be (consider
> what will happen if we remove/redefine the log level variables in
> future).

True.

> I don't particularly like the way that pg_dump is behaving at the
> moment, ie cluttering the output with E'' strings.  That makes it

Well, it should only do that if there is a backslash in the string.  I
tested a CHECK string and that is how it behaved.

> unnecessarily hard to use the output to load into other databases
> or older PG versions.  What I'd like to do is SET

True.

> standard_conforming_strings appropriately (this probably has to be
> a command line option, since it'll depend on where you want to use
> the output) and then not use E'' strings at all.

Yes, we could do that, but are you saying it is a pg_dump flag, and then
you have to make sure you load into something that has the same
behavior?  That seems quite error-prone.  Having the SET in the dump
seems easier, and it would eliminate the need for E'' in the pg_dump
file.

What if we do something like SET NO VALIDATOR var='val' and if the SET is
not understood, no error is generated?

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_dump and backslash escapes

From
Bruce Momjian
Date:
I have seen no reply to my suggestion below, so I assume it is the way
people want to go for 7.3, 7.4, and 8.0.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > >     -- Set escape environment for possible loading into version >= 8.2.
> > >     -- If variables are not supported, suppress error messages.
> > >     SET client_min_messages = panic;
> > >     SET log_min_messages = log;
> > >     SET log_min_error_statement = panic;
> > >     SET escape_string_warning = off;
> > >     SET standard_conforming_strings = off;
> > >     RESET log_min_error_statement;
> > >     RESET log_min_messages;
> > >     RESET client_min_messages;
> > 
> > Thrashing about with the message level settings like that is useless.
> > Either the command will work or it won't.  And we've not bothered to
> > try to suppress warnings for any of the other SET commands pg_dump
> > issues.  AFAICS all you've accomplished here is to make the dump
> 
> Well, the issue with back branches is there is no way to know if the
> dump is going to be loaded into the same back-branch, or a newer one,
> so I figured we would have to suppress any SET because in the existing
> branch, it would fail.
> 
> We have discussed the idea of coding the PostgreSQL version number in
> the dump somehow so the backend could adjust its behavior based on that
> version.  For example, you do SET pg_dump_version = 7.3 and sql standard
> strings and backslash warnings are turned off.  That could be
> backpatched, I guess.
> 
> > dependent on even more GUC variables than it needs to be (consider
> > what will happen if we remove/redefine the log level variables in
> > future).
> 
> True.
> 
> > I don't particularly like the way that pg_dump is behaving at the
> > moment, ie cluttering the output with E'' strings.  That makes it
> 
> Well, it should only do that if there is a backslash in the string.  I
> tested a CHECK string and that is how it behaved.
> 
> > unnecessarily hard to use the output to load into other databases
> > or older PG versions.  What I'd like to do is SET
> 
> True.
> 
> > standard_conforming_strings appropriately (this probably has to be
> > a command line option, since it'll depend on where you want to use
> > the output) and then not use E'' strings at all.
> 
> Yes, we could do that, but are you saying it is a pg_dump flag, and then
> you have to make sure you load into something that has the same
> behavior?  That seems quite error-prone.  Having the SET in the dump
> seems easier, and it would eliminate the need for E'' in the pg_dump
> file.
> 
> What if we do something like SET NO VALIDATOR var='val' and if the SET is
> not understood, no error is generated?
> 
> -- 
>   Bruce Momjian   http://candle.pha.pa.us
>   EnterpriseDB    http://www.enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
> 

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_dump and backslash escapes

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I have seen no reply to my suggestion below, so I assume it is the way
> people want to go for 7.3, 7.4, and 8.0.

I'm not particularly for it, if that's what you meant, and certainly not
for hacking up old branches that way.  For one thing, you can't
retroactively cause servers that are already out there to not spit
errors for GUC variables they've not heard of; and even if you had such
a time-travel machine at hand, it's far from clear that it'd be a good
idea.

The pg_dump philosophy for cross-version updates is generally that the
dump should load if you are willing to ignore errors and press on.  Not
that there will never be errors.  See for example our previous handling
of the without_oids business, or search_path, or tablespaces.
        regards, tom lane


Re: pg_dump and backslash escapes

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I have seen no reply to my suggestion below, so I assume it is the way
> > people want to go for 7.3, 7.4, and 8.0.
> 
> I'm not particularly for it, if that's what you meant, and certainly not
> for hacking up old branches that way.  For one thing, you can't
> retroactively cause servers that are already out there to not spit
> errors for GUC variables they've not heard of; and even if you had such
> a time-travel machine at hand, it's far from clear that it'd be a good
> idea.
> 
> The pg_dump philosophy for cross-version updates is generally that the
> dump should load if you are willing to ignore errors and press on.  Not
> that there will never be errors.  See for example our previous handling
> of the without_oids business, or search_path, or tablespaces.

So, we should SET the variables and allow people to get the errors on
load?  And not supress them from the client and server logs?  Is that
better than suppressing them?

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_dump and backslash escapes

From
Bruce Momjian
Date:
The basic issue is that we need to set standard_conforming_strings to
'off' for loading into newer releases, >= 8.2, but that SET command is
going to generate an error even dumping/loading into the same version of
PostgreSQL, like 7.3 to 7.3.  I don't think we want that, do we? I agree
we can have errors when doing cross-version dumps, but have we ever
generated errors when dumping/reloading into the same version?

We don't need to set escape_string_warning because it is just a warning
message and the warning will only happen when loading into 8.2 or later,
so we could skip that part of the patch. But, I figured as long as we
are suppressing warnings at that point, might as well add that too.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > I have seen no reply to my suggestion below, so I assume it is the way
> > > people want to go for 7.3, 7.4, and 8.0.
> > 
> > I'm not particularly for it, if that's what you meant, and certainly not
> > for hacking up old branches that way.  For one thing, you can't
> > retroactively cause servers that are already out there to not spit
> > errors for GUC variables they've not heard of; and even if you had such
> > a time-travel machine at hand, it's far from clear that it'd be a good
> > idea.
> > 
> > The pg_dump philosophy for cross-version updates is generally that the
> > dump should load if you are willing to ignore errors and press on.  Not
> > that there will never be errors.  See for example our previous handling
> > of the without_oids business, or search_path, or tablespaces.
> 
> So, we should SET the variables and allow people to get the errors on
> load?  And not supress them from the client and server logs?  Is that
> better than suppressing them?
> 
> -- 
>   Bruce Momjian   http://candle.pha.pa.us
>   EnterpriseDB    http://www.enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
> 

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_dump and backslash escapes

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> The basic issue is that we need to set standard_conforming_strings to
> 'off' for loading into newer releases, >= 8.2, but that SET command is
> going to generate an error even dumping/loading into the same version of
> PostgreSQL, like 7.3 to 7.3.  I don't think we want that, do we?

Why are you worried?  Have you counted how many SETs there are currently
that will generate errors in older versions of PG?  As long as the older
backend will load the data correctly after rejecting the SET, there's no
functional problem, and I think trying to hide the error is a cosmetic
thing that will likely do more harm than good in the long run.
        regards, tom lane


Re: pg_dump and backslash escapes

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > The basic issue is that we need to set standard_conforming_strings to
> > 'off' for loading into newer releases, >= 8.2, but that SET command is
> > going to generate an error even dumping/loading into the same version of
> > PostgreSQL, like 7.3 to 7.3.  I don't think we want that, do we?
> 
> Why are you worried?  Have you counted how many SETs there are currently
> that will generate errors in older versions of PG?  As long as the older
> backend will load the data correctly after rejecting the SET, there's no
> functional problem, and I think trying to hide the error is a cosmetic
> thing that will likely do more harm than good in the long run.

Fine.

You mean dumping and reloading pg_dump output in 7.3 generates errors? 
I didn't know.  Can you give an example?  I wasn't aware of that.

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_dump and backslash escapes

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> You mean dumping and reloading pg_dump output in 7.3 generates errors? 
> I didn't know.  Can you give an example?  I wasn't aware of that.

Well, looking at the SETs already currently emitted:

$ pg_dump -s regression | grep ^SET
SET client_encoding = 'SQL_ASCII';    fails before 7.1
SET check_function_bodies = false;    fails before 7.4
SET client_min_messages = warning;    fails before 7.3
SET search_path = public, pg_catalog;    fails before 7.3
SET default_tablespace = '';        fails before 8.0
SET default_with_oids = false;        fails before 8.0

so I'm not at all clear what you've got against this one.
        regards, tom lane


Re: pg_dump and backslash escapes

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > You mean dumping and reloading pg_dump output in 7.3 generates errors? 
> > I didn't know.  Can you give an example?  I wasn't aware of that.
> 
> Well, looking at the SETs already currently emitted:
> 
> $ pg_dump -s regression | grep ^SET
> SET client_encoding = 'SQL_ASCII';    fails before 7.1
> SET check_function_bodies = false;    fails before 7.4
> SET client_min_messages = warning;    fails before 7.3
> SET search_path = public, pg_catalog;    fails before 7.3
> SET default_tablespace = '';        fails before 8.0
> SET default_with_oids = false;        fails before 8.0
> 
> so I'm not at all clear what you've got against this one.

Very clear.  The issue is that I can't find any of these emitted by a
pg_dump version who's native backend doesn't understand them.

I assume that it is expected that a cross-db dump/reload will generate
errors, and it is done rarely for upgrades, but I assume same-version
dump/restore is done more frequently and people don't expect errors.
Is that not a significant distinction?

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_dump and backslash escapes

From
"Zeugswetter Andreas DCP SD"
Date:
> Very clear.  The issue is that I can't find any of these emitted by a
> pg_dump version who's native backend doesn't understand them.
>
> I assume that it is expected that a cross-db dump/reload will generate
> errors, and it is done rarely for upgrades, but I assume same-version
> dump/restore is done more frequently and people don't expect errors.
> Is that not a significant distinction?

I thought the suggested procedure (see migration doc) was to use the
new pg_dump to dump the older db version, so why backpatch ?

Andreas


Re: pg_dump and backslash escapes

From
Bruce Momjian
Date:
Zeugswetter Andreas DCP SD wrote:
> 
> > Very clear.  The issue is that I can't find any of these emitted by a
> > pg_dump version who's native backend doesn't understand them.
> > 
> > I assume that it is expected that a cross-db dump/reload will generate
> > errors, and it is done rarely for upgrades, but I assume same-version
> > dump/restore is done more frequently and people don't expect errors.
> > Is that not a significant distinction?
> 
> I thought the suggested procedure (see migration doc) was to use the 
> new pg_dump to dump the older db version, so why backpatch ?

Uh, you can suggest it, but I would guess < 50% do it, and once the old
database is gone, there is no way to re-do the dump.

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_dump and backslash escapes

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Very clear.  The issue is that I can't find any of these emitted by a
> pg_dump version who's native backend doesn't understand them.

So?  We're not doing anything differently in that regard either.  8.2
will understand the SET, what's the problem?
        regards, tom lane


Re: pg_dump and backslash escapes

From
"Zeugswetter Andreas DCP SD"
Date:
> > I thought the suggested procedure (see migration doc) was to use the

> > new pg_dump to dump the older db version, so why backpatch ?
>
> Uh, you can suggest it, but I would guess < 50% do it, and once the
old
> database is gone, there is no way to re-do the dump.

But you can still load the dump if you execute the two statements in the

new db psql session before loading the dump file, no ?

> > > SET escape_string_warning = off;
> > > SET standard_conforming_strings = off;

Andreas


Re: pg_dump and backslash escapes

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Very clear.  The issue is that I can't find any of these emitted by a
> > pg_dump version who's native backend doesn't understand them.
> 
> So?  We're not doing anything differently in that regard either.  8.2
> will understand the SET, what's the problem?

Uh, 8.2 doesn't need these flags.  It is for back branches that assume
escape processing.

Anyway, I talked to Tom on the phone and he feels we just need to tell
people loading pre-8.1 dumps to use PGOPTIONS to set
escape_string_warning to false when loading the dump.  Seeing that many
pre-8.1 people are not going to upgrade the newest 8.0.X before
upgrading to 8.2 or 8.3, I suppose the PGOPTIONS approach is OK.  It
would have to be documented in the release notes, actually starting with
8.2 because that's the first release where escape_string_warning can be
true.

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_dump and backslash escapes

From
Greg Stark
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:

> Zeugswetter Andreas DCP SD wrote:
>
> > I thought the suggested procedure (see migration doc) was to use the 
> > new pg_dump to dump the older db version, so why backpatch ?
> 
> Uh, you can suggest it, but I would guess < 50% do it, and once the old
> database is gone, there is no way to re-do the dump.

I thought the plan was to have one version that supported
standards_conforming_strings but defaulted to false anyways. So this would
only bite people who were jumping over an entire major release.

-- 
greg



Re: pg_dump and backslash escapes

From
Bruce Momjian
Date:
Greg Stark wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> 
> > Zeugswetter Andreas DCP SD wrote:
> >
> > > I thought the suggested procedure (see migration doc) was to use the 
> > > new pg_dump to dump the older db version, so why backpatch ?
> > 
> > Uh, you can suggest it, but I would guess < 50% do it, and once the old
> > database is gone, there is no way to re-do the dump.
> 
> I thought the plan was to have one version that supported
> standards_conforming_strings but defaulted to false anyways. So this would
> only bite people who were jumping over an entire major release.

Yes, that is true.

--  Bruce Momjian   http://candle.pha.pa.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +