Thread: initdb failure (was Re: [GENERAL] sequence's plpgsql)

initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Tim McAuley <mcauleyt@tcd.ie> writes:
> Another question now. I am unable to compile Postgresql 7.4 beta 3 under 
> cygwin (Windows 2K, using cgyipc 2).

> I am getting the error:
> "
> creating information schema... ERROR:  end-of-copy marker does not match 
> previous newline style
> CONTEXT:  COPY FROM, line 361
> "

That's interesting.  COPY is complaining because the \. terminator in
the file it's been fed has a different kind of newline after it than
the newlines earlier in the file (LF vs CR/LF, no doubt).  The part
of the initdb script that must be causing this is
 echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name, is_supported,
comments)FROM STDIN;" cat "$datadir"/sql_features.txt echo "\."
 
) \| "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely
echo "ok"

so it appears that cygwin's "echo" generates a different newline style
than what got put into sql_features.txt.  A possible way to fix this is
to put the "\." line into sql_features.txt, but maybe there's a cleaner
answer.  Peter, any thoughts?
        regards, tom lane


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"Matthew T. O'Connor"
Date:
On Fri, 2003-09-26 at 11:01, Tom Lane wrote:
> so it appears that cygwin's "echo" generates a different newline style
> than what got put into sql_features.txt.  A possible way to fix this is
> to put the "\." line into sql_features.txt, but maybe there's a cleaner
> answer.  Peter, any thoughts?

Does cygwin still have the install time option of what type of line feed
to use?  I know at one point (a long time ago) when I installed cygwin,
and chose windows line feeds (CRLF) that it caused problems with several
applications.  So the problem might be that with CYGWIN you could have
either type of line feed depending on what the user selected during
install.



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Michael Meskes
Date:
On Fri, Sep 26, 2003 at 11:01:34AM -0400, Tom Lane wrote:
> > I am getting the error:
> > "
> > creating information schema... ERROR:  end-of-copy marker does not match 
> > previous newline style
> > CONTEXT:  COPY FROM, line 361
> > "
> 
> That's interesting.  COPY is complaining because the \. terminator in
> the file it's been fed has a different kind of newline after it than
> the newlines earlier in the file (LF vs CR/LF, no doubt).  The part

Well actually I'm not so sure. I tried installing 7.4 from Oliver's
Debian packages and experienced exactly the same. We have yet to find
out why, but I doubt newlines changed between his Debian system and mine.

> of the initdb script that must be causing this is
> 
>   echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name,
is_supported,comments) FROM STDIN;"
 
>   cat "$datadir"/sql_features.txt
>   echo "\."
> ) \
>     | "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely
> echo "ok"
> 
> so it appears that cygwin's "echo" generates a different newline style
> than what got put into sql_features.txt.  A possible way to fix this is
> to put the "\." line into sql_features.txt, but maybe there's a cleaner
> answer.  Peter, any thoughts?

Putting the "\." line into sql_features.txt did help me, too. 

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
Tom Lane writes:

> so it appears that cygwin's "echo" generates a different newline style
> than what got put into sql_features.txt.  A possible way to fix this is
> to put the "\." line into sql_features.txt, but maybe there's a cleaner
> answer.  Peter, any thoughts?

There's no clean answer to this on Cygwin.  This specific case is just a
little problem that we could solve locally, but in general you'll just end
up annoying people if you require them to use consistent line endings on
Cygwin.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Tom Lane writes:
> 
> > so it appears that cygwin's "echo" generates a different newline style
> > than what got put into sql_features.txt.  A possible way to fix this is
> > to put the "\." line into sql_features.txt, but maybe there's a cleaner
> > answer.  Peter, any thoughts?
> 
> There's no clean answer to this on Cygwin.  This specific case is just a
> little problem that we could solve locally, but in general you'll just end
> up annoying people if you require them to use consistent line endings on
> Cygwin.

This error is coming from the new 7.4 COPY code that allows \r\n as a
line terminator.  Requiring the end-of-line to be consistent seemed to
be the only way to be sure we were not eating a literal carriage return
in the data stream.  Let's put the "\." into sql_features.txt and see if
that fixes it, or can we use echo -c "\.\n" in initdb?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> so it appears that cygwin's "echo" generates a different newline style
>> than what got put into sql_features.txt.  A possible way to fix this is
>> to put the "\." line into sql_features.txt, but maybe there's a cleaner
>> answer.  Peter, any thoughts?

> There's no clean answer to this on Cygwin.  This specific case is just a
> little problem that we could solve locally, but in general you'll just end
> up annoying people if you require them to use consistent line endings on
> Cygwin.

Yeah, I was wondering whether you wouldn't propose dropping the newline
consistency check.  I'm not very comfortable with that, but maybe we
should.  Bruce?
        regards, tom lane


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Tom Lane writes:
> >> so it appears that cygwin's "echo" generates a different newline style
> >> than what got put into sql_features.txt.  A possible way to fix this is
> >> to put the "\." line into sql_features.txt, but maybe there's a cleaner
> >> answer.  Peter, any thoughts?
> 
> > There's no clean answer to this on Cygwin.  This specific case is just a
> > little problem that we could solve locally, but in general you'll just end
> > up annoying people if you require them to use consistent line endings on
> > Cygwin.
> 
> Yeah, I was wondering whether you wouldn't propose dropping the newline
> consistency check.  I'm not very comfortable with that, but maybe we
> should.  Bruce?

I posted on that a few minutes ago.  Yea, we can drop it, but we risk
eating carraige returns as data values.  I am not sure how consistently
we output literal carriage returns in old dumps, nor how many apps
produce on literal carriage returns in COPY. If we conditionally eat
them, we run the risk of discarding some of their data without warning. 
Perhaps we can throw a warning rather than an error, and adjust initdb
to be consistent.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Andrew Dunstan
Date:
Bruce Momjian wrote:

>I posted on that a few minutes ago.  Yea, we can drop it, but we risk
>eating carraige returns as data values.  I am not sure how consistently
>we output literal carriage returns in old dumps, nor how many apps
>produce on literal carriage returns in COPY. If we conditionally eat
>them, we run the risk of discarding some of their data without warning. 
>Perhaps we can throw a warning rather than an error, and adjust initdb
>to be consistent.
>
>  
>

Would the best longterm solution be to require escaping CR in a data 
value? (Yes I know this might cause backwards compatibility problems, at 
least for a while).

cheers

andrew



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Perhaps we can throw a warning rather than an error, and adjust initdb
> to be consistent.

I like the idea of reducing the newline consistency check to a warning.
There is one thing we'd have to watch for though (it's already an issue
but would become a bigger one): client-side COPY code had better be
prepared to absorb backend Notice messages while processing COPY IN.
Currently libpq doesn't read input data at all during a COPY IN loop,
which means that if the COPY generates more than a few K of warning
messages, the backend gets blocked on a full pipe and the whole
operation locks up.  I have been meaning to fix that in libpq anyway,
but what other client libraries might have the same issue?  Anyone know
whether JDBC would need a similar fix?

            regards, tom lane

Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> Bruce Momjian wrote:
> 
> >I posted on that a few minutes ago.  Yea, we can drop it, but we risk
> >eating carraige returns as data values.  I am not sure how consistently
> >we output literal carriage returns in old dumps, nor how many apps
> >produce on literal carriage returns in COPY. If we conditionally eat
> >them, we run the risk of discarding some of their data without warning. 
> >Perhaps we can throw a warning rather than an error, and adjust initdb
> >to be consistent.
> >
> >  
> >
> 
> Would the best longterm solution be to require escaping CR in a data 
> value? (Yes I know this might cause backwards compatibility problems, at 
> least for a while).

Yes, we do have that documented, but we supported it in earlier
releases, so we can't be sure who is using it, and it probably exists in
older dumps.

The particular problem is not literal carriage returns, which we don't
support any more, but carriage returns that happen to be at the end of
the line, right up against \n.  If we relax this, we will conditionally
strip off the \r.

In fact, we can do that now:
create table xx(text);

Right now, if all the column value end with \r, we will silently eat it.
But if some have \r and some do not, we will throw an error.  (Above,
when I say \r, I mean literal carriage return, not the "\r" string,
which we have always handled cleanly.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Perhaps we can throw a warning rather than an error, and adjust initdb
> > to be consistent.
>
> I like the idea of reducing the newline consistency check to a warning.
> There is one thing we'd have to watch for though (it's already an issue
> but would become a bigger one): client-side COPY code had better be
> prepared to absorb backend Notice messages while processing COPY IN.
> Currently libpq doesn't read input data at all during a COPY IN loop,
> which means that if the COPY generates more than a few K of warning
> messages, the backend gets blocked on a full pipe and the whole
> operation locks up.  I have been meaning to fix that in libpq anyway,
> but what other client libraries might have the same issue?  Anyone know
> whether JDBC would need a similar fix?

Wow, that sounds big.  The ERROR will only happen once, while the
WARNING could happen a lot --- we could add code to throw the WARNING
only once per COPY command --- that would probably make sense.  I don't
see how we could get all clients to handle this for 7.4, particularly
clients from previous releases.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Tom Lane writes:
>
> > so it appears that cygwin's "echo" generates a different newline style
> > than what got put into sql_features.txt.  A possible way to fix this is
> > to put the "\." line into sql_features.txt, but maybe there's a cleaner
> > answer.  Peter, any thoughts?
>
> There's no clean answer to this on Cygwin.  This specific case is just a
> little problem that we could solve locally, but in general you'll just end
> up annoying people if you require them to use consistent line endings on
> Cygwin.

Here is a little diff to make initdb behave if we decide to keep the
COPY check.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: src/bin/initdb/initdb.sh
===================================================================
RCS file: /cvsroot/pgsql-server/src/bin/initdb/initdb.sh,v
retrieving revision 1.201
diff -c -c -r1.201 initdb.sh
*** src/bin/initdb/initdb.sh    7 Sep 2003 03:36:03 -0000    1.201
--- src/bin/initdb/initdb.sh    26 Sep 2003 18:23:16 -0000
***************
*** 1087,1094 ****
    echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name,
is_supported,comments) FROM STDIN;" 
    cat "$datadir"/sql_features.txt
    echo "\."
! ) \
!     | "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely
  echo "ok"

  $ECHO_N "vacuuming database template1... "$ECHO_C
--- 1087,1095 ----
    echo "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub_feature_name,
is_supported,comments) FROM STDIN;" 
    cat "$datadir"/sql_features.txt
    echo "\."
! ) |
! tr -d '\r' | # make newlines consistent for Win32
! "$PGPATH"/postgres $PGSQL_OPT template1 > /dev/null || exit_nicely
  echo "ok"

  $ECHO_N "vacuuming database template1... "$ECHO_C

Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> ... we could add code to throw the WARNING
> only once per COPY command --- that would probably make sense.

Seems like a bit of a kluge, but perhaps the best compromise.  It would
be quite likely that you'd get the same warning on many lines of a COPY,
and that probably isn't really going to help people.

            regards, tom lane

Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
Tom Lane writes:

> Yeah, I was wondering whether you wouldn't propose dropping the newline
> consistency check.  I'm not very comfortable with that, but maybe we
> should.  Bruce?

I don't mind if we keep it on pure-POSIX platforms.  But one of the nicer
developments on Windows in recent(?) times is that you can actually use
any kind of line separator and most programs still work correctly (with
the notable exception of Notepad).  The drawback of this is that you never
really know what kind of line separator a program is actually going to
write, especially when you start mixing Cygwin, MinGW, and native.  So
this newline consistency check is not only going to be pretty annoying on
Windows, it's going to cancel a feature of the operating system
environment.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"scott.marlowe"
Date:
On Fri, 26 Sep 2003, Bruce Momjian wrote:

> Tom Lane wrote:
> > Peter Eisentraut <peter_e@gmx.net> writes:
> > > Tom Lane writes:
> > >> so it appears that cygwin's "echo" generates a different newline style
> > >> than what got put into sql_features.txt.  A possible way to fix this is
> > >> to put the "\." line into sql_features.txt, but maybe there's a cleaner
> > >> answer.  Peter, any thoughts?
> > 
> > > There's no clean answer to this on Cygwin.  This specific case is just a
> > > little problem that we could solve locally, but in general you'll just end
> > > up annoying people if you require them to use consistent line endings on
> > > Cygwin.
> > 
> > Yeah, I was wondering whether you wouldn't propose dropping the newline
> > consistency check.  I'm not very comfortable with that, but maybe we
> > should.  Bruce?
> 
> I posted on that a few minutes ago.  Yea, we can drop it, but we risk
> eating carraige returns as data values.  I am not sure how consistently
> we output literal carriage returns in old dumps, nor how many apps
> produce on literal carriage returns in COPY. If we conditionally eat
> them, we run the risk of discarding some of their data without warning. 
> Perhaps we can throw a warning rather than an error, and adjust initdb
> to be consistent.

I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database 
to a 7.4beta3 database is producing some errors like this:

ERROR:  literal newline found in data
HINT:  Use "\n" to represent newline.
CONTEXT:  COPY FROM, line 59

ERROR:  literal carriage return found in data
HINT:  Use "\r" to represent carriage return.
CONTEXT:  COPY FROM, line 41


These show up with little or no context, only the "line number" of the 
dump file.  Since I'm wrapping these up in pg_dumpall, I don't have the 
dump file so I don't know where the error is really occuring.  It would be 
nice to have such occurances echo the table / row they are getting the 
error on, or maybe just the first 20 or so characters, so they'd be easier 
to identify.

Is this related to this issue?



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > ... we could add code to throw the WARNING
> > only once per COPY command --- that would probably make sense.
>
> Seems like a bit of a kluge, but perhaps the best compromise.  It would
> be quite likely that you'd get the same warning on many lines of a COPY,
> and that probably isn't really going to help people.

I can't think of any cases where we throw _tons_ of warnings at people,
like we could with COPY.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database 
> to a 7.4beta3 database is producing some errors like this:

> ERROR:  literal newline found in data
> HINT:  Use "\n" to represent newline.
> CONTEXT:  COPY FROM, line 59

> ERROR:  literal carriage return found in data
> HINT:  Use "\r" to represent carriage return.
> CONTEXT:  COPY FROM, line 41

Really?  7.2 should dump data \r or \n as the backslash versions ...
and does in my tests.  Can you make a reproducible test case?



> It would be nice to have such occurances echo the table / row they are
> getting the error on, or maybe just the first 20 or so characters, so
> they'd be easier to identify.

That's not a bad idea.  I think it would be fairly easy now for the
CONTEXT line of the error message to include the input data line:
CONTEXT:  COPY FROM, line 41: "data here ...."

at least up through the field where the error gets thrown, and with some
limit on the length of the data that will get echoed.  If people like
that idea I'll see about making it happen.
        regards, tom lane


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Tom Lane wrote:
> > ERROR:  literal carriage return found in data
> > HINT:  Use "\r" to represent carriage return.
> > CONTEXT:  COPY FROM, line 41
> 
> Really?  7.2 should dump data \r or \n as the backslash versions ...
> and does in my tests.  Can you make a reproducible test case?
> 
> 
> 
> > It would be nice to have such occurances echo the table / row they are
> > getting the error on, or maybe just the first 20 or so characters, so
> > they'd be easier to identify.
> 
> That's not a bad idea.  I think it would be fairly easy now for the
> CONTEXT line of the error message to include the input data line:
> 
>     CONTEXT:  COPY FROM, line 41: "data here ...."
> 
> at least up through the field where the error gets thrown, and with some
> limit on the length of the data that will get echoed.  If people like
> that idea I'll see about making it happen.

Also, he wanted table name too.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> Yeah, I was wondering whether you wouldn't propose dropping the newline
>> consistency check.  I'm not very comfortable with that, but maybe we
>> should.  Bruce?

> I don't mind if we keep it on pure-POSIX platforms.  But one of the nicer
> developments on Windows in recent(?) times is that you can actually use
> any kind of line separator and most programs still work correctly (with
> the notable exception of Notepad).

Not sure if we should make the behavior Windows-specific though.  And
didn't Michael report seeing the same initdb failure on Debian?  That
confuses me a bit --- why would there be a newline discrepancy on Debian?
        regards, tom lane


Re: initdb failure

From
Andrew Dunstan
Date:
Tom Lane wrote:

>>I don't mind if we keep it on pure-POSIX platforms.  But one of the nicer
>>developments on Windows in recent(?) times is that you can actually use
>>any kind of line separator and most programs still work correctly (with
>>the notable exception of Notepad).
>>    
>>
>
>Not sure if we should make the behavior Windows-specific though.  And
>didn't Michael report seeing the same initdb failure on Debian?  That
>confuses me a bit --- why would there be a newline discrepancy on Debian?
>
>  
>
Is this affected by the fact that the platform where a dump is made 
might have a different line-end discipline from the one where the dump 
is restored?

cheers

andrew




Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"scott.marlowe"
Date:
On Fri, 26 Sep 2003, Tom Lane wrote:

> "scott.marlowe" <scott.marlowe@ihs.com> writes:
> > I'm running into issues where 7.4's pg_dump/pg_dumpall from a 7.2 database 
> > to a 7.4beta3 database is producing some errors like this:
> 
> > ERROR:  literal newline found in data
> > HINT:  Use "\n" to represent newline.
> > CONTEXT:  COPY FROM, line 59
> 
> > ERROR:  literal carriage return found in data
> > HINT:  Use "\r" to represent carriage return.
> > CONTEXT:  COPY FROM, line 41
> 
> Really?  7.2 should dump data \r or \n as the backslash versions ...
> and does in my tests.  Can you make a reproducible test case?

The attached file produces this problem.  Note it's a blank trailing field 
that looks to be causing it.  The error for this .sql file is:

ERROR:  literal carriage return found in data
HINT:  Use "\r" to represent carriage return.
CONTEXT:  COPY FROM, line 2

Note that loading this into pico and saving it back out fixes the problem.

If I remove the preceding row that doesn't end in a blank field, I get a 
different error, this one:

ERROR:  end-of-copy marker does not match previous newline style
CONTEXT:  COPY FROM, line 2

> > It would be nice to have such occurances echo the table / row they are
> > getting the error on, or maybe just the first 20 or so characters, so
> > they'd be easier to identify.
> 
> That's not a bad idea.  I think it would be fairly easy now for the
> CONTEXT line of the error message to include the input data line:
> 
>     CONTEXT:  COPY FROM, line 41: "data here ...."
> 
> at least up through the field where the error gets thrown, and with some
> limit on the length of the data that will get echoed.  If people like
> that idea I'll see about making it happen.

table name too, like Bruce said.  The bothersome bit is that in pg_dump, 
it says the line, relative to just this part of the copy command, so you 
don't even know which table is giving the error.

Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
scott.marlowe wrote:
> The attached file produces this problem.  Note it's a blank trailing field 
> that looks to be causing it.  The error for this .sql file is:
> 
> ERROR:  literal carriage return found in data
> HINT:  Use "\r" to represent carriage return.
> CONTEXT:  COPY FROM, line 2
> 
> Note that loading this into pico and saving it back out fixes the problem.
> 
> If I remove the preceding row that doesn't end in a blank field, I get a 
> different error, this one:
> 
> ERROR:  end-of-copy marker does not match previous newline style
> CONTEXT:  COPY FROM, line 2

OK, 'vi' shows it as:COPY people2 (id, persons) FROM stdin;59      Chance Terry--S60      ^M\.

which is _exactly the case the error was supposed to catch.  Now, the
big question is where did this dump come from?  Pg version?  OS platform?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> scott.marlowe wrote:
> > The attached file produces this problem.  Note it's a blank trailing field 
> > that looks to be causing it.  The error for this .sql file is:
> > 
> > ERROR:  literal carriage return found in data
> > HINT:  Use "\r" to represent carriage return.
> > CONTEXT:  COPY FROM, line 2
> > 
> > Note that loading this into pico and saving it back out fixes the problem.
> > 
> > If I remove the preceding row that doesn't end in a blank field, I get a 
> > different error, this one:
> > 
> > ERROR:  end-of-copy marker does not match previous newline style
> > CONTEXT:  COPY FROM, line 2
> 
> OK, 'vi' shows it as:
>     
>     COPY people2 (id, persons) FROM stdin;
>     59      Chance Terry--S
>     60      ^M
>     \.

The significant issue here is that we don't want to throw a warning in
this case;  it should be an error --- this is clearly a data value we
don't want to discard as part of the end-of-line.

The argument that you want a warning because you might have mixed
newlines in the file seems less likely than this case where they are
using a literal carriage return as a data value at the end of the line.

Seems we should just fix initdb as my patch does and see if we get any
more mixed-newline cases or literal carriage-return data cases.  My
guess is that we will get more of the latter.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"scott.marlowe"
Date:
On Fri, 26 Sep 2003, Bruce Momjian wrote:

> scott.marlowe wrote:
> > The attached file produces this problem.  Note it's a blank trailing field 
> > that looks to be causing it.  The error for this .sql file is:
> > 
> > ERROR:  literal carriage return found in data
> > HINT:  Use "\r" to represent carriage return.
> > CONTEXT:  COPY FROM, line 2
> > 
> > Note that loading this into pico and saving it back out fixes the problem.
> > 
> > If I remove the preceding row that doesn't end in a blank field, I get a 
> > different error, this one:
> > 
> > ERROR:  end-of-copy marker does not match previous newline style
> > CONTEXT:  COPY FROM, line 2
> 
> OK, 'vi' shows it as:
>     
>     COPY people2 (id, persons) FROM stdin;
>     59      Chance Terry--S
>     60      ^M
>     \.
> 
> which is _exactly the case the error was supposed to catch.  Now, the
> big question is where did this dump come from?  Pg version?  OS platform?

The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
on it.

The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
running on it.

The data likely came out of a (gasp, horrors) windows box.



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
scott.marlowe wrote:
> > OK, 'vi' shows it as:
> >     
> >     COPY people2 (id, persons) FROM stdin;
> >     59      Chance Terry--S
> >     60      ^M
> >     \.
> > 
> > which is _exactly the case the error was supposed to catch.  Now, the
> > big question is where did this dump come from?  Pg version?  OS platform?
> 
> The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
> on it.
> 
> The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
> running on it.
> 
> The data likely came out of a (gasp, horrors) windows box.

OK, try this on your 7.2:
test=> create table test(x text);CREATE TABLEtest=> insert into test values ('\r');INSERT 17158 1test=> copy test to
'/tmp/out';COPY

Then 'vi' /tmp/out. It should show \r, not ^M.  Please report back.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Michael Meskes
Date:
On Fri, Sep 26, 2003 at 03:05:58PM -0400, Tom Lane wrote:
> Not sure if we should make the behavior Windows-specific though.  And
> didn't Michael report seeing the same initdb failure on Debian?  That
> confuses me a bit --- why would there be a newline discrepancy on Debian?

I take it there are no special characters in that COPY, right? The only
difference between the two machines we found was that my was running
under UTF-8. 

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: meskes@jabber.org
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"scott.marlowe"
Date:
On Fri, 26 Sep 2003, Bruce Momjian wrote:

> scott.marlowe wrote:
> > > OK, 'vi' shows it as:
> > >     
> > >     COPY people2 (id, persons) FROM stdin;
> > >     59      Chance Terry--S
> > >     60      ^M
> > >     \.
> > > 
> > > which is _exactly the case the error was supposed to catch.  Now, the
> > > big question is where did this dump come from?  Pg version?  OS platform?
> > 
> > The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
> > on it.
> > 
> > The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
> > running on it.
> > 
> > The data likely came out of a (gasp, horrors) windows box.
> 
> OK, try this on your 7.2:
> 
>     test=> create table test(x text);
>     CREATE TABLE
>     test=> insert into test values ('\r');
>     INSERT 17158 1
>     test=> copy test to '/tmp/out';
>     COPY
> 
> Then 'vi' /tmp/out. It should show \r, not ^M.  Please report back.

I'm not much of a vi guy, so the out file thing didn't tell me much, but 
if I try and dump it from the 7.4 beta3 box into a new table, I get:

ERROR:  end-of-copy marker does not match previous newline style
CONTEXT:  COPY FROM, line 2




Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"scott.marlowe"
Date:
On Fri, 26 Sep 2003, Bruce Momjian wrote:

> scott.marlowe wrote:
> > > OK, 'vi' shows it as:
> > >     
> > >     COPY people2 (id, persons) FROM stdin;
> > >     59      Chance Terry--S
> > >     60      ^M
> > >     \.
> > > 
> > > which is _exactly the case the error was supposed to catch.  Now, the
> > > big question is where did this dump come from?  Pg version?  OS platform?
> > 
> > The originating system is a RedHat 7.2 box with postgresql 7.2.x running 
> > on it.
> > 
> > The destination system is a RedHat 7.2 box with postgresql 7.4 beta3 
> > running on it.
> > 
> > The data likely came out of a (gasp, horrors) windows box.
> 
> OK, try this on your 7.2:
> 
>     test=> create table test(x text);
>     CREATE TABLE
>     test=> insert into test values ('\r');
>     INSERT 17158 1
>     test=> copy test to '/tmp/out';
>     COPY
> 
> Then 'vi' /tmp/out. It should show \r, not ^M.  Please report back.

Figured out iv (-b switch, man pages rock) and it's a ^M




Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
scott.marlowe writes:

> table name too, like Bruce said.  The bothersome bit is that in pg_dump,
> it says the line, relative to just this part of the copy command, so you
> don't even know which table is giving the error.

I don't see the problem.  Can't you identify the failing command by the
line number that psql gives you?

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> The argument that you want a warning because you might have mixed
> newlines in the file seems less likely than this case where they are
> using a literal carriage return as a data value at the end of the line.

I don't agree with that assessment.  Who actually has CRs in their data?

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"scott.marlowe"
Date:
On Fri, 26 Sep 2003, Peter Eisentraut wrote:

> Bruce Momjian writes:
> 
> > The argument that you want a warning because you might have mixed
> > newlines in the file seems less likely than this case where they are
> > using a literal carriage return as a data value at the end of the line.
> 
> I don't agree with that assessment.  Who actually has CRs in their data?

I do.  And so do lots of other people.



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"scott.marlowe"
Date:
On Fri, 26 Sep 2003, Peter Eisentraut wrote:

> scott.marlowe writes:
> 
> > table name too, like Bruce said.  The bothersome bit is that in pg_dump,
> > it says the line, relative to just this part of the copy command, so you
> > don't even know which table is giving the error.
> 
> I don't see the problem.  Can't you identify the failing command by the
> line number that psql gives you?

OK, here's the output from pg_dump while it's running

ERROR:  function "odbc_user" already exists with same argument types
REVOKE
REVOKE
GRANT
You are now connected as new user "ayousuff".
ERROR:  literal newline found in data
HINT:  Use "\n" to represent newline.
CONTEXT:  COPY FROM, line 59
You are now connected as new user "smarlowe".
You are now connected as new user "ayousuff".
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
You are now connected as new user "smarlowe".
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX

So how am I supposed to figure out which table and which row broke?  Keep 
in mind, I'm doing this:

pg_dump -h otherserver dbname|psql -h desthost dbname

but I get basically the same thing if I dump it to a .sql file and do:

psql dbname <dbname.sql





Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Larry Rosenman
Date:

--On Friday, September 26, 2003 14:36:08 -0600 "scott.marlowe"
<scott.marlowe@ihs.com> wrote:

> On Fri, 26 Sep 2003, Peter Eisentraut wrote:
>
>> Bruce Momjian writes:
>>
>> > The argument that you want a warning because you might have mixed
>> > newlines in the file seems less likely than this case where they are
>> > using a literal carriage return as a data value at the end of the line.
>>
>> I don't agree with that assessment.  Who actually has CRs in their data?
>
> I do.  And so do lots of other people.
I store E-Mail type comments in some of my DB's and they may contain cr's.

Don't ASSUME that we don't do stuff like this with a database.



--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
scott.marlowe writes:

> but I get basically the same thing if I dump it to a .sql file and do:
>
> psql dbname <dbname.sql

Use psql -f dbname.sql instead.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"scott.marlowe"
Date:
On Fri, 26 Sep 2003, Peter Eisentraut wrote:

> scott.marlowe writes:
> 
> > but I get basically the same thing if I dump it to a .sql file and do:
> >
> > psql dbname <dbname.sql
> 
> Use psql -f dbname.sql instead.

and the output is:

psql:webport.sql:803: ERROR:  function "odbc_user" already exists with 
same argument types
REVOKE
REVOKE
GRANT
You are now connected as new user "ayousuff".
psql:webport.sql:869: ERROR:  literal newline found in data
HINT:  Use "\n" to represent newline.
CONTEXT:  COPY FROM, line 59
You are now connected as new user "smarlowe".
You are now connected as new user "ayousuff".
CREATE INDEX
CREATE INDEX
CREATE INDEX


the same.  It doesn't tell me which table in my dump caused the problem, 
and it certainly isn't line 59 of the dump file, but of the table 
producing the error.



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> scott.marlowe writes:
>> but I get basically the same thing if I dump it to a .sql file and do:
>> psql dbname <dbname.sql

> Use psql -f dbname.sql instead.

This doesn't seem like a good argument not to add more information to
the CONTEXT line for COPY errors.  Sure, in theory the existing info
should be sufficient, but what if the information is not coming in
through psql?  (For instance, maybe the COPY data is being generated
on-the-fly by some other program.)  Or what if the dump file is so large
you can't easily edit it to determine which line number is in question?
There are plenty of scenarios where it's not all that convenient to
triangulate on a problem from outside information.  Minimalism isn't
really a virtue in error reports anyway.

I'm thinking maybe:
CONTEXT:  COPY tablename, line 41: "data ..."

would serve the purpose nicely.
        regards, tom lane


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
"scott.marlowe"
Date:
On Fri, 26 Sep 2003, Tom Lane wrote:

> Peter Eisentraut <peter_e@gmx.net> writes:
> > scott.marlowe writes:
> >> but I get basically the same thing if I dump it to a .sql file and do:
> >> psql dbname <dbname.sql
> 
> > Use psql -f dbname.sql instead.
> 
> This doesn't seem like a good argument not to add more information to
> the CONTEXT line for COPY errors.  Sure, in theory the existing info
> should be sufficient, but what if the information is not coming in
> through psql?  (For instance, maybe the COPY data is being generated
> on-the-fly by some other program.)  Or what if the dump file is so large
> you can't easily edit it to determine which line number is in question?
> There are plenty of scenarios where it's not all that convenient to
> triangulate on a problem from outside information.  Minimalism isn't
> really a virtue in error reports anyway.
> 
> I'm thinking maybe:
> 
>     CONTEXT:  COPY tablename, line 41: "data ..."
> 
> would serve the purpose nicely.

Yeah, just having the table name and line number would be plenty for me.  
It's the lack of a table name that makes it so frustrating.  I had to 
basically dump / restore the tables one at a time to figure out which one 
was causing the error.  On a database with hundreds of tables, that could 
be painful.



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Bruce Momjian writes:
>> The argument that you want a warning because you might have mixed
>> newlines in the file seems less likely than this case where they are
>> using a literal carriage return as a data value at the end of the line.

> I don't agree with that assessment.  Who actually has CRs in their data?

The question isn't so much "who has CRs in their data" as "who is trying
to import data files in which CRs aren't correctly represented as \r" ?
Not anyone upgrading from a recent PG release ... though 7.1 or before
would have an issue.

If we eliminate the mixed-newline check then we will silently discard
CRs that appear at the tail end of the last column of a data line.  This
is somewhat worrisome because the failure will not be obvious.  But in
most cases (CR in the middle of a data field, or in a field that isn't
last), we will detect and report an error due to field count mismatch.
I think that that might be good enough.

If we were designing COPY from scratch today, we'd certainly not think
that a mixed-newline check was appropriate; it's only because of our
historical handling of literal CRs that the question comes up at all.
        regards, tom lane


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
scott.marlowe writes:

> psql:webport.sql:803: ERROR:  function "odbc_user" already exists with
> same argument types
> REVOKE
> REVOKE
> GRANT
> You are now connected as new user "ayousuff".
> psql:webport.sql:869: ERROR:  literal newline found in data
> HINT:  Use "\n" to represent newline.
> CONTEXT:  COPY FROM, line 59
> You are now connected as new user "smarlowe".
> You are now connected as new user "ayousuff".
> CREATE INDEX
> CREATE INDEX
> CREATE INDEX
>
>
> the same.  It doesn't tell me which table in my dump caused the problem,
> and it certainly isn't line 59 of the dump file, but of the table
> producing the error.

The table is the one the COPY command on line 869 in file webport.sql
mentions and the problem data is on line 928 of that file.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
I said:
> The question isn't so much "who has CRs in their data" as "who is trying
> to import data files in which CRs aren't correctly represented as \r" ?
> Not anyone upgrading from a recent PG release ... though 7.1 or before
> would have an issue.

Actually, checking the CVS logs shows that 7.2.1 was the first release
that would emit a data CR or LF as \r or \n rather than literally.  If
Scott is dumping from an original 7.2 installation rather than a dot-
release, that would explain his report.

I talked to Bruce about this on the phone, and we now both feel that
that change was too recent to assume that people won't be trying to load
dumps containing bare CRs into 7.4.  Accordingly, it seems that we do
need to throw an error or at least a warning for mixed newlines.
Eventually we can get rid of that behavior and just treat all newline
sequences alike, but I think we can't do it until 7.2.0 is in the
"ancient history" category.
        regards, tom lane


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
Tom Lane writes:

> This doesn't seem like a good argument not to add more information to
> the CONTEXT line for COPY errors.  Sure, in theory the existing info
> should be sufficient, but what if the information is not coming in
> through psql?  (For instance, maybe the COPY data is being generated
> on-the-fly by some other program.)

Then you look into the code of that other program.  Or you look into the
server log, where you should log the statements you generate if you are
testing your code.

> Or what if the dump file is so large you can't easily edit it to
> determine which line number is in question?

The line number is already contained in the available information about
the error.  If the file is too large to load into an editor, you could use

perl -npi -e '$. == 123456 and s/\r/\\r/;'

or something along those lines.

> There are plenty of scenarios where it's not all that convenient to
> triangulate on a problem from outside information.

Maybe, but the ones I've seen mentioned so far are not among them.  I'm
not opposed to making errors more easy to identify, but considering that
someone else in this thread didn't even know about psql's option to print
line numbers of errors, I think some people haven't done their homework
yet.

> Minimalism isn't really a virtue in error reports anyway.
>
> I'm thinking maybe:
>
>     CONTEXT:  COPY tablename, line 41: "data ..."
>
> would serve the purpose nicely.

The only thing that would really help in the general case is the number of
the character where the error occurs.  If you print the actual data you
lose if the data is repeated within the, er, data and/or if the section of
the data that you print contains crazy characters that mess up the
display.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> scott.marlowe writes:
> 
> > table name too, like Bruce said.  The bothersome bit is that in pg_dump,
> > it says the line, relative to just this part of the copy command, so you
> > don't even know which table is giving the error.
> 
> I don't see the problem.  Can't you identify the failing command by the
> line number that psql gives you?

If you are loading from pg_dump, you have lots of copy commands, so how
do you know which COPY command caused the failure.  You just have the
line number of _a_ copy.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > The argument that you want a warning because you might have mixed
> > newlines in the file seems less likely than this case where they are
> > using a literal carriage return as a data value at the end of the line.
> 
> I don't agree with that assessment.  Who actually has CRs in their data?

Uh, people storing Win32 text documents.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
>> Minimalism isn't really a virtue in error reports anyway.
>> I'm thinking maybe:
>> CONTEXT:  COPY tablename, line 41: "data ..."
>> would serve the purpose nicely.

> The only thing that would really help in the general case is the number of
> the character where the error occurs.  If you print the actual data you
> lose if the data is repeated within the, er, data and/or if the section of
> the data that you print contains crazy characters that mess up the
> display.

A messed-up display would be useful information in itself; and it's not
like we have no precedent for repeating the erroneous data in error
messages.  Consider pg_atoi:

regression=# create table foo (f1 int);
CREATE TABLE
regression=# copy foo from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> z44
>> \.
ERROR:  invalid input syntax for integer: "z44"
CONTEXT:  COPY FROM, line 1
regression=#

I was chewing this over with Bruce on the phone just now, and we refined
the idea a little.  Some errors (primarily those detected inside the
datatype input procedures) can be clearly traced to a specific column,
whereas others (such as too many fields on an input line) really can't
be nailed down more tightly than a line.  So we were thinking about two
different flavors of context info:
CONTEXT:  COPY tablename, line n, field colname: "column data"

versus
CONTEXT:  COPY tablename, line n: "line data"

where in each case the data display would be truncated at 100 characters
or so (and would have to be omitted in the COPY BINARY case anyway).

If you're really concerned about funny characters messing up the report,
we could imagine replacing them by backslash sequences or some such, but
I suspect that would create more confusion than it would solve.
        regards, tom lane


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Tom Lane writes:
> 
> > so it appears that cygwin's "echo" generates a different newline style
> > than what got put into sql_features.txt.  A possible way to fix this is
> > to put the "\." line into sql_features.txt, but maybe there's a cleaner
> > answer.  Peter, any thoughts?
> 
> There's no clean answer to this on Cygwin.  This specific case is just a
> little problem that we could solve locally, but in general you'll just end
> up annoying people if you require them to use consistent line endings on
> Cygwin.

At this point, we have one mixed-EOL case (initdb, fixable), and one \r
in data case (from 7.2).  Seems the safest solution is to fix initdb and
see what other failure reports we get.

If we downgrade it to a warning, we will not know about the failures;
someone could miss a warning in a dump, but probably will not mess an
error and an empty table.  If we get more reports like initdb, we can
reevaluate, but it seems the safest course is to keep our existing code,
which only removes carriage returns when it quite sure.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [JDBC] initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Kris Jurka
Date:

On Fri, 26 Sep 2003, Tom Lane wrote:

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Perhaps we can throw a warning rather than an error, and adjust initdb
> > to be consistent.
>
> I like the idea of reducing the newline consistency check to a warning.
> There is one thing we'd have to watch for though (it's already an issue
> but would become a bigger one): client-side COPY code had better be
> prepared to absorb backend Notice messages while processing COPY IN.
> Currently libpq doesn't read input data at all during a COPY IN loop,
> which means that if the COPY generates more than a few K of warning
> messages, the backend gets blocked on a full pipe and the whole
> operation locks up.  I have been meaning to fix that in libpq anyway,
> but what other client libraries might have the same issue?  Anyone know
> whether JDBC would need a similar fix?
>

JDBC does not support the COPY protocol at the moment.

Kris Jurka


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Jon Jensen
Date:
On Fri, 26 Sep 2003, Tom Lane wrote:

> I was chewing this over with Bruce on the phone just now, and we refined
> the idea a little.  Some errors (primarily those detected inside the
> datatype input procedures) can be clearly traced to a specific column,
> whereas others (such as too many fields on an input line) really can't
> be nailed down more tightly than a line.  So we were thinking about two
> different flavors of context info:
> 
>     CONTEXT:  COPY tablename, line n, field colname: "column data"
> 
> versus
> 
>     CONTEXT:  COPY tablename, line n: "line data"

Those would be very helpful bits of information.

> where in each case the data display would be truncated at 100 characters
> or so (and would have to be omitted in the COPY BINARY case anyway).
> 
> If you're really concerned about funny characters messing up the report,
> we could imagine replacing them by backslash sequences or some such, but
> I suspect that would create more confusion than it would solve.

I hate to mention it, but would it be useful/non-overkill to make either
of those things (context message maximum length and funny character
escaping) configurable somehow?

Jon


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Jon Jensen wrote:
> On Fri, 26 Sep 2003, Tom Lane wrote:
> 
> > I was chewing this over with Bruce on the phone just now, and we refined
> > the idea a little.  Some errors (primarily those detected inside the
> > datatype input procedures) can be clearly traced to a specific column,
> > whereas others (such as too many fields on an input line) really can't
> > be nailed down more tightly than a line.  So we were thinking about two
> > different flavors of context info:
> > 
> >     CONTEXT:  COPY tablename, line n, field colname: "column data"
> > 
> > versus
> > 
> >     CONTEXT:  COPY tablename, line n: "line data"
> 
> Those would be very helpful bits of information.
> 
> > where in each case the data display would be truncated at 100 characters
> > or so (and would have to be omitted in the COPY BINARY case anyway).
> > 
> > If you're really concerned about funny characters messing up the report,
> > we could imagine replacing them by backslash sequences or some such, but
> > I suspect that would create more confusion than it would solve.
> 
> I hate to mention it, but would it be useful/non-overkill to make either
> of those things (context message maximum length and funny character
> escaping) configurable somehow?

Overkill.  Sorry.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Oliver Elphick
Date:
On Fri, 2003-09-26 at 23:18, Bruce Momjian wrote:
> If you are loading from pg_dump, you have lots of copy commands, so how
> do you know which COPY command caused the failure.  You just have the
> line number of _a_ copy.

I would recommend using psql -e
so that the sql commands are output too.

Here is a documentation patch:

Index: backup.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql-server/doc/src/sgml/backup.sgml,v
retrieving revision 2.30
diff -c -r2.30 backup.sgml
*** backup.sgml    31 Aug 2003 17:32:18 -0000    2.30
--- backup.sgml    27 Sep 2003 05:45:04 -0000
***************
*** 100,106 ****     be read in by the <application>psql</application> program. The     general command form to restore
adump is <synopsis>
 
! psql <replaceable class="parameter">dbname</replaceable> < <replaceable class="parameter">infile</replaceable>
</synopsis>    where <replaceable class="parameter">infile</replaceable> is what     you used as <replaceable
class="parameter">outfile</replaceable>
--- 100,106 ----     be read in by the <application>psql</application> program. The     general command form to restore
adump is <synopsis>
 
! psql -f <replaceable class="parameter">infile</replaceable> -d <replaceable class="parameter">dbname</replaceable>
</synopsis>    where <replaceable class="parameter">infile</replaceable> is what     you used as <replaceable
class="parameter">outfile</replaceable>
***************
*** 112,117 ****
--- 112,124 ----     <application>psql</> supports similar options to <application>pg_dump</>      for controlling the
databaseserver location and the user name. See     its reference page for more information.
 
+    </para>
+ 
+    <para>
+     With a large dump, it may be difficult to identify where any errors are
+     occurring.  You may use the -e option to psql to print the SQL commands
+     as they are run, so that it is easy to see precisely which commands are
+     causing errors.    </para>     <para>

-- 
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight, UK                             http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
========================================   "My brethren, count it all joy when ye fall into     various trials, Knowing
thatthe testing of your faith     produces endurance."          James 1:2,3 
 



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
Jon Jensen writes:

> I hate to mention it, but would it be useful/non-overkill to make either
> of those things (context message maximum length and funny character
> escaping) configurable somehow?

I'm thinking the same.  We have conflicting needs of different groups:
Users on "real" Unix platforms converting old data need the checks, users
on Windows platforms with new data don't want them.  (I wonder what people
on Mac OS that mix Unix-style and native applications would want.)

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
Oliver Elphick writes:

> +    <para>
> +     With a large dump, it may be difficult to identify where any errors are
> +     occurring.  You may use the -e option to psql to print the SQL commands
> +     as they are run, so that it is easy to see precisely which commands are
> +     causing errors.
>      </para>

That is just not true.  If you use -f, it will tell you the line number of
the command causing the error.  Add the line number of the COPY error
message, there you have it.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Peter Eisentraut wrote:
> > Tom Lane writes:
> > 
> > > so it appears that cygwin's "echo" generates a different newline style
> > > than what got put into sql_features.txt.  A possible way to fix this is
> > > to put the "\." line into sql_features.txt, but maybe there's a cleaner
> > > answer.  Peter, any thoughts?
> > 
> > There's no clean answer to this on Cygwin.  This specific case is just a
> > little problem that we could solve locally, but in general you'll just end
> > up annoying people if you require them to use consistent line endings on
> > Cygwin.
> 
> At this point, we have one mixed-EOL case (initdb, fixable), and one \r
> in data case (from 7.2).  Seems the safest solution is to fix initdb and
> see what other failure reports we get.
> 
> If we downgrade it to a warning, we will not know about the failures;
> someone could miss a warning in a dump, but probably will not mess an
> error and an empty table.  If we get more reports like initdb, we can
> reevaluate, but it seems the safest course is to keep our existing code,
> which only removes carriage returns when it quite sure.

OK, I have applied a patch to make initdb create a COPY file with
consistent end-of-line termination, so it passes the COPY EOL checking.

If we find more cases where load files have intended inconsistent EOL
termination, we can revisit the COPY error for inconsistent EOL
termination.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Oliver Elphick writes:
> 
> > +    <para>
> > +     With a large dump, it may be difficult to identify where any errors are
> > +     occurring.  You may use the -e option to psql to print the SQL commands
> > +     as they are run, so that it is easy to see precisely which commands are
> > +     causing errors.
> >      </para>
> 
> That is just not true.  If you use -f, it will tell you the line number of
> the command causing the error.  Add the line number of the COPY error
> message, there you have it.

You are assuming it is easy to find what is on a specific line of the
dump file.  I am not sure that is always easy for people with limited
Unix skills, or MSWin folks.  I am not sure I would have thought to add
the file offset to find the problem COPY line.  I guess I would have
eventually, but it wouldn't have been my first idea, and I might _not_
have used -f on the load, and if the load took an hour, I would have to
run it again to get that line number.

I can see the point that the table name is only really valuable when you
are loading a dump, and not valuabvle when you are just doing a copy. 
However, copy is used enought in dumps that the exta word (the table
name) doesn't see to hurt.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> You are assuming it is easy to find what is on a specific line of the
> dump file.  I am not sure that is always easy for people with limited
> Unix skills, or MSWin folks.  I am not sure I would have thought to add
> the file offset to find the problem COPY line.  I guess I would have
> eventually, but it wouldn't have been my first idea, and I might _not_
> have used -f on the load, and if the load took an hour, I would have to
> run it again to get that line number.

That is all besides the point.  If adding -f to the command line is for
some reason prohibitive, then the same applies to -e.  That is all.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Peter Eisentraut
Date:
Bruce Momjian writes:

> OK, I have applied a patch to make initdb create a COPY file with
> consistent end-of-line termination, so it passes the COPY EOL checking.

Maybe instead we can change the COPY command to read the file directly and
not via stdin?  Then we don't need the \. marker.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > You are assuming it is easy to find what is on a specific line of the
> > dump file.  I am not sure that is always easy for people with limited
> > Unix skills, or MSWin folks.  I am not sure I would have thought to add
> > the file offset to find the problem COPY line.  I guess I would have
> > eventually, but it wouldn't have been my first idea, and I might _not_
> > have used -f on the load, and if the load took an hour, I would have to
> > run it again to get that line number.
> 
> That is all besides the point.  If adding -f to the command line is for
> some reason prohibitive, then the same applies to -e.  That is all.

I see, both -e give query before error, -f gives line number before
error.  I suppose the -e is clearer because you don't have to find the
line in the file, but the -e output makes it more likely they would miss
an error line in the output.

Seems we should recommend -f rather than "<" for restores anyway, right?

Reporting the table with the error is clearer, but this brings up
another case --- what happens with pg_dumpall?  Do we print the database
name or will they know the database name from the table name?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Bruce Momjian writes:
> 
> > OK, I have applied a patch to make initdb create a COPY file with
> > consistent end-of-line termination, so it passes the COPY EOL checking.
> 
> Maybe instead we can change the COPY command to read the file directly and
> not via stdin?  Then we don't need the \. marker.

Yes, we need a temp table then.  The single-line fix seemed easier.

Also, the file is a mix of things from a file and echo's, it isn't just
the \.:
  echo "UPDATE information_schema.sql_implementation_info SET character_value = '$combined_version' WHERE
implementation_info_name= 'DBMS VERSION';"  echo "COPY information_schema.sql_features (feature_id, feature_name,
sub_feature_id,sub_feature_name, is_supported, comments) FROM STDIN;"  cat "$datadir"/sql_features.txt  echo "\.") |tr
-d'\r' | # make newlines consistent for Win32
 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Alvaro Herrera
Date:
On Sat, Sep 27, 2003 at 01:08:37PM -0400, Bruce Momjian wrote:
> Peter Eisentraut wrote:

> > That is all besides the point.  If adding -f to the command line is for
> > some reason prohibitive, then the same applies to -e.  That is all.
> 
> Seems we should recommend -f rather than "<" for restores anyway, right?

Please keep in mind that it is not always possible to use -f.  In my
case I had a file larger than 2GB and psql was compiled without large
file support.  Detecting an error in that situation would have required
recompiling psql.

Also I don't see the point in not extending the context message of the
error.  It's not like it's going to take too much processing power, nor
screen estate, so what is it?

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Right now the sectors on the hard disk run clockwise, but I heard a rumor that
you can squeeze 0.2% more throughput by running them counterclockwise.
It's worth the effort. Recommended."  (Gerry Pourwelle)


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Peter Eisentraut wrote:
>> Maybe instead we can change the COPY command to read the file directly and
>> not via stdin?  Then we don't need the \. marker.

I like that idea too.

> Yes, we need a temp table then.  The single-line fix seemed easier.

Why a temp table?  The COPY command is only sourcing the contents of
"$datadir"/sql_features.txt.
        regards, tom lane


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Jon Jensen writes:
>> I hate to mention it, but would it be useful/non-overkill to make either
>> of those things (context message maximum length and funny character
>> escaping) configurable somehow?

> I'm thinking the same.  We have conflicting needs of different groups:
> Users on "real" Unix platforms converting old data need the checks, users
> on Windows platforms with new data don't want them.  (I wonder what people
> on Mac OS that mix Unix-style and native applications would want.)

I wouldn't object to adding a GUC variable that turns off the
mixed-newlines error (as long as it defaults to enabled ;-)).

Our intention to remove the check in the future might then reduce
to simply flipping the factory default for this variable.
        regards, tom lane


Re: initdb failure (was Re: [GENERAL] sequence's plpgsql)

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Peter Eisentraut wrote:
> >> Maybe instead we can change the COPY command to read the file directly and
> >> not via stdin?  Then we don't need the \. marker.
> 
> I like that idea too.
> 
> > Yes, we need a temp table then.  The single-line fix seemed easier.
> 
> Why a temp table?  The COPY command is only sourcing the contents of
> "$datadir"/sql_features.txt.

Sorry, I meant temp file.  I have SQL on the brain.  :-)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: initdb failure

From
Christopher Browne
Date:
pgman@candle.pha.pa.us (Bruce Momjian) writes:
> Peter Eisentraut wrote:
>> Bruce Momjian writes:
>> 
>> > The argument that you want a warning because you might have mixed
>> > newlines in the file seems less likely than this case where they are
>> > using a literal carriage return as a data value at the end of the line.
>> 
>> I don't agree with that assessment.  Who actually has CRs in their data?
>
> Uh, people storing Win32 text documents.

Or people storing XML data that may contain all sorts of whitespace
that may include various inter-line "metadata."
-- 
let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];;
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)