Thread: [Fwd: Re: [pgsql-hackers-win32] Import from Linux to Windows]

[Fwd: Re: [pgsql-hackers-win32] Import from Linux to Windows]

From
Andrew Dunstan
Date:
[redirecting]

I have abstracted this problem, and we definitely have a newline bug
that has to be fixed, IMNSHO.

Attached are 2 scripts that are identical except that one has DOS-style
line endings and one has Unix style line endings. The DOS-style just
fails miserably with no warning. It's not related to Windows file-end
munging - the problem is observable on Linux - psql doesn't correctly
detect the end of copy input with "\." if it's followed by CRNL.

The attached patch appears to solve the problem. However, while it makes
us conform to the first sentence below from the docs, it doesn't comply
with the second. Not sure what to do about that. Maybe there's a better
solution?

"COPY FROM can handle lines ending with newlines, carriage returns, or
carriage return/newlines. To reduce the risk of error due to
un-backslashed newlines or carriage returns that were meant as data,
COPY FROM will complain if the line endings in the input are not all alike."

cheers

andrew


Tom Haddon wrote:

>Here's what I can share, as there's some sensitive stuff in others.
>
>This is just a dump from one of the databases. I was able to load it
>using the \i switch, but it has only populated data from the first
>table. Also, it kept prompting me to hit a key as the screen was
>scrolling. This doesn't seem normal to me. And I tried running it as a
>SQL statement from within PgAdmin3, and got:
>
>"ERROR:  syntax error at or near "1" at character 15030"
>
>The first "1" you see is where this error is.
>
>COPY backup_data (id, lu, ub, tape, contents, last_writter,
>times_written) FROM stdin;
>1    2004-07-13 09:37:14.78254    thaddon    DLT000005    Full
>Backup (including only 2004 filings) 6/9/04\r\n\r\nNow archived
>2004-06-09    6
>4    2004-07-13 09:40:57.876355    thaddon    DLT000009    Full
>Backup 7/12/04    2004-07-12    4
>3    2004-07-21 16:28:09.843069    thaddon    DLT000007    Full
>Backup 7/21/04\r\n\r\nC:\\Perforce\\*.* /SUBDIR \r\nD:\\backup\\*.*
>/SUBDIR \r\nDMTNJ1-SERVER\\insight_backup\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\equilar_ca\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\dmt_media\\public\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\dmt_media\\public\\software\\*.* /SUBDIR /EXCLUDE
>\r\nHOTNSOUR\\sql_backup\\*.* /SUBDIR
>\r\nYODA\\sec_filings\\filings\\2004\\06\\*.* /SUBDIR
>\r\nYODA\\sec_filings\\filings\\2004\\07\\*.* /SUBDIR \r\nYODA\\qa\\*.*
>/SUBDIR \r\nMILLENIUM\\ExchangeBackup\\*.* /SUBDIR \r\n    2004-07-21
>14
>2    2004-07-22 13:27:01.657706    thaddon    DLT000006    Full
>Backup 7/22/04\r\n\r\nC:\\Perforce\\*.* /SUBDIR \r\nD:\\backup\\*.*
>/SUBDIR \r\nDMTNJ1-SERVER\\insight_backup\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\equilar_ca\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\dmt_media\\public\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\dmt_media\\public\\software\\*.* /SUBDIR /EXCLUDE
>\r\nHOTNSOUR\\sql_backup\\*.* /SUBDIR
>\r\nYODA\\sec_filings\\filings\\2004\\06\\*.* /SUBDIR
>\r\nYODA\\sec_filings\\filings\\2004\\07\\*.* /SUBDIR \r\nYODA\\qa\\*.*
>/SUBDIR \r\nMILLENIUM\\ExchangeBackup\\*.* /SUBDIR \r\n    2004-07-22
>10
>5    2004-07-27 10:49:22.786997    thaddon    DLT000008
>Incremental backup since last full 7/21/04 last updated
>7/23/04\r\n\r\nWill include (as of 7/23/04):\r\n\r\nC:\\Perforce\\*.*
>/SUBDIR \r\nD:\\backup\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\insight_backup\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\equilar_ca\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\dmt_media\\public\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\dmt_media\\public\\software\\*.* /SUBDIR /EXCLUDE
>\r\nHOTNSOUR\\sql_backup\\*.* /SUBDIR \r\nYODA\\qa\\*.* /SUBDIR
>\r\nYODA\\sec_filings\\filings\\2004\\06\\*.* /SUBDIR
>\r\nYODA\\sec_filings\\filings\\2004\\07\\*.* /SUBDIR
>\r\nMILLENIUM\\ExchangeBackup\\*.* /SUBDIR \r\n    2004-07-23    3
>6    2004-07-27 10:50:11.802647    thaddon    DLT000010
>Differential backup since 7/21/04\r\n\r\nC:\\Perforce\\*.* /SUBDIR
>\r\nD:\\backup\\*.* /SUBDIR \r\nDMTNJ1-SERVER\\insight_backup\\*.*
>/SUBDIR \r\nDMTNJ1-SERVER\\equilar_ca\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\dmt_media\\public\\*.* /SUBDIR
>\r\nDMTNJ1-SERVER\\dmt_media\\public\\software\\*.* /SUBDIR /EXCLUDE
>\r\nHOTNSOUR\\sql_backup\\*.* /SUBDIR \r\nYODA\\qa\\*.* /SUBDIR
>\r\nYODA\\sec_filings\\filings\\2004\\06\\*.* /SUBDIR
>\r\nYODA\\sec_filings\\filings\\2004\\07\\*.* /SUBDIR
>\r\nMILLENIUM\\ExchangeBackup\\*.* /SUBDIR \r\n    2004-07-27    1
>\.
>
>Thanks, Tom
>
>-----Original Message-----
>From: Andrew Dunstan [mailto:andrew@dunslane.net]
>Sent: Thursday, August 12, 2004 8:40 AM
>To: Tom Haddon
>Cc: pgsql-hackers-win32@postgresql.org
>Subject: Re: [pgsql-hackers-win32] Import from Linux to Windows
>
>
>
>
>Tom Haddon wrote:
>
>
>
>>I don't think so. Did a search for it in vi and nothing. It doesn't
>>give me an error, just exits. Last statement is "CREATE TABLE".
>>
>>Sorry, not very helpful...
>>
>>
>>
>>
>>
>>
>>
>
>I think you'll need to let us look at the dump file to make any progress
>
>- otherwise we are just guessing in the dark.
>
>thanks
>
>andrew
>
>

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


CREATE TABLE bar (
    c text,
    d text
);
CREATE TABLE foo (
    a text,
    b text
);
COPY bar (c, d) FROM stdin;
c    d
\.
COPY foo (a, b) FROM stdin;
a    b
\.
select * from foo;
select * from bar;
drop table foo;
drop table bar;
CREATE TABLE bar (
    c text,
    d text
);
CREATE TABLE foo (
    a text,
    b text
);
COPY bar (c, d) FROM stdin;
c    d
\.
COPY foo (a, b) FROM stdin;
a    b
\.
select * from foo;
select * from bar;
drop table foo;
drop table bar;
Index: src/bin/psql/copy.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/bin/psql/copy.c,v
retrieving revision 1.50
diff -c -r1.50 copy.c
*** src/bin/psql/copy.c    11 Jul 2004 21:34:03 -0000    1.50
--- src/bin/psql/copy.c    13 Aug 2004 12:02:17 -0000
***************
*** 700,705 ****
--- 700,707 ----
                  c = getc(copystream);
                  if (c == '\n' || c == EOF)
                  {
+                     if (c == '\n' && s != copybuf && *(s-1) == '\r')
+                         *(s-1) = '\0';
                      linedone = true;
                      break;
                  }

Re: [Fwd: Re: [pgsql-hackers-win32] Import from Linux to Windows]

From
Andrew Dunstan
Date:

Andrew Dunstan wrote:

>
> The attached patch appears to solve the problem. However, while it
> makes us conform to the first sentence below from the docs, it doesn't
> comply with the second. Not sure what to do about that. Maybe there's
> a better solution?
>
>

Attached patch seems much better, I think.

cheers

andrew
Index: src/bin/psql/copy.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/bin/psql/copy.c,v
retrieving revision 1.50
diff -c -r1.50 copy.c
*** src/bin/psql/copy.c    11 Jul 2004 21:34:03 -0000    1.50
--- src/bin/psql/copy.c    13 Aug 2004 12:41:15 -0000
***************
*** 717,723 ****
              PQputline(conn, copybuf);
              if (firstload)
              {
!                 if (!strcmp(copybuf, "\\."))
                  {
                      copydone = true;
                      break;
--- 717,724 ----
              PQputline(conn, copybuf);
              if (firstload)
              {
!                 if ((!strcmp(copybuf, "\\.")) ||
!                     (c == '\n' && !strcmp(copybuf,"\\.\r")))
                  {
                      copydone = true;
                      break;

Andrew Dunstan <andrew@dunslane.net> writes:
>> The attached patch appears to solve the problem. However, while it
>> makes us conform to the first sentence below from the docs, it doesn't
>> comply with the second. Not sure what to do about that. Maybe there's
>> a better solution?

> Attached patch seems much better, I think.

I think it is still not quite there.  Since as you noted the backend
will complain if line endings don't match, if we hit EOF then we have
to cons up a \. line with the correct ending.  (BTW, this is not
actually necessary when talking 3.0 protocol, but it is when talking
to an older server.)

I modified the patch a little more and applied the attached.  It seems
to work for me but could use more testing.

            regards, tom lane

*** src/bin/psql/copy.c.orig    Fri Aug 13 10:47:23 2004
--- src/bin/psql/copy.c    Fri Aug 13 18:51:25 2004
***************
*** 663,668 ****
--- 663,669 ----
      bool        copydone = false;
      bool        firstload;
      bool        linedone;
+     bool        saw_cr = false;
      char        copybuf[COPYBUFSIZ];
      char       *s;
      int            bufleft;
***************
*** 695,724 ****

          while (!linedone)
          {                        /* for each bufferload in line ... */
              s = copybuf;
              for (bufleft = COPYBUFSIZ - 1; bufleft > 0; bufleft--)
              {
                  c = getc(copystream);
!                 if (c == '\n' || c == EOF)
                  {
                      linedone = true;
                      break;
                  }
                  *s++ = c;
              }
              *s = '\0';
              if (c == EOF && s == copybuf && firstload)
              {
!                 PQputline(conn, "\\.");
                  copydone = true;
                  if (pset.cur_cmd_interactive)
                      puts("\\.");
                  break;
              }
              PQputline(conn, copybuf);
              if (firstload)
              {
!                 if (!strcmp(copybuf, "\\."))
                  {
                      copydone = true;
                      break;
--- 696,744 ----

          while (!linedone)
          {                        /* for each bufferload in line ... */
+             /* Fetch string until \n, EOF, or buffer full */
              s = copybuf;
              for (bufleft = COPYBUFSIZ - 1; bufleft > 0; bufleft--)
              {
                  c = getc(copystream);
!                 if (c == EOF)
                  {
                      linedone = true;
                      break;
                  }
                  *s++ = c;
+                 if (c == '\n')
+                 {
+                     linedone = true;
+                     break;
+                 }
+                 if (c == '\r')
+                     saw_cr = true;
              }
              *s = '\0';
+             /* EOF with empty line-so-far? */
              if (c == EOF && s == copybuf && firstload)
              {
!                 /*
!                  * We are guessing a little bit as to the right line-ending
!                  * here...
!                  */
!                 if (saw_cr)
!                     PQputline(conn, "\\.\r\n");
!                 else
!                     PQputline(conn, "\\.\n");
                  copydone = true;
                  if (pset.cur_cmd_interactive)
                      puts("\\.");
                  break;
              }
+             /* No, so pass the data to the backend */
              PQputline(conn, copybuf);
+             /* Check for line consisting only of \. */
              if (firstload)
              {
!                 if (strcmp(copybuf, "\\.\n") == 0 ||
!                     strcmp(copybuf, "\\.\r\n") == 0)
                  {
                      copydone = true;
                      break;
***************
*** 726,732 ****
                  firstload = false;
              }
          }
-         PQputline(conn, "\n");
          linecount++;
      }
      ret = !PQendcopy(conn);
--- 746,751 ----

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Andrew Dunstan
Date:

Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>
>
>>>The attached patch appears to solve the problem. However, while it
>>>makes us conform to the first sentence below from the docs, it doesn't
>>>comply with the second. Not sure what to do about that. Maybe there's
>>>a better solution?
>>>
>>>
>
>
>
>>Attached patch seems much better, I think.
>>
>>
>
>I think it is still not quite there.  Since as you noted the backend
>will complain if line endings don't match, if we hit EOF then we have
>to cons up a \. line with the correct ending.  (BTW, this is not
>actually necessary when talking 3.0 protocol, but it is when talking
>to an older server.)
>
>I modified the patch a little more and applied the attached.  It seems
>to work for me but could use more testing.
>
>
>
>

WorksForMe, and looks good. You're right, I had forgotten the EOF case.

Should it be backported for the upcoming stable release(s)? Bruce and I
were discussing this earlier.

Pro: it's an ugly bug and hard to diagnose - things just seem to die for
no apparent reason.
Con: there's a workaround - just make sure to run dos2unix on your file
if necessary.

cheers

andrew

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Should it be backported for the upcoming stable release(s)? Bruce and I
> were discussing this earlier.

Probably a good idea, since we do support psql on Windows even in the
older releases.

My personal opinion is to back-port only as far as 7.4, but if you feel
like doing and testing it for 7.3 then I'll apply the patch.  I need it
tomorrow (Sat) though, as I'd like to wrap these releases Sunday.

            regards, tom lane

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
"Andrew Dunstan"
Date:
Tom Lane said:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Should it be backported for the upcoming stable release(s)? Bruce and
>> I  were discussing this earlier.
>
> Probably a good idea, since we do support psql on Windows even in the
> older releases.
>
> My personal opinion is to back-port only as far as 7.4, but if you feel
> like doing and testing it for 7.3 then I'll apply the patch.  I need it
> tomorrow (Sat) though, as I'd like to wrap these releases Sunday.
>

No, I think 7.4 should do. 7.3 users will still have the dos2unix workaround
available. Are you going to do the 7.4 patch, or do you need me to? I
normally only keep a HEAD tree checked out. A quick look at the cvsweb diffs
suggests the patch should apply cleanly but with different line offsets.

cheers

andrew



Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Tom Lane
Date:
"Andrew Dunstan" <andrew@dunslane.net> writes:
> No, I think 7.4 should do. 7.3 users will still have the dos2unix workaround
> available. Are you going to do the 7.4 patch, or do you need me to? I
> normally only keep a HEAD tree checked out. A quick look at the cvsweb diffs
> suggests the patch should apply cleanly but with different line offsets.

If you're sure the code in that routine hasn't changed since 7.4, then I
can just apply the patch to that branch.

            regards, tom lane

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Andrew Dunstan
Date:

Tom Lane wrote:

>"Andrew Dunstan" <andrew@dunslane.net> writes:
>
>
>>No, I think 7.4 should do. 7.3 users will still have the dos2unix workaround
>>available. Are you going to do the 7.4 patch, or do you need me to? I
>>normally only keep a HEAD tree checked out. A quick look at the cvsweb diffs
>>suggests the patch should apply cleanly but with different line offsets.
>>
>>
>
>If you're sure the code in that routine hasn't changed since 7.4, then I
>can just apply the patch to that branch.
>
>
>
>

It has - the prompt changes in version 1.37. But I don't think that
conflict with this patch. I'll have a look this morning.

cheers

andrew

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Andrew Dunstan
Date:

Andrew Dunstan wrote:

>
>
> Tom Lane wrote:
>
>>
>> If you're sure the code in that routine hasn't changed since 7.4, then I
>> can just apply the patch to that branch.
>>
>
> It has - the prompt changes in version 1.37. But I don't think that
> conflict with this patch. I'll have a look this morning.


As expected:

Hunk #1 succeeded at 496 (offset -167 lines).
Hunk #2 succeeded at 689 (offset -7 lines).
Hunk #3 succeeded at 579 (offset -167 lines).



Here's the diff with 7.4 line numbers - tests fine for me.

cheers

andrew
*** copy.c.orig    Mon Aug  4 19:59:39 2003
--- copy.c    Sat Aug 14 09:12:04 2004
***************
*** 496,501 ****
--- 496,502 ----
      bool        copydone = false;
      bool        firstload;
      bool        linedone;
+     bool        saw_cr = false;
      char        copybuf[COPYBUFSIZ];
      char       *s;
      int            bufleft;
***************
*** 521,550 ****

          while (!linedone)
          {                        /* for each bufferload in line ... */
              s = copybuf;
              for (bufleft = COPYBUFSIZ - 1; bufleft > 0; bufleft--)
              {
                  c = getc(copystream);
!                 if (c == '\n' || c == EOF)
                  {
                      linedone = true;
                      break;
                  }
                  *s++ = c;
              }
              *s = '\0';
              if (c == EOF && s == copybuf && firstload)
              {
!                 PQputline(conn, "\\.");
                  copydone = true;
                  if (pset.cur_cmd_interactive)
                      puts("\\.");
                  break;
              }
              PQputline(conn, copybuf);
              if (firstload)
              {
!                 if (!strcmp(copybuf, "\\."))
                  {
                      copydone = true;
                      break;
--- 522,570 ----

          while (!linedone)
          {                        /* for each bufferload in line ... */
+             /* Fetch string until \n, EOF, or buffer full */
              s = copybuf;
              for (bufleft = COPYBUFSIZ - 1; bufleft > 0; bufleft--)
              {
                  c = getc(copystream);
!                 if (c == EOF)
                  {
                      linedone = true;
                      break;
                  }
                  *s++ = c;
+                 if (c == '\n')
+                 {
+                     linedone = true;
+                     break;
+                 }
+                 if (c == '\r')
+                     saw_cr = true;
              }
              *s = '\0';
+             /* EOF with empty line-so-far? */
              if (c == EOF && s == copybuf && firstload)
              {
!                 /*
!                  * We are guessing a little bit as to the right line-ending
!                  * here...
!                  */
!                 if (saw_cr)
!                     PQputline(conn, "\\.\r\n");
!                 else
!                     PQputline(conn, "\\.\n");
                  copydone = true;
                  if (pset.cur_cmd_interactive)
                      puts("\\.");
                  break;
              }
+             /* No, so pass the data to the backend */
              PQputline(conn, copybuf);
+             /* Check for line consisting only of \. */
              if (firstload)
              {
!                 if (strcmp(copybuf, "\\.\n") == 0 ||
!                     strcmp(copybuf, "\\.\r\n") == 0)
                  {
                      copydone = true;
                      break;
***************
*** 552,558 ****
                  firstload = false;
              }
          }
-         PQputline(conn, "\n");
          linecount++;
      }
      ret = !PQendcopy(conn);
--- 572,577 ----

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
>
>
> Tom Lane wrote:
>
> >"Andrew Dunstan" <andrew@dunslane.net> writes:
> >
> >
> >>No, I think 7.4 should do. 7.3 users will still have the dos2unix workaround
> >>available. Are you going to do the 7.4 patch, or do you need me to? I
> >>normally only keep a HEAD tree checked out. A quick look at the cvsweb diffs
> >>suggests the patch should apply cleanly but with different line offsets.
> >>
> >>
> >
> >If you're sure the code in that routine hasn't changed since 7.4, then I
> >can just apply the patch to that branch.
> >
> >
> >
> >
>
> It has - the prompt changes in version 1.37. But I don't think that
> conflict with this patch. I'll have a look this morning.

One issue is that pre-8.0, psql files were opened in Win32 text mode, so
we wouldn't have seen this bug on Win32, but we would on Linux.
Because we open them on Win32 now in binary mode so we see control-Z it
will show up on Win32 too.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Andrew Dunstan
Date:

Bruce Momjian wrote:

>One issue is that pre-8.0, psql files were opened in Win32 text mode, so
>we wouldn't have seen this bug on Win32, but we would on Linux.
>Because we open them on Win32 now in binary mode so we see control-Z it
>will show up on Win32 too.
>
>
>

true, *BUT*

The patch is not platform-specific. It simply makes psql accept the same
line endings on COPY FROM that the backend will accept - in effect it
makes it line-end agnostic - this is a Good Thing (tm).

We are still months away from releasing 8.0, so I think doing this for
tomorrow's 7.4 bundle makes sense.

cheers

andrew

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
>
>
> Bruce Momjian wrote:
>
> >One issue is that pre-8.0, psql files were opened in Win32 text mode, so
> >we wouldn't have seen this bug on Win32, but we would on Linux.
> >Because we open them on Win32 now in binary mode so we see control-Z it
> >will show up on Win32 too.
> >
> >
> >
>
> true, *BUT*
>
> The patch is not platform-specific. It simply makes psql accept the same
> line endings on COPY FROM that the backend will accept - in effect it
> makes it line-end agnostic - this is a Good Thing (tm).
>
> We are still months away from releasing 8.0, so I think doing this for
> tomorrow's 7.4 bundle makes sense.

Yes, totally agree.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Here's the diff with 7.4 line numbers - tests fine for me.

Patch applied.  Thanks for double-checking it.

            regards, tom lane

Re: [HACKERS] [Fwd: Re: [pgsql-hackers-win32] Import from Linux to

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> The patch is not platform-specific. It simply makes psql accept the same
> line endings on COPY FROM that the backend will accept - in effect it
> makes it line-end agnostic - this is a Good Thing (tm).

Strictly speaking it's not there yet --- psql still doesn't cope with
line endings that are \r only (original Mac OS style).  However, since
we do not and probably never will have a port to original Mac OS, I'm
satisfied with the code as-is.

> We are still months away from releasing 8.0, so I think doing this for
> tomorrow's 7.4 bundle makes sense.

I agree (and did in fact just commit the back-patch a bit ago).

            regards, tom lane