Thread: [Fwd: Re: [pgsql-hackers-win32] Import from Linux to Windows]
[redirecting] I have abstracted this problem, and we definitely have a newline bug that has to be fixed, IMNSHO. Attached are 2 scripts that are identical except that one has DOS-style line endings and one has Unix style line endings. The DOS-style just fails miserably with no warning. It's not related to Windows file-end munging - the problem is observable on Linux - psql doesn't correctly detect the end of copy input with "\." if it's followed by CRNL. The attached patch appears to solve the problem. However, while it makes us conform to the first sentence below from the docs, it doesn't comply with the second. Not sure what to do about that. Maybe there's a better solution? "COPY FROM can handle lines ending with newlines, carriage returns, or carriage return/newlines. To reduce the risk of error due to un-backslashed newlines or carriage returns that were meant as data, COPY FROM will complain if the line endings in the input are not all alike." cheers andrew Tom Haddon wrote: >Here's what I can share, as there's some sensitive stuff in others. > >This is just a dump from one of the databases. I was able to load it >using the \i switch, but it has only populated data from the first >table. Also, it kept prompting me to hit a key as the screen was >scrolling. This doesn't seem normal to me. And I tried running it as a >SQL statement from within PgAdmin3, and got: > >"ERROR: syntax error at or near "1" at character 15030" > >The first "1" you see is where this error is. > >COPY backup_data (id, lu, ub, tape, contents, last_writter, >times_written) FROM stdin; >1 2004-07-13 09:37:14.78254 thaddon DLT000005 Full >Backup (including only 2004 filings) 6/9/04\r\n\r\nNow archived >2004-06-09 6 >4 2004-07-13 09:40:57.876355 thaddon DLT000009 Full >Backup 7/12/04 2004-07-12 4 >3 2004-07-21 16:28:09.843069 thaddon DLT000007 Full >Backup 7/21/04\r\n\r\nC:\\Perforce\\*.* /SUBDIR \r\nD:\\backup\\*.* >/SUBDIR \r\nDMTNJ1-SERVER\\insight_backup\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\equilar_ca\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\dmt_media\\public\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\dmt_media\\public\\software\\*.* /SUBDIR /EXCLUDE >\r\nHOTNSOUR\\sql_backup\\*.* /SUBDIR >\r\nYODA\\sec_filings\\filings\\2004\\06\\*.* /SUBDIR >\r\nYODA\\sec_filings\\filings\\2004\\07\\*.* /SUBDIR \r\nYODA\\qa\\*.* >/SUBDIR \r\nMILLENIUM\\ExchangeBackup\\*.* /SUBDIR \r\n 2004-07-21 >14 >2 2004-07-22 13:27:01.657706 thaddon DLT000006 Full >Backup 7/22/04\r\n\r\nC:\\Perforce\\*.* /SUBDIR \r\nD:\\backup\\*.* >/SUBDIR \r\nDMTNJ1-SERVER\\insight_backup\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\equilar_ca\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\dmt_media\\public\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\dmt_media\\public\\software\\*.* /SUBDIR /EXCLUDE >\r\nHOTNSOUR\\sql_backup\\*.* /SUBDIR >\r\nYODA\\sec_filings\\filings\\2004\\06\\*.* /SUBDIR >\r\nYODA\\sec_filings\\filings\\2004\\07\\*.* /SUBDIR \r\nYODA\\qa\\*.* >/SUBDIR \r\nMILLENIUM\\ExchangeBackup\\*.* /SUBDIR \r\n 2004-07-22 >10 >5 2004-07-27 10:49:22.786997 thaddon DLT000008 >Incremental backup since last full 7/21/04 last updated >7/23/04\r\n\r\nWill include (as of 7/23/04):\r\n\r\nC:\\Perforce\\*.* >/SUBDIR \r\nD:\\backup\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\insight_backup\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\equilar_ca\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\dmt_media\\public\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\dmt_media\\public\\software\\*.* /SUBDIR /EXCLUDE >\r\nHOTNSOUR\\sql_backup\\*.* /SUBDIR \r\nYODA\\qa\\*.* /SUBDIR >\r\nYODA\\sec_filings\\filings\\2004\\06\\*.* /SUBDIR >\r\nYODA\\sec_filings\\filings\\2004\\07\\*.* /SUBDIR >\r\nMILLENIUM\\ExchangeBackup\\*.* /SUBDIR \r\n 2004-07-23 3 >6 2004-07-27 10:50:11.802647 thaddon DLT000010 >Differential backup since 7/21/04\r\n\r\nC:\\Perforce\\*.* /SUBDIR >\r\nD:\\backup\\*.* /SUBDIR \r\nDMTNJ1-SERVER\\insight_backup\\*.* >/SUBDIR \r\nDMTNJ1-SERVER\\equilar_ca\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\dmt_media\\public\\*.* /SUBDIR >\r\nDMTNJ1-SERVER\\dmt_media\\public\\software\\*.* /SUBDIR /EXCLUDE >\r\nHOTNSOUR\\sql_backup\\*.* /SUBDIR \r\nYODA\\qa\\*.* /SUBDIR >\r\nYODA\\sec_filings\\filings\\2004\\06\\*.* /SUBDIR >\r\nYODA\\sec_filings\\filings\\2004\\07\\*.* /SUBDIR >\r\nMILLENIUM\\ExchangeBackup\\*.* /SUBDIR \r\n 2004-07-27 1 >\. > >Thanks, Tom > >-----Original Message----- >From: Andrew Dunstan [mailto:andrew@dunslane.net] >Sent: Thursday, August 12, 2004 8:40 AM >To: Tom Haddon >Cc: pgsql-hackers-win32@postgresql.org >Subject: Re: [pgsql-hackers-win32] Import from Linux to Windows > > > > >Tom Haddon wrote: > > > >>I don't think so. Did a search for it in vi and nothing. It doesn't >>give me an error, just exits. Last statement is "CREATE TABLE". >> >>Sorry, not very helpful... >> >> >> >> >> >> >> > >I think you'll need to let us look at the dump file to make any progress > >- otherwise we are just guessing in the dark. > >thanks > >andrew > > ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster CREATE TABLE bar ( c text, d text ); CREATE TABLE foo ( a text, b text ); COPY bar (c, d) FROM stdin; c d \. COPY foo (a, b) FROM stdin; a b \. select * from foo; select * from bar; drop table foo; drop table bar; CREATE TABLE bar ( c text, d text ); CREATE TABLE foo ( a text, b text ); COPY bar (c, d) FROM stdin; c d \. COPY foo (a, b) FROM stdin; a b \. select * from foo; select * from bar; drop table foo; drop table bar; Index: src/bin/psql/copy.c =================================================================== RCS file: /projects/cvsroot/pgsql-server/src/bin/psql/copy.c,v retrieving revision 1.50 diff -c -r1.50 copy.c *** src/bin/psql/copy.c 11 Jul 2004 21:34:03 -0000 1.50 --- src/bin/psql/copy.c 13 Aug 2004 12:02:17 -0000 *************** *** 700,705 **** --- 700,707 ---- c = getc(copystream); if (c == '\n' || c == EOF) { + if (c == '\n' && s != copybuf && *(s-1) == '\r') + *(s-1) = '\0'; linedone = true; break; }
Andrew Dunstan wrote: > > The attached patch appears to solve the problem. However, while it > makes us conform to the first sentence below from the docs, it doesn't > comply with the second. Not sure what to do about that. Maybe there's > a better solution? > > Attached patch seems much better, I think. cheers andrew Index: src/bin/psql/copy.c =================================================================== RCS file: /projects/cvsroot/pgsql-server/src/bin/psql/copy.c,v retrieving revision 1.50 diff -c -r1.50 copy.c *** src/bin/psql/copy.c 11 Jul 2004 21:34:03 -0000 1.50 --- src/bin/psql/copy.c 13 Aug 2004 12:41:15 -0000 *************** *** 717,723 **** PQputline(conn, copybuf); if (firstload) { ! if (!strcmp(copybuf, "\\.")) { copydone = true; break; --- 717,724 ---- PQputline(conn, copybuf); if (firstload) { ! if ((!strcmp(copybuf, "\\.")) || ! (c == '\n' && !strcmp(copybuf,"\\.\r"))) { copydone = true; break;
Andrew Dunstan <andrew@dunslane.net> writes: >> The attached patch appears to solve the problem. However, while it >> makes us conform to the first sentence below from the docs, it doesn't >> comply with the second. Not sure what to do about that. Maybe there's >> a better solution? > Attached patch seems much better, I think. I think it is still not quite there. Since as you noted the backend will complain if line endings don't match, if we hit EOF then we have to cons up a \. line with the correct ending. (BTW, this is not actually necessary when talking 3.0 protocol, but it is when talking to an older server.) I modified the patch a little more and applied the attached. It seems to work for me but could use more testing. regards, tom lane *** src/bin/psql/copy.c.orig Fri Aug 13 10:47:23 2004 --- src/bin/psql/copy.c Fri Aug 13 18:51:25 2004 *************** *** 663,668 **** --- 663,669 ---- bool copydone = false; bool firstload; bool linedone; + bool saw_cr = false; char copybuf[COPYBUFSIZ]; char *s; int bufleft; *************** *** 695,724 **** while (!linedone) { /* for each bufferload in line ... */ s = copybuf; for (bufleft = COPYBUFSIZ - 1; bufleft > 0; bufleft--) { c = getc(copystream); ! if (c == '\n' || c == EOF) { linedone = true; break; } *s++ = c; } *s = '\0'; if (c == EOF && s == copybuf && firstload) { ! PQputline(conn, "\\."); copydone = true; if (pset.cur_cmd_interactive) puts("\\."); break; } PQputline(conn, copybuf); if (firstload) { ! if (!strcmp(copybuf, "\\.")) { copydone = true; break; --- 696,744 ---- while (!linedone) { /* for each bufferload in line ... */ + /* Fetch string until \n, EOF, or buffer full */ s = copybuf; for (bufleft = COPYBUFSIZ - 1; bufleft > 0; bufleft--) { c = getc(copystream); ! if (c == EOF) { linedone = true; break; } *s++ = c; + if (c == '\n') + { + linedone = true; + break; + } + if (c == '\r') + saw_cr = true; } *s = '\0'; + /* EOF with empty line-so-far? */ if (c == EOF && s == copybuf && firstload) { ! /* ! * We are guessing a little bit as to the right line-ending ! * here... ! */ ! if (saw_cr) ! PQputline(conn, "\\.\r\n"); ! else ! PQputline(conn, "\\.\n"); copydone = true; if (pset.cur_cmd_interactive) puts("\\."); break; } + /* No, so pass the data to the backend */ PQputline(conn, copybuf); + /* Check for line consisting only of \. */ if (firstload) { ! if (strcmp(copybuf, "\\.\n") == 0 || ! strcmp(copybuf, "\\.\r\n") == 0) { copydone = true; break; *************** *** 726,732 **** firstload = false; } } - PQputline(conn, "\n"); linecount++; } ret = !PQendcopy(conn); --- 746,751 ----
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>>The attached patch appears to solve the problem. However, while it >>>makes us conform to the first sentence below from the docs, it doesn't >>>comply with the second. Not sure what to do about that. Maybe there's >>>a better solution? >>> >>> > > > >>Attached patch seems much better, I think. >> >> > >I think it is still not quite there. Since as you noted the backend >will complain if line endings don't match, if we hit EOF then we have >to cons up a \. line with the correct ending. (BTW, this is not >actually necessary when talking 3.0 protocol, but it is when talking >to an older server.) > >I modified the patch a little more and applied the attached. It seems >to work for me but could use more testing. > > > > WorksForMe, and looks good. You're right, I had forgotten the EOF case. Should it be backported for the upcoming stable release(s)? Bruce and I were discussing this earlier. Pro: it's an ugly bug and hard to diagnose - things just seem to die for no apparent reason. Con: there's a workaround - just make sure to run dos2unix on your file if necessary. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Should it be backported for the upcoming stable release(s)? Bruce and I > were discussing this earlier. Probably a good idea, since we do support psql on Windows even in the older releases. My personal opinion is to back-port only as far as 7.4, but if you feel like doing and testing it for 7.3 then I'll apply the patch. I need it tomorrow (Sat) though, as I'd like to wrap these releases Sunday. regards, tom lane
Tom Lane said: > Andrew Dunstan <andrew@dunslane.net> writes: >> Should it be backported for the upcoming stable release(s)? Bruce and >> I were discussing this earlier. > > Probably a good idea, since we do support psql on Windows even in the > older releases. > > My personal opinion is to back-port only as far as 7.4, but if you feel > like doing and testing it for 7.3 then I'll apply the patch. I need it > tomorrow (Sat) though, as I'd like to wrap these releases Sunday. > No, I think 7.4 should do. 7.3 users will still have the dos2unix workaround available. Are you going to do the 7.4 patch, or do you need me to? I normally only keep a HEAD tree checked out. A quick look at the cvsweb diffs suggests the patch should apply cleanly but with different line offsets. cheers andrew
"Andrew Dunstan" <andrew@dunslane.net> writes: > No, I think 7.4 should do. 7.3 users will still have the dos2unix workaround > available. Are you going to do the 7.4 patch, or do you need me to? I > normally only keep a HEAD tree checked out. A quick look at the cvsweb diffs > suggests the patch should apply cleanly but with different line offsets. If you're sure the code in that routine hasn't changed since 7.4, then I can just apply the patch to that branch. regards, tom lane
Tom Lane wrote: >"Andrew Dunstan" <andrew@dunslane.net> writes: > > >>No, I think 7.4 should do. 7.3 users will still have the dos2unix workaround >>available. Are you going to do the 7.4 patch, or do you need me to? I >>normally only keep a HEAD tree checked out. A quick look at the cvsweb diffs >>suggests the patch should apply cleanly but with different line offsets. >> >> > >If you're sure the code in that routine hasn't changed since 7.4, then I >can just apply the patch to that branch. > > > > It has - the prompt changes in version 1.37. But I don't think that conflict with this patch. I'll have a look this morning. cheers andrew
Andrew Dunstan wrote: > > > Tom Lane wrote: > >> >> If you're sure the code in that routine hasn't changed since 7.4, then I >> can just apply the patch to that branch. >> > > It has - the prompt changes in version 1.37. But I don't think that > conflict with this patch. I'll have a look this morning. As expected: Hunk #1 succeeded at 496 (offset -167 lines). Hunk #2 succeeded at 689 (offset -7 lines). Hunk #3 succeeded at 579 (offset -167 lines). Here's the diff with 7.4 line numbers - tests fine for me. cheers andrew *** copy.c.orig Mon Aug 4 19:59:39 2003 --- copy.c Sat Aug 14 09:12:04 2004 *************** *** 496,501 **** --- 496,502 ---- bool copydone = false; bool firstload; bool linedone; + bool saw_cr = false; char copybuf[COPYBUFSIZ]; char *s; int bufleft; *************** *** 521,550 **** while (!linedone) { /* for each bufferload in line ... */ s = copybuf; for (bufleft = COPYBUFSIZ - 1; bufleft > 0; bufleft--) { c = getc(copystream); ! if (c == '\n' || c == EOF) { linedone = true; break; } *s++ = c; } *s = '\0'; if (c == EOF && s == copybuf && firstload) { ! PQputline(conn, "\\."); copydone = true; if (pset.cur_cmd_interactive) puts("\\."); break; } PQputline(conn, copybuf); if (firstload) { ! if (!strcmp(copybuf, "\\.")) { copydone = true; break; --- 522,570 ---- while (!linedone) { /* for each bufferload in line ... */ + /* Fetch string until \n, EOF, or buffer full */ s = copybuf; for (bufleft = COPYBUFSIZ - 1; bufleft > 0; bufleft--) { c = getc(copystream); ! if (c == EOF) { linedone = true; break; } *s++ = c; + if (c == '\n') + { + linedone = true; + break; + } + if (c == '\r') + saw_cr = true; } *s = '\0'; + /* EOF with empty line-so-far? */ if (c == EOF && s == copybuf && firstload) { ! /* ! * We are guessing a little bit as to the right line-ending ! * here... ! */ ! if (saw_cr) ! PQputline(conn, "\\.\r\n"); ! else ! PQputline(conn, "\\.\n"); copydone = true; if (pset.cur_cmd_interactive) puts("\\."); break; } + /* No, so pass the data to the backend */ PQputline(conn, copybuf); + /* Check for line consisting only of \. */ if (firstload) { ! if (strcmp(copybuf, "\\.\n") == 0 || ! strcmp(copybuf, "\\.\r\n") == 0) { copydone = true; break; *************** *** 552,558 **** firstload = false; } } - PQputline(conn, "\n"); linecount++; } ret = !PQendcopy(conn); --- 572,577 ----
Andrew Dunstan wrote: > > > Tom Lane wrote: > > >"Andrew Dunstan" <andrew@dunslane.net> writes: > > > > > >>No, I think 7.4 should do. 7.3 users will still have the dos2unix workaround > >>available. Are you going to do the 7.4 patch, or do you need me to? I > >>normally only keep a HEAD tree checked out. A quick look at the cvsweb diffs > >>suggests the patch should apply cleanly but with different line offsets. > >> > >> > > > >If you're sure the code in that routine hasn't changed since 7.4, then I > >can just apply the patch to that branch. > > > > > > > > > > It has - the prompt changes in version 1.37. But I don't think that > conflict with this patch. I'll have a look this morning. One issue is that pre-8.0, psql files were opened in Win32 text mode, so we wouldn't have seen this bug on Win32, but we would on Linux. Because we open them on Win32 now in binary mode so we see control-Z it will show up on Win32 too. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian wrote: >One issue is that pre-8.0, psql files were opened in Win32 text mode, so >we wouldn't have seen this bug on Win32, but we would on Linux. >Because we open them on Win32 now in binary mode so we see control-Z it >will show up on Win32 too. > > > true, *BUT* The patch is not platform-specific. It simply makes psql accept the same line endings on COPY FROM that the backend will accept - in effect it makes it line-end agnostic - this is a Good Thing (tm). We are still months away from releasing 8.0, so I think doing this for tomorrow's 7.4 bundle makes sense. cheers andrew
Andrew Dunstan wrote: > > > Bruce Momjian wrote: > > >One issue is that pre-8.0, psql files were opened in Win32 text mode, so > >we wouldn't have seen this bug on Win32, but we would on Linux. > >Because we open them on Win32 now in binary mode so we see control-Z it > >will show up on Win32 too. > > > > > > > > true, *BUT* > > The patch is not platform-specific. It simply makes psql accept the same > line endings on COPY FROM that the backend will accept - in effect it > makes it line-end agnostic - this is a Good Thing (tm). > > We are still months away from releasing 8.0, so I think doing this for > tomorrow's 7.4 bundle makes sense. Yes, totally agree. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Andrew Dunstan <andrew@dunslane.net> writes: > Here's the diff with 7.4 line numbers - tests fine for me. Patch applied. Thanks for double-checking it. regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes: > The patch is not platform-specific. It simply makes psql accept the same > line endings on COPY FROM that the backend will accept - in effect it > makes it line-end agnostic - this is a Good Thing (tm). Strictly speaking it's not there yet --- psql still doesn't cope with line endings that are \r only (original Mac OS style). However, since we do not and probably never will have a port to original Mac OS, I'm satisfied with the code as-is. > We are still months away from releasing 8.0, so I think doing this for > tomorrow's 7.4 bundle makes sense. I agree (and did in fact just commit the back-patch a bit ago). regards, tom lane