Thread: Patch: incorrect array offset in backend replication tar header
While researching the way streaming replication works I was examining the construction of the tar file header. By comparing documentation on the tar header format from various sources I certain the following patch should be applied to so the group identifier is put into thee header properly. While I realize that wikipedia isn't always the best source of information, the header offsets seem to match the other documentation I've found. The format is just easier to read on wikipedia http://en.wikipedia.org/wiki/Tar_(file_format)#File_header Here is the trivial patch: diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c index 4aaa9e3..524223e 100644 --- a/src/backend/replication/basebackup.c +++ b/src/backend/replication/basebackup.c @@ -871,7 +871,7 @@ _tarWriteHeader(const char *filename, const char *linktarget, sprintf(&h[108], "%07o ", statbuf->st_uid); /* Group 8 */ - sprintf(&h[117], "%07o ", statbuf->st_gid); + sprintf(&h[116], "%07o ", statbuf->st_gid); /* File size 12 - 11 digits, 1 space, no NUL */ if (linktarget != NULL || S_ISDIR(statbuf->st_mode)) -- Brian -- /* insert witty comment here */
Actually I found one other issue while continuing my investigation. The insertion of the 'ustar' and version '00' has the '00' version at the wrong offset. The patch is attached. -- Brian On Mon, Sep 24, 2012 at 7:51 PM, Brian Weaver <cmdrclueless@gmail.com> wrote: > While researching the way streaming replication works I was examining > the construction of the tar file header. By comparing documentation on > the tar header format from various sources I certain the following > patch should be applied to so the group identifier is put into thee > header properly. > > While I realize that wikipedia isn't always the best source of > information, the header offsets seem to match the other documentation > I've found. The format is just easier to read on wikipedia > > http://en.wikipedia.org/wiki/Tar_(file_format)#File_header > > Here is the trivial patch: > > diff --git a/src/backend/replication/basebackup.c > b/src/backend/replication/basebackup.c > index 4aaa9e3..524223e 100644 > --- a/src/backend/replication/basebackup.c > +++ b/src/backend/replication/basebackup.c > @@ -871,7 +871,7 @@ _tarWriteHeader(const char *filename, const char > *linktarget, > sprintf(&h[108], "%07o ", statbuf->st_uid); > > /* Group 8 */ > - sprintf(&h[117], "%07o ", statbuf->st_gid); > + sprintf(&h[116], "%07o ", statbuf->st_gid); > > /* File size 12 - 11 digits, 1 space, no NUL */ > if (linktarget != NULL || S_ISDIR(statbuf->st_mode)) > > > -- Brian > > -- > > /* insert witty comment here */ -- /* insert witty comment here */
Attachment
Um.... I apologize for the third e-mail on the topic. It seems that my C coding is a bit rusty from years of neglect. No sooner had I hit the send button then I realized that trying to embed a null character in a string might not work, especially when it's followed by two consecutive zeros. Here is a safer fix which is more in line with the rest of the code in the file. I guess this is what I get for being cleaver. Patch is attached -- /* insert witty comment here */
Attachment
Brian Weaver <cmdrclueless@gmail.com> writes: > While researching the way streaming replication works I was examining > the construction of the tar file header. By comparing documentation on > the tar header format from various sources I certain the following > patch should be applied to so the group identifier is put into thee > header properly. Yeah, this is definitely wrong. > While I realize that wikipedia isn't always the best source of > information, the header offsets seem to match the other documentation > I've found. The format is just easier to read on wikipedia The authoritative specification can be found in the "pax" page in the POSIX spec, which is available here: http://pubs.opengroup.org/onlinepubs/9699919799/ I agree that the 117 number is bogus, and also that the magic "ustar" string is written incorrectly. What's more, it appears that the latter error has been copied from pg_dump (but the 117 seems to be just a new bug in pg_basebackup). I wonder what else might be wrong hereabouts :-( Will sit down and take a closer look. I believe what we need to do about this is: 1. fix pg_dump and pg_basebackup output to conform to spec. 2. make sure pg_restore will accept both conformant and previous-generation files. Am I right in believing that we don't have any code that's expected to read pg_basebackup output? We just feed it to "tar", no? I'm a bit concerned about backwards compatibility issues. It looks to me like existing versions of pg_restore will flat out reject files that have a spec-compliant "ustar\0" MAGIC field. Is it going to be sufficient if we fix this in minor-version updates, or are we going to need to have a switch that tells pg_dump to emit the incorrect old format? (Ick.) regards, tom lane
Tom, I'm still investigating and I have been looking at various sources. I have checked lots of pages on the web and I was just looking at the libarchive source from github. I found an interesting sequence in libarchive that implies that the 'ustar00\0' marks the header as GNU Tar format. Here are lines 321 through 329 of 'archive_read_support_format_tar.c' from libarchive 321 /* Recognize POSIX formats. */322 if ((memcmp(header->magic, "ustar\0", 6) == 0)323 && (memcmp(header->version,"00", 2) == 0))324 bid += 56;325326 /* Recognize GNU tar format. */327 if ((memcmp(header->magic, "ustar ", 6) == 0)328 && (memcmp(header->version, " \0", 2) == 0))329 bid += 56; I'm wondering if the original committer put the 'ustar00\0' string in by design? Regardless I'll look at it more tomorrow 'cause I'm calling it a night. I need to send a note to the libarchive folks too because I *think* I found a potential buffer overrun in one of their octal conversion routines. -- Brian On Mon, Sep 24, 2012 at 10:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brian Weaver <cmdrclueless@gmail.com> writes: >> While researching the way streaming replication works I was examining >> the construction of the tar file header. By comparing documentation on >> the tar header format from various sources I certain the following >> patch should be applied to so the group identifier is put into thee >> header properly. > > Yeah, this is definitely wrong. > >> While I realize that wikipedia isn't always the best source of >> information, the header offsets seem to match the other documentation >> I've found. The format is just easier to read on wikipedia > > The authoritative specification can be found in the "pax" page in the > POSIX spec, which is available here: > http://pubs.opengroup.org/onlinepubs/9699919799/ > > I agree that the 117 number is bogus, and also that the magic "ustar" > string is written incorrectly. What's more, it appears that the latter > error has been copied from pg_dump (but the 117 seems to be just a new > bug in pg_basebackup). I wonder what else might be wrong hereabouts :-( > Will sit down and take a closer look. > > I believe what we need to do about this is: > > 1. fix pg_dump and pg_basebackup output to conform to spec. > > 2. make sure pg_restore will accept both conformant and > previous-generation files. > > Am I right in believing that we don't have any code that's expected to > read pg_basebackup output? We just feed it to "tar", no? > > I'm a bit concerned about backwards compatibility issues. It looks to > me like existing versions of pg_restore will flat out reject files that > have a spec-compliant "ustar\0" MAGIC field. Is it going to be > sufficient if we fix this in minor-version updates, or are we going to > need to have a switch that tells pg_dump to emit the incorrect old > format? (Ick.) > > regards, tom lane -- /* insert witty comment here */
Brian Weaver <cmdrclueless@gmail.com> writes: > Here are lines 321 through 329 of 'archive_read_support_format_tar.c' > from libarchive > 321 /* Recognize POSIX formats. */ > 322 if ((memcmp(header->magic, "ustar\0", 6) == 0) > 323 && (memcmp(header->version, "00", 2) == 0)) > 324 bid += 56; > 325 > 326 /* Recognize GNU tar format. */ > 327 if ((memcmp(header->magic, "ustar ", 6) == 0) > 328 && (memcmp(header->version, " \0", 2) == 0)) > 329 bid += 56; > I'm wondering if the original committer put the 'ustar00\0' string in by design? The second part of that looks to me like it matches "ustar \0", not "ustar00\0". I think the pg_dump coding is just wrong. I've already noticed that its code for writing the checksum is pretty brain-dead too :-( Note that according to the wikipedia page, tar programs typically accept files as pre-POSIX format if the checksum is okay, regardless of what is in the magic field; and the fields that were added by POSIX are noncritical so we'd likely never notice that they were being ignored. (In fact, looking closer, pg_dump isn't even filling those fields anyway, so the fact that it's not producing a compliant magic field may be a good thing ...) regards, tom lane
Tom, I actually plan on doing a lot of work on the frontend pg_basebackup for my employer. pg_basebackup is 90% of the way to a solution that I need for doing backups of *large* databases while allowing the database to continue to work. The problem is a lack of secondary disk space to save a replication of the original database cluster. I want to modify pg_basebackup to include the WAL files in the tar output. I have several ideas but I need to code and test them. That was the main reason I was examining the backend code. If you're willing to wait a bit on me to code and test my extensions to pg_basebackup I will try to address some of the deficiencies as well add new features. I agree the checksum algorithm could definitely use some refactoring. I was already working on that before I retired last night. -- Brian On Mon, Sep 24, 2012 at 10:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brian Weaver <cmdrclueless@gmail.com> writes: >> Here are lines 321 through 329 of 'archive_read_support_format_tar.c' >> from libarchive > >> 321 /* Recognize POSIX formats. */ >> 322 if ((memcmp(header->magic, "ustar\0", 6) == 0) >> 323 && (memcmp(header->version, "00", 2) == 0)) >> 324 bid += 56; >> 325 >> 326 /* Recognize GNU tar format. */ >> 327 if ((memcmp(header->magic, "ustar ", 6) == 0) >> 328 && (memcmp(header->version, " \0", 2) == 0)) >> 329 bid += 56; > >> I'm wondering if the original committer put the 'ustar00\0' string in by design? > > The second part of that looks to me like it matches "ustar \0", > not "ustar00\0". I think the pg_dump coding is just wrong. I've > already noticed that its code for writing the checksum is pretty > brain-dead too :-( > > Note that according to the wikipedia page, tar programs typically > accept files as pre-POSIX format if the checksum is okay, regardless of > what is in the magic field; and the fields that were added by POSIX > are noncritical so we'd likely never notice that they were being > ignored. (In fact, looking closer, pg_dump isn't even filling those > fields anyway, so the fact that it's not producing a compliant magic > field may be a good thing ...) > > regards, tom lane -- /* insert witty comment here */
On 9/25/12 3:38 PM, Brian Weaver wrote: > I want > to modify pg_basebackup to include the WAL files in the tar output. Doesn't pg_basebackup -x do exactly that? Regards, Marko Tiikkaja
Brian Weaver <cmdrclueless@gmail.com> writes: > If you're willing to wait a bit on me to code and test my extensions > to pg_basebackup I will try to address some of the deficiencies as > well add new features. I think it's a mistake to try to handle these issues in the same patch as feature extensions. If you want to submit a patch for them, I'm happy to let you do the legwork, but please keep it narrowly focused on fixing file-format deficiencies. The notes I had last night after examining pg_dump were: magic number written incorrectly, but POSIX fields aren't filled anyway (which is why tar tvf doesn't show them) checksum code is brain-dead; no use in "lastSum" nor in looping per spec, there should be 1024 zeroes not 512 at end of file; this explains why tar whines about a "lone zero block" ... Not sure which of these apply to pg_basebackup. As far as the backwards compatibility issue goes, what seems like a good idea after sleeping on it is (1) fix pg_dump in HEAD to emit standard-compliant tar files; (2) fix pg_restore in HEAD and all back branches to accept both the standard and the incorrect magic field. This way, the only people with a compatibility problem would be those trying to use by-then-ancient pg_restore versions to read 9.3 or later pg_dump output. regards, tom lane
Unless I misread the code, the tar format and streaming xlog are mutually exclusive. Considering my normal state of fatigue it's not unlikely. I don't want to have to set my wal_keep_segments artificially high just for the backup On Tue, Sep 25, 2012 at 10:05 AM, Marko Tiikkaja <pgmail@joh.to> wrote: > On 9/25/12 3:38 PM, Brian Weaver wrote: >> >> I want >> to modify pg_basebackup to include the WAL files in the tar output. > > > Doesn't pg_basebackup -x do exactly that? > > > Regards, > Marko Tiikkaja > -- /* insert witty comment here */
Tom, I'm fine with submitting highly focused patches first. I was just explaining my end-goal. Still I will need time to patch, compile, and test before submitting so you're not going to see any output from me for a few days. That's all assuming my employer can leave me alone long enough to focus on a single task. I'm far too interrupt driven at work. -- Brian On Tue, Sep 25, 2012 at 10:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brian Weaver <cmdrclueless@gmail.com> writes: >> If you're willing to wait a bit on me to code and test my extensions >> to pg_basebackup I will try to address some of the deficiencies as >> well add new features. > > I think it's a mistake to try to handle these issues in the same patch > as feature extensions. If you want to submit a patch for them, I'm > happy to let you do the legwork, but please keep it narrowly focused > on fixing file-format deficiencies. > > The notes I had last night after examining pg_dump were: > > magic number written incorrectly, but POSIX fields aren't filled anyway > (which is why tar tvf doesn't show them) > > checksum code is brain-dead; no use in "lastSum" nor in looping > > per spec, there should be 1024 zeroes not 512 at end of file; > this explains why tar whines about a "lone zero block" ... > > Not sure which of these apply to pg_basebackup. > > As far as the backwards compatibility issue goes, what seems like > a good idea after sleeping on it is (1) fix pg_dump in HEAD to emit > standard-compliant tar files; (2) fix pg_restore in HEAD and all back > branches to accept both the standard and the incorrect magic field. > This way, the only people with a compatibility problem would be those > trying to use by-then-ancient pg_restore versions to read 9.3 or later > pg_dump output. > > regards, tom lane -- /* insert witty comment here */
On Tue, Sep 25, 2012 at 5:08 PM, Brian Weaver <cmdrclueless@gmail.com> wrote: > Unless I misread the code, the tar format and streaming xlog are > mutually exclusive. Considering my normal state of fatigue it's not > unlikely. I don't want to have to set my wal_keep_segments > artificially high just for the backup Correct, you can't use both of those at the same time. That can certainly be improved - but injecting a file into the tar from a different process is far from easy. But one idea might be to just stream the WAL into a *separate* tarfile in this case. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Tue, Sep 25, 2012 at 4:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brian Weaver <cmdrclueless@gmail.com> writes: >> While researching the way streaming replication works I was examining >> the construction of the tar file header. By comparing documentation on >> the tar header format from various sources I certain the following >> patch should be applied to so the group identifier is put into thee >> header properly. > > Yeah, this is definitely wrong. > >> While I realize that wikipedia isn't always the best source of >> information, the header offsets seem to match the other documentation >> I've found. The format is just easier to read on wikipedia > > The authoritative specification can be found in the "pax" page in the > POSIX spec, which is available here: > http://pubs.opengroup.org/onlinepubs/9699919799/ > > I agree that the 117 number is bogus, and also that the magic "ustar" > string is written incorrectly. What's more, it appears that the latter > error has been copied from pg_dump (but the 117 seems to be just a new > bug in pg_basebackup). I wonder what else might be wrong hereabouts :-( > Will sit down and take a closer look. > > I believe what we need to do about this is: > > 1. fix pg_dump and pg_basebackup output to conform to spec. > > 2. make sure pg_restore will accept both conformant and > previous-generation files. > > Am I right in believing that we don't have any code that's expected to > read pg_basebackup output? We just feed it to "tar", no? Yes. We generate a .tarfile, but we have no tool that reads it ourselves. (unlike pg_dump where we have pg_restore that actually reads it) > I'm a bit concerned about backwards compatibility issues. It looks to > me like existing versions of pg_restore will flat out reject files that > have a spec-compliant "ustar\0" MAGIC field. Is it going to be > sufficient if we fix this in minor-version updates, or are we going to > need to have a switch that tells pg_dump to emit the incorrect old > format? (Ick.) Do we officially support using an older pg_restore to reload a newer dump? I think not? As long as we don't officially support that, I think we'll be ok. -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus, I probably just did a poor job of explaining what I wanted to try. I was going to have the base backup open two connections; one to stream the tar archive, the second to receive the wal files like pg_receivexlog. The wal files received on the second connection would be streamed to a temporary file, with tar headers. Then when the complete tar archive from the first header was complete received simply replay the contents from the temporary file to append them to the tar archive. Turns out that isn't necessary..... It was an idea borne from my misunderstanding of how the pg_basebackup worked. The archive will include all the wal files if I make wal_keep_segments high enough. -- Brian On Thu, Sep 27, 2012 at 6:01 PM, Magnus Hagander <magnus@hagander.net> wrote: > On Tue, Sep 25, 2012 at 5:08 PM, Brian Weaver <cmdrclueless@gmail.com> wrote: >> Unless I misread the code, the tar format and streaming xlog are >> mutually exclusive. Considering my normal state of fatigue it's not >> unlikely. I don't want to have to set my wal_keep_segments >> artificially high just for the backup > > Correct, you can't use both of those at the same time. That can > certainly be improved - but injecting a file into the tar from a > different process is far from easy. But one idea might be to just > stream the WAL into a *separate* tarfile in this case. > > > -- > Magnus Hagander > Me: http://www.hagander.net/ > Work: http://www.redpill-linpro.com/ -- /* insert witty comment here */
OK, here is my attempt at patching and correcting the issue in this thread. I have done my best to test to ensure that hot standby, pg_basebackup, and pg_restore of older files work without issues. I think this might be a larger patch that expected, I took some liberties of trying to clean up a bit. For example the block size '512' was scattered throughout the code regarding the tar block size. I've replace instances of that with a defined constant TAR_BLOCK_SIZE. I've likewise created other constants and used them in place of raw numbers in what I hope makes the code a bit more readable. I've also used functions like strncpy(), strnlen(), and the like in place of sprintf() where I could. Also instead of using sscanf() I used a custom octal conversion routine which has a hard limit on how many character it will process like strncpy() & strnlen(). I expect comments, hopefully they'll be positive. -- Brian -- /* insert witty comment here */
Attachment
On Fri, Sep 28, 2012 at 12:12 AM, Brian Weaver <cmdrclueless@gmail.com> wrote: > Magnus, > > I probably just did a poor job of explaining what I wanted to try. I > was going to have the base backup open two connections; one to stream > the tar archive, the second to receive the wal files like > pg_receivexlog. This is what --xlog=stream does. > The wal files received on the second connection would be streamed to a > temporary file, with tar headers. Then when the complete tar archive > from the first header was complete received simply replay the contents > from the temporary file to append them to the tar archive. Ah, yeah, that should also work I guess. But you could also just leave the the data in the temporary tarfile the whole time. IIRC, you can just cat one tarfile to the end of another one, right? -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > On Tue, Sep 25, 2012 at 4:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm a bit concerned about backwards compatibility issues. It looks to >> me like existing versions of pg_restore will flat out reject files that >> have a spec-compliant "ustar\0" MAGIC field. Is it going to be >> sufficient if we fix this in minor-version updates, or are we going to >> need to have a switch that tells pg_dump to emit the incorrect old >> format? (Ick.) > Do we officially support using an older pg_restore to reload a newer > dump? I think not? As long as we don't officially support that, I > think we'll be ok. Well, for the -Fc format, we have an explicit version number, and pg_restore is supposed to be able to read anything with current or prior version number. We don't bump the version number too often, but we've definitely done it anytime we added new features at the file-format level. However, since the whole point of the -Ft format is to be standard-compliant, people might be surprised if it fell over in a backwards-compatibility situation. Having said all that, I don't think we have a lot of choices here. A "tar format" output option that isn't actually tar format has hardly any excuse to live at all. regards, tom lane
On Fri, Sep 28, 2012 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >> On Tue, Sep 25, 2012 at 4:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I'm a bit concerned about backwards compatibility issues. It looks to >>> me like existing versions of pg_restore will flat out reject files that >>> have a spec-compliant "ustar\0" MAGIC field. Is it going to be >>> sufficient if we fix this in minor-version updates, or are we going to >>> need to have a switch that tells pg_dump to emit the incorrect old >>> format? (Ick.) > >> Do we officially support using an older pg_restore to reload a newer >> dump? I think not? As long as we don't officially support that, I >> think we'll be ok. > > Well, for the -Fc format, we have an explicit version number, and > pg_restore is supposed to be able to read anything with current or prior > version number. We don't bump the version number too often, but we've > definitely done it anytime we added new features at the file-format > level. However, since the whole point of the -Ft format is to be > standard-compliant, people might be surprised if it fell over in a > backwards-compatibility situation. > > Having said all that, I don't think we have a lot of choices here. > A "tar format" output option that isn't actually tar format has hardly > any excuse to live at all. Yeah, that's what I'm thinking - it's really a bugfix... -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Brian Weaver <cmdrclueless@gmail.com> writes: > OK, here is my attempt at patching and correcting the issue in this > thread. I have done my best to test to ensure that hot standby, > pg_basebackup, and pg_restore of older files work without issues. I > think this might be a larger patch that expected, I took some > liberties of trying to clean up a bit. > For example the block size '512' was scattered throughout the code > regarding the tar block size. I've replace instances of that with a > defined constant TAR_BLOCK_SIZE. I've likewise created other constants > and used them in place of raw numbers in what I hope makes the code a > bit more readable. That seems possibly reasonable ... > I've also used functions like strncpy(), strnlen(), and the like in > place of sprintf() where I could. Also instead of using sscanf() I > used a custom octal conversion routine which has a hard limit on how > many character it will process like strncpy() & strnlen(). ... but I doubt that this really constitutes a readability improvement. Or a portability improvement. strnlen for instance is not to be found in Single Unix Spec v2 (http://pubs.opengroup.org/onlinepubs/007908799/) which is what we usually take as our baseline assumption about which system functions are available everywhere. By and large, I think the more different system functions you rely on, the harder it is to read your code, even if some unusual system function happens to exactly match your needs in particular places. It also greatly increases the risk of having portability problems, eg on Windows, or non-mainstream Unix platforms. But a larger point is that the immediate need is to fix bugs. Code beautification is a separate activity and would be better submitted as a separate patch. There is no way I'd consider applying most of this patch to the back branches, for instance. regards, tom lane
Magnus Hagander <magnus@hagander.net> writes: > Ah, yeah, that should also work I guess. But you could also just leave > the the data in the temporary tarfile the whole time. IIRC, you can > just cat one tarfile to the end of another one, right? Not if they're written according to spec, I think. There is an EOF marker consisting of 2 blocks of zeroes (which pg_dump is failing to create correctly, but that's a bug not a feature). Possibly you could strip the EOF marker though, if the file join code is allowed to be something smarter than "cat". regards, tom lane
On 09/27/2012 06:30 PM, Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: >> On Tue, Sep 25, 2012 at 4:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I'm a bit concerned about backwards compatibility issues. It looks to >>> me like existing versions of pg_restore will flat out reject files that >>> have a spec-compliant "ustar\0" MAGIC field. Is it going to be >>> sufficient if we fix this in minor-version updates, or are we going to >>> need to have a switch that tells pg_dump to emit the incorrect old >>> format? (Ick.) >> Do we officially support using an older pg_restore to reload a newer >> dump? I think not? As long as we don't officially support that, I >> think we'll be ok. > Well, for the -Fc format, we have an explicit version number, and > pg_restore is supposed to be able to read anything with current or prior > version number. We don't bump the version number too often, but we've > definitely done it anytime we added new features at the file-format > level. However, since the whole point of the -Ft format is to be > standard-compliant, people might be surprised if it fell over in a > backwards-compatibility situation. > > Having said all that, I don't think we have a lot of choices here. > A "tar format" output option that isn't actually tar format has hardly > any excuse to live at all. I agree, but it's possibly worth pointing out that GNU tar has no trouble at all processing the erroneous format, and the "file" program on my Linux system has no trouble recognizing it as a tar archive. Nevertheless, I think we should fix all live versions of pg_dump make all live versions of pg-restore accept both formats. cheers andrew
On Fri, Sep 28, 2012 at 12:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Magnus Hagander <magnus@hagander.net> writes: >> Ah, yeah, that should also work I guess. But you could also just leave >> the the data in the temporary tarfile the whole time. IIRC, you can >> just cat one tarfile to the end of another one, right? > > Not if they're written according to spec, I think. There is an EOF > marker consisting of 2 blocks of zeroes (which pg_dump is failing to > create correctly, but that's a bug not a feature). Possibly you could > strip the EOF marker though, if the file join code is allowed to be > something smarter than "cat". Hmm. Yeah. It seems gnu tar has "--concatenate".... Not sure if it's in the standard or a GNU extension though. But it says: " However, tar archives incorporate an end-of-file marker which must be removed if the concatenated archives are to be read properly as one archive. `--concatenate' removes the end-of-archive marker from the target archive before each new archive is appended. If you use cat to combine the archives, the result will not be a valid tar format archive. If you need to retrieve files from an archive that was added to using the cat utility, use the --ignore-zeros (-i) option. " -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Andrew Dunstan <andrew@dunslane.net> writes: > On 09/27/2012 06:30 PM, Tom Lane wrote: >> Having said all that, I don't think we have a lot of choices here. >> A "tar format" output option that isn't actually tar format has hardly >> any excuse to live at all. > I agree, but it's possibly worth pointing out that GNU tar has no > trouble at all processing the erroneous format, and the "file" program > on my Linux system has no trouble recognizing it as a tar archive. Well, they're falling back to assuming that the file is a pre-POSIX tarfile, which is why you don't see string user/group names for instance. > Nevertheless, I think we should fix all live versions of pg_dump make > all live versions of pg-restore accept both formats. I think it's clear that we should make all versions of pg_restore accept either spelling of the magic string. It's less clear that we should change the output of pg_dump in back branches though. I think the only reason we'd not get complaints about that is that not that many people are relying on tar-format output anyway. Anybody who is would probably be peeved if version 8.3.21 pg_restore couldn't read the output of version 8.3.22 pg_dump. regards, tom lane
On Fri, Sep 28, 2012 at 12:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 09/27/2012 06:30 PM, Tom Lane wrote: >>> Having said all that, I don't think we have a lot of choices here. >>> A "tar format" output option that isn't actually tar format has hardly >>> any excuse to live at all. > >> I agree, but it's possibly worth pointing out that GNU tar has no >> trouble at all processing the erroneous format, and the "file" program >> on my Linux system has no trouble recognizing it as a tar archive. > > Well, they're falling back to assuming that the file is a pre-POSIX > tarfile, which is why you don't see string user/group names for > instance. > >> Nevertheless, I think we should fix all live versions of pg_dump make >> all live versions of pg-restore accept both formats. > > I think it's clear that we should make all versions of pg_restore accept > either spelling of the magic string. It's less clear that we should > change the output of pg_dump in back branches though. I think the only > reason we'd not get complaints about that is that not that many people > are relying on tar-format output anyway. Anybody who is would probably > be peeved if version 8.3.21 pg_restore couldn't read the output of > version 8.3.22 pg_dump. There's no real point to using the tar format in pg_dump, really, is there? Which is why I think most people just don't use it. pg_basebackup in tar format is a much more useful thing, of course... So we could fix just pg_basebackup in backbranches (since we never read anything, it shouldn't be that big a problem), and then do pg_dump in HEAD only? -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Magnus Hagander <magnus@hagander.net> writes: > On Fri, Sep 28, 2012 at 12:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think it's clear that we should make all versions of pg_restore accept >> either spelling of the magic string. It's less clear that we should >> change the output of pg_dump in back branches though. I think the only >> reason we'd not get complaints about that is that not that many people >> are relying on tar-format output anyway. Anybody who is would probably >> be peeved if version 8.3.21 pg_restore couldn't read the output of >> version 8.3.22 pg_dump. > There's no real point to using the tar format in pg_dump, really, is > there? Which is why I think most people just don't use it. I think folks who know the tool realize that custom format is better. But we've seen plenty of indications that novices sometimes choose tar format. For instance, people occasionally complain about its 8GB-per-file limit. > pg_basebackup in tar format is a much more useful thing, of course... Right. > So we could fix just pg_basebackup in backbranches (since we never > read anything, it shouldn't be that big a problem), and then do > pg_dump in HEAD only? Yeah, pg_basebackup is an entirely different tradeoff. I think we want to fix its problems in all branches. regards, tom lane
On Thu, Sep 27, 2012 at 6:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brian Weaver <cmdrclueless@gmail.com> writes: >> OK, here is my attempt at patching and correcting the issue in this >> thread. I have done my best to test to ensure that hot standby, >> pg_basebackup, and pg_restore of older files work without issues. I >> think this might be a larger patch that expected, I took some >> liberties of trying to clean up a bit. > >> For example the block size '512' was scattered throughout the code >> regarding the tar block size. I've replace instances of that with a >> defined constant TAR_BLOCK_SIZE. I've likewise created other constants >> and used them in place of raw numbers in what I hope makes the code a >> bit more readable. > > That seems possibly reasonable ... > >> I've also used functions like strncpy(), strnlen(), and the like in >> place of sprintf() where I could. Also instead of using sscanf() I >> used a custom octal conversion routine which has a hard limit on how >> many character it will process like strncpy() & strnlen(). > > ... but I doubt that this really constitutes a readability improvement. > Or a portability improvement. strnlen for instance is not to be found > in Single Unix Spec v2 (http://pubs.opengroup.org/onlinepubs/007908799/) > which is what we usually take as our baseline assumption about which > system functions are available everywhere. By and large, I think the > more different system functions you rely on, the harder it is to read > your code, even if some unusual system function happens to exactly match > your needs in particular places. It also greatly increases the risk > of having portability problems, eg on Windows, or non-mainstream Unix > platforms. > > But a larger point is that the immediate need is to fix bugs. Code > beautification is a separate activity and would be better submitted as > a separate patch. There is no way I'd consider applying most of this > patch to the back branches, for instance. > > regards, tom lane Here's a very minimal fix then, perhaps it will be more palatable. Even though I regret the effort I put into the first patch it's in my employer's best interest that it's fixed. I'm obliged to try to remediate the problem in something more acceptable to the community. enjoy -- /* insert witty comment here */
Attachment
Brian Weaver <cmdrclueless@gmail.com> writes: > Here's a very minimal fix then, perhaps it will be more palatable. I did some further work on this to improve comments and clean up the pg_dump end of things, and have committed it. > Even though I regret the effort I put into the first patch it's in my > employer's best interest that it's fixed. I'm obliged to try to > remediate the problem in something more acceptable to the community. You're welcome to submit further cleanup as a follow-on patch --- I just want to keep that separate from back-patchable bug fixing. regards, tom lane