Thread: pg_dump not in very good shape
I have repaired the most recently introduced coredump in pg_dump, but it still crashes on the regression test database. Issue 1: The "most recently introduced coredump" came from the change to oidvector/int2vector to suppress trailing zeroes in the output routine. pg_dump was assuming that it would see exactly the right number of zeroes, and wasn't bothering to initialize any leftover array locations --- but it would happily try to dereference those locations later on. Ugh. Although cleaning up pg_dump's code is clearly good practice, maybe this should raise a flag about whether suppressing the zeroes is a good idea. Are there any client applications that will break because of this change? I'm not sure... Issue 2: The reason it's still broken is that the pqexpbuffer.c code I added to libpq doesn't support adding more than 1K characters to an "expansible string" in any one appendPQExpBuffer() call. pg_dump tries to use that routine to format function definitions, which can easily be over 1K. (Very likely there are other places in pg_dump that have similar problems, but this is the one you hit first when trying to pg_dump the regression DB.) That 1K limitation was OK when the module was just used internally in libpq, but if we're going to allow pg_dump to use it, we probably ought to relax the limitation. The equivalent backend code already has solved this problem, but it solved it by using vsnprintf() which isn't available everywhere. We have a vsnprintf() emulation in backend/port, so in theory we could link that routine into libpq if we are on a platform that hasn't got vsnprintf. The thing that bothers me about that is that if libpq exports a vsnprintf routine that's different from the system version, we could find ourselves changing the behavior of applications that thought they were calling the local system's vsnprintf. (The backend/port module would get linked if either snprintf() or vsnprintf() is missing --- there are machines that have only one --- and we'd effectively replace the system definition of the one that the local system did have.) That's not good. However, the alternative of hacking pg_dump so it doesn't try to format more than 1K at a time is mighty unattractive as well. I am inclined to go ahead and insert vsnprintf into libpq. The risk of problems seems pretty small (and it's zero on any machine with a reasonably recent libc, since then vsnprintf will be in libc and we won't link our version). The risk of missing a buffer-overrun condition in pg_dump, and shipping a pg_dump that will fail on someone's database, seems worse. Comments? Better ideas? regards, tom lane
> I have repaired the most recently introduced coredump in pg_dump, > but it still crashes on the regression test database. > > Issue 1: > > The "most recently introduced coredump" came from the change to > oidvector/int2vector to suppress trailing zeroes in the output > routine. pg_dump was assuming that it would see exactly the > right number of zeroes, and wasn't bothering to initialize any > leftover array locations --- but it would happily try to dereference > those locations later on. Ugh. > > Although cleaning up pg_dump's code is clearly good practice, maybe > this should raise a flag about whether suppressing the zeroes is > a good idea. Are there any client applications that will break > because of this change? I'm not sure... I think we are OK. There are very few places the vectors are used. They really weren't used even as part of initdb except to define the types. Makes sense pg_dump uses it, I guess, but I can't imagine other apps using it. With a definable length, I think we have to supress the zero padding. > I am inclined to go ahead and insert vsnprintf into libpq. > The risk of problems seems pretty small (and it's zero on any > machine with a reasonably recent libc, since then vsnprintf > will be in libc and we won't link our version). The risk of > missing a buffer-overrun condition in pg_dump, and shipping > a pg_dump that will fail on someone's database, seems worse. You bring up an interesting point. I say just link it in and see what happens. No real good way to know how much memory sprintf needs. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On 2000-01-15, Tom Lane mentioned: > I have repaired the most recently introduced coredump in pg_dump, > but it still crashes on the regression test database. Which brings up the idea why the regression tests don't test pg_dump. It's just as important to people as the backend. psql already gets tested more or less. Would it not be a decent idea to do a pg_dump regress > out diff out expected.out at the end of the tests? That way we could catch these problems earlier. (Especially since I'm not sure how many people use pg_dump at all during development.) -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
On 2000-01-15, Tom Lane mentioned: > I am inclined to go ahead and insert vsnprintf into libpq. > The risk of problems seems pretty small (and it's zero on any > machine with a reasonably recent libc, since then vsnprintf > will be in libc and we won't link our version). The risk of > missing a buffer-overrun condition in pg_dump, and shipping > a pg_dump that will fail on someone's database, seems worse. > > Comments? Better ideas? I think including this in libpq is the wrong way to go. It's not meant for external clients. If you open this can of worms then anything psql or pg_dump feel like using that day becomes part of the library interface. We'd be stuck with supporting this forever. A better idea would be to do what psql does with snprintf: Just include the [v]snprintf.o file in the compilation (linking) conditionally. (Of course a better plan might even be consolidating all the backend/port and utils stuff into one unified port directory that everyone in the source tree can use, but that's probably too much bother right now.) One thing that I hope I can tackle for 7.1 is cleaning up the build process (with automake?) and that would take care of missing functions automatically by substituting a replacement contained in the distribution, as I suggested above. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <peter_e@gmx.net> writes: > On 2000-01-15, Tom Lane mentioned: >> I am inclined to go ahead and insert vsnprintf into libpq. > I think including this in libpq is the wrong way to go. [snip] > A better idea would be to do what psql does with snprintf: Just include > the [v]snprintf.o file in the compilation (linking) conditionally. Sorry if I was unclear, but that was exactly what I meant. BTW, since this is now done in libpq, you could probably remove snprintf.o from psql ... regards, tom lane
[Charset ISO-8859-1 unsupported, filtering to ASCII...] > On 2000-01-15, Tom Lane mentioned: > > > I have repaired the most recently introduced coredump in pg_dump, > > but it still crashes on the regression test database. > > Which brings up the idea why the regression tests don't test pg_dump. It's > just as important to people as the backend. psql already gets tested more > or less. Would it not be a decent idea to do a > > pg_dump regress > out > diff out expected.out > > at the end of the tests? That way we could catch these problems > earlier. (Especially since I'm not sure how many people use pg_dump at all > during development.) Actually the megatest is: pg_dump regress > outdropdb regressioncreatedb regressionpsql regression < outpg_dump regress > out2diff out out2 That is the pg_dump test, and someone usually does it as part of regression testing before each release. It would be nice to add this to test/regress/Makefile maybe. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Peter Eisentraut <peter_e@gmx.net> writes: > Which brings up the idea why the regression tests don't test pg_dump. That'd be nice ... > Would it not be a decent idea to do a > pg_dump regress > out > diff out expected.out > at the end of the tests? There's a couple of small practical problems with that. Number one: pg_dump regression | wc 118211 565800 3516170 Adding a 3.5meg comparison file to the distribution isn't too appetizing; nor is the prospect of trying to keep it up to date via cvs. (*How* much storage did you just add to hub, Marc? ;-)) Number two is that we'd never get consistent dump results across different platforms. There are the known cross-platform variations (float roundoff, DST handling, etc) already accounted for by platform-specific substitute comparison files. Worse, a dump will see the platform-dependent variations in tuple update order that we currently mask in many tests by asking for ordered select results. I don't think anyone will hold still for a bunch of 3.5meg platform-specific dump comparison files. In short, it'd be a great idea, but figuring out a practical testing method will take some work. regards, tom lane
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Actually the megatest is: > pg_dump regress > out > dropdb regression > createdb regression > psql regression < out > pg_dump regress > out2 > diff out out2 > That is the pg_dump test, and someone usually does it as part of > regression testing before each release. > It would be nice to add this to test/regress/Makefile maybe. That's a good thought --- it eliminates both the platform-specific issues and the problem of adding a bulky reference file to the distribution. I'd suggest, though, that the test *not* clobber the regression DB. Instead pg_dump regression >outcreatedb regression2psql regression2 <outpg_dump regression >out2dropdb regression2 -- maybediffout out2 This leaves you a better chance of investigating the diff if you get one. regards, tom lane
On Mon, 17 Jan 2000, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > On 2000-01-15, Tom Lane mentioned: > >> I am inclined to go ahead and insert vsnprintf into libpq. > > > I think including this in libpq is the wrong way to go. [snip] > > A better idea would be to do what psql does with snprintf: Just include > > the [v]snprintf.o file in the compilation (linking) conditionally. > > Sorry if I was unclear, but that was exactly what I meant. > > BTW, since this is now done in libpq, you could probably remove > snprintf.o from psql ... Hmm, maybe this is not what I meant. I meant adding the linking line to pg_dump, not libpq. But I guess as long as we don't tell anyone about it (vsnprintf being in libpq) we can safely take it out later if someone has a better plan. -- Peter Eisentraut Sernanders vaeg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Actually the megatest is: > > > pg_dump regress > out > > dropdb regression > > createdb regression > > psql regression < out > > pg_dump regress > out2 > > diff out out2 > > > That is the pg_dump test, and someone usually does it as part of > > regression testing before each release. > > > It would be nice to add this to test/regress/Makefile maybe. > > That's a good thought --- it eliminates both the platform-specific > issues and the problem of adding a bulky reference file to the > distribution. Still an incomplete test at all. It doesn't guarantee, that the resulting dump is what you really need to restore the database. For example, I'm not sure that FOREIGN KEY constraints are actually dumped correct (as constraint trigger statements after the data load). So it might work, and result in the same, wrong dump again. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Peter Eisentraut <e99re41@DoCS.UU.SE> writes: >>>> A better idea would be to do what psql does with snprintf: Just include >>>> the [v]snprintf.o file in the compilation (linking) conditionally. >> >> Sorry if I was unclear, but that was exactly what I meant. >> >> BTW, since this is now done in libpq, you could probably remove >> snprintf.o from psql ... > Hmm, maybe this is not what I meant. I meant adding the linking line to > pg_dump, not libpq. Not a usable answer: that would mean that *every* application using libpq would have to start including backend/port/snprintf.o, on the platforms where vsnprintf doesn't exist. > But I guess as long as we don't tell anyone about it > (vsnprintf being in libpq) It's certainly not going to become a published part of the interface, if that's what you mean. regards, tom lane
On 2000-01-17, Tom Lane mentioned: > > Hmm, maybe this is not what I meant. I meant adding the linking line to > > pg_dump, not libpq. > > Not a usable answer: that would mean that *every* application using > libpq would have to start including backend/port/snprintf.o, on the > platforms where vsnprintf doesn't exist. Nevermind. I got it all backwards. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden