Re: Parallel pg_restore versus old dump files - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: Parallel pg_restore versus old dump files |
Date | |
Msg-id | 4C215D24.9020603@dunslane.net Whole thread Raw |
In response to | Parallel pg_restore versus old dump files (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Parallel pg_restore versus old dump files
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-hackers |
Tom Lane wrote: > In short, parallel pg_restore is guaranteed to fail on any input file > made with a pre-8.4 pg_dump on Windows. It may be that there's some > other mechanism involved in the reports we've gotten of parallel restore > failing only some of the time, but I'm thinking that the heretofore > unrecognized dependency on pg_dump-time seekability could well explain > those too. > IIRC, you can reproduce this on Unix too by sending the output of pg_dump into a pipe. So it's not uniquely a Windows problem. As Greg suggests, the solution would be to have a second TOC at the end of the file with the offsets. But I think that's way beyond what we should do on the back branches, and really beyond what we should do for 9.0. We should document the limitation. > I see several action items here: > > 1. The error message emitted by _PrintTocData is incredibly misleading. > It needs to be fixed to tell people if the problem is lack of data > offsets rather than lack of seek capability. > Agreed. > Another possibility is to just remove the inside-the-loop error test > altogether: make it just skip till it finds the desired item, and only > throw an error if it hits EOF without finding it. In the case that > the error test is trying to catch, this would mean significantly more > work done before reporting the error, but do we really care? I'm > leaning to this solution because it would not require exporting state > from the parallel restore control logic. > Would exporting a bit of state be so bad? It seems like it would be a bit cleaner, and I'll be surprised if it's terribly difficult. It can be set at the top of parallel_restore(). > 3. Perhaps pg_dump ought to emit a warning when it can't seek, instead > of just silently not writing the data offsets. That behavior was okay > before when lack of data offsets didn't really matter that much, but > lack of data offsets is a serious performance handicap for parallel > restore even after we fix the outright failure condition (because each > worker is going to read through a lot of data to find what it needs). > For now, yes. But in 9.1 we should write out a second TOC and teach pg_restore to look for it. > 4. Is there any value in back-porting the Windows FSEEKO support into > 8.3 and 8.2? Arguably, not writing the data offsets is a performance > bug. However a back-port won't do anything for people who are dumping > with less than the latest minor release of pg_dump, so doing this might > be largely wasted effort. > I doubt it's worth it, but I could be persuaded otherwise. cheers andrew
pgsql-hackers by date: