Re: Idea for improving speed of pg_restore - Mailing list pgsql-general

From cbbrowne@acm.org
Subject Re: Idea for improving speed of pg_restore
Date
Msg-id 60isnsh30t.fsf@dev6.int.libertyrms.info
Whole thread Raw
In response to Idea for improving speed of pg_restore  (Ron Johnson <ron.l.johnson@cox.net>)
List pgsql-general
tgl@sss.pgh.pa.us (Tom Lane) writes:
> Christopher Browne <cbbrowne@acm.org> writes:
>> Restoring a database involves, for each table:
>>  1. Reading table data from the source file;
>>  2. Writing data to the database file for the table;
>>  3. After that, reading the database file data, and
>>  4. Writing the sorted bits to the index file.
>>  5. Along with all of this, HEFTY amounts of updates to WAL.
>
> An idea that Marc and Jan and I were kicking around last night was
> to offer a GUC option to disable writes to WAL.  During initial data
> load, you might as well go back to initdb if you have any failure,
> so why bother with full ACID compliance?  I'm not sure if the
> performance benefit would be great enough to make it worth equipping
> the system with such a large-caliber foot-gun, but it's something to
> think about.

That is a good thought.

To make it _marginally_ less unsafe, it might be an idea to only
enable this in some variation on single user mode so that it is made
manifestly clear that this is a "recovery" process and that nobody
else should be poking their head in on.

(Or, in keeping with the metaphor, it involves having a large flashing
light that says "Stay off the range - live fire artillery exercise
under way!"  "Make sure this isn't pointed at Mabel or any other
highly trained monkeys or databases you want to keep around!"
<http://cbbrowne.com/info/images/shooting.jpg>  Similarly, if my
brother gets behind the wheel of Certain Large Vehicles, it is good to
keep any lurking trees out of the way...
<http://www3.ns.sympatico.ca/coffee/pics/brdtnk2.gif> :-))

> I tend to agree with your doubts about parallelizing index builds,
> but there may be scenarios where it's a win; it'd depend on your
> relative CPU and disk horsepower.  (Consider fast disk and multiple
> not-so-fast CPUs; serial index builds can only use one of the CPUs.)
> Question is, is it a big enough win for enough people to make it
> worth supporting?

That one would take some benchmarking to see if/where it would be a
win.  And "fast disk and multiple cheezy CPUs" sounds pretty atypical
in these days of Opterons and multiple GHz Intel hardware.  That being
said, I could point to a _perfect_ server to try it out on, with the
problem that it isn't representative of the sort of 'production'
environment where you'd actually care about tuning the builds.
Certainly doesn't fit into the set of things I can afford to kludge
into my schedule :-(.
--
let name="aa454" and tld="freenet.carleton.ca" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/x.html
If nothing ever sticks to Teflon, how do they make Teflon stick to the
pan?

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: State of Beta 2
Next
From: "Shawn Pinto"
Date:
Subject: Re: Red Hat 9 Postgres