Parallel backups (was Re: why postgresql over other RDBMS) - Mailing list pgsql-general

From Ron Johnson
Subject Parallel backups (was Re: why postgresql over other RDBMS)
Date
Msg-id 465620B8.2040107@cox.net
Whole thread Raw
In response to Re: why postgresql over other RDBMS  (Chris Browne <cbbrowne@acm.org>)
List pgsql-general
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 05/24/07 17:21, Chris Browne wrote:
[snip]
>
> This would permit doing a neat parallel decomposition of pg_dump: you
> could do a 4-way parallelization of it that would function something
> like the following:
>
> - connection 1 opens, establishes the usual serialized mode transaction
>
> - connection 1 dumps the table metadata into one or more files in a
>   specified directory
>
> - then it forks 3 more connections, and seeds them with the same
>   serialized mode state
>
> - it then goes thru and can dump 4 tables concurrently at a time,
>   one apiece to a file in the directory.
>
> This could considerably improve speed of dumps, possibly of restores,
> too.

What about a master thread that "establishes the usual serialized
mode transaction" and then issues N asynchronous requests to the
database, and as they return with data, pipe the data to N number of
corresponding "writer" threads.  Matching N to the number of tape
drives comes to mind.

Yes, the master thread would be the choke point, but CPUs and RAM
are still a heck of a lot faster than disks, so maybe it wouldn't be
such a problem after all.

Of course, if libpq(??) doesn't handle async IO, then it's not such
a good idea after all.

> Note that this isn't related to subtransactions...

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGViC4S9HxQb37XmcRAgkAAKC4pyZQWDF01S17uITbOkcj+KY8lgCg40pi
2B3xg2tnp554GGP0VsgACWE=
=eIUP
-----END PGP SIGNATURE-----

pgsql-general by date:

Previous
From: "George Pavlov"
Date:
Subject: index vs. seq scan choice?
Next
From: Rodrigo De León
Date:
Subject: Re: Limiting number of rows returned at a time in select query