Thread: PostgreSQL theoretical maximums.

PostgreSQL theoretical maximums.

From

"Karen Hill"

Date:

27 July 2006, 14:07:38

How many tables can PostgreSQL theoretically and then practically
handle?  What is the largest database size possible?  What was the
biggest database you've ever had on PostgreSQL?  What were the
challenges and what kind of hardware and OS works best?

What is an effective way to predict database size when designing
tables?

regards,

Re: PostgreSQL theoretical maximums.

From

Richard Huxton

Date:

27 July 2006, 15:20:31

Karen Hill wrote:
> How many tables can PostgreSQL theoretically and then practically
> handle?  What is the largest database size possible?  What was the
> biggest database you've ever had on PostgreSQL?  What were the
> challenges and what kind of hardware and OS works best?

Maximum number of tables etc. is a FAQ:
   http://www.postgresql.org/docs/faqs.FAQ.html#item4.4

It's been running on unix-like systems for much longer than Windows.
Apart from that, the best system is probably determined by your experience.

> What is an effective way to predict database size when designing
> tables?

Rule-of-thumb - assume 3-5 times the size of the raw data to allow for
overhead, indexes etc. Other than that, you can find details of on-disk
formats for rows towards the end of the manuals.

--
   Richard Huxton
   Archonet Ltd

Re: PostgreSQL theoretical maximums.

From

David Gagnon

Date:

27 July 2006, 18:02:11

Hi,
Have a look at:
http://www.postgresql.org/about/


/David

Karen Hill wrote:
> How many tables can PostgreSQL theoretically and then practically
> handle?  What is the largest database size possible?  What was the
> biggest database you've ever had on PostgreSQL?  What were the
> challenges and what kind of hardware and OS works best?
>
> What is an effective way to predict database size when designing
> tables?
>
> regards,
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
>
>
>

Practical maximums (was Re: PostgreSQL theoretical maximums.)

From

Ron Johnson

Date:

30 July 2006, 20:51:04

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Richard Huxton wrote:
> Karen Hill wrote:
>> How many tables can PostgreSQL theoretically and then
>> practically handle?  What is the largest database size
>> possible?  What was the biggest database you've ever had on
>> PostgreSQL?  What were the challenges and what kind of hardware
>> and OS works best?
>
> Maximum number of tables etc. is a FAQ:
> http://www.postgresql.org/docs/faqs.FAQ.html#item4.4
>
> It's been running on unix-like systems for much longer than
> Windows. Apart from that, the best system is probably determined
> by your experience.
>
>> What is an effective way to predict database size when
>> designing tables?
>
> Rule-of-thumb - assume 3-5 times the size of the raw data to
> allow for overhead, indexes etc. Other than that, you can find
> details of on-disk formats for rows towards the end of the
> manuals.

I think I've read that there are multi-hundred GB PostgreSQL databases.

Since pg_dump is single-threaded, how long does it take to back up
such a database?  (Obviously it would need modern LTO, SuperDLT, etc
drives, either FC or U320.)

Are there any plans of making a multi-threaded, or even
multi-process pg_dump?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEzUXgS9HxQb37XmcRAq7eAJ9HS4YWB5y/MQ7sGTpAMIvXKMTRhQCfXPEJ
rYSLF4nTKv9AclJ2ZUHLVxE=
=eBsW
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical maximums.)

From

Peter Eisentraut

Date:

31 July 2006, 03:37:41

Ron Johnson wrote:
> I think I've read that there are multi-hundred GB PostgreSQL
> databases.
>
> Since pg_dump is single-threaded, how long does it take to back up
> such a database?

The evasive answer is that you probably don't run regular full pg_dump
on such databases.

> Are there any plans of making a multi-threaded, or even
> multi-process pg_dump?

What do you hope to accomplish by that?  pg_dump is not CPU bound.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

31 July 2006, 11:55:08

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Peter Eisentraut wrote:
> Ron Johnson wrote:
>> I think I've read that there are multi-hundred GB PostgreSQL
>> databases.
>>
>> Since pg_dump is single-threaded, how long does it take to back up
>> such a database?
>
> The evasive answer is that you probably don't run regular full pg_dump
> on such databases.

Hmmm.

>> Are there any plans of making a multi-threaded, or even
>> multi-process pg_dump?
>
> What do you hope to accomplish by that?  pg_dump is not CPU bound.

Write to multiple tape drives at the same time, thereby reducing the
total wall time of the backup process.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEzhl2S9HxQb37XmcRArRoAKDiVOicD7giL2lWF5zMa9qQanwXjQCfdY6r
0eswQQ5nJb8l5Mbp+ok0ZRs=
=MWSM
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Jeff Davis

Date:

07 August 2006, 14:00:21

On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote:

> > The evasive answer is that you probably don't run regular full pg_dump
> > on such databases.
>
> Hmmm.
>

You might want to use PITR for incremental backup or maintain a standby
system using Slony-I ( www.slony.info ).

> >> Are there any plans of making a multi-threaded, or even
> >> multi-process pg_dump?
> >
> > What do you hope to accomplish by that?  pg_dump is not CPU bound.
>
> Write to multiple tape drives at the same time, thereby reducing the
> total wall time of the backup process.

pg_dump just produces output. You could pretty easily stripe that output
across multiple devices just by using some scripts. Just make sure to
write a script that can reconstruct the data again when you need to
restore. You don't need multi-threaded pg_dump, you just need to use a
script that produces multiple output streams. Multi-threaded design is
only useful for CPU-bound applications.

Doing full backups of that much data is always a challenge, and I don't
think PostgreSQL has limitations that another database doesn't.

Regards,
    Jeff Davis

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

07 August 2006, 15:51:49

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeff Davis wrote:
> On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote:
>
>>> The evasive answer is that you probably don't run regular full pg_dump
>>> on such databases.
>> Hmmm.
>>
>
> You might want to use PITR for incremental backup or maintain a standby
> system using Slony-I ( www.slony.info ).

We want PITR, Incremental & Full.

Currently do:
Full - every other day, directly to 4 tape drives.
Incremental - on "the other" days, to 1 tape drive.
PITR - every 8 hours, to disk.

>>>> Are there any plans of making a multi-threaded, or even
>>>> multi-process pg_dump?
>>> What do you hope to accomplish by that?  pg_dump is not CPU bound.
>> Write to multiple tape drives at the same time, thereby reducing the
>> total wall time of the backup process.
>
> pg_dump just produces output. You could pretty easily stripe that output
> across multiple devices just by using some scripts. Just make sure to
> write a script that can reconstruct the data again when you need to
> restore.

But doesn't that mean that pg_dump must dump to disk?

With a *big* database, that's a whole lot of extra kit (not just
spindles) to buy.

>          You don't need multi-threaded pg_dump, you just need to use a
> script that produces multiple output streams. Multi-threaded design is
> only useful for CPU-bound applications.
>
> Doing full backups of that much data is always a challenge, and I don't
> think PostgreSQL has limitations that another database doesn't.

I dunno about that.  With Rdb/VMS it is trivial to backup a database
directly to multiple tape drives.  The Rdb backup utility has an
algorithm with determines which tablespace data is copied to which
tape drive.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE14u3S9HxQb37XmcRAj8xAKC6n4OmHBLeGkGoMz58RFY3FIWf0wCeIlRU
Ott3Uj7/0rpdG7Yb4o+7HPY=
=TdPt
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Scott Marlowe

Date:

07 August 2006, 15:55:30

On Mon, 2006-08-07 at 13:51, Ron Johnson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Jeff Davis wrote:
> > On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote:

> > pg_dump just produces output. You could pretty easily stripe that output
> > across multiple devices just by using some scripts. Just make sure to
> > write a script that can reconstruct the data again when you need to
> > restore.
>
> But doesn't that mean that pg_dump must dump to disk?
>
> With a *big* database, that's a whole lot of extra kit (not just
> spindles) to buy.

I'm certain you can backup direct to tape, as I've done it in the past
with postgresql.  This was in the day of 4 gig 4 mm tape drives...

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

07 August 2006, 16:51:32

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Scott Marlowe wrote:
> On Mon, 2006-08-07 at 13:51, Ron Johnson wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Jeff Davis wrote:
>>> On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote:
>
>>> pg_dump just produces output. You could pretty easily stripe that output
>>> across multiple devices just by using some scripts. Just make sure to
>>> write a script that can reconstruct the data again when you need to
>>> restore.
>> But doesn't that mean that pg_dump must dump to disk?
>>
>> With a *big* database, that's a whole lot of extra kit (not just
>> spindles) to buy.
>
> I'm certain you can backup direct to tape, as I've done it in the past
> with postgresql.  This was in the day of 4 gig 4 mm tape drives...

Sure.  That's why tar is named *Tape* ARchive.

Going along with the example of the 4GB 4mm tape drive, and to
simplify, we are *not* using compression (neither on the drive or
s/w gzip or bzip):

Say we have a 7GB database.  Is there a way to way to use 2 tape
drives...  Argh, I guess not, since the Unix pipe mentality presumes
that the mode of operation will be:
  $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz

What would be darned useful (but only, I think, with heavy usage of
tablespaces) is:
  $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE15mwS9HxQb37XmcRAmh/AKDN8ITbeax0+DvKBMGRJ2JBEy3cpACg4/Gd
t4mFuJA+DeAUOEKSztbWHNg=
=Cq3a
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Richard Broersma Jr

Date:

07 August 2006, 16:59:14

> Say we have a 7GB database.  Is there a way to way to use 2 tape
> drives...  Argh, I guess not, since the Unix pipe mentality presumes
> that the mode of operation will be:
>   $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz
>
> What would be darned useful (but only, I think, with heavy usage of
> tablespaces) is:
>   $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3

Would passing the --multi-volume option be useful to you?

http://www.apl.jhu.edu/Misc/Unix-info/tar/tar_97.html

Regards,

Richard Broersma Jr.

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Scott Marlowe

Date:

07 August 2006, 17:00:22

On Mon, 2006-08-07 at 14:51, Ron Johnson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Scott Marlowe wrote:
> > On Mon, 2006-08-07 at 13:51, Ron Johnson wrote:
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA1
> >>
> >> Jeff Davis wrote:
> >>> On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote:
> >
> >>> pg_dump just produces output. You could pretty easily stripe that output
> >>> across multiple devices just by using some scripts. Just make sure to
> >>> write a script that can reconstruct the data again when you need to
> >>> restore.
> >> But doesn't that mean that pg_dump must dump to disk?
> >>
> >> With a *big* database, that's a whole lot of extra kit (not just
> >> spindles) to buy.
> >
> > I'm certain you can backup direct to tape, as I've done it in the past
> > with postgresql.  This was in the day of 4 gig 4 mm tape drives...
>
> Sure.  That's why tar is named *Tape* ARchive.
>
> Going along with the example of the 4GB 4mm tape drive, and to
> simplify, we are *not* using compression (neither on the drive or
> s/w gzip or bzip):
>
> Say we have a 7GB database.  Is there a way to way to use 2 tape
> drives...  Argh, I guess not, since the Unix pipe mentality presumes
> that the mode of operation will be:
>   $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz
>
> What would be darned useful (but only, I think, with heavy usage of
> tablespaces) is:
>   $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3

Oh, I see where you were headed.

I've found a bit here and there googling about for multiplex and tar,
and found an IO:Multiplex module for perl.

I don't see why someone couldn't make use of it to create some kind of
system where you'd do:

pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen

No need for postgresql to support it directly.  Restoring would likewise
just be a reverse operation.

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

07 August 2006, 17:07:58

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Richard Broersma Jr wrote:
>> Say we have a 7GB database.  Is there a way to way to use 2 tape
>> drives...  Argh, I guess not, since the Unix pipe mentality presumes
>> that the mode of operation will be:
>>   $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz
>>
>> What would be darned useful (but only, I think, with heavy usage of
>> tablespaces) is:
>>   $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3
>
> Would passing the --multi-volume option be useful to you?
>
> http://www.apl.jhu.edu/Misc/Unix-info/tar/tar_97.html

Mostly no, because it serializes access.  Fill up one tape, go to
the next, then the 3rd, etc.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE152FS9HxQb37XmcRAiLuAJwIsMlCfguvti4z/2WNozCFjT0ZYQCgv8d2
LnwIypkK00EVRx/yMKzggy4=
=rdW2
-----END PGP SIGNATURE-----

Re: Practical maximums

From

Jorge Godoy

Date:

07 August 2006, 17:16:08

Ron Johnson <ron.l.johnson@cox.net> writes:

> Say we have a 7GB database.  Is there a way to way to use 2 tape
> drives...  Argh, I guess not, since the Unix pipe mentality presumes
> that the mode of operation will be:
>   $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz

If you use "z" you're compressing then using multiple volumes is not
possible.  If you can abdicate of that then you'll be able to use the "m"
option and it will ask for another tape.

> What would be darned useful (but only, I think, with heavy usage of
> tablespaces) is:
>   $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3

You can create multiple files using the "m" option and then copy those to each
individual tape drive.  That's the only way I can think of to redirect your
output to multiple tape drives.

For multiple tapes, tar will ask you for the next and all of them will be
/dev/st<drive_number>.

--
Jorge Godoy      <jgodoy@gmail.com>

Attachment

msg-32091-121012.dat

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

07 August 2006, 17:18:18

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Scott Marlowe wrote:
> On Mon, 2006-08-07 at 14:51, Ron Johnson wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Scott Marlowe wrote:
>>> On Mon, 2006-08-07 at 13:51, Ron Johnson wrote:
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> Jeff Davis wrote:
>>>>> On Mon, 2006-07-31 at 09:53 -0500, Ron Johnson wrote:
>>>>> pg_dump just produces output. You could pretty easily stripe that output
>>>>> across multiple devices just by using some scripts. Just make sure to
>>>>> write a script that can reconstruct the data again when you need to
>>>>> restore.
>>>> But doesn't that mean that pg_dump must dump to disk?
>>>>
>>>> With a *big* database, that's a whole lot of extra kit (not just
>>>> spindles) to buy.
>>> I'm certain you can backup direct to tape, as I've done it in the past
>>> with postgresql.  This was in the day of 4 gig 4 mm tape drives...
>> Sure.  That's why tar is named *Tape* ARchive.
>>
>> Going along with the example of the 4GB 4mm tape drive, and to
>> simplify, we are *not* using compression (neither on the drive or
>> s/w gzip or bzip):
>>
>> Say we have a 7GB database.  Is there a way to way to use 2 tape
>> drives...  Argh, I guess not, since the Unix pipe mentality presumes
>> that the mode of operation will be:
>>   $ pg_dump /some/database | tar cvfz /dev/st0/db.tgz
>>
>> What would be darned useful (but only, I think, with heavy usage of
>> tablespaces) is:
>>   $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3
>
> Oh, I see where you were headed.
>
> I've found a bit here and there googling about for multiplex and tar,
> and found an IO:Multiplex module for perl.
>
> I don't see why someone couldn't make use of it to create some kind of
> system where you'd do:
>
> pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen
>
> No need for postgresql to support it directly.  Restoring would likewise
> just be a reverse operation.

Interesting.  Many thanks.  Scary though that it hasn't been touched
in 30 months.

With multiple SCSI cards, each with it's own tape drive, this
immensely speeds up the backup operation.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE16ABS9HxQb37XmcRAg3jAJ4nCT6vaytOuPnk/zG2QYSeulRirgCgmVkT
iM+FYqrPwGX2as2fYs67lE4=
=YVML
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Scott Marlowe

Date:

07 August 2006, 17:22:23

On Mon, 2006-08-07 at 15:18, Ron Johnson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Scott Marlowe wrote:
> >
> > Oh, I see where you were headed.
> >
> > I've found a bit here and there googling about for multiplex and tar,
> > and found an IO:Multiplex module for perl.
> >
> > I don't see why someone couldn't make use of it to create some kind of
> > system where you'd do:
> >
> > pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen
> >
> > No need for postgresql to support it directly.  Restoring would likewise
> > just be a reverse operation.
>
> Interesting.  Many thanks.  Scary though that it hasn't been touched
> in 30 months.
>
> With multiple SCSI cards, each with it's own tape drive, this
> immensely speeds up the backup operation.

Yeah, I read the description, and I think it's just a fancy name for
tee.  sigh.  Not REAL multiplexing, but stream duplication.

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

07 August 2006, 18:10:08

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Scott Marlowe wrote:
> On Mon, 2006-08-07 at 15:18, Ron Johnson wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Scott Marlowe wrote:
>>> Oh, I see where you were headed.
>>>
>>> I've found a bit here and there googling about for multiplex and tar,
>>> and found an IO:Multiplex module for perl.
>>>
>>> I don't see why someone couldn't make use of it to create some kind of
>>> system where you'd do:
>>>
>>> pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen
>>>
>>> No need for postgresql to support it directly.  Restoring would likewise
>>> just be a reverse operation.
>> Interesting.  Many thanks.  Scary though that it hasn't been touched
>> in 30 months.
>>
>> With multiple SCSI cards, each with it's own tape drive, this
>> immensely speeds up the backup operation.
>
> Yeah, I read the description, and I think it's just a fancy name for
> tee.  sigh.  Not REAL multiplexing, but stream duplication.

This is where a multi-threaded pg_tapedump would be more effective,
since it would be able to, for example, have 4 threads reading
(different parts of) the database and writing to 4 separate tape drives.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE16wiS9HxQb37XmcRAvi0AJ0eHuurG7cC3HC9A1lOMXvcBDJ7QACeMlyB
bS3ozQ69gFgrM70oHGJr8Zk=
=8N8c
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Jeff Davis

Date:

07 August 2006, 18:20:57

On Mon, 2006-08-07 at 14:51 -0500, Ron Johnson wrote:

> What would be darned useful (but only, I think, with heavy usage of
> tablespaces) is:
>   $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3

I must be missing something. What is stopping you from doing something
along the lines of:

// mux.script psuedocode
// X is the size of a stripe of data
top:
read X bytes from stdin or exit if EOF
asynchronously write those X bytes to /dev/st0
read X bytes from stdin or exit if EOF
asynchronously write those X bytes to /dev/st1
goto top

And then make an inverse script called demux.script.

Then:
$ pg_dump somedatabase | gzip -c | mux.script
To restore:
$ demux.script | gunzip -c | psql somedatabase

Would that work? Obviously you'd have to document the process well to
make sure the someone didn't get confused 12 months later trying to
restore.

You may have to do something a little more sophisticated to make it work
more generally, like adding header information to each tape that says "I
am the 2nd tape of 3".

Regards,
    Jeff Davis

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Alvaro Herrera

Date:

07 August 2006, 18:44:07

Ron Johnson wrote:

> This is where a multi-threaded pg_tapedump would be more effective,
> since it would be able to, for example, have 4 threads reading
> (different parts of) the database and writing to 4 separate tape drives.

It will be difficult to have a consistent dump though.  You can't do
that with separate transactions.  (And you can't have multiple
simultaneous readers without separate transactions.)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Scott Marlowe

Date:

07 August 2006, 18:49:06

On Mon, 2006-08-07 at 16:09, Ron Johnson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Scott Marlowe wrote:
> > On Mon, 2006-08-07 at 15:18, Ron Johnson wrote:
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA1
> >>
> >> Scott Marlowe wrote:
> >>> Oh, I see where you were headed.
> >>>
> >>> I've found a bit here and there googling about for multiplex and tar,
> >>> and found an IO:Multiplex module for perl.
> >>>
> >>> I don't see why someone couldn't make use of it to create some kind of
> >>> system where you'd do:
> >>>
> >>> pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen
> >>>
> >>> No need for postgresql to support it directly.  Restoring would likewise
> >>> just be a reverse operation.
> >> Interesting.  Many thanks.  Scary though that it hasn't been touched
> >> in 30 months.
> >>
> >> With multiple SCSI cards, each with it's own tape drive, this
> >> immensely speeds up the backup operation.
> >
> > Yeah, I read the description, and I think it's just a fancy name for
> > tee.  sigh.  Not REAL multiplexing, but stream duplication.
>
> This is where a multi-threaded pg_tapedump would be more effective,
> since it would be able to, for example, have 4 threads reading
> (different parts of) the database and writing to 4 separate tape drives.

Actually, I kinda prefer the idea of creating multiple streams using
something like RAID, where if you have two streams, it's a mirror, if
you have three or more then you rotate around a parity stripe like RAID
5.  Then, any error on any one tape drive could be recovered.  Veritas
has something like that for tape drives.

Heck, the more I think about it, the more I think it would be an
interesting project for a device driver that was like /dev/mdt or
something, and re-used the md libs from the hard drive universe.

Attach X tape drive, put in a bunch of tapes, and just pg_dump > mdt0
and you're backing up.  Restore the other way around.

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

07 August 2006, 18:51:26

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeff Davis wrote:
> On Mon, 2006-08-07 at 14:51 -0500, Ron Johnson wrote:
>
>> What would be darned useful (but only, I think, with heavy usage of
>> tablespaces) is:
>>   $ pg_tapedump /some/database /dev/st0,/dev/st1,/dev/st2,/dev/st3
>
> I must be missing something. What is stopping you from doing something
> along the lines of:

Missing nothing. :)

> // mux.script psuedocode
> // X is the size of a stripe of data
> top:
> read X bytes from stdin or exit if EOF
> asynchronously write those X bytes to /dev/st0
> read X bytes from stdin or exit if EOF
> asynchronously write those X bytes to /dev/st1
> goto top

Python queues and threads would make that very simple.  Master
thread reads from pg_dump pipe, passing X bytes to a queue, and each
thread performs this loop:
  - read from queue
  - synchronously write to /dev/stN

> And then make an inverse script called demux.script.
>
> Then:
> $ pg_dump somedatabase | gzip -c | mux.script
> To restore:
> $ demux.script | gunzip -c | psql somedatabase
>
> Would that work? Obviously you'd have to document the process well to
> make sure the someone didn't get confused 12 months later trying to
> restore.

The disk system would have to be fast enough to keep all X tape
drives full.

Depending on the speed of the tape systems and disk system, having X
"threads" reading from the database/writing to tape, all at the same
time is much faster.

> You may have to do something a little more sophisticated to make it work
> more generally, like adding header information to each tape that says "I
> am the 2nd tape of 3".

Depending on the capabilities of your tape drive, CRC, error
checking, load-next-tape, etc, etc.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE17XRS9HxQb37XmcRApdSAKCv390n8/eQxdodF3D6PQGBvyUSUACfR5NO
ZpL8Ac6RMHQRHNqCEb8Grwo=
=nrzR
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

07 August 2006, 18:53:56

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alvaro Herrera wrote:
> Ron Johnson wrote:
>
>> This is where a multi-threaded pg_tapedump would be more effective,
>> since it would be able to, for example, have 4 threads reading
>> (different parts of) the database and writing to 4 separate tape drives.
>
> It will be difficult to have a consistent dump though.  You can't do
> that with separate transactions.  (And you can't have multiple
> simultaneous readers without separate transactions.)

Absolutely, you're right.  All "threads" must run from within the
same read-only transaction.


- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE17ZrS9HxQb37XmcRAitSAKCo1/HaJ8ZsoGYFgGo2sHezWmqqWACbB9Mr
Hw4Xaalah+ZqExMwr4vu058=
=e33c
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

07 August 2006, 18:57:50

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Scott Marlowe wrote:
> On Mon, 2006-08-07 at 16:09, Ron Johnson wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Scott Marlowe wrote:
>>> On Mon, 2006-08-07 at 15:18, Ron Johnson wrote:
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> Scott Marlowe wrote:
>>>>> Oh, I see where you were headed.
>>>>>
>>>>> I've found a bit here and there googling about for multiplex and tar,
>>>>> and found an IO:Multiplex module for perl.
>>>>>
>>>>> I don't see why someone couldn't make use of it to create some kind of
>>>>> system where you'd do:
>>>>>
>>>>> pg_dump dbname | perl mulitiplexscript device1,device2,device3,devicen
>>>>>
>>>>> No need for postgresql to support it directly.  Restoring would likewise
>>>>> just be a reverse operation.
>>>> Interesting.  Many thanks.  Scary though that it hasn't been touched
>>>> in 30 months.
>>>>
>>>> With multiple SCSI cards, each with it's own tape drive, this
>>>> immensely speeds up the backup operation.
>>> Yeah, I read the description, and I think it's just a fancy name for
>>> tee.  sigh.  Not REAL multiplexing, but stream duplication.
>> This is where a multi-threaded pg_tapedump would be more effective,
>> since it would be able to, for example, have 4 threads reading
>> (different parts of) the database and writing to 4 separate tape drives.
>
> Actually, I kinda prefer the idea of creating multiple streams using
> something like RAID, where if you have two streams, it's a mirror, if
> you have three or more then you rotate around a parity stripe like RAID
> 5.  Then, any error on any one tape drive could be recovered.  Veritas
> has something like that for tape drives.
>
> Heck, the more I think about it, the more I think it would be an
> interesting project for a device driver that was like /dev/mdt or
> something, and re-used the md libs from the hard drive universe.
>
> Attach X tape drive, put in a bunch of tapes, and just pg_dump > mdt0
> and you're backing up.  Restore the other way around.

Tape RAID?  I like that!!!!

If the disk system can keep the tape system full, though.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE17dQS9HxQb37XmcRApZIAKCJDCErL0s/MecjFv+jX4YzNzHuVACeLDoT
1BUxLkwU1GBvFzmvms7dWRU=
=h7Wq
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Jeff Davis

Date:

07 August 2006, 19:07:35

On Mon, 2006-08-07 at 16:51 -0500, Ron Johnson wrote:

> Python queues and threads would make that very simple.  Master
> thread reads from pg_dump pipe, passing X bytes to a queue, and each
> thread performs this loop:
>   - read from queue
>   - synchronously write to /dev/stN

Right, you can use threads also. You're doing synchronous writes from
different threads, I was doing asynchronous writes from the same thread.
It's just a matter of preference.

> The disk system would have to be fast enough to keep all X tape
> drives full.
>

The whole point of this is that pg_dump was producing output (from the
disk) faster than you could store it (to the tape), right? So if you use
multiple tapes and stripe with a multiplexing script, that should solve
the problem right?

Regards,
    Jeff Davis

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Jeff Davis

Date:

07 August 2006, 19:10:27

On Mon, 2006-08-07 at 16:53 -0500, Ron Johnson wrote:
> > It will be difficult to have a consistent dump though.  You can't do
> > that with separate transactions.  (And you can't have multiple
> > simultaneous readers without separate transactions.)
>
> Absolutely, you're right.  All "threads" must run from within the
> same read-only transaction.
>

The idea is that pg_dump already works and already creates a good
backup. Why not split up the data after pg_dump produces it? Of course
it should be split up in a stream fashion, like I suggested before in
this thread with my multiplexing script.

Regards,
    Jeff Davis

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Jeff Davis

Date:

07 August 2006, 19:11:51

On Mon, 2006-08-07 at 16:57 -0500, Ron Johnson wrote:
> > Attach X tape drive, put in a bunch of tapes, and just pg_dump > mdt0
> > and you're backing up.  Restore the other way around.
>
> Tape RAID?  I like that!!!!
>
> If the disk system can keep the tape system full, though.
>

What situation are you worried about here exactly?

Regards,
    Jeff Davis

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

07 August 2006, 23:11:01

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeff Davis wrote:
> On Mon, 2006-08-07 at 16:57 -0500, Ron Johnson wrote:
>>> Attach X tape drive, put in a bunch of tapes, and just pg_dump > mdt0
>>> and you're backing up.  Restore the other way around.
>> Tape RAID?  I like that!!!!
>>
>> If the disk system can keep the tape system full, though.
>>
>
> What situation are you worried about here exactly?

Backing up a large database in a limited amount of time.

If the DB is large enough, and the time is short enough, then the
single-reader pg_dump can not read the data fast enough (especially
if other operations are pounding the disk system) to meet the time
limit.

Thus, the need (some times) for multiple readers.

But, you say: do PITR and weekly full backups.  Ok.  Say you do
Saturday night backups and nightly PITR backups.  And the disk
system barfs hard on Friday.  If the Tuesday night PITR tape has a
parity error, you're hosed and have lost 3 days of data.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE1/KsS9HxQb37XmcRAm+sAKCDRpDvzSlLpP9cFF061rCcaz4ufwCgq8Z+
dGCOEcOgIZRMzDOrLJ+kaak=
=P0Q4
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Jeff Davis

Date:

08 August 2006, 13:47:44

On Mon, 2006-08-07 at 21:10 -0500, Ron Johnson wrote:
> > What situation are you worried about here exactly?
>
> Backing up a large database in a limited amount of time.
>
> If the DB is large enough, and the time is short enough, then the
> single-reader pg_dump can not read the data fast enough (especially
> if other operations are pounding the disk system) to meet the time
> limit.
>
> Thus, the need (some times) for multiple readers.
>

Oh, I see. You're not only worried about write speed (to tape), but also
the ability of pg_dump to read from all your disk spindles at once to
get fast enough read speed.

If you use PITR, the during the backup you can copy the data directory
in any order that you want. It should be pretty easy to get it to copy
from multiple streams in that way. You can use tar to create a single
stream out of each tablespace, and then multiplex the stream to multiple
tapes so that it can write quickly enough.

It's not as easy to create multiple reading streams using pg_dump,
because each transaction can have at most one COPY going at a time. You
might be able to hack up something, but I think PITR is the way to go.

> But, you say: do PITR and weekly full backups.  Ok.  Say you do
> Saturday night backups and nightly PITR backups.  And the disk
> system barfs hard on Friday.  If the Tuesday night PITR tape has a
> parity error, you're hosed and have lost 3 days of data.
>

By modifying the scripts you use to multiplex the data, you can employ
some redundancy. I don't think it would be hard to include duplicate
blocks or parity blocks in the multiplex script.

By creating multiple reading streams, one for each tablespace, and then
writing each stream to multiple tapes (with parity blocks), you should
be able to back up at the speed your hardware allows, and restore at the
speed your hardware allows. Furthermore, with PITR you can do
incremental backups continuously and wait 'til the weekend to do a full
backup.

I'm imagining something similar to the following:
mydb=# select pg_start_backup('mybackup');

$ tar zcf - /path/to/pgdata | mux.script /dev/st0 /dev/st1 /dev/st2
$ tar zcf - /path/to/tablespace1 | mux.script /dev/st3 /dev/st4 /dev/st5
$ tar zcf - /path/to/tablespace2 | mux.script /dev/st6 /dev/st7 /dev/st8

mydb=# select pg_stop_backup(); -- in same connection as pg_start_backup

Where mux.script can write to three tapes and use parity blocks. Adjust
based on the number of tapes you actually have.

Hope this helps,
    Jeff Davis

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Jeff Davis

Date:

08 August 2006, 13:54:24

On Tue, 2006-08-08 at 09:47 -0700, Jeff Davis wrote:
> I'm imagining something similar to the following:
> mydb=# select pg_start_backup('mybackup');
>
> $ tar zcf - /path/to/pgdata | mux.script /dev/st0 /dev/st1 /dev/st2
> $ tar zcf - /path/to/tablespace1 | mux.script /dev/st3 /dev/st4 /dev/st5
> $ tar zcf - /path/to/tablespace2 | mux.script /dev/st6 /dev/st7 /dev/st8
>

Oops, let me clarify. Those tar commands are independent and can be run
in parallel, which was the point I was trying to make. So, just run them
in the background in whatever way is most convenient.

Regards,
    Jeff Davis

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Ron Johnson

Date:

08 August 2006, 15:48:50

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeff Davis wrote:
> On Tue, 2006-08-08 at 09:47 -0700, Jeff Davis wrote:
>> I'm imagining something similar to the following:
>> mydb=# select pg_start_backup('mybackup');
>>
>> $ tar zcf - /path/to/pgdata | mux.script /dev/st0 /dev/st1 /dev/st2
>> $ tar zcf - /path/to/tablespace1 | mux.script /dev/st3 /dev/st4 /dev/st5
>> $ tar zcf - /path/to/tablespace2 | mux.script /dev/st6 /dev/st7 /dev/st8
>>
>
> Oops, let me clarify. Those tar commands are independent and can be run
> in parallel, which was the point I was trying to make. So, just run them
> in the background in whatever way is most convenient.

Open database or closed database?

- --
Ron Johnson, Jr.
Jefferson LA  USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE2Nx/S9HxQb37XmcRAgM5AKCkFR21o7/tUQiDPF/tvxpT0hmENACgsCBW
LeMdlk1n2TnD7gfqYeAFySw=
=R8CG
-----END PGP SIGNATURE-----

Re: Practical maximums (was Re: PostgreSQL theoretical

From

Jeff Davis

Date:

08 August 2006, 20:12:56

On Tue, 2006-08-08 at 13:48 -0500, Ron Johnson wrote:
> >> I'm imagining something similar to the following:
> >> mydb=# select pg_start_backup('mybackup');
> >>
> >> $ tar zcf - /path/to/pgdata | mux.script /dev/st0 /dev/st1 /dev/st2
> >> $ tar zcf - /path/to/tablespace1 | mux.script /dev/st3 /dev/st4 /dev/st5
> >> $ tar zcf - /path/to/tablespace2 | mux.script /dev/st6 /dev/st7 /dev/st8
> >>
> >
> > Oops, let me clarify. Those tar commands are independent and can be run
> > in parallel, which was the point I was trying to make. So, just run them
> > in the background in whatever way is most convenient.
>
> Open database or closed database?

See http://www.postgresql.org/docs/8.1/static/backup-online.html

In section 23.3.2, it suggests that you _don't_ need the same connection
to do pg_start_backup('mybackup') as to do pg_stop_backup().

So, it appears you really need no database connections at all open
during the process, but clients can freely connect if you want with no
interruption. This backup procedure will not affect your applications or
stop your database.

All you need to do is:
(1) connect, issue pg_start_backup('mybackup'), and disconnect
(2) back up the data dir + all tablespaces. You can do this in parallel
if you want. You don't have to make a filesystem snapshot to do this,
the restoration will fix any internal inconsistencies as long as you
have archived the WAL files according to the above docs. So long as the
backup makes it onto permanent media, and so have the WAL files, it
should be fine.
(3) connect, issue pg_stop_backup(), and disconnect

I recommend reading the above link carefully and doing a few tests to
make sure it works as you expect. But you certainly can do online
backups (hot backups) with PITR, and the restoration as well. With PITR,
the large portion of the data is moved using standard filesystem
interaction, so you can do it in parallel with no problems.

Hope this helps,
    Jeff Davis