Thread: tablespaces a priority for 7.5?

tablespaces a priority for 7.5?

From
"Brian Maguire"
Date:
Is support for tablespaces a priority feature for 7.5? I believe there
has been significant development in this area and it seems that
postgres' file structure opens it up nicely to support it.  What are the
chances this will be completed?

In my opinion, it really is a critical feature to support and administer
enterprise databases.  All the major databases currently support this
and it is a compelling enough reason drive big users from away from
using postgres for their enterprise/large databases.  It really is a
database administrator's feature.


Brian


Re: tablespaces a priority for 7.5?

From
Date:


In RAID era tablespaces are not such important regarding performance.

But for backup/restore - the ability to backup/restore selected tablespaces
while leaving other tablespaces is a big thing.
The whole point here is: it is assumed that backup/restore of tablespaces can
hapen quite quickly and as simple as to copy tablespace files from one
location to another(even while database is on - WAL can be used to handle
this) - this is compared to dump.

For example, index, tempoarary data tablespaces can be lost - not a big deal.

Undo(rollback) tablespaces - in a way can be lost as well.
While system data tablespace (table structure, stored procedures, etc) - at
no cost should be lost.
The same way application can be devided in "critical" and "not critical"
tablespaces and their backups maintained accordingly. For example, it may not
be a big deal to lose year 1996 tables while year 2004 tables should be
online.


Laimis

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Maguire
> Sent: 21. janúar 2004 16:06
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] tablespaces a priority for 7.5?
>
>
> Is support for tablespaces a priority feature for 7.5? I
> believe there has been significant development in this area
> and it seems that postgres' file structure opens it up nicely
> to support it.  What are the chances this will be completed?
>
> In my opinion, it really is a critical feature to support and
> administer enterprise databases.  All the major databases
> currently support this and it is a compelling enough reason
> drive big users from away from using postgres for their
> enterprise/large databases.  It really is a database
> administrator's feature.
>
>
> Brian
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
               http://archives.postgresql.org

Re: tablespaces a priority for 7.5?

From
"Brian Maguire"
Date:
I agree that RAID provides similar performance benifits especially with striping io benifits, however it is powerful
andideal to have both options.  For example you may have a set of tables that are read-only for reporting and another
setmostly write only.   You could have they resting on different raid configurations ideal for each situtation.  
 
 
I also agree there are several admin benifits in the areas of backup.  You can also more easily create different
frequency/schedulesof backup for certain critical tables to a different schedule than other less important tables.
Theflexibility of easily growing your database beyond the current disk because of size limitations can be a life
savior.
 
brian

    -----Original Message----- 
    From: lnd@hnit.is [mailto:lnd@hnit.is] 
    Sent: Wed 1/21/2004 12:22 PM 
    To: Brian Maguire 
    Cc: 
    Subject: RE: [GENERAL] tablespaces a priority for 7.5?
    
    

    In RAID era tablespaces are not such important regarding performance. 

    But for backup/restore - the ability to backup/restore selected tablespaces 
    while leaving other tablespaces is a big thing. 
    The whole point here is: it is assumed that backup/restore of tablespaces can 
    hapen quite quickly and as simple as to copy tablespace files from one 
    location to another(even while database is on - WAL can be used to handle 
    this) - this is compared to dump. 

    For example, index, tempoarary data tablespaces can be lost - not a big deal. 

    Undo(rollback) tablespaces - in a way can be lost as well. 
    While system data tablespace (table structure, stored procedures, etc) - at 
    no cost should be lost.  
    The same way application can be devided in "critical" and "not critical" 
    tablespaces and their backups maintained accordingly. For example, it may not 
    be a big deal to lose year 1996 tables while year 2004 tables should be 
    online. 


    Laimis 

    > -----Original Message----- 
    > From: pgsql-general-owner@postgresql.org 
    > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Brian Maguire 
    > Sent: 21. janúar 2004 16:06 
    > To: pgsql-general@postgresql.org 
    > Subject: [GENERAL] tablespaces a priority for 7.5? 
    > 
    > 
    > Is support for tablespaces a priority feature for 7.5? I 
    > believe there has been significant development in this area 
    > and it seems that postgres' file structure opens it up nicely 
    > to support it.  What are the chances this will be completed?  
    > 
    > In my opinion, it really is a critical feature to support and 
    > administer enterprise databases.  All the major databases 
    > currently support this and it is a compelling enough reason 
    > drive big users from away from using postgres for their 
    > enterprise/large databases.  It really is a database 
    > administrator's feature.  
    > 
    >  
    > Brian 
    >  
    > 
    > ---------------------------(end of 
    > broadcast)--------------------------- 
    > TIP 6: Have you searched our list archives? 
    > 
                   http://archives.postgresql.org 


Re: tablespaces a priority for 7.5?

From
Tom Lane
Date:
>     The whole point here is: it is assumed that backup/restore of tablespaces can
>     hapen quite quickly and as simple as to copy tablespace files from one
>     location to another(even while database is on - WAL can be used to handle
>     this) - this is compared to dump.

This is not going to happen.  We intend to provide tablespaces in the
form of a simple management scheme for table files that are scattered
across multiple directories (typically on different filesystems).
That doesn't make it any safer to copy files behind the database's back.

            regards, tom lane

Re: tablespaces a priority for 7.5?

From
Cott Lang
Date:
On Wed, 2004-01-21 at 09:05, Brian Maguire wrote:

> In my opinion, it really is a critical feature to support and administer
> enterprise databases.  All the major databases currently support this
> and it is a compelling enough reason drive big users from away from
> using postgres for their enterprise/large databases.  It really is a
> database administrator's feature.

It seems to me that the lack of point-in-time recovery is a much bigger
roadblock against big users. :(




Re: tablespaces a priority for 7.5?

From
Date:
> On Behalf Of Cott Lang
> It seems to me that the lack of point-in-time recovery is a
> much bigger roadblock against big users. :(

Meaning incremental (hot)-backups?
Or as protection against DROP/TRUNCATE/DELETE ALL TABLE/SCHEMA/DATABASE?

With a WAL it should be doable in some 7.x version, all ingredients are
there.

Possibly someone suceeded in doing it already? Having a baseline backup and
saved WAL logs, shouldn't it be possible to recover?


Laimis

Re: tablespaces a priority for 7.5?

From
Cott Lang
Date:
On Thu, 2004-01-22 at 07:38, lnd@hnit.is wrote:

> Meaning incremental (hot)-backups?
> Or as protection against DROP/TRUNCATE/DELETE ALL TABLE/SCHEMA/DATABASE?
>
> With a WAL it should be doable in some 7.x version, all ingredients are
> there.
>
> Possibly someone suceeded in doing it already? Having a baseline backup and
> saved WAL logs, shouldn't it be possible to recover?

Incremental pg_dumps would be a huge step in the right direction!

However, unless I am mistaken, a baseline backup would need to be taken
cold because you cannot take a consistent online backup of the data
files without using file system snapshots or split mirrors, and even
that's questionable.




Re: tablespaces a priority for 7.5?

From
Date:
> Incremental pg_dumps would be a huge step in the right direction!

For big people (meaning bid databases) - not shure if pg_dump is the right
direction. Pg_dump must be quite slow also compact. Raw file copy is the way
to go: quick and simple, virtually no configuration is required, no possible
pg_dump bugs - the latest quite important(of course, no DB version and OS
changes are possible, but not really needed - this is recovery, not
migration).


> However, unless I am mistaken, a baseline backup would need
> to be taken cold because you cannot take a consistent online
> backup of the data files without using file system snapshots
> or split mirrors, and even that's questionable.

That's doable: depends what you have in WAL logs.
In short: a baseline full hot database backup is taken while database is
running, then when recovering WAL logs are put on top of this baseline
backup. WAL logs must actually account for a lot: table, index changes, etc.
From my understanding, this is done not at SQL level, but at pages level. as
far as I remember the method - a full hot database backup must be taken at
page level as well, i.e. simple OS file copy-utility may not suite for the
purpose.

Regards, Laimis



> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Cott Lang
> Sent: 22. janúar 2004 15:00
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] tablespaces a priority for 7.5?
>
>
> On Thu, 2004-01-22 at 07:38, lnd@hnit.is wrote:
>
> > Meaning incremental (hot)-backups?
> > Or as protection against DROP/TRUNCATE/DELETE ALL
> > TABLE/SCHEMA/DATABASE?
> >
> > With a WAL it should be doable in some 7.x version, all ingredients
> > are there.
> >
> > Possibly someone suceeded in doing it already? Having a baseline
> > backup and saved WAL logs, shouldn't it be possible to recover?
>
> Incremental pg_dumps would be a huge step in the right direction!
>
> However, unless I am mistaken, a baseline backup would need
> to be taken cold because you cannot take a consistent online
> backup of the data files without using file system snapshots
> or split mirrors, and even that's questionable.
>
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

Re: tablespaces a priority for 7.5?

From
Jeff
Date:
On Jan 22, 2004, at 10:00 AM, Cott Lang wrote:

> On Thu, 2004-01-22 at 07:38, lnd@hnit.is wrote:
>
>> Meaning incremental (hot)-backups?
>> Or as protection against DROP/TRUNCATE/DELETE ALL
>> TABLE/SCHEMA/DATABASE?
>>
>> With a WAL it should be doable in some 7.x version, all ingredients
>> are
>> there.
>>
>> Possibly someone suceeded in doing it already? Having a baseline
>> backup and
>> saved WAL logs, shouldn't it be possible to recover?
>
> Incremental pg_dumps would be a huge step in the right direction!
>
> However, unless I am mistaken, a baseline backup would need to be taken
> cold because you cannot take a consistent online backup of the data
> files without using file system snapshots or split mirrors, and even
> that's questionable.
>

pg_dump always takes a consistent dump - things won't change underneath
it.
So you're backup won't have anything that changed after pg_dump
started.. thus where incremental would come in.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/


Re: tablespaces a priority for 7.5?

From
Tom Lane
Date:
Cott Lang <cott@internetstaff.com> writes:
> However, unless I am mistaken, a baseline backup would need to be taken
> cold because you cannot take a consistent online backup of the data
> files without using file system snapshots or split mirrors, and even
> that's questionable.

No, it wouldn't.  All you need is to archive WAL beginning from the last
checkpoint record before you begin to take the baseline backup.  The
baseline backup would not be consistent in itself --- but replaying WAL
from the previous checkpoint to any time later than the end of the
backup would bring it to a consistent state.  Missed updates in the
backup are essentially damage that would get repaired by replay.

There are a few trivial things that still have to be done before this
can be a reality (one I can think of is that WAL really needs to have
entries for file creation/deletion), but it's not that far off in terms
of the base mechanisms.  Writing the management software is the main
task.

There is a lot more info available in the pghackers archives.

            regards, tom lane

Re: tablespaces a priority for 7.5?

From
Cott Lang
Date:
> > However, unless I am mistaken, a baseline backup would need to be taken
> > cold because you cannot take a consistent online backup of the data
> > files without using file system snapshots or split mirrors, and even
> > that's questionable.
> >
>
> pg_dump always takes a consistent dump - things won't change underneath
> it.
> So you're backup won't have anything that changed after pg_dump
> started.. thus where incremental would come in.

Sorry, I'm referring to two entirely different things there. :)

1. Being able to do incremental pg_dumps would be a big plus, because
you could take much quicker dumps and thus do it more regularly. I have
a 50GB database I dump every 3 hours that takes 35 minutes to dump. :(

2. The rest was my hypothesizing on what might be necessary for
point-in-time recovery, which pg_dump isn't going to allow even if you
can perform incremental dumps.




Re: tablespaces a priority for 7.5?

From
Cott Lang
Date:
On Thu, 2004-01-22 at 09:04, Tom Lane wrote:

> No, it wouldn't.  All you need is to archive WAL beginning from the last
> checkpoint record before you begin to take the baseline backup.  The
> baseline backup would not be consistent in itself --- but replaying WAL
> from the previous checkpoint to any time later than the end of the
> backup would bring it to a consistent state.  Missed updates in the
> backup are essentially damage that would get repaired by replay.

I will experiment with this. I have plenty of databases to wreak havoc
on. :)

> There are a few trivial things that still have to be done before this
> can be a reality (one I can think of is that WAL really needs to have
> entries for file creation/deletion), but it's not that far off in terms
> of the base mechanisms.  Writing the management software is the main
> task.

Being able to write WAL logs to two locations would be handy.  Is it
currently possible to have the logs not be recycled and occasionally
deleted?

Is there any hope for support for all of this in 7.5?   I dread knowing
that at some point, I may be forced kicking and screaming back to Oracle
because of this. :)

thanks!





Re: tablespaces a priority for 7.5?

From
Tom Lane
Date:
Cott Lang <cott@internetstaff.com> writes:
> 2. The rest was my hypothesizing on what might be necessary for
> point-in-time recovery, which pg_dump isn't going to allow even if you
> can perform incremental dumps.

Right.  There seems to be some confusion about that in this thread,
so just for the record: pg_dump has nothing to do with our plans for
point-in-time recovery (or incremental backup, which is essentially the
same thing).  To do these, you would take a *physical* dump of the
database directory as a baseline, and thereafter copy WAL segments off
to tape or whatever you are using as archive media.  Recovery would
consist of restoring the physical baseline dump, and then replaying WAL
against it up to whatever point in time you wanted to recover to.  You
would, therefore, need to keep a continuous sequence of WAL files back
to the time of your most recent baseline backup.

            regards, tom lane

Re: tablespaces a priority for 7.5?

From
"John Sidney-Woollett"
Date:
Tom Lane said:
> To do these, you would take a *physical* dump of the
> database directory as a baseline, and thereafter copy WAL segments off
> to tape or whatever you are using as archive media.  Recovery would
> consist of restoring the physical baseline dump, and then replaying WAL
> against it up to whatever point in time you wanted to recover to.  You
> would, therefore, need to keep a continuous sequence of WAL files back
> to the time of your most recent baseline backup.

This is similar to Oracle...

With Oracle you have the option of EITHER exporting the database
(equivalent to doing pg_dump) OR taking the database offline and carrying
out a file system level copy of the database files.

Recovery can be accomplished by either restoring the backup database files
or creating a new database by importing the most recent export/dump file.

After that you apply the archive logs (equivalent to WAL segments) to
bring your database up to date.

One caveat for Oracle (at least in 8.1.x) is that DDL statements are not
recorded in the archive logs, and can screw things up. Best to take a new
export/dump after making DDL changes!

The difference between Oracle and Postgres appears to be that posgres
requires a file system level copy of the database instead of being able to
make use of a dump file for this type of recovery. Is that correct?

John Sidney-Woollett


Re: tablespaces a priority for 7.5?

From
Tom Lane
Date:
Cott Lang <cott@internetstaff.com> writes:
> Being able to write WAL logs to two locations would be handy.  Is it
> currently possible to have the logs not be recycled and occasionally
> deleted?

The current thought is to add some sort of hook that allows a
user-defined action to be taken at the point where a WAL file would
normally get recycled.  Presumably this action would involve copying the
WAL file to someplace else (eg a tape), after which it could get
recycled.  If you've got any thoughts on exactly how to manage this,
let's take up a discussion on pgsql-hackers.

> Is there any hope for support for all of this in 7.5?

People are thinking about it, but I'm not seeing a lot of work getting
done.  (Partly my fault, since this is one of the things I'm supposed to
be working on...)

            regards, tom lane

Re: tablespaces a priority for 7.5?

From
Tom Lane
Date:
"John Sidney-Woollett" <johnsw@wardbrook.com> writes:
> With Oracle you have the option of EITHER exporting the database
> (equivalent to doing pg_dump) OR taking the database offline and carrying
> out a file system level copy of the database files.

The planned PITR feature would not require you to take anything offline.
The whole concept of an "offline" database is an Oracle-ism that I see
no value in emulating.

> One caveat for Oracle (at least in 8.1.x) is that DDL statements are not
> recorded in the archive logs, and can screw things up. Best to take a new
> export/dump after making DDL changes!

Well, we're ahead of them on that...

> The difference between Oracle and Postgres appears to be that posgres
> requires a file system level copy of the database instead of being able to
> make use of a dump file for this type of recovery. Is that correct?

I dunno what a "dump file" would equate to in Postgres terms, but yeah,
we're envisioning using ordinary filesystem tools (tar, say) as the
mechanism for handling a baseline backup.

            regards, tom lane

Re: tablespaces a priority for 7.5?

From
"John Sidney-Woollett"
Date:
Tom Lane said:
> The planned PITR feature would not require you to take anything offline.
> The whole concept of an "offline" database is an Oracle-ism that I see
> no value in emulating.

> I dunno what a "dump file" would equate to in Postgres terms, but yeah,
> we're envisioning using ordinary filesystem tools (tar, say) as the
> mechanism for handling a baseline backup.

(I think) that was the point I was trying to make. Oracle lets you backup
while up and running (using export) OR when offline (copying db files)...

However, it looks like postgres will require a file system level backup to
recover from. Or will you be able to build a new db from the dump file,
and then apply the WAL segments?

John Sidney-Woollett




Re: tablespaces a priority for 7.5?

From
Cott Lang
Date:
> With Oracle you have the option of EITHER exporting the database
> (equivalent to doing pg_dump) OR taking the database offline and carrying
> out a file system level copy of the database files.

OR putting the tablespaces in hot backup mode and carrying out a file
system level copy of the database files, which is the option I certainly
prefer. :)

> Recovery can be accomplished by either restoring the backup
> database files or creating a new database by importing the most
> recent export/dump file.

> After that you apply the archive logs (equivalent to WAL segments)
> to bring your database up to date.

You can't do that with an export in Oracle. Archive logs are only useful
to apply to a cold or hot backup of the data files.

Export with Postgres works about the same as with Oracle.  It's the lack
of well-supported hot backups in Postgres that are the limitation.

> One caveat for Oracle (at least in 8.1.x) is that DDL statements
> are not recorded in the archive logs, and can screw things up.
> Best to take a new export/dump after making DDL changes!

I've not heard that one. That would make hot standby databases a serious
maintenance problem. :)



Re: tablespaces a priority for 7.5?

From
Date:
It's interesting to know what pg WAL's are:

If they are page level logs, then you theoretically can have a file system
backup - which is fast, reliable, no issues like pg_dump has, i.e. who is
first/last regarding stored procedures/views/triggers, to have users
precreated, etc, etc.

If the are SQL statements which are REPLAYD after pg_dump restore: well, then
file system backup has no chance. The issue is that you are missing index
pages and probably much more.


Regards, Laimis


> -----Original Message-----
> From: John Sidney-Woollett [mailto:johnsw@wardbrook.com]
> Sent: 22. janúar 2004 17:53
> To: Laimutis Nedzinskas
> Cc: johnsw@wardbrook.com
> Subject: RE: [GENERAL] tablespaces a priority for 7.5?
>
>
> lnd@hnit.is said:
> >> (I think) that was the point I was trying to make. Oracle lets you
> >> backup while up and running (using export) OR when offline
> (copying
> >> db files)...
> >
> > Not neccesseraly offline. In oracle you take tablespace in
> backup mode
> > and just use OCOPY(I believe you must watch out that file copy is
> > performed at OS block sizes not smaller than db page size -
> OCOPY does
> > that) to copy files.
> > REDO logs (i.e. WAL logs) will bring copied files in synch.
> > Or one can use RMAN tool which is the same as to copy files
> just it makes
> > book-keeping for backups and discards empty pages from
> backuped file.
>
> You obviously know Oracle better than me - I stand corrected! :)
>
> Actually I was trying to figure out whether the WAL segments
> could be applied to a database rebuilt using a file generated
> by pg_dump. I suspect not, and doesn't that mean that the
> postmaster needs to be stopped to carry out the file system
> level copy of the pg files?
>
> John
>

Re: tablespaces a priority for 7.5?

From
Tom Lane
Date:
"John Sidney-Woollett" <johnsw@wardbrook.com> writes:
> Tom Lane said:
>> I dunno what a "dump file" would equate to in Postgres terms, but yeah,
>> we're envisioning using ordinary filesystem tools (tar, say) as the
>> mechanism for handling a baseline backup.

> However, it looks like postgres will require a file system level backup to
> recover from. Or will you be able to build a new db from the dump file,
> and then apply the WAL segments?

What dump file?  I'm trying to say that we have no such concept and no
intention of inventing one.

            regards, tom lane

Re: tablespaces a priority for 7.5?

From
"Rick Gigger"
Date:
> On Wed, 2004-01-21 at 09:05, Brian Maguire wrote:
>
> > In my opinion, it really is a critical feature to support and administer
> > enterprise databases.  All the major databases currently support this
> > and it is a compelling enough reason drive big users from away from
> > using postgres for their enterprise/large databases.  It really is a
> > database administrator's feature.
>
> It seems to me that the lack of point-in-time recovery is a much bigger
> roadblock against big users. :(

 This is certainly my feeling.


Re: tablespaces a priority for 7.5?

From
Julian North
Date:
speaking as someone currently migrating enterprise stuff to postgres....

point-in-time is definitely the biggest issue.

this is the main thing we are having to look reproducing using some form of
replication to an alternative server.



-----Original Message-----
From: Rick Gigger [mailto:rick@alpinenetworking.com]
Sent: 22 January 2004 19:09
To: Cott Lang; pgsql-general@postgresql.org
Subject: Re: [GENERAL] tablespaces a priority for 7.5?



> On Wed, 2004-01-21 at 09:05, Brian Maguire wrote:
>
> > In my opinion, it really is a critical feature to support and administer
> > enterprise databases.  All the major databases currently support this
> > and it is a compelling enough reason drive big users from away from
> > using postgres for their enterprise/large databases.  It really is a
> > database administrator's feature.
>
> It seems to me that the lack of point-in-time recovery is a much bigger
> roadblock against big users. :(

 This is certainly my feeling.


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________