Re: Incremental backup - Mailing list pgsql-hackers

From Kevin Brown
Subject Re: Incremental backup
Date
Msg-id 20030215223029.GG1833@filer
Whole thread Raw
In response to Re: Incremental backup  (Curt Sampson <cjs@cynic.net>)
List pgsql-hackers
Curt Sampson wrote:
> Oracle9i has a new feature called a "flashback query," which uses the
> information in the rollback segements to let you query the database
> in a previous state. (I.e., "select such and such from this table as
> of two hours ago.") Postgres could do this using the older copies of
> rows as well, though the performance often wouldn't be pretty, since
> your indexes become useless, I believe. (Don't they point to only the
> latest copy of a row?) Still, it would be cool and life-saving in some
> situations.

That sounds like a really nice feature.  We don't do rollback
segments, but I suppose we could (perhaps at the expense of
performance).

> > For PITR, I assume we'd need an archivelog function that would
> > copy the WAL files as they're checkpointed to some other location
> > (with destination names that reflect their order in time), just for
> > starters.
> 
> Well, I'm not sure that you'd really need to have any special archiving
> facilities: you just need to let the old files sit there, and keep
> creating new ones. You do lose a bit of performance in that you can't
> recycle log segments, but you could always just fork off a (well niced)
> process to create a new, empty log segment at the time you start in on
> the last pre-created one, so that you will have another pre-created one
> ready when you finish the current one.

It's probably just as easy to fork off a well-niced process to copy
the log segments elsewhere.

If you're a shop trying to eke out the last bit of performance from
PostgreSQL, then you're probably going to put the transaction logs on
the fastest bit of disk you've got -- a solid-state disk if you can.
That means the amount of space you've got on the filesystem holding
the logs isn't necessarily all that high.  For PITR you almost
certainly want to put the old log files somewhere else for
safekeeping: since you're not using them anymore and you probably want
to keep a lot of them, the old log files need to go somewhere that has
tons of space and is highly reliable.  For that area, performance is
the least of your concerns, as long as it's fast enough that you can
keep up with the database as it writes transaction logs.

And chances are you'll want a background process that compresses the
old log files to minimize the amount of space they eat.


> It's not like there's much point in writing the data to the disk if
> you can't get to it. I'd understood fdatasync as just a way to avoid
> updaing the inode's last changed date.  Are there OSes that
> implement fdatasync in a way that you could still lose data?

I wouldn't think so.  Otherwise what would be the point of having it?

> And you can't do such a rollback at all, except on an entire database,
> because of the potential integrity violations. The best you could do
> would be to generate SQL for the changes you'd need to get back to the
> previous point, and see if you can execute these changes.

Well, you could do the rollback for all tables involved in the
referential integrity tree and leave it at that, but yeah it's
probably simpler to rollback the whole database.

It would still be a useful feature, but I don't know that it's worth
the upfront performance cost of having to save off rollback segments.
If you did have such a feature you'd want to be able to control
whether or not it's enabled, so that you take the performance hit only
if you need the feature.



-- 
Kevin Brown                          kevin@sysexperts.com


pgsql-hackers by date:

Previous
From: Marc Munro
Date:
Subject: Re: Incremental backup
Next
From: Ian Barwick
Date:
Subject: Todo "claim": psql tab completion on schema names