Thread: Restoring a postgres database

Restoring a postgres database

From
Timothy Brier
Date:
Hi,

I've run in to this problem with a database we are working.  When I
restore a database schema, I need to restore the schema 5 times to
ensure that the schema is complete.  Also some of the sequences are not
restored in a usable form.  E.g. If my next sequence should be 1000, my
sequence is set to 1 and I need to run a query to reset my sequences.

I have restored other simpler databases in PostgreSQL without a problem.
It is my view that this issue is caused by a dependency issue because
the items that don't get restored the first or second time complain that
a dependency on a function doesn't exist, but all is fine after the 5th
attempt.

I do two pg_dumps.  The first is:
pg_dump -Cs databasename | gzip -cv > databasenameschemayyyymmdd.gz
pg_dump -Ca databasename | gzip -cv > databasenamedatayyyymmdd.gz

The database contains 64 tables, 34 views, 244 user functions, 34 rules,
87 triggers, 202 indexes and 70 sequences.

We are also using inheritance in the database.  The schemas are standard
schemas created by PostgreSQL.

I have also used the ability of pg_dump to create a schema and data in a
tar format, but cannot get it to restore the schema from the tar.  It
always complains about the functions for plpgsql already existing and
stops.  No problem restoring the data from the tar.

I am using PostgreSQL 7.3.3 on RedHat 7.1 - 9.0.   The database itself
is great, we've converted a few projects from MSSQL to PostgreSQL but I
am concerned about the integrity of restoring the data.

Does anyone know if this will be improved in 7.4?  Is there a better way
to do a backup?

To the developers, support team and the community,

Keep up the good work.


Timothy Brier.


Re: Restoring a postgres database

From
Andrew Gould
Date:
--- Timothy Brier <briert@cepu.ca> wrote:
> Hi,
>
> I've run in to this problem with a database we are
> working.  When I
> restore a database schema, I need to restore the
> schema 5 times to
> ensure that the schema is complete.  Also some of
> the sequences are not
> restored in a usable form.  E.g. If my next sequence
> should be 1000, my
> sequence is set to 1 and I need to run a query to
> reset my sequences.
>
> I have restored other simpler databases in
> PostgreSQL without a problem.
> It is my view that this issue is caused by a
> dependency issue because
> the items that don't get restored the first or
> second time complain that
> a dependency on a function doesn't exist, but all is
> fine after the 5th
> attempt.
>
> I do two pg_dumps.  The first is:
> pg_dump -Cs databasename | gzip -cv >
> databasenameschemayyyymmdd.gz
> pg_dump -Ca databasename | gzip -cv >
> databasenamedatayyyymmdd.gz
>
> The database contains 64 tables, 34 views, 244 user
> functions, 34 rules,
> 87 triggers, 202 indexes and 70 sequences.
>
> We are also using inheritance in the database.  The
> schemas are standard
> schemas created by PostgreSQL.
>
> I have also used the ability of pg_dump to create a
> schema and data in a
> tar format, but cannot get it to restore the schema
> from the tar.  It
> always complains about the functions for plpgsql
> already existing and
> stops.  No problem restoring the data from the tar.
>
> I am using PostgreSQL 7.3.3 on RedHat 7.1 - 9.0.
> The database itself
> is great, we've converted a few projects from MSSQL
> to PostgreSQL but I
> am concerned about the integrity of restoring the
> data.
>
> Does anyone know if this will be improved in 7.4?
> Is there a better way
> to do a backup?
>
> To the developers, support team and the community,
>
> Keep up the good work.
>
> Timothy Brier.

I ran into a situation similar to yours regarding
tables with foreign references and escalation rules.
I've noticed that tables seem to get dumped/restored
in the order in which they were created. To fix my
problem, I rearranged the table order in my schema
files. Since the tables were then created in the
correct order, subsequent dumps/restores have gone
smoothly.  (I hope it wasn't just dumb luck.)

I dump the schema separately from the data. I have a
python script that separates the table creation
statements into one schema file and the index and
constraint creation statements into a second schema
file. This allows me to recreate the tables, restore
the data, and then recreate indexes and constraints. I
figure if the data does not comply with the
contraints, the dump was bad anyway. (This has yet to
occur.)

I can't help with the sequence field problem; but I
hope you're not having to restore too often.

Best of luck,

Andrew Gould

Re: Restoring a postgres database

From
Timothy Brier
Date:
Andrew Gould wrote:
> --- Timothy Brier <briert@cepu.ca> wrote:
>
>>Hi,
>>
>>I've run in to this problem with a database we are
>>working.  When I
>>restore a database schema, I need to restore the
>>schema 5 times to
>>ensure that the schema is complete.  Also some of
>>the sequences are not
>>restored in a usable form.  E.g. If my next sequence
>>should be 1000, my
>>sequence is set to 1 and I need to run a query to
>>reset my sequences.
>>
>>I have restored other simpler databases in
>>PostgreSQL without a problem.
>>It is my view that this issue is caused by a
>>dependency issue because
>>the items that don't get restored the first or
>>second time complain that
>>a dependency on a function doesn't exist, but all is
>>fine after the 5th
>>attempt.
>>
>>I do two pg_dumps.  The first is:
>>pg_dump -Cs databasename | gzip -cv >
>>databasenameschemayyyymmdd.gz
>>pg_dump -Ca databasename | gzip -cv >
>>databasenamedatayyyymmdd.gz
>>
>>The database contains 64 tables, 34 views, 244 user
>>functions, 34 rules,
>>87 triggers, 202 indexes and 70 sequences.
>>
>>We are also using inheritance in the database.  The
>>schemas are standard
>>schemas created by PostgreSQL.
>>
>>I have also used the ability of pg_dump to create a
>>schema and data in a
>>tar format, but cannot get it to restore the schema
>>from the tar.  It
>>always complains about the functions for plpgsql
>>already existing and
>>stops.  No problem restoring the data from the tar.
>>
>>I am using PostgreSQL 7.3.3 on RedHat 7.1 - 9.0.
>>The database itself
>>is great, we've converted a few projects from MSSQL
>>to PostgreSQL but I
>>am concerned about the integrity of restoring the
>>data.
>>
>>Does anyone know if this will be improved in 7.4?
>>Is there a better way
>>to do a backup?
>>
>>To the developers, support team and the community,
>>
>>Keep up the good work.
>>
>>Timothy Brier.
>
>
> I ran into a situation similar to yours regarding
> tables with foreign references and escalation rules.
> I've noticed that tables seem to get dumped/restored
> in the order in which they were created. To fix my
> problem, I rearranged the table order in my schema
> files. Since the tables were then created in the
> correct order, subsequent dumps/restores have gone
> smoothly.  (I hope it wasn't just dumb luck.)
>
> I dump the schema separately from the data. I have a
> python script that separates the table creation
> statements into one schema file and the index and
> constraint creation statements into a second schema
> file. This allows me to recreate the tables, restore
> the data, and then recreate indexes and constraints. I
> figure if the data does not comply with the
> contraints, the dump was bad anyway. (This has yet to
> occur.)
>
> I can't help with the sequence field problem; but I
> hope you're not having to restore too often.
>
> Best of luck,
>
> Andrew Gould
>
>
Thanks for the reply.  I don't do alot of restores. But I would like to
see the issue addressed so it would be easier for other people who use
PostgreSQL and need to do a restore without jumping through these hoops.
  At the same time I realize and appreciate the hard work that has gone
into this DB and that there are other priorities.

Tim.


Re: Restoring a postgres database

From
Martijn van Oosterhout
Date:
I've been reading about these problems (but never experienced myself). I've
always wondered if it would be possible to write a script to scan through a
schema dump searching for dependancies. You can then use tsort to dump out
the order that things should be restored in.

Has anyone attempted this at all?

On Wed, Jul 09, 2003 at 12:29:32AM -0400, Timothy Brier wrote:
> Andrew Gould wrote:
> >--- Timothy Brier <briert@cepu.ca> wrote:
> >
> >>Hi,
> >>
> >>I've run in to this problem with a database we are
> >>working.  When I
> >>restore a database schema, I need to restore the
> >>schema 5 times to
> >>ensure that the schema is complete.  Also some of
> >>the sequences are not
> >>restored in a usable form.  E.g. If my next sequence
> >>should be 1000, my
> >>sequence is set to 1 and I need to run a query to
> >>reset my sequences.
> >>
> >>I have restored other simpler databases in
> >>PostgreSQL without a problem.
> >>It is my view that this issue is caused by a
> >>dependency issue because
> >>the items that don't get restored the first or
> >>second time complain that
> >>a dependency on a function doesn't exist, but all is
> >>fine after the 5th
> >>attempt.
> >>
> >>I do two pg_dumps.  The first is:
> >>pg_dump -Cs databasename | gzip -cv >
> >>databasenameschemayyyymmdd.gz
> >>pg_dump -Ca databasename | gzip -cv >
> >>databasenamedatayyyymmdd.gz
> >>
> >>The database contains 64 tables, 34 views, 244 user
> >>functions, 34 rules,
> >>87 triggers, 202 indexes and 70 sequences.
> >>
> >>We are also using inheritance in the database.  The
> >>schemas are standard
> >>schemas created by PostgreSQL.
> >>
> >>I have also used the ability of pg_dump to create a
> >>schema and data in a
> >>tar format, but cannot get it to restore the schema
> >>from the tar.  It
> >>always complains about the functions for plpgsql
> >>already existing and
> >>stops.  No problem restoring the data from the tar.
> >>
> >>I am using PostgreSQL 7.3.3 on RedHat 7.1 - 9.0.
> >>The database itself
> >>is great, we've converted a few projects from MSSQL
> >>to PostgreSQL but I
> >>am concerned about the integrity of restoring the
> >>data.
> >>
> >>Does anyone know if this will be improved in 7.4?
> >>Is there a better way
> >>to do a backup?
> >>
> >>To the developers, support team and the community,
> >>
> >>Keep up the good work.
> >>
> >>Timothy Brier.
> >
> >
> >I ran into a situation similar to yours regarding
> >tables with foreign references and escalation rules.
> >I've noticed that tables seem to get dumped/restored
> >in the order in which they were created. To fix my
> >problem, I rearranged the table order in my schema
> >files. Since the tables were then created in the
> >correct order, subsequent dumps/restores have gone
> >smoothly.  (I hope it wasn't just dumb luck.)
> >
> >I dump the schema separately from the data. I have a
> >python script that separates the table creation
> >statements into one schema file and the index and
> >constraint creation statements into a second schema
> >file. This allows me to recreate the tables, restore
> >the data, and then recreate indexes and constraints. I
> >figure if the data does not comply with the
> >contraints, the dump was bad anyway. (This has yet to
> >occur.)
> >
> >I can't help with the sequence field problem; but I
> >hope you're not having to restore too often.
> >
> >Best of luck,
> >
> >Andrew Gould
> >
> >
> Thanks for the reply.  I don't do alot of restores. But I would like to
> see the issue addressed so it would be easier for other people who use
> PostgreSQL and need to do a restore without jumping through these hoops.
>  At the same time I realize and appreciate the hard work that has gone
> into this DB and that there are other priorities.
>
> Tim.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Attachment

Re: Restoring a postgres database

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> I've been reading about these problems (but never experienced myself). I've
> always wondered if it would be possible to write a script to scan through a
> schema dump searching for dependancies. You can then use tsort to dump out
> the order that things should be restored in.

Now that we have dependency tracking on the server side (see pg_depend)
this should be just a small matter of programming.  No one's tackled it
yet though.

            regards, tom lane

Re: Restoring a postgres database

From
Dennis Björklund
Date:
On Wed, 9 Jul 2003, Tom Lane wrote:

> > schema dump searching for dependancies. You can then use tsort to dump out
> > the order that things should be restored in.
>
> Now that we have dependency tracking on the server side (see pg_depend)
> this should be just a small matter of programming.  No one's tackled it
> yet though.

It's not enough to just order things. To get something that always works
one need code that can break up cycles in the graph. With alter table and
other constructs it's easy to create objects that depend on each other.

--
/Dennis


Re: Restoring a postgres database

From
Martijn van Oosterhout
Date:
On Wed, Jul 09, 2003 at 09:15:41AM +0200, Dennis Björklund wrote:
> On Wed, 9 Jul 2003, Tom Lane wrote:
>
> > > schema dump searching for dependancies. You can then use tsort to dump out
> > > the order that things should be restored in.
> >
> > Now that we have dependency tracking on the server side (see pg_depend)
> > this should be just a small matter of programming.  No one's tackled it
> > yet though.
>
> It's not enough to just order things. To get something that always works
> one need code that can break up cycles in the graph. With alter table and
> other constructs it's easy to create objects that depend on each other.

Ah yes, but tsort tells you where the loop is and would be able to break it.
For example, if it were a function you would make a list of CREATE FUNCTION
with dummy bodies at the beginning and then a list of CREATE OR REPLACE
FUNCTION at the end with the real bodies. If there is an issue with the
DEFAULT of a table field it could break it out to an ALTER TABLE SET
DEFAULT.

In fact, you could take the ultra pessimistic route and change pg_dump to
dump in such a way that it will always work. That should be possible. Say
in the order:

languages, types, domains, functions with dummy bodies, tables in top-down
order but no defaults, functions with real bodies, table fields defaults,
triggers.

I've never had a database with recursive dependancies so maybe I'm
underestimating the problems here.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Attachment

Re: Restoring a postgres database

From
Dennis Björklund
Date:
On Wed, 9 Jul 2003, Martijn van Oosterhout wrote:

> In fact, you could take the ultra pessimistic route and change pg_dump to
> dump in such a way that it will always work. That should be possible. Say
> in the order:

That is probably simplest way even if the dumps will be ugly. There might
also be some places in pg where you can't make the change needed to do it
like this. But pg have been getting a lot better in this respect with a
nice sql syntax to do things that you previously could only do by changing
system tables.

> I've never had a database with recursive dependancies so maybe I'm
> underestimating the problems here.

I've had it several times, and I don't think it's that uncommon.

--
/Dennis


Re: Restoring a postgres database

From
Martijn van Oosterhout
Date:
On Wed, Jul 09, 2003 at 10:48:46AM +0200, Dennis Björklund wrote:
> On Wed, 9 Jul 2003, Martijn van Oosterhout wrote:
>
> > In fact, you could take the ultra pessimistic route and change pg_dump to
> > dump in such a way that it will always work. That should be possible. Say
> > in the order:
>
> That is probably simplest way even if the dumps will be ugly. There might
> also be some places in pg where you can't make the change needed to do it
> like this. But pg have been getting a lot better in this respect with a
> nice sql syntax to do things that you previously could only do by changing
> system tables.

Indeed.

> > I've never had a database with recursive dependancies so maybe I'm
> > underestimating the problems here.
>
> I've had it several times, and I don't think it's that uncommon.

Ok, in your experience, are all depndancy loops either:

- CHECKs or DEFAULTs that refer to functions that don't exist yet
- FUNCTIONs that refer to tables that don't exist yet

Because they can all be solved predefining functions and then fully
declaring them at the end. Are there other possibilities?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Attachment

Re: Restoring a postgres database

From
Dennis Björklund
Date:
On Wed, 9 Jul 2003, Martijn van Oosterhout wrote:

> Ok, in your experience, are all depndancy loops either:
>
> - CHECKs or DEFAULTs that refer to functions that don't exist yet
> - FUNCTIONs that refer to tables that don't exist yet

I'm not sure which is the most common. I think for me it have been
functions that refer to tables that dont exist. I actually don't remember
exactly what it have been. I know I once tried to write a small parser
that concisted of a couple of functions that called each other. In the end
I wrote that in the client instead, but it is one example I've had.

> Because they can all be solved predefining functions and then fully
> declaring them at the end. Are there other possibilities?

I think almost all things is solvable like that. Even tables you can
create as an empty table and add columns afterwards if you like.

Foreign keys is something that very well can point in both directions
between two tables. I've had that in a database, at least once. In this
case there can be problems with restoring the data also unless you check
all constraints in the very end. Otherwise you need to add the rows first
and then update them to get all the "pointers" right.

Maybe a better solution is to make all constraints deferable until the end
of the dump, even such "constraints" as checking if a table exist that a
function refer to. Then you could put the things in any order in the file
and it will work anyway. This view appeals to me.

The problem with dumps is that you don't notice the problems until you
need to restore a database, something that you don't do very often. I have
a lot of dumps stored which is my backups, but I don't know for sure that
these will be easy to restore from (I can always fix them myslef by hand).

--
/Dennis