Thread: pg_restore and create FK without verification check

pg_restore and create FK without verification check

From
ow
Date:
--- ow <oneway_111@yahoo.com> wrote:
> IMHO, not only data need to loaded before FK constraints are created but also
> there has got to be a feature to allow creation of an FK constraint WITHOUT
> doing the verification that all loaded/existing records satisfy the FK
> constraint. The ability to create a FK constraint without verification of
> existing records should exist by itself (maybe only to superuser) and also as
> an option in pg_restore. 
> 
> More details:
> http://archives.postgresql.org/pgsql-admin/2003-11/msg00308.php
> http://archives.postgresql.org/pgsql-admin/2003-11/msg00323.php
>

It appears there's not a lot of interest in discussing the possibility of FK
constraint creation WITHOUT the verification check. How then should one handle
the situation with pg_restore and large dbs where creation of FK constraint(s)
may take hours?

Thanks




__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/


Re: pg_restore and create FK without verification check

From
Andreas Pflug
Date:
ow wrote:

>
>It appears there's not a lot of interest in discussing the possibility of FK
>constraint creation WITHOUT the verification check. How then should one handle
>the situation with pg_restore and large dbs where creation of FK constraint(s)
>may take hours?
>  
>

I'd prefer a backup/restore method that dumps physical data, so at 
restore time there's no need for recreation of FKs. But I didn't see any 
feedback on this proposal either.

Regards,
Andreas




Re: pg_restore and create FK without verification check

From
Hannu Krosing
Date:
Andreas Pflug kirjutas K, 26.11.2003 kell 12:09:
> ow wrote:
> 
> >
> >It appears there's not a lot of interest in discussing the possibility of FK
> >constraint creation WITHOUT the verification check. How then should one handle
> >the situation with pg_restore and large dbs where creation of FK constraint(s)
> >may take hours?
> >  
> >
> 
> I'd prefer a backup/restore method that dumps physical data, so at 
> restore time there's no need for recreation of FKs. But I didn't see any 
> feedback on this proposal either.

Was this proposal a separate one from using WAL logs for PITR ?

-------------
Hannu



Re: pg_restore and create FK without verification check

From
Andreas Pflug
Date:
Hannu Krosing wrote:

>Andreas Pflug kirjutas K, 26.11.2003 kell 12:09:
>  
>
>>ow wrote:
>>
>>    
>>
>>>It appears there's not a lot of interest in discussing the possibility of FK
>>>constraint creation WITHOUT the verification check. How then should one handle
>>>the situation with pg_restore and large dbs where creation of FK constraint(s)
>>>may take hours?
>>> 
>>>
>>>      
>>>
>>I'd prefer a backup/restore method that dumps physical data, so at 
>>restore time there's no need for recreation of FKs. But I didn't see any 
>>feedback on this proposal either.
>>    
>>
>
>Was this proposal a separate one from using WAL logs for PITR ?
>  
>

Yes, I mentioned it just a few days when discussing dependency in pg_dump.
This is somewhat complementary to WAL and PITR. I'm seeking for a fast 
way to dump and restore a complete database, like physical file copy, 
without shutting down the backend. I was thinking of a BACKUP command 
that streams out the files including any indexes and non-vacuumed 
tuples. A database recreated from that wouldn't be as clean as a 
pg_dump/pg_restored database, but it would be up much faster, and there 
wouldn't be any dependency problem.
This doesn't really replace pg_dump/pg_restore, because it probably 
wouldn't be able to upgrade a cluster. Still, it would be helpful for 
disaster recovery.

Regards,
Andreas




Re: pg_restore and create FK without verification check

From
ow
Date:
--- Andreas Pflug <pgadmin@pse-consulting.de> wrote:
> Yes, I mentioned it just a few days when discussing dependency in pg_dump.
> This is somewhat complementary to WAL and PITR. I'm seeking for a fast 
> way to dump and restore a complete database, like physical file copy, 
> without shutting down the backend. I was thinking of a BACKUP command 
> that streams out the files including any indexes and non-vacuumed 
> tuples. A database recreated from that wouldn't be as clean as a 
> pg_dump/pg_restored database, but it would be up much faster, and there 
> wouldn't be any dependency problem.
> This doesn't really replace pg_dump/pg_restore, because it probably 
> wouldn't be able to upgrade a cluster. Still, it would be helpful for 
> disaster recovery.
> 

I think creating a FK without verification check is still needed, especially in
case if:
1) original db is corrupted
2) during cluster upgrade
3) there's a need to BACKUP/RESTORE a *schema* instead of db.

Thanks




__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/


Re: pg_restore and create FK without verification check

From
Andreas Pflug
Date:
ow wrote:

>--- Andreas Pflug <pgadmin@pse-consulting.de> wrote:
>  
>
>>Yes, I mentioned it just a few days when discussing dependency in pg_dump.
>>This is somewhat complementary to WAL and PITR. I'm seeking for a fast 
>>way to dump and restore a complete database, like physical file copy, 
>>without shutting down the backend. I was thinking of a BACKUP command 
>>that streams out the files including any indexes and non-vacuumed 
>>tuples. A database recreated from that wouldn't be as clean as a 
>>pg_dump/pg_restored database, but it would be up much faster, and there 
>>wouldn't be any dependency problem.
>>This doesn't really replace pg_dump/pg_restore, because it probably 
>>wouldn't be able to upgrade a cluster. Still, it would be helpful for 
>>disaster recovery.
>>
>>    
>>
>
>I think creating a FK without verification check is still needed, especially in
>case if:
>1) original db is corrupted
>2) during cluster upgrade
>  
>
Agreed. This might be useful for replication purposes too; in MSSQL, you 
can write "CREATE TRIGGER ... NOT FOR REPLICATION". I'd like to see a 
transaction safe way (ENABLE/DISABLE TRIGGER command) for this.

>3) there's a need to BACKUP/RESTORE a *schema* instead of db.
>
>Thanks
>
>
>
>
>__________________________________
>Do you Yahoo!?
>Free Pop-Up Blocker - Get it now
>http://companion.yahoo.com/
>
>---------------------------(end of broadcast)---------------------------
>TIP 3: if posting/reading through Usenet, please send an appropriate
>      subscribe-nomail command to majordomo@postgresql.org so that your
>      message can get through to the mailing list cleanly
>
>  
>




Re: pg_restore and create FK without verification check

From
Oli Sennhauser
Date:
Hello

I was asking about this too, one or two weeks ago.

>>>> It appears there's not a lot of interest in discussing the 
>>>> possibility of FK
>>>> constraint creation WITHOUT the verification check. How then should 
>>>> one handle
>>>> the situation with pg_restore and large dbs where creation of FK 
>>>> constraint(s)
>>>> may take hours?     
>>>
>>> I'd prefer a backup/restore method that dumps physical data, so at 
>>> restore time there's no need for recreation of FKs. But I didn't see 
>>> any feedback on this proposal either.   
>>
>> Was this proposal a separate one from using WAL logs for PITR ? 
>
My question then was:

>> Q2: New situation: Why is it not a good idea to backup the database
>> files of a cluster incl. all c_log and x_log (log files last) to get a
>> "physicaly hot backup".
>> In principle it is the same situation like a server which is crashing
>> (not a once but during some time). After restoring, it should do a redo
>> and rollback automatically like after a crash. This methode (physical
>> hot backup) would increas backup and restore times dramatically.

The answer from Robert Treat was:

> Essentially I think you're right, it should behave much like a crashing 
> server.  The main reason why people don't recommend it is that (depending on 
> your os setup) there is the potential to lose data that has been commited but 
> not actually written to disk.  Note that you shouldn't get corrupted data 
> from this, but in many cases losing data is just as bad so we don't recomend 
> it.  If you really want to do this, you should really either shut down the 
> database  or get LVM going.

I did not yet many tests. But in principle I was able to hot-backup a 
cluster or only one database and restore it. But the answer from Robert 
makes me a little afraid. It means for me he/they do not trust theire 
recovery mechanism. A definitive answer from Robert is still out.

In my opinion a high grade professional database system (like PostgreSQL 
is or want to be) should have some hot backup features. Otherwise you 
are NEVER able to handle VLDB's. They were discussing about a 32 TB 
PostgreSQL database. And I bet my next bonus this year :-), that they 
are also not able to backup and restore it in a reasonable time.

Regards Oli

-------------------------------------------------------

Oli Sennhauser
Database-Engineer (Oracle & PostgreSQL)
Rebenweg 6
CH - 8610 Uster / Switzerland

Phone (+41) 1 940 24 82 or Mobile (+41) 79 450 49 14
e-Mail oli.sennhauser@bluewin.ch
Website http://mypage.bluewin.ch/shinguz/PostgreSQL/

Secure (signed/encrypted) e-Mail with a Free Personal SwissSign ID: http://www.swisssign.ch

Import the SwissSign Root Certificate: http://swisssign.net/cgi-bin/trust/import


Re: pg_restore and create FK without verification check

From
"Zeugswetter Andreas SB SD"
Date:
> >> Q2: New situation: Why is it not a good idea to backup the database
> >> files of a cluster incl. all c_log and x_log (log files last) to get a
> >> "physicaly hot backup".
> >> In principle it is the same situation like a server which is crashing
> >> (not a once but during some time). After restoring, it should do a redo
> >> and rollback automatically like after a crash. This methode (physical
> >> hot backup) would increas backup and restore times dramatically.
>
> The answer from Robert Treat was:
>
> > Essentially I think you're right, it should behave much like a crashing
> > server.  The main reason why people don't recommend it is that (depending on
> > your os setup) there is the potential to lose data that has been commited but
> > not actually written to disk.  Note that you shouldn't get corrupted data
> > from this, but in many cases losing data is just as bad so we don't recomend
> > it.  If you really want to do this, you should really either shut down the
> > database  or get LVM going.

The key issue here is to have a pg_control file to start from with a
finished checkpoint from before you start to backup. Then you need to
ensure that you have all logfiles from checkpoint until backup finishes.
The last thing to backup must be the last active x_log.
It would prbbly be a good idea to not have a vacuum running concurrently :-)

And then you need to do a lot of tests, since nobody else does it yet.
I think this is an issue, since it is such high risk, nobody will
step up easily and say that it is safe.

Andreas


Re: pg_restore and create FK without verification check

From
Tom Lane
Date:
Andreas Pflug <pgadmin@pse-consulting.de> writes:
> This is somewhat complementary to WAL and PITR. I'm seeking for a fast 
> way to dump and restore a complete database, like physical file copy, 
> without shutting down the backend. I was thinking of a BACKUP command 
> that streams out the files including any indexes and non-vacuumed 
> tuples. A database recreated from that wouldn't be as clean as a 
> pg_dump/pg_restored database, but it would be up much faster, and there 
> wouldn't be any dependency problem.

It's already intended to support this as part of the PITR work.  The
idea is you force a checkpoint and then make a tar-format dump of the
database tree (tar or whatever floats your boat, but anyway a
filesystem-level backup).  The database need not be stopped while you do
this, and you don't need a filesystem that can do snapshots or anything
fancy like that.  The tar dump itself most likely does not represent a
consistent state of the database by the time you are done making it.
That is okay, because you have also been archiving off to tape (or
someplace) all the WAL data generated since that pre-dump checkpoint.
You can continue archiving the WAL series for however far forward from
the original dump you feel like.  If you need to recover, you reload the
database from the tar dump and then replay the WAL series against it.
This is indistinguishable from a crash recovery situation --- the
"inconsistent" tar dump looks just like a disk that has received some
but not all of the updates since the last checkpoint.  Replay will fix it.

The cool thing about this is you can actually bring the DB to the state
it was in at any specific point in time covered by your WAL archive ---
just run the WAL replay as far as you want, then stop.  Solves the
"junior DBA deleted all my data Saturday morning" problem, thus "PITR".
Now the uncool thing is you need massive amounts of secondary storage to
archive all that WAL data, if your installation has heavy update
activity.  But it seems to me it would address the need you mention
above --- you'd just not bother to continue archiving WAL past the end
of the dump operation.

In principle you could do this today, but we don't have enough
support code in place to make it work smoothly, eg WAL segment files
aren't labeled with enough identifying information to let you manage
an archive full of 'em.  Still it doesn't seem that far away.

> This doesn't really replace pg_dump/pg_restore, because it probably 
> wouldn't be able to upgrade a cluster.

Right, any such physical dump would be limited to restoring a whole
cluster as-is: no imports into other clusters, no selectivity, no fancy
games.  The main reason is you'd have to dump and restore pg_clog along
with the data files.
        regards, tom lane


Re: pg_restore and create FK without verification check

From
ow
Date:
--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > This doesn't really replace pg_dump/pg_restore, because it probably 
> > wouldn't be able to upgrade a cluster.
> 
> Right, any such physical dump would be limited to restoring a whole
> cluster as-is: no imports into other clusters, no selectivity, no fancy
> games.  The main reason is you'd have to dump and restore pg_clog along
> with the data files.
> 

But that would not help people who would HAVE to use pg_dump/pg_restore (e.g.
to backup/restore a single schema), would it? Depending on the db size, etc.,
creation of FK constraint(s) may take many hours. How should this be handled
then?

Thanks




__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/


Re: pg_restore and create FK without verification check

From
Tom Lane
Date:
ow <oneway_111@yahoo.com> writes:
> --- Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Right, any such physical dump would be limited to restoring a whole
>> cluster as-is: no imports into other clusters, no selectivity, no fancy
>> games.

> But that would not help people who would HAVE to use pg_dump/pg_restore (e.g.
> to backup/restore a single schema), would it? Depending on the db size, etc.,
> creation of FK constraint(s) may take many hours. How should this be handled
> then?

Quite honestly, I think they should check their foreign keys.  In a
partial restore situation there is no guarantee that the referenced
table and the referencing table are being restored at the same time from
the same dump.  An override in that situation looks like a great tool
for shooting yourself in the foot.

People might be more interested in debating this topic with you if we
hadn't discussed it at length just a couple months back.  There wasn't
consensus then that we had to offer an escape hatch, and you've not
offered any argument that wasn't made before.
        regards, tom lane


Re: pg_restore and create FK without verification check

From
ow
Date:
--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Quite honestly, I think they should check their foreign keys.

Generally speaking, I agree. The problem is that verification of FK
constraint(s) may take too long, depending on the size of the db and other
conditions. In my case, on test data, it takes about 1 hour to create tables
and copy the data, then about 40 min to create indexes, then 4.5 hours to
create one (1) FK constraint. In production, we'll have 10-100x more data than
we have for testing.
If we have a problem in production, the time necessary to restore the db is
simply going to kill us.

> People might be more interested in debating this topic with you if we
> hadn't discussed it at length just a couple months back.  There wasn't
> consensus then that we had to offer an escape hatch, and you've not
> offered any argument that wasn't made before.

I'm simply presenting a problem for which I currently do not see any solution
(it's very important for us to be able to restore db within a reasonable amount
of time). If there's no solution and none is planned, then we cannot use pgsql,
can we?

Thanks






__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/


Re: pg_restore and create FK without verification check

From
Greg Stark
Date:
ow <oneway_111@yahoo.com> writes:

> --- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Quite honestly, I think they should check their foreign keys.

What should I do if I *know* there will be a FK failure but I want to correct
it manually. Perhaps by creating all the necessary target records, perhaps by
deleting or updating the dead references. Perhaps with a mix of these.

As it stands I have to delete the FK constraint, load the table, and fix the
data. Then recreate the FK constraint -- with the danger that I'll get the
definition wrong -- and wait for the constraint to be verified.

If I could disable and reenable the constraint the danger that I would get the
definition wrong would be eliminated. And if I had already done the work to
ensure there were no broken relationships I would optionally be able to skip
the redundant automatic check. I could even have done the verification myself
while the data wasn't live for example.

The database is a tool. It's annoying to have a tool that tries to outsmart
the user.

-- 
greg



Re: pg_restore and create FK without verification check

From
Andreas Pflug
Date:
Tom Lane wrote:

>Andreas Pflug <pgadmin@pse-consulting.de> writes:
>  
>
>>This is somewhat complementary to WAL and PITR. I'm seeking for a fast 
>>way to dump and restore a complete database, like physical file copy, 
>>without shutting down the backend. I was thinking of a BACKUP command 
>>that streams out the files including any indexes and non-vacuumed 
>>tuples. A database recreated from that wouldn't be as clean as a 
>>pg_dump/pg_restored database, but it would be up much faster, and there 
>>wouldn't be any dependency problem.
>>    
>>
>
>It's already intended to support this as part of the PITR work.  The
>idea is you force a checkpoint and then make a tar-format dump of the
>database tree (tar or whatever floats your boat, but anyway a
>filesystem-level backup).  The database need not be stopped while you do
>this, and you don't need a filesystem that can do snapshots or anything
>fancy like that.  The tar dump itself most likely does not represent a
>consistent state of the database by the time you are done making it.
>That is okay, because you have also been archiving off to tape (or
>someplace) all the WAL data generated since that pre-dump checkpoint.
>You can continue archiving the WAL series for however far forward from
>the original dump you feel like.  If you need to recover, you reload the
>database from the tar dump and then replay the WAL series against it.
>This is indistinguishable from a crash recovery situation --- the
>"inconsistent" tar dump looks just like a disk that has received some
>but not all of the updates since the last checkpoint.  Replay will fix it.
>
>The cool thing about this is you can actually bring the DB to the state
>it was in at any specific point in time covered by your WAL archive ---
>just run the WAL replay as far as you want, then stop.  Solves the
>"junior DBA deleted all my data Saturday morning" problem, thus "PITR".
>Now the uncool thing is you need massive amounts of secondary storage to
>archive all that WAL data, 
>
Shouldn't be a problem, since there are few databases out there 
worldwide exceeding today's average disk capacity...

>if your installation has heavy update
>activity.  But it seems to me it would address the need you mention
>above --- you'd just not bother to continue archiving WAL past the end
>of the dump operation.
>
>  
>
PITR is cool, no question, it's more than I've been requesting. When the 
database server burns, I'd be quite happy if I could restore to my 
latest tape's point in time, since the WAL log disk probably isn't 
functional too. So having a fast backup of the snapshot when the backup 
CHECKPOINT was issued would be enough, no WAL replay needed.

>In principle you could do this today, but we don't have enough
>support code in place to make it work smoothly, eg WAL segment files
>aren't labeled with enough identifying information to let you manage
>an archive full of 'em.  Still it doesn't seem that far away.
>  
>

So I issue CHECKPOINT, and tar the cluster or database. Still, I got two 
questions:
- how to restore a single database
- while tar is running, CHECKPOINT(n+1) might be recorded in some files, 
while others have CHECKPOINT(n). How does the backend know to rollback 
to CHECKPOINT(n)?

Regards,
Andreas




Re: pg_restore and create FK without verification check

From
Andreas Pflug
Date:
Greg Stark wrote:

>If I could disable and reenable the constraint the danger that I would get the
>definition wrong would be eliminated. And if I had already done the work to
>ensure there were no broken relationships I would optionally be able to skip
>the redundant automatic check. I could even have done the verification myself
>while the data wasn't live for example.
>  
>

Since FKs are implemented as trigger, you could disable all triggers on 
the table right now, no? Could be a bit more comfortable, I agree, and 
hope for an upcoming DISABLE TRIGGER command.

While talking about this: I could add ENABLE/DISABLE TRIGGER 
functionality to pgadmin3. Unfortunately, on pg_trigger.tgenabled 
there's still the comment "not presently checked everywhere it should 
be, so disabling a trigger by setting this to false does not work 
reliably". I wonder if this is still true for 7.4. I can't imagine that 
this should be so hard to fix.

Regards,
Andreas




Re: pg_restore and create FK without verification check

From
Tom Lane
Date:
Andreas Pflug <pgadmin@pse-consulting.de> writes:
>> In principle you could do this today, but we don't have enough
>> support code in place to make it work smoothly, eg WAL segment files
>> aren't labeled with enough identifying information to let you manage
>> an archive full of 'em.  Still it doesn't seem that far away.

> So I issue CHECKPOINT, and tar the cluster or database. Still, I got two 
> questions:
> - how to restore a single database

You don't.  As I said, any physical backup is going to be
all-or-nothing.  These techniques are not a replacement for pg_dump.

> - while tar is running, CHECKPOINT(n+1) might be recorded in some files, 
> while others have CHECKPOINT(n). How does the backend know to rollback 
> to CHECKPOINT(n)?

That's part of the management code that we need to write before this
will really be very useful; you need to be able to associate the
starting time of a tar dump with the most recent previous CHECKPOINT
in the WAL logs.  Not hard in principle, just takes some code we ain't
got yet.
        regards, tom lane


Re: pg_restore and create FK without verification check

From
Andreas Pflug
Date:
Tom Lane wrote:

>>- how to restore a single database
>>    
>>
>
>You don't.  As I said, any physical backup is going to be
>all-or-nothing.  These techniques are not a replacement for pg_dump.
>
>  
>

That's sad. I've been backing up and restoring single databases from a 
cluster frequently, so I'd really like the database to be selectable.

>That's part of the management code that we need to write before this
>will really be very useful; you need to be able to associate the
>starting time of a tar dump with the most recent previous CHECKPOINT
>in the WAL logs.  Not hard in principle, just takes some code we ain't
>got yet.
>  
>

So I lay back patiently (more or less :-)

Regards,
Andreas





Re: pg_restore and create FK without verification check

From
Stephan Szabo
Date:
On Wed, 26 Nov 2003, ow wrote:

> > People might be more interested in debating this topic with you if we
> > hadn't discussed it at length just a couple months back.  There wasn't
> > consensus then that we had to offer an escape hatch, and you've not
> > offered any argument that wasn't made before.
>
> I'm simply presenting a problem for which I currently do not see any solution
> (it's very important for us to be able to restore db within a reasonable amount
> of time). If there's no solution and none is planned, then we cannot use pgsql,
> can we?

You can make your own solution, that's the nice thing about open source
stuff.  If you wanted to go the SET variable route to control alter time
checks of CHECK and FOREIGN KEY constraints, it's almost certainly less
than an hours worth of work.


Re: pg_restore and create FK without verification check

From
Stephan Szabo
Date:
On Wed, 26 Nov 2003, Tom Lane wrote:

> Quite honestly, I think they should check their foreign keys.  In a
> partial restore situation there is no guarantee that the referenced
> table and the referencing table are being restored at the same time from
> the same dump.  An override in that situation looks like a great tool
> for shooting yourself in the foot.
>
> People might be more interested in debating this topic with you if we
> hadn't discussed it at length just a couple months back.  There wasn't
> consensus then that we had to offer an escape hatch, and you've not
> offered any argument that wasn't made before.

I actually thought the majority in the past discussion thought that an
escape hatch was a good idea, but that the discussion broke down in trying
to determine what sort of hatch that might be (iirc, it got off into the
general discussion of disabling constraints for normal operation as
opposed to at alter time).


Re: pg_restore and create FK without verification check

From
Robert Treat
Date:
On Wed, 2003-11-26 at 12:43, Andreas Pflug wrote:
> Greg Stark wrote:
> 
> >If I could disable and reenable the constraint the danger that I would get the
> >definition wrong would be eliminated. And if I had already done the work to
> >ensure there were no broken relationships I would optionally be able to skip
> >the redundant automatic check. I could even have done the verification myself
> >while the data wasn't live for example.
> >  
> >
> 
> Since FKs are implemented as trigger, you could disable all triggers on 
> the table right now, no? Could be a bit more comfortable, I agree, and 
> hope for an upcoming DISABLE TRIGGER command.

ISTM I've done this before... from a pg_dump -Fc backup first do a
pg_dump -s restore (schema only) and then a pg_dump -a
--disable-triggers to load the data without check foreign keys. 

Theres certainly potential for trouble with this method I suppose but it
seems like it accomplish what the original poster requires...


Robert Treat

-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL



Re: pg_restore and create FK without verification check

From
Tom Lane
Date:
ow <oneway_111@yahoo.com> writes:
> --- Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Quite honestly, I think they should check their foreign keys.

> Generally speaking, I agree. The problem is that verification of FK
> constraint(s) may take too long, depending on the size of the db and other
> conditions. In my case, on test data, it takes about 1 hour to create tables
> and copy the data, then about 40 min to create indexes, then 4.5 hours to
> create one (1) FK constraint.

If you're seeing this on 7.4, I'd like to see the details of the exact
commands being issued.  If it's not 7.4, it's not a relevant
measurement.
        regards, tom lane


Re: pg_restore and create FK without verification check

From
Stephan Szabo
Date:
On Wed, 26 Nov 2003, Tom Lane wrote:

> ow <oneway_111@yahoo.com> writes:
> > --- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Quite honestly, I think they should check their foreign keys.
>
> > Generally speaking, I agree. The problem is that verification of FK
> > constraint(s) may take too long, depending on the size of the db and other
> > conditions. In my case, on test data, it takes about 1 hour to create tables
> > and copy the data, then about 40 min to create indexes, then 4.5 hours to
> > create one (1) FK constraint.
>
> If you're seeing this on 7.4, I'd like to see the details of the exact
> commands being issued.  If it's not 7.4, it's not a relevant

IIRC, he was. I think the thing causing the difference between his times
and the ones we saw typically when doing the tests was that he didn't have
an index on the fktable's referencing column.


Re: pg_restore and create FK without verification check

From
Andreas Pflug
Date:
Stephan Szabo wrote:

>
>
>IIRC, he was. I think the thing causing the difference between his times
>and the ones we saw typically when doing the tests was that he didn't have
>an index on the fktable's referencing column.
>  
>

A common mistake, can't count how often I created this one... And not 
easy to find, because EXPLAIN won't explain triggers.
I'm planning to create some kind of fk index wizard in pgAdmin3, which 
finds out about fks using columns that aren't covered by an appropriate 
index. Maybe this check could be performed (as a NOTICE) when the fk is 
created?

Regards,
Andreas




Re: pg_restore and create FK without verification check

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Wed, 26 Nov 2003, Tom Lane wrote:
>> If you're seeing this on 7.4, I'd like to see the details of the exact
>> commands being issued.  If it's not 7.4, it's not a relevant

> IIRC, he was. I think the thing causing the difference between his times
> and the ones we saw typically when doing the tests was that he didn't have
> an index on the fktable's referencing column.

I'm not convinced it should matter all that much, for the single-query
test method that 7.4 uses.  That's why I wanted to see details.
        regards, tom lane


Re: pg_restore and create FK without verification check

From
ow
Date:
--- Andreas Pflug <pgadmin@pse-consulting.de> wrote:
> Stephan Szabo wrote:
> 
> >
> >
> >IIRC, he was. I think the thing causing the difference between his times
> >and the ones we saw typically when doing the tests was that he didn't have
> >an index on the fktable's referencing column.
> >  
> >
> 
> A common mistake, can't count how often I created this one...

Wrong. It's a mistake to think that you always HAVE to have an index on FK
column. See the links below for more details:

http://archives.postgresql.org/pgsql-admin/2003-11/msg00317.php
http://archives.postgresql.org/pgsql-admin/2003-11/msg00319.php




__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/


Re: pg_restore and create FK without verification check

From
Christopher Kings-Lynne
Date:
> A common mistake, can't count how often I created this one... And not 
> easy to find, because EXPLAIN won't explain triggers.
> I'm planning to create some kind of fk index wizard in pgAdmin3, which 
> finds out about fks using columns that aren't covered by an appropriate 
> index. Maybe this check could be performed (as a NOTICE) when the fk is 
> created?

Weird - I'm planning the exact same thing for phpPgAdmin!

Great minds think alike :P

Chris




Re: pg_restore and create FK without verification check

From
Kevin Brown
Date:
Tom Lane wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
> >> In principle you could do this today, but we don't have enough
> >> support code in place to make it work smoothly, eg WAL segment files
> >> aren't labeled with enough identifying information to let you manage
> >> an archive full of 'em.  Still it doesn't seem that far away.
> 
> > So I issue CHECKPOINT, and tar the cluster or database. Still, I got two 
> > questions:
> > - how to restore a single database
> 
> You don't.  As I said, any physical backup is going to be
> all-or-nothing.  These techniques are not a replacement for pg_dump.

But this is just an artifact of the fact that the WAL is a single
instance-wide entity, rather than a per-database entity.  But since
databases are completely separate entities that cannot be simultaneously
accessed by any query (corrections welcome), there isn't any reason in
principle that the WAL files cannot also be created on a per-database
basis.

I'm sure, of course, that doing so would bring with it a new set of
problems and tradeoffs, so it might not be worth it...


-- 
Kevin Brown                          kevin@sysexperts.com


Re: pg_restore and create FK without verification check

From
Alvaro Herrera
Date:
On Thu, Nov 27, 2003 at 12:40:28AM +0100, Andreas Pflug wrote:

> A common mistake, can't count how often I created this one... And not 
> easy to find, because EXPLAIN won't explain triggers.

That's a pity.  And the lack of EXPLAINing function execution, too.
Maybe it's not that hard to do?

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Just treat us the way you want to be treated + some extra allowance
for ignorance."                                    (Michael Brusser)


Re: pg_restore and create FK without verification check

From
Tom Lane
Date:
Kevin Brown <kevin@sysexperts.com> writes:
> Tom Lane wrote:
>> You don't.  As I said, any physical backup is going to be
>> all-or-nothing.  These techniques are not a replacement for pg_dump.

> But this is just an artifact of the fact that the WAL is a single
> instance-wide entity, rather than a per-database entity.  But since
> databases are completely separate entities that cannot be simultaneously
> accessed by any query (corrections welcome), there isn't any reason in
> principle that the WAL files cannot also be created on a per-database
> basis.

WAL is not the bottleneck ... as I already mentioned today, pg_clog (and
more specifically the meaning of transaction IDs) is what really makes a
cluster an indivisible whole at the physical level.

If you want to do separate physical dumps/restores, the answer is to set
up separate clusters (separate postmasters).  Not so hard, is it?
        regards, tom lane


Re: pg_restore and create FK without verification check

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> That's a pity.  And the lack of EXPLAINing function execution, too.
> Maybe it's not that hard to do?

Not sure if it's hard or not, but it'd sure be a nice thing to have.
        regards, tom lane


Re: pg_restore and create FK without verification check

From
Kevin Brown
Date:
Tom Lane wrote:
> Kevin Brown <kevin@sysexperts.com> writes:
> > Tom Lane wrote:
> >> You don't.  As I said, any physical backup is going to be
> >> all-or-nothing.  These techniques are not a replacement for pg_dump.
> 
> > But this is just an artifact of the fact that the WAL is a single
> > instance-wide entity, rather than a per-database entity.  But since
> > databases are completely separate entities that cannot be simultaneously
> > accessed by any query (corrections welcome), there isn't any reason in
> > principle that the WAL files cannot also be created on a per-database
> > basis.
> 
> WAL is not the bottleneck ... as I already mentioned today, pg_clog (and
> more specifically the meaning of transaction IDs) is what really makes a
> cluster an indivisible whole at the physical level.
> 
> If you want to do separate physical dumps/restores, the answer is to set
> up separate clusters (separate postmasters).  Not so hard, is it?

Well, aside from the fact that separate clusters have completely separate
user databases, listen on different ports, will compete with other
clusters on the same system for resources that would be better managed
by a single cluster, and generally have to be maintained as completely
separate entities from start to finish, no it's not that hard.  ;-)


The ability to restore a single large database quickly is, I think,
a reasonable request, it's just that right now it's difficult (perhaps
impossible) to satisfy that request.


It's probably something that we'll have to deal with if we want PG to be
useful to people managing really large databases on really, really big
iron, though.


-- 
Kevin Brown                          kevin@sysexperts.com


Re: pg_restore and create FK without verification check

From
ow
Date:
--- Alvaro Herrera <alvherre@dcc.uchile.cl> wrote:
> On Thu, Nov 27, 2003 at 12:40:28AM +0100, Andreas Pflug wrote:
> 
> > A common mistake, can't count how often I created this one... And not 
> > easy to find, because EXPLAIN won't explain triggers.
> 
> That's a pity.  And the lack of EXPLAINing function execution, too.
> Maybe it's not that hard to do?
> 

I'd like to emphasize again that NOT having an index on the FK column is a
perfectly valid approach, despite some opinions to the contrary. In fact,
requiring an index  on FK column(s) when it is not required by the
application's logic IS a mistake since it slows down
inserts/deletes/updates/vacume/reindex/etc on the respective table and wastes
disk space (could be considerable amount on large tables).

Also, FK column index DOES NOT, in general, solve performance issues with FK
verification check. Someone may (and, I'm sure, will) simply have more data or
more constraints.

The only solution here appears to be the "--disable-triggers" option as it was
suggested by Robert Treat. If it works then I'm fine, somehow I did not see
that option in the beginning.

Thanks






__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/


Re: pg_restore and create FK without verification check

From
Alvaro Herrera
Date:
On Wed, Nov 26, 2003 at 10:11:20PM -0800, ow wrote:
> --- Alvaro Herrera <alvherre@dcc.uchile.cl> wrote:
> > On Thu, Nov 27, 2003 at 12:40:28AM +0100, Andreas Pflug wrote:
> > 
> > > A common mistake, can't count how often I created this one... And not 
> > > easy to find, because EXPLAIN won't explain triggers.
> > 
> > That's a pity.  And the lack of EXPLAINing function execution, too.
> > Maybe it's not that hard to do?
> 
> I'd like to emphasize again that NOT having an index on the FK column is a
> perfectly valid approach, despite some opinions to the contrary.

In what scenarios?  I'd easily buy this if you are talking about small
tables.

> Also, FK column index DOES NOT, in general, solve performance issues
> with FK verification check. Someone may (and, I'm sure, will) simply
> have more data or more constraints.

More data?  Hmm ... if you have a lot of data in the referenced table,
you'd better _have_ an index unless you want a lot of seqscans.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La tristeza es un muro entre dos jardines" (Khalil Gibran)


Re: pg_restore and create FK without verification check

From
ow
Date:
--- Alvaro Herrera <alvherre@dcc.uchile.cl> wrote:
> In what scenarios?  I'd easily buy this if you are talking about small
> tables.
> 

Read the message again.





__________________________________
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/


Re: pg_restore and create FK without verification check

From
Andreas Pflug
Date:
ow wrote:

>I'd like to emphasize again that NOT having an index on the FK column is a
>perfectly valid approach, despite some opinions to the contrary. 
>

OW, you might insist that there are several cases when an index is not 
needed, but I didn't propose to create the index automatically (this 
really shouldn't happen), but merely to send a NOTICE to the user so he 
can check if he maybe eventually by chance in doubt should create an 
index. Databases will certainly suffer much more from missing indexes 
than from too many indexes.

Regards,
Andreas




Re: pg_restore and create FK without verification check

From
Andreas Pflug
Date:
Kevin Brown wrote:

>>WAL is not the bottleneck ... as I already mentioned today, pg_clog (and
>>more specifically the meaning of transaction IDs) is what really makes a
>>cluster an indivisible whole at the physical level.
>>
>
>
>The ability to restore a single large database quickly is, I think,
>a reasonable request, it's just that right now it's difficult (perhaps
>impossible) to satisfy that request.
>
>  
>


I could live perfectly with a single database restore solution that 
can't cope with WAL, but merely contains the very snapshot present at 
the CHECKPOINT when the backup started.

Additionally, I could imagine a restore where only one db is restored, 
and the WAL is replayed from the complete cluster backup set, while 
ignoring all WAL entries not meant for the database in restauration.
Imagine you have a full backup at midnight, and at at 5PM you say "sh*t, 
I need to have an 11am PITR on my ABC database, while leaving the other 
five in the cluster untouched". I'd drop that offending DB, restore it, 
and replay that WAL.
Does this sound too esoteric?

Regards,
Andreas



Re: pg_restore and create FK without verification check

From
Jan Wieck
Date:
ow wrote:

> --- Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> People might be more interested in debating this topic with you if we
>> hadn't discussed it at length just a couple months back.  There wasn't
>> consensus then that we had to offer an escape hatch, and you've not
>> offered any argument that wasn't made before.
> 
> I'm simply presenting a problem for which I currently do not see any solution
> (it's very important for us to be able to restore db within a reasonable amount
> of time). If there's no solution and none is planned, then we cannot use pgsql,
> can we?

You're simply presenting a problem that isn't there in the first place. 
If you really feel the need to shoot yourself in the foot, use separate 
schema and data dumps and do the latter with "-X disable-triggers".

And now will you please put it to rest?


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: pg_restore and create FK without verification check

From
Oli Sennhauser
Date:
>>> People might be more interested in debating this topic with you if we
>>> hadn't discussed it at length just a couple months back.  There wasn't
>>> consensus then that we had to offer an escape hatch, and you've not
>>> offered any argument that wasn't made before.
>>
>>
>> I'm simply presenting a problem for which I currently do not see any 
>> solution
>> (it's very important for us to be able to restore db within a 
>> reasonable amount
>> of time). If there's no solution and none is planned, then we cannot 
>> use pgsql,
>> can we?
>
>
> You're simply presenting a problem that isn't there in the first 
> place. If you really feel the need to shoot yourself in the foot, use 
> separate schema and data dumps and do the latter with "-X 
> disable-triggers".
>
> And now will you please put it to rest?

If this is not a prio 1 problem, what are then the prio one problems??? 
You are a developer, right? Did you ever manage a big database in 
production? What shoul I tell to my customers when they want to have a 
not that big database (100 GB) in PostgreSQL: "I am sorry, but we are 
not able to do performant backups, I recommend you to choos ORACLE 
instead???". Is it this we/you recommend?

Regards Oli

-------------------------------------------------------

Oli Sennhauser
Database-Engineer (Oracle & PostgreSQL)
Rebenweg 6
CH - 8610 Uster / Switzerland

Phone (+41) 1 940 24 82 or Mobile (+41) 79 450 49 14
e-Mail oli.sennhauser@bluewin.ch
Website http://mypage.bluewin.ch/shinguz/PostgreSQL/

Secure (signed/encrypted) e-Mail with a Free Personal SwissSign ID: http://www.swisssign.ch

Import the SwissSign Root Certificate: http://swisssign.net/cgi-bin/trust/import


Re: pg_restore and create FK without verification check

From
Jan Wieck
Date:
Oli Sennhauser wrote:

>>>> People might be more interested in debating this topic with you if we
>>>> hadn't discussed it at length just a couple months back.  There wasn't
>>>> consensus then that we had to offer an escape hatch, and you've not
>>>> offered any argument that wasn't made before.
>>>
>>>
>>> I'm simply presenting a problem for which I currently do not see any 
>>> solution
>>> (it's very important for us to be able to restore db within a 
>>> reasonable amount
>>> of time). If there's no solution and none is planned, then we cannot 
>>> use pgsql,
>>> can we?
>>
>>
>> You're simply presenting a problem that isn't there in the first 
>> place. If you really feel the need to shoot yourself in the foot, use 
>> separate schema and data dumps and do the latter with "-X 
>> disable-triggers".
>>
>> And now will you please put it to rest?
> 
> If this is not a prio 1 problem, what are then the prio one problems??? 

Did you read my mail or only that last sentence?

> You are a developer, right? Did you ever manage a big database in 
> production? What shoul I tell to my customers when they want to have a 
> not that big database (100 GB) in PostgreSQL: "I am sorry, but we are 
> not able to do performant backups, I recommend you to choos ORACLE 
> instead???". Is it this we/you recommend?

Among many other things I am a developer too, and I have managed 
customer databases up to 1.2 TB. But I wonder what you are.

You should tell your customers that they have to dump their databases as
    pg_dump -d swisscheese >swisscheese.schema.dump    pg_dump -a -X disable-triggers swisscheese
>swisscheese.data.dump

This is what I recommended in my previous mail. Is that an unacceptable 
solution for your customers or what is the problem?


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #