Thread: Re: [GENERAL] 7.4Beta

Re: [GENERAL] 7.4Beta

From
Bruce Momjian
Date:
[ Beta discussion moved to hackers.]

Peter Childs wrote:
>     Just loaded up 7.4Beta on a test system, to try and work out what the major 
> changes are/improvements.... I also started up pg_vacuum as this is now 
> working very well on our semi-live system (semi-live because we have not 
> finished developing the system yet.....)
>     Anyway. After trying to throw our 7.3 config file at it and it crashed badly 
> I resorted to the defaults. some of the settings in the config file have 
> changed. (Can somone make the config files backward compatable so at least 
> they ignor depricated settings!)

There will be a lost of config name and other changes in a special
section soon.

>     I throw last nights backup at it. Data went in in about 1/2 an hour then the 
> constraints went in and they took at age. about 2 hours.....
>     Is there anyway to speed up the database constraint code? Because quite 
> frankly at the current speed your probably better off without the 
> constraints.... (Same problem with 7.3 come to think about it.)
>     Otherwise 7.4. seams fine. 
>     Question, when is replication and rolling backup comming. I want to be able 
> to take the last backup and a list of all the updates since then and get back 
> to the current database quickly.

Rolling backups should be in 7.5, I hope.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [GENERAL] 7.4Beta

From
"Christopher Kings-Lynne"
Date:
> > I throw last nights backup at it. Data went in in about 1/2 an hour then
the
> > constraints went in and they took at age. about 2 hours.....
> > Is there anyway to speed up the database constraint code? Because quite
> > frankly at the current speed your probably better off without the
> > constraints.... (Same problem with 7.3 come to think about it.)

I can also attest to the horrendously long time it takes to restore the ADD
FOREIGN KEY section...

Chris



Re: [GENERAL] 7.4Beta

From
Stephan Szabo
Date:
On Fri, 15 Aug 2003, Christopher Kings-Lynne wrote:

>
> > > I throw last nights backup at it. Data went in in about 1/2 an hour then
> the
> > > constraints went in and they took at age. about 2 hours.....
> > > Is there anyway to speed up the database constraint code? Because quite
> > > frankly at the current speed your probably better off without the
> > > constraints.... (Same problem with 7.3 come to think about it.)
>
> I can also attest to the horrendously long time it takes to restore the ADD
> FOREIGN KEY section...

That really needs to be rewritten to do a single check over the table
rather than running the constraint for every row.  I keep meaning to get
around to it and never actually do. :(  I'm not sure that in practice
you'll get a better plan at restore time depending on what the default
statistics give you.



Re: [GENERAL] 7.4Beta

From
Gavin Sherry
Date:
On Thu, 14 Aug 2003, Stephan Szabo wrote:

> That really needs to be rewritten to do a single check over the table
> rather than running the constraint for every row.  I keep meaning to get
> around to it and never actually do. :(  I'm not sure that in practice
> you'll get a better plan at restore time depending on what the default
> statistics give you.

Perhaps it would be easier to allow SKIP VALIDATION (or something) with
ALTER TABLE ADD .... which can set FkConstraint->skip_validation. If we're
just handling pg_dump output, then presumably the data is already
validated. On the other handle, it might encourage users to bypass FKs
when they feel like it...

Thanks,

Gavin



Re: [GENERAL] 7.4Beta

From
"Christopher Kings-Lynne"
Date:
> > I can also attest to the horrendously long time it takes to restore the
ADD
> > FOREIGN KEY section...
>
> That really needs to be rewritten to do a single check over the table
> rather than running the constraint for every row.  I keep meaning to get
> around to it and never actually do. :(  I'm not sure that in practice
> you'll get a better plan at restore time depending on what the default
> statistics give you.

Surely in the default case it would reduce to using the new hashed IN()
feature, so it'd be a lot faster?

Chris



Re: [GENERAL] 7.4Beta

From
Andreas Pflug
Date:
Stephan Szabo wrote:

>On Fri, 15 Aug 2003, Christopher Kings-Lynne wrote:
>
>  
>
>>>>I throw last nights backup at it. Data went in in about 1/2 an hour then
>>>>        
>>>>
>>the
>>    
>>
>>>>constraints went in and they took at age. about 2 hours.....
>>>>Is there anyway to speed up the database constraint code? Because quite
>>>>frankly at the current speed your probably better off without the
>>>>constraints.... (Same problem with 7.3 come to think about it.)
>>>>        
>>>>
>>I can also attest to the horrendously long time it takes to restore the ADD
>>FOREIGN KEY section...
>>    
>>
>
>That really needs to be rewritten to do a single check over the table
>rather than running the constraint for every row.  I keep meaning to get
>around to it and never actually do. :(  I'm not sure that in practice
>you'll get a better plan at restore time depending on what the default
>statistics give you.
>
This is clearly a case for a statement level trigger, as soon as 
affected rows can be identified.
One remark on that enable/disable triggers stuff: from a user's 
perspective, I wouldn't consider a constraint trigger as a trigger, so 
if I'd disable all triggers on a table, I still would expect all 
constraints to be checked.


Regards,
Andreas



ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Fri, 15 Aug 2003, Christopher Kings-Lynne wrote:
>> I can also attest to the horrendously long time it takes to restore the ADD
>> FOREIGN KEY section...

> That really needs to be rewritten to do a single check over the table
> rather than running the constraint for every row.  I keep meaning to get
> around to it and never actually do. :(  I'm not sure that in practice
> you'll get a better plan at restore time depending on what the default
> statistics give you.

In simple cases I think that the creation of indexes would be enough to
get you a passable plan --- CREATE INDEX does update pg_class.reltuples,
so the planner will know how big the tables are, and for single-column
primary keys the existence of a unique index is enough to cue the
planner that the column is unique, even without any ANALYZE stats.
Those are the biggest levers on the plan choice.

This assumes that pg_dump always dumps CREATE INDEX before ADD FOREIGN
KEY; I'm not certain if there's anything to enforce that at the
moment...

I assume what you have in mind is to replace
validateForeignKeyConstraint() with something that does a join of the
two tables via an SPI command.  But supposing that we want to keep the
present ability to report (one of) the failing key values, it seems
like the query has to look likeSELECT keycolumns FROM referencing_table WHEREkeycolumns NOT IN (SELECT refcols FROM
referenced_table);
which is only gonna do the right thing for one of the MATCH styles
(not sure which, offhand ... actually it may not do the right thing
for any match style if there are nulls in referenced_table ...).
How would you make it work for all the MATCH styles?  And will it
really be all that efficient?  (NOT IN is a lot more circumscribed
than IN.)
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Fri, 15 Aug 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Fri, 15 Aug 2003, Christopher Kings-Lynne wrote:
> >> I can also attest to the horrendously long time it takes to restore the ADD
> >> FOREIGN KEY section...
>
> > That really needs to be rewritten to do a single check over the table
> > rather than running the constraint for every row.  I keep meaning to get
> > around to it and never actually do. :(  I'm not sure that in practice
> > you'll get a better plan at restore time depending on what the default
> > statistics give you.
>
> In simple cases I think that the creation of indexes would be enough to
> get you a passable plan --- CREATE INDEX does update pg_class.reltuples,
> so the planner will know how big the tables are, and for single-column
> primary keys the existence of a unique index is enough to cue the
> planner that the column is unique, even without any ANALYZE stats.
> Those are the biggest levers on the plan choice.
>
> This assumes that pg_dump always dumps CREATE INDEX before ADD FOREIGN
> KEY; I'm not certain if there's anything to enforce that at the
> moment...
>
> I assume what you have in mind is to replace
> validateForeignKeyConstraint() with something that does a join of the
> two tables via an SPI command.  But supposing that we want to keep the
> present ability to report (one of) the failing key values, it seems
> like the query has to look like
>     SELECT keycolumns FROM referencing_table WHERE
>     keycolumns NOT IN (SELECT refcols FROM referenced_table);
> which is only gonna do the right thing for one of the MATCH styles
> (not sure which, offhand ... actually it may not do the right thing
> for any match style if there are nulls in referenced_table ...).

Yes, in practice, you'd have to put IS NOT NULL checks in the subselect,
which is fine for the two match types we support since a referenced row
with a NULL isn't a choice for a referenced row for those.  I think MATCH
PARTIAL might have to fall back to the repeated check unless we can make
the query work which would be harder because you only want to compare the
columns for a particular row where the keycolumn case is not null and I
can't think of a query for that that'd be particularly clean and likely to
be fast, then again I don't think the constraint would be either. :(

It'd probably be:
MATCH unspecified:SELECT keycolumns FROM referencing_table WHERE (keycolumns) NOT IN (SELECT refcols FROM
referenced_table  WHERE refcol1 IS NOT NULL AND ... )AND keycolumn1 IS NOT NULL AND ...;
 

MATCH FULL: (something like, I haven't tried it)SELECT keycolumns FROM referencing_table WHERE ((keycolumns) NOT IN
(SELECTrefcols FROM referenced_table  WHERE refcol1 IS NOT NULL AND ...) AND  (keycolumn1 IS NOT NULL AND ...) ) OR
((keycolumn1IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...)
 

> How would you make it work for all the MATCH styles?  And will it
> really be all that efficient?  (NOT IN is a lot more circumscribed
> than IN.)

I'm not really sure yet.  Limited tests seem to show that it'll probably
be as fast if not faster for all reasonable cases, but I'd want to
generate a much larger random data set and actually put it in to make a
fair comparison (maybe temporarily with a set to allow people to try both
cases on real world data).  One other advantage here is that we don't need
to get row locks while checking this if we've already gotten the exclusive
table locks on both tables involved.  I'm not sure if we do that currently
though.




Re: [GENERAL] 7.4Beta

From
Stephan Szabo
Date:
On Fri, 15 Aug 2003, Andreas Pflug wrote:

> Stephan Szabo wrote:
>
> >That really needs to be rewritten to do a single check over the table
> >rather than running the constraint for every row.  I keep meaning to get
> >around to it and never actually do. :(  I'm not sure that in practice
> >you'll get a better plan at restore time depending on what the default
> >statistics give you.
> >
> This is clearly a case for a statement level trigger, as soon as
> affected rows can be identified.

Well, I think single inserts might be more expensive (because the query is
more involved for the table joining case) using a statement level trigger,
so we'd probably want to profile the cases.



Re: [GENERAL] 7.4Beta

From
Andreas Pflug
Date:
Stephan Szabo wrote:

>On Fri, 15 Aug 2003, Andreas Pflug wrote:
>
>  
>
>>Stephan Szabo wrote:
>>
>>    
>>
>>>That really needs to be rewritten to do a single check over the table
>>>rather than running the constraint for every row.  I keep meaning to get
>>>around to it and never actually do. :(  I'm not sure that in practice
>>>you'll get a better plan at restore time depending on what the default
>>>statistics give you.
>>>
>>>      
>>>
>>This is clearly a case for a statement level trigger, as soon as
>>affected rows can be identified.
>>    
>>
>
>Well, I think single inserts might be more expensive (because the query is
>more involved for the table joining case) using a statement level trigger,
>so we'd probably want to profile the cases.
>  
>
This really depends. If a constraint is just a check on the 
inserted/updated column, so no other row needs to be checked, there's no 
faster way then the current row trigger. But FK constraints need to 
execute a query to retrieve the referenced row, and every RDBMS prefers 
to execute a single statement with many rows over many statements with a 
single row, because the first will profit from optimization. And even if 
only a single row is inserted or updated, there's still the need to 
lookup the reference.

Regards,
Andreas




Re: [GENERAL] 7.4Beta

From
Stephan Szabo
Date:
On Fri, 15 Aug 2003, Andreas Pflug wrote:

> Stephan Szabo wrote:
>
> >On Fri, 15 Aug 2003, Andreas Pflug wrote:
> >
> >
> >
> >>Stephan Szabo wrote:
> >>
> >>
> >>
> >>>That really needs to be rewritten to do a single check over the table
> >>>rather than running the constraint for every row.  I keep meaning to get
> >>>around to it and never actually do. :(  I'm not sure that in practice
> >>>you'll get a better plan at restore time depending on what the default
> >>>statistics give you.
> >>>
> >>>
> >>>
> >>This is clearly a case for a statement level trigger, as soon as
> >>affected rows can be identified.
> >>
> >>
> >
> >Well, I think single inserts might be more expensive (because the query is
> >more involved for the table joining case) using a statement level trigger,
> >so we'd probably want to profile the cases.
> >
> >
> This really depends. If a constraint is just a check on the
> inserted/updated column, so no other row needs to be checked, there's no
> faster way then the current row trigger. But FK constraints need to
> execute a query to retrieve the referenced row, and every RDBMS prefers
> to execute a single statement with many rows over many statements with a
> single row, because the first will profit from optimization. And even if
> only a single row is inserted or updated, there's still the need to
> lookup the reference.

I don't think that addresses the issue I brought up.  If you're doing a
bunch of single inserts:
begin;
insert into foo values (1);
insert into foo values (1);
insert into foo values (1);
insert into foo values (1);
insert into foo values (1);
end;

Each of those statement triggers is still only going to be dealing with a
single row.  If you're in immediate mode there's not much you can do about
that since the constraint is checked between inserts.  If you're in
deferred mode, right now it won't help because it's not going to batch
them, it's going to be 5 statement triggers AFAICT each with its own 1 row
affected table.

I believe that the more complicated join the old/new table with the pk
table and do the constraint check is going to be slightly slower than the
current row behavior for such cases because the trigger query is going to
be more complicated.  What would be nice would be some way to choose
whether to use a single query per statement vs a simpler query per row
based on what's happening.



Re: [GENERAL] 7.4Beta

From
Andreas Pflug
Date:
Stephan Szabo wrote:

>On Fri, 15 Aug 2003, Andreas Pflug wrote:
>
>  
>
>>Stephan Szabo wrote:
>>
>>    
>>
>>>On Fri, 15 Aug 2003, Andreas Pflug wrote:
>>>
>>>
>>>
>>>      
>>>
>>>>Stephan Szabo wrote:
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>>>That really needs to be rewritten to do a single check over the table
>>>>>rather than running the constraint for every row.  I keep meaning to get
>>>>>around to it and never actually do. :(  I'm not sure that in practice
>>>>>you'll get a better plan at restore time depending on what the default
>>>>>statistics give you.
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>This is clearly a case for a statement level trigger, as soon as
>>>>affected rows can be identified.
>>>>
>>>>
>>>>        
>>>>
>>>Well, I think single inserts might be more expensive (because the query is
>>>more involved for the table joining case) using a statement level trigger,
>>>so we'd probably want to profile the cases.
>>>
>>>
>>>      
>>>
>>This really depends. If a constraint is just a check on the
>>inserted/updated column, so no other row needs to be checked, there's no
>>faster way then the current row trigger. But FK constraints need to
>>execute a query to retrieve the referenced row, and every RDBMS prefers
>>to execute a single statement with many rows over many statements with a
>>single row, because the first will profit from optimization. And even if
>>only a single row is inserted or updated, there's still the need to
>>lookup the reference.
>>    
>>
>
>I don't think that addresses the issue I brought up.  If you're doing a
>bunch of single inserts:
>begin;
>insert into foo values (1);
>insert into foo values (1);
>insert into foo values (1);
>insert into foo values (1);
>insert into foo values (1);
>end;
>
>Each of those statement triggers is still only going to be dealing with a
>single row.  If you're in immediate mode there's not much you can do about
>that since the constraint is checked between inserts.  If you're in
>deferred mode, right now it won't help because it's not going to batch
>them, it's going to be 5 statement triggers AFAICT each with its own 1 row
>affected table.
>
>I believe that the more complicated join the old/new table with the pk
>table and do the constraint check is going to be slightly slower than the
>current row behavior for such cases because the trigger query is going to
>be more complicated.  What would be nice would be some way to choose
>whether to use a single query per statement vs a simpler query per row
>based on what's happening.
>
>  
>

Deferring the constraint check would mean checking 5 single rows, right. 
But I still can't see why you think that a row level trigger would be 
cheaper in this case. I had a look at ri_triggers.c and what's coded 
there looks just as I expected, doing a query on the referenced table. 
the queries might look a bit different when checking multiple rows at 
once, but carefully designed I doubt that there would be a performance 
hit from this. In case it *is* significantly slower, single row updates 
could be handled separately using the current triggers, and statement 
triggers for multiple rows. This would cover both scenarios best. At the 
moment, update/insert scales not too good.

Best thing in the situation above would certainly be if all 5 rows would 
be checked in a single query, but that looks quite impossible because a 
mixture of inserts/updates/deletes on different tables might be deferred.

Regards,
Andreas




Re: [GENERAL] 7.4Beta

From
Stephan Szabo
Date:
On Fri, 15 Aug 2003, Andreas Pflug wrote:

> Stephan Szabo wrote:
>
> >On Fri, 15 Aug 2003, Andreas Pflug wrote:
> >
> >>Stephan Szabo wrote:
> >>
> >>>Well, I think single inserts might be more expensive (because the query is
> >>>more involved for the table joining case) using a statement level trigger,
> >>>so we'd probably want to profile the cases.
> >>>
> >>>
> >>>
> >>>
> >>This really depends. If a constraint is just a check on the
> >>inserted/updated column, so no other row needs to be checked, there's no
> >>faster way then the current row trigger. But FK constraints need to
> >>execute a query to retrieve the referenced row, and every RDBMS prefers
> >>to execute a single statement with many rows over many statements with a
> >>single row, because the first will profit from optimization. And even if
> >>only a single row is inserted or updated, there's still the need to
> >>lookup the reference.
> >>
> >>
> >
> >I don't think that addresses the issue I brought up.  If you're doing a
> >bunch of single inserts:
> >begin;
> >insert into foo values (1);
> >insert into foo values (1);
> >insert into foo values (1);
> >insert into foo values (1);
> >insert into foo values (1);
> >end;
> >
> >Each of those statement triggers is still only going to be dealing with a
> >single row.  If you're in immediate mode there's not much you can do about
> >that since the constraint is checked between inserts.  If you're in
> >deferred mode, right now it won't help because it's not going to batch
> >them, it's going to be 5 statement triggers AFAICT each with its own 1 row
> >affected table.
> >
> >I believe that the more complicated join the old/new table with the pk
> >table and do the constraint check is going to be slightly slower than the
> >current row behavior for such cases because the trigger query is going to
> >be more complicated.  What would be nice would be some way to choose
> >whether to use a single query per statement vs a simpler query per row
> >based on what's happening.
> >
>
> Deferring the constraint check would mean checking 5 single rows, right.
> But I still can't see why you think that a row level trigger would be
> cheaper in this case. I had a look at ri_triggers.c and what's coded
> there looks just as I expected, doing a query on the referenced table.
> the queries might look a bit different when checking multiple rows at
> once, but carefully designed I doubt that there would be a performance
> hit from this. In case it *is* significantly slower, single row updates

I don't know if there will be or not, but in one case it's a single table
select with constant values, in the other it's probably some kind of scan
and subselect. I'm just not going to rule out the possibility, so we
should profile it in large transactions with say 100k single inserts and
see.

> could be handled separately using the current triggers, and statement
> triggers for multiple rows. This would cover both scenarios best. At the

Yep.  I'd wish that it could do it without actually needing to queue up
both triggers, but I don't know how if that'd be possible without tying
some knowledge of the fk functions deeper down.

> Best thing in the situation above would certainly be if all 5 rows would
> be checked in a single query, but that looks quite impossible because a
> mixture of inserts/updates/deletes on different tables might be deferred.

Yeah, the 5 above are pretty easy to show that it's safe, but other cases
and referential action cases won't necessarily be so easy.



Re: [GENERAL] 7.4Beta

From
Andreas Pflug
Date:
Stephan Szabo wrote:

>I don't know if there will be or not, but in one case it's a single table
>select with constant values, in the other it's probably some kind of scan
>and subselect. I'm just not going to rule out the possibility, so we
>should profile it in large transactions with say 100k single inserts and
>see.
>  
>
You're talking about bulk operations, that should be handled carefully 
either. Usually loading all data into a temporary table, and making a 
INSERT INTO xxx SELECT FROM tmptable should give a better performance if 
indices and constraints are concerned. PostgreSQL shouldn't be 
considered to accept the most abusive ways of operation, but it should 
offer a reasonable set of tools enabling the jobs in a convenient way. 
Best situation available is if many small random transactions are 
performed good, for TPC like loads, as well as bulk operations. Nobody 
should expect that a database will smootly convert a bunch of single 
transactions into an optimized bulk one. That's the job of a programmer.

>Yeah, the 5 above are pretty easy to show that it's safe, but other cases
>and referential action cases won't necessarily be so easy.
>
So it's the programmers responsibility to offer mass data to the 
backend, not separate inserts that by chance might be handled in a 
similar way. A RDBMS is not a clairvoyant.

Regards,
Andreas




Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
[ continuing a discussion from mid-August ]

Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
>> I assume what you have in mind is to replace
>> validateForeignKeyConstraint() with something that does a join of the
>> two tables via an SPI command.

> It'd probably be:
> MATCH unspecified:
>  SELECT keycolumns FROM referencing_table WHERE
>   (keycolumns) NOT IN (SELECT refcols FROM referenced_table
>     WHERE refcol1 IS NOT NULL AND ... )
>  AND keycolumn1 IS NOT NULL AND ...;

> MATCH FULL: (something like, I haven't tried it)
>  SELECT keycolumns FROM referencing_table WHERE
>   ((keycolumns) NOT IN (SELECT refcols FROM referenced_table
>    WHERE refcol1 IS NOT NULL AND ...)
>   AND
>    (keycolumn1 IS NOT NULL AND ...)
>   )
>   OR ((keycolumn1 IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...)

I thought of what seems to be a better design for the check query: use
a LEFT JOIN and check for NULL in the righthand joined column.  For
example, I think a MATCH UNSPECIFIED on two columns could be tested like
this:

select f1,f2 from fk left join pk on (fk.f1=pk.f1 and fk.f2=pk.f2) where pk.f1 is null and (fk.f1 is not null and fk.f2
isnot null);
 

and MATCH FULL is the same except
 where pk.f1 is null and (fk.f1 is not null OR fk.f2 is not null);

MATCH PARTIAL would be harder; I think you'd need to generate a separate
query for each subset of the columns, in which you would probe for
unmatched rows having exactly that subset non-null.  But it could be
done.

Do you see any logical error here?

In some preliminary tests, the planner seems to be able to choose
reasonable plans for this type of query even without pg_statistic data,
as long as it knows the table sizes (which it would do after CREATE INDEX).
So it would work reasonably well during a pg_dump script, I think.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Sat, 27 Sep 2003, Tom Lane wrote:

> [ continuing a discussion from mid-August ]
>
> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> >> I assume what you have in mind is to replace
> >> validateForeignKeyConstraint() with something that does a join of the
> >> two tables via an SPI command.
>
> > It'd probably be:
> > MATCH unspecified:
> >  SELECT keycolumns FROM referencing_table WHERE
> >   (keycolumns) NOT IN (SELECT refcols FROM referenced_table
> >     WHERE refcol1 IS NOT NULL AND ... )
> >  AND keycolumn1 IS NOT NULL AND ...;
>
> > MATCH FULL: (something like, I haven't tried it)
> >  SELECT keycolumns FROM referencing_table WHERE
> >   ((keycolumns) NOT IN (SELECT refcols FROM referenced_table
> >    WHERE refcol1 IS NOT NULL AND ...)
> >   AND
> >    (keycolumn1 IS NOT NULL AND ...)
> >   )
> >   OR ((keycolumn1 IS NOT NULL)!=(keycolumn2 IS NOT NULL) OR ...)
>
> I thought of what seems to be a better design for the check query: use
> a LEFT JOIN and check for NULL in the righthand joined column.  For
> example, I think a MATCH UNSPECIFIED on two columns could be tested like
> this:
>
> select f1,f2
>   from fk left join pk on (fk.f1=pk.f1 and fk.f2=pk.f2)
>   where pk.f1 is null and (fk.f1 is not null and fk.f2 is not null);
>
> and MATCH FULL is the same except
>
>   where pk.f1 is null and (fk.f1 is not null OR fk.f2 is not null);
>
> MATCH PARTIAL would be harder; I think you'd need to generate a separate
> query for each subset of the columns, in which you would probe for
> unmatched rows having exactly that subset non-null.  But it could be
> done.
>
> Do you see any logical error here?
>
> In some preliminary tests, the planner seems to be able to choose
> reasonable plans for this type of query even without pg_statistic data,
> as long as it knows the table sizes (which it would do after CREATE INDEX).
> So it would work reasonably well during a pg_dump script, I think.

Hmm, my initial testing showed that it really was a little slower
than a more complicated one with NOT EXISTS so I'd abandoned it. How does
it fare for you compared to:
select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;

I believe the above is the appropriate not exists form for match
unspecified.

I've actually got code (that no longer cleanly applies, but...) that uses
the single query version with NOT EXISTS (which could be easily changed to
either of the other forms) and was planning to put it together for a patch
when 7.5 devel started because I figured it wasn't precisely a bug and
wouldn't get accepted for 7.4.


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Stephan Szabo wrote:
> Hmm, my initial testing showed that it really was a little slower
> than a more complicated one with NOT EXISTS so I'd abandoned it. How does
> it fare for you compared to:
> select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
> and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;
> 
> I believe the above is the appropriate not exists form for match
> unspecified.
> 
> I've actually got code (that no longer cleanly applies, but...) that uses
> the single query version with NOT EXISTS (which could be easily changed to
> either of the other forms) and was planning to put it together for a patch
> when 7.5 devel started because I figured it wasn't precisely a bug and
> wouldn't get accepted for 7.4.

I am a little lost on this point myself --- are we talking 7.4 or 7.5
for this change?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Sat, 27 Sep 2003, Tom Lane wrote:
>> I thought of what seems to be a better design for the check query: use
>> a LEFT JOIN and check for NULL in the righthand joined column.

> Hmm, my initial testing showed that it really was a little slower
> than a more complicated one with NOT EXISTS so I'd abandoned it. How does
> it fare for you compared to:
> select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
> and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;

Were you testing against 7.3 or 7.4?  On what kinds of tables?

In 7.4 I think that the JOIN would yield as good or better a plan.  The
best possible plan for the NOT EXISTS query is effectively a nestloop
with inner indexscan, which is great if the FK table is small and the
PK table is large, but it sucks otherwise.  The planner should choose a
plan of this form for the LEFT JOIN given that combination of table
sizes, and so there shouldn't be any great difference in runtime in that
case.  But in other combinations, such as large FK and small PK, other
plan types will beat the pants off nestloop.

> I've actually got code (that no longer cleanly applies, but...) that uses
> the single query version with NOT EXISTS (which could be easily changed to
> either of the other forms) and was planning to put it together for a patch
> when 7.5 devel started because I figured it wasn't precisely a bug and
> wouldn't get accepted for 7.4.

Well, Bruce has this on his open-items list, so I figure we have a green
light to do something for 7.4 if we can work out what to do.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Sun, 28 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Sat, 27 Sep 2003, Tom Lane wrote:
> >> I thought of what seems to be a better design for the check query: use
> >> a LEFT JOIN and check for NULL in the righthand joined column.
>
> > Hmm, my initial testing showed that it really was a little slower
> > than a more complicated one with NOT EXISTS so I'd abandoned it. How does
> > it fare for you compared to:
> > select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
> > and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;
>
> Were you testing against 7.3 or 7.4?  On what kinds of tables?

7.4 with tables of 1-2 integer key columns with 10k-1m pk rows of
sequential data (in the 2 key case it was value and #rows-value iirc) and
1m-20m fk rows of randomly generated valid data.  But it wasn't any sort
of amazingly detailed test and those aren't huge tables, but I don't
exactly have a huge machine.  I can go back through, do more tests and
report back.

> In 7.4 I think that the JOIN would yield as good or better a plan.  The
> best possible plan for the NOT EXISTS query is effectively a nestloop
> with inner indexscan, which is great if the FK table is small and the
> PK table is large, but it sucks otherwise.  The planner should choose a
> plan of this form for the LEFT JOIN given that combination of table
> sizes, and so there shouldn't be any great difference in runtime in that
> case.  But in other combinations, such as large FK and small PK, other
> plan types will beat the pants off nestloop.

That's what I was expecting too.  I expected it to basically go, NOT IN,
LEFT JOIN, NOT EXISTS in speed (at least when the hashing stuff
happened for in given the not in enhancements), but didn't actually see
that.

> > I've actually got code (that no longer cleanly applies, but...) that uses
> > the single query version with NOT EXISTS (which could be easily changed to
> > either of the other forms) and was planning to put it together for a patch
> > when 7.5 devel started because I figured it wasn't precisely a bug and
> > wouldn't get accepted for 7.4.
>
> Well, Bruce has this on his open-items list, so I figure we have a green
> light to do something for 7.4 if we can work out what to do.

I must have missed that.  I'd have mentioned it earlier then.


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Sun, 28 Sep 2003, Bruce Momjian wrote:

> Stephan Szabo wrote:
> > Hmm, my initial testing showed that it really was a little slower
> > than a more complicated one with NOT EXISTS so I'd abandoned it. How does
> > it fare for you compared to:
> > select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
> > and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;
> >
> > I believe the above is the appropriate not exists form for match
> > unspecified.
> >
> > I've actually got code (that no longer cleanly applies, but...) that uses
> > the single query version with NOT EXISTS (which could be easily changed to
> > either of the other forms) and was planning to put it together for a patch
> > when 7.5 devel started because I figured it wasn't precisely a bug and
> > wouldn't get accepted for 7.4.
>
> I am a little lost on this point myself --- are we talking 7.4 or 7.5
> for this change?

I'd thought 7.5, but I now see that it's on the 7.4 open items list.


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Tom Lane wrote:
> > I've actually got code (that no longer cleanly applies, but...) that uses
> > the single query version with NOT EXISTS (which could be easily changed to
> > either of the other forms) and was planning to put it together for a patch
> > when 7.5 devel started because I figured it wasn't precisely a bug and
> > wouldn't get accepted for 7.4.
> 
> Well, Bruce has this on his open-items list, so I figure we have a green
> light to do something for 7.4 if we can work out what to do.

I put it on because I wasn't clear exactly what was happening in the
discussion.  There also was discussion that we want to improve this now
because everyone will be using for upgrading to 7.4, and with a ~50% db
reload speed improvement, it is hard to ignore.

I am not against the idea of adding it to 7.4 if we can do it cleanly,
and in fact we are sort of waiting for more serious bug reports at this
time, so doing something else to improve the code isn't out of the
question if we can do it without stumbling --- seems dump/reload gets
full attention only during beta, which makes sense.

However, I think we have to be honest that this is a performance
_improvement_, not a fix.  Yea, you can say it was a bug that we did it
the way we did in the past, but you have to look real hard to see it
that way.  :-)

Let's have multiple people eyeball the patch and give it an OK and we
can add it for 7.4 if people want it.  If you look really hard, you can
say it is a fix for a missing pg_upgrade!

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Let's have multiple people eyeball the patch and give it an OK and we
> can add it for 7.4 if people want it.

Well, we haven't even *got* a proposed patch yet, but yeah we should
tread carefully.  I do think it'd be okay to apply a patch if we can
come up with one that Stephan and Jan and I all like.  As you say,
dump/reload speed normally doesn't get thought about except at this
stage of the release cycle, so ...
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Let's have multiple people eyeball the patch and give it an OK and we
> > can add it for 7.4 if people want it.
> 
> Well, we haven't even *got* a proposed patch yet, but yeah we should
> tread carefully.  I do think it'd be okay to apply a patch if we can
> come up with one that Stephan and Jan and I all like.  As you say,
> dump/reload speed normally doesn't get thought about except at this
> stage of the release cycle, so ...

OK.  What releases had this slow restore problem?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> Well, we haven't even *got* a proposed patch yet, but yeah we should
>> tread carefully.

> OK.  What releases had this slow restore problem?

We introduced it in 7.3 --- before that, FKs were simply dumped as
"create trigger" commands, and there was no check overhead.  So arguably
it is a bug; a performance bug maybe, but that's still a bug.  No one
has yet gone through a dump/reload cycle in which they had to face this
penalty.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> Well, we haven't even *got* a proposed patch yet, but yeah we should
> >> tread carefully.
> 
> > OK.  What releases had this slow restore problem?
> 
> We introduced it in 7.3 --- before that, FKs were simply dumped as
> "create trigger" commands, and there was no check overhead.  So arguably
> it is a bug; a performance bug maybe, but that's still a bug.  No one
> has yet gone through a dump/reload cycle in which they had to face this
> penalty.

Now that is a strong argument. I knew you would find one.  :-)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Christopher Kings-Lynne
Date:
You could just as easily argue that the lack of integrity testing at 
data load time was equally a bug.

I think we need someway of telling postgres to suppress a foreign key check.

The main problem is that the foreign key column is often not indexed.

Chris

Bruce Momjian wrote:
> Tom Lane wrote:
> 
>>Bruce Momjian <pgman@candle.pha.pa.us> writes:
>>
>>>Tom Lane wrote:
>>>
>>>>Well, we haven't even *got* a proposed patch yet, but yeah we should
>>>>tread carefully.
>>
>>>OK.  What releases had this slow restore problem?
>>
>>We introduced it in 7.3 --- before that, FKs were simply dumped as
>>"create trigger" commands, and there was no check overhead.  So arguably
>>it is a bug; a performance bug maybe, but that's still a bug.  No one
>>has yet gone through a dump/reload cycle in which they had to face this
>>penalty.
> 
> 
> Now that is a strong argument. I knew you would find one.  :-)
> 




Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> You could just as easily argue that the lack of integrity testing at 
> data load time was equally a bug.
> 
> I think we need someway of telling postgres to suppress a foreign key check.
> 
> The main problem is that the foreign key column is often not indexed.

As I remember, the new code is showing full table checks of a few
seconds, rather than minutes, but I agree we do need a way to turn off
checks some times.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> I think we need someway of telling postgres to suppress a foreign key check.

Well, the subtext argument here is "do we fix it by providing a way to
suppress the check, or do we fix it by making the check fast enough to
be tolerable?"

I think the advantages of choice (b) are obvious --- it doesn't allow
bogus data to be loaded accidentally, and it doesn't create a problem
with loading existing 7.3 dump files that don't know how to suppress the
check.

If we find there is no way to do (b) acceptably well, then and only then
would I want to consider (a).
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Christopher Kings-Lynne
Date:
> I think the advantages of choice (b) are obvious --- it doesn't allow
> bogus data to be loaded accidentally, and it doesn't create a problem
> with loading existing 7.3 dump files that don't know how to suppress the
> check.

OK, I didn't realise there was a (b).  I volunteer to do speed tests on 
data reloading on real data for our site.

Chris





Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Andreas Pflug
Date:
Christopher Kings-Lynne wrote:

> You could just as easily argue that the lack of integrity testing at 
> data load time was equally a bug.
>
> I think we need someway of telling postgres to suppress a foreign key 
> check.
>
> The main problem is that the foreign key column is often not indexed. 

So a db designer made a bloody mistake.
The problem is there's no easy way to find out what's missing.
I'd really like EXPLAIN to display all subsequent triggered queries 
also, to see the full scans caused by missing indexes.

Regards,
Andreas




Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Christopher Kings-Lynne
Date:
> So a db designer made a bloody mistake.
> The problem is there's no easy way to find out what's missing.
> I'd really like EXPLAIN to display all subsequent triggered queries 
> also, to see the full scans caused by missing indexes.

I'd sure second that!

Chris





Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Hannu Krosing
Date:
Andreas Pflug kirjutas E, 29.09.2003 kell 12:04:
> Christopher Kings-Lynne wrote:
> 
> > You could just as easily argue that the lack of integrity testing at 
> > data load time was equally a bug.
> >
> > I think we need someway of telling postgres to suppress a foreign key 
> > check.
> >
> > The main problem is that the foreign key column is often not indexed. 
> 
> So a db designer made a bloody mistake.
> The problem is there's no easy way to find out what's missing.
> I'd really like EXPLAIN to display all subsequent triggered queries 
> also, to see the full scans caused by missing indexes.

It could probably be doable for EXPLAIN ANALYZE (by actually tracing
execution), but then you will see really _all_ queries, i.e. for a 1000
row update you would see 1 UPDATE query and 1000 fk checks ...

OTOH, you probably can get that already from logs with right logging
parameters.

-------------
Hannu



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Christopher Kings-Lynne
Date:
>>So a db designer made a bloody mistake.
>>The problem is there's no easy way to find out what's missing.
>>I'd really like EXPLAIN to display all subsequent triggered queries 
>>also, to see the full scans caused by missing indexes.
> 
> 
> It could probably be doable for EXPLAIN ANALYZE (by actually tracing
> execution), but then you will see really _all_ queries, i.e. for a 1000
> row update you would see 1 UPDATE query and 1000 fk checks ...
> 
> OTOH, you probably can get that already from logs with right logging
> parameters.

Actually - it shouldn't be too hard to write a query that returns all 
unindexed foreign keys, surely?

Chris




Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Shridhar Daithankar
Date:
On Monday 29 September 2003 15:58, Christopher Kings-Lynne wrote:
> >>So a db designer made a bloody mistake.
> >>The problem is there's no easy way to find out what's missing.
> >>I'd really like EXPLAIN to display all subsequent triggered queries
> >>also, to see the full scans caused by missing indexes.
> >
> > It could probably be doable for EXPLAIN ANALYZE (by actually tracing
> > execution), but then you will see really _all_ queries, i.e. for a 1000
> > row update you would see 1 UPDATE query and 1000 fk checks ...
> >
> > OTOH, you probably can get that already from logs with right logging
> > parameters.
>
> Actually - it shouldn't be too hard to write a query that returns all
> unindexed foreign keys, surely?

Correct me if I am wrong but I remember postgresql throwing error that foreign 
key field was not unique in foreign table. Obviously it can not detect that 
without an index. Either primary key or unique constraint would need an 
index.

What am I missing here?


IOW, how do I exactly create foreign keys without an index?
Shridhar



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
"Nigel J. Andrews"
Date:
On Mon, 29 Sep 2003, Christopher Kings-Lynne wrote:

> > So a db designer made a bloody mistake.
> > The problem is there's no easy way to find out what's missing.
> > I'd really like EXPLAIN to display all subsequent triggered queries 
> > also, to see the full scans caused by missing indexes.
> 
> I'd sure second that!

That's only partially determinable though. The trigger code could branch and
run two different queries depending on the values supplied in the 'input' tuple
of the trigger.

Nigel




Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Hannu Krosing
Date:
Shridhar Daithankar kirjutas E, 29.09.2003 kell 13:34:
> On Monday 29 September 2003 15:58, Christopher Kings-Lynne wrote:
> > >>So a db designer made a bloody mistake.
> > >>The problem is there's no easy way to find out what's missing.
> > >>I'd really like EXPLAIN to display all subsequent triggered queries
> > >>also, to see the full scans caused by missing indexes.
> > >
> > > It could probably be doable for EXPLAIN ANALYZE (by actually tracing
> > > execution), but then you will see really _all_ queries, i.e. for a 1000
> > > row update you would see 1 UPDATE query and 1000 fk checks ...
> > >
> > > OTOH, you probably can get that already from logs with right logging
> > > parameters.
> >
> > Actually - it shouldn't be too hard to write a query that returns all
> > unindexed foreign keys, surely?
> 
> Correct me if I am wrong but I remember postgresql throwing error that foreign 
> key field was not unique in foreign table. Obviously it can not detect that 
> without an index. Either primary key or unique constraint would need an 
> index.
> 
> What am I missing here?
> 
> 
> IOW, how do I exactly create foreign keys without an index?

hannu=# create table pkt(i int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
'pkt_pkey' for table 'pkt'
CREATE TABLE
hannu=# create table fkt(j int references pkt);
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
check(s)
CREATE TABLE
hannu=#

now the *foreygn key* column (fkt.j) is without index. As foreign keys
are enforced both ways, this can be a problem when changing table pkt or
bulk creating FK's on big tables.

----------------
Hannu



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Andreas Pflug
Date:
Nigel J. Andrews wrote:

>On Mon, 29 Sep 2003, Christopher Kings-Lynne wrote:
>
>  
>
>>>So a db designer made a bloody mistake.
>>>The problem is there's no easy way to find out what's missing.
>>>I'd really like EXPLAIN to display all subsequent triggered queries 
>>>also, to see the full scans caused by missing indexes.
>>>      
>>>
>>I'd sure second that!
>>    
>>
>
>That's only partially determinable though. The trigger code could branch and
>run two different queries depending on the values supplied in the 'input' tuple
>of the trigger.
>
That would be ok; if I got a problem with a certain query, I don't 
expect to find problems I might get with other queries. Though this 
would be nice, how about a general pg_gimme_all_problems() function :-)

Regards,
Andreas




Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Greg Stark
Date:

> > > So a db designer made a bloody mistake.

Not necessarily. If I'm never going to update or delete from the parent table
the index would be useless. I find very few of my foreign key relationships
actually need indexes on the child table. I usually only have the unique index
on the parent table.

And often the child table is the big table. The index would be very large and
have awful selectivity. The last thing I want is a 5-million record table with
half a dozen indexes each with 10-20 unique values.

> > > The problem is there's no easy way to find out what's missing.
> > > I'd really like EXPLAIN to display all subsequent triggered queries 
> > > also, to see the full scans caused by missing indexes.
> > 
> > I'd sure second that!

I think the root of problem here is the same as the root of the problem with
foreign key checks being slow for large batch updates and inserts. Namely that
foreign key constraint checks are being handled as a million small queries.

To handle foreign key constraints optimally they would really have to be
merged into the plan in a kind of join. For most inserts/updates something
like a nested-loop join that is effectively the same as the current triggers
would be used. But for large batch updates/inserts it's quite possible that it
would look more like a hash join or even a merge join.

To do that would probably mean throwing out the whole trigger-based
implementation though, which seems like an awfully big project.

And being able to disable and reenable constraints would still be nice.
They're never going to be instantaneous. And besides, speed isn't the only
reason to want to disable constraints temporarily. The database is a tool, it
should be there to do the DBA's bidding, not the other way around :)

-- 
greg



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Sun, 28 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Sat, 27 Sep 2003, Tom Lane wrote:
> >> I thought of what seems to be a better design for the check query: use
> >> a LEFT JOIN and check for NULL in the righthand joined column.
>
> > Hmm, my initial testing showed that it really was a little slower
> > than a more complicated one with NOT EXISTS so I'd abandoned it. How does
> > it fare for you compared to:
> > select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
> > and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;
>
> Were you testing against 7.3 or 7.4?  On what kinds of tables?
>
> In 7.4 I think that the JOIN would yield as good or better a plan.  The
> best possible plan for the NOT EXISTS query is effectively a nestloop
> with inner indexscan, which is great if the FK table is small and the
> PK table is large, but it sucks otherwise.  The planner should choose a
> plan of this form for the LEFT JOIN given that combination of table
> sizes, and so there shouldn't be any great difference in runtime in that
> case.  But in other combinations, such as large FK and small PK, other
> plan types will beat the pants off nestloop.

As an update, so far I still am getting better results with NOT EXISTS
than the left join.  For a 50m row fk, 10k row pk where the rows are just
the keys, I'm getting a plan like
Merge JoinIndex scan on pktableSort Seqscan on fktable

which is taking about 2-4 times longer for me than the not exists
depending on sort_mem (at 4096,64000,128000).

When I lowered random_page_cost to 1, I got an indexscan on fktable, but
that hadn't seemed to finish after about 2 hours (as opposed to about
30-35 minutes for the not exists and about 30 min - 1 1/2 hours for the
sort+seqscan version.

I want to do some more tests where there's extraneous data in both tables
and see what that does to the results.



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Mon, 29 Sep 2003, Stephan Szabo wrote:

> When I lowered random_page_cost to 1, I got an indexscan on fktable, but
> that hadn't seemed to finish after about 2 hours (as opposed to about
> 30-35 minutes for the not exists and about 30 min - 1 1/2 hours for the

Small correction, I'd meant to type 20-35 minutes above, not 30-35.


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> As an update, so far I still am getting better results with NOT EXISTS
> than the left join.

Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
happening?  This is clearly a planner failure, although I'm unsure if we
can expect the planner to get the right answer with no pg_statistic entries.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Mon, 29 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > As an update, so far I still am getting better results with NOT EXISTS
> > than the left join.
>
> Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
> happening?  This is clearly a planner failure, although I'm unsure if we
> can expect the planner to get the right answer with no pg_statistic entries.

For the sort+seq one and the not exists, I had, but I'll re-run it (it's
on my home desktop that I won't be able to access).  The other when I
forced it to use an index scan I haven't let complete yet, and I don't
know how long that will take.

I was also planning to run a set after running analyze, so I'll include
those too.  It'll probably be a few hours before the results are in. :)

Are there any other options (enable_mergejoin, etc) that you want me to
try with?




Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Mon, 29 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > As an update, so far I still am getting better results with NOT EXISTS
> > than the left join.
>
> Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
> happening?  This is clearly a planner failure, although I'm unsure if we
> can expect the planner to get the right answer with no pg_statistic entries.

The left join one seems to give me values like the following:

At sort_mem=4096Merge Right Join  (cost=9966071.76..10349763.45 rows=49501250 width=4)
(actual time=4383435.48..4383435.48 rows=0 loops=1)  Merge Cond: (("outer".a = "inner".b) AND ("outer".b = "inner".c))
Filter:("outer".a IS NULL)  ->  Index Scan using pktest_a_key on pktest  (cost=0.00..52.00
 
rows=1000 width=8) (actual time=17.82..1609.97 rows=10000 loops=1)  ->  Sort  (cost=9966071.76..10089824.88
rows=49501250width=8) (actual
 
time=3876614.87..4157850.82 rows=50000000 loops=1)        Sort Key: fktest.b, fktest.c        ->  Seq Scan on fktest
(cost=0.00..745099.00rows=49501250
 
width=8) (actual time=4.09..273798.65 rows=50000000 loops=1)              Filter: ((b IS NOT NULL) AND (c IS NOT
NULL))Totalruntime: 4384366.79 msec
 
(9 rows)

At sort_mem=128000Merge Right Join  (cost=69.32..134.00 rows=991 width=4) (actual
time=2183787.83..2183787.83 rows=0 loops=1)  Merge Cond: (("outer".a = "inner".b) AND ("outer".b = "inner".c))  Filter:
("outer".aIS NULL)  ->  Index Scan using pktest_a_key on pktest  (cost=0.00..52.00
 
rows=1000 width=8) (actual time=13.11..390.40 rows=10000
loops=1)  ->  Sort  (cost=69.32..71.79 rows=991 width=8) (actual
time=1944240.67..2048954.65 rows=50000000 loops=1)        Sort Key: fktest.b, fktest.c        ->  Seq Scan on fktest
(cost=0.00..20.00rows=991 width=8)
 
(actual time=2.61..225967.79 rows=50000000 loops=1)              Filter: ((b IS NOT NULL) AND (c IS NOT NULL))Total
runtime:2184348.78 msec
 

I haven't finished a run with it doing an index scan on fktestyet,
still...

The not exists gives me:
Seq Scan on fktest  (cost=0.00..242021289.48 rows=24750625 width=8)
(actual time=2032607.68..2032607.68 rows=0 loops=1)  Filter: ((b IS NOT NULL) AND (c IS NOT NULL) AND (NOT (subplan)))
SubPlan   ->  Index Scan using pktest_a_key on pktest  (cost=0.00..4.83 rows=1
 
width=0) (actual time=0.03..0.03 rows=1 loops=50000000)          Index Cond: ((a = $0) AND (b = $1))Total runtime:
2032607.87msec
 
(6 rows)

But this time was one of the higher times for this query.  I'd seen times
down at about 1400000 msec yesterday.



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Andreas Pflug
Date:
Greg Stark wrote:

>  
>
>>>>So a db designer made a bloody mistake.
>>>>        
>>>>
>
>Not necessarily. If I'm never going to update or delete from the parent table
>the index would be useless. I find very few of my foreign key relationships
>actually need indexes on the child table. I usually only have the unique index
>on the parent table.
>
>And often the child table is the big table. The index would be very large and
>have awful selectivity. The last thing I want is a 5-million record table with
>half a dozen indexes each with 10-20 unique values.
>
>  
>
>>>>The problem is there's no easy way to find out what's missing.
>>>>I'd really like EXPLAIN to display all subsequent triggered queries 
>>>>also, to see the full scans caused by missing indexes.
>>>>        
>>>>
>>>I'd sure second that!
>>>      
>>>
>
>I think the root of problem here is the same as the root of the problem with
>foreign key checks being slow for large batch updates and inserts. Namely that
>foreign key constraint checks are being handled as a million small queries.
>
Apart from missing indices, this is certainly a problem. Statement level 
triggers will solve this, as soon as they are fully implemented and 
support OLD and NEW record sets.

Regards,
Andreas



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
>> Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
>> happening?  This is clearly a planner failure, although I'm unsure if we
>> can expect the planner to get the right answer with no pg_statistic entries.

> The left join one seems to give me values like the following:

There are some fishy row estimates in here:

>    ->  Index Scan using pktest_a_key on pktest  (cost=0.00..52.00
> rows=1000 width=8) (actual time=17.82..1609.97 rows=10000 loops=1)

The system definitely should be expected to have the accurate row count
for the PK table, since an index should have been created on it (and we
do do that after loading the data, no?).  It is possible that it'd have
the default 1000 estimate for the FK table, if there are no indexes at
all on the FK table; otherwise it should have the right number.  It's
not real clear to me what conditions you're testing under, but the
estimates in the plans you're quoting aren't consistent ...
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Mon, 29 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> >> Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
> >> happening?  This is clearly a planner failure, although I'm unsure if we
> >> can expect the planner to get the right answer with no pg_statistic entries.
>
> > The left join one seems to give me values like the following:
>
> There are some fishy row estimates in here:
>
> >    ->  Index Scan using pktest_a_key on pktest  (cost=0.00..52.00
> > rows=1000 width=8) (actual time=17.82..1609.97 rows=10000 loops=1)
>
> The system definitely should be expected to have the accurate row count
> for the PK table, since an index should have been created on it (and we
> do do that after loading the data, no?).  It is possible that it'd have
> the default 1000 estimate for the FK table, if there are no indexes at
> all on the FK table; otherwise it should have the right number.  It's
> not real clear to me what conditions you're testing under, but the
> estimates in the plans you're quoting aren't consistent ...

Well, they're all from the same load of the same data with only stopping
and starting in between, but I did make the index on the pk table first
loaded the data and then built the fk table index ( because I'd wanted to
try without the index as well), which meant that it wouldn't match the
behavior of a dump. Ugh, I'd forgotten that the primary key didn't get
created until later too.

Okay, that's much better:Hash Left Join  (cost=203.00..1487869.29 rows=49501250 width=4) (actual
time=611632.67..611632.67 rows=0 loops=1)  Hash Cond: (("outer".b = "inner".a) AND ("outer".c = "inner".b))  Filter:
("inner".aIS NULL)  ->  Seq Scan on fktest  (cost=0.00..745099.00 rows=49501250 width=8)
 
(actual time=0.01..169642.48 rows=50000000 loops=1)        Filter: ((b IS NOT NULL) AND (c IS NOT NULL))  ->  Hash
(cost=152.00..152.00rows=10000 width=8) (actual
 
time=46.04..46.04 rows=0 loops=1)        ->  Seq Scan on pktest  (cost=0.00..152.00 rows=10000 width=8)
(actual time=0.02..21.38 rows=10000 loops=1)Total runtime: 611632.95 msec
(8 rows)

That's much better. :)  As long as the row estimates are reasonable it
seems to be okay, but I do wonder why it chose the merge join for the case
when it thought there was only 1000 rows though.


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Jan Wieck
Date:
Tom Lane wrote:
> Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
>> I think we need someway of telling postgres to suppress a foreign key check.
> 
> Well, the subtext argument here is "do we fix it by providing a way to
> suppress the check, or do we fix it by making the check fast enough to
> be tolerable?"
> 
> I think the advantages of choice (b) are obvious --- it doesn't allow
> bogus data to be loaded accidentally, and it doesn't create a problem
> with loading existing 7.3 dump files that don't know how to suppress the
> check.
> 
> If we find there is no way to do (b) acceptably well, then and only then
> would I want to consider (a).

I think I can accept it to be the choice of the DBA what to do. Pg_dump 
has that kind of options already, one can choose between COPY and INSERT 
for example. Why not adding the choice of dumping FKeys as ALTER TABLE 
or CREATE CONSTRAINT TRIGGER?

The whole "original" idea (way back a few years ago) of doing it with 
the CREATE CONSTRAINT TRIGGER command was, that your "backup" ought to 
be consistent anyway. Finding out that your tape contains inconsistent 
garbage _after_ your harddisk made that suspicious noise ... is a bit 
late, isn't it?

That ALTER TABLE ... ADD CONSTRAINT needs to be improved, because at the 
moment it is normally used we cannot make any assumptions about data 
consistency, no question. But just because we have such a nice and 
allways asked for ALTER TABLE command does not mean we have to force 
every DBA of every well maintained and stable system to perform hourly 
long nonsense-tests on known-to-be-good data.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
(I think my previous attempt got aborted by a lost connection, so a
message like this may arrive twice)

On Mon, 29 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> >> Hm.  Don't suppose you were using EXPLAIN ANALYZE so we could see what's
> >> happening?  This is clearly a planner failure, although I'm unsure if we
> >> can expect the planner to get the right answer with no pg_statistic entries.
>
> > The left join one seems to give me values like the following:
>
> There are some fishy row estimates in here:
>
> >    ->  Index Scan using pktest_a_key on pktest  (cost=0.00..52.00
> > rows=1000 width=8) (actual time=17.82..1609.97 rows=10000 loops=1)
>
> The system definitely should be expected to have the accurate row count
> for the PK table, since an index should have been created on it (and we
> do do that after loading the data, no?).  It is possible that it'd have
> the default 1000 estimate for the FK table, if there are no indexes at
> all on the FK table; otherwise it should have the right number.  It's
> not real clear to me what conditions you're testing under, but the
> estimates in the plans you're quoting aren't consistent ...

Also, the sequence was basically:
CREATE TABLE pktest(a int, b int, unique(a,b));
CREATE TABLE fktest(b int, c int);
COPY pktest FROM STDIN;
...
COPY fktest FROM STDIN;
...
<run some tests I didn't mention here>
CREATE INDEX fki on fktest(b,c);
<run the above test>

With stopping and restarting the server involved and running the tests
multiple times.


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
> I think I can accept it to be the choice of the DBA what to do. Pg_dump 
> has that kind of options already, one can choose between COPY and INSERT 
> for example. Why not adding the choice of dumping FKeys as ALTER TABLE 
> or CREATE CONSTRAINT TRIGGER?

We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that
loses pg_depend information and (b) it's too low-level a representation;
we couldn't ever change the implementation of foreign keys as long as
dumps look like that.

Also, I don't see why you'd want to make such a choice at pg_dump time.
Probably better to control it at restore time.  Accordingly, my proposal
if we were to go that route would be a boolean GUC variable that simply
prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Jan Wieck
Date:

Tom Lane wrote:

> Jan Wieck <JanWieck@Yahoo.com> writes:
>> I think I can accept it to be the choice of the DBA what to do. Pg_dump 
>> has that kind of options already, one can choose between COPY and INSERT 
>> for example. Why not adding the choice of dumping FKeys as ALTER TABLE 
>> or CREATE CONSTRAINT TRIGGER?
> 
> We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that
> loses pg_depend information and (b) it's too low-level a representation;
> we couldn't ever change the implementation of foreign keys as long as
> dumps look like that.

That's finally 2 points, okay.

> 
> Also, I don't see why you'd want to make such a choice at pg_dump time.
> Probably better to control it at restore time.  Accordingly, my proposal
> if we were to go that route would be a boolean GUC variable that simply
> prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.

Okay too. And this would be simple and safe enough to add it at the time 
being.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Christopher Kings-Lynne
Date:
> Correct me if I am wrong but I remember postgresql throwing error that foreign 
> key field was not unique in foreign table. Obviously it can not detect that 
> without an index. Either primary key or unique constraint would need an 
> index.
> 
> What am I missing here?
> 
> 
> IOW, how do I exactly create foreign keys without an index?

You are taling about the primary key or the referenced key, not the 
foreign key.  (eg. the source column)

Chris




Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Jan Wieck wrote:
> 
> 
> Tom Lane wrote:
> 
> > Jan Wieck <JanWieck@Yahoo.com> writes:
> >> I think I can accept it to be the choice of the DBA what to do. Pg_dump 
> >> has that kind of options already, one can choose between COPY and INSERT 
> >> for example. Why not adding the choice of dumping FKeys as ALTER TABLE 
> >> or CREATE CONSTRAINT TRIGGER?
> > 
> > We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that
> > loses pg_depend information and (b) it's too low-level a representation;
> > we couldn't ever change the implementation of foreign keys as long as
> > dumps look like that.
> 
> That's finally 2 points, okay.
> 
> > 
> > Also, I don't see why you'd want to make such a choice at pg_dump time.
> > Probably better to control it at restore time.  Accordingly, my proposal
> > if we were to go that route would be a boolean GUC variable that simply
> > prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.
> 
> Okay too. And this would be simple and safe enough to add it at the time 
> being.

If we go that direction, why don't we just make a GUC variable to
disable constraint checking.  Is that what this will do, or is it more
limited.  I know it breaks referential integrity, but we have had many
folks as for it, it is on the TODO list, and there are tons of server
functions flying around that do just this by fiddling with pg_class.  I
would rather just have it be a GUC for that particular backend.  People
are going to need to turn it off anyway, so why not give them a clean
way to do it.

Also, how does someone turn it on at restore time if they are piping
into psql?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Jan Wieck wrote:
>> Tom Lane wrote:
>>> if we were to go that route would be a boolean GUC variable that simply
>>> prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.
>> 
>> Okay too. And this would be simple and safe enough to add it at the time 
>> being.

> If we go that direction, why don't we just make a GUC variable to
> disable constraint checking.

You mean in general, even for plain old insert/update/delete changes?
Yipes.  What happened to ACID compliance?

What I actually expected to ensue was a discussion about how we could
narrow down the effects of a disable-foreign-key-verification switch to
reduce the odds of shooting oneself in the foot.  (For example, maybe
disallow it from being set in postgresql.conf.)  I wasn't expecting
proposals to enlarge the gauge of the foot-gun ...

> Also, how does someone turn it on at restore time if they are piping
> into psql?

Something like
export PGOPTIONS="-c disable-fk-verification=true"
then run psql or pg_restore.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Tue, 30 Sep 2003, Bruce Momjian wrote:

> Jan Wieck wrote:
> >
> >
> > Tom Lane wrote:
> >
> > > Jan Wieck <JanWieck@Yahoo.com> writes:
> > >> I think I can accept it to be the choice of the DBA what to do. Pg_dump
> > >> has that kind of options already, one can choose between COPY and INSERT
> > >> for example. Why not adding the choice of dumping FKeys as ALTER TABLE
> > >> or CREATE CONSTRAINT TRIGGER?
> > >
> > > We don't want it dumping as CREATE CONSTRAINT TRIGGER, because (a) that
> > > loses pg_depend information and (b) it's too low-level a representation;
> > > we couldn't ever change the implementation of foreign keys as long as
> > > dumps look like that.
> >
> > That's finally 2 points, okay.
> >
> > >
> > > Also, I don't see why you'd want to make such a choice at pg_dump time.
> > > Probably better to control it at restore time.  Accordingly, my proposal
> > > if we were to go that route would be a boolean GUC variable that simply
> > > prevents ALTER TABLE ADD FOREIGN KEY from doing the validity checks.
> >
> > Okay too. And this would be simple and safe enough to add it at the time
> > being.
>
> If we go that direction, why don't we just make a GUC variable to
> disable constraint checking.  Is that what this will do, or is it more
> limited.  I know it breaks referential integrity, but we have had many
> folks as for it, it is on the TODO list, and there are tons of server
> functions flying around that do just this by fiddling with pg_class.  I
> would rather just have it be a GUC for that particular backend.  People
> are going to need to turn it off anyway, so why not give them a clean
> way to do it.

But such a GUC wouldn't affect just one backend.  It'd potentially affect
all backends that were doing concurrent modifications that would be
involved since the locks aren't taken.  In addition, who would be allowed
to set this value and what constraints would it affect? If it's only
superusers, then it doesn't help for non-superuser restores.  If it's
settable by anyone and affects only constraints on tables that user owns
and that refer to tables that user owns it might be okay.  If it's
settable by anyone and affects all tables it renders the constraints
meaningless since anyone could break them.



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Tom Lane wrote:
> > If we go that direction, why don't we just make a GUC variable to
> > disable constraint checking.
> 
> You mean in general, even for plain old insert/update/delete changes?
> Yipes.  What happened to ACID compliance?
> 
> What I actually expected to ensue was a discussion about how we could
> narrow down the effects of a disable-foreign-key-verification switch to
> reduce the odds of shooting oneself in the foot.  (For example, maybe
> disallow it from being set in postgresql.conf.)  I wasn't expecting
> proposals to enlarge the gauge of the foot-gun ...

Fact is, folks are doing it anyway by modifying pg_class.  I know one
guy who did it in a transaction so he was the only one to see the
triggers disabled!  The PostgreSQL cookbook page has an example too. 
People are always asking how to do this.  Why not just make it setable
only by the super-user.

FYI, TODO has:
* Allow triggers to be disabled [trigger]* With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN  KEY
The second one is the one we are discussing.  If we never want to do it,
I will remove it from the TODO list.  However, I think we might be
making things too controlled by not allowing administrators to do this.


> > Also, how does someone turn it on at restore time if they are piping
> > into psql?
> 
> Something like
> export PGOPTIONS="-c disable-fk-verification=true"
> then run psql or pg_restore.

How many folks are going to remember to do this?  Why make it hard for
them?  Someone is going to forget too easily.  "Why is this restore
taking so long?  Oh, I forgot that switch."  Or they put it in a login
file and forget it is set.  Seems safer for it to be in the dump file.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Stephan Szabo wrote:
> > If we go that direction, why don't we just make a GUC variable to
> > disable constraint checking.  Is that what this will do, or is it more
> > limited.  I know it breaks referential integrity, but we have had many
> > folks as for it, it is on the TODO list, and there are tons of server
> > functions flying around that do just this by fiddling with pg_class.  I
> > would rather just have it be a GUC for that particular backend.  People
> > are going to need to turn it off anyway, so why not give them a clean
> > way to do it.
> 
> But such a GUC wouldn't affect just one backend.  It'd potentially affect
> all backends that were doing concurrent modifications that would be
> involved since the locks aren't taken.  In addition, who would be allowed
> to set this value and what constraints would it affect? If it's only
> superusers, then it doesn't help for non-superuser restores.  If it's
> settable by anyone and affects only constraints on tables that user owns
> and that refer to tables that user owns it might be okay.  If it's
> settable by anyone and affects all tables it renders the constraints
> meaningless since anyone could break them.

I assume it would be only setable by the super-user.  They are mucking
around with pg_class anyway (and have permission to do so), so let them
do it cleanly at least.  Allowing non-supers to do it for tables they
own would be OK, I guess.  Is there a problem if some of the primary
table is owned by someone else?  Not sure.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> How many folks are going to remember to do this?  Why make it hard for
> them?  Someone is going to forget too easily.  "Why is this restore
> taking so long?  Oh, I forgot that switch."  Or they put it in a login
> file and forget it is set.  Seems safer for it to be in the dump file.

I disagree.  The "how many folks are going to remember to do this"
argument applies just as well to magic pg_dump switches; that's not
a tenable argument against doing it at restore time.

The difference between controlling it at pg_dump time and pg_restore
time is that if you change your mind after having made the dump, it's
too late, if the decision was nailed down in the dump file.  In an
upgrade situation it's very likely that you no longer have the option
to re-do your dump, because you already blew away your old installation.

Since there's no performance difference at pg_dump time, I can't see any
advantage to freezing your decision then.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I assume it would be only setable by the super-user.

That might be a good restriction too (on top of my speculation about not
allowing it in postgresql.conf).  Only allow it to be SET per-session,
and only by a superuser.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Tue, 30 Sep 2003, Bruce Momjian wrote:

> Stephan Szabo wrote:
> > > If we go that direction, why don't we just make a GUC variable to
> > > disable constraint checking.  Is that what this will do, or is it more
> > > limited.  I know it breaks referential integrity, but we have had many
> > > folks as for it, it is on the TODO list, and there are tons of server
> > > functions flying around that do just this by fiddling with pg_class.  I
> > > would rather just have it be a GUC for that particular backend.  People
> > > are going to need to turn it off anyway, so why not give them a clean
> > > way to do it.
> >
> > But such a GUC wouldn't affect just one backend.  It'd potentially affect
> > all backends that were doing concurrent modifications that would be
> > involved since the locks aren't taken.  In addition, who would be allowed
> > to set this value and what constraints would it affect? If it's only
> > superusers, then it doesn't help for non-superuser restores.  If it's
> > settable by anyone and affects only constraints on tables that user owns
> > and that refer to tables that user owns it might be okay.  If it's
> > settable by anyone and affects all tables it renders the constraints
> > meaningless since anyone could break them.
>
> I assume it would be only setable by the super-user.  They are mucking
> around with pg_class anyway (and have permission to do so), so let them
> do it cleanly at least.  Allowing non-supers to do it for tables they
> own would be OK, I guess.  Is there a problem if some of the primary
> table is owned by someone else?  Not sure.

The problem I have with a super-user only solution is that it doesn't
solve the problem for restores in general.  I think we need a mechanism
that works for any user that wants to restore a table (or tables) from
dump(s), so for the dump/restore mechanism I think we should be looking in
that direction.


Re: ADD FOREIGN KEY

From
Christopher Browne
Date:
Centuries ago, Nostradamus foresaw when tgl@sss.pgh.pa.us (Tom Lane) would write:
> Since there's no performance difference at pg_dump time, I can't see any
> advantage to freezing your decision then.

This parallels the common suggestion of throwing an ANALYZE in at the
bottom of a pg_dump script.  On that particular note, I'd think it
preferable to analyze after loading each table, since the data for the
specific table will still be in memory.  But that's a _bit_ of a
change of subject.

This looks like something where a "hook" would be valuable such that
there is something in the pg_dump that can be configured AFTER the
fact to control how it's loaded.

It would surely seem valuable to have a way of making loads go As Fast
As Possible, even with the possibility of "breakneck speed" offering
the possibility of actually getting seriously injured (breaking one's
neck?).  If the hardware fails during the recovery, consider that you
were _recovering_ from a _backup_; that surely ought to be an
eminently redoable operation, quite unlike accepting a random SQL
request from a user.

I have done some "recoveries" recently (well, more precisely,
"installs") by taking a tarball of a pre-existing database and
dropping it into place.  I had no problem with the fact that if my
hand slipped and hit ^C at the wrong moment ("quelle horreur!"), I
would be forced to restart the "cd $TARGETDIR; tar xfvz Flex.tgz"
process.

I would be pretty "game" for a near-single-user-mode approach that
would turn off some of the usual functionality that we knew we didn't
need because the data source was an already-committed-and-FK-checked
set of data.
-- 
output = reverse("ac.notelrac.teneerf" "@" "454aa")
http://www.ntlug.org/~cbbrowne/spiritual.html
"Another result of the tyranny of Pascal is that beginners don't use
function pointers."  --Rob Pike


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Stephan Szabo wrote:
> On Tue, 30 Sep 2003, Bruce Momjian wrote:
> 
> > Stephan Szabo wrote:
> > > > If we go that direction, why don't we just make a GUC variable to
> > > > disable constraint checking.  Is that what this will do, or is it more
> > > > limited.  I know it breaks referential integrity, but we have had many
> > > > folks as for it, it is on the TODO list, and there are tons of server
> > > > functions flying around that do just this by fiddling with pg_class.  I
> > > > would rather just have it be a GUC for that particular backend.  People
> > > > are going to need to turn it off anyway, so why not give them a clean
> > > > way to do it.
> > >
> > > But such a GUC wouldn't affect just one backend.  It'd potentially affect
> > > all backends that were doing concurrent modifications that would be
> > > involved since the locks aren't taken.  In addition, who would be allowed
> > > to set this value and what constraints would it affect? If it's only
> > > superusers, then it doesn't help for non-superuser restores.  If it's
> > > settable by anyone and affects only constraints on tables that user owns
> > > and that refer to tables that user owns it might be okay.  If it's
> > > settable by anyone and affects all tables it renders the constraints
> > > meaningless since anyone could break them.
> >
> > I assume it would be only setable by the super-user.  They are mucking
> > around with pg_class anyway (and have permission to do so), so let them
> > do it cleanly at least.  Allowing non-supers to do it for tables they
> > own would be OK, I guess.  Is there a problem if some of the primary
> > table is owned by someone else?  Not sure.
> 
> The problem I have with a super-user only solution is that it doesn't
> solve the problem for restores in general.  I think we need a mechanism
> that works for any user that wants to restore a table (or tables) from
> dump(s), so for the dump/restore mechanism I think we should be looking in
> that direction.

OK.  Let's explore that.  What does ownership mean?  If I grant all
permissions on an object I own to you, what can you not do?  I think
GRANT/REVOKE and ALTER TABLE are the only two ones, right?

So, if I own it, I am the only one who can ALTER the table to add/remove
the foreign key constraint.  So, if I already have a foreign key
constraint on a table, I can easily remove it if I am the owner and do
whatever I want with the table.

Now, the big question is, is there harm in my saying in the system
catalogs that I have a foreign key constraint on a table, when I might
have turned off the constraint via GUC and modified the table so the
foreign key constraint isn't valid?  I think that is the big question
--- is there harm to others in saying something I own has a foreign key,
when it might not?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > How many folks are going to remember to do this?  Why make it hard for
> > them?  Someone is going to forget too easily.  "Why is this restore
> > taking so long?  Oh, I forgot that switch."  Or they put it in a login
> > file and forget it is set.  Seems safer for it to be in the dump file.
> 
> I disagree.  The "how many folks are going to remember to do this"
> argument applies just as well to magic pg_dump switches; that's not
> a tenable argument against doing it at restore time.
> 
> The difference between controlling it at pg_dump time and pg_restore
> time is that if you change your mind after having made the dump, it's
> too late, if the decision was nailed down in the dump file.  In an
> upgrade situation it's very likely that you no longer have the option
> to re-do your dump, because you already blew away your old installation.
> 
> Since there's no performance difference at pg_dump time, I can't see any
> advantage to freezing your decision then.

I understand, and if everyone used pg_restore, then adding a flag to
pg_restore to do this would make sense.  However, everyone is used to
treating that dump file as a simple dump and throwing it into psql. 
Psql doesn't have any special dump flags, so you have to do the
environment variable trick, which you must admit is pretty ugly looking
and prone to typing errors, and forgetting, because they are used to
invoking psql all the time.

Maybe we need a psql dump reload flag?  Would we be able to do any other
optimizations, like increasing sort_mem or something?  That would be a
clean solution, and perhaps allow additional optimizations.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I assume it would be only setable by the super-user.
> 
> That might be a good restriction too (on top of my speculation about not
> allowing it in postgresql.conf).  Only allow it to be SET per-session,

We don't have a way to make something unsetable in postgresql.conf right
now, do we?

> and only by a superuser.

See my recent email on this about "ownership".  I personally am happy
with super-user only (or db-owner and super-user only).  As I said, it
is a question of what documenting a foreign key in the system catalogs
means to folks who don't own the table.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Andreas Pflug
Date:
Bruce Momjian wrote:

>Fact is, folks are doing it anyway by modifying pg_class.  I know one
>guy who did it in a transaction so he was the only one to see the
>triggers disabled!  The PostgreSQL cookbook page has an example too. 
>People are always asking how to do this.  Why not just make it setable
>only by the super-user.
>
>FYI, TODO has:
>
>    * Allow triggers to be disabled [trigger]
>    * With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN
>      KEY
>

For practical reasons, I'd prefer the "disable trigger" not to influence 
fk triggers, or at least to have such a default flavor. When restoring a 
database, you might consider the data as consistent and complete, so no 
triggers and ref checks are needed at all. But in the cases of some kind 
of application data import, you might like the data to have fk ref 
checked, but don't want to trigger all user triggers.
The implementation of fk checking by triggers should normally be hidden 
to the user.

Regards,
Andreas



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> Since there's no performance difference at pg_dump time, I can't see any
>> advantage to freezing your decision then.

> I understand, and if everyone used pg_restore, then adding a flag to
> pg_restore to do this would make sense.  However, everyone is used to
> treating that dump file as a simple dump and throwing it into psql. 

So?  A GUC variable could be set equally easily either way.  In fact
more so.

> Psql doesn't have any special dump flags, so you have to do the
> environment variable trick,

You forgot SET ...
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> That might be a good restriction too (on top of my speculation about not
>> allowing it in postgresql.conf).  Only allow it to be SET per-session,

> We don't have a way to make something unsetable in postgresql.conf right
> now, do we?

Yeah, we do --- see GUC_DISALLOW_IN_FILE.  The existing variable
zero_damaged_pages has the same restrictions we're talking about here,
and for largely the same reasons: you can shoot yourself in the foot
with it.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Tue, 30 Sep 2003, Jan Wieck wrote:

> Stephan Szabo wrote:
> > On Tue, 30 Sep 2003, Tom Lane wrote:
> >
> >> I see where Stephan is coming from, but in my mind disabling consistency
> >> checks ought to be a feature reserved to the DBA (ie superuser), who
> >> presumably has some clue about the tradeoffs involved.  I don't think
> >> ordinary users should be able to do it.  If we can get the cost of
> >> performing the initial check down to something reasonable (and I don't
> >> mean "near zero", I mean something that's small in comparison to the
> >> other costs of loading data and creating indexes), then I think we've
> >> done as much as we should do for ordinary users.
> >
> > Limiting the cases under which constraint ignoring works is certainly
> > fine by me, but I was assuming that we were trying to make it accessable
> > to any restore. If that's not true, then we don't need to worry about that
> > part of the issue.
>
> It is not true.
>
> Fact is that restoring can require more rights than creating the dump.
> That is already the case if you want to restore anything that contains
> objects owned by different users. Trying to enable everyone who can take
> a dump also to restore it, by whatever mechanism, gives someone the
> right to revert things in time and create a situation (consistent or
> not) that he could not (re)create without doing dump/restore. This is
> wrong and should not be possible.

I think this is a larger argument than the one that was being discussed
above. Given a dump of objects I own, can I restore them without requiring
the fk check to be done if I alter table add constraint a foreign key? If
the answer to that is no, then the option can be put in as a superuser
only option and it's relatively easy. If the answer to that is yes, then
there are additional issues that need to be resolved.


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Jan Wieck
Date:
Stephan Szabo wrote:

> On Tue, 30 Sep 2003, Jan Wieck wrote:
> 
>> Stephan Szabo wrote:
>> > On Tue, 30 Sep 2003, Tom Lane wrote:
>> >
>> >> I see where Stephan is coming from, but in my mind disabling consistency
>> >> checks ought to be a feature reserved to the DBA (ie superuser), who
>> >> presumably has some clue about the tradeoffs involved.  I don't think
>> >> ordinary users should be able to do it.  If we can get the cost of
>> >> performing the initial check down to something reasonable (and I don't
>> >> mean "near zero", I mean something that's small in comparison to the
>> >> other costs of loading data and creating indexes), then I think we've
>> >> done as much as we should do for ordinary users.
>> >
>> > Limiting the cases under which constraint ignoring works is certainly
>> > fine by me, but I was assuming that we were trying to make it accessable
>> > to any restore. If that's not true, then we don't need to worry about that
>> > part of the issue.
>>
>> It is not true.
>>
>> Fact is that restoring can require more rights than creating the dump.
>> That is already the case if you want to restore anything that contains
>> objects owned by different users. Trying to enable everyone who can take
>> a dump also to restore it, by whatever mechanism, gives someone the
>> right to revert things in time and create a situation (consistent or
>> not) that he could not (re)create without doing dump/restore. This is
>> wrong and should not be possible.
> 
> I think this is a larger argument than the one that was being discussed
> above. Given a dump of objects I own, can I restore them without requiring
> the fk check to be done if I alter table add constraint a foreign key? If
> the answer to that is no, then the option can be put in as a superuser
> only option and it's relatively easy. If the answer to that is yes, then
> there are additional issues that need to be resolved.

Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
a consensus that we don't _want_ that. Probably we should declare it 
deprecated and remove it in 7.5. And the option currently under 
discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
that _should_ be restricted.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Tue, 30 Sep 2003, Tom Lane wrote:

> I see where Stephan is coming from, but in my mind disabling consistency
> checks ought to be a feature reserved to the DBA (ie superuser), who
> presumably has some clue about the tradeoffs involved.  I don't think
> ordinary users should be able to do it.  If we can get the cost of
> performing the initial check down to something reasonable (and I don't
> mean "near zero", I mean something that's small in comparison to the
> other costs of loading data and creating indexes), then I think we've
> done as much as we should do for ordinary users.

Limiting the cases under which constraint ignoring works is certainly
fine by me, but I was assuming that we were trying to make it accessable
to any restore. If that's not true, then we don't need to worry about that
part of the issue.

As a side note, in the partial implementation I'd already done, I noticed
a potential problem if the person doing the alter table didn't have read
permissions on the pktable. I'd written it to bail and do the slow check
in that case (well actually in most error cases that didn't themselves
cause an elog), does anyone have a better idea?


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> As a side note, in the partial implementation I'd already done, I noticed
> a potential problem if the person doing the alter table didn't have read
> permissions on the pktable. I'd written it to bail and do the slow check
> in that case (well actually in most error cases that didn't themselves
> cause an elog), does anyone have a better idea?

Wouldn't all the subsequent triggers fail also in such a case?  (For
that matter, wouldn't the existing implementation of the initial check
fail?)  I can't see a reason to expend code to avoid failing here.  It's
not very sensible to be able to create an FK on a table you don't have
read permission for.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Stephan Szabo
Date:
On Tue, 30 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > As a side note, in the partial implementation I'd already done, I noticed
> > a potential problem if the person doing the alter table didn't have read
> > permissions on the pktable. I'd written it to bail and do the slow check
> > in that case (well actually in most error cases that didn't themselves
> > cause an elog), does anyone have a better idea?
>
> Wouldn't all the subsequent triggers fail also in such a case?  (For
> that matter, wouldn't the existing implementation of the initial check
> fail?)  I can't see a reason to expend code to avoid failing here.  It's

No, because the triggers change permissions to the owner of the
appropriate (either fk or pk) table before running the query, so the old
method works as well as the final constraint would. However, if the two
owners are not the same, you can't set to both during the single query.

> not very sensible to be able to create an FK on a table you don't have
> read permission for.

IIRC, you only need references permissions to make an fk constraint, not
select.


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Stephan Szabo wrote:
>> The problem I have with a super-user only solution is that it doesn't
>> solve the problem for restores in general.

> OK.  Let's explore that.  What does ownership mean?

It does not normally mean the ability to bypass consistency checks;
for example, if you put a CHECK constraint on a table, you don't get
to violate it because you own the table.  (Of course superuserness
doesn't let you do so either...)

I see where Stephan is coming from, but in my mind disabling consistency
checks ought to be a feature reserved to the DBA (ie superuser), who
presumably has some clue about the tradeoffs involved.  I don't think
ordinary users should be able to do it.  If we can get the cost of
performing the initial check down to something reasonable (and I don't
mean "near zero", I mean something that's small in comparison to the
other costs of loading data and creating indexes), then I think we've
done as much as we should do for ordinary users.
        regards, tom lane


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Jan Wieck
Date:
Stephan Szabo wrote:
> On Tue, 30 Sep 2003, Tom Lane wrote:
> 
>> I see where Stephan is coming from, but in my mind disabling consistency
>> checks ought to be a feature reserved to the DBA (ie superuser), who
>> presumably has some clue about the tradeoffs involved.  I don't think
>> ordinary users should be able to do it.  If we can get the cost of
>> performing the initial check down to something reasonable (and I don't
>> mean "near zero", I mean something that's small in comparison to the
>> other costs of loading data and creating indexes), then I think we've
>> done as much as we should do for ordinary users.
> 
> Limiting the cases under which constraint ignoring works is certainly
> fine by me, but I was assuming that we were trying to make it accessable
> to any restore. If that's not true, then we don't need to worry about that
> part of the issue.

It is not true.

Fact is that restoring can require more rights than creating the dump. 
That is already the case if you want to restore anything that contains 
objects owned by different users. Trying to enable everyone who can take 
a dump also to restore it, by whatever mechanism, gives someone the 
right to revert things in time and create a situation (consistent or 
not) that he could not (re)create without doing dump/restore. This is 
wrong and should not be possible.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: ADD FOREIGN KEY

From
Manfred Koizar
Date:
On Tue, 30 Sep 2003 08:00:07 -0400, Christopher Browne
<cbbrowne@acm.org> wrote:
>I would be pretty "game" for a near-single-user-mode approach that
>would turn off some of the usual functionality that we knew we didn't
>need because the data source was an already-committed-and-FK-checked
>set of data.

Single user mode is a good idea, IMHO.  But it should only make sure
that there is not more than one user connected to the database (or to
the postmaster).  Everything else should depend on special GUC
variables that are only settable in single user mode:

db=> SET disable-fk-verification = true;
ERROR: "disable-fk-verification" can only be set in single user mode
db=> SET SINGLE USER MODE ON;
ERROR: permission denied
HINT: Must be superuser or owner of database "db".
db=> \c - dbo
You are now connected as new user "dbo".
db=> SET SINGLE USER MODE ON;
ERROR: cannot enter single user mode
HINT: You are not the only user connected to database "db".
-- after other users have logged out ...
db=> SET SINGLE USER MODE ON;
SET
db=> SET disable-fk-verification = true;
SET

Single user mode would also help in several cases where now a
standalone backend is required ...

ServusManfred


Re: ADD FOREIGN KEY

From
Christopher Browne
Date:
mkoi-pg@aon.at (Manfred Koizar) writes:
> On Tue, 30 Sep 2003 08:00:07 -0400, Christopher Browne
> <cbbrowne@acm.org> wrote:
>>I would be pretty "game" for a near-single-user-mode approach that
>>would turn off some of the usual functionality that we knew we didn't
>>need because the data source was an already-committed-and-FK-checked
>>set of data.
>
> Single user mode is a good idea, IMHO.  But it should only make sure
> that there is not more than one user connected to the database (or
> to the postmaster).

Well, there already exists an honest-to-goodness single-user mode,
where you start a postmaster directly.  

This is the way that you need to connect to PG in order to be able to
regenerate indexes for any "nailed" system tables.

If I could be certain that a "pg_fast_recovery" program could run
several times faster than the existing approach of "psql <
recoveryfile.sql", then it might well be worthwhile to have something
invoked something like the following:

% zcat /backups/latest_backup.gz | postmaster -D $PGDATA -F -N 0 --fast-recovery-off-ACID --log /tmp/recovery.log mydb

-N 0 means that there won't even be as many as one user connected to
the database.

I would, given an ideal world, prefer to be able to have a connection
or two live during this to let me monitor the DB and even get an early
peek at the data.  But if I could save a few hours of recovery time,
it might be livable to lose that.
-- 
select 'cbbrowne' || '@' || 'libertyrms.info';
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)


Re: ADD FOREIGN KEY

From
Tom Lane
Date:
Manfred Koizar <mkoi-pg@aon.at> writes:
> db=> SET disable-fk-verification = true;
> ERROR: "disable-fk-verification" can only be set in single user mode

I don't really see the point of such a restriction.  Restricting the
system to a single user has nothing to do with making
disable-fk-verification more safe.  It would simply be an artificial
restriction making the feature harder to use.

Also, not very long ago we were speculating about the possible value of
parallel restore processes --- while I recall being unimpressed with the
likely gains, I wouldn't want to put a permanent kibosh on the idea by
adopting a philosophy that restores are supposed to be done in
single-user mode.
        regards, tom lane


Re: ADD FOREIGN KEY

From
Greg Stark
Date:
Christopher Browne <cbbrowne@libertyrms.info> writes:

> I would, given an ideal world, prefer to be able to have a connection
> or two live during this to let me monitor the DB and even get an early
> peek at the data.  

On that note, how hard would it be to implement a read-dirty mode in postgres?
This would be useful for few things, the only thing I can think of are
progress indicators for long-running updates/inserts.

It seems like it falls naturally out of the MVCC algorithm, simply have it set
the transaction id of the current transaction to be a magic value that
compares greater than any transaction id. So all uncommitted transactions are
seen as having been committed in the past.

I don't see any real need for updates or inserts, but reasonable semantics for
them also fall out of MVCC. Any updates or inserts should be seen as being
committed infinitely far in the future. So they can only be seen by other
read-dirty transactions.

The main use for this that I see are doing select count(*) on tables being
imported or inserted into. Or perhaps being able to peek at records being
updated by another session in a long-running job.

If nothing else it'll save the load on the mailing list every time people ask
how to calculate how much longer their data load is going to take based on the
size of the files in the postgres data directory.

I'm sure I'm skipping a few steps. What I said doesn't quite make sense on its
own. I think I'm missing some key elements of the postgres MVCC system.

-- 
greg



Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Jan Wieck wrote:
> > I think this is a larger argument than the one that was being discussed
> > above. Given a dump of objects I own, can I restore them without requiring
> > the fk check to be done if I alter table add constraint a foreign key? If
> > the answer to that is no, then the option can be put in as a superuser
> > only option and it's relatively easy. If the answer to that is yes, then
> > there are additional issues that need to be resolved.
> 
> Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
> a consensus that we don't _want_ that. Probably we should declare it 
> deprecated and remove it in 7.5. And the option currently under 
> discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
> that _should_ be restricted.

Added to TODO:
* Remove CREATE CONSTRAINT TRIGGER

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Christopher Kings-Lynne
Date:
>>Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
>>a consensus that we don't _want_ that. Probably we should declare it 
>>deprecated and remove it in 7.5. And the option currently under 
>>discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
>>that _should_ be restricted.

How can we ever remove it - what about people upgrading from 7.0, 7.1, 
7.2?  Also, people upgrading from 7.3 who've never heard of adddepend...

Chris




Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> 
> >>Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
> >>a consensus that we don't _want_ that. Probably we should declare it 
> >>deprecated and remove it in 7.5. And the option currently under 
> >>discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
> >>that _should_ be restricted.
> 
> How can we ever remove it - what about people upgrading from 7.0, 7.1, 
> 7.2?  Also, people upgrading from 7.3 who've never heard of adddepend...

Not sure.  We can remove documentation about it, at least.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)

From
Jan Wieck
Date:
Bruce Momjian wrote:

> Christopher Kings-Lynne wrote:
>> 
>> >>Well, with the CREATE CONSTRAINT TRIGGER you _can_, but we already have 
>> >>a consensus that we don't _want_ that. Probably we should declare it 
>> >>deprecated and remove it in 7.5. And the option currently under 
>> >>discussion is exactly what will cause ALTER TABLE to let you, but IMHO 
>> >>that _should_ be restricted.
>> 
>> How can we ever remove it - what about people upgrading from 7.0, 7.1, 
>> 7.2?  Also, people upgrading from 7.3 who've never heard of adddepend...
> 
> Not sure.  We can remove documentation about it, at least.
> 

If the idea is to support any 7.n -> 7.m (where n < m) upgrade directly, 
then it's IMHO time for 8.0 and clearly stating that 7.x -> 8.y only is 
supported as 7.x -> 7.3 -> 8.0 -> 8.y and "you're on your own with any 
other attempt".

Don't get this wrong, I am a big friend of easy upgrades. But I am not a 
big friend of making improvements impossible by "backward compatibility 
forever". It was the "backward compatibility" to CP/M-80 (v2.2) that 
caused MS-DOS 7.0 to have a maximum commandline length of 127 characters 
... that was taking compatibility too far. Well, M$ took it too far the 
other way from there and is compatible to nothing any more, not even to 
themself ... but at least they learned from that mistake.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #