Re: [GENERAL] 7.4Beta - Mailing list pgsql-hackers

From Andreas Pflug
Subject Re: [GENERAL] 7.4Beta
Date
Msg-id 3F3D1670.2040209@pse-consulting.de
Whole thread Raw
In response to Re: [GENERAL] 7.4Beta  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
Responses Re: [GENERAL] 7.4Beta
List pgsql-hackers
Stephan Szabo wrote:

>On Fri, 15 Aug 2003, Andreas Pflug wrote:
>
>  
>
>>Stephan Szabo wrote:
>>
>>    
>>
>>>On Fri, 15 Aug 2003, Andreas Pflug wrote:
>>>
>>>
>>>
>>>      
>>>
>>>>Stephan Szabo wrote:
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>>>That really needs to be rewritten to do a single check over the table
>>>>>rather than running the constraint for every row.  I keep meaning to get
>>>>>around to it and never actually do. :(  I'm not sure that in practice
>>>>>you'll get a better plan at restore time depending on what the default
>>>>>statistics give you.
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>This is clearly a case for a statement level trigger, as soon as
>>>>affected rows can be identified.
>>>>
>>>>
>>>>        
>>>>
>>>Well, I think single inserts might be more expensive (because the query is
>>>more involved for the table joining case) using a statement level trigger,
>>>so we'd probably want to profile the cases.
>>>
>>>
>>>      
>>>
>>This really depends. If a constraint is just a check on the
>>inserted/updated column, so no other row needs to be checked, there's no
>>faster way then the current row trigger. But FK constraints need to
>>execute a query to retrieve the referenced row, and every RDBMS prefers
>>to execute a single statement with many rows over many statements with a
>>single row, because the first will profit from optimization. And even if
>>only a single row is inserted or updated, there's still the need to
>>lookup the reference.
>>    
>>
>
>I don't think that addresses the issue I brought up.  If you're doing a
>bunch of single inserts:
>begin;
>insert into foo values (1);
>insert into foo values (1);
>insert into foo values (1);
>insert into foo values (1);
>insert into foo values (1);
>end;
>
>Each of those statement triggers is still only going to be dealing with a
>single row.  If you're in immediate mode there's not much you can do about
>that since the constraint is checked between inserts.  If you're in
>deferred mode, right now it won't help because it's not going to batch
>them, it's going to be 5 statement triggers AFAICT each with its own 1 row
>affected table.
>
>I believe that the more complicated join the old/new table with the pk
>table and do the constraint check is going to be slightly slower than the
>current row behavior for such cases because the trigger query is going to
>be more complicated.  What would be nice would be some way to choose
>whether to use a single query per statement vs a simpler query per row
>based on what's happening.
>
>  
>

Deferring the constraint check would mean checking 5 single rows, right. 
But I still can't see why you think that a row level trigger would be 
cheaper in this case. I had a look at ri_triggers.c and what's coded 
there looks just as I expected, doing a query on the referenced table. 
the queries might look a bit different when checking multiple rows at 
once, but carefully designed I doubt that there would be a performance 
hit from this. In case it *is* significantly slower, single row updates 
could be handled separately using the current triggers, and statement 
triggers for multiple rows. This would cover both scenarios best. At the 
moment, update/insert scales not too good.

Best thing in the situation above would certainly be if all 5 rows would 
be checked in a single query, but that looks quite impossible because a 
mixture of inserts/updates/deletes on different tables might be deferred.

Regards,
Andreas




pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: [GENERAL] 7.4Beta
Next
From: Stephan Szabo
Date:
Subject: Re: [GENERAL] 7.4Beta