Thread: Foreign key performance

Foreign key performance

From
Kevin Brown
Date:
I'm using 7.3.2 on Linux, with a decent amount of muscle behind it
(1.5 GHz PPro CPU, 1G mem, 20M/sec disks, xlog on different disk than
data).

I've got a database that has several foreign keys, and I'm copying a
bunch of data from an MS-SQL server into it via Perl DBI.  I noticed
that inserts into this database are very slow, on the order of 100 per
second on this hardware.  All the inserts are happening in a single
transaction.  The postmaster I'm connected to appears to be CPU
limited, as it's pegging the CPU at a constant 85 percent or more.

I have no problem with that under normal circumstances (i.e., the
foreign key constraints are actively being enforced): it may well be
the nature of foreign keys, but the problem is this: all the keys are
DEFERRABLE INITIALLY DEFERRED and, on top of that, the Perl program
will SET CONSTRAINTS ALL DEFERRED at the beginning of the transaction.

If I remove all the foreign key constraints, my performance goes up to
700 inserts per second!

Why isn't the insert performance with all the constraints deferred
approximating that of the performance I get without the foreign keys??
If anything, I should get a big delay at transaction commit time while
all the foreign key constraints are checked (and, indeed, I get that
too), but the performance during the transaction prior to the commit
should be the same as it is without the foreign key constraints.

It's almost as if the foreign key constraints are being invoked and
the results ignored during the inserts...

In essence, this smells like a bug to me, but I don't know enough
about the internals to really call it that.


Any ideas on what can be done about this?




--
Kevin Brown                          kevin@sysexperts.com


Re: [PERFORM] Foreign key performance

From
Stephan Szabo
Date:
On Thu, 17 Apr 2003, Kevin Brown wrote:

> I have no problem with that under normal circumstances (i.e., the
> foreign key constraints are actively being enforced): it may well be
> the nature of foreign keys, but the problem is this: all the keys are
> DEFERRABLE INITIALLY DEFERRED and, on top of that, the Perl program
> will SET CONSTRAINTS ALL DEFERRED at the beginning of the transaction.
>
> If I remove all the foreign key constraints, my performance goes up to
> 700 inserts per second!
>
> Why isn't the insert performance with all the constraints deferred
> approximating that of the performance I get without the foreign keys??

It appears (from some not terribly scientific experiments - see below)
that it's likely to be related to managing the deferred trigger queue
given that in my case at least running the constraints non-deferred was
negligible in comparison.

On batch inserts to three tables each with a foreign key to a table
containing one row (and inserts of lots of that value), I saw a ratio of
approximately 1:1.7:7 for normal inserts:non-deferred fk:deferred fk on my
7.4 dev server.


Re: [PERFORM] Foreign key performance

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> It appears (from some not terribly scientific experiments - see below)
> that it's likely to be related to managing the deferred trigger queue
> given that in my case at least running the constraints non-deferred was
> negligible in comparison.

At one time the deferred-trigger queue had an O(N^2) behavioral problem
for large N = number of pending trigger events.  But I thought we'd
fixed that.  What's the test case exactly?  Can you get a profile with
gprof?

            regards, tom lane


Re: [PERFORM] Foreign key performance

From
Stephan Szabo
Date:
On Fri, 18 Apr 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > It appears (from some not terribly scientific experiments - see below)
> > that it's likely to be related to managing the deferred trigger queue
> > given that in my case at least running the constraints non-deferred was
> > negligible in comparison.
>
> At one time the deferred-trigger queue had an O(N^2) behavioral problem
> for large N = number of pending trigger events.  But I thought we'd
> fixed that.  What's the test case exactly?  Can you get a profile with
> gprof?

I'm going to tomorrow hopefully - but it looks to me that we fixed one, but
possibly not another place where we read through the list unnecessarily
AFAICS. I think deferredTriggerInvokeEvents (when called with
immediate_only = true) is going to go through the entire list looking for
immediate triggers to fire after each statement.  However, excepting set
constraints, any immediate triggers for any event added prior to this
statement will by necessity have already been run unless I'm missing
something, which means that we're often looking through entries that
aren't going to have any triggers to run now in any case.

Keeping a pointer to the end of the list as of last statement and going
through the list from there cut the time for the deferred case in half in
my simple test (about 3.3x the no fk and just under 2x the immediate).


Re: [PERFORM] Foreign key performance

From
Stephan Szabo
Date:
[Not sure this really is relevant for -performance at this point]

On Thu, 17 Apr 2003, Stephan Szabo wrote:

> On Fri, 18 Apr 2003, Tom Lane wrote:
>
> > Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > > It appears (from some not terribly scientific experiments - see below)
> > > that it's likely to be related to managing the deferred trigger queue
> > > given that in my case at least running the constraints non-deferred was
> > > negligible in comparison.
> >
> > At one time the deferred-trigger queue had an O(N^2) behavioral problem
> > for large N = number of pending trigger events.  But I thought we'd
> > fixed that.  What's the test case exactly?  Can you get a profile with
> > gprof?
>
> I'm going to tomorrow hopefully - but it looks to me that we fixed one, but

Argh. I'm getting that state where gprof returns all 0s for times.  I'm
pretty sure this has come up before along with how to get it to work, but
I couldn't find it in the archives. Someday I'll learn how to use gprof. :(

In any case, the call list seemed reasonable.  It's currently doing O(n^2)
calls to MemoryContextReset and deferredTriggerCheckState in InvokeEvents
I don't see anything else that's at that kind of number of calls (50
million calls for a backend that's only done 10000 inserts stands out a
bit).  Going only from last statement seems to make it linear (I think my
attempt is checking 1 too many trigger values, need to change that
probably).



Re: [PERFORM] Foreign key performance

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> Argh. I'm getting that state where gprof returns all 0s for times.  I'm
> pretty sure this has come up before along with how to get it to work, but
> I couldn't find it in the archives. Someday I'll learn how to use gprof. :(

You're on Linux?  You need to compile postmaster.c with -DLINUX_PROFILE.
But the call counts do sound pretty damning.
        regards, tom lane



Re: [PERFORM] Foreign key performance

From
Stephan Szabo
Date:
On Fri, 18 Apr 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > Argh. I'm getting that state where gprof returns all 0s for times.  I'm
> > pretty sure this has come up before along with how to get it to work, but
> > I couldn't find it in the archives. Someday I'll learn how to use gprof. :(
>
> You're on Linux?  You need to compile postmaster.c with -DLINUX_PROFILE.

Yep, thanks. :)

> But the call counts do sound pretty damning.

Yeah, but even with my hack last night it was still appreciably slower
than immediate constraints. Comparing the call counts in that function
for the immediate versus deferred(hacked) weren't giving me a good idea of
where that time was going.



Re: [PERFORM] Foreign key performance

From
Stephan Szabo
Date:
On Fri, 18 Apr 2003, Stephan Szabo wrote:

> On Fri, 18 Apr 2003, Tom Lane wrote:
>
> > But the call counts do sound pretty damning.
>
> Yeah, but even with my hack last night it was still appreciably slower
> than immediate constraints. Comparing the call counts in that function
> for the immediate versus deferred(hacked) weren't giving me a good idea of
> where that time was going.

This last was due to assert checking I think.  AllocSetCheck was the big
time waster on the hacked deferred case. Turning off assert checking I
get:

Median over 3 100000 inserts in one transaction (excepting the original
code which took a really long time so I ran it once) from time psql ...

No Fk                   24.14s
Immediate FK            42.80s
Original Deferred FK  1862.06s
Hacked Deferred FK      35.30s

The hack was just the keeping around the list pointer from the last run
through (see attached - passed simple fk tests and regression, but there
might be problems I don't see).  Looking at the code, I also wonder if we
would get some gain by not allocating the per_tuple_context at the
beginning but only when a non-deferred constraint is found since otherwise
we're creating and destroying the context and possibly never using it.
The cost would presumably be some boolean tests inside the inner loop.

Re: [PERFORM] Foreign key performance

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> The hack was just the keeping around the list pointer from the last run
> through (see attached - passed simple fk tests and regression, but there
> might be problems I don't see).

Shouldn't this patch update the comment in deferredTriggerInvokeEvents
(c. line 1860 in cvs tip)?

> Looking at the code, I also wonder if we
> would get some gain by not allocating the per_tuple_context at the
> beginning but only when a non-deferred constraint is found since otherwise
> we're creating and destroying the context and possibly never using it.

I doubt it's worth worrying over.  Creation/destruction of a never-used
memory context is pretty cheap, I think.
        regards, tom lane



Re: [PERFORM] Foreign key performance

From
Stephan Szabo
Date:
On Sat, 19 Apr 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > The hack was just the keeping around the list pointer from the last run
> > through (see attached - passed simple fk tests and regression, but there
> > might be problems I don't see).
>
> Shouldn't this patch update the comment in deferredTriggerInvokeEvents
> (c. line 1860 in cvs tip)?

Probably, since the second part of that is basically what this is.  I'll
update and send updated patch tomorrow.

> > Looking at the code, I also wonder if we
> > would get some gain by not allocating the per_tuple_context at the
> > beginning but only when a non-deferred constraint is found since otherwise
> > we're creating and destroying the context and possibly never using it.
>
> I doubt it's worth worrying over.  Creation/destruction of a never-used
> memory context is pretty cheap, I think.

Okay, sounds good enough for me. :)



Re: [PERFORM] Foreign key performance

From
Stephan Szabo
Date:
On Sat, 19 Apr 2003, Stephan Szabo wrote:

> On Sat, 19 Apr 2003, Tom Lane wrote:
>
> > Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > > The hack was just the keeping around the list pointer from the last run
> > > through (see attached - passed simple fk tests and regression, but there
> > > might be problems I don't see).
> >
> > Shouldn't this patch update the comment in deferredTriggerInvokeEvents
> > (c. line 1860 in cvs tip)?
>
> Probably, since the second part of that is basically what this is.  I'll
> update and send updated patch tomorrow.

Okay, this changes the second paragraph of that comment.  I left in the
comment that's really similar next to where I actually do the selection of
which start point to use.

Re: [PERFORM] Foreign key performance

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> Okay, this changes the second paragraph of that comment.  I left in the
> comment that's really similar next to where I actually do the selection of
> which start point to use.

This had a bit of a problem yet: the loop in deferredTriggerInvokeEvents
expects 'prev_event' to point to the list entry just before 'event'.
A nice byproduct of fixing that is we don't uselessly rescan the last list
entry.  I also tried to improve the comments a little.  You can see what
I actually applied at
http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/commands/trigger.c
        regards, tom lane



Re: [PERFORM] Foreign key performance

From
Kevin Brown
Date:
Tom Lane wrote:
> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > Okay, this changes the second paragraph of that comment.  I left in the
> > comment that's really similar next to where I actually do the selection of
> > which start point to use.
> 
> This had a bit of a problem yet: the loop in deferredTriggerInvokeEvents
> expects 'prev_event' to point to the list entry just before 'event'.
> A nice byproduct of fixing that is we don't uselessly rescan the last list
> entry.  I also tried to improve the comments a little.  You can see what
> I actually applied at
>
> http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/commands/trigger.c

Any chance of backporting these changes to 7_3_STABLE (when you're
satisfied they don't break anything)?  Just looking at the CVS log for
trigger.c, it appears there have been enough changes since then that
it might not be easy to do (and since it's not necessarily a "bug fix"
as such, it might not qualify for backporting to a stable version).

Even if it's not something that can be put into another release of
7.3, it would certainly be useful to me.  It might be useful to enough
people to justify releasing it as a patch on -patches, if nothing
else.

I'd do it myself but I don't understand the trigger code at all (and
if there's any documentation you can point me to that describes the
various functions and supporting data structures in trigger.c, that
would help a lot), and I'd rather not touch something like that until
I understand it thoroughly.


-- 
Kevin Brown                          kevin@sysexperts.com



Re: [PERFORM] Foreign key performance

From
Tom Lane
Date:
Kevin Brown <kevin@sysexperts.com> writes:
> Any chance of backporting these changes to 7_3_STABLE (when you're
> satisfied they don't break anything)?  Just looking at the CVS log for
> trigger.c, it appears there have been enough changes since then that
> it might not be easy to do (and since it's not necessarily a "bug fix"
> as such, it might not qualify for backporting to a stable version).

I'd be pretty hesitant to make such a change in the stable branch ---
at least not without a lot of testing.  If you and others want to
provide such testing, go to it.  The patch appears to apply cleanly
enough to 7.3, but here's an adjusted patch if fuzz makes you nervous...
        regards, tom lane

*** trigger.c~    Sun Apr 20 20:28:55 2003
--- trigger.c    Sun Apr 20 20:29:13 2003
***************
*** 1461,1472 ****
--- 1461,1478 ----  * Because this can grow pretty large, we don't use separate List nodes,  * but instead thread the
listthrough the dte_next fields of the member  * nodes.  Saves just a few bytes per entry, but that adds up.
 
+  * 
+  * deftrig_events_imm holds the tail pointer as of the last 
+  * deferredTriggerInvokeEvents call; we can use this to avoid rescanning
+  * entries unnecessarily.  It is NULL if deferredTriggerInvokeEvents
+  * hasn't run since the last state change.  *  * XXX Need to be able to shove this data out to a file if it grows too
*       large...  * ----------  */ static DeferredTriggerEvent deftrig_events;
 
+ static DeferredTriggerEvent deftrig_events_imm; static DeferredTriggerEvent deftrig_event_tail;  
***************
*** 1680,1686 **** deferredTriggerInvokeEvents(bool immediate_only) {     DeferredTriggerEvent event,
!                 prev_event = NULL;     MemoryContext per_tuple_context;     Relation    rel = NULL;     TriggerDesc
*trigdesc= NULL;
 
--- 1686,1692 ---- deferredTriggerInvokeEvents(bool immediate_only) {     DeferredTriggerEvent event,
!                 prev_event;     MemoryContext per_tuple_context;     Relation    rel = NULL;     TriggerDesc
*trigdesc= NULL;
 
***************
*** 1692,1704 ****      * are going to discard the whole event queue on return anyway, so no      * need to bother with
"retail"pfree's.      *
 
!      * In a scenario with many commands in a transaction and many
!      * deferred-to-end-of-transaction triggers, it could get annoying to
!      * rescan all the deferred triggers at each command end. To speed this
!      * up, we could remember the actual end of the queue at EndQuery and
!      * examine only events that are newer. On state changes we simply
!      * reset the saved position to the beginning of the queue and process
!      * all events once with the new states.      */      /* Make a per-tuple memory context for trigger function
calls*/
 
--- 1698,1709 ----      * are going to discard the whole event queue on return anyway, so no      * need to bother with
"retail"pfree's.      *
 
!      * If immediate_only is true, we need only scan from where the end of
!      * the queue was at the previous deferredTriggerInvokeEvents call;
!      * any non-deferred events before that point are already fired.
!      * (But if the deferral state changes, we must reset the saved position
!      * to the beginning of the queue, so as to process all events once with
!      * the new states.  See DeferredTriggerSetState.)      */      /* Make a per-tuple memory context for trigger
functioncalls */
 
***************
*** 1709,1715 ****                               ALLOCSET_DEFAULT_INITSIZE,
ALLOCSET_DEFAULT_MAXSIZE);
 
!     event = deftrig_events;     while (event != NULL)     {         bool        still_deferred_ones = false;
--- 1714,1735 ----                               ALLOCSET_DEFAULT_INITSIZE,
ALLOCSET_DEFAULT_MAXSIZE);
 
!     /*
!      * If immediate_only is true, then the only events that could need firing
!      * are those since deftrig_events_imm.  (But if deftrig_events_imm is
!      * NULL, we must scan the entire list.)
!      */
!     if (immediate_only && deftrig_events_imm != NULL)
!     {
!         prev_event = deftrig_events_imm;
!         event = prev_event->dte_next;
!     }
!     else
!     {
!         prev_event = NULL;
!         event = deftrig_events;
!     }
!      while (event != NULL)     {         bool        still_deferred_ones = false;
***************
*** 1830,1835 ****
--- 1850,1858 ----     /* Update list tail pointer in case we just deleted tail event */     deftrig_event_tail =
prev_event;
 
+     /* Set the immediate event pointer for next time */
+     deftrig_events_imm = prev_event;
+      /* Release working resources */     if (rel)         heap_close(rel, NoLock);
***************
*** 1917,1922 ****
--- 1940,1946 ----     MemoryContextSwitchTo(oldcxt);      deftrig_events = NULL;
+     deftrig_events_imm = NULL;     deftrig_event_tail = NULL; } 
***************
*** 2146,2153 ****      * CONSTRAINTS command applies retroactively. This happens "for free"      * since we have
alreadymade the necessary modifications to the      * constraints, and deferredTriggerEndQuery() is called by
 
!      * finish_xact_command().      */ }  
--- 2170,2180 ----      * CONSTRAINTS command applies retroactively. This happens "for free"      * since we have
alreadymade the necessary modifications to the      * constraints, and deferredTriggerEndQuery() is called by
 
!      * finish_xact_command().  But we must reset deferredTriggerInvokeEvents'
!      * tail pointer to make it rescan the entire list, in case some deferred
!      * events are now immediately invokable.      */
+     deftrig_events_imm = NULL; }  



Re: [PERFORM] Foreign key performance

From
Kevin Brown
Date:
Tom Lane wrote:
> Kevin Brown <kevin@sysexperts.com> writes:
> > Any chance of backporting these changes to 7_3_STABLE (when you're
> > satisfied they don't break anything)?  Just looking at the CVS log for
> > trigger.c, it appears there have been enough changes since then that
> > it might not be easy to do (and since it's not necessarily a "bug fix"
> > as such, it might not qualify for backporting to a stable version).
> 
> I'd be pretty hesitant to make such a change in the stable branch ---
> at least not without a lot of testing.  If you and others want to
> provide such testing, go to it.  The patch appears to apply cleanly
> enough to 7.3, but here's an adjusted patch if fuzz makes you
> nervous...

Thanks, Tom.  I've applied the patch to my server and it has so far
passed the few tests I've thrown at it so far (it has detected foreign
key violations in both immediate and deferred trigger mode).  And just
so you know, it performs FAR better than the pre-patched version does
-- in the overall transaction I'm doing, I see very little difference
now between deferred triggers and no triggers!


-- 
Kevin Brown                          kevin@sysexperts.com



Re: [PERFORM] Foreign key performance

From
Jan Wieck
Date:
Stephan Szabo wrote:
> 
> [Not sure this really is relevant for -performance at this point]
> 
> On Thu, 17 Apr 2003, Stephan Szabo wrote:
> 
> > On Fri, 18 Apr 2003, Tom Lane wrote:
> >
> > > Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > > > It appears (from some not terribly scientific experiments - see below)
> > > > that it's likely to be related to managing the deferred trigger queue
> > > > given that in my case at least running the constraints non-deferred was
> > > > negligible in comparison.
> > >
> > > At one time the deferred-trigger queue had an O(N^2) behavioral problem
> > > for large N = number of pending trigger events.  But I thought we'd
> > > fixed that.  What's the test case exactly?  Can you get a profile with
> > > gprof?
> >
> > I'm going to tomorrow hopefully - but it looks to me that we fixed one, but
> 
> Argh. I'm getting that state where gprof returns all 0s for times.  I'm
> pretty sure this has come up before along with how to get it to work, but
> I couldn't find it in the archives. Someday I'll learn how to use gprof. :(

You have to save and restore the timers around the fork() under Linux.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #